I’m currently doing Work Experience at Microsoft Research in Cambridge (MSR Cambridge) and despite my gripes about Microsoft software, the research centre and some of the stuff people are doing there is pretty cool. Here’s what I’ve done so far and what I think of the place / stuff.
I arrived in Cambridge to a 4-storey house all to myself (!) complete with dishwasher, washing machine, decent cooking stuff etc. I saw the wifi router but it wasn’t broadcasting beacons – it was a hidden SSID. So I spent about 3 hours capping packets to no avail on a nearby WEP wifi and finally realised that (a) there was an ethernet cable sticking out of the wifi router which I could plug into, and (b) the router had its hidden SSID written on it.
After waking up the next day to three alarms (alarm clock, laptop, watch; 5 minutes apart from each other; I wanted to make sure I actually got up on time … for once) I left my laptop at home with three layers of Lifehacker-inspired protection (physical lock, Yawcam for motion detection webcam and LaptopAlarm – a thief would have to have his/her photo uploaded to an FTP server and walk around Cambridge with a chest of drawers attached to a laptop that starts screaming the second it’s unplugged in order to steal my laptop) and walked the 1.6 mile commute to MSR in the West Cambridge Site NW of the city. I eventually arrived where I was issued a temporary pass, a Microsoft Research rucksack, some freebies and a ‘wottle’ – MS in an attempt to do the environment some good decided to issue refillable recycled plastic/rubber (?) bottles. The person who came up with the idea evidently had the same lack of aptitude as I do when it comes to names; it’s got ‘wottled by you’ written on the side…
I also met my supervisor who explained a little about what the project I was to be working on (Infer.NET in the machine learning area) is all about – Bayesian inference. It’s basically a system of statistical modelling which takes in information about distributions, observed values and interaction between random variables and infers new distributions by applying Bayes’ Theorem (which SJRH gave us a talk in the dying weeks of term). In terms of the theory all he said was when there are several distributions it gets pretty complicated and uber-badass integration (not his words) is necessary which, until an approximate method was found a couple of years ago, wasn’t really possible computationally. He also introduced me to a couple of the team members, one of whom has a firm belief that random variables should be a part of programming and are as valid as, if not more than, ‘normal’ everyday data types like integers.
Eventually we settled on my project. I’d mentioned the Monty Hall problem when discussing Bayes’ Theorem and he proposed that I work on a display implementing that using Infer.NET as a simple project, and also to give me an idea of how something like that would be implemented using this framework. At that point I was more or less left to my own devices to familiarise myself with the workstation and Visual Studio and to complete a stack of paperwork I’d been given. I think it was at about this point that I discovered I was working on an 8-core Dell Precision workstation. Serious overkill – as Guy pointed out, probably employees spend most of their time gaming and consequently don’t get much done!
My supervisor (I think the legal paperwork of which there was much permits me to give his name – John Guiver) took me out for lunch according to some sort of tradition involving new people joining the team at the Cavendish Laboratory Canteen (the Cavendish Lab was near the MS building). Actually the canteen there was pretty standard (I was expecting wonders from the great Cavendish) though the food is good and cheap – that’s all that really matters. One of the other team members, John Winn (random variables in programming dude) had been to MIT. Apparently they have whiteboards in the loos and people leave each other problems on them, sangaku style!
There was some sort of filming going on in the main lecture theatre – bright white theatre floodlights adorned the roof and several cameras were propped up on tripods all over the place. After lunch I ended up agreeing to help out which mostly involved milling around displays of Infer.NET related programs for a time lapse vid. This turned out to be really awesome because of the way the displays were implemented – massive multi-touch touch screen coffee tables! They could also be operated by circular counters placed on the tabletop, used like dials (twist to change settings), similar to a touch screen I’d seen on youtube demonstrating a physics simulator. There was also a(n incomplete) machine vision display which proudly labeled a pair of scissors as a book…
Afterwards I grabbed myself the Infer.NET dll’s and started trying to implement the Monty Hall problem. There was a problem with the model and I ended up (wilfully) working almost an hour overtime on my first day! I got home and tried for the first time (and failed) to play Dawn of War Soulstorm with Guy over Hamachi-induced WAN.
Thursday was a particularly interesting day. The first thing that happened was, while infuriated with WPF’s apparent inability to do frame by frame animations and in a *very* roundabout method of doing things, I worked out how to do multithreading in C#. It’s not actually that hard at all (1 line of code captures the essence of it) but was just one of those really useful things I’d always wanted to learn but hadn’t owing to preconceptions about ploughing through hundreds of lines of unclear example code from an obscure online tutorial. Traditionally Thursday lunch is an Infer.NET group lunch so I also got to chat more with the rest of the team about what they’re doing. One was trying to help a mathematician with something programmatic and was stressing about why on earth he kept insisting on doing things in unprogrammatic and ultimately unsuccessful ways, which ended up as a big discussion about mathematical v computing methods, and how the two disciplines have drastically different approaches and concepts of ideal solutions despite their fundamental similarities. It also allowed me to use my rather scanty knowledge of linear algebra to justify the mathematician’s approach to writing an algorithm to determine if a graph has loops: create the graph’s incidence matrix and find its determinant. If the det is zero the graph has a loop. Of course finding the determinant is an O(n3) affair using Gauss Jordan (exponential if you use the FP1-style cofactor method) so computationally expensive . Thursday afternoon saw my discovery of how to make dlls in C# using Visual Studio, the second simple yet important discovery of the day.
But most interesting / fun was the evening. A go karting event had been arranged for all the interns (and some employees) as part of a summer series of events and as a work experience employee I’d been invited. I discovered I wasn’t the only first timer which made me feel a little less worried about failing epically; so long as the damn thing doesn’t have a clutch I’ll be OK (just had my second driving lesson). Anyways it was all a lot of fun – I only crashed once because I first overestimated then underestimated the tyres’ grip (it was a classic understeer-oversteer-skid affair) and didn’t lean the right way (apparently you should lean towards the outside of a bend to get more grip which seems to me as a cyclist extremely counterintuitive). I also discovered it’s true that racing makes you ridiculously thirsty; luckily I had my wottle to hand! Afterwards while we were waiting for the coach back I noticed one of the employees was wearing a T-shirt with ‘i bing’ on the front and ‘u bing?’ on the back – apparently people who had worked on the Bing project were all issued them at the launch.
By Friday I’d more or less finished the Monty Hall implementation, so John and I discussed possibilities for next week. One was some intelligent text ‘understanding’ which uses a neat little trick involving Bayesian inference on a massive scale to partition a text into phrases (without using punctuation as a guide). Another alternative was to use statistical analysis on the relative locality of certain words to generate a summary of a very long text. I suggested the possibility of using Infer.NET for auto correcting which seemed viable. I was also led over to the uber-cool gadgets area of the building (‘computers in the home’) which was densely populated with enormous screens, multi-touch touch screen coffee tables, wireless [just about everything] and more or less all the stuff I’d want to have sitting in my room!
John also explained a little about modelling statistics and factor graphs: graphs in which nodes are either variables or functions, and edges are directed. Messages in the form of entire distributions are passed along these edges between the variables and functions and you end up getting complicated-looking graphs with arrows and squares and rectangles. These are used for describing models and the maths is apparently beyond 1st year undergrad… The storm that day was also quite epic – the electricity in the MS building seemed affected as the lights kept flickering out – I assume the PCs were all UPS’d.
Anyways those are my thoughts at half time – I still have another week to go. I haven’t actually taken [m]any photos at all since I’ve been spending most of my hours either working or cooking/eating or sleeping (I’m back in London for the weekend); hopefully I’ll be able to take some next week.