Microsoft – Week 2

July 25, 2009

This week has gone pretty quickly and I’ve mostly been working on the text analyser / summary program. I even managed to take some photos! The week started with @dumbledad (= Tim) showing me some of the visualisation stuff he and an intern had been working on to visualise a book, some of which will appear shortly on a site somewhere… It’s all in the spirit of new and interesting data presentation in the spirit of Information Aesthetics and he sent me a link to some stuff he did on ManyEyes – word clouds (or ‘wordles’) comparing frequencies of words in narrative and speech. Some of the other ones are more difficult to describe but I’ll be sure to tweet link to them when they get published.

The idea of the summary program was that it split the book into sections then compared a histogram of word frequency densities in each section with another histogram for the entire book, then picked out the words which were most likely to be important to the section by choosing the most unusually frequently used ones. The problem with that was the program wasn’t picking out main characters because they were being mentioned all throughout the book. So I was to implement a system to split words into three categories: local to the section, local to the book (main characters) and common to the English language. The existing framework for a two-way local to section vs local to book had already been written so I was to implement the three-way split.

Factor graph showing the model

Factor graph showing the model

By Wednesday I’d finished the actual implementation so I started trying to invent a visualisation. My original idea was to have a ’story line’ (no pun was actually intended) along which various threads would undulate, and the further out from the story line they are, the more important they are; think of it as a radial graph – I think I was probably inspired by the RealPlayer (yuk, I know) ‘cosmic string’ visualisation. I built a really flickery version as a mockup which was approved, and since I was by then starting to shy away from WPF I ended up learning DirectX overnight to implement a final 3D non-flickery version of it. After spending a whole day stressing over the edges of the scene getting cut off and finally realising I’d set the camera’s maximum viewing distance ridiculously low, I finally got it to work, and after writing some homebrew bezier curve code it looked pretty good (if I may say so myself); Tim tells me he’ll probably add a screen video of it to the online display of visualisations so … watch this space.

Another excitement of the week was a talk from TrueKnowledge (= TK), an internet answer engine. It’s similar to the famous Wolfram Alpha (= Walfa); however in my opinion it actually has more potential. Walfa throws manpower at writing new code to scrape information from various different sources on the fly which essentially means the more information you want, the more you’re going to need to work. TK on the other hand stores information in an enormous database which has a structure suitable for storing any type of information, and although work is done to ‘crawl’ Wikipedia and other sources for knowledge, it also sources the community for information which means it can gather lots of important knowledge very quickly with minimal effort. It also has awesome features of natural language parsing (ask it ‘what colour are red cars’ for example) and it can also give you a step-by-step explanation of the logical process that leads to its final answer.

The bottom half of the screenshot shows TKs stages of logical inference

The bottom half of the screenshot shows TK's stages of logical inference

It of course differs from Walfa in that it hasn’t got a tonne of Mathematica code behind it – its strengths are in factual and inferred knowledge as opposed to evaluating integrals. It’s currently in Beta and has an API (yay!) so I strongly encourage anyone who has used Walfa to give TK a go.

On Tuesday the weekly Mexican food van appeared – until then I’d never realised quite how amazingly good burritos can be! While we were eating we started discussing presentation of text. The problem is that a conventional layout presents the reader with a formidable block of text interspersed with some images which is difficult to follow and annoying to read since one always has to alternate between studying the image and reading the text. However attempts at producing non-linear presentations of information such as embedding text into the image as tooltips or expandable areas of the image etc. have always resulted in people simply not reading very much of the text and consequently missing out important stuff. The best solution we came up with is using an old method of collapsible clauses, just like collapsible code. For example, if a relative clause which in this case is italicised and relatively long yet somehow doesn’t contribute much to the sentence thus merely adds length and unnecessary information to the text making the ultimate meaning more difficult to discern is considered superfluous to the meaning of the sentence, it could be replaced by a small button that only shows the clause if clicked – such ideas are particularly relevant to German sentences which tend to have huge diversions into clauses before the verb is revealed right at the end. This way readers can quickly get the gist of what’s going on so they may study the image in an enlightened way, then go back and expand the text to get the full meaning.

There are also a few things I noticed about MSR in general. There is a strong sense of company loyalty – all employees seem to use Bing, and everyone I’ve seen even goes as far as using IE instead of Firefox! Using only Microsoft products to perform tasks however did make me aware of the wide range of programs they do produce – they even have Virtual Machine software and an internal proprietary alternative to SVN. I guess it does help the developers of these applications a lot if they have an enormous internal test group: all the employees and interns. There’s also pretty close integration with Redmond (Outlook + Office Communicator + global WAN shares) so feedback could be quite efficiently delivered. The entire place also operates in the spirit of trust – all users have admin rights (necessary for developers anyway) – which is so much better than what is implemented at school: a highly restrictive policy which, despite recent changes for the better, still filters out most protocols (FTP included) and in fact, instead of preventing people from doing things simply makes everything so much more difficult to do. Now I have to connect through encrypted VPN to use FTP…

Anyways overall it was a great two weeks. I enjoyed it hugely, I didn’t need to touch Excel, I didn’t make anyone coffee and I didn’t do any filing (who needs paper anyway? It’s a software company!) – instead I worked on real (and rather cool) projects, learnt some useful things, and made new acquaintances.

In other news, I’m off tomorrow to Cranfield for the Aerospace Challenge Finals – I’ll get to fly (actual!) planes, take lots of photos and it should be another great experience. They’d just better have wifi, though I’m bringing my Alfa Awus (ridiculously powerful) along in case of weak signal!


Microsoft Work Experience – Half Time

July 19, 2009

I’m currently doing Work Experience at Microsoft Research in Cambridge (MSR Cambridge) and despite my gripes about Microsoft software, the research centre and some of the stuff people are doing there is pretty cool. Here’s what I’ve done so far and what I think of the place / stuff.

I arrived in Cambridge to a 4-storey house all to myself (!) complete with dishwasher, washing machine, decent cooking stuff etc. I saw the wifi router but it wasn’t broadcasting beacons – it was a hidden SSID. So I spent about 3 hours capping packets to no avail on a nearby WEP wifi and finally realised that (a) there was an ethernet cable sticking out of the wifi router which I could plug into, and (b) the router had its hidden SSID written on it.

After waking up the next day to three alarms (alarm clock, laptop, watch; 5 minutes apart from each other; I wanted to make sure I actually got up on time … for once) I left my laptop at home with three layers of Lifehacker-inspired protection (physical lock, Yawcam for motion detection webcam and LaptopAlarm – a thief would have to have his/her photo uploaded to an FTP server and walk around Cambridge with a chest of drawers attached to a laptop that starts screaming the second it’s unplugged in order to steal my laptop) and walked the 1.6 mile commute to MSR in the West Cambridge Site NW of the city. I eventually arrived where I was issued a temporary pass, a Microsoft Research rucksack, some freebies and a ‘wottle’ – MS in an attempt to do the environment some good decided to issue refillable recycled plastic/rubber (?) bottles. The person who came up with the idea evidently had the same lack of aptitude as I do when it comes to names; it’s got ‘wottled by you’ written on the side…

I also met my supervisor who explained a little about what the project I was to be working on (Infer.NET in the machine learning area) is all about – Bayesian inference. It’s basically a system of statistical modelling which takes in information about distributions, observed values and interaction between random variables and infers new distributions by applying Bayes’ Theorem (which SJRH gave us a talk in the dying weeks of term). In terms of the theory all he said was when there are several distributions it gets pretty complicated and uber-badass integration (not his words) is necessary which, until an approximate method was found a couple of years ago, wasn’t really possible computationally. He also introduced me to a couple of the team members, one of whom has a firm belief that random variables should be a part of programming and are as valid as, if not more than, ‘normal’ everyday data types like integers.

Eventually we settled on my project. I’d mentioned the Monty Hall problem when discussing Bayes’ Theorem and he proposed that I work on a display implementing that using Infer.NET as a simple project, and also to give me an idea of how something like that would be implemented using this framework. At that point I was more or less left to my own devices to familiarise myself with the workstation and Visual Studio and to complete a stack of paperwork I’d been given. I think it was at about this point that I discovered I was working on an 8-core Dell Precision workstation. Serious overkill – as Guy pointed out, probably employees spend most of their time gaming and consequently don’t get much done!

My supervisor (I think the legal paperwork of which there was much permits me to give his name – John Guiver) took me out for lunch according to some sort of tradition involving new people joining the team at the Cavendish Laboratory Canteen (the Cavendish Lab was near the MS building). Actually the canteen there was pretty standard (I was expecting wonders from the great Cavendish) though the food is good and cheap – that’s all that really matters. One of the other team members, John Winn (random variables in programming dude) had been to MIT. Apparently they have whiteboards in the loos and people leave each other problems on them, sangaku style!

There was some sort of filming going on in the main lecture theatre – bright white theatre floodlights adorned the roof and several cameras were propped up on tripods all over the place. After lunch I ended up agreeing to help out which mostly involved milling around displays of Infer.NET related programs for a time lapse vid. This turned out to be really awesome because of the way the displays were implemented – massive multi-touch touch screen coffee tables! They could also be operated by circular counters placed on the tabletop, used like dials (twist to change settings), similar to a touch screen I’d seen on youtube demonstrating a physics simulator. There was also a(n incomplete) machine vision display which proudly labeled a pair of scissors as a book…

Afterwards I grabbed myself the Infer.NET dll’s and started trying to implement the Monty Hall problem. There was a problem with the model and I ended up (wilfully) working almost an hour overtime on my first day! I got home and tried for the first time (and failed) to play Dawn of War Soulstorm with Guy over Hamachi-induced WAN.

Thursday was a particularly interesting day. The first thing that happened was, while infuriated with WPF’s apparent inability to do frame by frame animations and in a *very* roundabout method of doing things, I worked out how to do multithreading in C#. It’s not actually that hard at all (1 line of code captures the essence of it) but was just one of those really useful things I’d always wanted to learn but hadn’t owing to preconceptions about ploughing through hundreds of lines of unclear example code from an obscure online tutorial. Traditionally Thursday lunch is an Infer.NET group lunch so I also got to chat more with the rest of the team about what they’re doing. One was trying to help a mathematician with something programmatic and was stressing about why on earth he kept insisting on doing things in unprogrammatic and ultimately unsuccessful ways, which ended up as a big discussion about mathematical v computing methods, and how the two disciplines have drastically different approaches and concepts of ideal solutions despite their fundamental similarities. It also allowed me to use my rather scanty knowledge of linear algebra to justify the mathematician’s approach to writing an algorithm to determine if a graph has loops: create the graph’s incidence matrix and find its determinant. If the det is zero the graph has a loop. Of course finding the determinant is an O(n3) affair using Gauss Jordan (exponential if you use the FP1-style cofactor method) so computationally expensive . Thursday afternoon saw my discovery of how to make dlls in C# using Visual Studio, the second simple yet important discovery of the day.

But most interesting / fun was the evening. A go karting event had been arranged for all the interns (and some employees) as part of a summer series of events and as a work experience employee I’d been invited. I discovered I wasn’t the only first timer which made me feel a little less worried about failing epically; so long as the damn thing doesn’t have a clutch I’ll be OK (just had my second driving lesson). Anyways it was all a lot of fun – I only crashed once because I first overestimated then underestimated the tyres’ grip (it was a classic understeer-oversteer-skid affair) and didn’t lean the right way (apparently you should lean towards the outside of a bend to get more grip which seems to me as a cyclist extremely counterintuitive). I also discovered it’s true that racing makes you ridiculously thirsty; luckily I had my wottle to hand! Afterwards while we were waiting for the coach back I noticed one of the employees was wearing a T-shirt with ‘i bing’ on the front and ‘u bing?’ on the back – apparently people who had worked on the Bing project were all issued them at the launch.

By Friday I’d more or less finished the Monty Hall implementation, so John and I discussed possibilities for next week. One was some intelligent text ‘understanding’ which uses a neat little trick involving Bayesian inference on a massive scale to partition a text into phrases (without using punctuation as a guide). Another alternative was to use statistical analysis on the relative locality of certain words to generate a summary of a very long text. I suggested the possibility of using Infer.NET for auto correcting which seemed viable. I was also led over to the uber-cool gadgets area of the building (‘computers in the home’) which was densely populated with enormous screens, multi-touch touch screen coffee tables, wireless [just about everything] and more or less all the stuff I’d want to have sitting in my room!

John also explained a little about modelling statistics and factor graphs: graphs in which nodes are either variables or functions, and edges are directed. Messages in the form of entire distributions are passed along these edges between the variables and functions and you end up getting complicated-looking graphs with arrows and squares and rectangles. These are used for describing models and the maths is apparently beyond 1st year undergrad… The storm that day was also quite epic – the electricity in the MS building seemed affected as the lights kept flickering out – I assume the PCs were all UPS’d.

Anyways those are my thoughts at half time – I still have another week to go. I haven’t actually taken [m]any photos at all since I’ve been spending most of my hours either working or cooking/eating or sleeping (I’m back in London for the weekend); hopefully I’ll be able to take some next week.


Snowdonia Walking Trip

July 6, 2009

I’ve just (yesterday) returned from the annual walking trip (this year to Snowdonia) in the mountainous midge breeding ground that is Wales. I managed to write a brief record of our daily activities so here is (more or less) an illustrated account of our adventures.

I’m trying out Picasa as an alternative to Flickr so all the photos I’ve published of this trip are in my Picasa Web Album.

Sunday: Journey up

This was rather boring until we stopped at Betws-y-Coed (the teachers pronounced it roughly as ‘Battersea Coyd’ – I’m pretty sure that’s wrong) for lunch. It appeared that they were having some sort of summer festival so we grabbed some burgers and sausage rolls from a rather smokey and carcinogenic-looking stand and sat on a bench contemplating a police car that appeared to be on show (pop music emanated from its speakers, its sirens went off at apparently random intervals, and members of the public kept crawling in and out of the car while two police officers stood nearby sipping pints), a madman standing in a fenced area wielding a chainsaw (there was a sign saying ‘wood carving’ though the ‘toadstools’ that he produced were arguably less aesthetic than the original stumps) and a van with ‘water incident unit’ painted on its side (we postulated it had something to do with rescuing vehicles / people from lakes, hence the dry suits hanging up inside. I prefer the hypothesis that it rushes to the rescue whenever it rains and erects a large umbrella). We sauntered further into the town and saw the railway station, though we decided the ice cream shop was far more interesting.

We left Betws-y-Coed and passed by Beddgelert (where the most awesome and famous ice cream shop of Wales is) where we saw this rather amusing advert on the back of a bus:

Amusing advert on the back of a bus as we went over the bridge in Beddgelert

Amusing advert on the back of a bus as we went over the bridge in Beddgelert

Finally we pulled into the campsite with Queen blasting through the minibus’ stereo and set up our tents in the otherwise deserted campsite.

Abuse of a mallet for hammering in tent pegs

Abuse of a mallet for hammering in tent pegs

In the absence of KPZ, it was GL’s turn to cook – dinner was rather good but plagued by a flashmob of midges (which lasted for the next six days). We 8th formers did the washing up to set an example and returned to find the entire campsite had been invaded by a small battalion of siege cows.

Easily visible in the backgound is a herd of cattle that somehow managed to get into the campsite.

Easily visible in the backgound is a herd of cattle that somehow managed to get into the campsite.

We retreated from the insect-mammal assault into the minibus (where we could observe the campsite owner attempting to chase the cows off his land with a quad bike) and played a game of mafia (very difficult in a minibus). We eventually got to bed at around 10-11 pm.

Monday: Light walk, blazing sun

Everyone was awoken by the dawn chorus at the unholy hour of 6am – if not for the cows joining in I might have been able to get some more sleep before the planned 8am breakfast, at which GL described the previous night’s cooking as ‘look[ing] like a polluted river’: he’d cooked it in Specked Hen which created a substantial amount of froth…

Alastair looking triumphant at the top of the mountain

Alastair looking triumphant at the top of the mountain

People in Wales seem to have a great sense of humour. Wed also seen a sign in Betws-y-Coed saying Children left unattended will be sold to the circus

People in Wales seem to have a great sense of humour. We'd also seen a sign in Betws-y-Coed saying 'Children left unattended will be sold to the circus'

We split into three groups for the walk, after which two groups stopped by at Beddgelert for ice cream. DAE decided to pull an amusing trick on GL: since GL had already parked when we (DAE’s bus) had arrived, with endorsement from TCIM (‘what could go wrong?’) DAE moved GL’s bus somewhere inconspicuous and parked our bus in its spot, and went as far as transferring GL’s sandals to DAE’s bus! After a most satisfactory ice cream we watched mirthfully through binoculars from an unsubtle distance as GL became increasingly stressed!

Several hours after we returned to the campsite, we witnessed the much anticipated arrival of Max and Marius, two OPs who left the U8 last year but wanted to join us for a laugh. After dinner we all sat in the TV room and watched one of the most drawn-out tennis competitions I have ever seen which lasted until 10:30 and resulted in Murray winning 6-3 (I think). It was about then that I discovered the campsite offered free wifi and began downloading the missed episode of Top Gear (S13 E02).

Tuesday: Snowdon

The weather was perfect on the ascent: it was cloudy and breezy which made walking uphill effortless and chilled. As we reached the top we began the customary cursing of fat tourists sitting at the top of Snowdon wearing snow white trainers who had evidently taken the train up but still had the impudence to buy badges and T-shirts with words to the effect of ‘I walked up Snowdon’. Suggestions were made about using the fattest ones as train fuel (!), making the train treadmill-powered, not providing a train journey back down the mountain, and stopping the train half way up Snowdon, forcing everyone to do some work before reaching the top. I was also disappointed by the lack of free drinking water at the cafe – apparently tap water there is straight from the lake.

DAE found / remembered a tunnel off the beaten track which led to an abandoned quarry

DAE found / remembered a tunnel off the beaten track which led to an abandoned quarry

Max & Marius appear to have found a rather good vantage point

Max & Marius appear to have found a rather good vantage point

The train to the top of Snowdon seems to run partly on steam though the actual lifting is probably a rack and pinion affair. I dont really know much about trains...

The train to the top of Snowdon seems to run partly on steam though the actual lifting is probably a rack and pinion affair. I don't really know much about trains...

We were fortunate and managed to get a pretty good view (not much mist)

We were fortunate and managed to get a pretty good view (not much mist)

The descent was extremely hot and humid which made most of us feel like jumping into the remarkably clear plunge pools of the river running alongside the path. GL was keen to return quickly to the campsite to grab meat for a BBQ that night so we would have been unable to grab an ice cream from Beddgelert if not for a rather timely and extremely spectacular incident involving the contents of a 5th former’s stomach and several people around the epicentre who took some … splash damage. This event occurred precisely outside the ice cream shop, forcing DAE to do an emergency stop in the shop’s car park. I jokingly suggested to the Reverend (who was with us on that trip) that God must have really wanted us to have an ice cream. Max and Marius arrived and chauffeured GL away to do some meat shopping leaving us with the rather iconic image of a grinning Max sitting in the back seat clutching two double ice cream cones. That evening after the BBQ some of us watched Top Gear and played Dawn of War on my laptop until about 11pm.

DAE found a (slightly shorter) shortcut down Snowdon

DAE found a (slightly shorter) shortcut down Snowdon

When youre hot and sweaty it is unbelievably tempting to just jump in...

When you're hot and sweaty it is unbelievably tempting to just jump in...

Sir we found a reverend!

Sir we found a reverend!

Wednesday: Cnicht

When I was on the same trip three years ago this was the first day mountain. On Wednesday we approached it from a different face which involved quite a lot of scrambling. It was hot and humid but the steep ascent and striking view from the top made it all worth it and not much of a slog. We made our customary stop at Beddgelert for another ice cream, then we downloaded and watched ‘Night at the Museum’ (both I and II).

There was a pretty awesome view from the top of Cnicht. According to Marius the name Cnicht means Viking Helmet which is what the mountain resembles from the sea

There was a pretty awesome view from the top of Cnicht. According to Marius the name Cnicht means Viking Helmet which is what the mountain resembles from the sea

Dont fall off!

Don't fall off!

A helicopter appeared while we were descending Cnicht and the friendly guy waved at us. None of us dared wave back lest we be mistaken for hikers in distress

A helicopter appeared while we were descending Cnicht and the friendly guy waved at us. None of us dared wave back lest we be mistaken for hikers in distress

Thursday: Beach / Campsite

We awoke to the soothing (or alarming, depending on who you are) sound of a torrent merrily splashing onto the tops of our tents. The weather forecast was unpromising so we had the option of going to the beach or staying at the campsite (by now there were several people feeling unwell from the heat of the previous days). Max and Marius left (but not before tearful goodbye hugs from everyone who happened to be around), and Guy and I stayed at the campsite (I wanted to get on with some reading and he played an 8-way dawn of war battle). I ended up cooking a rather sumptuous sausages and eggs for us at which point I discovered what a mess the kitchen was.

Dinner was lamb, mushrooms and potatoes with cake dessert, after which we watched Alien v Predator 2 until midnight. It had rained most of the afternoon and night.

Friday: Half-day walk, getting lost

We woke up to an extremely wet morning so we cowered in the communal tent for breakfast. We had lunch in the campsite then ventured out in the sunny afternoon to where we had gone the previous trip to do orienteering. We got lost several times, and I managed to get a large number of amusing photos of our leaders peering confusedly at the map. Most of my downhill journey was spent making witty banter with the Reverend about religion terminating in us finally agreeing on where we were on the map. Again the trip ended in ice cream at Beddgelert, and we returned at about 5:45 to the campsite where we discovered rather rude campers had decided to pitch their tent between the school minibus and our tents!

Here are several pictures of our group leaders getting extremely lost:

… and the absolute classic (caption competition anyone?):

After GL’s slightly tipsy Scheherazade (all teachers are young at heart, especially after a few pints!) some of us congregated in the TV room to watch Saw (which seemed to have quite an impact on the subject of conversations for the rest of the trip) followed by Jonathan Ross interviewing Emma Watson. I then made some attempts at taking night photos before going to bed.

Saturday: Rainy walk

We woke up for the third day in a row to torrid rain, but at 10:30 the teachers decided we might was well do a walk around the base of Tryfan. The highest point of the walk was extremely windy and wet making for freezing horizontal rain. We lunched inside the (extremely) orange emergency shelter which wasn’t quite large enough for all of us which caused us some entertainment. We also got a bit of a laugh from imagining people outside asking a heaving, seething, complaining, munching orange mass in the middle of the footpath on the side of a mountain whether it was OK, and whether it had seen a party of six people walk past!

Inside the emergency shelter. Apparently a dog poked its snout into one of the air holes at one point.

Eventually the sun came out and even though we’d been shivering some of the way down the mountain we decided an ice cream at Beddgelert was called for (I went for a double cone: Chocolate & Ginger and Run & Raisin).

At 5pm the internet was abruptly cut off while I was downloading Saw II and we were told we’d been downloading too much and were using up the monthly cap. We’d been getting 600KB/s; I knew there had to be a catch. So instead of watching a film we rebuilt our dam from three years ago I took some more photos.

The classic trick with rapids - take a long exposure. For some of these I had to put the stand in the middle of the stream - I hope it doesnt rust

The classic trick with rapids - take a long exposure. For some of these I had to put the stand in the middle of the stream - I hope it doesn't rust

Sunday: Journey back

There’s not much to say about yesterday: we packed everything up, woke the apparently unwakeable 6th formers and helped them pack their tent, and drove back to London. We all cheered when we saw the sign welcoming us into Shropshire at the Welsh-English border and listened to Queen’s Greatest Hits as we coasted down the bus lane on the M4.

Final Thoughts

I thought it was an altogether highly enjoyable trip. The mounds of earth that had segregated our part of the campsite from everyone else three years ago were gone and there was no electricity near our tents which was a slight inconvenience, and I felt too many days were rained off / taken lightly … but overall I think most people had a great time, nothing / nobody got broken, and it was a great way to relax after the end of a tough school year.

Meanwhile – this is too good not to repeat:

Perhaps if someone thinks up a good caption … ?

๏̯͡๏﴿