Complex Hearts

December 12, 2009

So, I was sitting in The Eagle with Andrew the night before my Corpus interview (blog post written, pending non-disclosure agreement expiry in January) when he told me he’d discovered Complex Hearts and suggested we play it at some point after interviews. The idea is simple: normal hearts with an imaginative twist (Phil Plait style pun :P ). The point cards are hearts (+1), Q of Spades (+13 i), J of Diamonds (-10) and 10 of Clubs (multiplies your round’s score by 2i). The first person to end up with a score of magnitude > 100 loses (and I guess the one with the smallest magnitude wins). Shooting is an option – after a successful shoot you can opt to give everyone 13 + 13 i points, which I did in the first round we ever played (!)

We’re also wondering whether each game necessarily ends at some point or whether it is actually possible for a game to run on forever. At first glance it seems like it can go on forever but the details are genuinely non-trivial – inventing a situation in which it does never end is harder than it seems and can’t be thought up in a matter of seconds (our discussion of this game property was brief to say the least).

Program

We came up with this crazy idea involving plotting players’ scores on an Argand diagram for some extra amusement. The first time we played I used an Excel spreadsheet to plot a circle of radius 100 and some points representing people’s scores. If someone goes outside the circle they lose. Then I wrote a program – and here’s the result (it’s still being developed and features are being added as and when people suggest them):

Complex Hearts scoring program which I wrote. Click to embiggen

It was a very quick hack of a program so I wrote some pretty messy code (I really shouldn’t get into this habit) but it’s pretty nifty and I’m pleased with it.

I’ve also been considering attempting to write an online version so people can play against each other but learning Flash seems like a lot of effort…

Tactics

Compared to many other card games, normal Hearts is fairly sparse on tactics – whatever you do don’t get the Q of spades. But this variant introduces a whole new level of complexity (oh no not again). I’ve noticed a couple of things about gameplay: In every game I’ve played, people lose by drifting up. The reason’s pretty simple: you can gain imaginary points easily: either get the Q of spades or get a load of hearts then the 10 of Clubs. But it’s almost impossible to lose imaginary points: you have to get the J and the 10 in the same round. Also, in terms of real points, ignoring the 10 of clubs, on average each person will gain 0.75 points per round just because of the Jack. And in every game I’ve played people tend to drift in the negative real direction, so winning hearts tricks seems to be a good thing. Perhaps a good tactic would be to avoid passing strong hearts and try to win hearts tricks. That way, if you try to do the Jack-10 combo to gain negative imaginary points and all you have is the Jack, you still gain from it by losing points. After all, chances are someone’s going to have shot off the top of the Argand diagram by the time anyone wins a total of 100 real points (which would involve winning almost all the hearts in every round for at least 8 rounds).

AI for Complex Hearts?

I was considering writing an AI for complex hearts at one point – one that attempts to guess what everyone else has and play accordingly. It would have a points engine to work out which cards it wants to win/lose far more rationally than a human (in this game I’m convinced you want to win hearts) so it wouldn’t need to be *too* complicated to do reasonably well. Then again, the rules are non-trivial. Perhaps I could use Microsoft Inter.NET (cf. my work experience) to work some Bayesian Inference magic – using that framework is pretty amazing: give it some information and command it to “INFER!” (a bit like Mathematica: just using the single line Solve[...] saves you about a week’s worth of sweat, blood and tears).


Young Rewired State

August 23, 2009

I’ve spent the last two days at the Google HQ in London attending Young Rewired State [hit link for more info about event] (#youngrewiredstate), and it’s been nothing short of epic.

And of course, I’ve taken some photos.

The schedule (shamelessly copied from the site) was as follows:

Saturday 22nd August:
10:00 Start
10:30 Planning session
12:30 Lunch
13:30 Hacking starts
17:30 Dinner
18:30 Home (Hacking overnight allowed!)

Sunday 23rd August:
10:00 Back to hacking
11:30 Brunch
12:30 Back to hacking
16:00 Presentations to Judges and Press
18:30 Prizes announced

On the first day we split into groups and started thinking up ideas. At about 4pm we finally settled on our idea: to make something very similar to RentACoder, but much simpler, targetted at talented coders who need experience in order to get a proper job. Here are a couple of screenshots of the final result (click to embiggen).

We decided on a PHP/MySQL project and as luck would have it, I was the only PHP/MySQL programmer in the group! So it was fairly frantic work (solid coding from 10 till about 3 on the last day) and we ran into all sorts of problems with versioning and people overwriting each others’ work in FTP, especially as the CSS people tended to be working on the same files as I was at the same time!

IRC

As with all hack days, IRC was one of the most important methods of communication. Literally everyone had their laptops out during talks, especially during the presentations at the end and there was a fairly constant stream of chatter on the channel. @samhale123 also put up a bot on the channel to tweet things over IRC – we had several hours of fun attempting to overload the script / twitter / the server!

Immaturity with Twitterfall

Immaturity with Twitterfall

Google

Google is an amazing place with by far the best decor I’ve seen in a company building. The floor is laid out like the London underground and the meeting rooms are more or less in the right place for stations (with consistent naming). There are ducks on the ceiling and random awesome other bits of furniture / decor adorning the walls / ceiling / floor.

We were also given a load of Google freebies, including Google yo-yo’s, Google cakes, Google water, Google pens, Google notebooks…

This actually was a telephone box!

This actually was a telephone box!

Google and Youtube Cakes

Google and Youtube Cakes

People

Of course it was a floor full of geeks, which essentially means a brilliant selection of geek T-shirts (I spotted several from ThinkGeek, at least one from the xkcd store…). The mentors (helping out with coding / guiding the groups) were also working in all sorts of fantastic companies; one of our mentors is working at last.fm, one at moo, one with the BBC etc. And needless to say there was a wide array of OS’ – the large majority seemed to be using Macs, those with PCs were probably split 50/50 between linux (mostly ubuntu, one debian that I know of) and windows.

There was also a brilliant selection of judges, including people from Wired (for some reason looks very familiar; came to school to give a talk maybe?), C4, etc.

Some of the judges

Some of the judges

The presentations were good fun – there were something like 40 people from the press / outside making the buzz all the more exciting. And we (@workforpeanuts) won the “Wish I’d thought of that” award!

Anyways, this is the first hack event that I’ve ever been to, and if this is anything to go by, I’m definitely game for another at some point. Heck, maybe DEFCON next year… *MANY* thanks to @hubmum for organising such an amazing event.

And I took other cool photos so go for it and browse!


Microsoft – Week 2

July 25, 2009

This week has gone pretty quickly and I’ve mostly been working on the text analyser / summary program. I even managed to take some photos! The week started with @dumbledad (= Tim) showing me some of the visualisation stuff he and an intern had been working on to visualise a book, some of which will appear shortly on a site somewhere… It’s all in the spirit of new and interesting data presentation in the spirit of Information Aesthetics and he sent me a link to some stuff he did on ManyEyes – word clouds (or ‘wordles’) comparing frequencies of words in narrative and speech. Some of the other ones are more difficult to describe but I’ll be sure to tweet link to them when they get published.

The idea of the summary program was that it split the book into sections then compared a histogram of word frequency densities in each section with another histogram for the entire book, then picked out the words which were most likely to be important to the section by choosing the most unusually frequently used ones. The problem with that was the program wasn’t picking out main characters because they were being mentioned all throughout the book. So I was to implement a system to split words into three categories: local to the section, local to the book (main characters) and common to the English language. The existing framework for a two-way local to section vs local to book had already been written so I was to implement the three-way split.

Factor graph showing the model

Factor graph showing the model

By Wednesday I’d finished the actual implementation so I started trying to invent a visualisation. My original idea was to have a ’story line’ (no pun was actually intended) along which various threads would undulate, and the further out from the story line they are, the more important they are; think of it as a radial graph – I think I was probably inspired by the RealPlayer (yuk, I know) ‘cosmic string’ visualisation. I built a really flickery version as a mockup which was approved, and since I was by then starting to shy away from WPF I ended up learning DirectX overnight to implement a final 3D non-flickery version of it. After spending a whole day stressing over the edges of the scene getting cut off and finally realising I’d set the camera’s maximum viewing distance ridiculously low, I finally got it to work, and after writing some homebrew bezier curve code it looked pretty good (if I may say so myself); Tim tells me he’ll probably add a screen video of it to the online display of visualisations so … watch this space.

Another excitement of the week was a talk from TrueKnowledge (= TK), an internet answer engine. It’s similar to the famous Wolfram Alpha (= Walfa); however in my opinion it actually has more potential. Walfa throws manpower at writing new code to scrape information from various different sources on the fly which essentially means the more information you want, the more you’re going to need to work. TK on the other hand stores information in an enormous database which has a structure suitable for storing any type of information, and although work is done to ‘crawl’ Wikipedia and other sources for knowledge, it also sources the community for information which means it can gather lots of important knowledge very quickly with minimal effort. It also has awesome features of natural language parsing (ask it ‘what colour are red cars’ for example) and it can also give you a step-by-step explanation of the logical process that leads to its final answer.

The bottom half of the screenshot shows TKs stages of logical inference

The bottom half of the screenshot shows TK's stages of logical inference

It of course differs from Walfa in that it hasn’t got a tonne of Mathematica code behind it – its strengths are in factual and inferred knowledge as opposed to evaluating integrals. It’s currently in Beta and has an API (yay!) so I strongly encourage anyone who has used Walfa to give TK a go.

On Tuesday the weekly Mexican food van appeared – until then I’d never realised quite how amazingly good burritos can be! While we were eating we started discussing presentation of text. The problem is that a conventional layout presents the reader with a formidable block of text interspersed with some images which is difficult to follow and annoying to read since one always has to alternate between studying the image and reading the text. However attempts at producing non-linear presentations of information such as embedding text into the image as tooltips or expandable areas of the image etc. have always resulted in people simply not reading very much of the text and consequently missing out important stuff. The best solution we came up with is using an old method of collapsible clauses, just like collapsible code. For example, if a relative clause which in this case is italicised and relatively long yet somehow doesn’t contribute much to the sentence thus merely adds length and unnecessary information to the text making the ultimate meaning more difficult to discern is considered superfluous to the meaning of the sentence, it could be replaced by a small button that only shows the clause if clicked – such ideas are particularly relevant to German sentences which tend to have huge diversions into clauses before the verb is revealed right at the end. This way readers can quickly get the gist of what’s going on so they may study the image in an enlightened way, then go back and expand the text to get the full meaning.

There are also a few things I noticed about MSR in general. There is a strong sense of company loyalty – all employees seem to use Bing, and everyone I’ve seen even goes as far as using IE instead of Firefox! Using only Microsoft products to perform tasks however did make me aware of the wide range of programs they do produce – they even have Virtual Machine software and an internal proprietary alternative to SVN. I guess it does help the developers of these applications a lot if they have an enormous internal test group: all the employees and interns. There’s also pretty close integration with Redmond (Outlook + Office Communicator + global WAN shares) so feedback could be quite efficiently delivered. The entire place also operates in the spirit of trust – all users have admin rights (necessary for developers anyway) – which is so much better than what is implemented at school: a highly restrictive policy which, despite recent changes for the better, still filters out most protocols (FTP included) and in fact, instead of preventing people from doing things simply makes everything so much more difficult to do. Now I have to connect through encrypted VPN to use FTP…

Anyways overall it was a great two weeks. I enjoyed it hugely, I didn’t need to touch Excel, I didn’t make anyone coffee and I didn’t do any filing (who needs paper anyway? It’s a software company!) – instead I worked on real (and rather cool) projects, learnt some useful things, and made new acquaintances.

In other news, I’m off tomorrow to Cranfield for the Aerospace Challenge Finals – I’ll get to fly (actual!) planes, take lots of photos and it should be another great experience. They’d just better have wifi, though I’m bringing my Alfa Awus (ridiculously powerful) along in case of weak signal!


Inovazone

June 25, 2009

It suddenly strikes me that I haven’t been blogging much recently. Exams finished several weeks ago but I have been somewhat busy.

In particular I’ve been working on a new site for a client, Inovazone. In the words of the site’s inventor, Alastair Darwood:

How does an invention go from a scribble on a page to a world-changing product that advances humanity? The answer is that at the heart of the invention there lies a set of distinct and crucial necessities that the invention is addressing … Until now, the only way in which necessities were discovered was through large companies carrying out expensive research, or a spark of genius from someone who suddenly sees one and thinks of a solution. Inovazone is designed to change this. The idea is that users of the site post their ‘necessities’ (short explanations of problems, ‘necessities’, they think need to be solved to benefit them (or humanity)), or rough outlines of inventions they would like to see and anyone can browse the necessities if they want to and look for interesting problems to try to solve through innovation. This could be in the form of an invention or simply a quick online response to the post on our comments system.

Interestingly enough, there doesn’t actually yet exist a site or system available to the general public that serves this particular function by providing a framework within which ideas for inventions are openly submitted and accessed, so I agreed to work on it, especially considering the core of submitting and displaying necessities is a relatively simple PHP/SQL project.

Submitting a necessity

Submitting a necessity. Click to embiggen

What I found particularly compelling was that, according to Alastair, everyone with whom he’s discussed the site has expressed enormous enthusiasm for it, and even a seasoned inventor he’d talked to had described the idea as long overdue and predicted huge success. Although I originally liked the idea and somehow felt it would do pretty well owing to its novelty, I for some reason assumed one would obtain mixed reactions in discussions.

Original image at http://rockstartemplate.com/wp-content/uploads/2008/12/social-bookmark-icon.jpg

Original image at http://rockstartemplate.com/wp-content/uploads/2008/12/social-bookmark-icon.jpg

In terms of the hard sell, we’re employing several different methods to publicise Inovazone. We’ve created a Facebook fan page and a Twitter account – feel free to follow us (@inovazone). One.com also kindly provide adwords coupons so with some SEO we might be able to appear in ads on relevant Google searches.

The site is currently about to go into beta testing. From wiki:

Beta testing comes after alpha testing. Versions of the software, known as beta versions, are released to a limited audience outside of the programming team. The software is released to groups of people so that further testing can ensure the product has few faults or bugs. Sometimes, beta versions are made available to the open public to increase the feedback field to a maximal number of future users.

The idea of this is simply to fine-tune what we already have rather than to add more features, though both these are desirable outcomes from this round of testing; though for me personally the primary objective is to see whether the system works well with a large user base. So for that to happen we need some willing volunteers to go test out the site: anyone reading this is welcome to join the test group. Go ahead and post as many necessities as you deem reasonable, and comment on existing ones. Try out creating new subcategories and send us feedback about features that you think should be created or made better via the awesome uservoice-powered feedback utility. I’m especially interested in bugs and vulnerabilities you find; if you somehow work out how to delete posted necessities or spam the site with adverts, or if the site spews out a series of errors while you’re using it under normal circumstances, get in touch (there is a contact page). Chances are the database will be reverted to its initial (more or less empty) state before the site is released in a few weeks’ time, so disasters should be recoverable.

Before you (readers) go, I’d be interested to hear your feedback on the idea of the site – do you think it has potential? Will it be useful?

That’s all from me for now. The site will get a working and hopefully regularly-updated blog pretty soon for general status information and news so you probably won’t be hearing much more about it from me.

๏̯͡๏﴿


Automatic Email Reminders

June 22, 2009

Many services seem to provide automatic email reminders these days – notably Google Calendar; however what I need is something that can send me daily reminders about things, and setting up a Google calendar with daily repeats seems an altogether inelegant solution. Virgin Media operates a throttling policy in which downloading is limited between 10am and 3pm, and again between 4pm and 9pm. As far as I’m concerned, all this means is I need to make sure I’m not downloading much between those times. I thought of setting up my various alarm clocks and watches to ring at 10am, 3pm, 4pm and 9pm to remind me – but considering I’d have to use four different devices (none of my alarm clocks support multiple alarms) and the fact that it would all be useless if I’m not at home, the best solution is to use a daily email reminder which would alert me even if I’m working off my laptop at school (for example).

I also jumped at the opportunity to find out more about Linux and PHP; besides I didn’t want to sign up for several free email reminder services hunting for a good one so opted simply to write my own. For the sake of anyone attempting to implement any of the features of PHP and Linux I used, here is my solution which can essentially be broken down into four parts:

1. Emailing script

It was fairly easy to find out how to send an email via SMTP in PHP so I set up a script to connect to my gmail and send an email from there to myself. After turning it all into a function, in accordance with the modular approach to development, I was ready to proceed.

2. Scheduler

This was more difficult. PHP can’t by itself do anything on a schedule so it was necessary to delve into Debian’s scheduling system. I tried to look up how to make the damn thing work and eventually found (amongst other irrelevant info – hurry up Wolfram Alpha and add support for coders!) how to use it. It seems like cron is already pre-installed upon Debian installation, and is constantly running as a daemon. It checks a file every minute to check whether it should be executing a scheduled task. To edit this file you type ‘crontab -e’ and get a plaintext editing interface (looks like vi). The format of the file is a list of lines, each one representing a scheduled task. Each line’s format is:

[minute : int] [hour : int] [day of month : int] [month of year : int] [year : int] [script pathname : string]

So to run the program ‘rtorrent’ at 17:30 every 3rd of the month, you go:

30 17 3 * * rtorrent

Asterisks are, as always, wildcards. To execute a process every minute, the first 5 terms look like ‘* * * * *’. There is a slight problem with this: you can’t use vi from a PHP script (at least it’s not possible using exec). It turns out there’s an alternative way of using crontab – by importing a file. So the command looks like:

crontab foo.txt

Great. So the PHP script creates a text file containing the new line then executes a command to add that file to the cron file. Eventually I set it to run the script every minute and have the script itself check for whether it should be doing anything by referencing a MySQL database.

So in the end the PHP looked like this:

$fh = fopen(“foo.txt”, ‘w’) or die(“can’t open file”);
$stringData = “* * * * * /opt/lampp/bin/php /opt/lampp/htdocs/php/ereminders/s_mail.php &> /dev/null\n”;
fwrite($fh, $stringData);
fclose($fh);
exec(“crontab foo.txt” . ‘ 2>&1′, $output);

The &> /dev/null is just to stop it ‘helpfully’ sending email to root every time it runs (i.e. every minute) containing a log of exactly what happened.

3. Database

It’s a fairly simple MySQL thing – nothing fancy. I wrote a nice function in PHP to tabulate the results of a query which I use in my screenshot. It’s a single table and I haven’t bothered to normalise it or anything. Nothing to see here. Move on.

4. Admin panel

This was, like with most good things, the final stage of development. I was also getting lazy and bored so it’s pretty rudimentary; I wrote it just so I don’t have to go into PHPMyAdmin to change things. It makes use of that rather neat ‘tabulate’ function that I had written which tabulates the MySQL query. The var_dump is the contents of the cron file.

Final thoughts

In hindsight, this is pretty good for two hours’ work, especially considering about half an hour was spent writing scripts to automate things from an admin panel. I’ve also actually found it quite useful (pardon the surprise) – the other day I wanted to download an episode of Lost Windows 7 but the throttling period had already started and that one download would probably have pushed me over the download limit. At exactly 9pm I got an email reminding me to download so was able to watch the episode install the OS that very night. Though I doubt it’ll be much use to anyone other than me since there exist systems out there that do the same thing, just much better (probably).

๏̯͡๏﴿


Songbird v Foobar

April 29, 2009

Interestingly enough I switched away from iTunes 7 and haven’t touched it ever since their highly hyped update to 8. I switched to foobar2000 which is actually a pretty awesome bit of software. I have however been constantly hearing about Songbird and its amazing features so I’ve now finally got round to installing it and testing it out. Here are my thoughts.

Foobar > Songbird

One of the reasons I switched away from iTunes in the first place was obscene memory usage. I’m not sure how iTunes 8 is with memory but I had many grievances about the performance of iTunes 7 when I used it. Testing Songbird on a decent laptop (3GB RAM, Intel Core2 Duo T8100 @ 2.10 GHz, a processor that benchmarks faster than most in its clock speed range), it took 5 seconds for the program to start up fully while foobar loaded instantly. Foobar’s memory footprint was absolutely miniscule at 10MB while Songbird required a hefty 80MB, though that’s fairly unsurprising considering its capabilities as a browser.

In terms of usability, as a foobar2000 user, I miss features like Cursor Follows Playback (and more importantly Playback Follows Cursor), complete ID3 tag control, advanced syntactical filters and fully customisable shortcut keys, for which I have yet to find Songbird extensions. Whatever the case these are minor concerns and are bound to be ironed out / provided in the long run by extensions or built in natively. However my concern is that Songbird seems directed more at less savvy / control-freak users who don’t necessarily want to use something like a RegEx string or SQL query to perform operations or filter their music – the functionality is based more around forms and buttons rather than console, debug window and command prompt. While most people probably welcome this user-friendly approach, I personally enjoy the ‘hackability’ and almost complete controllability of foobar. Of course, since Songbird is open-source a real hardcore user may prefer to hard code in mods, though I for one prefer not to have to recompile software to make it do what I want.

There are also several components which come natively with foobar (or as pre-installed plugins) such as ReplayGain (very important; Songbird’s equivalent is the ‘VolumeProfiles’ addon); minimise to tray (again critical [to me]; Songbird has the ‘MinimizeToTray’ addon); and a ‘resume playback after restart’ option (a nice touch to foobar; Songbird has an addon called ‘last track resume’).

This demonstrates the syntax of a Foobar preference element - a lot of the preferences are like this. Theres just so much control

This demonstrates the syntax of a Foobar preference element - a lot of the preferences are like this. There's just so much control

You can even control exactly what text is in the window title, status bar and system tray tooltip

You can even control exactly what text is in the window title, status bar and system tray tooltip

Songbird > Foobar

Enough nitpicking. Songbird really does have some really awesome features. Its integration with the web is very nciely done – I get the impression more or less every online music service is supported to some extent, and the whole browser integration is a brilliant idea. Foobar’s web integration comes in the form of ‘freedb’ which I assume is some sort of tags downloader though it’s never given me any vaguely sensible suggestions so isn’t very good. There’s also a mini player built in which foobar doesn’t seem to have without resorting to skinning. Ratings are native which foobar is critically missing – you have to use ‘quick tagger’ [addon]. The default iTunes interface was offputting at first but the browse library by artist/genre/album etc at the top is another feature foobar lacks but Songbird has. And, of course, Songbird is open source.

It’s interesting that Songbird was developed as an open source project thus appealing to the techies while also being amazingly pleasant to use with some of the most useful and critial features built in and vast extensionability. Someone commented Songbird is like the Firefox of media players. I can’t say I disagree.

I find the way theyve built a media player around a browser quite cool and certainly in line with the whole web integration thing

I find the way they've built a media player around a browser quite cool and certainly in line with the whole web integration thing

Songbird has a clear iTunes-like interface and the mashTape (web integration with artist/song info, reviews, even youtube) is a pretty cool feature IMHO

Songbird has a clear iTunes-like interface and the mashTape (web integration with artist/song info, reviews, even youtube) is a pretty cool feature IMHO

Songbird, Foobar > iTunes

Despite a slow load time, Songbird wipes the floor with iTunes when it comes to performance. There was a problem with iTunes 7 in which scrolling through a large library was a misery owing to the intense slowness of just about everything. Songbird on the other hand is actually pretty snappy. And of course Foobar runs like lighting.
Both are extensionable. I know there are iTunes addons etc. but both these alternatives take extensionability to a much higher level. Songbird probably uses extensions about as much as Firefox while Foobar takes extensionability to an extreme by more or less requiring them to function normally (hence the pre-installed ones).
And of course neither associates itself with a store that sells DRM music ;) So it’s all good.

Overall, based on my experience of them so far, both are far more than adequate replacements for iTunes (unless you’re a fool and actually use the iTunes store in which case your music is useless if played by anything but Apple products). Foobar even has support for iPods (not sure about Songbird). Neither has performance issues, and both are more or less customisable enough for the standard user. If you’re after an easy and pleasant-to-use player with an automatically decent-looking interface with truly wonderful web integration, go download Songbird. If you’re a control-freak in search of hackability and control almost to the extent of writing your own RegEx (and also a completely no-nonsense player), foobar’s the one for you. On the other hand if you want a program that is slow, memory-hogging and defaults to buying music from a store with hideous DRM, go ahead and download iTunes.

๏̯͡๏﴿


British Informatics Olympiad Finals at Cambridge

March 29, 2009

I’ve just got back from the British Informatics Olympiad finals (they chose the top 14 in the country to take this – I was an experimental error) which took place at Trinity College in Cambridge. Despite a fairly epic fail, I thought the weekend was quite productive and I’ve definitely acquired better skills from the only formal or otherwise training in programming the contestants were (strongly) encouraged to undertake. More photos can be found on my Flickr photostream.

Night shot of housing

Night shot of housing

As always with such activities, the people there shared many interests, and since programming is (at St Paul’s anyway) an interest few people take vaguely seriously, it was a particularly unique weekend for me in that I could talk about Dijkstra’s Algorithm without getting suspicious sideways glances and a general awkward diffusion of human density away from myself… Pretty much all the people there were doing double Maths and were very much into it, and even at a school like St Paul’s one probably wouldn’t be able to mention the Collatz Conjecture and have every person in the room nod and begin elucidating their own hypotheses on where the proof will (if ever) eventually come from (e.g. graph theory, number theory, computing etc). One of the people who had a big hand in organising and managing the whole affair (Tom), seemed to know a great deal about Physics, Maths and computing and the night before the olympiad he and a few of us had a long argument about string theory, something unheard of even in Physics lessons at school. Tom also seemed to know my maths teacher from helping him set ridiculously hard BMO questions at some point in the past (I think I can just about forgive him for doing that). The atmosphere was also quite singular since geek humour actually worked, and maths and physics jokes got a higher laugh:groan ratio than usual; though I couldn’t help facepalming when, while collecting in papers, Tom remarked ‘all your papers are belong to us’.

I think I probably did get quite a bit out of this whole experience. Since being invited to the finals I’ve been constantly prodded to complete USACO training challenges which are combined with a form of structured training / algorithmic tutorial thing which gives users formal training of algorithms. This training is actually the only time I’ve ever been ‘trained’ in programming and I’d been sort of making things up on the fly since I started programming (in visual basic…) back in Colet Court – I didn’t actually know what a greedy algorithm was until this year, and my understanding of even what algorithms are was still sketchy when I took the BIO for the second time (last year). So by going through a beginners through intermediate algorithms training course was definitely useful for me and I now approach problems more analytically rather than just trying to convince a compiler to automatically do what I would personally do if faced with a problem (no, not give up straight away and watch a film instead). I have to say though that despite its utility I acquired a few bad habits from it. Since for each assignment infinite attempts are allowed, I tend not to actually test programs against anything other than sample data and submit it, hoping something that can solve a specific case will probably work with all other cases. Inevitably it tends not to work first time and rather than sitting down and debugging it, I take the test data and just try to make the program work for that by doing silly things like making loops slightly longer or incrementing variables by 1 randomly. USACO helpfully also provides the full answers for test data. I have to say though that the grading system on USACO is impressive – after you submit a solution the backend code automatically compiles and runs the program for each test case and grades its answer. Inspired, I’ve now got something vaguely similar set up on my Debian server (albeit requiring both FTP and SSH connections to sort of do everything manually).

Another thing I’ve always wanted to do was learn C++ as every linux package seems to require compilation with g++, so C/C++ seems to be the language to learn. Since the BIO Finals required code in C/C++ or Delphi (that nobody used, surprise surprise) I was forced to learn a new language, and after discovering the awesomeness of pointers, I have an incentive to endure the confusion and use some really cool programming features (at the risk of corrupting random critical data in the system memory).

And as always with a visit to Cambridge, I got to see more of the college and living quarters. I have to say, the Corpus rooms were more spacious though assuming a specific comparison is representative of a more general comparison would be displeasing to my stats teacher.

And of course since the BIO is sponsored by Lionhead Studios (gaming company) we got to meet a representative and should find obtaining work experience there much easier.

So anyway, here’s what I remember from the itinerary:

Day 1
Erroll and I arrived at Trinity only to be told we had to be at the porters of Burrell College on Grange Road, which is somehow related to Trinity. After a bit of a trek we arrived, Pauline-style, fashionably late. We were shown our rooms, were provided with food and were shown the computer rooms where we got used to our environments. It was actually a very simple question on summing squares repeatedly (up to 2^63 times) – all we had to do was notice there was a repeating sequence – but the unfamiliar environment and unfamiliar method of file input (fscanf in <stdio> as opposed to fin >> in <fstream>) made me do all sorts of stupid things. I think after three months of USACO challenges in C++ I think I still prefer C# / Java as languages. Perhaps it’s got something to do with compilers (my 2003 MS C++ compiler at home goes kaput randomly).

Day 2
The brain uses an insane amount of energy when working hard and doing olympiads was even suggested (as a joke) as a means of weight loss. For this reason an awesome breakfast was bestowed upon us, consisting of baconey, eggy, toastey goodness. The morning papers were both written, the first being on Turing Machines and the second on emotion/sentiment detection in text. This was actually probably the most enjoyable part of the contest – the questions on turing machines were a bit like a mix of maths, logic and electronis (involving truth table like things and train track systems etc). It was also the only bit I could actually seem vaguely competent doing – an education in maths and electronics probably helped me.

The afternoon was when the hardcore competitioning kicked in with a whopping five-hour paper consisting of four questions (the best people completed two questions). I’ll talk about those later, but I failed quite epically, only managing to write a program to solve a question for about half the test data. It also transpired I’d spent all my time on the hardest question, and that although I saw immediately an algorithm that Dr Forster said worked, my implementation was total crap. Later that evening I managed to think up a linear time program to solve the same problem which when I mentioned it the next morning was apparently the best solution Dr Forster had thought of. I still blame C++ :P

After the olympiad I attempted to take some night photography with a tripod which is apparently banned. Oops. We all then went to dinner at Ask and got quite stuffed up before being asked to eat more in the chill room afterwards.

Day 3
Today was mostly free time in which Erroll, I and another person we met spent our 2 hours of free time pulling newspaper, broadband adverts and IR spectra printouts out of a pool table in an attempt to get the balls out. Our efforts included using mobile phones as endoscopes to get a better view of the inside of the machine, using wire coat hangers as hooks to reach the newspaper and peering inside the machine to aid our end. We were eventually thwarted as we concluded our over-zealous table tilting had resulted in the balls falling out of the mechanism entirely and ending up unretrievably on the bottom of the table. For future reference: pool table lock picking doesn’t work with coat hanger wires.

After an inventive method of going round (in three dimensions) a gate without a keycard I and a few others got to the main Trinity college. The tension was apparently high as they announced the IoI competitors though I didn’t notice; my expectations were clearly low as I had just organised three weeks’ work experience with Microsoft at the same time as the finals. What was nice though was that we got very cool 4GB memory sticks with ‘British Informatics Olympiad’ on them, and we were issued with compulsory free games and ‘Introduction to Algorithms’. I already had the book and had in fact brought it to the contest in order to seem vaguely intelligent with no real intention of looking at it.

My room

My room

The questions

I’ve put thumbnails of scans – click to enlarge. I cut out my retarded scribblings.

Desperate Measures

This is the one I half did, and also the hardest. You’re given a cross section of a polygon and are supposed to divide it into triangles using only given vertices. In the question the polygon is the cross section of a cave tunnel…

My initial (and apparently correct) thought was to cut off all the bits sticking out of the polygon (i.e. turn them into triangles and ignore the outside points) and end up with a convex polygon and subsequently just a triangle in the middle; sort of repeatedly going round the polygon chopping off bits. My second (and apparently better) thought was to go from left to right joining up points, an algorithm that runs in linear time.

River

This is apparently the easiest question: a river (= straight line) of up to length 2^31 kilometres (the unit was some amusing invented thing but I can’t remember what it was) can be split up in one of up to 1000 (=n) ways into n unequal segments (= up to 1000 different sections using up to 1000 different partitions). The question was to split it into n segments so that each different segment is a segment described in a different partition and there’s no overlap of partitions. OK that’s a crap explanation – the scan does it better.

I didn’t actually look at this which is a pity – it was the easiest. When I looked at it before going to bed that night I managed to come up with an algo straight away which would have worked in O(n^2 ish) – the biggest case would have taken about a second to run. Pity I saw 2^31 and immediately moved on when actually doing the competition.

All work and no play

The scan says it all. It was a really annoying question because I started working on it about an hour before the end. I wrote an efficient O(n) algo for finding how many ways there are of making n in such a way, had an amazing pascal’s triangle/combinatorics thing going, and was about to write it when I realised it all got screwed up by the blocks only going up to 10. Unable to repair my program, my final submission only worked up to n=10 (needs to work up to n=64). I might also have forgotten to change the output method from console to file output. Meh.

Spies

I couldn’t think of an efficient way of doing this (not even in the evening though I was very sleepy by then) and am certainly not going to attempt it again until I have lots of free time, i.e. summer. This was the second hardest.

๏̯͡๏﴿


Cambridge Eng + Comp Sci Lectures

February 28, 2009


Today was the fourth time I’ve been to Cambridge (bringing my total by the end of this year to six, as I mentioned here), this time for another CareersMCS event about engineering and computer science. Like both the other conferences organised by the same people, I ended up leaving somewhat inspired, and wanting to do all the courses they talked about. As one of my friends put it, now I don’t want to be a banker/medic/programmer/whatever – I just want to be a student all my life. As Chef from South Park put it, “there’s a time and place for everything … and it’s called college”.

The day started at silly o’clock when I woke up, arriving in time for a 9:45 start. After an intro (at the beginning of which all parents in the hall were, as always, assertively invited to leave – presumably to allow us to feel free to ask questions) we launched straight into Aerospace Engineering.

I feel I took away something new, interesting and possibly useful from each of the nine lectures today. Aerospace was all about fluid dynamics and how aerofoils generate lift, footballs spin and paper aeroplanes do ‘loop-de-loops’. The speaker quickly dismissed the higly yesteryear longer path theory (and to think I was misled by the Science Museum!) and moved onto explaining exactly what Cambridge professors think is going on. It’s all about bends in streamlines creating pressure gradients and subsequent forces. Theories involving Bernoulli’s speed-pressure relationships are also apparently flawed. Fluid dynamics are one of the Physics topics tragically lost in the teaching of AS Physics (*very* casually touched upon by KPZ when we were calculating electron drift speeds in wires) so it was certainly something novel for me to see.

The speaker touched on the intriguing Physics behind shock fronts and sonicbooms

The speaker touched on the intriguing Physics behind shock fronts and sonicbooms

Computer science was largely familiar to me – it was on security: mostly cryptography and corresponding cryptanalysis, but also briefly touching on stego, social engineering and PGP certs. It’s interesting to note that contrary to what I might have expected a year ago, the course places little emphasis on learning programming languages such as C++ well, but far more about simply working out algorithms and computational methods for solving problems. Having now done some (sort of) formal algorithm training (and being aware of the syntax of many different types of languages), I’m seeing the language divide beginning to melt away and starting to see the use of teaching, say, Dijkstra’s Algorithm in pseudocode rather than C++. And of course, it was no surprise that Mathematics (particularly llinear algebra) played a pretty hefty role.

Chemical engineering was quite a new one for me since I hadn’t a clue what it was all about beforehand. Turns out it’s essentially process engineering. After this we were given a time-lag-style summary of a first-year student’s perspective on applying to engineering at Cambridge. Following lunch was an intriguing talk on mechanical engineering involving some familiar circular motion revision and a pretty awesome demo of a Morrison Shelter, with the speaker’s mobile phone as collateral (sitting inside a flimsy-looking metal-wire shelter as a 5KG weight was dropped on it)! This was followed by an excellent demo-rich talk on electrical engineering (I was delighted to be able to recognise an AS-style power amp, albeit one with three concatenated push-pull amplifiers [power!]) which revised some electromag physics (dPhi/dt, I(lxB), q(vxB) etc). There were also some nice experiments with a big coil and an iron rod going through it – though I felt rather sad about being so delighted by seeing Physics demonstrated so wonderfully. Maybe I’m just weird that way…

Morrison Shelter. It used concepts of plasticity to reduce damage to occupants (like car crumple zones)

Morrison Shelter. It used concepts of plasticity to reduce damage to occupants (like car crumple zones)

The talk on applications of IT to engineering was highly interesting theoretically though the program appeared depressingly slow – crunching a few matrix equations would have taken MATLAB about five seconds at the most. The speaker left his program running for a good half of his talk before it finally finished… The guy in charge gave a talk on applying and UCAS and other such helpful stuff in which he cracked the same joke as he does for every single one of these: apparently someone wrote on his personal statement [paraphrased]:

I do lots of music. I enjoy playing the flute; sometimes I play with myself

The day was concluded with a talk on civil engineering, though by that time my brain was complaining about lack of sleep.

The thing I love about studying things like Physics and Maths is that every now and then as I learn new topics and areas in the subjects, I come across proofs and lines of logical thinking and generally ‘things’ which just make me stop and think, ‘oh yeah – never thought about it that way before’, and marvel at the genius of whoever came up with it (for example OLCT told us about an intriguing interpretation of regression lines involving dot products of vectors). Maths and Physics have intriguing subtleties which never fail to inspire. I still think after today that I’ll go with Physics/Maths. Engineering subjects are interesting and probably far more directly useful practically speaking. But there were moments in the lectures when I felt the subjects were more superficial than I would like. There’s just something about delving deep into a subject and finding something surprising and counter-intuitive yet logically beautiful that makes me want to find out more. I must be *really* weird…

Again, as I was walking around Cambridge, I observed huge contrasts in the types of people around. There’s something I love about students – they make up the most diverse age group: the norm is to be radically different and anti-trendy. The town is simply buzzing full of life, from street musicians (pretty good ones as well!) to punters (literally), to human rights demonstrators sitting in cages outside King’s. This time we also got to see Corpus, Trinity and St John’s all on the same street, the three colleges about which I know enough to want to apply to. Cambridge is simply an awesome place; there’s no question about it.


End of the Road for CAPTCHAs

February 23, 2009

Signing up to a forum the other day, I couldn’t help noticing the increasingly pervasive presence of anti-spam and anti-bot modules built in to online registration forms.

The problem with CAPTCHAs in my opinion is simply that they are stressful to use for a modern end-user. Research (c.f. some New Scientist article that I read) indicates online users have over several years become far more ‘aggressive’ in that website visits are increasingly about simply extracting information and leaving: people are becoming more ‘hit-and-run’ as opposed to ‘come join our wonderful community at Experts Exchange’! This speed and efficiency (which I think is largely attributable to Google) of surfing however goes completely out the window when it comes to CAPTCHAs: rather than simply clicking ‘OK’, users are forced to squint at a series of curly, obfuscated, low-contrast, specked, half-obscured characters, often only to be told they’ve typed the wrong characters and forced to repeat the whole unpleasant experience. CAPTCHAs break the line of thought and direction of a user, serving merely to annoy him/her; most people I know only ever think of CAPTCHAs as infuriating and time-consuming wastes of space. They’re sometimes even vaguely funny!

There’s actually an argument that the whole concept of forcing users to read half-illegible characters is doomed anyway. There are very, very good programmers and hackers out there who are working towards better text-recognition software. ABBYY Finereader is one of the leading Optical Character Recognition (OCR) programs out on the market, and Evernote wasn’t deceived even by my handwriting – technology has surpassed most of the humans I know! I am quite convinced that there will come a day on which computers and bots will eventually become better than humans at recognising characters and completing optical identification tests. I’m pretty sure CAPTCHAs seem to get harder and harder to the (ridiculous) point at which it takes several attempts to get one right, for example the ‘cat’ CAPTCHA at Rapidshare:

I became an expert at doing these and I assure you the answer is KOI8

I became an expert at doing these and I assure you the answer is 'KOI8'

and this one which I think is a fake but is far too good not to include:

So CAPTCHAs are ultimately heavily limited in their applications, especially considering computers are growing in power at an exponential rate (Moore’s Law). Alternatives often offered to CAPTCHAs involve having the characters read out through the speakers. Unfortunately I think speech-to-text software is also at a highly advanced stage (so long as Vista’s attempt at it is ignored).

So, have people had any other workable anti-bot systems? I certainly think so. In fact the example I gave near the top was a good example of such an alternative: a turing test. Programs which have the ability to ‘parse’ English and make decent ’sense’ out of it are still things of science fiction. Admittedly there are some very limited versions of artificial intelligence, however those tend to cost rather a lot more than the average spammer can afford, and your average bot is still a long way from answering questions like “is ice hot or cold?”

BT: is ice hot or cold?
SmarterChild: People always ask me if there’s a smarter bot than SmarterChild. Well, now there is! Add wilma@live.com to your Contact List to see what I mean!

During the course of our conversations you may see sponsored messages. These messages have the word “sponsorship” at the end. Sponsors help me remain free of cost to you. To learn more about sponsored messages, click here —> http://web.smarterchild.com/ct?xCkO/0

Huh?
BT: is ice hot or cold?
SmarterChild: I’m sorry, I don’t know what you’re saying.

———-
*Adds wilma@live.com as a contact*
———-

BT: is ice hot or cold?
Wilma: I am sorry, I didn’t get that.

I’m just getting started, so I can’t answer everything yet. I’m getting smarter all the time!

Would you like to try a search for “is ice hot or cold”?
BT: you know what, never mind…

I have to say though, my favourite proposed alternative is this one:

Perhaps the people who designed it were attempting to guarantee their site an intelligent user-base!

So anyways, I think the CAPTCHA is near the end of its life. When confronted with a picture of what appears to be a random doodle and expected to use their artistic licence to deduce some vague form of meaning from it, I’m pretty sure many people in a hurry would be annoyed, end up wasting time getting the CAPTCHA wrong multiple times and end up losing track what he/she was supposed to be doing when signing up/posting in the first place. I hope the future holds something easier to use for users of t’internet, much as I love the idea of getting users to solve Maths questions before allowing them to post comments. Perhaps ultimately Man will lose his battle against the machine and whatever anti-bot system ends up being put into place will be solved faster and with greater accuracy than humans! For now though, most machines powerful enough to perform complex human intellectual feats in any reasonable amount of time occupy several hundred square metres of floorspace in labs at NASA and IBM. I sure hope NASA aren’t secretly spammers…

Interestingly, a productive use for CAPTCHAs has actually been found. For those who still haven’t come across them, reCAPTCHA systems have been deployed all over the web. As stated on the reCAPTCHA website:

reCAPTCHA improves the process of digitizing books by sending words that cannot be read by computers to the Web in the form of CAPTCHAs for humans to decipher. More specifically, each word that cannot be read correctly by OCR is placed on an image and used as a CAPTCHA. This is possible because most OCR programs alert you when a word cannot be read correctly.

CAPTCHAs were useful once, back in the year 2000 when computers ran on processors just about capable 7 flops and could barely run fortran compilers (yes I am being ironic). In fact, they were quite a good, and rather an ingenious, solution to the problem, as before they began to get ridiculous they were convenient to completely and only ever took about 2 seconds, and computers were more or less baffled by even slightly complicated/curly/irregularly typed/written text versions. But in a modern world I think the whole process of exacerbating myopia is bloated, unnecessary and pointless. But then maybe that’s just me!

๏̯͡๏﴿


MATLAB Spiel

February 6, 2009

[NB: Scroll to the bottom if you don't want to read my Maths rant and just want to see some cool graphs! Click to view full size.]

MATLAB graph of saddle-shaped z = y^2 - x^2

MATLAB graph of saddle-shaped z = y^2 - x^2. This negative curvature is, according to some theories, the shape of the universe (of course in 5-D or something): such universes are described by hyperbolic geometry.

I’m not going to pretend I know everything about MATLAB. But the more I learn the more I’m impressed with this immensely powerful software. While reading up on matrices I was at one point inspired to write a program to solve matrix equations since it really seemed like a job for a computer, and also a good test of my 2-D array programming skills. It goes without saying of course that the program never materialised owing to laziness and high work load from school. However as I was drawn deeper and and deeper into the world of linear algebra I kept coming across this thing called MATLAB. The lecturer kept going on about how MATLAB would try to avoid not only non-zero pivots but also small pivots for the sake of computational accuracy while doing elimination, and how MATLAB only solves the RHS of an equation after getting the LHS in echelon form, and generally how MATLAB is so great, and eventually I broke and decided I had to find out more about this program. I really didn’t regret it – it is sincerely awesome. So what’s so good about it?

Matrices

The lecture series I was watching was on matrices so I decided to test MATLAB’s apparently awesome matrix capabilities first of all. I created A, a random 100×100 matrix using MATLAB’s random number function:

>> A = rand(100,100);

The semicolon is to prevent it printing the entire matrix to the screen which just wastes time. This took an apparently infinitesimal amount of time so decided to make things a bit more interesting by making it a 1000×1000 matrix instead. MATLAB barely blinked; OK big deal, it can generate a million random numbers in very little time. Now for the proper tests. Elimination is generally speaking O(n^3). In fact it’s about (1/3)n^3. Well, precisely speaking it would involve 333,833,500 calculations for a 1000 square matrix. It’s not actually that many calculations for a computer to perform but it was the natural first test. Elimination is basically the same process as the factorisation of A into two triangular matrices: PA = LU. so let’s get MATLAB to do just that, with a massive matrix:

>> [L,U,P] = lu(A);

Quite impressive – it was fairly speedy: again, it did it in less than a blink of the eye. In fact I ran all sorts of tests with a single 1000×1000 matrix (generating a new one each time to get round falsely fast results from caching). Transposing was no sweat, and inverse ran for half a second. Curiously enough, rref (Row Reduced Echelon Form) took forever to run (ran for over a minute before I stopped it) although both factorisation and inverse were incredibly speedy. Surely getting from Upper Triangular to rref shouldn’t take that long? Especially since it’s obvious you’ll get a huge 1000×1000 Identity matrix (MATLAB managed to invert it so it has to be invertible). I also at first thought it was odd that no matter how many times I regenerated A, I always got its det as either infinity or -infinity, although after thinking about it, a 1000×1000 matrix is likely to have a pretty extreme det given its size, larger than MATLAB can cope with displaying.

Equation Solver

I subsequently discovered MATLAB’s equation solver which is _very_ awesome and includes complex solutions (e.g. x^5 + x = 0 has 5 solutions, 4 of which are complex – MATLAB got them all). Tying in with our FP1 syllabus, MATLAB confirmed 1 has 6 sixth roots (4 complex).


>> solve('x^6 = 1')

ans =

-1
1
-1/2*(-2+2*i*3^(1/2))^(1/2)
1/2*(-2+2*i*3^(1/2))^(1/2)
-1/2*(-2-2*i*3^(1/2))^(1/2)
1/2*(-2-2*i*3^(1/2))^(1/2)

It also managed to get a solution for cos(y) = sin(x) although failed to give me an entire solutionspace, possibly something to do with its insisting by default on solving for x, which was slightly peeving:


>> solve('cos(y)=sin(x)')

ans =

1/2*pi-y

Graph Plotting

And now enough of me speaking gibberish that I don’t properly understand and forwards: towards the awesome(-ish) visual bit of MATLAB: the graph plotter. There are some annoying things about this: you can’t just type in some equation involving x and y like a circle/ellipse equation (or indeed sin(x) = cos(y)) and ask it to plot it – you have to physically create a set of x values:


>> x = -10:0.1:10; %-- set of numbers between -10 and 10, resolution of 0.1

then create a second set of data, y, and define it as a function of x. Only then do you plot y against x. I personally find this method of plotting ‘datum by datum’ fairly annoying but still, graphers are graphers and thus by definition awesome.


>> [x,y] = meshgrid([-10:0.01:10],[-10:0.01:10]);
>> z = sin(x./y);
>> mesh(x,y,z);

MATLAB plot of z = sin(x/y)

MATLAB plot of z = sin(x/y)


>> z = sin(sqrt(x.^2 + y.^2));

z = sin(sqrt(x^2 + y^2))

z = sin(sqrt(x^2 + y^2))


>> z = 1 ./ (nthroot(x.^2 + y.^2, 2)) .* sin(sqrt(x.^2 + y.^2));

z = 1 / (nthroot(x^2 + y^2, 2)) * sin(sqrt(x^2 + y^2))

z = 1 / (nthroot(x^2 + y^2, 2)) * sin(sqrt(x^2 + y^2))

Awesemo, huh?