Young Rewired State

August 23, 2009

I’ve spent the last two days at the Google HQ in London attending Young Rewired State [hit link for more info about event] (#youngrewiredstate), and it’s been nothing short of epic.

And of course, I’ve taken some photos.

The schedule (shamelessly copied from the site) was as follows:

Saturday 22nd August:
10:00 Start
10:30 Planning session
12:30 Lunch
13:30 Hacking starts
17:30 Dinner
18:30 Home (Hacking overnight allowed!)

Sunday 23rd August:
10:00 Back to hacking
11:30 Brunch
12:30 Back to hacking
16:00 Presentations to Judges and Press
18:30 Prizes announced

On the first day we split into groups and started thinking up ideas. At about 4pm we finally settled on our idea: to make something very similar to RentACoder, but much simpler, targetted at talented coders who need experience in order to get a proper job. Here are a couple of screenshots of the final result (click to embiggen).

We decided on a PHP/MySQL project and as luck would have it, I was the only PHP/MySQL programmer in the group! So it was fairly frantic work (solid coding from 10 till about 3 on the last day) and we ran into all sorts of problems with versioning and people overwriting each others’ work in FTP, especially as the CSS people tended to be working on the same files as I was at the same time!

IRC

As with all hack days, IRC was one of the most important methods of communication. Literally everyone had their laptops out during talks, especially during the presentations at the end and there was a fairly constant stream of chatter on the channel. @samhale123 also put up a bot on the channel to tweet things over IRC – we had several hours of fun attempting to overload the script / twitter / the server!

Immaturity with Twitterfall

Immaturity with Twitterfall

Google

Google is an amazing place with by far the best decor I’ve seen in a company building. The floor is laid out like the London underground and the meeting rooms are more or less in the right place for stations (with consistent naming). There are ducks on the ceiling and random awesome other bits of furniture / decor adorning the walls / ceiling / floor.

We were also given a load of Google freebies, including Google yo-yo’s, Google cakes, Google water, Google pens, Google notebooks…

This actually was a telephone box!

This actually was a telephone box!

Google and Youtube Cakes

Google and Youtube Cakes

People

Of course it was a floor full of geeks, which essentially means a brilliant selection of geek T-shirts (I spotted several from ThinkGeek, at least one from the xkcd store…). The mentors (helping out with coding / guiding the groups) were also working in all sorts of fantastic companies; one of our mentors is working at last.fm, one at moo, one with the BBC etc. And needless to say there was a wide array of OS’ – the large majority seemed to be using Macs, those with PCs were probably split 50/50 between linux (mostly ubuntu, one debian that I know of) and windows.

There was also a brilliant selection of judges, including people from Wired (for some reason looks very familiar; came to school to give a talk maybe?), C4, etc.

Some of the judges

Some of the judges

The presentations were good fun – there were something like 40 people from the press / outside making the buzz all the more exciting. And we (@workforpeanuts) won the “Wish I’d thought of that” award!

Anyways, this is the first hack event that I’ve ever been to, and if this is anything to go by, I’m definitely game for another at some point. Heck, maybe DEFCON next year… *MANY* thanks to @hubmum for organising such an amazing event.

And I took other cool photos so go for it and browse!


Microsoft – Week 2

July 25, 2009

This week has gone pretty quickly and I’ve mostly been working on the text analyser / summary program. I even managed to take some photos! The week started with @dumbledad (= Tim) showing me some of the visualisation stuff he and an intern had been working on to visualise a book, some of which will appear shortly on a site somewhere… It’s all in the spirit of new and interesting data presentation in the spirit of Information Aesthetics and he sent me a link to some stuff he did on ManyEyes – word clouds (or ‘wordles’) comparing frequencies of words in narrative and speech. Some of the other ones are more difficult to describe but I’ll be sure to tweet link to them when they get published.

The idea of the summary program was that it split the book into sections then compared a histogram of word frequency densities in each section with another histogram for the entire book, then picked out the words which were most likely to be important to the section by choosing the most unusually frequently used ones. The problem with that was the program wasn’t picking out main characters because they were being mentioned all throughout the book. So I was to implement a system to split words into three categories: local to the section, local to the book (main characters) and common to the English language. The existing framework for a two-way local to section vs local to book had already been written so I was to implement the three-way split.

Factor graph showing the model

Factor graph showing the model

By Wednesday I’d finished the actual implementation so I started trying to invent a visualisation. My original idea was to have a ’story line’ (no pun was actually intended) along which various threads would undulate, and the further out from the story line they are, the more important they are; think of it as a radial graph – I think I was probably inspired by the RealPlayer (yuk, I know) ‘cosmic string’ visualisation. I built a really flickery version as a mockup which was approved, and since I was by then starting to shy away from WPF I ended up learning DirectX overnight to implement a final 3D non-flickery version of it. After spending a whole day stressing over the edges of the scene getting cut off and finally realising I’d set the camera’s maximum viewing distance ridiculously low, I finally got it to work, and after writing some homebrew bezier curve code it looked pretty good (if I may say so myself); Tim tells me he’ll probably add a screen video of it to the online display of visualisations so … watch this space.

Another excitement of the week was a talk from TrueKnowledge (= TK), an internet answer engine. It’s similar to the famous Wolfram Alpha (= Walfa); however in my opinion it actually has more potential. Walfa throws manpower at writing new code to scrape information from various different sources on the fly which essentially means the more information you want, the more you’re going to need to work. TK on the other hand stores information in an enormous database which has a structure suitable for storing any type of information, and although work is done to ‘crawl’ Wikipedia and other sources for knowledge, it also sources the community for information which means it can gather lots of important knowledge very quickly with minimal effort. It also has awesome features of natural language parsing (ask it ‘what colour are red cars’ for example) and it can also give you a step-by-step explanation of the logical process that leads to its final answer.

The bottom half of the screenshot shows TKs stages of logical inference

The bottom half of the screenshot shows TK's stages of logical inference

It of course differs from Walfa in that it hasn’t got a tonne of Mathematica code behind it – its strengths are in factual and inferred knowledge as opposed to evaluating integrals. It’s currently in Beta and has an API (yay!) so I strongly encourage anyone who has used Walfa to give TK a go.

On Tuesday the weekly Mexican food van appeared – until then I’d never realised quite how amazingly good burritos can be! While we were eating we started discussing presentation of text. The problem is that a conventional layout presents the reader with a formidable block of text interspersed with some images which is difficult to follow and annoying to read since one always has to alternate between studying the image and reading the text. However attempts at producing non-linear presentations of information such as embedding text into the image as tooltips or expandable areas of the image etc. have always resulted in people simply not reading very much of the text and consequently missing out important stuff. The best solution we came up with is using an old method of collapsible clauses, just like collapsible code. For example, if a relative clause which in this case is italicised and relatively long yet somehow doesn’t contribute much to the sentence thus merely adds length and unnecessary information to the text making the ultimate meaning more difficult to discern is considered superfluous to the meaning of the sentence, it could be replaced by a small button that only shows the clause if clicked – such ideas are particularly relevant to German sentences which tend to have huge diversions into clauses before the verb is revealed right at the end. This way readers can quickly get the gist of what’s going on so they may study the image in an enlightened way, then go back and expand the text to get the full meaning.

There are also a few things I noticed about MSR in general. There is a strong sense of company loyalty – all employees seem to use Bing, and everyone I’ve seen even goes as far as using IE instead of Firefox! Using only Microsoft products to perform tasks however did make me aware of the wide range of programs they do produce – they even have Virtual Machine software and an internal proprietary alternative to SVN. I guess it does help the developers of these applications a lot if they have an enormous internal test group: all the employees and interns. There’s also pretty close integration with Redmond (Outlook + Office Communicator + global WAN shares) so feedback could be quite efficiently delivered. The entire place also operates in the spirit of trust – all users have admin rights (necessary for developers anyway) – which is so much better than what is implemented at school: a highly restrictive policy which, despite recent changes for the better, still filters out most protocols (FTP included) and in fact, instead of preventing people from doing things simply makes everything so much more difficult to do. Now I have to connect through encrypted VPN to use FTP…

Anyways overall it was a great two weeks. I enjoyed it hugely, I didn’t need to touch Excel, I didn’t make anyone coffee and I didn’t do any filing (who needs paper anyway? It’s a software company!) – instead I worked on real (and rather cool) projects, learnt some useful things, and made new acquaintances.

In other news, I’m off tomorrow to Cranfield for the Aerospace Challenge Finals – I’ll get to fly (actual!) planes, take lots of photos and it should be another great experience. They’d just better have wifi, though I’m bringing my Alfa Awus (ridiculously powerful) along in case of weak signal!


Inovazone

June 25, 2009

It suddenly strikes me that I haven’t been blogging much recently. Exams finished several weeks ago but I have been somewhat busy.

In particular I’ve been working on a new site for a client, Inovazone. In the words of the site’s inventor, Alastair Darwood:

How does an invention go from a scribble on a page to a world-changing product that advances humanity? The answer is that at the heart of the invention there lies a set of distinct and crucial necessities that the invention is addressing … Until now, the only way in which necessities were discovered was through large companies carrying out expensive research, or a spark of genius from someone who suddenly sees one and thinks of a solution. Inovazone is designed to change this. The idea is that users of the site post their ‘necessities’ (short explanations of problems, ‘necessities’, they think need to be solved to benefit them (or humanity)), or rough outlines of inventions they would like to see and anyone can browse the necessities if they want to and look for interesting problems to try to solve through innovation. This could be in the form of an invention or simply a quick online response to the post on our comments system.

Interestingly enough, there doesn’t actually yet exist a site or system available to the general public that serves this particular function by providing a framework within which ideas for inventions are openly submitted and accessed, so I agreed to work on it, especially considering the core of submitting and displaying necessities is a relatively simple PHP/SQL project.

Submitting a necessity

Submitting a necessity. Click to embiggen

What I found particularly compelling was that, according to Alastair, everyone with whom he’s discussed the site has expressed enormous enthusiasm for it, and even a seasoned inventor he’d talked to had described the idea as long overdue and predicted huge success. Although I originally liked the idea and somehow felt it would do pretty well owing to its novelty, I for some reason assumed one would obtain mixed reactions in discussions.

Original image at http://rockstartemplate.com/wp-content/uploads/2008/12/social-bookmark-icon.jpg

Original image at http://rockstartemplate.com/wp-content/uploads/2008/12/social-bookmark-icon.jpg

In terms of the hard sell, we’re employing several different methods to publicise Inovazone. We’ve created a Facebook fan page and a Twitter account – feel free to follow us (@inovazone). One.com also kindly provide adwords coupons so with some SEO we might be able to appear in ads on relevant Google searches.

The site is currently about to go into beta testing. From wiki:

Beta testing comes after alpha testing. Versions of the software, known as beta versions, are released to a limited audience outside of the programming team. The software is released to groups of people so that further testing can ensure the product has few faults or bugs. Sometimes, beta versions are made available to the open public to increase the feedback field to a maximal number of future users.

The idea of this is simply to fine-tune what we already have rather than to add more features, though both these are desirable outcomes from this round of testing; though for me personally the primary objective is to see whether the system works well with a large user base. So for that to happen we need some willing volunteers to go test out the site: anyone reading this is welcome to join the test group. Go ahead and post as many necessities as you deem reasonable, and comment on existing ones. Try out creating new subcategories and send us feedback about features that you think should be created or made better via the awesome uservoice-powered feedback utility. I’m especially interested in bugs and vulnerabilities you find; if you somehow work out how to delete posted necessities or spam the site with adverts, or if the site spews out a series of errors while you’re using it under normal circumstances, get in touch (there is a contact page). Chances are the database will be reverted to its initial (more or less empty) state before the site is released in a few weeks’ time, so disasters should be recoverable.

Before you (readers) go, I’d be interested to hear your feedback on the idea of the site – do you think it has potential? Will it be useful?

That’s all from me for now. The site will get a working and hopefully regularly-updated blog pretty soon for general status information and news so you probably won’t be hearing much more about it from me.

๏̯͡๏﴿


Automatic Email Reminders

June 22, 2009

Many services seem to provide automatic email reminders these days – notably Google Calendar; however what I need is something that can send me daily reminders about things, and setting up a Google calendar with daily repeats seems an altogether inelegant solution. Virgin Media operates a throttling policy in which downloading is limited between 10am and 3pm, and again between 4pm and 9pm. As far as I’m concerned, all this means is I need to make sure I’m not downloading much between those times. I thought of setting up my various alarm clocks and watches to ring at 10am, 3pm, 4pm and 9pm to remind me – but considering I’d have to use four different devices (none of my alarm clocks support multiple alarms) and the fact that it would all be useless if I’m not at home, the best solution is to use a daily email reminder which would alert me even if I’m working off my laptop at school (for example).

I also jumped at the opportunity to find out more about Linux and PHP; besides I didn’t want to sign up for several free email reminder services hunting for a good one so opted simply to write my own. For the sake of anyone attempting to implement any of the features of PHP and Linux I used, here is my solution which can essentially be broken down into four parts:

1. Emailing script

It was fairly easy to find out how to send an email via SMTP in PHP so I set up a script to connect to my gmail and send an email from there to myself. After turning it all into a function, in accordance with the modular approach to development, I was ready to proceed.

2. Scheduler

This was more difficult. PHP can’t by itself do anything on a schedule so it was necessary to delve into Debian’s scheduling system. I tried to look up how to make the damn thing work and eventually found (amongst other irrelevant info – hurry up Wolfram Alpha and add support for coders!) how to use it. It seems like cron is already pre-installed upon Debian installation, and is constantly running as a daemon. It checks a file every minute to check whether it should be executing a scheduled task. To edit this file you type ‘crontab -e’ and get a plaintext editing interface (looks like vi). The format of the file is a list of lines, each one representing a scheduled task. Each line’s format is:

[minute : int] [hour : int] [day of month : int] [month of year : int] [year : int] [script pathname : string]

So to run the program ‘rtorrent’ at 17:30 every 3rd of the month, you go:

30 17 3 * * rtorrent

Asterisks are, as always, wildcards. To execute a process every minute, the first 5 terms look like ‘* * * * *’. There is a slight problem with this: you can’t use vi from a PHP script (at least it’s not possible using exec). It turns out there’s an alternative way of using crontab – by importing a file. So the command looks like:

crontab foo.txt

Great. So the PHP script creates a text file containing the new line then executes a command to add that file to the cron file. Eventually I set it to run the script every minute and have the script itself check for whether it should be doing anything by referencing a MySQL database.

So in the end the PHP looked like this:

$fh = fopen(“foo.txt”, ‘w’) or die(“can’t open file”);
$stringData = “* * * * * /opt/lampp/bin/php /opt/lampp/htdocs/php/ereminders/s_mail.php &> /dev/null\n”;
fwrite($fh, $stringData);
fclose($fh);
exec(“crontab foo.txt” . ‘ 2>&1′, $output);

The &> /dev/null is just to stop it ‘helpfully’ sending email to root every time it runs (i.e. every minute) containing a log of exactly what happened.

3. Database

It’s a fairly simple MySQL thing – nothing fancy. I wrote a nice function in PHP to tabulate the results of a query which I use in my screenshot. It’s a single table and I haven’t bothered to normalise it or anything. Nothing to see here. Move on.

4. Admin panel

This was, like with most good things, the final stage of development. I was also getting lazy and bored so it’s pretty rudimentary; I wrote it just so I don’t have to go into PHPMyAdmin to change things. It makes use of that rather neat ‘tabulate’ function that I had written which tabulates the MySQL query. The var_dump is the contents of the cron file.

Final thoughts

In hindsight, this is pretty good for two hours’ work, especially considering about half an hour was spent writing scripts to automate things from an admin panel. I’ve also actually found it quite useful (pardon the surprise) – the other day I wanted to download an episode of Lost Windows 7 but the throttling period had already started and that one download would probably have pushed me over the download limit. At exactly 9pm I got an email reminding me to download so was able to watch the episode install the OS that very night. Though I doubt it’ll be much use to anyone other than me since there exist systems out there that do the same thing, just much better (probably).

๏̯͡๏﴿


Songbird v Foobar

April 29, 2009

Interestingly enough I switched away from iTunes 7 and haven’t touched it ever since their highly hyped update to 8. I switched to foobar2000 which is actually a pretty awesome bit of software. I have however been constantly hearing about Songbird and its amazing features so I’ve now finally got round to installing it and testing it out. Here are my thoughts.

Foobar > Songbird

One of the reasons I switched away from iTunes in the first place was obscene memory usage. I’m not sure how iTunes 8 is with memory but I had many grievances about the performance of iTunes 7 when I used it. Testing Songbird on a decent laptop (3GB RAM, Intel Core2 Duo T8100 @ 2.10 GHz, a processor that benchmarks faster than most in its clock speed range), it took 5 seconds for the program to start up fully while foobar loaded instantly. Foobar’s memory footprint was absolutely miniscule at 10MB while Songbird required a hefty 80MB, though that’s fairly unsurprising considering its capabilities as a browser.

In terms of usability, as a foobar2000 user, I miss features like Cursor Follows Playback (and more importantly Playback Follows Cursor), complete ID3 tag control, advanced syntactical filters and fully customisable shortcut keys, for which I have yet to find Songbird extensions. Whatever the case these are minor concerns and are bound to be ironed out / provided in the long run by extensions or built in natively. However my concern is that Songbird seems directed more at less savvy / control-freak users who don’t necessarily want to use something like a RegEx string or SQL query to perform operations or filter their music – the functionality is based more around forms and buttons rather than console, debug window and command prompt. While most people probably welcome this user-friendly approach, I personally enjoy the ‘hackability’ and almost complete controllability of foobar. Of course, since Songbird is open-source a real hardcore user may prefer to hard code in mods, though I for one prefer not to have to recompile software to make it do what I want.

There are also several components which come natively with foobar (or as pre-installed plugins) such as ReplayGain (very important; Songbird’s equivalent is the ‘VolumeProfiles’ addon); minimise to tray (again critical [to me]; Songbird has the ‘MinimizeToTray’ addon); and a ‘resume playback after restart’ option (a nice touch to foobar; Songbird has an addon called ‘last track resume’).

This demonstrates the syntax of a Foobar preference element - a lot of the preferences are like this. Theres just so much control

This demonstrates the syntax of a Foobar preference element - a lot of the preferences are like this. There's just so much control

You can even control exactly what text is in the window title, status bar and system tray tooltip

You can even control exactly what text is in the window title, status bar and system tray tooltip

Songbird > Foobar

Enough nitpicking. Songbird really does have some really awesome features. Its integration with the web is very nciely done – I get the impression more or less every online music service is supported to some extent, and the whole browser integration is a brilliant idea. Foobar’s web integration comes in the form of ‘freedb’ which I assume is some sort of tags downloader though it’s never given me any vaguely sensible suggestions so isn’t very good. There’s also a mini player built in which foobar doesn’t seem to have without resorting to skinning. Ratings are native which foobar is critically missing – you have to use ‘quick tagger’ [addon]. The default iTunes interface was offputting at first but the browse library by artist/genre/album etc at the top is another feature foobar lacks but Songbird has. And, of course, Songbird is open source.

It’s interesting that Songbird was developed as an open source project thus appealing to the techies while also being amazingly pleasant to use with some of the most useful and critial features built in and vast extensionability. Someone commented Songbird is like the Firefox of media players. I can’t say I disagree.

I find the way theyve built a media player around a browser quite cool and certainly in line with the whole web integration thing

I find the way they've built a media player around a browser quite cool and certainly in line with the whole web integration thing

Songbird has a clear iTunes-like interface and the mashTape (web integration with artist/song info, reviews, even youtube) is a pretty cool feature IMHO

Songbird has a clear iTunes-like interface and the mashTape (web integration with artist/song info, reviews, even youtube) is a pretty cool feature IMHO

Songbird, Foobar > iTunes

Despite a slow load time, Songbird wipes the floor with iTunes when it comes to performance. There was a problem with iTunes 7 in which scrolling through a large library was a misery owing to the intense slowness of just about everything. Songbird on the other hand is actually pretty snappy. And of course Foobar runs like lighting.
Both are extensionable. I know there are iTunes addons etc. but both these alternatives take extensionability to a much higher level. Songbird probably uses extensions about as much as Firefox while Foobar takes extensionability to an extreme by more or less requiring them to function normally (hence the pre-installed ones).
And of course neither associates itself with a store that sells DRM music ;) So it’s all good.

Overall, based on my experience of them so far, both are far more than adequate replacements for iTunes (unless you’re a fool and actually use the iTunes store in which case your music is useless if played by anything but Apple products). Foobar even has support for iPods (not sure about Songbird). Neither has performance issues, and both are more or less customisable enough for the standard user. If you’re after an easy and pleasant-to-use player with an automatically decent-looking interface with truly wonderful web integration, go download Songbird. If you’re a control-freak in search of hackability and control almost to the extent of writing your own RegEx (and also a completely no-nonsense player), foobar’s the one for you. On the other hand if you want a program that is slow, memory-hogging and defaults to buying music from a store with hideous DRM, go ahead and download iTunes.

๏̯͡๏﴿


How to Download Youtube Video/Audio

April 21, 2009

There are lots of tools out there claiming they can stream Youtube video and audio to file. I’ve tried to use many different ones but most seem to fail either at the downloading stage or the playback stage, i.e. they produce unreadable output files. Even worse is when you end up with an FLV file which only seems playable in VLC. I think I’ve found the perfect solution which can stream Youtube to an AVI file (properly, using a standard algorithm and producing a fully legitimate file) which can subsequently be converted into any format at all, be it audio, video or whatever.

1. Download

The best solution in my opinion is VDownloader. It provides a no-nonsense interface with settings for the use of a proxy, AVI codec, and download directory. It’s freeware of course, with a small encouragement at the bottom of the window to donate. To use it, simply paste the Youtube URL into the URL box and hit download. It first downloads it as a file without extension before converting it to AVI.

A potential alternative is Orbit Downloader with Grab++ which can grab streamed flash off any website. While it will work without fail, you end up with a FLV file which for some reason only VLC media player seems to be able to play and recognise as any sort of legitimate video file, preventing the next step from working. I thus never managed to get VLC to stream anything to a file. Perhaps it uses an outdated / incorrect .flv format.

2. Convert

If you feel the need to convert it from AVI to, say, an audio format like MP3 (some very rare classical recordings only exist on Youtube for some reason, but I’m not sure about the legality of downloading them to local storage), by far the best all-purpose media converter I’ve come across is SUPER. Frankly I find the interface revolting but the functionality is truly wonderful and more than makes up for looks – just take a look at their site to see what features are on offer (the site is also not very pretty…). Again, it’s all free.

To use it, first install then run the program. You’ll be confronted with a hideous UI. Select the top-left drop down menu and select the target format (e.g. mp3). If necessary change the codec (top right drop-down). In the blue ‘Audio’ rounded rectangle set your preferences for the output stream. Drag and drop the AVI you downloaded with VDownloader into the grey file list box near the bottom. Tick the file you want converted and press ‘Encode’. It might take a while (and the window sometimes freezes) but chances are it’ll complete and you’ll have an mp3 waiting in the output folder. The default is C:\Program Files\SUPER\OutPut\

So there you have it – that’s the best way I’ve found to download youtube content and convert it into any format desired. I’ll leave the legality of this for you to decide / work out / research – if you download music from Youtube and it turns out to be illegal, don’t blame me!

Good hunting ;)

๏̯͡๏﴿


The Pirate Bay Situation

April 19, 2009

It’s big (and by now fairly vintage) news in the torrenting and general technology community that a verdict has been reached for the lawsuit against The Pirate Bay’s four founders. I won’t say much about the gory details of the trials – there are plenty of articles on good websites that will give you all sorts of facts; I’m just going to state some of my opinions on the matter. In case you don’t already know, the verdict was a jail sentence and a $3.6M fine.

Firstly my thoughts on file sharing in general. BitTorrent is used for a whole host of good things – I’ve used it on multiple occasions to grab up-to-date linux distributions and it’s a fantastic way to download without limitations on server upload speeds (private trackers have exceptionally high ratios and speeds but linux and other open source stuff tends to download fast as well even on public trackers). There’s also the whole debate about whether or not piracy really does harm the economy as much as Sony would like us to think. But personally I think there’s no hope for companies trying to shut down piracy because it stems directly from the entire point of the internet: sharing information. If torrenting somehow gets shut down (an incredibly unlikely scenario), an alternative P2P system will immediately spring up to replace it, and there are a great many out there waiting to be exploited. But basically what I’m saying is that an attempt to target the infrastructure of filesharing is just a pathetic way for companies to seek some sort of revenge for probably mostly imagined and definitely largely over-hyped and bloated losses.

It is transparently obvious that the trial is much bigger than just The Pirate Bay – the verdict poses a threat to the entire community of file sharing. The Pirate Bay may have had certain special circumstances that made this verdict even vaguely plausible: something to do with Sweden possibly. But the verdict has set an incredibly dangerous precedent – if the team really end up facing significant jail time and massive fines, it would serve as a massive deterrant to anyone even considering starting up a novel platform for sharing, be it open-source software, ideas or whatever. My opinion is that the entire spirit of a collaborative internet is being broken apart piece by piece, while the pirates will still always find a method of sharing illegally obtained and distributed material. The supposedly illegal side always tends to be far more determined to keep sharing than the average supposedly law-abiding person who is probably fairly ambivalent anyway about whether or not to share those photos on Flickr.

There’s also a huge amount of wastage. I noticed Isohunt have put a notice on their front page linking to some legal material. I wouldn’t be surprised if other trackers are calling their lawyers right now, preparing for a legal assault on their communities. I’m not saying lawyers’ pay is waste, but the sheer amount of effort and time going into nit-picking against a multi-corporate legal mob in front of an unconvinced and generally non-tech-savvy jury seems to me at least a somewhat inefficient use of resources.

And to keep everything in perspective, the recording industry are fighting against a phenomenon they themselves are helping to create. The measures being adopted to prevent piracy such as music DRM make life a misery for law-abiding citizens who pay for their music; for example iTunes forced all its customers to re-download and thus re-buy all their music just to (supposedly) remove one layer of DRM from audio files. All this hassle actually makes pirated music of higher quality than purchased music, a ridiculous situation created by companies like Apple. How can anyone blame me if I decide to download a torrent of a few songs (which I’ve already paid for) just to be able to play them in something other than iTunes?

In my opinion, it will become increasingly difficult in the future to download plain DRM-free music and films, and indeed the risk of being caught doing so will probably increase, as will the penalty. The current trend is that more and more companies are getting involved – once it was just bodies such as the MPAA and RIAA who were targeting file sharers. Then more private companies joined in for the money such as MediaDefender, and now even ISPs and governments have joined the witch-hunt. If you want my take on this, I suggest that if you already download and share pirated material, do so while you can and max out on it; the window of opportunity to get hold of clean untrackable media may well be closing.

Good hunting ;)

๏̯͡๏﴿


Copyright Infringement

March 18, 2009

This is in fact my Economics essay for the Barnett Prize. The person who marked it thought it was a terrible essay (fair enough), but I still think it’s interesting to discuss. So here it is, however horrific an essay, after being cut down to exactly 1500 words, and with a less bellicose attitude towards the RIAA than normal (for the sake of being PC). It’s entitled “Steal this Essay”. This is also an experiment to see how well WordPress’ ‘import from Word’ feature works. I have to say, I’m impressed by how it deals with footnotes and citations/bibliographies.

Using a specific microeconomic case study from either the UK or abroad, assess how governments can deal with market failure.

Copyright infringement is a growing concern in the music and film industry. Despite the best efforts of governments, private firms, law-enforcement agencies and cyber-police, the amount of music and film in illegal circulation over the internet has grown at an exponential rate since the conception of Napster, the first peer-to-peer (P2P) file sharing service to hit the internet. According to Ars Technica (1), usage of Bittorrent, a popular tool used by file sharers, grew 24% in five months, and Bittorrent is apparently responsible for 80% of the world’s traffic. Global music sales dropped from $38bn in 1999 to $32bn in 2003 and American studios reported $2.3bn in losses to film piracy in 2005 alone (2). Free market forces are failing: talent starved of adequate funding will fail to flourish and music, entertainment and culture may be eroded as a result. Without consumers willing to pay for goods produced by talented artists, such artists will be unable to invest profits in creating more music (they may opt to record fewer CDs), and supply of the product decreases, increasing its cost which in this case further exacerbates losses leading to a vicious cycle. With revenue from their music falling, artists may choose to switch careers, discontinuing their contributions to music: the labour supply decreases, as does the pool of musical talent. Culturally, the negative externalities of this are highly significant, resulting in a shallower music culture.

In my opinion, the primary reason for file sharing is simply convenience. The internet allows even very large amounts of data to be transferred at zero cost and supersonic speeds; the temptation to cheat the system becomes irresistible: the opportunity cost of buying a CD from a local store, which involves both monetary as well as time sacrifices, is far greater than the single mouse click it takes to download the entire album. Pirated media can be seen as a substitute good for which consumers pay with risk of getting caught rather than with conventional currency; industry is losing out to this illegal competition.

Existent government measures have proven ineffectual at best. For a while law enforcement worked: the RIAA[1] and MPAA[2] succeeded in intimidating file sharers into accepting monetary settlements out of court, thereby recouping losses and deterring potential copyright criminals. However when exonerated file sharers began to sue the RIAA back (3), subsequent copyright lawsuits became somewhat anathematised and digital rights organisations procured an increasingly unpopular and disrespected reputation for aggressive methods (as illustrated). One of the primary difficulties is the issue of evidence: piracy is exceedingly difficult to detect with ever-advancing encryption technology. Taking the current trial of The Pirate Bay[3] as an example of ineffectuality, on the second day of the trial half the charges have already been dropped (4) (at the time of writing this). To make things worse, the music industry’s attempts to cover itself from piracy only punish the law-abiding: employment of digital rights management on music by online retailers such as Apple renders the files unusable with anything but iTunes and iPods, causing frustration, heavily discouraging the buying of music.

The government has attempted to enforce the law through Internet Service Providers (ISPs): since they provide the means to perform illegal activities, perhaps it should be their responsibility to police their networks. In 2008, the UK’s largest six[4] ISPs[5] agreed (5) to a code laid out by the government: if an ISP has reasonable evidence upon which to suspect a customer of illegally downloading music, it will throttle his download speeds significantly. This appears promising: a large proportion of the country is now under surveillance by ISPs and users have an incentive to stop file sharing; Virgin Media even sent out warning letters to several hundred of its file sharing customers. However the government is pitting itself directly against the free market forces: as pointed out by the Wired article (5), it is far more probable that they were merely taking measures against losing customers: firms appreciate the revenue from them. In addition, ISPs abiding by this code are likely to lose business owing to a pervasive sense of intrusion from being constantly monitored. Besides, imposing regulations on businesses raises supply costs (employing a monitoring team for example), shifting the supply curve leftwards, resulting in a more expensive, and less abundant, good or service:

[Insert bog standard Economics AS/AD diagram]

One suggestion was for the government to accept the fact that internet users will share files, and rather than fight this unwinnable war, to tax broadband usage and return tax revenue to industry, thus compensating for the market failure. Unsurprisingly this has been met with much fury: not only does this demonstrate great cynicism and mistrust on the government’s part, but it may actually exacerbate the problem: consumers might decide that since they have paid for their illegal downloads they are entitled to download copyrighted material.

Alternatively, similarly to using disturbing television advertisements about lacking TV licences, the government could attempt to threaten the population into submission. There is good evidence that advertising works with anti-smoking campaigns, so there appears to be a high probability of success with this measure. Again the government is no longer working against the free market: it is rather injecting information into the market and allowing consumers to make a better-informed decision. Unlike regulation, such measures preserve human rights and can be highly effective in combination with other measures. Unfortunately, unlike the case of smoking and even TV licences, a savvy file sharer knows he can hide his activities indefinitely: threatening advertisements do not work for such people (who also tend to be the heaviest sharers). Education may eventually curb the problem through generations of law-abiding citizens, but such slow-acting measures may not be sufficiently effective in the short run to avert cultural erosion.

In September 2007, it was discovered that a large (needless to say illegal) cyber-offensive was being planned (6) against The Pirate Bay by MediaDefender[6] in an attempt to halt file sharing activity. Although it was discovered and averted, perhaps such measures are the only ones that will work: aggressive attacks on the central hive of activity; this is after all what governments are accustomed to doing when dealing with terrorists and criminals. However, despite a worldwide offensive on terrorism, ever since September 2001, little, if not negative, progress has been made against it; what guarantees the success of an offensive against a worldwide network of highly intelligent anonymous criminals?

Perhaps to understand fully the nature of this market failure, one should reconsider the extent and type of damage done by file sharing, and also take into account its positive aspects. According to US District Judge James P. Jones (7), ‘17,000 illegal downloads don’t equal 17,000 lost sales’. If a customer wants some music but is not prepared to pay the price quoted on Amazon (indicating he is not willing and able to pay for it), he would not be in the market in the first place, so the music industry should be indifferent to whether he downloads that music illegally in the end. Of course this line of reasoning cannot be extended too far, but the point is that not every illegal download harms industry. In fact there was a study (8) (English synopsis (9)) commissioned by the Dutch government which concluded that in fact file sharing contributes €100m per year to the Dutch economy. Apparently much downloaded content becomes treated as sample content to be bought later, and downloaders tend to buy more games than non-downloaders (possibly an effect of exposure to online marketing). In addition, the positive externalities of file sharing include broader cultural wealth. By annihilating file sharing, the government would lose out on positive externalities as well as negative ones. The report concludes that most losses can be attributed to things other than piracy, including competition with other forms of entertainment. Notably in The Netherlands, downloading media for personal use is legal. Perhaps an unconventional solution to the market failure is to legalise file sharing, thus maximising the social benefits and accepting the (minimal) social costs. Besides, surveys show that 80% of British people would be in favour of this measure (10).

In conclusion, business has taken a radical new direction since the concept of ‘free goods’ first appeared. Google provides millions with enormously powerful search facilities for free and receives its revenue largely from advertising. According to Wikinomics (11), ‘free and collaborative’ (complete with externalities) is the future, whether we like it or not. I believe governments should embrace this future and work with the market, rather than fight it. Whatever the solution to the problem of copyright infringement eventually turns out to be, I suspect, and hope, that the RIAA and MPAA will not be closely involved, that free market tools will be capitalised upon, and that the positive externalities of change will be fully appreciated.

Bibliography

1. Bangeman, Eric. BitTorrent use soars as MPAA fights on against P2P sites. ars technica. [Online] 17 04 2008. http://arstechnica.com/old/content/2008/04/bittorrent-use-soars-as-mpaa-fights-on-against-p2p-sites.ars.

2. File sharing. Wikipedia. [Online] http://en.wikipedia.org/wiki/File_sharing.

3. Oregon RIAA Victim Fights Back. Recording Industry vs The People. [Online] http://recordingindustryvspeople.blogspot.com/2005/10/oregon-riaa-victim-fights-back-sues.html.

4. enigmax. 50% of Charges Against Pirate Bay Dropped. TorrentFreak. [Online] 17 02 2009. http://torrentfreak.com/50-of-charges-against-pirate-bay-dropped-090217/.

5. Buskirk, Eliot Van. British ISPs Agree To Curb File Sharers’ Internet Access. Wired. [Online] 23 07 2008. http://blog.wired.com/music/2008/07/uk-could-announ.html.

6. Leyden, John. Pirate Bay sues media giants for ’sabotage’. The Register. [Online] 24 09 2007. http://www.theregister.co.uk/2007/09/24/pirate_bay_counterstrike/.

7. Cheng, Jacqui. Judge: 17,000 illegal downloads don’t equal 17,000 lost sales. ars technica. [Online] 19 01 2009. http://arstechnica.com/tech-policy/news/2009/01/judge-17000-illegal-downloads-dont-equal-17000-lost-sales.ars.

8. Ups and downs – Economische en culturele gevolgen van file sharing voor muziek, film en games. TNO. [Online] 2009. http://tno.nl/content.cfm?context=markten&content=publicatie&laag1=182&laag2=1&item_id=473.

9. Ernesto. Economy Profits From File-Sharing, Report Concludes. TorrentFreak. [Online] 19 01 2009. http://torrentfreak.com/economy-profits-from-file-sharing-report-concludes-090119/.

10. Orlowski, Andrew. 80% want legal P2P – survey. The Register. [Online] 16 06 2008. http://www.theregister.co.uk/2008/06/16/bmr_music_survey/.

11. Tapscott, Don and Williams, Anthony D. Wikinomics. London : Atlantic Books, 2007. ISBN 978-1-84354-637-5.

12. P2P Survey Results. In HIIT. [Online] 2007. http://inhiit.blogspot.com/2007/09/p2p-survey-results.html.


[1] Recording Industry Association of America

[2] Moving Pictures Association of America

[3] A large Bittorrent tracker and hub of illegal piracy

[4] Virgin Media, Sky, Carphone Warehouse, BT, Orange and Tiscali

[5] Internet Service Providers

[6] Anti-piracy company


Cambridge Eng + Comp Sci Lectures

February 28, 2009


Today was the fourth time I’ve been to Cambridge (bringing my total by the end of this year to six, as I mentioned here), this time for another CareersMCS event about engineering and computer science. Like both the other conferences organised by the same people, I ended up leaving somewhat inspired, and wanting to do all the courses they talked about. As one of my friends put it, now I don’t want to be a banker/medic/programmer/whatever – I just want to be a student all my life. As Chef from South Park put it, “there’s a time and place for everything … and it’s called college”.

The day started at silly o’clock when I woke up, arriving in time for a 9:45 start. After an intro (at the beginning of which all parents in the hall were, as always, assertively invited to leave – presumably to allow us to feel free to ask questions) we launched straight into Aerospace Engineering.

I feel I took away something new, interesting and possibly useful from each of the nine lectures today. Aerospace was all about fluid dynamics and how aerofoils generate lift, footballs spin and paper aeroplanes do ‘loop-de-loops’. The speaker quickly dismissed the higly yesteryear longer path theory (and to think I was misled by the Science Museum!) and moved onto explaining exactly what Cambridge professors think is going on. It’s all about bends in streamlines creating pressure gradients and subsequent forces. Theories involving Bernoulli’s speed-pressure relationships are also apparently flawed. Fluid dynamics are one of the Physics topics tragically lost in the teaching of AS Physics (*very* casually touched upon by KPZ when we were calculating electron drift speeds in wires) so it was certainly something novel for me to see.

The speaker touched on the intriguing Physics behind shock fronts and sonicbooms

The speaker touched on the intriguing Physics behind shock fronts and sonicbooms

Computer science was largely familiar to me – it was on security: mostly cryptography and corresponding cryptanalysis, but also briefly touching on stego, social engineering and PGP certs. It’s interesting to note that contrary to what I might have expected a year ago, the course places little emphasis on learning programming languages such as C++ well, but far more about simply working out algorithms and computational methods for solving problems. Having now done some (sort of) formal algorithm training (and being aware of the syntax of many different types of languages), I’m seeing the language divide beginning to melt away and starting to see the use of teaching, say, Dijkstra’s Algorithm in pseudocode rather than C++. And of course, it was no surprise that Mathematics (particularly llinear algebra) played a pretty hefty role.

Chemical engineering was quite a new one for me since I hadn’t a clue what it was all about beforehand. Turns out it’s essentially process engineering. After this we were given a time-lag-style summary of a first-year student’s perspective on applying to engineering at Cambridge. Following lunch was an intriguing talk on mechanical engineering involving some familiar circular motion revision and a pretty awesome demo of a Morrison Shelter, with the speaker’s mobile phone as collateral (sitting inside a flimsy-looking metal-wire shelter as a 5KG weight was dropped on it)! This was followed by an excellent demo-rich talk on electrical engineering (I was delighted to be able to recognise an AS-style power amp, albeit one with three concatenated push-pull amplifiers [power!]) which revised some electromag physics (dPhi/dt, I(lxB), q(vxB) etc). There were also some nice experiments with a big coil and an iron rod going through it – though I felt rather sad about being so delighted by seeing Physics demonstrated so wonderfully. Maybe I’m just weird that way…

Morrison Shelter. It used concepts of plasticity to reduce damage to occupants (like car crumple zones)

Morrison Shelter. It used concepts of plasticity to reduce damage to occupants (like car crumple zones)

The talk on applications of IT to engineering was highly interesting theoretically though the program appeared depressingly slow – crunching a few matrix equations would have taken MATLAB about five seconds at the most. The speaker left his program running for a good half of his talk before it finally finished… The guy in charge gave a talk on applying and UCAS and other such helpful stuff in which he cracked the same joke as he does for every single one of these: apparently someone wrote on his personal statement [paraphrased]:

I do lots of music. I enjoy playing the flute; sometimes I play with myself

The day was concluded with a talk on civil engineering, though by that time my brain was complaining about lack of sleep.

The thing I love about studying things like Physics and Maths is that every now and then as I learn new topics and areas in the subjects, I come across proofs and lines of logical thinking and generally ‘things’ which just make me stop and think, ‘oh yeah – never thought about it that way before’, and marvel at the genius of whoever came up with it (for example OLCT told us about an intriguing interpretation of regression lines involving dot products of vectors). Maths and Physics have intriguing subtleties which never fail to inspire. I still think after today that I’ll go with Physics/Maths. Engineering subjects are interesting and probably far more directly useful practically speaking. But there were moments in the lectures when I felt the subjects were more superficial than I would like. There’s just something about delving deep into a subject and finding something surprising and counter-intuitive yet logically beautiful that makes me want to find out more. I must be *really* weird…

Again, as I was walking around Cambridge, I observed huge contrasts in the types of people around. There’s something I love about students – they make up the most diverse age group: the norm is to be radically different and anti-trendy. The town is simply buzzing full of life, from street musicians (pretty good ones as well!) to punters (literally), to human rights demonstrators sitting in cages outside King’s. This time we also got to see Corpus, Trinity and St John’s all on the same street, the three colleges about which I know enough to want to apply to. Cambridge is simply an awesome place; there’s no question about it.


End of the Road for CAPTCHAs

February 23, 2009

Signing up to a forum the other day, I couldn’t help noticing the increasingly pervasive presence of anti-spam and anti-bot modules built in to online registration forms.

The problem with CAPTCHAs in my opinion is simply that they are stressful to use for a modern end-user. Research (c.f. some New Scientist article that I read) indicates online users have over several years become far more ‘aggressive’ in that website visits are increasingly about simply extracting information and leaving: people are becoming more ‘hit-and-run’ as opposed to ‘come join our wonderful community at Experts Exchange’! This speed and efficiency (which I think is largely attributable to Google) of surfing however goes completely out the window when it comes to CAPTCHAs: rather than simply clicking ‘OK’, users are forced to squint at a series of curly, obfuscated, low-contrast, specked, half-obscured characters, often only to be told they’ve typed the wrong characters and forced to repeat the whole unpleasant experience. CAPTCHAs break the line of thought and direction of a user, serving merely to annoy him/her; most people I know only ever think of CAPTCHAs as infuriating and time-consuming wastes of space. They’re sometimes even vaguely funny!

There’s actually an argument that the whole concept of forcing users to read half-illegible characters is doomed anyway. There are very, very good programmers and hackers out there who are working towards better text-recognition software. ABBYY Finereader is one of the leading Optical Character Recognition (OCR) programs out on the market, and Evernote wasn’t deceived even by my handwriting – technology has surpassed most of the humans I know! I am quite convinced that there will come a day on which computers and bots will eventually become better than humans at recognising characters and completing optical identification tests. I’m pretty sure CAPTCHAs seem to get harder and harder to the (ridiculous) point at which it takes several attempts to get one right, for example the ‘cat’ CAPTCHA at Rapidshare:

I became an expert at doing these and I assure you the answer is KOI8

I became an expert at doing these and I assure you the answer is 'KOI8'

and this one which I think is a fake but is far too good not to include:

So CAPTCHAs are ultimately heavily limited in their applications, especially considering computers are growing in power at an exponential rate (Moore’s Law). Alternatives often offered to CAPTCHAs involve having the characters read out through the speakers. Unfortunately I think speech-to-text software is also at a highly advanced stage (so long as Vista’s attempt at it is ignored).

So, have people had any other workable anti-bot systems? I certainly think so. In fact the example I gave near the top was a good example of such an alternative: a turing test. Programs which have the ability to ‘parse’ English and make decent ’sense’ out of it are still things of science fiction. Admittedly there are some very limited versions of artificial intelligence, however those tend to cost rather a lot more than the average spammer can afford, and your average bot is still a long way from answering questions like “is ice hot or cold?”

BT: is ice hot or cold?
SmarterChild: People always ask me if there’s a smarter bot than SmarterChild. Well, now there is! Add wilma@live.com to your Contact List to see what I mean!

During the course of our conversations you may see sponsored messages. These messages have the word “sponsorship” at the end. Sponsors help me remain free of cost to you. To learn more about sponsored messages, click here —> http://web.smarterchild.com/ct?xCkO/0

Huh?
BT: is ice hot or cold?
SmarterChild: I’m sorry, I don’t know what you’re saying.

———-
*Adds wilma@live.com as a contact*
———-

BT: is ice hot or cold?
Wilma: I am sorry, I didn’t get that.

I’m just getting started, so I can’t answer everything yet. I’m getting smarter all the time!

Would you like to try a search for “is ice hot or cold”?
BT: you know what, never mind…

I have to say though, my favourite proposed alternative is this one:

Perhaps the people who designed it were attempting to guarantee their site an intelligent user-base!

So anyways, I think the CAPTCHA is near the end of its life. When confronted with a picture of what appears to be a random doodle and expected to use their artistic licence to deduce some vague form of meaning from it, I’m pretty sure many people in a hurry would be annoyed, end up wasting time getting the CAPTCHA wrong multiple times and end up losing track what he/she was supposed to be doing when signing up/posting in the first place. I hope the future holds something easier to use for users of t’internet, much as I love the idea of getting users to solve Maths questions before allowing them to post comments. Perhaps ultimately Man will lose his battle against the machine and whatever anti-bot system ends up being put into place will be solved faster and with greater accuracy than humans! For now though, most machines powerful enough to perform complex human intellectual feats in any reasonable amount of time occupy several hundred square metres of floorspace in labs at NASA and IBM. I sure hope NASA aren’t secretly spammers…

Interestingly, a productive use for CAPTCHAs has actually been found. For those who still haven’t come across them, reCAPTCHA systems have been deployed all over the web. As stated on the reCAPTCHA website:

reCAPTCHA improves the process of digitizing books by sending words that cannot be read by computers to the Web in the form of CAPTCHAs for humans to decipher. More specifically, each word that cannot be read correctly by OCR is placed on an image and used as a CAPTCHA. This is possible because most OCR programs alert you when a word cannot be read correctly.

CAPTCHAs were useful once, back in the year 2000 when computers ran on processors just about capable 7 flops and could barely run fortran compilers (yes I am being ironic). In fact, they were quite a good, and rather an ingenious, solution to the problem, as before they began to get ridiculous they were convenient to completely and only ever took about 2 seconds, and computers were more or less baffled by even slightly complicated/curly/irregularly typed/written text versions. But in a modern world I think the whole process of exacerbating myopia is bloated, unnecessary and pointless. But then maybe that’s just me!

๏̯͡๏﴿