Why node.js

I’ve moved from Python to Javascript on the server side – specifically, Tornado to Node.js. Three years ago, I moved from Perl to Python because I got free hosting at AppEngine. Python’s a cleaner language, but that was not enough to make me move. Free hosting was. Initially, my apps were on AppEngine, but that wouldn’t work for corporate apps, so I tried Django. IMHO, Django’s too bulky, has too much “magic”, and templates are restrictive. Then I tried Tornado: small; independent modules; easy to learn. I used it for almost 2 years. ...

HTML 4 & 5: The complete Reference

HTML 4 & 5: The Complete Reference is an iPhone / iPad app that does exactly what it says: a reference for HTML 4 and 5. It has a list of all tags, clearly demarcated as HTML4, HTML5 or both. The application is fairly easy to scroll through to find the tag or attribute you want. Clicking on a tag, you get: a brief description of what it’s for what attributes are valid – the good part is you can see clearly which attributes are specific to the element, and which ones are common (like class, id, etc.). You can also see the possible values for the attribute, which helps. and an example of how the tag is used. The examples are quite simplistic, and there’s only one per tag, but it does have a rendered version of the code, which helps. You can also scroll through the list of attributes and see which tags they’re valid for. ...

Yahoo Clues API

Yahoo Clues is like Google Insights for Search. It has one interesting thing that the latter doesn’t though: search flows. It doesn’t have an official API, so I thought I’d document the unofficial one. The API endpoint is http://clues.yahoo.com/clue The query parameters are: q1 – the first query string q2 – the second query string ts – the time span. 0 = today, 1 = past 7 days, 2 = past 30 days tz – time zone? Not sure how it works. It’s just set to “0” for me s – the format? No value other than “j” seems to work So a search for “gmat” for the last 30 days looks like this: ...

Automated image enhancement

There are some standard enhancements that I apply to my photos consistently: auto-levels, increase saturation, increase sharpness, etc. I’d also read that Flickr sharpens uploads (at least, the resized ones) so that they look better. So last week, I took 100 of my photos and created 4 versions of each image: The base image itself (example) A sharpened version (example). I used a sharpening factor of 200% A saturated version (example). I used a saturation factor of 125% An auto-levelled version (example) I created a test asking people to compare these. The differences between these are not always noticeable when placed side-by-side, so the test flashed two images at the same place. ...

Shortening sentences

When writing Mixamail, I wanted tweets automatically shortened to 140 characters – but in the most readable manner. Some steps are obvious. Removing redundant spaces, for example. And URL shortening. I use bit.ly because it has an API. I’ll switch to Goo.gl, once theirs is out. I tried a few more strategies: Replace words with short forms. “u” for “you”, “&” for and, etc. Remove articles – a, an, the Remove optional punctuation – comma, semicolon, colon and quotes, in particular Replace “one” with “1”, “to” or “too” with 2, etc. “Before” becomes “Be4”, for example Remove spaces after punctuations. So “a, b” becomes “a,b” – the space after the comma is removed Remove vowels in the middle. nglsh s lgbl wtht vwls. How did they pan out? I tested out these on the English sentences on the Tanaka Corpus, which has about 150,000 sentences. (No, they’re not typical tweets, but hey…). By just doing these, independently, here is the percentage reduction in the size of text: ...

HTML5: Up and Running

HTML5: Up and Running is the book version of Mark Pilgrim’s comprehensive introduction to HTML5 at DiveIntoHTML5.org. Whether you buy the book or read it online, it’s the best introduction to the topic you’ll find. Mark begins with the history of HTML5 (using email archaeology, as he calls it). You’d never guess that many of the problems we have with XHTML, MIME types, etc. have roots in discussions over 20 years ago. From then on, he moves into feature detection (which uses the Modernizr library), new tags, canvas, video, geo-location, storage, offline web apps, new form features and microdata. Each chapter can be read independently – so if you’re planning to use this as a reference, you may be better of reading the links kept up-to-date at DiveIntoHTML5.org. If you’re interesting in learning about the features, it’s a very readable book, terse, simple, and above all, delightfully intelligent. ...

Modular CSS frameworks

A fair number of the CSS frameworks I’ve seen – Blueprint, Tripoli, YUI, SenCSS – are monolithic. What I’d like is to be able to mix and match specific components of these. For example, 960.gs has a simple grid system that I’d love to combine with the vertical rhythm that SenCSS offers. (Vertical rhythm ensures that sentences align vertically.) I’d love to have a CSS framework that just sets the fonts, for example, and touches nothing else. Or something that defines the colour schemes, and lets you change the theme like Microsoft Office does. ...

Make backgrounds transparent

This is the simplest way that I’ve found to make the background colour of an image transparent. Download GIMP Open your image. I’ll pick this one: Optional: Select Image – Mode – RGB if it’s not RGB. Select Colors – Colors to Alpha… Click on the white button next to “From” and select the eye-dropper. Pick the green colour on the image, and click OK The anti-aliasing is preserved as well. ...

Shopping with Cooliris

I just put together this little demo that scrapes John Lewis’ site and creates a MediaRSS file out of it. CoolIris has got to be the best way to shop. Apart from being really pretty, it’s quite useful when you know what something looks like, but don’t quite know how to search for it. For example, I was trying to look for a headphone-microphone (you know, the ones that connect into an iPhone or a Blackberry). I didn’t have a clue what it’s called. (TRRS, if you’re interested. I found out later.) The only way I could get it was to browse the wall… ...

ImportHtml doesn’t auto-refresh

A cool thing about Google Spreadsheets is that you can scrape websites using external data functions like importHtml. It’s really easy to use. The formula: =importHtml("http://www.imdb.com/chart/top", "table", 1) imports the Internet Movie Database top 250 table on to Google Spreadsheets. Since you can publish these as RSS feeds, it ought to, in theory, be a great way of generating RSS feeds out of arbitrary content. There’s just one problem: it doesn’t auto update. There are claims that it does every hour. Maybe it does when the sheet is open. I don’t know. But it definitely does not when the sheet is closed. I wrote a simple script that logs the time at which the script was accessed, and prints the log every time it is accessed. ...

Command line alarm

When I’m in front of my laptop, I usually forget the world around. Sadly, the world around has important things that need to get done on time. Like eating medicines, turning off the washing machine or the hob, etc. The one thing I’ve been lacking on my machine was a simple alarm system. I’d like to set an alarm to remind me to do something in 5 minutes, for example. And it should be dead simple to set up. ...

SSH Tunneling through web filters

You can defeat most web filters by spending around 8 cents/hr 0 cents/hr on Amazon EC2. (It’s usually worth the money. It’s a fraction of the cost a phone call or a sandwich. And I usually end up wasting that money anyway on calling someone or eating my way out of the misery of corporate proxies.) Most web filters and proxies block all ports except the HTTP port (80) and the HTTPS port (443). But it’s used to carry encrypted traffic, and, as Mark explains: ...

Open source in corporates

Last month, my first application went live. I’ve been writing code for 20 years. Not one line of my code has been officially deployed in a corporate. (Loser…) It’s a happy feeling. Someone defined happiness as the intersection of pleasure and meaning. Writing code is pleasurable. Others using it is meaningful. But this post isn’t quite about that. It’s about the hoops I’ve had to jump through to make this happen. ...

Inline form validation

A List Apart’s article on Inline Validation is one of the most informative I’ve read in a while — and it’s backed by solid data. Some useful lessons: Inline validation can reduce form completion time by 40% Use inline validations where the user doesn’t know if they’ll get it wrong (e.g. is a username available?). Don’t use them if user knows the answer (e.g. their name) Validate on blur, not on keypress (it’s distracting, and users can’t multitask) Comments jesse 25 Sep 2009 4:15 pm: maybe u should add some inline validation on your comments form, instead of the wordpress error page?

Round buttons with Python Image Library

After much hunting, I finally settled on Hedger Wang’s simple round CSS links as the most acceptable cross-browser round button implementation. The minified CSS is about 2.5KB, and the syntax is very simple. To make an input button into a round button, just wrap it within a <span class="button">: <span class="button"><input type="submit"></span> … and it’s just as easy to convert a link into a rounded button: <a class="button" href=”/”><span>Home</span></a> It works by using a transparent PNG / GIF that looks like this: ...

Error logging with Google Analytics

A quick note: I blogged earlier about Javascript error logging, saying that you can wrap every function in your code (automatically) in a try{} catch{} block, and log the error message in the catch{} block. I used to write the error message to a Perl script. But now I use Google’s event tracking. var s = []; for (var i in err) s.push(i + "=" + err[i]); s = s.join(" ").substr(0, 500); pageTracker._trackEvent("Error", function_name, s); The good part is that it makes error monitoring a whole lot easier. Within a day of implementing this, I managed to get a couple of errors fixed that had been pending for months. ...

Short URLs

With all the discussion around URL shorteners, Diggbar, blocking it, and the rev=canonical proposal, I decided to implement a URL shortening service on this blog with the least effort possible. This probably won’t impact you just yet, but when tools become more popular and sophisticated, it would hopefully eliminate the need for tinyurl, bit.ly, etc. Since the blog runs on WordPress, every post has an ID. The short URL for any post will simply be http://www.s-anand.net/the_ID. For example, http://s-anand.net/17 is a link to post on Ubuntu on a Dell Latitude D420. At 21 characters, it’s roughly the same size as most URL shorteners could make it. ...

Automating PowerPoint with Python

Writing a program to draw or change slides is sometimes easier than doing it manually. To change all fonts on a presentation to Arial, for example, you’d write this Visual Basic macro: Sub Arial() For Each Slide In ActivePresentation.Slides For Each Shape In Slide.Shapes Shape.TextFrame.TextRange.Font.Name = "Arial" Next Next End Sub If you didn’t like Visual Basic, though, you could write the same thing in Python: import win32com.client, sys Application = win32com.client.Dispatch("PowerPoint.Application") Application.Visible = True Presentation = Application.Presentations.Open(sys.argv[1]) for Slide in Presentation.Slides: for Shape in Slide.Shapes: Shape.TextFrame.TextRange.Font.Name = "Arial" Presentation.Save() Application.Quit() Save this as arial.py and type “arial.py some.ppt” to convert some.ppt into Arial. ...

WordPress themes on Live Writer

One of the reasons I moved to WordPress was the ability to write posts offline, for which I use Windows Live Writer most of the time. The beauty of this is that I can preview the post exactly as it will appear on my site. Nothing else that I know is as WYSIWYG, and it’s very useful to be able to type knowing exactly where each word will be. The only hitch is: if you write your own WordPress theme, Live Writer probably won’t be able to detect your theme — unless you’re an expert theme writer. ...

Client side scraping for contacts

By curious coincidence, just a day after my post on client side scraping, I had a chance to demo this to a client. They were making a contacts database. Now, there are two big problems with managing contacts. Getting complete information Keeping it up to date Now, people happy to fill out information about themselves in great detail. If you look at the public profiles on LinkedIn, you’ll find enough and more details about most people. ...