Thursday 3 June 2010

Websocket gets an update, and it breaks stuff.

Scroll to the bottom of this post for a cheat sheet of what has changed.
Unfortunately the change is not backward compatible. From the blog post:
"These changes make it incompatible with draft-hixie-thewebsocketprotocol-75; a client implementation of -75 can’t talk with a server implementation of -76, and vice versa."
So, lets take a look at the changes, and try to make sense of them.
The specification document is just not readable unless you want to go completely insane. Here's a few lovely bits from the document.
26. Let /key3/ be a string consisting of eight random bytes (or
equivalently, a random 64 bit integer encoded in big-endian
order).
EXAMPLE: For example, 0x47 0x30 0x22 0x2D 0x5A 0x3F 0x47 0x58.
What??? Wait. Let me read that again. a random 64 bit integer encoded in big-endian order. Sure. Make sure you don't use a random 64 bit integer encoded in LITTLE-endian order, that would completely mess up the protocol.
32. Let /fields/ be a list of name-value pairs, initially empty.
33. _Field_: Let /name/ and /value/ be empty byte arrays.
34. Read a byte from the server.
If the connection closes before this byte is received, then fail
the WebSocket connection and abort these steps.
Otherwise, handle the byte as described in the appropriate entry below:
-> If the byte is 0x0D (ASCII CR)
If the /name/ byte array is empty, then jump to the fields processing step. Otherwise, fail the
WebSocket connection and abort these steps.
-> If the byte is 0x0A (ASCII LF)
Fail the WebSocket connection and abort these steps.
-> If the byte is 0x3A (ASCII :)
Move on to the next step.
-> If the byte is in the range 0x41 to 0x5A (ASCII A-Z)
Append a byte whose value is the byte's value plus 0x20 to the /name/ byte array and redo this step for the next byte.
-> Otherwise Append the byte to the /name/ byte array and redo this step for the next byte.
NOTE: This reads a field name, terminated by a colon, converting upper-case ASCII letters to lowercase, and aborting if a stray CR or LF is found.
35. Let /count/ equal 0.
NOTE: This is used in the next step to skip past a space character after the colon, if necessary.
36. Read a byte from the server and increment /count/ by 1.
If the connection closes before this byte is received, then fail the WebSocket connection and abort these steps.
Otherwise, handle the byte as described in the appropriate entry below:
-> If the byte is 0x20 (ASCII space) and /count/ equals 1 Ignore the byte and redo this step for the next byte.
-> If the byte is 0x0D (ASCII CR) Move on to the next step.
-> If the byte is 0x0A (ASCII LF) Fail the WebSocket connection and abort these steps.
-> Otherwise Append the byte to the /value/ byte array and redo this step for the next byte. NOTE: This reads a field value, terminated by a CRLF, skipping past a single space after the colon if there is one.
37. Read a byte from the server. If the connection closes before this byte is received, or if the byte is not a 0x0A byte (ASCII LF), then fail the WebSocket connection and abort these steps. NOTE: This skips past the LF byte of the CRLF after the field.
38. Append an entry to the /fields/ list that has the name given by the string obtained by interpreting the /name/ byte array as a UTF-8 byte stream and the value given by the string obtained by interpreting the /value/ byte array as a UTF-8 byte stream.
39. Return to the "Field" step above.
Do you enjoy seeing basic HTTP header parsing code rewritten in English??? For me, it's beyond excruciating. It's verging on obfuscation. Where is the actual spec? Can we please just see an example packet dump conversation from client to server? You know, the 10 lines or so we actually need?
So, in the initial spec, things were reasonably sane, the client sent over "Hey can I be your friend and play websocket?", and the server sent back "hehe sure lets play bro". Then the two conversed using a reasonably sane binary protocol.
It seems that this was decided to have potential for misuse. Presumably if an insecure server (Not HTTP, something else), was out there, that you could get to say "hehe sure lets play bro", then potentially, you could establish a connection to it, and send fairly binary data to it from there on in (It would have packet headers, but you may still be able to do damage).
Note that the issue here doesn't seem to be anything within the expected usage, rather in forcing a browser to connect to some other say mail/irc server, and getting it to do bad stuff.
So in this new version of the spec, they've added a simple challenge / response. I say simple, but I mean needlessly complex.
Firstly, there's 2 new headers in the request. sec-websocket-key1 and sec-websocket-key2. These contain 2 integer keys. But for some reason, those keys are intersperced with random characters!
16. Let /spaces_1/ be a random integer from 1 to 12 inclusive.
Hickson Expires November 24, 2010 [Page 21]
Internet-Draft The WebSocket protocol May 2010
Let /spaces_2/ be a random integer from 1 to 12 inclusive.
EXAMPLE: For example, 5 and 9.
17. Let /max_1/ be the largest integer not greater than
4,294,967,295 divided by /spaces_1/.
Let /max_2/ be the largest integer not greater than
4,294,967,295 divided by /spaces_2/.
EXAMPLE: Continuing the example, 858,993,459 and 477,218,588.
18. Let /number_1/ be a random integer from 0 to /max_1/ inclusive.
Let /number_2/ be a random integer from 0 to /max_2/ inclusive.
EXAMPLE: For example, 777,007,543 and 114,997,259.
19. Let /product_1/ be the result of multiplying /number_1/ and
/spaces_1/ together.
Let /product_2/ be the result of multiplying /number_2/ and
/spaces_2/ together.
EXAMPLE: Continuing the example, 3,885,037,715 and
1,034,975,331.
20. Let /key_1/ be a string consisting of /product_1/, expressed in
base ten using the numerals in the range U+0030 DIGIT ZERO (0)
to U+0039 DIGIT NINE (9).
Let /key_2/ be a string consisting of /product_2/, expressed in
base ten using the numerals in the range U+0030 DIGIT ZERO (0)
to U+0039 DIGIT NINE (9).
EXAMPLE: Continuing the example, "3885037715" and "1034975331".
21. Insert between one and twelve random characters from the ranges
U+0021 to U+002F and U+003A to U+007E into /key_1/ at random
positions.
Insert between one and twelve random characters from the ranges
U+0021 to U+002F and U+003A to U+007E into /key_2/ at random
positions.
NOTE: This corresponds to random printable ASCII characters
other than the digits and the U+0020 SPACE character.
Hickson Expires November 24, 2010 [Page 22]
Internet-Draft The WebSocket protocol May 2010
EXAMPLE: Continuing the example, this could lead to "P388O503D&
ul7{K%gX(%715" and "1N?|kUT0or3o4I97N5-S3O31".
22. Insert /spaces_1/ U+0020 SPACE characters into /key_1/ at random
positions other than the start or end of the string.
Insert /spaces_2/ U+0020 SPACE characters into /key_2/ at random
positions other than the start or end of the string.
EXAMPLE: Continuing the example, this could lead to "P388 O503D&
ul7 {K%gX( %7 15" and "1 N ?|k UT0or 3o 4 I97N 5-S3O 31".
23. Add the string consisting of the concatenation of the string
"Sec-WebSocket-Key1:", a U+0020 SPACE character, and the /key_1/
value, to /fields/.
Add the string consisting of the concatenation of the string
"Sec-WebSocket-Key2:", a U+0020 SPACE character, and the /key_2/
value, to /fields/.
24. For each string in /fields/, in a random order: send the string,
encoded as UTF-8, followed by a UTF-8-encoded U+000D CARRIAGE
RETURN U+000A LINE FEED character pair (CRLF). It is important
that the fields be output in a random order so that servers not
depend on the particular order used by any particular client.
Oh sweet Jesus what were you smoking? So instead of just sending over a challenge, we're sending 2 challenges, interspersed with some random characters. OK, whatever. But wait, you're saying that the client should send the headers in a *random* order? That's just crazy. That's not a good solution to "Dumb ass server expects headers in specific order".
2 challenges enough for you? Apparently not.
26. Let /key3/ be a string consisting of eight random bytes (or equivalently, a random 64 bit integer
encoded in big-endian order). EXAMPLE: For example, 0x47 0x30 0x22 0x2D 0x5A 0x3F 0x47 0x58.
27. Send /key3/ to the server.
This 8 byte random key is sent after the initial headers. Don't forget, big-endian or it won't work ;)
OK, so we have the new request from the client, with THREE challenge keys. 2 of them as headers, and 1 of them after the headers.
To make our server work with this, all we now need to do is firstly, send back the headers Sec-WebSocket-Origin and Sec-WebSocket-Location (They were WebSocket-Origin and WebSocket-Location in previous version of protocol), and then after the headers have been sent, send the response to the challenge. Which is md5(BIG_ENDIAN_4byte(key1) + BIG_ENDIAN_4byte(key2)+key3).
Here's the cheat sheet for people who have better things to do than read endless English descriptions of code:
Client sends over
* header - sec-websocket-key1 - extract numerics (0-9) from value, and convert to int base 10. Divide by number of space characters in value!
* header - sec-websocket-key2 - extract numerics (0-9) from value, and convert to int base 10. Divide by number of space characters in value!
* key3 sent straight after the initial headers. 8 bytes data.
Server response
* headers Sec-WebSocket-Origin and Sec-WebSocket-Location must be sent instead of WebSocket-Origin and Websocket-Location
* After headers have been sent, send 16 byte md5 of (key1 + key2 + key3) (BIG_ENDIAN). That's 4 byte big endian key1, 4 byte big endian key2, and 8 byte key3.
It's fairly simple to setup your server to support both WebSocket versions. If the new key1/key2 headers are present, proceed with the new version. Else use the old.
Why couldn't the spec just include the 'meat'. It's a simple protocol which can be summed up in a page or two. The current spec runs to 55 pages! I'd bet far more than any implementation of the spec.
Are 3 challenge keys more secure than 1? Is adding random characters into the middle of the keys more secure than not doing that? Will it work if we use little-endian for the random number instead of big-endian?
Sometimes these things seem to be ridiculously over engineered to me. It took me far more time to read the spec than it did to update the Mibbit server to support it.

Wednesday 31 March 2010

Google Closure Compiler Advanced mode

Benefits of using Google Closure Compiler
Closure is a set of tools to help developers write javascript. At its core is a compiler designed to reduce script size drastically by shortening variables and other techniques, much like the objectives of JSMin and YUI Compressor.
At Mibbit, we serve our web based webchat client to millions of unique users every week. Written entirely in Javascript it's made up of 42 separate files totaling 428Kb so even zipped up it's responsible for over 100GB bandwidth each week.
As well as wanting to reduce our bandwidth usage, we want to make our code loads as fast as possible for users.
Why Google closure?
Closure takes the idea of a Javascript minifier a step further than ever before. It does this by doing real compilation to remove unused code, inline variables and rewrite code to make it as small as possible.
Comparison
Now lets take a look at some JS minifiers, to see how Closure compares. Note though that closure compiler is much more than a simple minifier, it also goes to great lengths to analyze program flow, re-organizing and removing unused code as much as it can.
The table below includes gzipped byte counts, because Javascript should always be served up gzipped if the browser can accept it.
We compare JSMin, by Douglas Crockford and Yahoo! YUI
MinifierBytes% of originalgz Bytesgz % original
None428,264100%91,750100%
JSMin249,37258%57,33862%
YUI235,21455%55,99061%
Closure (STANDARD)219,44651%53,51558%
Closure (ADVANCED)166,77439%47,37252%
As you can see, the best by a mile is Closure running in ADVANCED mode.
Real world experience and tips
To get started with Closure checkout the website here http://code.google.com/closure/compiler/ or jump straight in and test out how the compiler will change things in your code by using the online version: http://closure-compiler.appspot.com/
You should be able to use the STANDARD mode with your js without any issues. If you're after the ADVANCED mode though, you're going to need to do a little more work. Firstly, in advanced mode, closure checks what parts of the js are actually used, and which bits can be safely left out. This is useful if you have libraries which you only pick and choose specific parts of. Remember to keep your original js files as this is a one-way trip. If you reference your js code from an html file, you need to tell closure that, by defining some externs. Otherwise it will assume your code isn't used by anything, and remove it! This is done in the following manner:
function init() {
// Some code
}
// Make sure it's extern so that things can access it.
window["init"] = init;
The other thing you need to do is understand how closure treats foo.bar vs foo["bar"].
with foo.bar the bar will be shortened if possible. With foo["bar"] the value bar is left unchanged by the compiler, as it is a string. This means that if you have any external interfaces, such as JSON, you'll need to access that JSON data using ["key"] rather than .key
At first look, this seems counter-intuitive. foo["bar"] is more characters than foo.bar. However, the closure compiler actually changes those foo["bar"] to foo.bar to save space, so everything will be ok :)
The other issue you may come across is externs. If you use external libraries, you need to tell closure about the external functions you're going to be calling, so that it doesn't minify any of those function names. This can be done using an externs file as per their guide. Closure already has an extensive list of externs, which it won't touch. For example, it knows about standard style names. So if you have foo.borderTop = "1px solid #ccc" it will already know to leave 'borderTop' alone. A list can be found here.
This is a little bit of extra work, since you won't see any error if closure has minified something it shouldn't have, but your code won't work. One way to assist you is by asking closure to output a property map list, in which it reports exactly which names it has shortened. You can then look through this, and if you notice something that is for example only in JSON data, or should be an external library reference, you can adjust and compile again. There's also a firebug extension for closure called inspector for better debugging.
The closure compiler is a really good tool in getting your js to the smallest size possible. Saving you and your users bandwidth and cpu. It's worth looking at the issues list for Closure http://code.google.com/p/closure-compiler/issues/list . Of course often, javascript is the least of your worries if you're serving up images, sound files, video etc. But it's good to get as many things optimized as you can. As you can see in the table above though, the worst case (raw js, no gz) compared to the best case (closure advanced, gz) means a reduction to just 11% of the original.
Benefits
So summing up the benefits are of course smaller size and lower bandwidth, should also mean better performance in the browser. One area Mibbit is keen to see how closure can perform is on mobile devices where bandwidth and processor aren't as plentiful as on the desktop. Any small code efficiency gain is good news for mobile where any excessive downloads or processor usage just chew the battery. Minifying in Advanced mode also makes it harder for others to read/steal parts of your code too. So if you're into protecting your code - here's a fair start.
So, look for real world examples like this and try out closure on your code, see the benefits.
If you have any questions or comments on web development in general, we can be found from time to time in http://mibbit.com/#webdev
Mibbit is going to Google i/o, are you? Follow us on Twitter http://twitter.com/mibbit