Friday, 11 December 2009

WebSocket - some numbers

I just implemented WebSocket into the Mibbit server, and thought I'd get some real numbers on performance. Having recently updated the Mibbit server to use deflate compression on XHR responses where it provides a net gain, I wanted to see how the two compare.
Note that one of my main focuses is on bandwidth usage. We use a fair amount of bandwidth, and anything we can do to optimize this is a good thing. Also it usually means a speed up for users which is extremely important.
First of all, a recap/explanation of how the conventional Comet works here.
We have two connections, which are set to keep-alive. One is for sending from browser to server, the other one is for server to browser.
The server->browser one is opened by the browser, doing an XHR POST, and held open by the server until data is ready, or until a timeout. Then a new request is again sent from the browser. This means that as soon as data is ready, it's delivered to the browser.
There are a couple of downsides to the above method.
1. Keep-alive isn't failsafe. Sometimes browsers/proxies/etc ignore you, or for other reasons decide to open new connections. Creating a new tcp/ip connection is expensive, and may mean that lag is introduced. For the vast majority of cases though, no new TCP/IP connections are created, and you just have your 2 connections to the mibbit server to handle all communications.
2. Every HTTP request sent from the browser includes headers, wether you want them or not. These are not small headers. expect 2k+ per request. You can remove *some* of these headers (See previous post), but most of them you're stuck with.
Now, enter WebSocket. This basically gives you a bi-directional socket with the server after a small HTTP handshake. The advantages are that there's no HTTP headers from then on, and there shouldn't be any lag due to keep-alive issues.
For the initial test, I opened a Mibbit Widget pointed at a channel on said a couple of things, and did a whois lookup. A reasonably small scale test, but a useful one involving packets sent both ways, and some large packets (MOTD, topic).
First lets see the results for standard XHR:
data recvd: 1222
data sent: 7220
overhead In: 4456
overhead Out: 1229
Total data: 14127
ok, so we have 14k of data, and 5.6k of that is 'overhead' - HTTP headers/request/responses.
Lets see how WebSocket improves on that:
data recvd: 1350
data sent: 7307
overhead In: 118
overhead Out: 176
Total data: 8951
Wow. That's a big improvement. We've cut the overhead down to just 294 bytes (Basically the initial handshake).
Given the above data, it's clear that using WebSocket is a big win both in terms of bandwidth usage, and (although I haven't measured yet) lag. Anecdotally, the WebSocket version did seem a lot 'snappier', so I'd expect the lag to be reduced.
However, we haven't looked into one other area - compression. With HTTP, we can compress responses from the server, and the browser will decompress them fast, and pass them onto js. There is no mechanism for this with WebSockets (yet). If you want compression with WebSockets, you're likely going to have to do it yourself in javascript, which may burn precious browser cpu cycles.
So, finally, here's the numbers for XHR+deflate:
data recvd: 1222
data sent: 1868 (Compressed)
overhead In: 4456
overhead Out: 1229
Total data: 8775
So, this just beats WebSocket for bandwidth usage. It would depend heavily on the type of data you're sending as to how good your compression is, and how the numbers compare with WebSocket.
Just to recap,
XHR: 14,127 bytes
WebSocket: 8,951 bytes
XHR+deflate: 8,775 bytes
Note that this is a reasonably small scale test, but I do believe the numbers will scale pretty much like this. In general, for our type of traffic, the HTTP headers in XHR double the traffic. Again, for our type of traffic, the compression pretty much halves the traffic. So XHR+deflate vs WebSocket is pretty close.
We'll be rolling out WebSocket support in Mibbit in the next few days, and will be able to get some more definitive data on how the two compare. We have quite a large Chrome userbase, hopefully some of which are on the dev builds which support WebSocket.
It's certainly a great upgrade to the web, and hopefully compression support will come in due course.
Implementing WebSocket wasn't actually too bad at all, there were a couple of hoops to jump through, the protocol seems reasonably sane. Sadly the protocol doc is completely insane, and tries to describe what you should do using plain english instead of just giving you the data you need. eg "take the value \b\ and bitwise and it with 0x7f and put the result in a variable \b2\"
We have a very early alpha Mibbit Widget setup on with support for WebSocket if available, else XHR+deflate, or XHR worst case. To see if it's using WebSocket, click on the debug tab, and you should see a message saying WebSocket created. Alternatively if you use the developer timeline, you can see if there's any XHR going on or not.

Tuesday, 16 June 2009

Revenue / Browser

Here's some interesting stats from Mibbit... I checked out the average revenue generated per 1,000 visits on the main site. The data covered about 800k visits, so reasonably statistically valid I think.
First off, here's the visit breakdown:
Firefox: 58.75%
IE: 26.11%
Chrome: 6.46%
Opera: 3.95%
Safari: 3.63%
Mozilla: 0.67%
Now here's the average revenue generated per 1,000 visits for each browser. Calculated for example as Safari_revenue * 1000 / Safari_visits:
Safari: $2.392
Firefox: $1.599
Mozilla: $1.476
Chrome: $1.053
IE: $1.050
Opera: $0.388
This sort of went against some of my assumptions. I imagined that IE would be the top revenue generator, as you sort of imagine IE users as being less tech savvy, more 'used to' clicking on adverts etc.
The other interesting point to note is that you should never believe the extremely vocal minority who tell you that all firefox users have AdBlockPlus installed. They don't. As you can see Firefox users are the 2nd best revenue generators.
I didn't know where to place safari before I did the calculations, but it does make some sense. Apple users are more used to spending money, (They likely value their time more than their money), so perhaps this is why they generate more revenue.
The shocker was Opera. An Opera user generates just 16% of an average Safari user! That's really poor. Someone mentioned something about built in content blocking in Opera, but I couldn't find it in a default install.
So should I start pushing people away from Opera, and toward firefox+safari? Well, no, they're probably more likely using the browser *because* of their advertising behavior, not the other way around.... Still, food for thought.
So what about OSes?
Macintosh: $2.156
Linux: $2.076
Windows: $1.285
So once again, we have Mac users ready to spend money, click on ads, etc. The surprise is that Linux users generate quite a bit more revenue than Windows users. Counter to what I had assumed previously.
Note that these stats exclude iphone/ipod/opera mobile which aren't really big enough to draw many conclusions from - also people don't click often on ads on mobiles.
The stats were generated using Google analytics tied to adsense, which works really well for things like this.
In summary then:
  • Apple users are good at generating revenue - they buy stuff
  • Linux and firefox users are also good - don't listen to the overly vocal AdBlockPlus user that likes to tell you how everyone using firefox doesn't see any ads anyway
  • IE/windows is solid enough
  • Opera is terrible
  • Google analytics rocks
If you have any thoughts on why Opera should be so bad, please post a comment, perhaps it's to do with the 'turbo mode'? afaik this puts everything through their opera-mini web proxies? so perhaps that blocks ads?
As mentioned by some commenters, this may have more to do with different locations than browsers, coupled to the fact that some browsers have definite geographical biases. For example, Opera usage in these stats for the US, is 2%, whilst Opera usage for eastern Europe is 8%. In short, Opera may have a geographical bias toward less-easily-monetized countries (At least using adsense).