Sunday, January 03, 2010

Content Delivery Networks

My company has a very specific requirement: we need to get our application to any desktop in the world in less than three minutes. There are business drivers for this that I shall not go into; basically it is so that potential customers don’t get bored waiting for our application to install and run. We are currently failing to do that for all users, and we suspect we are losing customers because we fall at the first fence.

The Problem is Discovered

Our installer is about 50MB, which is not huge, but we have been seeing an enormous variation in deployment times to various parts of the world. Currently we use a UK-based hosting service with high symmetric bandwidth, but routine log analysis revealed that the install times for some users exceeded 10 minutes, and many did not complete. A quick web search revealed that this is a well known problem, so well known in fact that there are many commercial solutions that come under the generic title of Content Delivery Networks (CDNs). The big players are companies like Akami and Limelight, but I am allergic to companies that won’t tell you the price, and I suspect our needs are too modest to be worth their while addressing. There is however a new class of companies like GoGrid emerging and there are established hosting players like Amazon (with CloudFront) and Rackspace (using Limelights CDN network) who are offering CDNs. The new-kid-on-the-block is Microsoft, who beta-launched the Azure CDN solution just as my investigations began.

CDN, like all hosting, is a highly commodified product. There are certainly modest differences in terms of things like upload flexibility (Azure stinks), clever torrent links (Amazon S3 rocks), and general UI friendliness, but there were no showstoppers. The only really important metrics are speed, reliability, and cost. Cost was easy, everyone who didn’t make it clear on their website in the first two minutes was discarded (are you starting to understand our business drivers now?), and the remaining companies were all so cheap that it wasn’t worth worrying about. This is because we are talking about a very small amount of data 50MB x 100 installs per month = 5GB and the pricing is never more than about 25 cents per GB. These businesses are built for large streaming media and Flash media files, not for tiny desktop installers like ours.

Reliability next: we are not particularly concerned about reliability given that we are statistically unlikely to lose enough business in the difference between four nines and five nines to make it worth basing a decision on. Everybody can do four nines.

So that left speed, which comes in two flavors: latency and bandwidth. Latency is critical for that snappy website that puts your shop window in front of the customer in less than a few seconds (which is sometimes all you have). Incidentally, I didn’t come across any CDN webhosts, particularly ones that support ASP.NET, but you have to imagine it is coming from Azure. In our case, bandwidth was going to dominate so that is what we needed to know about.

During my research, I came across Ryan Kearney’s comparison of CDN providers. He gives a great round-up of the price and features of many of the providers, as well as some latency statistics for a handful of international locations. He was kind enough to host a file for my test rig on his Rackspace account, which was much appreciated.

So there are plenty of CDN providers, but very little information available to allow you to compare them. For instance, India and China are two very important markets for us, but what is the bandwidth to them from each of the providers? Clearly we needed to do some measurements.

The Game is Afoot

How do you measure the bandwidth of a host to every country in the world? Well, there are many companies that offer website monitoring and will alert you if your website goes down, some of these have international monitoring capabilities, and some of them have page download time statistics. However, to get an accurate picture of download speeds you need a fairly sizable file so that the bandwidth lag dominates other factors such as DNS resolution or server latency. Only one web monitoring service actually downloaded the whole file, allowing us to make an accurate estimate of bandwidth. They are WebSitePulse, and I could not have done this analysis without them. They have the most monitoring stations in the world, the most detailed statistics, and a 30 day free trial, which I used for this investigation. I highly recommend them to anyone looking for sophisticated, international, web site monitoring.

We created a test file called Test1MB.zip, which was a zip file that was truncated to exactly 1MB. A zip file is largely incompressible and the extension stops most servers from trying (actually few offer HTTP compression, which is a serious omission but beyond the scope of this post). This was mounted on multiple hosts and WebSitePulse was configured to download the files periodically. The WebSitePulse trial limits you to 20 monitor stations at a time (and excludes Auckland and Melbourne), and I didn’t have access to all of the hosts from the beginning, so the statistics are not done to laboratory standards. However, the statistical picture that emerges is reliable enough to allow business decisions to be made.

The Runners and Riders

Host CDN Capable Notes
RapidSwitch No Our current host and representative of good quality hosting in the UK.
Azure CDN Yes Still in beta, and we literally started using it the day it opened, so there were teething problems.
Rackspace Yes Huge player in hosting and cloud computing.
Amazon CloudFront Yes CDN at the front, Amazon’s S3 at the back. Nominally still in beta, but frankly charging for something means you must be judged as a commercial product.
Amazon S3 No Our S3 hosting is in the US, so this is the standard candle for US-based cloud hosting.
GoGrid CDN Yes A high number of international points-of-presence, and more on the way.

Very few of the CDN companies offer free trials for some reason, but I think all are pay-as-you-go, which costs pennies for what we want. It took a bit of back-and-forth to get my GoGrid account set up, but their Twitter guy was great at fixing the problem once I made him aware of it. This meant that there are slightly less results for GoGrid. The whole trial ran for the best part of a month with roughly 15 minute poll times for every host. I had to change things around a bit as I went along to stay within the T&C’s of the WebSitePulse trial – you get $1000 to play with in total.

The following locations were monitored: Amsterdam, Bangalore, Beijing (2 monitors), Boston, Brisbane, Buenos Aires, Chicago, Dusseldorf, Guangzhou, Hong Kong, Houston, London, Los Angeles, Miami, Montreal, Mumbai, Munich, New York, Paris, San Francisco, Sao Paulo, Seattle, Shanghai, Singapore, Stockholm, Sydney (2 monitors), Tokyo, Toronto, Trumbull, Vancouver, Washington

The Results

The summary of the results is shown below:

Host Uptime Average 1MB DL Time (s)
GoGrid

100.00%

2.03

Rackspace CDN

100.00%

2.70

Amazon CloudFront

100.00%

4.46

Azure CDN

99.52%

4.67

Amazon S3

100.00%

5.04

RapidSwitch

99.98%

7.43

Here are the detailed results for all of the monitoring stations and hosts sorted into average download time order:

  GoGrid Rackspace Amazon CloudFront Azure CDN Amazon S3 RapidSwitch Average

New York

0.12

0.19

0.24

1.00

0.50

1.39

0.57

Boston

0.17

0.24

0.42

1.06

0.54

1.31

0.62

Trumbull

0.16

0.39

0.55

1.38

0.50

1.36

0.72

Washington

0.21

0.36

0.98

1.04

0.34

1.80

0.79

Houston

0.24

0.34

0.30

0.73

1.20

2.20

0.84

Paris

0.23

0.31

0.40

2.39

1.78

0.24

0.89

Dusseldorf

0.20

0.29

0.18

3.08

1.92

0.27

0.99

Amsterdam

0.16

0.15

0.47

2.43

2.70

0.24

1.03

Chicago

0.05

0.19

1.95

1.02

1.60

1.48

1.05

San Francisco

0.30

0.30

0.22

1.58

1.83

2.31

1.09

London

0.15

0.37

0.40

3.68

1.89

0.18

1.11

Vancouver

0.15

0.41

0.23

1.57

1.64

2.69

1.12

Toronto

0.40

0.91

0.48

2.58

1.99

1.67

1.34

Seattle

0.16

0.31

0.21

2.15

1.64

3.66

1.36

Munich

0.63

0.29

0.31

4.17

3.06

0.67

1.52

Miami

0.35

4.06

0.70

1.72

0.83

2.25

1.65

Stockholm

0.71

0.20

0.82

4.48

4.82

0.67

1.95

Los Angeles

0.29

0.39

0.34

2.99

3.81

6.47

2.38

Sao Paulo

2.53

2.77

2.70

2.79

3.49

3.50

2.96

Brisbane

0.45

0.42

3.40

4.26

5.96

6.00

3.42

Tokyo

2.00

1.01

1.40

3.04

4.35

8.84

3.44

Sydney

1.17

1.25

3.19

4.15

5.81

5.78

3.56

Bangalore

3.32

0.76

1.75

4.61

8.33

2.66

3.57

Sydney 2

0.18

4.63

3.74

5.07

7.41

5.29

4.39

Montreal

1.01

1.57

1.46

12.44

2.14

8.40

4.51

Mumbai

4.23

2.96

2.00

4.87

10.40

3.70

4.69

Buenos Aires

5.80

7.03

6.41

7.60

6.25

5.33

6.40

Singapore

3.52

1.27

3.24

9.66

8.67

13.62

6.66

Hong Kong

1.24

1.76

1.24

7.65

9.28

26.92

8.02

Beijing 2

5.72

8.23

11.62

7.95

11.20

17.53

10.38

Guangzhou

5.43

10.99

19.48

8.23

10.30

10.91

10.89

Beijing

8.50

14.33

23.92

13.37

16.29

71.55

24.66

Shanghai

17.32

20.56

52.27

19.37

23.83

24.16

26.25

Average

2.03

2.70

4.46

4.67

5.04

7.43

4.39

image

Here are the raw stats if you would like to do any further analysis of your own.

Conclusions

Clearly GoGrid and Rackspace are the best providers from the hosts tested. GoGrid has the best average performance and is unbeaten to almost all of the monitoring stations.

Asia is very badly served by all the hosts tested. Obviously there are dedicated hosting services for Asia, but the whole point of a CDN is that it is global. I expect partnerships are being drafted as I type.

Amazon S3 barely outperforms CloudFront on average, but peak download times per city are much better in some cases.

Montreal did much worse than I expected given that Canada is so well connected to the US.

Amazon and Azure CDN’s both perform equally well, although the uptime of Azure looks bad. Actually the Azure uptime was only really bad for the first few days, after that it was very good, so it is probably not a fair measure.

Did We Win?

Our original aim was to move 50MB in less than three minutes. Therefore our target time for 1MB is 180 / 50 = 3.6 seconds. Even with the fastest CDN host, we are still failing to meet this target for several cities. For Shanghai, we are a factor of five off. And of course this is before we get from where the monitoring stations are (which is probably a well connected hub) out to users at the network edge.

So big-iron can help us make significant improvements for very little effort and cost, but the war goes on. I might tell you how we finally win in a future post. Hint: we make the installer smaller.

3 comments:

jon said...

Interesting article. Just thought I'd mention Panther Express as well who are part of CD Networks now. I work for EveryCity - we're a managed cloud hosting provider and we partner with CD Networks to deliver CDN services to our customers. They are excellent and competitively priced.

CloudHarmony said...

We've done some CDN comparison shopping as well. Akamai is the big dog in this market, but far from affordable if you are an smb with smaller needs. Surprisingly, CacheFly, a smaller CDN also performed the best in our limited tests using pingdom. They were followed by Akamai, Limelight (Rackspace), Edgecast (GoGrid), Internap (SoftLayer), CloudFront, Highwinds, simpleCDN, and finally Azure CDN. http://blog.cloudharmony.com/2010/01/bandwidth-disparity-in-cloud.html

Rupert said...

@CloudHarmony Great article - thanks for the info.