The hazards of 301 (permanent) redirects

June 15th, 2015 by

When you visit a web page, you’ll often see the URL change as it loads.  For example, if you attempt to visit http://mythic-beasts.com you’ll end up at https://www.mythic-beasts.com .   This is achieved using HTTP redirects, a response from a server that tells your browser that the page it is trying to load has moved.

HTTP redirects come in two flavours:

Permanent (301)
This tells the client that the page requested has moved permanently, and crucially, if it wants to load the page again, it needn’t bother checking the old URL to see if the situation has changed. This is a good way of redirecting something that you never want to undo, for example, if you’re permanently moving a website from one domain to another.
Temporary (302)
As the name suggests, this tells the client that the page has moved, but only temporarily, so the client should continue requesting the old URL if it wants to load the page again. This is a good way of telling users that your site is down for maintenance, that they they don’t have enough credit to access a site, or of some other issue that is likely to change.

#makeitwrong

three-301-fail

Getting this wrong can be a massive pain for your users. For example, Three use a permanent redirect if you’ve run out of credit on your data plan, or you’re trying to use tethering in the wrong country, or some other temporary problem.

So imagine what happens when you run out of data on your plan. You attempt to visit your favourite website, say, http://www.xkcd.com . Three tell you that that page has been replaced by http://tethering.three.co.uk/TetherNoProductPost. Permanently.

Now find a working internet connection, attempt to load http://www.xkcd.com, and find that your browser quite reasonably takes you straight to the Three fail page, even if you’re no longer using a Three connection. Shift+Reload doesn’t help, even restarting your browser may not help.

Three have told your browser that every page you visited whilst out of credit has moved permanently to their fail page.

Expiring permanent redirects

The example given above is very obviously a place where a temporary 302 redirect should be used, but webmasters are often encouraged to prefer 301s in the name of improving search rankings. 301 redirects allow you to tell search engines that your .co.uk site really is the same site as your .com site, thus accumulating all your google juice in the right place. They also save a small amount of time in loading the page by avoiding an unnecessary HTTP request.

Even when used legitimately, 301 redirects are obviously hazardous, as there’s no way to undo a permanent redirect once it’s been cached by a client.

The safe way to do a 301 redirect is to specify that it will expire, even if you don’t expect to ever change it. This can be done using the Cache-Control header. For example, the redirect that we issue for http://mythic-beasts.com includes the following header:

Cache-Control: max-age=3600

This tells clients that they can remember the redirect for at most one hour, allowing us to change it relatively easily at some point in the future. We use the mod_expires Apache module to create this header, which also produces an equivalent “Expires” header (the old HTTP 1.0 equivalent of Cache-Control).

.htaccess example

The above can be implemented using a .htaccess file as follows:

ExpiresActive on
ExpiresDefault "access plus 1 hour"
Redirect 301 / https://www.mythic-beasts.com/

This example uses mod_alias and mod_expires which may need enabling globally in your web server. In Debian, Ubuntu and similar distributions, this is done by running the following command as root:

a2enmod alias expires

mod_rewrite example

Redirects are often implemented using Apache’s mod_rewrite. Unfortunately, mod_expires doesn’t apply headers to RewriteRules, but mod_headers can be used instead:

RewriteRule ^.* http://www.mythic-beasts.com/ [L,R=301,E=limitcache:1]
Header always set Cache-Control "max-age=3600" env=limitcache

The RewriteRule is used to sent an environment variable which is used to conditionally add a Cache-Control header. Thanks to Mark Kolich’s blog for the inspiration.

Again, you may need to enable mod_rewrite and mod_headers on your web server:

a2enmod rewrite headers

Escaping 301 hell

Fortunately, if you’re unlucky enough to get caught by a broken 301 redirect, such as the one issued by Three, there is an easy way to get to the page you actually wanted: simply append a query string to the end of the URL. For example, http://www.xkcd.com/?foo=bar. Browsers won’t assume that the cached redirect is valid for this new URL and websites will almost always ignore unexpected query parameters.

2015-07-03 – Updated to add mod_rewrite example
2020-03-16 – Updated to note that the relevant modules may need enabling

DNSSEC

May 29th, 2015 by

We’re please to announced that we can now set DS records for any domains registered with us.  At present, only UK domains can be configured  through the control panel.  For any other domains, please email support and we’ll put the records in place for you.

Control panel integration and other DNSSEC improvements will be coming soon.

 

Virtual Server Snapshots

May 18th, 2015 by

VPS snapshotsWe’ve just rolled out a beta of our snapshot functionality for our virtual servers.  This allows you to take an instantaneous image of your servers disk space which can then be restored at a later date to either the same or a different server.  This can be used for cloning a virtual server, for backups, or just to take a copy of your server before making significant configuration changes such as an operating system upgrade.

Snapshots are stored in our distributed storage cloud, which replicates the image across three separate data centres.

The system is in beta testing at the moment, and during this beta we’re offering free storage for images.  Once the beta is complete, storage space will become chargeable, but we’ll contact all customers who’ve made use of the service prior to issuing any bills.

If you want to try it out, simply use the snapshot panel for your server in the customer control panel, or use the snapshot command on the admin console.  Hopefully it’s self-explanatory, if it’s not, tell us and we’ll make it better!

A non-party political broadcast from Mythic Beasts

May 6th, 2015 by

Here at Mythic Beasts it’s fair to say that our staff hold a wide spectrum of political beliefs, but I think one thing we can all agree on is that all the major political parties have at least some irredeemably stupid policies (and possibly also that some of the minor parties only have stupid policies).

This makes voting for a political party a pretty depressing prospect. So, what about voting for an elected representative who will look after our interests?

Our founders reside in two constituencies with notable MPs: Witney and Cambridge.

The MP for Witney is notable for being the Prime Minister. The MP for Cambridge, Julian Huppert, is notable for being a Liberal Democrat and yet still being highly regarded by a large number of his constituents.

Now, if you want good data on whether your MP is any good or not, you should head over to the excellent They Work For You and find out what they’ve been up to in Parliament on your behalf.

But who wants good data when you can have some anecdotes? Let’s look at two issues that have got us wound up recently.

Firstly, the EU VAT MESS, which causes us an administrative burden far in excess of the value of the affected revenue.

Julian Huppert was very active on behalf of the constituents who contacted him on this issue (Mythic only got as far as a tweet…), including submitting written questions in parliament, which received a predictably useless response.

On the other hand, Paul wrote to David Cameron twice (the first letter went AWOL), and received only a hopeless response which completely failed to address any of the issues raised.

Secondly, banning secure encryption. As a hosting company, the ability to undertake transactions securely online is quite important to our everyday business (see previous notes).

The appalling jeering by other MPs, and the pathetic response given by Theresa May, to Julian Huppert’s questions asked in Parliament demonstrated the he was clearly one of the few MPs who actually grasped the implications of the proposal, rather just resorting to rhetoric that fuels the fear that terrorism relies on.

As for David Cameron, well, it’s his idea.

So what can we conclude from this? Not a lot, except that we’d probably be in a far better place if parliament were full of representatives who listened to and understood their constituents, rather than those who get in on the strength of a party political vote.

Debian 8.0 “Jessie” now available

April 27th, 2015 by

Jessie

The new stable version of Debian, named “Jessie” was released on Saturday.  The new version is now available for use on all of our Virtual Server hosts. Jessie is fully available at the Mythic Beasts mirror and we’re included in the default menu so you can easily install directly from our mirror.

Mythic Beasts make extensive use of Debian and would like to thank all the Debian developers by donating our usual firkin of beer from the every excellent Milton Brewery to the Summer Debian UK barbeque so everyone within the Debian community can have a pint on us. Possibly more than one.

Virtual Servers – SSDs and disk upgrades

April 17th, 2015 by

cloud-ssd-red-150Following on from recent upgrades to RAM and bandwidth for our Virtual Servers, we’re pleased to announce upgrades to Virtual Server storage options.

We’ve launched a new range of SSD Virtual Servers, offering the ultimate in I/O performance. The range starts with our VPS2 SSD which replaces the 40GB disk in our standard VPS 2 with a 10GB SSD drive.

Like our spinning rust-based Virtual Servers, our SSD storage is local to the host machine, and connected as RAID 1 mirrored pairs to a controller with a battery-backup unit.  This allows us to safely enable a large write cache, further boosting write performance.

We’ve also doubled the disk space available with all of our full HDD-based Virtual Servers, so our basic VPS2 now includes 40GB of disk, 2GB RAM and 1TB of monthly bandwidth.

Existing customers can upgrade to the new storage capacity by typing “upgrade” on the admin console, and then adding new partitions or resizing existing partitions to make use of the new capacity.

 

 

IPv6 bites again

April 10th, 2015 by

Every now and again, one of our users will either get their SMTP credentials stolen, or will get a machine on our network compromised. More often than not, the miscreants responsible will then proceed to send a whole bunch of adverts for V1@gr@ or whatever through our mail servers. This typically results in our mail servers getting (not unreasonably) added to various blacklists, which affects all our users, creates work for us and generally makes for sad times.

We’ve got various measures to counter this, one of which relies on the fact that spam lists are typically very dirty and will generate a lot of rejections. We can use this fact to freeze outgoing mail for a particular user or IP address if it is generating an unreasonable number of delivery failures. The approach we use is based on the, generally excellent, Block Cracking config.

Unfortunately, both we, and the author of the above, overlooked what happens when you start adding IPv6 addresses to a file which uses “:” as its key/value separator, such as that used by Exim’s lsearch lookup. Yesterday evening, a customer’s compromised machine started a spam run to us over IPv6.

Our system raises a ticket in our support queue every time it adds a new IP to our block list so that we can get in touch with the customer quickly. Unfortunately, if the lookup doesn’t work because you haven’t correctly escaped an IPv6 address, it’ll happily keep adding the same IP for each spam email seen, and raising a new ticket each time. Cue one very busy support queue.

Needless to say, the fix was simple enough, but the moral, if there is one is a) test everything that you do with both IPv6 and IPv4 and b) start preparing for IPv6 now, as it’s going to take you ages to find everything that it breaks.

Code making assumptions about what an IP address looks like that will be broken by IPv6 are almost certainly more prevalent than 2-digit year assumptions were 15 years ago.

WP Super Cache vs Raspberry Pi 2

March 3rd, 2015 by

On Monday, the Raspberry Pi 2 was announced, and The Register’s predictions of global geekgasm proved to be about right. Slashdot, BBC News, global trending on Twitter and many other sources covering the story resulted in quite a lot of traffic. We saw 11 million page requests from over 700,000 unique IP addresses in our logs from Monday, around 6x the normal traffic load.

The Raspberry Pi website is hosted on WordPress using the WP Super Cache plugin. This plugin generally works very well, resulting in the vast majority of page requests being served from a static file, rather than hitting PHP and MySQL. The second major part of the site is the forums and the different parts of the site have wildly differing typical performance characteristics. In addition to this, the site is fronted by four load balancers which supply most of the downloads directly and scrub some malicious requests. We can cope with roughly:

Cached WordPress 160 pages / second
Non cached WordPress 10 pages / second
Forum page 10 pages / second
Maintenance page at least 10,000 pages / second

Back in 2012, during the original launch, we had a rather smaller server setup. That meant we simply just put a maintenance page up and directed everyone to buy a Pi direct from Farnell or RS, both of whom had some trouble coping with the demand. We also launched at 6am GMT so that most of our potential customers would still be in bed, spreading the initial surge over several hours.

This time, being a larger organisation with coordination across multiple news outlets and press conferences, the launch time was fixed for 9am on Feb 2nd 2015. Everything would happen then, apart from the odd journalist with premature timing problems – you know who you are.

Our initial plan was to leave the site up as normal, but set the maintenance page to be the launch announcement. That way if the launch overwhelmed things, everyone should see the announcement served direct from the load balancers and otherwise the site should function as normal. Plan B was to disable the forums, giving more resources to the main blog so people could comment there.

The Launch

turtlebeach

It is a complete coincidence that our director Pete took off to go to this isolated beach in the tropics five minutes after the Raspberry Pi 2 launch.

At 9:00 the announcement went live. Within a few minutes traffic volumes on the site had increased by more than a factor of five and the forum users were starting to make comments and chatter to each other. The server load increased from its usual level of 2 to over 400 – we now had a massive queue of users waiting for page requests because all of the server CPU time was being taken generating those slow forum pages which starved the main blog of server time to deliver those fast cached pages. At this point our load balancers started to kick in and deliver the maintenance page to a large fraction of our site users – the fall back plan. This did annoy the forum and blog users who had posted comments and received the maintenance page back having just had their submission thrown away – sorry. During the day we did a little bit of tweaking to the server to improve throughput, removing the nf_conntrack in the firewall to free up CPU for page rendering, and changing the apache settings to queue earlier so people received either their request page or maintenance page more quickly.

Disabling the forums freed up lots of CPU time for the main page and gave us a mostly working site. Sometimes it’d deliver the maintenance page, but mostly people were receiving cached WordPress pages of the announcement and most of the comments were being accepted.

Super Cache not quite so super

Unfortunately, we were still seeing problems. The site would cope with the load happily for a good few minutes, and then suddenly have a load spike to the point where pages were not being generated fast enough. It appears that WP Super Cache wasn’t behaving exactly as intended.

When someone posts a comment, Super Cache invalidates its cache of the corresponding page, and starts to rebuild a new one, but providing you have this option ticked…

supercache-anonymouse

…(we did), the now out-of-date cached page should continue to be served until it is overwritten by the newer version.

After a while, we realised that the symptoms that we were seeing were entirely consistent with this not working correctly, and once you hit very high traffic levels this behaviour becomes critical. If cached versions are not served whilst the page is being rebuilt then subsequent requests will also trigger a rebuild and you spend more and more CPU time generating copies of the missing cached page which makes the rebuild take even longer so you have to build more copies each of which now takes even longer.

Now we can build a ludicrously overly simple model of this with a short bit of perl and draw a graph of how long it takes to rebuild the main page based on hit rate – and it looks like this.

Supercache performance

This tells us that performance reasonably suddenly falls off a cliff at around 60-70 hits/second. At 12 hits/sec (typical usage) a rebuild of the page completes in considerably under a second, at 40 hits/sec (very busy) it’s about 4s, at 60 hits/sec it’s 30s, at 80hits/sec it’s well over five minutes. At that point the load balancers kick in and just display the maintenance page, and wait for the load to die down again before starting to serve traffic as normal again.

We still don’t know exactly what the cause of this was, so either it’s something else with exactly the same symptoms, or this setting wasn’t working or was interacting badly with another plugin, but as soon as we’d figured out the issue, we implemented the sensible workaround; we put a rewrite hack in to serve the front page and announcement page completely statically, then created the page afresh once every five minutes from cron, picking up all the newest comments. As if by magic the load returned to sensible levels, although there was now a small delay on new comments appearing.

Re-enabling the forums

With stable traffic levels, we turned the forums back on. And then immediately off again. They very quickly backed up the database server with connections, causing both the forums to cease working and the main website to run slowly. A little further investigation into the InnoDB parameters and we realised we had some contention on database locks, we reconfigured and this happened.

Our company pedant points out that actually only the database server process fell over, and it needed restarted not rebooting. Cunningly, we’d managed to find a set of improved settings for InnoDB that allowed us to see all the tables in the database but not read any data out of them. A tiny bit of fiddling later and everything was happy.

The bandwidth graphs

We end up with a traffic graph that looks like this.

raspi-launch-bwgraph

On the launch day it’s a bit lumpy, this is because when we’re serving the maintenance page nobody can get to the downloads page. Downloads of operating system images and NOOBS dominates the traffic graphs normally. Over the next few days the HTML volume starts dropping and the number of system downloads for newly purchased Raspberry Pis starts increasing rapidly. At this point were reminded of the work we did last year to build a fast distributed downloads setup and were rather thankful because we’re considerably beyond the traffic levels you can sanely serve from a single host.

Could do a bit better

The launch of Raspberry Pi 2 was a closely guarded secret, and although we were told in advance, we didn’t have a lot of time to prepare for the increased traffic. There’s a few things we’d like to have improved and will be talking to with Raspberry Pi over the coming months. One is to upgrade the hardware adding some more cores and RAM to the setup. Whilst we’re doing this it would be sensible to look at splitting the parts of the site into different VMs so that the forums/database/Wordpress have some separation from each other and make it easier to scale things. It would have been really nice to have put our extremely secret test setup with HipHop Virtual Machine into production, but that’s not yet well enough tested for primetime although a seven-fold performance increase on page rendering certainly would be nice.

Schoolboy error

Talking with Ben Nuttall we realised that the stripped down minimal super fast maintenance page didn’t have analytics on it. So the difference between our stats of 11 million page requests and Ben’s of 1.5 million indicate how many people during the launch saw the static maintenance page rather than a WordPress generated page with comments. In hindsight putting analytics on the maintenance page would have been a really good idea. Not every http request which received the maintenance page was necessarily a request to see the launch, nor was each definitely a different visitor. Without detailed analytics that we don’t have, we can estimate the number of people who saw the announcement to be more than 1.5 million but less than 11 million.

Flaming, Bleeding Servers

Liz occasionally has slightly odd ideas about exactly how web-servers work: 

is-this-thing-on

Now, much to her disappointment we don’t have any photographs of servers weeping blood or catching fire. [Liz interjects: it’s called METAPHOR, Pete.] But when we retire servers we like to give them a bit of a special send-off.

Virtual Server performance boost

February 6th, 2015 by

cloud-cpuWe’ve just added an option to allow Virtual Servers to get full access to the CPU extensions available on the host server.

By default, virtual servers see a subset of CPU features that is available consistently across all of our hosts. For most users this has no impact on performance, but for some applications, such as performing certain types of encryption, speed can be substantially improved if certain processor extensions are available.

We’ve noticed significant improvements in OpenVPN throughput and latency after turning on this option on some of our servers.

CPU mode on our virtual servers can be configured using the “cpu” command on the admin shell.

Bring Your Own ISO

January 30th, 2015 by

Cloud CDROMOur Virtual Servers come with a virtual CD drive, allowing you to load an ISO image from our library and install an operating system of your choice, configured exactly how you want it.

We’ve just launched our “Bring Your Own ISO” feature, allowing you to upload your own ISO images, giving you complete freedom to install your choice of operating system, or to run a “live CD” distribution.

All users have a free 5GB allocation on our storage cluster for images, and files can be fetched from anywhere on the internet via HTTP, HTTPS, git, FTP or rsync.

Customers can upload a boot image via the “Boot Media” option on our customer control panel.