Some more benchmarks

January 5th, 2013 by pete

CPU benchmarks

Here are some geekbench readings (32bit tryout version) for some of our servers and for comparison some Amazon EC2 images.

server	Geekbench
Dual Hex Core 2Ghz Sandybridge (debian) (E5-2630L)	18265
Hex Core 2Ghz Sandybridge (debian) (E5-2630L)	11435
Quad Core 2.3Ghz Ivy Bridge (ubuntu) (i7-3615QM)	12105
Quad Core 2.0Ghz Sandy Bridge (debian) (i7-2635QM)	9135
Dual Core 2.3Ghz Sandy Bridge (debian) (i5-2415M)	6856
Dual Core 2.66Ghz Core 2 Duo (debian) (P8800)	3719
Dual Core 1.83Ghz Core 2 Duo (debian) (T5600)	2547
Toshiba z930 laptop (Ivy Bridge i7-3667U)	6873
Amazon EC2 t1.micro instance (ubuntu) (E5430 1 virtual core)	2550
Amazon EC2 c1.xlarge instance (ubuntu) (E5506 8 virtual cores)	7830
Amazon EC2 hi1.4xlarge instance (ubuntu) (E5620 16 virtual cores)	10849
Azure Small (1 core AMD Opteron(tm) Processor 4171 HE @ 2.09 GHz / 1.75GB)	2510
Azure Extra Large (8 core AMD Opteron(tm) Processor 4171 HE 2.09Ghz / 14GB)	7471
Elastic Hosts ‘2000Mhz’ single core VM (Opteron 6128)	2163
ElasticHosts ‘20000Mhz’ eight core VM (Opteron 6128)	6942
Linode 512MB VDS (L5520 4 virtual cores)	4469
Mythic Beasts 1GB VDS (L5630 1 virtual core)	2966
Mythic Beasts 64GB VDS (L5630 4 virtual cores)	4166

The method here is pretty simple. Take the default OS install, copy geekbench 32 bit tryout edition onto the machine. Run it and record the results.

It’s important to remember that geekbench performs a mixture of tests, some of which don’t parallelise. This means a server with a fast core will receive a higher score than one with lots of slower cores. As a result the sandybridge and ivybridge machines score very highly because the servers will increase the performance of a single core if the other cores are idle.

Disk benchmarks

We have several disk subsystems available. Single disk, dual disk mirrored software RAID, dual disk mirrored hardware RAID, 8 disk array hardware RAID and PCI-E SSD accelerator card.

Read only benchmarks

The benchmark here is carried out with iops, a small python script that does random reads.

4kb reads

IO Subsystem	IOPS	Data rate
Single SATA disk	60.5	242kB/sec
Mirrored SATA disk	149	597kB/sec
Hardware RAID 1 SATA disk	160.2	640kB/sec
Hardware RAID 10 SATA 6-disk	349	1.4MB/sec
Hardware RAID 10 4 disk Intel 520 SSD	21426	83.7MB/sec
Hardware RAID 0 6 disk SAS 15krpm	104	416kB/sec
Intel 910 SSD	28811	112MB/sec
Apple 256GB SATA SSD	21943	85.7MB/sec
Intel 710 300GB SSD RAID1 Hardware BBU	24714	96.5MB/sec
Amazon micro instance (EBS)	557	2.2MB/sec
Amazon c1.xlarge instance (local)	1746	6.8MB/sec
Amazon c1.xlarge instance xvda (local)	325	1.2MB/sec
Amazon m1.xlarge EBS optimised, 2000IOPS EBS	69	277kB/sec
Amazon hi.4xlarge software RAID on 2x1TB SSD	22674	88.6MB/sec
Azure small (sda)	73.3	293kB/sec
Azure small (sdb)	16010	62.5MB/sec
Azure Extra Large (sda)	86.4	345kB/sec
Azure Extra Large (sdb)	10136	39.6MB/sec
Elastic Hosts Disk storage	54.1	216.6kB/sec
Elastic Hosts SSD storage	437	1.7MB/sec
Mythic Beasts 1G VDS	65.3	261KB/sec
Linode 512MB VDS	475	1.9MB/sec

1MB reads

IO Subsystem	IOPS	Data rate
Single SATA disk	n/a	n/a
Mirrored SATA disk	48.7	48.7MB/sec
Hardware RAID 1 SATA disk	24.9	24.9MB/sec
Hardware RAID 10 SATA disk	23.2	23.2MB/sec
Intel 910 SSD	525	524MB/sec
Apple 256GB SATA SSD	477	477MB/sec
Intel 710 300GB SSD RAID1 Hardware BBU	215	215MB/sec
Hardware RAID 10 4 disk Intel 520 SSD	734	734MB/sec
Hardware RAID 0 6 disk SAS 15krpm	24	24MB/sec
Amazon micro instance (EBS)	71	71MB/sec
Amazon c1.xlarge instance xvdb (local)	335	335MB/sec
Amazon c1.xlarge instance xvda (local)	81.4	114MB/sec
Amazon m1.xlarge EBS optimised, 2000IOPS EBS	24	24MB/sec
Amazon hi.4xlarge software RAID on 2x1TB SSD	888	888MB/sec
Azure small (sda)	n/a	n/a
Azure small (sdb)
Azure Extra Large(sda)	n/a	n/a
Azure Extra Large(sdb)	1817	1.8GB/sec
Elastic Hosts Disk storage	n/a	n/a
Elastic Hosts SSD storage	49.6	49.6MB/sec
Mythic Beasts 1G VDS	44.7	44.7MB/sec
Linode 512MB VDS	28	28MB/sec

It’s worth noting that with 64MB reads the Intel 910 delivers 1.2GB/sec, the hi.4xlarge instance 1.1GB/sec (curiously the Amazon machine was quicker with 16MB blocks). At the smaller block sizes the machine appears to be bottlenecked on CPU rather than the PCI-E accelerator card. The RAID10 array had a stripe size of 256kB so the 1MB read requires a seek on every disk – hence performance similar to that of a single disk as the limitation is seek rather than transfer time. There’s a reasonable argument that a more sensible setup is RAID1 pairs and then LVM striping to have much larger stripe sizes than the controller natively supports.

We’re not sure why the SAS array benchmarks so slowly, it is an old machine (five years old) but is set up for performance not reliability.

Write only benchmarks

I went back to rate.c, a synchronous disk benchmarking tool we wrote when investigating and improving UML disk performance back in 2006. What I did was generate a 2G file, run random sized synchronous writes into it and then read out the performance for 4k and 1M block sizes. The reasoning for a 2GB file is that our Linode instance is a 32bit OS and rate.c does all the benchmarking into a single file limited to 2GB.

Write performance

IO Subsystem	IOPS at 4k	IOPS at 1M
Software RAID 1	84	31
Linode 512MB VM	39	25
Mythic Beasts 1G VM	116	119
Mythic Beasts 1G VM	331	91
Mythic Beasts 1G VM	425	134
2x2TB RAID1 pair with BBU	746	54
6x2TB RAID10 pair with BBU	995	99
400GB Intel 910 SSD	2148	379
256GB Apple SATA SSD	453	96
2x300GB Intel 710 SSD RAID1 pair with BBU	3933	194
Hardware RAID 10 with 4xIntel 520 SSD	3113	623
Hardware RAID 0 with 6x15krpm SAS	2924	264
Amazon EC2 micro, EBS	78	23
Amazon EC2 m1.xlarge, EBS	275	24
Amazon EC2 m1.xlarge, EBS provisioned with 600IOPS	577	35
Amazon EC2 m1.xlarge, instance storage	953	45
Amazon EC2 m1.xlarge, EBS optimised, EBS	246	27
Amazon EC2 m1.xlarge, EBS optimised, EBS with 2000IOPS	670	42
Amazon EC2 hi.4xlarge, software RAID on 2x1TB SSD	2935	494
Azure small (sda)	24.5	5.8
Azure small (sdb)	14	11
Azure Extra Large (sda)	34	6
Azure Extra Large (sdb)	6.1	5.1
Elastic Hosts disk storage	12.8	7.7
Elastic Hosts ssd storage	585	50

I think there’s a reasonable argument that this is reading high for small writes on the BBU controllers (including the VMs & Linode VM). It’s entirely possible that the controllers manage to cache the vast majority of writes in RAM and the performance wouldn’t be sustained in the longer term.

Real world test

We presented these results to one of our customers who has a moderately large database (150GB). Nightly they take a database backup, post process it then reimport it to another database server in order to do some statistical processing on it. The bottleneck in their process is the database import. We borrowed their database and this is the timing data for a postgresql restore. The restore file is pulled from the same media the database is written to.

Server	Time for import
Hex core 2.0Ghz Sandy Bridge, 128GB RAM, 2TB SATA hardware RAID 1 with BBU	2h 35m 24s
Hex core 2.0Ghz Sandy Bridge, 128GB RAM, 400GB Intel 910 SSD	1h 45m 8s
Hex core 2.0Ghz Sandy Bridge, 128GB RAM, 2x300GB Intel 710 SSD hardware RAID 1 with BBU	2h 0m 33s
Quad core 2.3Ghz Ivy Bridge, 4GB RAM, 1TB SATA software RAID 1	4h 16m 14s
Quad core 2.3Ghz Ivy Bridge, 16GB RAM, 1TB SATA software RAID 1	3h 38m 3s
Quad core 2.3Ghz Ivy Bridge, 16GB RAM, 256GB SATA SSD	1h 54m 38s
Quad core E3-1260L 2.4Ghz Ivy Bridge, 32GB RAM, 4xIntel 520 SSD hardware RAID 10 with BBU	1h 29m 33s
Hex core E5450 3Ghz 24GB RAM, 6x15krpm SAS hardware RAID 0 with BBU	1h 58m
Amazon EC2 m1.xlarge with 200GB of 600IOPS EBS	5h 55m 36s
Amazon EC2 m1.xlarge with 200GB of 2000IOPS EBS	4h 53m 45s
Amazon EC2 hi.4xlarge with 2x1TB RAID1 SSD	2h 9m 27s
Azure Extra Large sdb (ephemeral storage)	6h 18m 29s
ElasticHosts 4000Mhz / 4GB / 200GB hard disk	5h 57m 39s
ElasticHosts 20000Mhz / 32GB / 200GB SSD	3h 16m 55s
KVM Virtual Machine (8GB / 8 cores) running on 16GB 2.3Ghz Ivy Bridge Server, software RAID1 with unsafe caching	4h 10m 30s

The postgres import is mostly single threaded – usually the servers sit at 100% CPU on one core with the others idle with only occasionaly bursts of parallelism. Consequently usually the CPU is bursting to 2.5Ghz (Sandy Bridge) or 3.3Ghz (Ivy Bridge). The Ivy Bridge RAID1 machine is actually a Mac Mini. In many ways this is an application perfectly suited to ‘the cloud’ because you’d want to spin up a fast VM, import the database then start querying it. It’s important to note that the estimated lifetime of the Intel 520 RAID 10 array in this test is six months, the performance gain there over the 910SSD is entirely due to faster single threaded performance on the CPU.

Bias

Whilst I’ve tried to be impartial obviously these results are biased. When Mythic Beasts choose hardware for our dedicated and virtual server platforms we deliberately search out the servers that we think offer the best value, so to some extent our servers have been chosen because historically they’ve performed well at the type of benchmarks we test with. There’s also publication bias, if the results said emphatically that our servers were slow and overpriced we’d have fixed our offering, then rewritten the article based on the newer faster servers we now had.

Notes

The real world test covers two scenarios, the delay in getting a test copy of the database for querying for which temporary storage may be fine, plus in the event of something going hideously wrong a measure of the downtime of your site until it comes back up again in which case persistent storage is terrifically important.

I plan to add an m2.xlarge + EBS instance, and a hi1.4xlarge instance. I originally didn’t include the hi1.4xlarge because they don’t have EBS optimised volumes for persistent storage. I might also add some test Mythic Beasts VMs with both safe and unsafe storage (i.e. you cache all writes in the host RAM ignoring sync calls) which is a cheap and easy way to achieve instance storage with a huge performance benefit. I excluded the Linode VM from the final test as it’s too small.

Posted in Misc