One square inch of the Unicode poster (1 A0 sheet) at 300dpi

A Unicode Poster

Update: I've updated the scripts so that they work with Unicode 5.1.0 and made them available again in a git repository on github.

Strangely, it seems as if lots of people don't understand why I think the Unicode and ISO 10646 character set standard is such a marvellous, doomed, never-ending endeavour. Anyway, I thought it would be excellent to have a giant poster of all the displayable characters in Unicode, and it turned out that someone (Ian Albert) had already created such a poster. So, why didn't I just download and print out that version, and instead generate my own? Essentially:

It's clear from this Flickr photo that lots of other people have done this too. After doing the first version of this script, I also discovered another page about making Unicode posters by Peter Harkins, which had some helpful suggestions that I incorporated.

The effect that these posters have on people is rather interesting. There seems to be a sharp divide between those whose reaction is total disinterest and those who get stuck for ages looking through all the different scripts and writing systems.

Thank-you to Chris and James for their advice on various aspects of putting this poster together.

What does the poster look like?

See the reduced versions in the section Layout Alternatives below.

Legibility Issues

The chief problem with this poster is that there are so many characters that even when printed onto A0, you basically need a magnifiying glass to see anything. The image above and to the right is one square inch of the final poster (reduced to 300dpi). I've played around with different ways of formatting the poster and separating blocks - my current favourite is the "inline" version described below. If you want to get an idea of how large the characters will be on various numbers of A0 sheets, the images below show the top left 1x2 inches, each reduced to 72dpi (a typical screen resolution).

On 1 A0 sheet
On 1 A0 sheet
On 2 A0 sheets
On 2 A0 sheets
 

One option for printing out the poster is to divide it up with pamdice into 16 A4 sheets and sticking them together. The advantage of this is that you can get a good idea of what it looks like just from a normal printer (my absurdly-cheap-but-good Samsung ML-1610 had no problem printing the 600dpi A4 bitmaps), but of course the ideal is to print it out on the highest definition an A0 plotter you can find.

Downloadable Versions

Sadly, I don't believe that I'm allowed to provide downloadable versions of the final poster. Each of the code charts has the following text in it:

“You may freely use these code charts for personal or internal business uses only. You may not incorporate them either wholly or in part into any product or publication, or otherwise distribute them without express written permission from the Unicode Consortium. However, you may provide links to these charts.
“The fonts and font data used in production of these Code Charts may NOT be extracted, or used in any other way in any product or publication, without permission or license granted by the typeface owner(s).”

I may write to them to clarify their position on such posters, but I suspect the answer will be disappointing. ☹ I can't see anything that prohibits me from using the charts for this purpose, which is clearly just personal use, I think.

How was the poster generated?

If you want to see all the gory details of how to generate these posters, you can download the scripts from github - the repository is here:

    http://github.com/mhl/unicode-poster/tree/master

I don't necessarily recommend trying this yourself - you'll need 30GiB of free disk space, lots of patience and be happy about editing hastily-written Ruby scripts to do what you want.

(In fact, there's a much better way of doing this that James pointed out. All the code charts have fonts for the character grids embedded in the PDFs (you can see them with pdffonts) and can be simply extracted. So, you could create a much more elegant postscript version with all the fonts embedded which would be smaller, faster, better, etc. Of course, this approach is prohibited by the terms of use, so a better approach would be to use some of the many genuinely free TrueType fonts covering much of Unicode.)

I roughly followed the method suggested by Ian Albert, but with a hot-potch of Ruby scripts. One lesson that became clear early on is that Imagemagick is basically hopeless for this kind of task. e.g. "convert" uses Ghostscript to convert from PDFs to PNGs, but in such a way that it takes forever (and All The Memory) on larger files. Just invoking gs directly avoids this. In addition, "montage" just doesn't work for joining very large images. GIMP and recent versions of the netpbm tools, on the other hand, cope very well with the large images.

The scripts will basically:

Printing Issues

Bitmaps of this size are a real pain to print. If you have access to an HP DesignJet without an annoying print server in the way, then you're in luck. You can send HP RTL to it, or sending TIFF files might work too. If you must go via PostScript (as is enforced by the informatics print server, for example) then you're probably not going to be able to print it out losslessly at the full resolution of the printer. The current version on my wall was produced by using PhotoShop and the HP DesignJet printer driver to print-to-file, then sending that to the printer. Unfortunately, this only seems to work if (aarrgh) you check the JPEG encoding box and I suspect that the version I have is only 300dpi. Later versions of the poster were printed by converting the 600dpi PGM file to PCL3GUI, wrapped in PJL and sent directly to an HP DesignJet (thanks to code supplied by James, who also kindly printed a couple out for me.)

Layout Alternatives

I've played around with a few of different layout strategies for these posters, and settled on two which I think are best. The first has quite a bit of whitespace to make the separation into blocks clear. The other one just puts all the block headings and characters inline so that the characters are as large as possible.

The layout with lots of whitespace is shown below, but with all the characters fitted onto two two A4 pages (at 300dpi) - these are useless for posters, of course, but will show you what the layout of the A0 posters would be. (Note that this is for version 4.1 of the standard, which has many fewer characters, so comparing the layouts directly with the inline version is difficult.)

The layout with everything inline is similarly shown below - this is for version 5.0, though, which has many more characters.

This is the inline layout again, but squeezed onto one A4 sheet at 300dpi. For some indication of how legible this is, see the photos below:

The following two images are two characters cropped out of a 300dpi single-sheet A0 bitmap, and a microscope image of those images on a test printing of the same image. (The microscope images are taken using the The Proscope ("As seen on the hit CBS TV series CSI & CSI Miami!"!!?!!1111). Sadly, I couldn't get it to work on Linux, so those are taken from a Mac.)

U+2614 UMBRELLA WITH RAIN DROPS and U+2615 HOT BEVERAGE in
the final bitmap
U+2614 UMBRELLA WITH RAIN DROPS and U+2615 HOT BEVERAGE on
a test printing; the ruler gradations are 1mm apart

And similarly with two other characters:

U+0001E2 LATIN CAPITAL LETTER AE WITH MACRON and U+0001E3 LATIN SMALL LETTER AE WITH MACRON in
the final bitmap
U+0001E2 LATIN CAPITAL LETTER AE WITH MACRON and U+0001E3 LATIN SMALL LETTER AE WITH MACRON on
a test printing; the ruler gradations are 1mm apart

The DesignJet printers understand 24bit data at 600dpi, regardless of what the actual output looks like, so I could try doing some more versions at twice the resolution to see what they look like. They're very pricy to print, though, so I'll have to wait for some further funding :(