Conversion from PBM bitmap format to PostScript

Introduction

This describes my efforts to submit some old PPPL reports to http://arxiv.org. The text for these were written in TeX or LaTeX, but at the time the reports were written there was no provision to include graphics in TeX output. In those days, figures were submitted to journals as glossy photographs. For the arxiv submissions, I therefore scanned the figures from paper copies of the reports and reran LaTeX including the scanned figures. The discussion here supplements the advice given at http://arxiv.org/help/bitmap.

Black & white or gray?

I initially assumed that scanning the figures as gray-scale images would be preferable. After all, that was the way to avoid the “jaggies”. However, I eventually persuaded myself that black & white images were better for two reasons:

  1. Black & white scans at 300 dots per inch result in a legible printed version.
  2. These can be compressed really well as PostScript (as we shall see).
  3. The various tools that produce PDF files have a nasty habit of using lossy compression for gray and color images (which give a horrible blotched effect with line drawings), but always use lossless compression for bitmaps.
  4. Black & white scans result in less “background” noise, e.g., from the printing on the other side of the paper.
  5. Printers do not print grays consistently. Thus you might be happy with a gray image as printed on one printer, but find that the apparently “white” background prints with gray splotches on another printer.
Scanning as gray-scale is useful however if you need to fiddle with the threshold between black and white. In this case, convert the scanned image to a bitmap with pgmtopbm -threshold -value 0.7, for example. Gray-scale is also better to correct a rotation in the image (for example, if the paper feed on your scanner sometimes misaligns the paper); this can be corrected with pnmrotate, followed by pgmtopbm.

An aside: pnmrotate uses 3 shears to implement the rotation and the manual page credits the method to

Alan W. Paeth,
“A Fast Algorithm for General Raster Rotation”,
Proc. Graphics Interface '86, Canadian Information Processing Soc., pp. 77–81, May 1986.
In fact, I had discovered the same method a year earlier. See Section V of
Charles F. F. Karney,
Numerical techniques for the study of long-time correlations”,
Particle Accelerators, 19(1–4):213–221, May 1986
[Proc. Workshop on Orbital Dynamics and Applications to Accelerators, Berkeley, CA, Mar. 7–12, 1985;
Princeton Univ. Rept. PPPL–2218 (May 1985) 9 pp.]
The same method is also described in
A. Tanaka, M. Kameyama, S. Kazama, and O. Watanabe,
“A Rotation Method for Raster Image Using Skew Transformation”,
Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 272–277, June 1986.

Useful formats

Here is a test on the bitmap image test.png (this is figure 8 from arxiv:nlin.CD/0501023). This image is a 2021 × 1956 scan at 300 dpi of a paper copy of the figure. The following table shows the size of the image file in various formats.

formatsize (kbytes)
PBM (raw)484
PNG (-compress 9) 27
PDF 14
PS (pnmtops) 984
PS (pnmtops -rle) 139
PS (imgtops -2, v 1.0) 148
PS (imgtops -3, v 1.0) 64
PS (imgtops -2, v 1.0a) 92
PS (imgtops -3, v 1.0a) 40
PS (tiff2ps) 18

The PNG file is produced with

res=300
resm=`expr '(' 10000 '*' $res + 127 ')' / 254` # dots/meter
pnmtopng -compress 9 -size "$resm $resm 1" test.pbm > test.png
The PDF file is produced with
res=300
pnmtops -dpi $res -equalpixels -noturn -rle -nosetpage test.pbm |
  epstopdf --filter > test.pdf
Surprisingly, PDF offers much better compression than PNG for bitmaps. Thus PDF is the preferred format for inclusion of bitmap graphics using pdflatex.

In order to include graphics with latex, it is necessary to convert the image to PostScript and here we have several options.

The clear winner here is tiff2ps. This uses the same compression as PDF, with the ratio in the resulting files sizes being 4:5 due to encoding the binary image in ASCII85. The file prints OK on a Level 2 PostScript printer. (However, on some models printing may be slow. Printing on an HP LaserJet 2100M is about 7 times slower than the output of pnmtops. Printing on an HP LaserJet 4m Plus, on the other hand, is about 30% faster that the output to pnmtops.) I found it convenient to write a script, pbmtops, to convert a PBM file to PostScript via tiff2ps. I include a comment in pbmtops indicating how to convert the PostScript file so that black and white are replaced by some other colors—I'm not sure how useful this will be in practice. pbmtops calls dscframe, which puts DSC comments around the image so that other software can skip over the image quickly. It also means that it's OK for an image line to begin with %%.

Recommendations


Charles Karney (2005-06-27)
Back to index.