Conversion from PBM bitmap format to PostScript
Introduction
This describes my efforts to submit some old
PPPL reports to
http://arxiv.org. The text for these
were written in TeX or LaTeX, but at the time the reports were written
there was no provision to include graphics in TeX output. In those
days, figures were submitted to journals as glossy photographs. For the
arxiv submissions, I therefore scanned the figures from paper copies of
the reports and reran LaTeX including the scanned figures. The discussion
here supplements the advice given at
http://arxiv.org/help/bitmap.
Black & white or gray?
I initially assumed that scanning the figures as gray-scale images would
be preferable. After all, that was the way to avoid the
“jaggies”. However, I eventually persuaded myself that
black & white images were better for two reasons:
-
Black & white scans at 300 dots per inch result in a legible printed
version.
-
These can be compressed really well as PostScript (as we shall see).
-
The various tools that produce PDF files have a nasty habit of using
lossy compression for gray and color images (which give a horrible
blotched effect with line drawings), but always use lossless
compression for bitmaps.
-
Black & white scans result in less “background” noise,
e.g., from the printing on the other side of the paper.
-
Printers do not print grays consistently. Thus you might be happy with
a gray image as printed on one printer, but find that the apparently
“white” background prints with gray splotches on another
printer.
Scanning as gray-scale is useful however if you need to fiddle with the
threshold between black and white. In this case, convert the scanned
image to a bitmap with pgmtopbm -threshold -value 0.7, for
example. Gray-scale is also better to correct a rotation in the image
(for example, if the paper feed on your scanner sometimes misaligns the
paper); this can be corrected with pnmrotate, followed by pgmtopbm.
An aside: pnmrotate uses 3 shears to implement the rotation and the
manual page credits the method to
Alan W. Paeth,
“A Fast Algorithm for General Raster Rotation”,
Proc. Graphics Interface '86, Canadian Information Processing Soc.,
pp. 77–81, May 1986.
In fact, I had discovered the same method a year earlier. See Section V of
Charles F. F. Karney,
“Numerical techniques for the study of
long-time correlations”,
Particle Accelerators, 19(1–4):213–221, May 1986
[Proc. Workshop on Orbital Dynamics and Applications to Accelerators,
Berkeley, CA, Mar. 7–12, 1985;
Princeton Univ. Rept. PPPL–2218 (May 1985) 9 pp.]
The same method is also described in
A. Tanaka, M. Kameyama, S. Kazama, and O. Watanabe,
“A Rotation Method for Raster Image Using Skew Transformation”,
Proc. IEEE Conference on Computer Vision and Pattern Recognition,
pp. 272–277, June 1986.
Useful formats
- PBM, raw bitmap format of
netpbm package.
- PS, PostScript—can be inserted as graphics by LaTeX.
- PNG, lossless compressed image format, one of the standard image
formats for the web—can be inserted as graphics by pdflatex.
- PDF, portable document format—can be inserted as graphics by
pdflatex.
Here is a test on the bitmap image test.png (this
is figure 8 from
arxiv:nlin.CD/0501023).
This image is a 2021 × 1956 scan at 300 dpi of a paper copy of the
figure. The following table shows the size of the image file in various
formats.
| format | size (kbytes)
|
| PBM (raw) | 484
|
| PNG (-compress 9) | 27
|
| PDF | 14
|
| PS (pnmtops) | 984
|
| PS (pnmtops -rle) | 139
|
| PS (imgtops -2, v 1.0) | 148
|
| PS (imgtops -3, v 1.0) | 64
|
| PS (imgtops -2, v 1.0a) | 92
|
| PS (imgtops -3, v 1.0a) | 40
|
| PS (tiff2ps) | 18
|
The PNG file is produced with
res=300
resm=`expr '(' 10000 '*' $res + 127 ')' / 254` # dots/meter
pnmtopng -compress 9 -size "$resm $resm 1" test.pbm > test.png
The PDF file is produced with
res=300
pnmtops -dpi $res -equalpixels -noturn -rle -nosetpage test.pbm |
epstopdf --filter > test.pdf
Surprisingly, PDF offers much better compression than PNG for bitmaps.
Thus PDF is the preferred format for inclusion of bitmap graphics using
pdflatex.
In order to include graphics with latex, it is necessary to convert the
image to PostScript and here we have several options.
-
pnmtops is part of the
netpbm package. -rle turns
on run length encoding.
-
imgtops is an program written by
Doug Zongker and available at
http://imgtops.sourceforge.net/.
Version 1.0 has a “bug” where it treats a bitmap as an 8-bit
image. This is fixed in the CVS repository (changes dated 2005-01-01),
which I call version 1.0a. -3 emits Level 3 PostScript which will
not print correctly on Level 2 PostScript printers and this
is therefore not a suitable format for distributing graphics.
-
tiff2ps is part of the LibTIFF
package.
The clear winner here is tiff2ps. This uses the same compression as
PDF, with the ratio in the resulting files sizes being 4:5 due to
encoding the binary image in ASCII85. The file prints OK on a Level 2
PostScript printer. (However, on some models printing may be slow.
Printing on an HP LaserJet 2100M is about 7 times slower than the
output of pnmtops. Printing on an HP LaserJet 4m Plus, on the other
hand, is about 30% faster that the output to pnmtops.) I found it
convenient to write a script, pbmtops, to
convert a PBM file to PostScript via tiff2ps. I include a comment in
pbmtops indicating how to convert the PostScript file so that black and
white are replaced by some other colors—I'm not sure how useful
this will be in practice. pbmtops calls
dscframe, which puts DSC comments around the
image so that other software can skip over the image quickly. It also
means that it's OK for an image line to begin with %%.
Recommendations
-
Scan images as bitmaps at 300 dpi.
-
For more flexibility, scan in gray scale and then convert to a bitmap
with pgmtopbm -threshold.
-
Convert bitmap to PDF format for inclusion of graphics with pdflatex.
-
Convert bitmap to PS format (using pbmtops script given above) for
inclusion of graphics with latex and for submission of graphics to
arxiv.org.
Charles Karney (2005-06-27)
Back to index.