2016-12-20

2016-12-20 02:58 pm

PDF and Postscript

Some interesting Open Source command line programs output their results in Postscript and/or PDF format. That makes it useful to have lightweight PDF and PostScript viewers that don't require a lot of dependencies to build so you can view their output quickly. It's also nice on Linux systems if they work in framebuffer mode, so you don't have to start an X Windows session just to view results. Cross-platform viewers work on a variety of systems from Linux and Mac to Windows to DOS and even Android. Having tools to edit/modify PDF and PostScript output is also useful.

I've been searching a lot lately for lightweight PostScript and PDF viewer options. Thought I'd share some of what I've learned about cross-platform PostScript and PDF support. If you know of any other viewers or PDF/PostScript related programs with minimal build dependencies that I may have missed, please let me know. I'm always looking for additions for my Open Source applications list: http://www.distasis.com/cpp/osrclist.htm

First, here some of the programs I use that output to PDF or PostScript.

wkhtmltopdf
http://wkhtmltopdf.org/
This does have a lot of build dependencies (needs Qt and webkit), but it's one of the few programs that I've been able to find that does a good job of converting HTML to PDF.

Other HTML to PostScript/PDF alternatives are html2ps ( http://user.it.uu.se/~jan/html2ps.html ) which requires Perl and PhantomJS ( http://phantomjs.org/ ).

abcm2ps
http://moinejf.free.fr/
I use abcm2ps to convert songs in ABC notation to sheet music. Later versions of abcm2ps have the ability to output to SVG as well as PostScript. I have a lightweight SVG viewer based on nanosvg and SDL that displays SVG, so that should be an interesting display option if a PostScript viewer isn't readily available. I can upload the build scripts for it to my archive if anyone wants to try out the SVG viewer.

lcal and pcal
http://pcal.sourceforge.net/
Personal calendars and lunar calendars can also be output to PostScript format.

gle
http://glx.sourceforge.net
The graphics layout engine outputs charts and graphs to PostScript, PDF and some graphics formats.

Other applications can output starcharts, barcodes, graphs, etc. to PostScript format. The applications I listed above are the ones I personally use the most in this category. If you have other favorites, would be interested to hear about them.

Next, I'll mention some PDF libraries and viewers.

The two main PDF rendering libraries/applications are mupdf and poppler.

Poppler
https://poppler.freedesktop.org/
Poppler is a library based on the xpdf viewer, so in most cases you really only need one of these on your system. They both have a lot of the same utility programs. When building poppler, there's additional support for GTK and Qt frameworks if you have those on your system. I typically build with these turned off to lower dependencies.

MuPDF
http://mupdf.com/
Mupdf was written by the developers of Ghostscript. It renders very quickly and is supposed to be very lightweight. However, what I dislike about it is its growing list of dependencies. Newer versions of mupdf provide more and more functionality, but they're requiring more dependencies to do so. Later versions of mupdf now require harfbuzz which I typically don't use with any other applications. (I'm not using Opentype fonts.) Also, the developers recommend you use their third party libraries rather than system libraries. There are work-arounds to use the system libraries, but it's a nuisance to modify the build scripts each time the program is updated. The API is not very stable either. Every time I download a new version, it seems to break third party utilities built using the mupdf library.

Mupdf has example viewers that come with it. The latest example includes an OpenGL based viewer. I did some work on a SDL based viewer using mupdf, but the API kept changing, making it difficult to maintain. When the OpenGL based viewer was added, I looked into using picogl in place of OpenGL/Mesa as a backend for the viewer. Using a viewer that comes with mupdf meant the mupdf developers would be responsible for dealing with the API changes. I got the OpenGL based viewer to build with picogl, but picogl still needs additional functionality to provide clear rendering of the text displayed by the mupdf viewer. Hope to work on this further in the future and would love to hear from anyone else who'd be interested in helping on this project.

There are some SDL and framebuffer based PDF viewers. However, they were either not easy to port to other platforms or had heavy dependencies I didn't want to add to my system. I found it interesting that some viewers used SDL as the actual graphical user interface (and could run in framebuffer mode on Linux), but still required GTK or Qt as an additional interface when rendering pages.

I finally found a lightweight, cross-platform PDF viewer that uses FLTK as a front end. It's called SpRead ( https://github.com/muranoya/SpRead ). It can be run in framebuffer on Linux using FLTK built with nano-x. The viewer is very lightweight and does seem to render quickly. Dependencies are really minimal compared to other viewers I've looked at. However, mupdf renders faster in many cases. SpRead uses poppler for PDF rendering and libarchive for zipped/archived file access.

Some other interesting PDF libraries available for editing and creating PDFs are qpdf ( http://qpdf.sourceforge.net/ ) and libharu ( http://libharu.org/ ).

Poppler and xpdf both come with PDF to text conversion utilities. This is very useful if you want to work with grep to search for a word or phrase in a collection of PDF files. Mupdf has added the ability to convert to text in more recent versions. However, their help/man pages on how to use their tools could use improvement. When I was searching for a lightweight PDF to text converter, I ran across pdftxt at https://litcave.rudi.ir/ As I mentioned, the mupdf library API keeps changing with newer releases, so typically when I update my version of mupdf, the pdftxt build breaks. I've uploaded patches to build pdftxt with the version of mupdf that I've most recently updated to. You can find patches and build scripts at the archive link on this page:
http://www.distasis.com/cpp/lmbld.htm

Since lightweight PDF viewers are more plentiful than PostScript viewers, I searched for PostScript to PDF converters. The main option in this category is Ghostscript. Like mupdf, it requires several dependencies and uses its own versions rather than system versions of libraries. It's actually worse in this area than mupdf. The only viable option to Ghostscript that I could find for PostScript rendering was xpost ( https://github.com/luser-dr00g/xpost ). It has less dependencies that Ghostscript and is easier to build. However, it's still a work in progress. I'm hoping at some point that it will replace Ghostscript for the functionality I'm interested in, but it at this point, it still doesn't have all the functionality I require. It does look like a promising project though.

Those are some of the interesting libraries and applications I've found on my search for PostScript and PDF related utilities. Hope you find some of them useful. If you know of other lightweight, cross-platform alternatives, I'd love to hear from you about them.