Looking for a PDF viewer
Dec. 6th, 2018 10:02 amI've been investigating Open Source C/C++ PDF viewers/rendering libraries. There really isn't a lot available out there. Most projects use poppler. It was developed as a fork of xpdf and is supposed to be somewhat more efficient than it. Both are licensed under GPL. The only other real competitor is mupdf. There have been some comparisons done on efficiency between mupdf and poppler and mupdf seems to be the clear winner. Just running mupdf versus a poppler based PDF viewer on a very large PDF file being viewed on a slow computer gives a good idea of the differences between the two. Mupdf was originally licensed under a GPL license and is now using AGPL which is even more restrictive. There are also commercial licenses available for poppler and mupdf which companies hope that projects will use if the GPL and AGPL licenses are too limiting. Projects would need to make all of their source code freely available if they want to include either of these libraries and don't want to purchase a commercial license. With AGPL, the source code needs to be available even if the program is not distributed, but it is used by someone over a network.
One possible alternative to these options is pdf.js ( https://github.com/mozilla/pdf.js ). The license is Apache 2.0 which doesn't require that you distribute your own source code. pdf.js as the name implies is written primarily in JavaScript. So, to run this with a C/C++ program, one could use an embedded web browser that can handle JavaScript. One such alternative that works with SDL, FLTK and other graphics or GUI libraries is the wkccc library (based on webkit) in Netrider ( https://sourceforge.net/projects/netrider/ ). Another alternative would be to convert the JavaScript code to C. This should greatly improve rendering efficiency and speed if done properly. It's an alternative I'm seriously considering should I need a PDF rendering library with a more liberal license at some point.
Since I just wanted a simple, efficient Open Source PDF viewer that I could freely distribute (along with its source code) if needed, mupdf seemed like a good possibility. However, I'd prefer to avoid AGPL licenses. Occasionally I see developers keeping an older version of an Open Source software project alive because they prefer the license used by the older version. When I ran across a GPLv2 version of mupdf at Github ( https://github.com/technosaurus/mupdf-GPL2 ), it encouraged me to consider the option of using an older version of mupdf with a more lenient license. I investigated using the GPLv2 version at github, however, it was missing several features of later versions of mupdf and it used jam as a build system. Going back to the last GPLv3 versions of mupdf provided many more features than the GPLv2 version plus a more standard build system. Searching for mupdf GPLv3 on github even yielded a handful of projects that went on to add new features. So, assuming ebook rendering isn't needed (and I already use applications like bard for that), the GPLv3 version of mupdf really has all the features needed for creating a fast, basic, cross-platform PDF viewer. My next step was to investigate what was working in the GPLv3 version, what was missing and what could be easily added using the various GPLv3 forks or other Open Source projects.
After further investigation and some prototyping, I now have a GPLv3 version of mupdf. It already had basic cbz support. However, I really wanted a cbr viewer as well. I have been searching for a decent, open licensed cbr library for a long time and I finally found an Open Source LGPL library that is part of sumatrapdf. I removed much of the cbz code in mupdf and substituted code that uses the unarr library from sumatrapdf for cbz and cbr access. If one can view graphics in cbz or cbr format, one can also view graphics that aren't archived, so I made sure support for that was working properly. I also added support for SVG format using nanoSVG. Viewing SVG files with a later version of the mupdf application and comparing that with my results, my version actually seemed to do a better job using the nanoSVG code for rendering than the mupdf AGPL licensed rendering engine did.
I happen to like the Unix philosophy of finding and working with programs that do one thing well. Since unarr is LGPL licensed, I realized I could have a cbz/cbr viewer completely independent of any mupdf code and the whole project could be licensed under LGPL. I have the basics working with unarr and SDL, I just need to create a user interface to handle choosing files and navigating through the pages. However, my original goal was a PDF viewer. It would still require a library such as mupdf whether I split off cbz/cbr viewing functionality or not. That currently leaves me with two choices for PDF viewing. I can use the mupdf GPLv3 library with a mupdf or custom SDL front end or I can attempt to port a library such as pdf.js to C and create a SDL front end for it.
I can't help wondering if I'm the only person looking for a lightweight, efficient, C based Open Source PDF viewer that can be used with a Linux distribution (or on other platforms when needed). There are some interesting lightweight Linux distributions being developed with goals such as working on older computers or making the most of their hardware. There are projects where developers build everything themselves (such as Linux from Scratch). Simple to build programs with limited dependencies are often preferrable in those cases. There are some alternative free operating systems in the works (like FreeDOS and Haiku). There are also users who like to specialize in finding light, efficient, limited dependency software for their desktop computers. Plus, with cross-platform programs, it would be nice to run the same familiar software on mobile devices as well as on desktops and laptops. Considering all that, there should be other developers and users who might be interested in similar goals. If that's true, I'd love to hear from you and compare ideas on how best to accomplish the goal of creating a simple, lightweight Open Source C PDF viewer. You can contact me via the CppDesign mailing list ( http://groups.yahoo.com/group/cppdesign ) or other methods and share notes on development and building software. It would be nice to combine resources on a project of this nature. Would also be happy to discuss finding/working on/building other useful, lightweight Open Source C/C++ projects.
I haven't fully decided which way to go for a PDF viewer, so I haven't made my source code enhancements to mupdf (GPLv3) available as yet. Not much point taking up web space to share code if no one's using the project at all. If there are others interested in this route, please let me know.
One possible alternative to these options is pdf.js ( https://github.com/mozilla/pdf.js ). The license is Apache 2.0 which doesn't require that you distribute your own source code. pdf.js as the name implies is written primarily in JavaScript. So, to run this with a C/C++ program, one could use an embedded web browser that can handle JavaScript. One such alternative that works with SDL, FLTK and other graphics or GUI libraries is the wkccc library (based on webkit) in Netrider ( https://sourceforge.net/projects/netrider/ ). Another alternative would be to convert the JavaScript code to C. This should greatly improve rendering efficiency and speed if done properly. It's an alternative I'm seriously considering should I need a PDF rendering library with a more liberal license at some point.
Since I just wanted a simple, efficient Open Source PDF viewer that I could freely distribute (along with its source code) if needed, mupdf seemed like a good possibility. However, I'd prefer to avoid AGPL licenses. Occasionally I see developers keeping an older version of an Open Source software project alive because they prefer the license used by the older version. When I ran across a GPLv2 version of mupdf at Github ( https://github.com/technosaurus/mupdf-GPL2 ), it encouraged me to consider the option of using an older version of mupdf with a more lenient license. I investigated using the GPLv2 version at github, however, it was missing several features of later versions of mupdf and it used jam as a build system. Going back to the last GPLv3 versions of mupdf provided many more features than the GPLv2 version plus a more standard build system. Searching for mupdf GPLv3 on github even yielded a handful of projects that went on to add new features. So, assuming ebook rendering isn't needed (and I already use applications like bard for that), the GPLv3 version of mupdf really has all the features needed for creating a fast, basic, cross-platform PDF viewer. My next step was to investigate what was working in the GPLv3 version, what was missing and what could be easily added using the various GPLv3 forks or other Open Source projects.
After further investigation and some prototyping, I now have a GPLv3 version of mupdf. It already had basic cbz support. However, I really wanted a cbr viewer as well. I have been searching for a decent, open licensed cbr library for a long time and I finally found an Open Source LGPL library that is part of sumatrapdf. I removed much of the cbz code in mupdf and substituted code that uses the unarr library from sumatrapdf for cbz and cbr access. If one can view graphics in cbz or cbr format, one can also view graphics that aren't archived, so I made sure support for that was working properly. I also added support for SVG format using nanoSVG. Viewing SVG files with a later version of the mupdf application and comparing that with my results, my version actually seemed to do a better job using the nanoSVG code for rendering than the mupdf AGPL licensed rendering engine did.
I happen to like the Unix philosophy of finding and working with programs that do one thing well. Since unarr is LGPL licensed, I realized I could have a cbz/cbr viewer completely independent of any mupdf code and the whole project could be licensed under LGPL. I have the basics working with unarr and SDL, I just need to create a user interface to handle choosing files and navigating through the pages. However, my original goal was a PDF viewer. It would still require a library such as mupdf whether I split off cbz/cbr viewing functionality or not. That currently leaves me with two choices for PDF viewing. I can use the mupdf GPLv3 library with a mupdf or custom SDL front end or I can attempt to port a library such as pdf.js to C and create a SDL front end for it.
I can't help wondering if I'm the only person looking for a lightweight, efficient, C based Open Source PDF viewer that can be used with a Linux distribution (or on other platforms when needed). There are some interesting lightweight Linux distributions being developed with goals such as working on older computers or making the most of their hardware. There are projects where developers build everything themselves (such as Linux from Scratch). Simple to build programs with limited dependencies are often preferrable in those cases. There are some alternative free operating systems in the works (like FreeDOS and Haiku). There are also users who like to specialize in finding light, efficient, limited dependency software for their desktop computers. Plus, with cross-platform programs, it would be nice to run the same familiar software on mobile devices as well as on desktops and laptops. Considering all that, there should be other developers and users who might be interested in similar goals. If that's true, I'd love to hear from you and compare ideas on how best to accomplish the goal of creating a simple, lightweight Open Source C PDF viewer. You can contact me via the CppDesign mailing list ( http://groups.yahoo.com/group/cppdesign ) or other methods and share notes on development and building software. It would be nice to combine resources on a project of this nature. Would also be happy to discuss finding/working on/building other useful, lightweight Open Source C/C++ projects.
I haven't fully decided which way to go for a PDF viewer, so I haven't made my source code enhancements to mupdf (GPLv3) available as yet. Not much point taking up web space to share code if no one's using the project at all. If there are others interested in this route, please let me know.