Was at a PHP meetup where the group was discussing Docker. Members and the presenter knew certain Linux distributions had a smaller footprint for use with Docker. I was surprised to find out they really didn't know why that was. One of the key factors is the C runtime library. Basic C runtime libraries just cover the functions and data structures that are part of the ISO C standard. Many C runtime libraries also add functions and data structures that are part of the POSIX standard as documented by the Open Group. Some C runtime libraries are rather bloated and provide a wide variety of functions (even beyond those documented by the ISO C and POSIX standards). Others provide a bare minimum. Some, especially those targeting embedded systems are designed for efficiency. Others are designed for functionality. Some provide no Unicode support (locale 'C' only). Some like musl, concentrate on UTF-8 support. Some try to support a large variety of characters sets and internationalization features. All these factors can affect code size and efficiency when compiling programs.

Alpine Linux previously used the uclibc runtime library and now uses the musl library. Most major Linux distributions use glibc. It was a big step, but a positive one when Debian (and Ubuntu) made the switch to eglibc. The choice of C runtime library can make a huge difference for the operating system.

For Windows developers, if you're using MinGW, the GNU compiler on Windows, you're using the Microsoft runtime libraries. The original version of MinGW used crtdll.dll but later versions use msvcrt.dll. Windows systems typically have one of these runtime libraries already installed. So while, you can't distribute Microsoft's libraries, you really don't have to. At least one is already on any particular Windows system. It gets even more complicated because different versions of Windows have different versions of msvcrt (such as msvcr70.dll, msvcr80.dll). MinGW uses a subset of the runtime functions based on Visual Studio 6.0. There are ways to access runtime functions from later versions of Visual Studio, but the application becomes less portable to various versions of Windows. Cygwin which also works on Windows, avoids the problem of dealing with multiple Windows runtime libraries and dlls and provides better POSIX compatibility for their runtime library by using a runtime library based on newlib.

There are a wide variety of runtime libraries designed for embedded systems. They're typical more compact and less bloated than a library like glibc and many are easier to port to various platforms. Of the runtime libraries available for embedded systems that are highly portable, I thought the newlib design was interesting. newlib is currently maintained by Red Hat. There are a limited number of syscalls (typically functionality provided by the kernel) that one needs to provide code for to get the library to work. Newlib uses a combination of public domain and BSD style licenses. Cygwin uses a runtime library derived from newlib, but adds several POSIX functions. It also uses a GNU GPL license. That means whenever you distribute programs linked with their runtime library you must also distribute the source code in order to be compliant with the GPL license.

As mentioned, the standard C library on most Linux distributions is glibc which was developed by the Free Software Foundation. The glibc project was slow to take patches and is a rather large library, so some distributions switched to eglibc which is binary compatible with glibc. The binary compatibility makes it easy to switch between the two. Some Linux distributions wanted to avoid bloated C libraries and work well on low resource or older systems. They chose uclibc which was designed for embedded systems. When I worked with a uclibc based distribution, I found myself making several patches to Open Source programs just to get them to compile. Many Linux distributions such as Alpine Linux are now using musl. It's designed for efficiency and standards compliance including full POSIX standards compliance.

musl was designed to work with a Linux kernel. So unlike choices such as newlib, it is not easy to port to other operating systems. The midipix project is endeavoring to create a POSIX compatible layer for Windows, so that musl will work out of the box on that system. Musl uses a MIT license. The midipix project will use a more restrictive license similar to Cygwin. Many basic embedded system libraries tend to use less restrictive licenses like MIT and BSD so that they will be adopted by companies. However, project or company adapts a C library to a particular system and adds a lot of functionality, they typically tend to use GNU GPL or other more restrictive licenses. Many projects dual license in hopes of selling companies a commercial license with less restrictions.

Google developed the Bionic C library for it's Android operating systems. It has a BSD license. It's also designed to work on low resource systems so it tends to offer less functionality than other C libraries such as glibc. It has partial POSIX support.

I've been hitting many limitations with the MinGW runtime libraries when it comes to porting. Alternatives such as Cygwin's or midipix's runtime libraries would overcome the issues. However, the licenses are much more restrictive. I know I definitely wanted more POSIX compatibility on Windows than MinGW offers out of the box. As a cross-platform programmer, it would be nice to take whatever C runtime library I end up with on Windows and reuse it on Linux and other systems. I looked at C runtime libraries for embedded systems which typically port well. Of those, newlib seemed the most interesting, because according to some of the documentation it only requires 17 syscalls to port it to an operating system. Some Windows CE compiler ports use newlib with added functionality for the Windows CE operating system to derive a working C/C++ compiler. When I investigated the newlib code, I did not particularly like the design, especially the way it handled threading by providing standard and threaded versions of functions. The implementation of file I/O looked like it had been modified several times and at this point could use a major refactoring. When I read some of the comments in that section of the code, I felt very uncomfortable using the library. I wanted a simple, clean, basic design that I can add to.

Another option I looked at was PDClib. I love the idea of a public domain library. MinGW original licensed their WIN32 and runtime code (which integrates with Microsoft's code) as public domain.
PDCLib was based on an earlier Public Domain library project originally at Sourceforge. I tried PDClib on Windows (which the developer says he uses it with), but I was unable to get file I/O to work properly. I contacted the developer to see if he needed with the project, but he really didn't seem to need any assistance at the time. PDClib only supports standard C functions. It does not provide any POSIX functionality. So, it would have limited usage as is for running most Open Source programs.

A number of original operating systems use their own C runtime libraries. I figure, if they can reinvent the wheel and create a C runtime library for their particular purposes, so can I.

I've been coding functions that are typically part of the C runtime library in order to provide better porting support for Windows. I wrote a C11 compatible thread library, several BSD and POSIX string functions, some POSIX file functions, etc. That left me with the dilemma of how best to integrate the additions with the runtime library. They really should be part of the library and part of the C standard headers. I currently have them implemented as supplemental libraries that have to be added separately. It's easier to test and integrate on various operating systems that way. Ideally, it would be nice to have everything accessible as one library though.

I'm very familiar in the various methods of connecting to the kernel in Windows. Some projects such as midipix just use ntdll.dll. Other projects connect to other Microsoft dlls such as kernel32.dll. One can use LoadLibrary, GetProcAddress to connect t a dll or if the library is already linked in, one can skip LoadLibrary. A few Open Source projects I've seen actually implement the LoadLibrary functionality from scratch. I'm not as familiar with the techniques to connect to the Linux, BSD and other kernels and would love to find more clear documentation on this subject. If you run across any good materials, please let me know ( http://www.distasis.com/connect.htm ). Linux uses techniques such as vdso, vsyscall, syscall to call kernel functions.

I find it fascinating to consider the design trade-offs of various C runtime libraries. With that in mind, here's a list of some of the C runtime library options:

eglibc:
http://www.eglibc.org/home
uclibc:
https://www.uclibc.org/
musl:
https://www.musl-libc.org/

Linux from Scratch build instructions (including musl)
http://clfs.org/view/clfs-embedded/arm/
midipix
http://midipix.org/
BSD regular expression library
(Musl regular expression support was forked from this library.)
https://github.com/laurikari/tre

PDCLib:
http://pdclib.e43.eu/
Original Public Domain C library:
https://sourceforge.net/projects/pdclib/
libTom Public Domain libraries for math and cryptographics functions
(Some of musl's functions were forked from these.)
http://www.libtom.net/

Bionic:
https://github.com/android/platform_bionic

Newlib:
https://sourceware.org/git/gitweb.cgi?p=newlib-cygwin.git;a=tree
libgw32c:
http://gnuwin32.sourceforge.net/packages/libgw32c.htm

Windows CE cross-compiler:
http://cegcc.sourceforge.net/
https://sourceforge.net/p/cegcc/code/HEAD/tree/

ELKS
https://github.com/jbruchon/elks

lib43:
https://github.com/lunixbochs/lib43
I've listed some core utilities options besides GNU. I thought I'd share something about what utilities I personally prefer to use. My main requirement in a good set of core utilities is portability. This is rather hard to find. You would think that if a utility was efficient and lightweight, it would be easy to port. However, that's not necessarily so. Many utilities that are designed for efficiency take advantage of features of a particular operating system which makes them harder to port.

At first, I considered starting with sbase which had stated goals similar to what I was looking for, but it didn't have enough features to effectively replace the GNU core utilities when developing and building programs. While newer versions of sbase have added a lot of functionality, they've become much less portable.

My favorite source for inspiration is Minix. Earlier versions provided some interesting and fairly portable versions of a variety of utilites:
https://www.minix-vmd.org/cgi-bin/raw/source/std/1.7.5/src/commands/simple/
Some of the utilities don't have sufficient UTF-8 support or lack some newer functionality found in GNU utilities that makes them fail when attempting to build applications. However, they make a useful starting point.

In some cases, the OBase or BSD utilities do a better job than the older Minix ones and still do that job efficiently. I particularly like the version of patch found on BSD systems. It's an earlier variant of the Free Software Foundation's patch program. Unlike the FSF's version of patch which uses the GNU license, it uses a BSD style license.

For some utilities, I've consulted the POSIX standards ( http://pubs.opengroup.org/onlinepubs/9699919799/idx/utilities.html ) and rewritten them from scratch.

Rather than trying to port utilities such as the Free Software Foundations coreutils, I thought having a lightweight, efficient, highly portable option would be a useful alternative. Many of the FSF developers have little interest in portability making it hard to get later versions of their programs working on non-POSIX systems. I was surprised at how little interest most users have in developing portable alternatives to the core utilities that could be used to build software. Not only was their little interest, some people posted extremely negative comments when anyone suggested creating alternatives to the FSF software. I was also surprised by some of the negative reactions I read about wonderful projects like SBase.

I have a growing collection of public domain, BSD and MIT licensed alternatives to the GNU core utilities. For now, I just use them for my own projects. If you have an interest in portable utilities and tools, would like to see a viable portable alternative to the FSF's GNU coreutils or would like to further discuss related topics in a positive light, feel free to contact me. I'd enjoy talking with other developers and utility users on the topic.

You'll find some added information on my utilities and information on how to discuss the topic further at:
http://www.distasis.com/cpp/lmbld.htm
Most systems (other than BSD based ones) use GNU's core utilities. It's used by most Linux distributions. Cygwin, MinGW and gnuwin32 run ported versions of the GNU applications as well. Even Microsoft's SFU/SUA included some of the GNU utilities. However, the GNU core utilities are typically more bloated and have more feature creep than other versions of standard Unix utility programs. BSD systems have their versions of core utilities. The latest version of Minix has adopted the BSD utilities. They tend to be less bloated than the GNU versions, but are still more bloated than other options out there. The BSD utilities also tend toward adding new features similar to but not to the same extent as the GNU utilities. Also, some of their utilities aren't as well optimized as the GNU versions. Busybox seems like the most viable option for a lightweight but still comprehensive version of core utilities. I'm currently using it on my Debian system instead of the GNU core utilities. Toybox is a similar alternative to Busybox. It has a better license option than Busybox, but it's lacking some features and tools that Busybox has.

Here are some links to core utility collections:

Earlier Minix alternatives
http://www.minix-vmd.org/cgi-bin/raw/source/std/1.7.5/src/commands/simple/
Earlier versions of Minix put together an interesting collection of lightweight utilities from various sources.

Busybox
https://busybox.net/
Windows ports of Busybox:
https://frippery.org/busybox/
https://github.com/pclouds/busybox-w32

Toybox
http://landley.net/toybox/

Heirloom Project
http://heirloom.sourceforge.net/
Based on traditional implementations of standard Unix utilities. Not very portable to non-POSIX systems. Not as bloated as GNU or BSD core utilities.

OBase
https://github.com/chneukirchen/obase
Port of OpenBSD userland to Linux.

SBase
http://git.suckless.org/sbase
This started out as a discussion on one of the suckless.org mailing lists of how to write efficient core utilites that weren't all part of one executable like Busybox or Toybox. Some good examples were posted and the project was started. Then project development was quiet for a while. The project became active again and one of the main goals besides efficiency was UTF-8/internationalization support. Looks like they've borrowed some UTF-8 support concepts (such as Runes) from Plan 9. It's not designed to be portable to non-POSIX systems. However, it does look like they've covered replacing most of the basic core utilities with lightweight, efficient versions.

Other alternatives:
https://github.com/jbruchon/elks/tree/master/elkscmd
https://github.com/EtchedPixels/FUZIX/tree/master/Applications
https://github.com/rofl0r/hardcore-utils
http://git.musl-libc.org/cgit/noxcuse/tree/
https://github.com/pikhq/pikhq-coreutils
https://github.com/DracoProject/dcore
http://www.fefe.de/embutils/
https://github.com/dimkr/lazy-utils
https://github.com/arsv/minitools
http://git.suckless.org/ubase
https://github.com/minoca/swiss

July 2017

S M T W T F S
      1
234 5678
9101112131415
16171819202122
23242526272829
3031     

Syndicate

RSS Atom

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 23rd, 2017 10:48 pm
Powered by Dreamwidth Studios