[personal profile] lmemsm
Was at a PHP meetup where the group was discussing Docker. Members and the presenter knew certain Linux distributions had a smaller footprint for use with Docker. I was surprised to find out they really didn't know why that was. One of the key factors is the C runtime library. Basic C runtime libraries just cover the functions and data structures that are part of the ISO C standard. Many C runtime libraries also add functions and data structures that are part of the POSIX standard as documented by the Open Group. Some C runtime libraries are rather bloated and provide a wide variety of functions (even beyond those documented by the ISO C and POSIX standards). Others provide a bare minimum. Some, especially those targeting embedded systems are designed for efficiency. Others are designed for functionality. Some provide no Unicode support (locale 'C' only). Some like musl, concentrate on UTF-8 support. Some try to support a large variety of characters sets and internationalization features. All these factors can affect code size and efficiency when compiling programs.

Alpine Linux previously used the uclibc runtime library and now uses the musl library. Most major Linux distributions use glibc. It was a big step, but a positive one when Debian (and Ubuntu) made the switch to eglibc. The choice of C runtime library can make a huge difference for the operating system.

For Windows developers, if you're using MinGW, the GNU compiler on Windows, you're using the Microsoft runtime libraries. The original version of MinGW used crtdll.dll but later versions use msvcrt.dll. Windows systems typically have one of these runtime libraries already installed. So while, you can't distribute Microsoft's libraries, you really don't have to. At least one is already on any particular Windows system. It gets even more complicated because different versions of Windows have different versions of msvcrt (such as msvcr70.dll, msvcr80.dll). MinGW uses a subset of the runtime functions based on Visual Studio 6.0. There are ways to access runtime functions from later versions of Visual Studio, but the application becomes less portable to various versions of Windows. Cygwin which also works on Windows, avoids the problem of dealing with multiple Windows runtime libraries and dlls and provides better POSIX compatibility for their runtime library by using a runtime library based on newlib.

There are a wide variety of runtime libraries designed for embedded systems. They're typical more compact and less bloated than a library like glibc and many are easier to port to various platforms. Of the runtime libraries available for embedded systems that are highly portable, I thought the newlib design was interesting. newlib is currently maintained by Red Hat. There are a limited number of syscalls (typically functionality provided by the kernel) that one needs to provide code for to get the library to work. Newlib uses a combination of public domain and BSD style licenses. Cygwin uses a runtime library derived from newlib, but adds several POSIX functions. It also uses a GNU GPL license. That means whenever you distribute programs linked with their runtime library you must also distribute the source code in order to be compliant with the GPL license.

As mentioned, the standard C library on most Linux distributions is glibc which was developed by the Free Software Foundation. The glibc project was slow to take patches and is a rather large library, so some distributions switched to eglibc which is binary compatible with glibc. The binary compatibility makes it easy to switch between the two. Some Linux distributions wanted to avoid bloated C libraries and work well on low resource or older systems. They chose uclibc which was designed for embedded systems. When I worked with a uclibc based distribution, I found myself making several patches to Open Source programs just to get them to compile. Many Linux distributions such as Alpine Linux are now using musl. It's designed for efficiency and standards compliance including full POSIX standards compliance.

musl was designed to work with a Linux kernel. So unlike choices such as newlib, it is not easy to port to other operating systems. The midipix project is endeavoring to create a POSIX compatible layer for Windows, so that musl will work out of the box on that system. Musl uses a MIT license. The midipix project will use a more restrictive license similar to Cygwin. Many basic embedded system libraries tend to use less restrictive licenses like MIT and BSD so that they will be adopted by companies. However, project or company adapts a C library to a particular system and adds a lot of functionality, they typically tend to use GNU GPL or other more restrictive licenses. Many projects dual license in hopes of selling companies a commercial license with less restrictions.

Google developed the Bionic C library for it's Android operating systems. It has a BSD license. It's also designed to work on low resource systems so it tends to offer less functionality than other C libraries such as glibc. It has partial POSIX support.

I've been hitting many limitations with the MinGW runtime libraries when it comes to porting. Alternatives such as Cygwin's or midipix's runtime libraries would overcome the issues. However, the licenses are much more restrictive. I know I definitely wanted more POSIX compatibility on Windows than MinGW offers out of the box. As a cross-platform programmer, it would be nice to take whatever C runtime library I end up with on Windows and reuse it on Linux and other systems. I looked at C runtime libraries for embedded systems which typically port well. Of those, newlib seemed the most interesting, because according to some of the documentation it only requires 17 syscalls to port it to an operating system. Some Windows CE compiler ports use newlib with added functionality for the Windows CE operating system to derive a working C/C++ compiler. When I investigated the newlib code, I did not particularly like the design, especially the way it handled threading by providing standard and threaded versions of functions. The implementation of file I/O looked like it had been modified several times and at this point could use a major refactoring. When I read some of the comments in that section of the code, I felt very uncomfortable using the library. I wanted a simple, clean, basic design that I can add to.

Another option I looked at was PDClib. I love the idea of a public domain library. MinGW original licensed their WIN32 and runtime code (which integrates with Microsoft's code) as public domain.
PDCLib was based on an earlier Public Domain library project originally at Sourceforge. I tried PDClib on Windows (which the developer says he uses it with), but I was unable to get file I/O to work properly. I contacted the developer to see if he needed with the project, but he really didn't seem to need any assistance at the time. PDClib only supports standard C functions. It does not provide any POSIX functionality. So, it would have limited usage as is for running most Open Source programs.

A number of original operating systems use their own C runtime libraries. I figure, if they can reinvent the wheel and create a C runtime library for their particular purposes, so can I.

I've been coding functions that are typically part of the C runtime library in order to provide better porting support for Windows. I wrote a C11 compatible thread library, several BSD and POSIX string functions, some POSIX file functions, etc. That left me with the dilemma of how best to integrate the additions with the runtime library. They really should be part of the library and part of the C standard headers. I currently have them implemented as supplemental libraries that have to be added separately. It's easier to test and integrate on various operating systems that way. Ideally, it would be nice to have everything accessible as one library though.

I'm very familiar in the various methods of connecting to the kernel in Windows. Some projects such as midipix just use ntdll.dll. Other projects connect to other Microsoft dlls such as kernel32.dll. One can use LoadLibrary, GetProcAddress to connect t a dll or if the library is already linked in, one can skip LoadLibrary. A few Open Source projects I've seen actually implement the LoadLibrary functionality from scratch. I'm not as familiar with the techniques to connect to the Linux, BSD and other kernels and would love to find more clear documentation on this subject. If you run across any good materials, please let me know ( http://www.distasis.com/connect.htm ). Linux uses techniques such as vdso, vsyscall, syscall to call kernel functions.

I find it fascinating to consider the design trade-offs of various C runtime libraries. With that in mind, here's a list of some of the C runtime library options:

eglibc:
http://www.eglibc.org/home
uclibc:
https://www.uclibc.org/
musl:
https://www.musl-libc.org/

Linux from Scratch build instructions (including musl)
http://clfs.org/view/clfs-embedded/arm/
midipix
http://midipix.org/
BSD regular expression library
(Musl regular expression support was forked from this library.)
https://github.com/laurikari/tre

PDCLib:
http://pdclib.e43.eu/
Original Public Domain C library:
https://sourceforge.net/projects/pdclib/
libTom Public Domain libraries for math and cryptographics functions
(Some of musl's functions were forked from these.)
http://www.libtom.net/

Bionic:
https://github.com/android/platform_bionic

Newlib:
https://sourceware.org/git/gitweb.cgi?p=newlib-cygwin.git;a=tree
libgw32c:
http://gnuwin32.sourceforge.net/packages/libgw32c.htm

Windows CE cross-compiler:
http://cegcc.sourceforge.net/
https://sourceforge.net/p/cegcc/code/HEAD/tree/

ELKS
https://github.com/jbruchon/elks

lib43:
https://github.com/lunixbochs/lib43

July 2017

S M T W T F S
      1
234 5678
9101112131415
16171819202122
23242526272829
3031     

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 23rd, 2017 10:43 pm
Powered by Dreamwidth Studios