lmemsm ([personal profile] lmemsm) wrote2020-07-08 03:23 pm

Machine Learning and Deep Learning Resources for C/C++

Machine learning and deep learning have become very popular. Unfortunately for C/C++ developers, most of the tools for these fields are written in other languages even though many of the core libraries are still written in C/C++. This list tries to track what FLOSS libraries and resources are available in the field that C/C++ developers can work with.

The list is by no means comprehensive and really doesn't offer that many options. So, if you have other suggestions for machine learning C/C++ source code, please let me know.

One thing desperately needed by some projects is a decent FLOSS speech-to-text recognizer. Some suggestions are included below but other options, especially compact ones that will work with mobile devices as well as standard desktop computers would be very useful. I've also included a list of FLOSS text-to-speech programs.


C resources:

https://pjreddie.com/darknet/
Darknet is an open source neural network framework written in C and CUDA.

https://github.com/yui0/catseye
Neural network library.

https://github.com/liuliu/ccv
http://libccv.org
C-based/Cached/Core Computer Vision Library

https://github.com/alrevuelta/cONNXr
A onnx runtime written in pure C99 with zero dependencies focused on embedded devices.

https://github.com/100/Cranium
Portable, header-only, artificial neural network library written in C99.

https://github.com/jeffheaton/encog-c
Encog machine learning framework port to C/C++ for experimentation with CUDA.

http://leenissen.dk/fann/wp/
FANN, Fast Artificial Neural Network Library, is a free open source neural network library which implements multilayer artificial neural networks in C with support for both fully connected and sparsely connected networks.

https://github.com/codeplea/genann
Genann is a minimal, well-tested library for training and using feed-forward artificial neural networks (ANN) in C.

https://github.com/attractivechaos/kann
KANN is a standalone and lightweight library in C for constructing and training small to medium artificial neural networks such as multi-layer perceptrons, convolutional neural networks and recurrent neural networks (including LSTM and GRU).

https://github.com/jppbsi/LibDEEP
LibDEEP is a deep learning library developed in C language for the development of AI techniques.

https://github.com/karpathy/llama2.c
Inference Llama 2 in one file of pure C.

https://github.com/fomichev/llm.c
Large language model in C. GPT-2 inference implementation in pure C.

https://github.com/GHamrouni/LocusCode
LocusCode allows you to perform similarity search on web scale datasets using C.

https://github.com/siavashserver/neonrvm
neonrvm is an Open Source machine learning library written in C for performing regression tasks using RVM technique.

https://github.com/GHamrouni/Recommender
A C library for product recommendations/suggestions using collaborative filtering (CF).

https://github.com/xiph/rnnoise
Recurrent neural network for audio noise reduction and suppression written in C.

https://github.com/LuisWohlers/simpleCnet
SimpleCNet is a simple single-header header-only library for neural networks written in C (C89). Training is accomplished using backpropagation.

https://github.com/symisc/sod
An Embedded Computer Vision & Machine Learning Library that is CPU Optimized & IoT Capable.

https://github.com/glouw/tinn
Tinn (Tiny Neural Network) is a 200 line dependency free neural network library written in C99.

https://github.com/Imetomi/TinY-ANN
TinY ANN is a simple library to create neural networks in C for smaller data science projects.

https://www.vlfeat.org/
The VLFeat open source library implements popular computer vision algorithms specializing in image understanding and local features extraction and matching. Algorithms include Fisher Vector, VLAD, SIFT, MSER, k-means, hierarchical k-means, agglomerative information bottleneck, SLIC superpixels, quick shift superpixels, large scale SVM training and many others. It is written in C for efficiency and compatibility with interfaces in MATLAB for ease of use.

https://igraph.org/c/
Igraph is a C library for creating, manipulating and analysing graphs. It is intended to be as powerful and fast as possible to enable working with large graphs.

http://htk.eng.cam.ac.uk/
Hidden Markov Model Toolkit (HTK) is a portable toolkit for building and manipulating hidden Markov models. HTK is primarily used for speech recognition research although it has been used for numerous other applications including research into speech synthesis, character recognition and DNA sequencing. HTK is in use at hundreds of sites worldwide.
HTK consists of a set of library modules and tools available in C source form. Source available but license is not listed at OSI.


C++ resources:

https://opencv.org/
OpenCV is a C++ library of programming functions mainly aimed at real-time computer vision.

https://aogmaneo.handmade.network/
AOgmaNeo is a machine learning system in C++.

https://github.com/iVishalr/cDNN
cDNN is a deep Learning Library written in C which provides functions that can be used to create artificial neural networks (ANN).

https://github.com/dmlc/cxxnet
CXXNET is a fast, concise, distributed deep learning framework in C++.

https://code.google.com/archive/p/cuda-convnet/
High-performance C++/CUDA implementation of convolutional neural networks.

https://github.com/jolibrain/deepdetect
DeepDetect is a machine learning API and server written in C++11.

http://dlib.net/
Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real world problems.

http://eblearn.sourceforge.net/
Eblearn is an object-oriented C++ library that implements various machine learning models.

https://github.com/FidoProject/Fido
Fido is an lightweight, highly modular C++ machine learning library for embedded electronics and robotics.

https://github.com/nickgillian/grt
The Gesture Recognition Toolkit (GRT) is a cross-platform, Open Source, C++ machine learning library designed for real-time gesture recognition.

https://github.com/tboox/hnr
hnr is an off-line handwritten numeral recognition system written in C++.

https://github.com/mosdeo/LKYDeepNN
Low dependency (using C++11 and STL only), portable, header-only, deep neural networks for embedded systems.

https://github.com/ggerganov/llama.cpp
Port of Facebook's LLaMA model in C/C++.

https://www.mlpack.org/
MLpack is a fast, flexible machine learning library written in C++ that aims to provide extensible implementations of cutting-edge machine learning algorithms.

https://github.com/CMU-Perceptual-Computing-Lab/openpose
OpenPose is real-time multi-person keypoint C++ detection library for body, face, hands and foot estimation.

http://image.diku.dk/shark/sphinx_pages/build/html/index.html
SHARK is a fast, modular, feature-rich open-source C++ machine learning library.

https://github.com/Tyill/skynet
Skynet is a neural network library written from scratch in C++ using only STL and OpenBLAS for calculation.

https://github.com/tensor-compiler/taco
Tensor algebra compiler in C++.

https://github.com/LanguageMachines/timbl/
TiMBL implements several memory-based learning algorithms using C++.

https://github.com/tiny-dnn/tiny-dnn
Header only, dependency-free deep learning framework in C++14.

https://github.com/aksnzhy/xlearn
xLearn is a high performance, easy-to-use, and scalable machine learning package written in C++ that contains linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM), all of which can be used to solve large-scale machine learning problems.

https://github.com/google/XNNPACK
XNNPACK is a highly optimized library of floating-point neural network inference operators written in C/C++ for ARM, WebAssembly, and x86 platforms. It provides low-level performance primitives for accelerating high-level machine learning frameworks


Tensorflow related:

https://github.com/tensorflow/tensorflow
An Open Source Machine Learning Framework in C++ for everyone.

https://github.com/google/mediapipe
MediaPipe is the simplest way for researchers and developers to build world-class ML solutions and applications for mobile, desktop/cloud, web and IoT devices.MediaPipe on the Web is an effort to run the same ML solutions built for mobile and desktop also in web browsers. C++

https://github.com/terryky/tflite_gles_app
GPU accelerated deep learning inference applications using TensorflowLite GPUDelegate/TensorRT and C.

https://github.com/uTensor/uTensor
TinyML AI inference library for C++11.


Open Source Speech-to-Text resources:

https://github.com/cmusphinx/pocketsphinx
PocketSphinx is one of Carnegie Mellon University's open source large vocabulary, speaker-independent continuous speech recognition engines written mainly in C.
https://github.com/cmusphinx/sphinxbase
Basic libraries shared by the CMU Sphinx trainer and all the Sphinx decoders (Sphinx-II, Sphinx-III, and PocketSphinx) written in C.
https://sourceforge.net/projects/cmusphinx/files/sphinx2/
CMU Sphinx 2 is a speaker-independent large vocabulary continuous speech recognizer written in C and released under BSD style license.

https://github.com/kaldi-asr/kaldi
Kaldi is a toolkit for speech recognition, intended for use by speech recognition researchers and professionals written in C++.

https://github.com/julius-speech/julius
Open Source large vocabulary continuous speech recognition engine written in C.

https://github.com/arthurv/srec
Fork of Android's speech recognition engine in C. From Android 4.1 aka Jelly Bean.

https://github.com/mozilla/DeepSpeech/tree/master/native_client
DeepSpeech from Mozilla (uses tensorflow). Written in C++.

https://github.com/nyumaya/nyumaya_audio_recognition_lib
C/C++ audio recognition library using tensorflow.

https://github.com/ggerganov/whisper.cpp
OpenAI's Whisper model in C/C++.


Open Source Text-to-Speech resources:

http://www.festvox.org/flite/index.html
CMU Flite (festival-lite) is a small, fast run-time open source text to speech synthesis engine developed at CMU and primarily designed for small embedded machines and/or large servers. Flite is designed as an alternative text to speech synthesis engine to Festival for voices built using the FestVox suite of voice building tools. C only, very portable source. (See also Festival.)

http://espeak.sourceforge.net/
eSpeak is a compact Open Source speech synthesizer for English and other languages written in C.

https://github.com/espeak-ng/espeak-ng
eSpeak NG is a compact Open Source speech synthesizer that supports more than hundred languages and accents. It is written in C.

https://github.com/MycroftAI/mimic
Mycroft's TTS engine, based on CMU's Flite (Festival Lite) in C.

https://github.com/robotology/speech
Speech synthesis and speech recognition programs including PicoTTS.

https://github.com/gmn/nanotts
NanoTTS is a command line speech synthesizer that improves on pico2wave, part of SVOX PicoTTS.

https://github.com/syoyo/tacotron-tts-cpp
Tacotron text to speech in C++ (synthesize only). Uses tensorflow.