Neural Network Library

A C++ implementation of a flexible neural network library with forward and backward propagation, featuring multiple activation functions, modern optimizers, and comprehensive unit tests.

Features

✅ What's Available

Network Architecture:

✅ Fully connected (dense) feedforward networks
✅ Arbitrary layer sizes and network depths
✅ Flexible layer connections (supports arbitrary network topologies)
✅ Composable networks (networks as layers)

Activation Functions:

✅ Sigmoid - Classic activation, range (0, 1)
✅ ReLU - Rectified Linear Unit, best for hidden layers
✅ Leaky ReLU - Prevents dying ReLU problem
✅ Tanh - Hyperbolic tangent, range (-1, 1)
✅ Softmax - Multi-class classification output
✅ ELU - Exponential Linear Unit
✅ SELU - Self-normalizing ELU variant
✅ Linear - Identity function for regression

Optimizers:

✅ SGD - Stochastic Gradient Descent with gradient clipping
✅ Momentum - SGD with momentum (β=0.9 default)
✅ Adam - Adaptive moment estimation (β1=0.9, β2=0.999)

Training & Optimization:

✅ Backpropagation for error computation
✅ Xavier/Glorot weight initialization
✅ Gradient clipping for stability
✅ Configurable learning rates

Infrastructure:

✅ Matrix operations optimized with OpenMP and SSE
✅ Model serialization (save/load weights to/from JSON)
✅ Comprehensive unit test suite
✅ CI/CD with GitHub Actions
✅ Cross-platform (x86/x64 and ARM/Apple Silicon)

❌ What's Missing

Advanced Network Types:

❌ Convolutional layers (CNN)
❌ Pooling layers (MaxPool, AvgPool)
❌ Recurrent layers (RNN, LSTM, GRU)
❌ Attention mechanisms

Training Features:

❌ Batch training (currently sample-by-sample only)
❌ Mini-batch gradient descent
❌ Batch normalization
❌ Dropout regularization
❌ L1/L2 weight regularization

Loss Functions:

❌ Cross-entropy loss (currently using MSE only)
❌ Multiple loss function options

Advanced Optimizers:

❌ AdaGrad, RMSprop
❌ Learning rate scheduling
❌ Adaptive learning rate decay

Building

mkdir build && cd build
cmake ..
make -j4

Running Tests

# Build and run unit tests
cd build
make test_network
./test_network

# Run model save/load tests
make test_model_save_load
./test_model_save_load

# Run activation functions and optimizer tests
g++ -std=c++11 -O3 -fopenmp -msse2 -I. \
    test_activations_optimizers.cpp Matrix/matrix.cpp thirdparty/jsonxx/jsonxx.cpp \
    -o test_activations_optimizers
./test_activations_optimizers

# Or use CTest to run all tests
ctest --output-on-failure

Running Toy Example Training

The project includes toy example training tests that demonstrate network learning on simple patterns:

Quick Start

# From project root
./build.sh                     # Build the project
cd build
make test_training             # Compile training tests
./test_training                # Run all training tests

What Gets Tested

The test_training executable runs multiple training scenarios:

Linear Regression (y = 2x + 1)
- Architecture: 1-8-1 (1 input, 8 hidden, 1 output)
- 50,000 epochs with learning rate 0.1
- Target: MSE < 0.01
- Performance: ~5-15 seconds (with -O3 optimization)
XOR Logic Gate
- Architecture: 2-4-1
- 10,000 epochs with learning rate 0.1
- Target: >90% accuracy
AND Logic Gate
- Architecture: 2-2-1
- 5,000 epochs with learning rate 0.5
- Target: >90% accuracy
OR Logic Gate
- Architecture: 2-2-1
- 5,000 epochs with learning rate 0.5
- Target: >90% accuracy

Performance Timing

Each test now includes timing information:

Training time: Time spent in the training loop
Testing time: Time spent evaluating the network

Example output:

>> Training complete. Time: 12345 ms
>> Testing complete. Time: 2 ms
>> Final MSE (on normalized values): 0.00173
>> Accuracy (within 5% tolerance): 72.7% (8/11 samples)

Building with Optimizations

For faster training, build in Release mode with optimizations:

cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make test_training -j4
./test_training

Performance difference:

Debug mode (-O0): ~5+ minutes for linear regression
Release mode (-O3): ~5-15 seconds for linear regression (10-60x faster!)

Comparing with PyTorch

A PyTorch baseline implementation is provided for comparison:

# Install dependencies (if needed)
pip install torch numpy matplotlib

# Run PyTorch comparison
python3 pytorch_toy_example/linear_regression_comparison.py

This generates:

Console output with training progress
Comparison plot: pytorch_toy_example/comparison_results.png
Performance benchmarks for both linear regression and neural network approaches

Usage Example

Basic Network with Custom Activations

#include "network.h"

using namespace ml;

// Create a 3-layer network with ReLU hidden layers and sigmoid output
// Architecture: 2 inputs -> 4 hidden (ReLU) -> 1 output (Sigmoid)
Network<double>* network = new Network<double>();

// Create layers with specific activation functions
Layer<double>* inputLayer  = new Layer<double>(2, "Input", ActivationType::LINEAR);
Layer<double>* hiddenLayer = new Layer<double>(4, "Hidden", ActivationType::RELU);
Layer<double>* outputLayer = new Layer<double>(1, "Output", ActivationType::SIGMOID);

// Connect layers
network->setInputLayer(inputLayer);
network->connect(inputLayer, hiddenLayer);
network->connect(hiddenLayer, outputLayer);
network->setOutputLayer(outputLayer);

// Set optimizer (default is SGD)
network->setOptimizerType(OptimizerType::ADAM);  // or MOMENTUM, SGD

// Initialize weights
network->init();

// Forward pass
Mat<double> input(1, 2, 0);
input.setAt(0, 0, 1.0);
input.setAt(0, 1, 0.5);

Mat<double> output = network->feed(input);

// Backward pass and weight update
Mat<double> targetOutput(1, 1, 0.8);
Mat<double> error = Diff(targetOutput, output);
outputLayer->setErrors(error);
network->backprop();
network->updateWeights(0.01);  // learning rate = 0.01

// Training loop example
for (int epoch = 0; epoch < 1000; epoch++) {
    Mat<double> output = network->feed(input);
    outputLayer->setErrors(Diff(targetOutput, output));
    network->backprop();
    network->updateWeights(0.01);
}

Available Activation Functions

// When creating layers, specify activation type:
Layer<double>* layer1 = new Layer<double>(10, "Layer1", ActivationType::RELU);
Layer<double>* layer2 = new Layer<double>(10, "Layer2", ActivationType::TANH);
Layer<double>* layer3 = new Layer<double>(10, "Layer3", ActivationType::SIGMOID);
Layer<double>* layer4 = new Layer<double>(10, "Layer4", ActivationType::SOFTMAX);

// For Leaky ReLU or ELU, you can specify the alpha parameter:
Layer<double>* leaky = new Layer<double>(10, "Leaky", ActivationType::LEAKY_RELU, 0.01);
Layer<double>* elu = new Layer<double>(10, "ELU", ActivationType::ELU, 1.0);

Available Optimizers

// Set optimizer type
network->setOptimizerType(OptimizerType::SGD);       // Basic SGD with gradient clipping
network->setOptimizerType(OptimizerType::MOMENTUM);  // SGD with momentum (β=0.9)
network->setOptimizerType(OptimizerType::ADAM);      // Adam optimizer (β1=0.9, β2=0.999)

// Or create and set a custom optimizer
AdamOptimizer<double>* adam = new AdamOptimizer<double>(0.9, 0.999, 1e-8);
network->setOptimizer(adam);

Matrix Operations

The library includes optimized matrix operations:

// Element-wise multiplication
Mat<double> result = ElementMult(m1, m2);

// Matrix multiplication
Mat<double> result = Mult(m1, m2);

// Activation functions (using unified interface)
Mat<double> activated = Activate(input, ActivationType::RELU);
Mat<double> grad = ActivateGrad(activated, ActivationType::RELU);

// Or use specific activation functions directly
Mat<double> sigmoid_out = Sigmoid(input);
Mat<double> relu_out = ReLU(input);
Mat<double> tanh_out = Tanh(input);

Model Serialization

Save and load trained network weights to/from JSON files:

Saving a Model

// After training your network...
network->init();
// ... perform training ...

// Save the model to a file
if (network->saveToFile("my_model.json")) {
    std::cout << "Model saved successfully!" << std::endl;
}

Loading a Model

// Create a network with the same structure as the saved model
Network<double>* network = new Network<double>();

ILayer<double>* inputLayer  = new Layer<double>(2, "Input");
ILayer<double>* hiddenLayer = new Layer<double>(4, "Hidden");
ILayer<double>* outputLayer = new Layer<double>(1, "Output");

network->setInputLayer(inputLayer);
network->connect(inputLayer, hiddenLayer);
network->connect(hiddenLayer, outputLayer);
network->setOutputLayer(outputLayer);

// Initialize with random weights (will be overwritten)
network->init();

// Load the saved weights
if (network->loadFromFile("my_model.json")) {
    std::cout << "Model loaded successfully!" << std::endl;

    // Now you can use the network for inference
    Mat<double> input(1, 2, 0);
    input.setAt(0, 0, 1.0);
    input.setAt(0, 1, 0.5);
    Mat<double> output = network->feed(input);
}

JSON File Format

The saved model contains:

layers: Array of layer information (name, input size, output size)
weights: Weight matrices between connected layers

Example:

{
  "layers": [
    {"index": 0, "name": "Input", "input_size": 2, "output_size": 3},
    {"index": 1, "name": "Hidden", "input_size": 4, "output_size": 5},
    {"index": 2, "name": "Output", "input_size": 1, "output_size": 2}
  ],
  "weights": [
    {
      "from_index": 0,
      "to_index": 1,
      "rows": 4,
      "cols": 3,
      "values": [0.123, -0.456, ...]
    }
  ]
}

Important: When loading a model, the network structure (number of layers and their sizes) must match the saved model exactly.

Testing Save/Load

Run the save/load test to see a complete example:

cd build
make test_model_save_load
./test_model_save_load

Architecture

ILayer: Base interface for all layer types
Layer: Standard neural network layer implementation
Network: Container that manages multiple layers and propagation
Mat: Template-based matrix class with reference counting

Requirements

C++11 or later
CMake 3.2+
OpenMP (optional, for parallel operations)
GCC/Clang with C++ support

Platform Support

The library supports both x86/x64 and ARM architectures (including Apple Silicon). SSE optimizations are automatically disabled on non-x86 platforms. The build system will automatically configure the appropriate settings for your platform.

Contributing

Contributions are welcome! Please ensure:

All tests pass: make test_network && ./test_network
Code follows existing style conventions
New features include corresponding unit tests

License

See License.txt for details.

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
.github/workflows		.github/workflows
Matrix		Matrix
docs		docs
math		math
pytorch_toy_example		pytorch_toy_example
res		res
thirdparty/jsonxx		thirdparty/jsonxx
.dockerignore		.dockerignore
.gitignore		.gitignore
AGENTS.md		AGENTS.md
ANALYSIS.md		ANALYSIS.md
CMakeLists.txt		CMakeLists.txt
CODE_REVIEW_FINDINGS.md		CODE_REVIEW_FINDINGS.md
Dockerfile		Dockerfile
LICENSE		LICENSE
License.txt		License.txt
MNIST_README.md		MNIST_README.md
Makefile		Makefile
README.md		README.md
activation.h		activation.h
allheader.h		allheader.h
build.sh		build.sh
conv_layer.h		conv_layer.h
im2col.h		im2col.h
loss.h		loss.h
main.cpp		main.cpp
make.sh		make.sh
mnist_loader.h		mnist_loader.h
network.h		network.h
optimizer.h		optimizer.h
pooling_layer.h		pooling_layer.h
refs.h		refs.h
tensor.h		tensor.h
test_activations_optimizers		test_activations_optimizers
test_activations_optimizers.cpp		test_activations_optimizers.cpp
test_comprehensive		test_comprehensive
test_comprehensive.cpp		test_comprehensive.cpp
test_conv_layer.cpp		test_conv_layer.cpp
test_core_correctness		test_core_correctness
test_core_correctness.cpp		test_core_correctness.cpp
test_debug_weights.cpp		test_debug_weights.cpp
test_forward_prop_bug		test_forward_prop_bug
test_forward_prop_bug.cpp		test_forward_prop_bug.cpp
test_gradient_check		test_gradient_check
test_gradient_check.cpp		test_gradient_check.cpp
test_im2col.cpp		test_im2col.cpp
test_mnist_loader.cpp		test_mnist_loader.cpp
test_mnist_training.cpp		test_mnist_training.cpp
test_model_save_load.cpp		test_model_save_load.cpp
test_mult_hypothesis		test_mult_hypothesis
test_mult_hypothesis.cpp		test_mult_hypothesis.cpp
test_network.cpp		test_network.cpp
test_pooling_layer.cpp		test_pooling_layer.cpp
test_sigmoid_derivative_fix		test_sigmoid_derivative_fix
test_sigmoid_derivative_fix.cpp		test_sigmoid_derivative_fix.cpp
test_simple.cpp		test_simple.cpp
test_tensor.cpp		test_tensor.cpp
test_training		test_training
test_training.cpp		test_training.cpp
test_xor_minimal		test_xor_minimal
test_xor_minimal.cpp		test_xor_minimal.cpp
tests.cpp		tests.cpp
timer.h		timer.h
train_mnist.cpp		train_mnist.cpp
utility.h		utility.h

License

Licenses found

oulrich1/Network

Folders and files

Latest commit

History

Repository files navigation

Neural Network Library

Features

✅ What's Available

❌ What's Missing

Building

Running Tests

Running Toy Example Training

Quick Start

What Gets Tested

Performance Timing

Building with Optimizations

Comparing with PyTorch

Usage Example

Basic Network with Custom Activations

Available Activation Functions

Available Optimizers

Matrix Operations

Model Serialization

Saving a Model

Loading a Model

JSON File Format

Testing Save/Load

Architecture

Requirements

Platform Support

Contributing

License

About

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages