Basic Operations

construct network
access layer properties
access weight vector for each layer
train network
visualize each layer activations
visualize convolution kernels
save and load network
load dataset
change weight initialization
handle error

construct network

You can construct networks by chaining operator << from top(input) to bottom(output).

...
// input: 32x32x1 (1024 dimensions)  output: 10
network<mse, adagrad> net;
net << convolutional_layer<tan_h>(32, 32, 5, 1, 6) // 32x32in, conv5x5
    << average_pooling_layer<tan_h>(28, 28, 6, 2) // 28x28in, pool2x2
    << fully_connected_layer<tan_h>(14 * 14 * 6, 120)
    << fully_connected_layer<identity>(120, 10);

...
// input: 32x32x3 (3072 dimensions)  output: 40
network<cross_entropy, RMSprop> net;
net << convolutional_layer<relu>(32, 32, 5, 3, 9)
    << average_pooling_layer<relu>(28, 28, 9, 2)
    << fully_connected_layer<tan_h>(14 * 14 * 9, 120)
    << fully_connected_layer<softmax>(120, 40);

If your network is simple mlp(multi-layer perceptron), you can also use make_mlp function.

...
auto mynet = make_mlp<mse, gradient_descent, tan_h>({ 32 * 32, 300, 10 });

access layer properties

You can access each layer by operator[] after construction.

...
network<cross_entropy, RMSprop> nn;

nn << convolutional_layer<tan_h>(32, 32, 5, 3, 6)
    << max_pooling_layer<tan_h>(28, 28, 6, 2)
    << fully_connected_layer<tan_h>(14 * 14 * 6, 10);

for (int i = 0; i < nn.depth(); i++) {
    cout << "#layer:" << i << "\n";
    cout << "layer type:" << nn[i]->layer_type() << "\n";
    cout << "input:" << nn[i]->in_size() << "(" << nn[i]->in_shape() << ")\n";
    cout << "output:" << nn[i]->out_size() << "(" << nn[i]->out_shape() << ")\n";
}

output:

#layer:0
layer type:conv
input:3072(32x32x3)
output:4704(28x28x6)
num of parameters:456
#layer:1
layer type:max-pool
input:4704(28x28x6)
output:1176(14x14x6)
num of parameters:0
#layer:2
layer type:fully-connected
input:1176(1176x1x1)
output:10(10x1x1)
num of parameters:11770

access weight vector for each layer

...
vec_t& weight = nn[i]->weight();
vec_t& bias = nn[i]->bias();

train network

without callback

...
// minibatch=50, epoch=20
nn.train(train_images, train_labels, 50, 20);

with callback

...
// test&save for each epoch
int epoch = 0;
nn.train(train_images, train_labels, 50, 20, [](){},
         [&](){
           result res = nn.test(test_images, test_labels);
           cout << res.num_success << "/" << res.num_total << endl;
           ofstream ofs (("epoch_"+to_string(epoch++)).c_str());
           ofs << nn;
         });

visualize each layer activations

network<cross_entropy, RMSprop> nn;

nn << convolutional_layer<tan_h>(32, 32, 5, 3, 6)
    << max_pooling_layer<tan_h>(28, 28, 6, 2)
    << fully_connected_layer<tan_h>(14 * 14 * 6, 10);
...
image img = nn[0]->output_to_image(); // visualize activations of recent input
img.write("layer0.bmp");

visualize convolution kernels

network<cross_entropy, RMSprop> nn;

nn << convolutional_layer<tan_h>(32, 32, 5, 3, 6)
    << max_pooling_layer<tan_h>(28, 28, 6, 2)
    << fully_connected_layer<tan_h>(14 * 14 * 6, 10);
...
image img = nn.at<convolutional_layer<tan_h>>(0).weight_to_image();
img.write("kernel0.bmp");

save and load network

Simply use operator << and >> to save/load network weights.

save

network<cross_entropy, RMSprop> nn;

nn << convolutional_layer<tan_h>(32, 32, 5, 3, 6)
    << max_pooling_layer<tan_h>(28, 28, 6, 2)
    << fully_connected_layer<tan_h>(14 * 14 * 6, 10);
...
std::ofstream output("nets.txt");
output << nn;

load

network<cross_entropy, RMSprop> nn;

nn << convolutional_layer<tan_h>(32, 32, 5, 3, 6)
    << max_pooling_layer<tan_h>(28, 28, 6, 2)
    << fully_connected_layer<tan_h>(14 * 14 * 6, 10);
...
std::ifstream input("nets.txt");
input >> nn;

tiny_cnn saves only weights/biases array, not network structure itself. So you must construct network(same as training time) before loading.

load dataset

from MNIST idx format

vector<vec_t> images;
vector<label_t> labels;
parse_mnist_images("train-images.idx3-ubyte", &images, -1.0, 1.0, 2, 2);
parse_mnist_labels("train-labels.idx1-ubyte", &labels);

from cifar-10 binary format

vector<vec_t> images;
vector<label_t> labels;
parse_cifar10("data_batch1.bin", &images, &labels, -1.0, 1.0, 0, 0);

change weight initialization

In neural network training, initial value of weight/bias can affect training speed and accuracy. In tiny-cnn, the weight is appropriately scaled by xavier algorithm1 and the bias is filled with 0.

To change initialization method and scaling factor, use weight_init() and bias_init() function of network and layer class.

xavier ... automatic scaling using sqrt(scale / (fan-in + fan-out))
lecun ... automatic scaling using scale / sqrt(fan-in)
constant ... fill constant value

int num_units [] = { 100, 400, 100 };
auto nn = make_mlp<mse, gradient_descent, tan_h>(num_units, num_units + 3);

// change all layers at once
nn.weight_init(weight_init::lecun());
nn.bias_init(weight_init::xavier(2.0));

// change specific layer
nn[0]->weight_init(weight_init::xavier(4.0));
nn[0]->bias_init(weight_init::constant(1.0));

handle error

tiny_cnn throws tiny_cnn::nn_error types in run-time. You can see detail error message by nn_error::what() funtion.

try {
   network<cross_entropy, RMSprop> nn;
   ...
} catch (const nn_error& e) {
   cout << e.what();
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly