Skip to content
nyanp edited this page Oct 22, 2015 · 6 revisions

construct network

You can construct networks by chaining operator << from top(input) to bottom(output).

...
// input: 32x32x1 (1024 dimensions)  output: 10
network<mse, adagrad> net;
net << convolutional_layer<tan_h>(32, 32, 5, 1, 6) // 32x32in, conv5x5
    << average_pooling_layer<tan_h>(28, 28, 6, 2) // 28x28in, pool2x2
    << fully_connected_layer<tan_h>(14 * 14 * 6, 120)
    << fully_connected_layer<identity>(120, 10);
...
// input: 32x32x3 (3072 dimensions)  output: 40
network<cross_entropy, RMSprop> net;
net << convolutional_layer<relu>(32, 32, 5, 3, 9)
    << average_pooling_layer<relu>(28, 28, 9, 2)
    << fully_connected_layer<tan_h>(14 * 14 * 9, 120)
    << fully_connected_layer<softmax>(120, 40);

If your network is simple mlp(multi-layer perceptron), you can also use make_mlp function.

...
auto mynet = make_mlp<mse, gradient_descent, tan_h>({ 32 * 32, 300, 10 });

access layer properties

You can access each layer by operator[] after construction.

...
network<cross_entropy, RMSprop> nn;

nn << convolutional_layer<tan_h>(32, 32, 5, 3, 6)
    << max_pooling_layer<tan_h>(28, 28, 6, 2)
    << fully_connected_layer<tan_h>(14 * 14 * 6, 10);

for (int i = 0; i < nn.depth(); i++) {
    cout << "#layer:" << i << "\n";
    cout << "layer type:" << nn[i]->layer_type() << "\n";
    cout << "input:" << nn[i]->in_size() << "(" << nn[i]->in_shape() << ")\n";
    cout << "output:" << nn[i]->out_size() << "(" << nn[i]->out_shape() << ")\n";
}

output:

#layer:0
layer type:conv
input:3072(32x32x3)
output:4704(28x28x6)
num of parameters:456
#layer:1
layer type:max-pool
input:4704(28x28x6)
output:1176(14x14x6)
num of parameters:0
#layer:2
layer type:fully-connected
input:1176(1176x1x1)
output:10(10x1x1)
num of parameters:11770

access weight vector for each layer

...
vec_t& weight = nn[i]->weight();
vec_t& bias = nn[i]->bias();

train network

without callback

...
// minibatch=50, epoch=20
nn.train(train_images, train_labels, 50, 20);

with callback

...
// test&save for each epoch
int epoch = 0;
nn.train(train_images, train_labels, 50, 20, [](){},
         [&](){
           result res = nn.test(test_images, test_labels);
           cout << res.num_success << "/" << res.num_total << endl;
           ofstream ofs (("epoch_"+to_string(epoch++)).c_str());
           ofs << nn;
         });

visualize each layer activations

network<cross_entropy, RMSprop> nn;

nn << convolutional_layer<tan_h>(32, 32, 5, 3, 6)
    << max_pooling_layer<tan_h>(28, 28, 6, 2)
    << fully_connected_layer<tan_h>(14 * 14 * 6, 10);
...
image img = nn[0]->output_to_image(); // visualize activations of recent input
img.write("layer0.bmp");

visualize convolution kernels

network<cross_entropy, RMSprop> nn;

nn << convolutional_layer<tan_h>(32, 32, 5, 3, 6)
    << max_pooling_layer<tan_h>(28, 28, 6, 2)
    << fully_connected_layer<tan_h>(14 * 14 * 6, 10);
...
image img = nn.at<convolutional_layer<tan_h>>(0).weight_to_image();
img.write("kernel0.bmp");

save and load network

Simply use operator << and >> to save/load network weights.

save

network<cross_entropy, RMSprop> nn;

nn << convolutional_layer<tan_h>(32, 32, 5, 3, 6)
    << max_pooling_layer<tan_h>(28, 28, 6, 2)
    << fully_connected_layer<tan_h>(14 * 14 * 6, 10);
...
std::ofstream output("nets.txt");
output << nn;

load

network<cross_entropy, RMSprop> nn;

nn << convolutional_layer<tan_h>(32, 32, 5, 3, 6)
    << max_pooling_layer<tan_h>(28, 28, 6, 2)
    << fully_connected_layer<tan_h>(14 * 14 * 6, 10);
...
std::ifstream input("nets.txt");
input >> nn;

tiny_cnn saves only weights/biases array, not network structure itself. So you must construct network(same as training time) before loading.

load dataset

from MNIST idx format

vector<vec_t> images;
vector<label_t> labels;
parse_mnist_images("train-images.idx3-ubyte", &images, -1.0, 1.0, 2, 2);
parse_mnist_labels("train-labels.idx1-ubyte", &labels);

from cifar-10 binary format

vector<vec_t> images;
vector<label_t> labels;
parse_cifar10("data_batch1.bin", &images, &labels, -1.0, 1.0, 0, 0); 

change weight initialization

In neural network training, initial value of weight/bias can affect training speed and accuracy. In tiny-cnn, the weight is appropriately scaled by xavier algorithm1 and the bias is filled with 0.

To change initialization method and scaling factor, use weight_init() and bias_init() function of network and layer class.

  • xavier ... automatic scaling using sqrt(scale / (fan-in + fan-out))
  • lecun ... automatic scaling using scale / sqrt(fan-in)
  • constant ... fill constant value
int num_units [] = { 100, 400, 100 };
auto nn = make_mlp<mse, gradient_descent, tan_h>(num_units, num_units + 3);

// change all layers at once
nn.weight_init(weight_init::lecun());
nn.bias_init(weight_init::xavier(2.0));

// change specific layer
nn[0]->weight_init(weight_init::xavier(4.0));
nn[0]->bias_init(weight_init::constant(1.0));

handle error

tiny_cnn throws tiny_cnn::nn_error types in run-time. You can see detail error message by nn_error::what() funtion.

try {
   network<cross_entropy, RMSprop> nn;
   ...
} catch (const nn_error& e) {
   cout << e.what();
}