-
Notifications
You must be signed in to change notification settings - Fork 0
Basic Operations
- construct network
- access layer properties
- access weight vector for each layer
- train network
- visualize each layer activations
- visualize convolution kernels
- save and load network
- load dataset
- change weight initialization
- handle error
You can construct networks by chaining operator <<
from top(input) to bottom(output).
...
// input: 32x32x1 (1024 dimensions) output: 10
network<mse, adagrad> net;
net << convolutional_layer<tan_h>(32, 32, 5, 1, 6) // 32x32in, conv5x5
<< average_pooling_layer<tan_h>(28, 28, 6, 2) // 28x28in, pool2x2
<< fully_connected_layer<tan_h>(14 * 14 * 6, 120)
<< fully_connected_layer<identity>(120, 10);
...
// input: 32x32x3 (3072 dimensions) output: 40
network<cross_entropy, RMSprop> net;
net << convolutional_layer<relu>(32, 32, 5, 3, 9)
<< average_pooling_layer<relu>(28, 28, 9, 2)
<< fully_connected_layer<tan_h>(14 * 14 * 9, 120)
<< fully_connected_layer<softmax>(120, 40);
If your network is simple mlp(multi-layer perceptron), you can also use make_mlp
function.
...
auto mynet = make_mlp<mse, gradient_descent, tan_h>({ 32 * 32, 300, 10 });
You can access each layer by operator[] after construction.
...
network<cross_entropy, RMSprop> nn;
nn << convolutional_layer<tan_h>(32, 32, 5, 3, 6)
<< max_pooling_layer<tan_h>(28, 28, 6, 2)
<< fully_connected_layer<tan_h>(14 * 14 * 6, 10);
for (int i = 0; i < nn.depth(); i++) {
cout << "#layer:" << i << "\n";
cout << "layer type:" << nn[i]->layer_type() << "\n";
cout << "input:" << nn[i]->in_size() << "(" << nn[i]->in_shape() << ")\n";
cout << "output:" << nn[i]->out_size() << "(" << nn[i]->out_shape() << ")\n";
}
output:
#layer:0
layer type:conv
input:3072(32x32x3)
output:4704(28x28x6)
num of parameters:456
#layer:1
layer type:max-pool
input:4704(28x28x6)
output:1176(14x14x6)
num of parameters:0
#layer:2
layer type:fully-connected
input:1176(1176x1x1)
output:10(10x1x1)
num of parameters:11770
...
vec_t& weight = nn[i]->weight();
vec_t& bias = nn[i]->bias();
without callback
...
// minibatch=50, epoch=20
nn.train(train_images, train_labels, 50, 20);
with callback
...
// test&save for each epoch
int epoch = 0;
nn.train(train_images, train_labels, 50, 20, [](){},
[&](){
result res = nn.test(test_images, test_labels);
cout << res.num_success << "/" << res.num_total << endl;
ofstream ofs (("epoch_"+to_string(epoch++)).c_str());
ofs << nn;
});
network<cross_entropy, RMSprop> nn;
nn << convolutional_layer<tan_h>(32, 32, 5, 3, 6)
<< max_pooling_layer<tan_h>(28, 28, 6, 2)
<< fully_connected_layer<tan_h>(14 * 14 * 6, 10);
...
image img = nn[0]->output_to_image(); // visualize activations of recent input
img.write("layer0.bmp");
network<cross_entropy, RMSprop> nn;
nn << convolutional_layer<tan_h>(32, 32, 5, 3, 6)
<< max_pooling_layer<tan_h>(28, 28, 6, 2)
<< fully_connected_layer<tan_h>(14 * 14 * 6, 10);
...
image img = nn.at<convolutional_layer<tan_h>>(0).weight_to_image();
img.write("kernel0.bmp");
Simply use operator << and >> to save/load network weights.
save
network<cross_entropy, RMSprop> nn;
nn << convolutional_layer<tan_h>(32, 32, 5, 3, 6)
<< max_pooling_layer<tan_h>(28, 28, 6, 2)
<< fully_connected_layer<tan_h>(14 * 14 * 6, 10);
...
std::ofstream output("nets.txt");
output << nn;
load
network<cross_entropy, RMSprop> nn;
nn << convolutional_layer<tan_h>(32, 32, 5, 3, 6)
<< max_pooling_layer<tan_h>(28, 28, 6, 2)
<< fully_connected_layer<tan_h>(14 * 14 * 6, 10);
...
std::ifstream input("nets.txt");
input >> nn;
tiny_cnn saves only weights/biases array, not network structure itself. So you must construct network(same as training time) before loading.
from MNIST idx format
vector<vec_t> images;
vector<label_t> labels;
parse_mnist_images("train-images.idx3-ubyte", &images, -1.0, 1.0, 2, 2);
parse_mnist_labels("train-labels.idx1-ubyte", &labels);
from cifar-10 binary format
vector<vec_t> images;
vector<label_t> labels;
parse_cifar10("data_batch1.bin", &images, &labels, -1.0, 1.0, 0, 0);
In neural network training, initial value of weight/bias can affect training speed and accuracy. In tiny-cnn, the weight is appropriately scaled by xavier algorithm1 and the bias is filled with 0.
To change initialization method and scaling factor, use weight_init()
and bias_init()
function of network and layer class.
- xavier ... automatic scaling using sqrt(scale / (fan-in + fan-out))
- lecun ... automatic scaling using scale / sqrt(fan-in)
- constant ... fill constant value
int num_units [] = { 100, 400, 100 };
auto nn = make_mlp<mse, gradient_descent, tan_h>(num_units, num_units + 3);
// change all layers at once
nn.weight_init(weight_init::lecun());
nn.bias_init(weight_init::xavier(2.0));
// change specific layer
nn[0]->weight_init(weight_init::xavier(4.0));
nn[0]->bias_init(weight_init::constant(1.0));
tiny_cnn throws tiny_cnn::nn_error
types in run-time. You can see detail error message by nn_error::what()
funtion.
try {
network<cross_entropy, RMSprop> nn;
...
} catch (const nn_error& e) {
cout << e.what();
}