2,077 views
この記事は最終更新から 583日 が経過しています。
cuda-convnet2 では、新たに cross-entropy cost layer が追加された。
1. これを使うと何がうれしい?
softmax を用いた多値分類器では、出力ユニット間の相対的な大小関係を学習させて
「最大出力ユニット番号=付与するラベル」 として使用した。
これに対して2値分類器では、一つの出力ユニットが 「ある条件に対して Yes or No」 だけを学習する。
これを応用し、複数個の出力ユニットに独立した条件判定ロジックを学習させれば、一つの入力データに対して複数種類の判定結果が出力可能なニューラルネットを作れる。
2. 実験の準備
(1) 実験条件
ブログのテーマがMNISTなので、ここでもMNISTを使って実験する。
1) 学習・テスト画像はMNIST数字画像データを使用する。
2) ベースとなるネットワーク構成は (25) cuda-convnet2でMNIST自動認識(その1) で使用したもの。
3) 出力層を detection cross-entropy cost layer に変更する。
4) 出力ユニット数3個を以下のように定義する。
出力ユニットNo.1 | 奇数フラグ |
出力ユニットNo.2 | 偶数フラグ |
出力ユニットNo.3 | 3の倍数フラグ |
このルールに従うと、入力した MNIST数字画像 0~9 に対する出力は次のようになる。
数字 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
出力値(bin) | 010 | 100 | 010 | 101 | 010 | 100 | 011 | 100 | 010 | 101 |
(2) 学習・テストデータ作り
cuda-convnet用MNISTデータを作る(その2), (その3) で作成した cuda-convnet入力データ作成ツールに少々手を入れる必要あり。改造箇所は1点だけ。
1) 教師データはラベル番号ではなく、上記の出力層の出力値(3次元情報)とする。
例) 今までの多値分類では data_batch_N ファイル中の教師データを以下のように設定していた。 (値は適当)
labels = [0,3,2,6,1,3]
今回の2値分類器では、以下のように設定する。(値は適当)
labels = [[0,1,0],[1,0,0],[1,1,0],[1,0,1]]
(3) NN定義ファイル作り
layers-MNIST-tri.cfg
最後のフル接続層の neuron を logistic にしないと実行時にエラーが出る。
[data] type=data dataIdx=0 [labels] type=data dataIdx=1 [conv1] type=conv inputs=data channels=1 filters=32 padding=0 stride=1 filterSize=5 neuron=tanh[1,1] initW=0.0001 sumWidth=4 sharedBiases=1 gpu=0 [pool1] type=pool pool=max inputs=conv1 start=0 sizeX=2 stride=2 outputsX=0 channels=32 [conv2] type=conv inputs=pool1 filters=32 padding=0 stride=1 filterSize=5 channels=32 neuron=tanh[1,1] initW=0.01 sumWidth=2 sharedBiases=1 [pool2] type=pool pool=avg inputs=conv2 start=0 sizeX=2 stride=2 outputsX=0 channels=32 [fcOut] type=fc outputs=3 inputs=pool2 initW=0.01 initB=0.1 neuron=logistic [dce] type=cost.dce inputs=labels,fcOut gpu=0
layer-params-MNIST-tri.cfg
[conv1] epsW=0.01 epsB=0.01 momW=0.9 momB=0.9 wc=0.0001 [conv2] epsW=0.01 epsB=0.01 momW=0.9 momB=0.9 wc=0.0001 [fcOut] epsW=0.01 epsB=0.01 momW=0.9 momB=0.9 wc=0.0001 [dce] coeff=1 topk=1
3. 実行結果
10epochs 流してみた。
今までよりもログ出力される情報量が多いようだ。
2014年 9月 14日 日曜日 02:21:24 JST Initialized data layer 'data', producing 784 outputs Initialized data layer 'labels', producing 3 outputs Initialized convolutional layer 'conv1' on GPUs 0, producing 24x24 32-channel output Initialized max-pooling layer 'pool1' on GPUs 0, producing 12x12 32-channel output Initialized convolutional layer 'conv2' on GPUs 0, producing 8x8 32-channel output Initialized avg-pooling layer 'pool2' on GPUs 0, producing 4x4 32-channel output Initialized fully-connected layer 'fcOut' on GPUs 0, producing 3 outputs Initialized detection cross-entropy cost 'dce' on GPUs 0 Initialized neuron layer 'fcOut_neuron' on GPUs 0, producing 3 outputs Initialized neuron layer 'conv2_neuron' on GPUs 0, producing 2048 outputs Initialized neuron layer 'conv1_neuron' on GPUs 0, producing 18432 outputs Layer conv2_neuron using acts from layer conv2 Layer fcOut_neuron using acts from layer fcOut Layer conv1_neuron using acts from layer conv1 ========================= Importing cudaconvnet._ConvNet C++ module Fwd terminal: dce found bwd terminal conv1[0] in passIdx=0 ========================= Training ConvNet Add PCA noise to color channels with given scale : 0 [DEFAULT] Check gradients and quit? : 0 [DEFAULT] Conserve GPU memory (slower)? : 0 [DEFAULT] Convert given conv layers to unshared local : Cropped DP: crop size (0 = don't crop) : 28 Cropped DP: test on multiple patches? : 0 [DEFAULT] Data batch range: testing : 7-7 Data batch range: training : 1-6 Data path : /home/user/cuda/cuda-convnet2/data/MNIST-tri/ Data provider : DCE Force save before quitting : 0 [DEFAULT] GPU override : 0 Layer definition file : /home/user/cuda/cuda-convnet2/config/MNIST/layers-MNIST-tri.cfg Layer file path prefix : [DEFAULT] Layer parameter file : /home/user/cuda/cuda-convnet2/config/MNIST/layer-params-MNIST-tri.cfg Load file : [DEFAULT] Logreg cost layer name (for --test-out) : [DEFAULT] Minibatch size : 128 [DEFAULT] Number of epochs : 10 Output test case predictions to given path : [DEFAULT] Save file override : Save path : /home/user/cuda/cuda-convnet2/save/MNIST/ Subtract this scalar from image (-1 = don't) : -1 [DEFAULT] Test and quit? : 0 [DEFAULT] Test on one batch at a time? : 1 [DEFAULT] Testing frequency : 6 Unshare weight matrices in given layers : Write test data features from given layer : [DEFAULT] Write test data features to this path (to be used with --write-features): [DEFAULT] ========================= Running on CUDA device(s) 0 Current time: Sun Sep 14 02:21:26 2014 Saving checkpoints to /home/user/cuda/cuda-convnet2/save/MNIST/ConvNet__2014-09-14_02.21.24 ========================= 1.1 (0.00%)... dce: (crossent) 1.199113, (err) 0.183633, (Godd) 0.813920, 0.890335, (Geven) 0.848251, 0.845842, (Gtri) 0.706835, 0.683478 (0.563 sec) 1.2 (1.67%)... dce: (crossent) 0.766782, (err) 0.103000, (Godd) 0.919650, 0.914490, (Geven) 0.908736, 0.915131, (Gtri) 0.832205, 0.823325 (0.253 sec) 1.3 (3.33%)... dce: (crossent) 0.567953, (err) 0.068433, (Godd) 0.950119, 0.945626, (Geven) 0.944995, 0.949025, (Gtri) 0.883268, 0.860500 (0.256 sec) 1.4 (5.00%)... dce: (crossent) 0.443588, (err) 0.052900, (Godd) 0.963820, 0.960780, (Geven) 0.958771, 0.963053, (Gtri) 0.897314, 0.897088 (0.251 sec) 1.5 (6.67%)... dce: (crossent) 0.410039, (err) 0.049000, (Godd) 0.964023, 0.958687, (Geven) 0.958937, 0.964177, (Gtri) 0.914705, 0.908954 (0.256 sec) 1.6 (8.33%)... dce: (crossent) 0.349523, (err) 0.040433, (Godd) 0.967851, 0.963834, (Geven) 0.963680, 0.966802, (Gtri) 0.938540, 0.928083 ======================Test output====================== dce: (crossent) 0.317917, (err) 0.036367, (Godd) 0.973128, 0.970635, (Geven) 0.969439, 0.972391, (Gtri) 0.944646, 0.922921 ----------------------Averages------------------------- dce: (crossent) 0.317917, (err) 0.036367, (Godd) 0.973128, 0.970635, (Geven) 0.969439, 0.972391, (Gtri) 0.944646, 0.922921 ------------------------------------------------------- Layer 'conv1' weights[0]: 8.155207e-02 [7.599560e-04] [9.318661e-03] Layer 'conv1' biases: 2.710742e-03 [2.686247e-05] Layer 'conv2' weights[0]: 2.011198e-02 [1.423615e-04] [7.078444e-03] Layer 'conv2' biases: 2.185383e-02 [2.591967e-04] Layer 'fcOut' weights[0]: 8.837998e-02 [4.182930e-04] [4.732893e-03] Layer 'fcOut' biases: 1.263399e-01 [7.117551e-04] ------------------------------------------------------- Saved checkpoint to /home/user/cuda/cuda-convnet2/save/MNIST/ConvNet__2014-09-14_02.21.24 ======================================================= (0.380 sec) 2.1 (10.00%)... dce: (crossent) 0.309526, (err) 0.035000, (Godd) 0.973669, 0.970020, (Geven) 0.969685, 0.973225, (Gtri) 0.941206, 0.938634 (0.262 sec) 2.2 (11.67%)... dce: (crossent) 0.305727, (err) 0.035700, (Godd) 0.973188, 0.969046, (Geven) 0.968035, 0.971641, (Gtri) 0.942188, 0.938213 (0.257 sec) 2.3 (13.33%)... dce: (crossent) 0.301048, (err) 0.033967, (Godd) 0.976615, 0.970843, (Geven) 0.971296, 0.975833, (Gtri) 0.943435, 0.931514 (0.256 sec) 2.4 (15.00%)... dce: (crossent) 0.269323, (err) 0.030567, (Godd) 0.978870, 0.976941, (Geven) 0.976287, 0.977873, (Gtri) 0.942836, 0.939759 (0.255 sec) 2.5 (16.67%)... dce: (crossent) 0.262750, (err) 0.030367, (Godd) 0.980865, 0.972722, (Geven) 0.972105, 0.980368, (Gtri) 0.945326, 0.943662 (0.258 sec) 2.6 (18.33%)... dce: (crossent) 0.232449, (err) 0.027100, (Godd) 0.979010, 0.977075, (Geven) 0.975994, 0.979352, (Gtri) 0.957322, 0.948595 ======================Test output====================== dce: (crossent) 0.245786, (err) 0.028500, (Godd) 0.987382, 0.971620, (Geven) 0.971611, 0.986602, (Gtri) 0.976171, 0.911044 ----------------------Averages------------------------- dce: (crossent) 0.245786, (err) 0.028500, (Godd) 0.987382, 0.971620, (Geven) 0.971611, 0.986602, (Gtri) 0.976171, 0.911044 ------------------------------------------------------- Layer 'conv1' weights[0]: 1.103154e-01 [6.913404e-04] [6.266943e-03] Layer 'conv1' biases: 3.618342e-03 [1.293126e-05] Layer 'conv2' weights[0]: 2.478795e-02 [1.239718e-04] [5.001291e-03] Layer 'conv2' biases: 2.868914e-02 [1.545805e-04] Layer 'fcOut' weights[0]: 1.136545e-01 [2.959493e-04] [2.603940e-03] Layer 'fcOut' biases: 1.387306e-01 [3.311247e-04] ------------------------------------------------------- Saved checkpoint to /home/user/cuda/cuda-convnet2/save/MNIST/ConvNet__2014-09-14_02.21.24 ======================================================= (0.391 sec) 3.1 (20.00%)... dce: (crossent) 0.224097, (err) 0.025033, (Godd) 0.981969, 0.977515, (Geven) 0.976974, 0.981136, (Gtri) 0.958230, 0.957516 (0.274 sec) 3.2 (21.67%)... dce: (crossent) 0.230526, (err) 0.026400, (Godd) 0.979814, 0.976591, (Geven) 0.974845, 0.978679, (Gtri) 0.957795, 0.957320 (0.274 sec) 3.3 (23.33%)... dce: (crossent) 0.228185, (err) 0.025833, (Godd) 0.981050, 0.979117, (Geven) 0.977935, 0.981113, (Gtri) 0.956699, 0.949204 (0.268 sec) 3.4 (25.00%)... dce: (crossent) 0.213223, (err) 0.023333, (Godd) 0.984202, 0.982263, (Geven) 0.981969, 0.983963, (Gtri) 0.954568, 0.954568 (0.266 sec) 3.5 (26.67%)... dce: (crossent) 0.208459, (err) 0.023700, (Godd) 0.983148, 0.980233, (Geven) 0.979625, 0.982797, (Gtri) 0.960953, 0.953219 (0.266 sec) 3.6 (28.33%)... dce: (crossent) 0.185518, (err) 0.021233, (Godd) 0.983759, 0.981621, (Geven) 0.981018, 0.983401, (Gtri) 0.965438, 0.962016 ======================Test output====================== dce: (crossent) 0.204320, (err) 0.023067, (Godd) 0.990608, 0.976941, (Geven) 0.976972, 0.990459, (Gtri) 0.983051, 0.923427 ----------------------Averages------------------------- dce: (crossent) 0.204320, (err) 0.023067, (Godd) 0.990608, 0.976941, (Geven) 0.976972, 0.990459, (Gtri) 0.983051, 0.923427 ------------------------------------------------------- Layer 'conv1' weights[0]: 1.302679e-01 [6.364945e-04] [4.886041e-03] Layer 'conv1' biases: 3.856979e-03 [8.572064e-06] Layer 'conv2' weights[0]: 2.778544e-02 [1.004645e-04] [3.615725e-03] Layer 'conv2' biases: 3.361420e-02 [9.831827e-05] Layer 'fcOut' weights[0]: 1.289217e-01 [2.037354e-04] [1.580304e-03] Layer 'fcOut' biases: 1.446893e-01 [1.287783e-04] ------------------------------------------------------- Saved checkpoint to /home/user/cuda/cuda-convnet2/save/MNIST/ConvNet__2014-09-14_02.21.24 ======================================================= (0.397 sec) 4.1 (30.00%)... dce: (crossent) 0.188140, (err) 0.021433, (Godd) 0.983564, 0.979684, (Geven) 0.978998, 0.983367, (Gtri) 0.967380, 0.965217 (0.270 sec) 4.2 (31.67%)... dce: (crossent) 0.199008, (err) 0.022367, (Godd) 0.981395, 0.979687, (Geven) 0.978917, 0.980335, (Gtri) 0.968337, 0.963772 (0.274 sec) 4.3 (33.33%)... dce: (crossent) 0.194720, (err) 0.021700, (Godd) 0.984011, 0.982072, (Geven) 0.981557, 0.983550, (Gtri) 0.962494, 0.959818 (0.273 sec) 4.4 (35.00%)... dce: (crossent) 0.178280, (err) 0.019867, (Godd) 0.987352, 0.984628, (Geven) 0.984015, 0.987211, (Gtri) 0.962003, 0.959588 (0.274 sec) 4.5 (36.67%)... dce: (crossent) 0.181196, (err) 0.020500, (Godd) 0.986097, 0.981419, (Geven) 0.981273, 0.986238, (Gtri) 0.967529, 0.959256 (0.266 sec) 4.6 (38.33%)... dce: (crossent) 0.163559, (err) 0.018567, (Godd) 0.985550, 0.983992, (Geven) 0.983633, 0.985425, (Gtri) 0.967636, 0.969106 ======================Test output====================== dce: (crossent) 0.190747, (err) 0.021300, (Godd) 0.989081, 0.981868, (Geven) 0.981452, 0.988226, (Gtri) 0.986279, 0.926459 ----------------------Averages------------------------- dce: (crossent) 0.190747, (err) 0.021300, (Godd) 0.989081, 0.981868, (Geven) 0.981452, 0.988226, (Gtri) 0.986279, 0.926459 ------------------------------------------------------- Layer 'conv1' weights[0]: 1.452356e-01 [5.611114e-04] [3.863456e-03] Layer 'conv1' biases: 4.199057e-03 [9.355912e-06] Layer 'conv2' weights[0]: 3.022003e-02 [9.920518e-05] [3.282762e-03] Layer 'conv2' biases: 3.737093e-02 [1.362906e-04] Layer 'fcOut' weights[0]: 1.403167e-01 [1.990263e-04] [1.418408e-03] Layer 'fcOut' biases: 1.483610e-01 [2.851311e-04] ------------------------------------------------------- Saved checkpoint to /home/user/cuda/cuda-convnet2/save/MNIST/ConvNet__2014-09-14_02.21.24 ======================================================= (0.400 sec) 5.1 (40.00%)... dce: (crossent) 0.172653, (err) 0.019367, (Godd) 0.984570, 0.981657, (Geven) 0.981201, 0.984584, (Gtri) 0.974661, 0.965217 (0.269 sec) 5.2 (41.67%)... dce: (crossent) 0.177714, (err) 0.019533, (Godd) 0.986410, 0.982975, (Geven) 0.982663, 0.985510, (Gtri) 0.967629, 0.964268 (0.269 sec) 5.3 (43.33%)... dce: (crossent) 0.170255, (err) 0.018900, (Godd) 0.986751, 0.983058, (Geven) 0.982598, 0.986190, (Gtri) 0.968568, 0.965631 (0.277 sec) 5.4 (45.00%)... dce: (crossent) 0.163894, (err) 0.018067, (Godd) 0.986985, 0.986401, (Geven) 0.985804, 0.986805, (Gtri) 0.966801, 0.964859 (0.286 sec) 5.5 (46.67%)... dce: (crossent) 0.162201, (err) 0.018133, (Godd) 0.987108, 0.983791, (Geven) 0.983458, 0.986642, (Gtri) 0.971176, 0.966046 (0.271 sec) 5.6 (48.33%)... dce: (crossent) 0.148354, (err) 0.016267, (Godd) 0.987532, 0.986166, (Geven) 0.985651, 0.987247, (Gtri) 0.971660, 0.972398 ======================Test output====================== dce: (crossent) 0.172621, (err) 0.019200, (Godd) 0.989861, 0.981277, (Geven) 0.981672, 0.989444, (Gtri) 0.985964, 0.940864 ----------------------Averages------------------------- dce: (crossent) 0.172621, (err) 0.019200, (Godd) 0.989861, 0.981277, (Geven) 0.981672, 0.989444, (Gtri) 0.985964, 0.940864 ------------------------------------------------------- Layer 'conv1' weights[0]: 1.587927e-01 [5.413409e-04] [3.409105e-03] Layer 'conv1' biases: 4.576801e-03 [1.049395e-05] Layer 'conv2' weights[0]: 3.219037e-02 [9.470111e-05] [2.941908e-03] Layer 'conv2' biases: 4.058194e-02 [1.346725e-04] Layer 'fcOut' weights[0]: 1.493520e-01 [1.707393e-04] [1.143201e-03] Layer 'fcOut' biases: 1.511034e-01 [2.599937e-04] ------------------------------------------------------- Saved checkpoint to /home/user/cuda/cuda-convnet2/save/MNIST/ConvNet__2014-09-14_02.21.24 ======================================================= (0.400 sec) 6.1 (50.00%)... dce: (crossent) 0.157575, (err) 0.017633, (Godd) 0.985004, 0.984615, (Geven) 0.983783, 0.984381, (Gtri) 0.974807, 0.970932 (0.268 sec) 6.2 (51.67%)... dce: (crossent) 0.163027, (err) 0.017433, (Godd) 0.988528, 0.983556, (Geven) 0.982485, 0.986959, (Gtri) 0.972865, 0.969727 (0.268 sec) 6.3 (53.33%)... dce: (crossent) 0.151389, (err) 0.015667, (Godd) 0.988946, 0.986998, (Geven) 0.986426, 0.988830, (Gtri) 0.973838, 0.968916 (0.269 sec) 6.4 (55.00%)... dce: (crossent) 0.150452, (err) 0.015900, (Godd) 0.988952, 0.987978, (Geven) 0.987830, 0.988632, (Gtri) 0.971270, 0.967369 (0.277 sec) 6.5 (56.67%)... dce: (crossent) 0.149220, (err) 0.017100, (Godd) 0.986931, 0.985175, (Geven) 0.984848, 0.986642, (Gtri) 0.974411, 0.967304 (0.268 sec) 6.6 (58.33%)... dce: (crossent) 0.141361, (err) 0.016100, (Godd) 0.986169, 0.986364, (Geven) 0.986829, 0.985830, (Gtri) 0.973178, 0.973917 ======================Test output====================== dce: (crossent) 0.166090, (err) 0.019333, (Godd) 0.984634, 0.985022, (Geven) 0.984169, 0.984369, (Gtri) 0.985000, 0.945919 ----------------------Averages------------------------- dce: (crossent) 0.166090, (err) 0.019333, (Godd) 0.984634, 0.985022, (Geven) 0.984169, 0.984369, (Gtri) 0.985000, 0.945919 ------------------------------------------------------- Layer 'conv1' weights[0]: 1.691544e-01 [5.411348e-04] [3.199059e-03] Layer 'conv1' biases: 4.764370e-03 [1.533124e-05] Layer 'conv2' weights[0]: 3.393728e-02 [1.179916e-04] [3.476754e-03] Layer 'conv2' biases: 4.355372e-02 [1.882927e-04] Layer 'fcOut' weights[0]: 1.571101e-01 [2.250832e-04] [1.432646e-03] Layer 'fcOut' biases: 1.538871e-01 [4.014182e-04] ------------------------------------------------------- Saved checkpoint to /home/user/cuda/cuda-convnet2/save/MNIST/ConvNet__2014-09-14_02.21.24 ======================================================= (0.395 sec) 7.1 (60.00%)... dce: (crossent) 0.146865, (err) 0.016633, (Godd) 0.986553, 0.984024, (Geven) 0.983607, 0.985801, (Gtri) 0.977295, 0.973168 (0.270 sec) 7.2 (61.67%)... dce: (crossent) 0.148073, (err) 0.015967, (Godd) 0.988176, 0.986264, (Geven) 0.985133, 0.987580, (Gtri) 0.975330, 0.971216 (0.269 sec) 7.3 (63.33%)... dce: (crossent) 0.139719, (err) 0.014200, (Godd) 0.989154, 0.988180, (Geven) 0.987827, 0.988830, (Gtri) 0.976679, 0.973717 (0.275 sec) 7.4 (65.00%)... dce: (crossent) 0.137491, (err) 0.014633, (Godd) 0.989927, 0.987781, (Geven) 0.987047, 0.990053, (Gtri) 0.974817, 0.971637 (0.280 sec) 7.5 (66.67%)... dce: (crossent) 0.138026, (err) 0.015767, (Godd) 0.989094, 0.985966, (Geven) 0.985870, 0.988464, (Gtri) 0.974482, 0.970070 (0.283 sec) 7.6 (68.33%)... dce: (crossent) 0.131545, (err) 0.014533, (Godd) 0.988142, 0.988142, (Geven) 0.987647, 0.987247, (Gtri) 0.974249, 0.977209 ======================Test output====================== dce: (crossent) 0.147773, (err) 0.017033, (Godd) 0.988528, 0.985022, (Geven) 0.984827, 0.988226, (Gtri) 0.981832, 0.956027 ----------------------Averages------------------------- dce: (crossent) 0.147773, (err) 0.017033, (Godd) 0.988528, 0.985022, (Geven) 0.984827, 0.988226, (Gtri) 0.981832, 0.956027 ------------------------------------------------------- Layer 'conv1' weights[0]: 1.788537e-01 [6.211860e-04] [3.473151e-03] Layer 'conv1' biases: 5.401216e-03 [1.268923e-05] Layer 'conv2' weights[0]: 3.544452e-02 [1.119315e-04] [3.157935e-03] Layer 'conv2' biases: 4.565623e-02 [1.653222e-04] Layer 'fcOut' weights[0]: 1.637556e-01 [2.176586e-04] [1.329167e-03] Layer 'fcOut' biases: 1.556429e-01 [3.749391e-04] ------------------------------------------------------- Saved checkpoint to /home/user/cuda/cuda-convnet2/save/MNIST/ConvNet__2014-09-14_02.21.24 ======================================================= (0.415 sec) 8.1 (70.00%)... dce: (crossent) 0.135743, (err) 0.015233, (Godd) 0.987367, 0.986588, (Geven) 0.986226, 0.987627, (Gtri) 0.978505, 0.972671 (0.272 sec) 8.2 (71.67%)... dce: (crossent) 0.142047, (err) 0.015733, (Godd) 0.988936, 0.985684, (Geven) 0.985145, 0.988408, (Gtri) 0.975579, 0.971464 (0.279 sec) 8.3 (73.33%)... dce: (crossent) 0.132628, (err) 0.015767, (Godd) 0.988544, 0.986013, (Geven) 0.985616, 0.988018, (Gtri) 0.974398, 0.971443 (0.282 sec) 8.4 (75.00%)... dce: (crossent) 0.134742, (err) 0.014500, (Godd) 0.990713, 0.988175, (Geven) 0.987654, 0.990662, (Gtri) 0.975246, 0.969127 (0.273 sec) 8.5 (76.67%)... dce: (crossent) 0.128446, (err) 0.014467, (Godd) 0.990871, 0.986954, (Geven) 0.987094, 0.990690, (Gtri) 0.974773, 0.971831 (0.280 sec) 8.6 (78.33%)... dce: (crossent) 0.122649, (err) 0.014300, (Godd) 0.988338, 0.988142, (Geven) 0.987664, 0.988664, (Gtri) 0.976408, 0.974677 ======================Test output====================== dce: (crossent) 0.151681, (err) 0.018233, (Godd) 0.984077, 0.986598, (Geven) 0.986156, 0.983354, (Gtri) 0.979323, 0.957544 ----------------------Averages------------------------- dce: (crossent) 0.151681, (err) 0.018233, (Godd) 0.984077, 0.986598, (Geven) 0.986156, 0.983354, (Gtri) 0.979323, 0.957544 ------------------------------------------------------- Layer 'conv1' weights[0]: 1.873097e-01 [5.005563e-04] [2.672345e-03] Layer 'conv1' biases: 5.523826e-03 [1.317529e-05] Layer 'conv2' weights[0]: 3.681655e-02 [9.820487e-05] [2.667411e-03] Layer 'conv2' biases: 4.807946e-02 [1.357576e-04] Layer 'fcOut' weights[0]: 1.698663e-01 [1.870220e-04] [1.100995e-03] Layer 'fcOut' biases: 1.567201e-01 [2.885009e-04] ------------------------------------------------------- Saved checkpoint to /home/user/cuda/cuda-convnet2/save/MNIST/ConvNet__2014-09-14_02.21.24 ======================================================= (0.418 sec) 9.1 (80.00%)... dce: (crossent) 0.129455, (err) 0.014500, (Godd) 0.988145, 0.986391, (Geven) 0.986232, 0.988032, (Gtri) 0.978120, 0.977391 (0.269 sec) 9.2 (81.67%)... dce: (crossent) 0.138982, (err) 0.015867, (Godd) 0.989122, 0.985104, (Geven) 0.984130, 0.988408, (Gtri) 0.975124, 0.972705 (0.278 sec) 9.3 (83.33%)... dce: (crossent) 0.130423, (err) 0.014233, (Godd) 0.988760, 0.987786, (Geven) 0.987419, 0.988221, (Gtri) 0.978664, 0.973717 (0.281 sec) 9.4 (85.00%)... dce: (crossent) 0.125232, (err) 0.013933, (Godd) 0.990521, 0.988569, (Geven) 0.988450, 0.990256, (Gtri) 0.973427, 0.974649 (0.278 sec) 9.5 (86.67%)... dce: (crossent) 0.126634, (err) 0.015000, (Godd) 0.989501, 0.987349, (Geven) 0.987278, 0.989476, (Gtri) 0.975696, 0.969316 (0.269 sec) 9.6 (88.33%)... dce: (crossent) 0.119767, (err) 0.013300, (Godd) 0.990493, 0.988340, (Geven) 0.988083, 0.990283, (Gtri) 0.977665, 0.975437 ======================Test output====================== dce: (crossent) 0.143420, (err) 0.015733, (Godd) 0.987016, 0.988766, (Geven) 0.988611, 0.986805, (Gtri) 0.982656, 0.959313 ----------------------Averages------------------------- dce: (crossent) 0.143420, (err) 0.015733, (Godd) 0.987016, 0.988766, (Geven) 0.988611, 0.986805, (Gtri) 0.982656, 0.959313 ------------------------------------------------------- Layer 'conv1' weights[0]: 1.955977e-01 [5.039437e-04] [2.576430e-03] Layer 'conv1' biases: 5.744199e-03 [1.012264e-05] Layer 'conv2' weights[0]: 3.810624e-02 [1.096704e-04] [2.878017e-03] Layer 'conv2' biases: 5.125387e-02 [1.558181e-04] Layer 'fcOut' weights[0]: 1.754434e-01 [2.125424e-04] [1.211459e-03] Layer 'fcOut' biases: 1.569183e-01 [3.300385e-04] ------------------------------------------------------- Saved checkpoint to /home/user/cuda/cuda-convnet2/save/MNIST/ConvNet__2014-09-14_02.21.24 ======================================================= (0.410 sec) 10.1 (90.00%)... dce: (crossent) 0.124347, (err) 0.014267, (Godd) 0.987379, 0.987574, (Geven) 0.987024, 0.987424, (Gtri) 0.979811, 0.976646 (0.293 sec) 10.2 (91.67%)... dce: (crossent) 0.127718, (err) 0.014533, (Godd) 0.989333, 0.986845, (Geven) 0.985558, 0.988822, (Gtri) 0.978808, 0.974194 (0.268 sec) 10.3 (93.33%)... dce: (crossent) 0.121530, (err) 0.012833, (Godd) 0.991308, 0.988574, (Geven) 0.988252, 0.990861, (Gtri) 0.979436, 0.974981 (0.269 sec) 10.4 (95.00%)... dce: (crossent) 0.120518, (err) 0.012667, (Godd) 0.992289, 0.989160, (Geven) 0.988871, 0.992083, (Gtri) 0.977341, 0.974398 (0.270 sec) 10.5 (96.67%)... dce: (crossent) 0.121691, (err) 0.014867, (Godd) 0.988538, 0.988733, (Geven) 0.988266, 0.988666, (Gtri) 0.973784, 0.971579 (0.263 sec) 10.6 (98.33%)... dce: (crossent) 0.113142, (err) 0.013233, (Godd) 0.989715, 0.988933, (Geven) 0.988673, 0.989474, (Gtri) 0.977446, 0.976703 ======================Test output====================== dce: (crossent) 0.136332, (err) 0.014800, (Godd) 0.985885, 0.991131, (Geven) 0.990814, 0.985384, (Gtri) 0.979770, 0.966894 ----------------------Averages------------------------- dce: (crossent) 0.136332, (err) 0.014800, (Godd) 0.985885, 0.991131, (Geven) 0.990814, 0.985384, (Gtri) 0.979770, 0.966894 ------------------------------------------------------- Layer 'conv1' weights[0]: 2.045221e-01 [5.905093e-04] [2.887264e-03] Layer 'conv1' biases: 6.253900e-03 [9.817431e-06] Layer 'conv2' weights[0]: 3.927734e-02 [9.667401e-05] [2.461318e-03] Layer 'conv2' biases: 5.389512e-02 [1.231133e-04] Layer 'fcOut' weights[0]: 1.806059e-01 [1.755370e-04] [9.719342e-04] Layer 'fcOut' biases: 1.589827e-01 [2.841916e-04] ------------------------------------------------------- Saved checkpoint to /home/user/cuda/cuda-convnet2/save/MNIST/ConvNet__2014-09-14_02.21.24 ======================================================= (0.395 sec) 11.1 (100.00%)... dce: (crossent) 0.118477, (err) 0.013900, (Godd) 0.988551, 0.987771, (Geven) 0.987231, 0.988032, (Gtri) 0.980289, 0.976149 (0.290 sec) 2014年 9月 14日 日曜日 02:21:45 JST
Test output に表示されているのは以下のもの。
(crossent) 0.136332 | 損失関数出力値 |
(err) 0.014800 | 全テストデータに対するエラー率 ※エラー出力数 / (テスト画像枚数 × 出力ユニット数) |
(Godd) 0.985885, 0.991131 | 出力ユニット#1のPositive正解率 ・左側はPrecision: TruePositive / DeclaredPositive ・右側はRecall: TruePositive / Positive ※batches.metaでGoddと命名した |
(Geven) 0.990814, 0.985384 | 出力ユニットNo.2の正解率 ※batches.metaでGevenと命名した |
(Gtri) 0.979770, 0.966894 | 出力ユニットNo.3の正解率 ※batches.metaでGtriと命名した |
TruePositive | (正解値, 出力値) = (1, 1) |
DeclaredPositive | 出力値 = 1 |
Positive | 正解値 = 1 |
======================Test output====================== dce: (crossent) 0.136332, (err) 0.014800, (Godd) 0.985885, 0.991131, (Geven) 0.990814, 0.985384, (Gtri) 0.979770, 0.966894
4. 念のために…
本当に学習が進んでいるのか確認してみたくなったので、上記のプログラムを実行中に
BinomialCrossEntropyCostLayer::fpropActs() で出力層の出力値をモニタリングしてみた。(各バッチデータの先頭3個だけ)
true label | 教師データ |
prob. | 出力層の生出力値 |
prob.th. | 閾値0.5で二値化済みの出力値 |
1) epoch#1-batch#1
当然だけれどもまだまだ全然ダメダメ…
[src/layer.cu][fpropActs][2117] true label(:,0) [1.000000][0.000000][0.000000] [src/layer.cu][fpropActs][2118] true label(:,1) [0.000000][1.000000][1.000000] [src/layer.cu][fpropActs][2119] true label(:,2) [0.000000][1.000000][0.000000] [src/layer.cu][fpropActs][2123] prob. (:,0) [0.524935][0.525479][0.525543] [src/layer.cu][fpropActs][2124] prob. (:,1) [0.525781][0.525362][0.524247] [src/layer.cu][fpropActs][2125] prob. (:,2) [0.525054][0.525753][0.524198] ↓ threshold = 0.5 [src/layer.cu][fpropActs][2141] prob.th. (:,0) [1.000000][1.000000][1.000000] [src/layer.cu][fpropActs][2142] prob.th. (:,1) [1.000000][1.000000][1.000000] [src/layer.cu][fpropActs][2143] prob.th. (:,2) [1.000000][1.000000][1.000000]
2) epoch#1-batch#2
まだ2バッチ目(=学習済みデータは10,000個) だけれどもここは3/3正解!
[src/layer.cu][fpropActs][2117] true label(:,0) [1.000000][0.000000][1.000000] [src/layer.cu][fpropActs][2118] true label(:,1) [0.000000][1.000000][0.000000] [src/layer.cu][fpropActs][2119] true label(:,2) [1.000000][0.000000][0.000000] [src/layer.cu][fpropActs][2123] prob. (:,0) [0.995084][0.005174][0.962656] [src/layer.cu][fpropActs][2124] prob. (:,1) [0.232831][0.785004][0.025421] [src/layer.cu][fpropActs][2125] prob. (:,2) [0.992378][0.008622][0.050015] ↓ threshold = 0.5 [src/layer.cu][fpropActs][2141] prob.th. (:,0) [1.000000][0.000000][1.000000] [src/layer.cu][fpropActs][2142] prob.th. (:,1) [0.000000][1.000000][0.000000] [src/layer.cu][fpropActs][2143] prob.th. (:,2) [1.000000][0.000000][0.000000]
3) epoch#1-batch#3
まだ3バッチ目(=学習済みデータは20,000個) 惜しいけどちょっと間違えてる。
[src/layer.cu][fpropActs][2117] true label(:,0) [1.000000][0.000000][0.000000] [src/layer.cu][fpropActs][2118] true label(:,1) [1.000000][0.000000][1.000000] [src/layer.cu][fpropActs][2119] true label(:,2) [1.000000][0.000000][0.000000] [src/layer.cu][fpropActs][2123] prob. (:,0) [0.970718][0.028501][0.343004] [src/layer.cu][fpropActs][2124] prob. (:,1) [0.854348][0.155817][0.465876] [src/layer.cu][fpropActs][2125] prob. (:,2) [0.881519][0.121382][0.047769] ↓ threshold = 0.5 [src/layer.cu][fpropActs][2141] prob.th. (:,0) [1.000000][0.000000][0.000000] [src/layer.cu][fpropActs][2142] prob.th. (:,1) [1.000000][0.000000][0.000000] [src/layer.cu][fpropActs][2143] prob.th. (:,2) [1.000000][0.000000][0.000000]
4) epoch#9-batch#2
ちょっと進んで9epochsの2バッチ目(=学習済みデータは490,000個) 出力値はほぼほぼ教師データと同じだ!
[src/layer.cu][fpropActs][2117] true label(:,0) [1.000000][0.000000][1.000000] [src/layer.cu][fpropActs][2118] true label(:,1) [0.000000][1.000000][0.000000] [src/layer.cu][fpropActs][2119] true label(:,2) [1.000000][0.000000][0.000000] [src/layer.cu][fpropActs][2123] prob. (:,0) [0.999963][0.000038][0.999809] [src/layer.cu][fpropActs][2124] prob. (:,1) [0.000016][0.999983][0.000180] [src/layer.cu][fpropActs][2125] prob. (:,2) [0.999750][0.000245][0.001855] ↓ threshold = 0.5 [src/layer.cu][fpropActs][2141] prob.th. (:,0) [1.000000][0.000000][1.000000] [src/layer.cu][fpropActs][2142] prob.th. (:,1) [0.000000][1.000000][0.000000] [src/layer.cu][fpropActs][2143] prob.th. (:,2) [1.000000][0.000000][0.000000]
cuda-convnet2 でいろいろと遊べる幅が広がった(^o^)/