8,457 views
1. プログラムの仕様
ニューラルネットワークの入門書に載っている普通のシンプルなネットワーク構成で実装してみた。
プログラムの仕様は以下の通り。
(1) MNIST画像データ60,000枚を学習
(2) MNISTテストデータ 10,000枚をテスト
(3) 入力層のユニット数は28×28の784個
(4) 出力層のユニット数は数字0~9に対応する10個
(5) 隠れ層の層数と各層のユニット数は引数で任意に指定可能
(6) 数字別に正解率を表示
(7) 各層の活性化関数はSigmoid
(8) 誤差関数は二乗和誤差
(9) Epoch数は1
(10) 1学習データの学習ごとにパラメーターを更新
2. プログラムの実行結果
適当に選んだ全9パターンの層構成を試した結果は以下のようになった。
この中では 4層 [784]-[256]-[64]-[10] の正解率 91.9% が最も高かった。
(1) 3層 [784]-[16]-[10] 89.0%
octave:10> tic;NNET_control([784 16 10]);toc; [ 0] 938 / 980 ( 95.7%) [ 1] 1109 / 1135 ( 97.7%) [ 2] 888 / 1032 ( 86.0%) [ 3] 892 / 1010 ( 88.3%) [ 4] 878 / 982 ( 89.4%) [ 5] 719 / 892 ( 80.6%) [ 6] 889 / 958 ( 92.8%) [ 7] 896 / 1028 ( 87.2%) [ 8] 803 / 974 ( 82.4%) [ 9] 890 / 1009 ( 88.2%) Total 8902 / 10000 ( 89.0%) Elapsed time is 63.3441 seconds.
(2) 3層 [784]-[24]-[10] 89.9%
octave:7> tic;NNET_control([784 24 10]);toc; [ 0] 949 / 980 ( 96.8%) [ 1] 1109 / 1135 ( 97.7%) [ 2] 907 / 1032 ( 87.9%) [ 3] 907 / 1010 ( 89.8%) [ 4] 887 / 982 ( 90.3%) [ 5] 708 / 892 ( 79.4%) [ 6] 882 / 958 ( 92.1%) [ 7] 925 / 1028 ( 90.0%) [ 8] 858 / 974 ( 88.1%) [ 9] 856 / 1009 ( 84.8%) Total 8988 / 10000 ( 89.9%) Elapsed time is 65.8959 seconds.
(3) 3層 [784]-[32]-[10] 90.0%
octave:6> tic;NNET_control([784 32 10]);toc; [ 0] 950 / 980 ( 96.9%) [ 1] 1108 / 1135 ( 97.6%) [ 2] 885 / 1032 ( 85.8%) [ 3] 905 / 1010 ( 89.6%) [ 4] 894 / 982 ( 91.0%) [ 5] 718 / 892 ( 80.5%) [ 6] 895 / 958 ( 93.4%) [ 7] 945 / 1028 ( 91.9%) [ 8] 828 / 974 ( 85.0%) [ 9] 869 / 1009 ( 86.1%) Total 8997 / 10000 ( 90.0%) Elapsed time is 71.066 seconds.
(4) 3層 [784]-[48]-[10] 90.6%
octave:9> tic;NNET_control([784 48 10]);toc; [ 0] 955 / 980 ( 97.4%) [ 1] 1117 / 1135 ( 98.4%) [ 2] 909 / 1032 ( 88.1%) [ 3] 931 / 1010 ( 92.2%) [ 4] 920 / 982 ( 93.7%) [ 5] 746 / 892 ( 83.6%) [ 6] 892 / 958 ( 93.1%) [ 7] 917 / 1028 ( 89.2%) [ 8] 824 / 974 ( 84.6%) [ 9] 847 / 1009 ( 83.9%) Total 9058 / 10000 ( 90.6%) Elapsed time is 74.398 seconds.
(5) 3層 [784]-[64]-[10] 90.8%
octave:1> tic;NNET_control([784 64 10]);toc; [ 0] 949 / 980 ( 96.8%) [ 1] 1109 / 1135 ( 97.7%) [ 2] 907 / 1032 ( 87.9%) [ 3] 901 / 1010 ( 89.2%) [ 4] 898 / 982 ( 91.4%) [ 5] 745 / 892 ( 83.5%) [ 6] 898 / 958 ( 93.7%) [ 7] 929 / 1028 ( 90.4%) [ 8] 859 / 974 ( 88.2%) [ 9] 888 / 1009 ( 88.0%) Total 9083 / 10000 ( 90.8%) Elapsed time is 81.1507 seconds.
(6) 3層 [784]-[96]-[10] 90.4%
octave:4> tic;NNET_control([784 96 10]);toc; [ 0] 953 / 980 ( 97.2%) [ 1] 1114 / 1135 ( 98.1%) [ 2] 910 / 1032 ( 88.2%) [ 3] 915 / 1010 ( 90.6%) [ 4] 851 / 982 ( 86.7%) [ 5] 720 / 892 ( 80.7%) [ 6] 888 / 958 ( 92.7%) [ 7] 937 / 1028 ( 91.1%) [ 8] 865 / 974 ( 88.8%) [ 9] 888 / 1009 ( 88.0%) Total 9041 / 10000 ( 90.4%) Elapsed time is 112.633 seconds.
(7) 3層 [784]-[128]-[10] 91.5%
octave:5> tic;NNET_control([784 128 10]);toc; [ 0] 951 / 980 ( 97.0%) [ 1] 1112 / 1135 ( 98.0%) [ 2] 931 / 1032 ( 90.2%) [ 3] 897 / 1010 ( 88.8%) [ 4] 879 / 982 ( 89.5%) [ 5] 784 / 892 ( 87.9%) [ 6] 892 / 958 ( 93.1%) [ 7] 927 / 1028 ( 90.2%) [ 8] 868 / 974 ( 89.1%) [ 9] 904 / 1009 ( 89.6%) Total 9145 / 10000 ( 91.5%) Elapsed time is 152.437 seconds.
(8) 4層 [784]-[256]-[32]-[10] 91.5%
octave:12> tic;NNET_control([784 256 32 10]);toc; [ 0] 957 / 980 ( 97.7%) [ 1] 1098 / 1135 ( 96.7%) [ 2] 929 / 1032 ( 90.0%) [ 3] 901 / 1010 ( 89.2%) [ 4] 876 / 982 ( 89.2%) [ 5] 781 / 892 ( 87.6%) [ 6] 899 / 958 ( 93.8%) [ 7] 942 / 1028 ( 91.6%) [ 8] 862 / 974 ( 88.5%) [ 9] 902 / 1009 ( 89.4%) Total 9147 / 10000 ( 91.5%) Elapsed time is 301.459 seconds.
(9) 4層 [784]-[256]-[64]-[10] 91.9%
octave:11> tic;NNET_control([784 256 64 10]);toc; [ 0] 960 / 980 ( 98.0%) [ 1] 1109 / 1135 ( 97.7%) [ 2] 904 / 1032 ( 87.6%) [ 3] 882 / 1010 ( 87.3%) [ 4] 909 / 982 ( 92.6%) [ 5] 786 / 892 ( 88.1%) [ 6] 910 / 958 ( 95.0%) [ 7] 936 / 1028 ( 91.1%) [ 8] 874 / 974 ( 89.7%) [ 9] 921 / 1009 ( 91.3%) Total 9191 / 10000 ( 91.9%) Elapsed time is 309.785 seconds.
3. プログラムのソースコード
(1) NNET_control.m
function NNET_control( num_unit_of_each_layer )
<span class="my_fc_green">% 学習画像・ラベル、テスト画像・ラベルをファイルから読み込み</span>
[train_img, train_lbl] = load_MNIST( 'train-images-idx3-ubyte', 'train-labels-idx1-ubyte' );
[test_img, test_lbl ] = load_MNIST( 't10k-images-idx3-ubyte', 't10k-labels-idx1-ubyte' );
<span class="my_fc_green">% 各画像データを 0.0~1.0の範囲に正規化</span>
train_img = train_img / 255;
test_img = test_img / 255;
<span class="my_fc_green">% 指定された層数、ユニット数でニューラルネットワークを作成</span>
nn = NNET_setup( num_unit_of_each_layer );
<span class="my_fc_green">% 学習実行</span>
nn = NNET_learn( nn, train_img, train_lbl );
<span class="my_fc_green">% テスト実行</span>
result = NNET_test( nn, test_img, test_lbl );
<span class="my_fc_green">% テスト結果を表示</span>
for i=1: 10
printf('[%2d] %4d / %4d (%5.1f%%) \n', i-1, result(i,2), result(i,1), result(i,2)/result(i,1)*100);
end
sum_result = sum(result,1);
printf('Total %5d / %5d (%5.1f%%) \n', sum_result(2), sum_result(1), sum_result(2)/sum_result(1)*100);
end
(2) load_MNIST.m
function [image label] = load_MNIST( image_file, label_file ) <span class="my_fc_green">%////////////////////////////////////////////////////////////// % 画像をロード % ファイルからロードした画像を image[28][28][60000]の配列で出力</span> fid = fopen(image_file,'r','b'); magic_number = fread(fid, 1, 'int32'); number_of_items = fread(fid, 1, 'int32'); number_of_rows = fread(fid, 1, 'int32'); number_of_columns = fread(fid, 1, 'int32'); img = fread(fid, [number_of_rows*number_of_columns number_of_items],'uint8'); image = permute(reshape(img, number_of_rows, number_of_columns,number_of_items),[2 1 3]); fclose(fid); <span class="my_fc_green">%////////////////////////////////////////////////////////////// % ラベルをロード % ファイルからロードしたラベルを label[10][60000]の配列で出力</span> fid = fopen(label_file,'r','b'); magic_number = fread(fid, 1, 'int32'); number_of_items = fread(fid, 1, 'int32'); lbl = fread(fid, number_of_items, 'uint8'); idx = [1:number_of_items]'; lblidx = lbl * number_of_items + idx; label = zeros(number_of_items,10); label(lblidx) = 1; label = label'; fclose(fid); end
(3) NNET_setup.m
function nn = NNET_setup( num_unit_of_each_layer )
% 乱数生成器を初期化
rand('seed', 0);
% 指定された層数を取得
num_layer = numel( num_unit_of_each_layer );
% 全層を初期化
for i=2 : num_layer
% 現層と前層のユニット数を取得
num_unit_pre = num_unit_of_each_layer( i - 1 );
num_unit = num_unit_of_each_layer( i );
% 各結合線の荷重を -1~1の一様分布乱数で初期化
nn.layer{i}.weight = -1 + rand( num_unit, num_unit_pre ) * 2;
% バイアスを初期化
nn.layer{i}.bias = zeros( num_unit, 1 );
end
end
(4) NNET_learn.m
function nn = NNET_learn( nn, train_img, train_lbl )
% 学習データ数を取得
num_data = size( train_img, 3 );
% 学習データをシャッフル
randvector = randperm( num_data );
train_img = train_img(:,:,randvector);
train_lbl = train_lbl(:,randvector);
% 全学習画像を1個ずつ学習させる
for i=1 : num_data
% 順伝播
nn = NNET_propagation_forward( nn, train_img(:,:,i) );
% 誤差逆伝播
nn = NNET_propagation_back( nn, train_lbl(:,i) );
% パラメーター更新
nn = NNET_update( nn, 0.05 );
end
end
(5) NNET_propagation_forward.m
function nn = NNET_propagation_forward( nn, train_img )
% 入力層の出力値を記憶
nn.layer{1}.out = train_img(:); % [n0][1]
% 全層について順伝播を実行
for i=2 : numel(nn.layer)
% a = Σwz + bias w[n1][n0] z[n0][1] bias[n1][1]
nn.layer{i}.actprm = nn.layer{i}.weight * nn.layer{i-1}.out + nn.layer{i}.bias;
% out = sigmoid(a) out[n1][1]
nn.layer{i}.out = sigmoid( nn.layer{i}.actprm );
end
end
(6) NNET_propagation_back.m
function nn = NNET_propagation_back( nn, train_lbl )
% 層数を取得
num_layer = numel(nn.layer);
% 出力層で検出された誤差量と逆伝播する勾配の初期値を算出
[err, nn.layer{num_layer}.grad] = lossfunc(nn.layer{num_layer}.out, train_lbl);
% 全層について誤差逆伝播を実行
for i=num_layer : -1 : 2
% 直前層の各ニューロンに伝播する勾配を算出
% δout
% ----- = w x h'(a)
% δin
% | δout|
% grad = Σ|gout x -----|
% | δin |
% 配列要素数の同じ2パラメータを先に計算 grad[n1][1] out[n1][1]
derr = nn.layer{i}.grad .* dsigmoid(nn.layer{i}.out);
% Σ(w・derr) w[n1][n0] derr[n1][1] grad[n0][1]
nn.layer{i-1}.grad = nn.layer{i}.weight' * derr;
% 結合荷重の修正量を算出
% δE
% ---- = grad・h'(a)・out
% δw
% IN側ユニット-OUT側ユニットの組み合わせごとに算出 out[n0][1] derr[n1][1] dw[n1][n0]
nn.layer{i}.dweight = derr * nn.layer{i-1}.out';
% バイアスの修正量を算出
nn.layer{i}.dbias = derr;
end
end
(7) NNET_update.m
function nn = NNET_update( nn, ratio )
% 全層について結合荷重とバイアスを更新
for i=2: numel(nn.layer)
nn.layer{i}.weight = nn.layer{i}.weight - ratio * nn.layer{i}.dweight;
nn.layer{i}.bias = nn.layer{i}.bias - ratio * nn.layer{i}.dbias;
end
end
(8) NNET_test.m
function result = NNET_test( nn, test_img, test_lbl )
% テストデータ数を取得
num_data = size(test_img, 3);
% テスト結果の記録領域を初期化
result = zeros(size(test_lbl,1),2);
% 全学習画像をテスト実行
for i=1 : num_data
% 順伝播
nn = NNET_propagation_forward( nn, test_img(:,:,i) );
% 自動識別結果(出力層の出力値)を取得
result_lbl = nn.layer{numel(nn.layer)}.out;
[~, idx_cor] = max(test_lbl(:,i)); % 期待値を取得
[~, idx_res] = max(result_lbl); % 判定結果を取得
% テスト結果を記憶
result(idx_cor ,1) = result(idx_cor ,1) + 1; % 数字別のテスト数+1
if idx_cor == idx_res
result(idx_cor ,2) = result(idx_cor ,2) + 1; % 数字別の正解数+1
end
end
end
(9) sigmoid.m
function y = sigmoid( x ) y = 1 ./ (1 + exp(-x)); end
(10) dsigmoid.m
function y = dsigmoid( x ) y = x .* (1 - x); end
(11) lossfunc.m
function [err grad] = lossfunc( out, lbl ) % 伝播する誤差の初期値を算出 grad = out - lbl; % out[n1][1] lbl[n1][1] % 誤差関数種別は二乗和誤差とする。 % 1 2 % err = ---Σ(out - lbl) % 2 err = grad' * grad / 2; end
次回「(5) EPOCH数を増やして正解率上昇」では、学習を繰り返すことにより正解率を向上させてみる。
アクセス数(直近7日): ※試験運用中、BOT除外簡易実装済2026-02-11: 0回 2026-02-10: 0回 2026-02-09: 3回 2026-02-08: 0回 2026-02-07: 1回 2026-02-06: 2回 2026-02-05: 0回