【MATLAB】CNN用にtrainとtestにデータを分ける

MATLAB, CNN, Deep learning
CNN用にtrainとtestにデータを分ける
CNNでMRIのシーケンスの疾患の分類精度を比較する
各シーケンス毎に同じ患者のデータが入っているので、各シーケンスでtrainとtestが同じ患者データになるように分ける

フォルダ構造はこんな感じ
f:id:radi_tech:20220205081416p:plain

手順

フォルダ構造を得る
保存用のフォルダを作成する
分割用のインデックスを乱数を用いて発生させる
movefileでファイルを移動
各シーケンスで繰り返す

フォルダ構造を得る

main_fd = uigetdir()

[a,b,c]= fileparts(main_fd); 

sub_fd =dir(main_fd);
%remove waste files
sub_fd =sub_fd (~ismember({sub_fd.name}, {'.','..','.DS_Store','._.DS_Store','._*'}));  
%covert to cell
sub_fd ={sub_fd.name}

分割したファイルの送り先フォルダを作成する
今回は、元フォルダをtrain、送り先をtestをする
元データのバックアップはとっておく（重要）

for x= 1:length(sub_fd)

    sub_sub_fd = dir(fullfile(main_fd,sub_fd{x}));
    %remove waste files
    sub_sub_fd =sub_sub_fd(~ismember({sub_sub_fd.name}, {'.','..','.DS_Store','._.DS_Store','._*'}));
    %covert to cell
    sub_sub_fd ={sub_sub_fd.name}

    for y= 1:length(sub_sub_fd)
        
        test_img_fd_path = fullfile(a,strcat('test_',b),sub_fd{x},sub_sub_fd{y});
        mkdir(test_img_fd_path);
    end
end
msgbox("save fd created")

length でファイル数を得て、乱数でインデックスを発生させる

% Create idx of each fd
for yyy = 1: length(sub_sub_fd) 
    tmp_fd = fullfile(main_fd,sub_fd{1},sub_sub_fd{yyy});
    img_lst = dir(tmp_fd);
    %remove waste files
    img_lst  =img_lst(~ismember({img_lst.name}, {'.','..','.DS_Store','._.DS_Store','._*'}));
    %covert to cell
    img_lst  ={img_lst.name};
    
    fd_nm{yyy} = length (img_lst)
end

idx_1 = randperm(fd_nm{1},round(fd_nm{1}*0.2))
idx_2 = randperm(fd_nm{2},round(fd_nm{2}*0.2))
idx_3 = randperm(fd_nm{3},round(fd_nm{3}*0.2))
idx_4 = randperm(fd_nm{4},round(fd_nm{4}*0.2))

length_idx_1 =length(idx_1)
length_idx_2 =length(idx_2)
length_idx_3 =length(idx_3)
length_idx_4 =length(idx_4)

上記で発生させたインデックスに応じて、movefileでファイルを移動

% fd1 move file  ADC

for xx= 1:length (sub_fd)
    
    tmp_fd_1 = fullfile(main_fd,sub_fd{xx},sub_sub_fd{1});
    
    img_lst_1 = dir(tmp_fd_1);
    %remove waste files
    img_lst_1  =img_lst_1(~ismember({img_lst_1.name}, {'.','..','.DS_Store','._.DS_Store','._*'}));
    %covert to cell
    img_lst_1  ={img_lst_1.name};

    for zz = idx_1
        
        tmp_img = fullfile(tmp_fd_1, img_lst_1{zz});
        destination = fullfile(a,strcat('test_',b),sub_fd{xx},sub_sub_fd{1});

        movefile(tmp_img, destination)
    end  
end

Radi_tech’s blog

Radiological technologist in Japan / MRI / AI / Deep learning / MATLAB / R / Python

【MATLAB】CNN用にtrainとtestにデータを分ける