为机器人换个好使的脑子(十四)

我们现在需要对数据集进行一些预处理。

Sean/
  |- images/            图片
  |- labels/            标注
  |- structure/         图片和标注的对应文件

1. 在数据集Sean目录下生成标签文件labelmap.prototxt

item {
  name: "none_of_the_above"
  label: 0
  display_name: "background"
}
item {
  name: "bunny"
  label: 1
  display_name: "bunny"
}
item {
  name: "doll"
  label: 2
  display_name: "doll"
}
item {
  name: "doraemon"
  label: 3
  display_name: "doraemon"
}
item {
  name: "snoopy"
  label: 4
  display_name: "snoopy"
}

2. 在structure目录下生成trainval.txt和test.txt

import os
from os.path import basename

def createMapTxt(baseDirDataSet, target):
    buffer = ''
    baseDir = baseDirDataSet+'/images/'+target
    for filename in os.listdir(baseDir):
        filenameOnly, file_extension = os.path.splitext(filename)
        s = 'images/'+filenameOnly+'.jpg'+' '+'labels/'+filenameOnly+'.xml\n'
        print (repr(s))
        img_file, anno = s.strip("\n").split(" ")
        print(repr(img_file), repr(anno))
        buffer+=s  
with open(baseDirDataSet+'/structure/'+target+'.txt', 'w') as file:
        file.write(buffer)
    print('Done')    

createMapTxt('/home/young/Sean', 'trainval')
createMapTxt('/home/young/Sean', 'test')

trainval.txt的内容大致如下:

images/Bunny(8).jpg labels/Bunny(8).xml
images/Bunny(78).jpg labels/Bunny(78).xml
images/doraemon(148).jpg labels/doraemon(148).xml
images/Snoopy(108).jpg labels/Snoopy(108).xml