为机器人换个好使的脑子(十一)

好吧,做了那么多前期准备,我们终于可以开始为六足机器人提高智商了。

整个流程包括:

  1. 准备数据集
  2. 数据标注
  3. 数据预处理
  4. 生成lmdb数据
  5. 训练
  6. 部署
  7. 部署到NCS
  8. 在树莓派上运行

为了加快训练速度,我们一般都采用迁移学习(transfer learning)

caffe train -solver solver.prototxt -weights trained.caffemodel
caffe的迁移学习,可以归结为:
  1. 数据预处理,包括
    1. 图片下载、标注
    2. 训练集/测试集划分
    3. lmdb生成
  2.  下载预训练模型
  3. 修改网络配置(与预训练模型一致),包括:
    1. 输入数据
    2. 冻结前面的layers,只训练最后一层或几层layer
  4. 修改solver.prototxt

为机器人换个好使的脑子(十)

要让机器人通过摄像头能辨认出东西,这个在人工智能领域叫目标检测(object detection)。爸爸说,目标检测算法有很多,什么R-CNN、Fast R-CNN、YOLO、SSD,巴拉巴拉一大堆。但是,考虑到我们要在树莓派上运行,爸爸建议我用MobileNet-SSD。

接下来,我们先在笔记本上领略一下MobileNet-SSD是如何检测物体的。

1. 下载MobileNet-SSD
$ cd $CAFFE_ROOT/examples
$ git clone --depth 1 https://github.com/chuanqi305/MobileNet-SSD
2. MobileNet-SSD自带了一个训练好的模型,能识别20种物体,包括 ‘bicycle’, ‘bird’,  ‘bottle’,  ‘car’, ‘cat’, ‘cow’, ‘dog’, ‘horse’等等。我们就用这个模型来测试。
import numpy as np
import sys,os
import cv2
caffe_root = '/home/young/caffe/'
sys.path.insert(0, caffe_root + 'python')
import caffe

net_file = 'deploy.prototxt'
caffe_model='mobilenet_iter_73000.caffemodel'
test_dir = "images"

if not os.path.exists(caffe_model):
  print(caffe_model + " does not exist")
exit()
if not os.path.exists(net_file):
  print(net_file + " does not exist")
exit()
net = caffe.Net(net_file,caffe_model,caffe.TEST)

CLASSES = ('background', 'aeroplane', 'bicycle', 'bird', 
'boat', 'bottle', 'bus', 'car', 'cat', 'chair', 'cow', 
'diningtable', 'dog', 'horse', 'motorbike', 'person', 
'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor')

def preprocess(src):
  img = cv2.resize(src, (300,300))
  img = img - 127.5
  img = img * 0.007843
return img

def postprocess(img, out):
  h = img.shape[0]
  w = img.shape[1]
  box = out['detection_out'][0,0,:,3:7] * np.array([w, h, w, h])

  cls = out['detection_out'][0,0,:,1]
  conf = out['detection_out'][0,0,:,2]
return (box.astype(np.int32), conf, cls)

def detect(imgfile):
  origimg = cv2.imread(imgfile)
  img = preprocess(origimg)

  img = img.astype(np.float32)
  img = img.transpose((2, 0, 1))

  net.blobs['data'].data[...] = img
  out = net.forward()
  box, conf, cls = postprocess(origimg, out)

  for i in range(len(box)):
    p1 = (box[i][0], box[i][1])
    p2 = (box[i][2], box[i][3])
    cv2.rectangle(origimg, p1, p2, (0,255,0),2)
    p3 = (max(p1[0], 15),p1[1]-5)
    title = "%s:%.2f" % (CLASSES[int(cls[i])], conf[i])
    cv2.putText(origimg, title, p3, cv2.FONT_ITALIC, 0.8, (0, 255, 0), 2)
  cv2.imshow("SSD", origimg)
  cv2.imwrite(title+'.jpg', origimg)
  k = cv2.waitKey(0) & 0xff
  if k == 27 : 
    return False
  return True

for f in os.listdir(test_dir):
  if detect(test_dir + "/" + f) == False:
    break
3. 检测效果
MobileNet-SSD如果检测到目标,会在目标周围画个框,然后标出目标的名称和置信度

为机器人换个好使的脑子(九)

接下来,就到了最后一步,在树莓派上安装NCSDK。

1. 树莓派系统更新到Raspbian Stretch

2. 安装OpenCV 3

$ sudo apt-get install libhdf5-dev libhdf5-serial-dev
$ sudo apt-get install libqtwebkit4 libqt4-test
$ wget https://bootstrap.pypa.io/get-pip.py
$ sudo python3 get-pip.py
$ sudo pip install opencv-contrib-python

3. 安装NCSDK

$ cd ~
$ git clone -b ncsdk2 http://github.com/Movidius/ncsdk && cd ncsdk && make install

为机器人换个好使的脑子(八)

现在,我们尝试一下,把前面的mnist手写数字识别模型搬到NCS(Intel神经计算棒)上,看看能不能运行。

首先,生成NCS需要的graph

$ cd ~/mnist
$ mkdir models
$ cp deploy.prototxt lenet_iter_10000.caffemodel models
$ mvNCCompile models/deploy.prototxt \
    -w models/lenet_iter_10000.caffemodel \
    -s 12 -is 300 300 -o models/mnist_graph
测试用NCS来识别手写数字
# NCSDK API V2
from mvnc import mvncapi as mvnc
import numpy as np
import cv2
import time

labels = ['zero','one','two','three','four','five','six','seven','eight','nine']
GRAPH = 'models/mnist_graph'
IMAGE = 'images/5.jpg'

# discover our device
devices = mvnc.enumerate_devices()
device = mvnc.Device(devices[0])
device.open()

# load graph onto the device
graph = mvnc.Graph('graph1')
with open(GRAPH, 'rb') as f:
graph_file = f.read()
input_fifo, output_fifo = graph.allocate_with_fifos(device, graph_file)

image = cv2.imread(IMAGE)
graph.queue_inference_with_fifo_elem(input_fifo, output_fifo, image_pro, 'object1')
output, user_obj = output_fifo.read_elem()

start = time.time()
scores = output.flatten()
indice =(-scores).argsort()[:1][0]
prediction = labels[indice]
print ('The number is %s' %prediction)
print("Done in %.2f s." % (time.time() - start))

input_fifo.destroy()
output_fifo.destroy()
graph.destroy()
device.close()
device.destroy()

为机器人换个好使的脑子(七)

在笔记本上安装NCSDK开发环境。

1.安装OpenCV 3.1

安装步骤和上一篇一样。

2.安装NCSDK

这个比较简单

$ cd 
$ git clone -b ncsdk2 http://github.com/Movidius/ncsdk
$ cd ncsdk 
$ make install

3. 测试NCSDK安装

Intel神经计算棒插入笔记本电脑的USB口,然后:

$ cd ~/ncsdk/examples/apps/hello_ncs_py
$ make run



					

为机器人换个好使的脑子(六)

上文我们只看到了Caffe训练过程,但是没有用训练好的模型(model)来预测手写数字。这次,我们用Caffe训练识别手写数字的图片。

1.下载mnist图片格式的数据集,

$ cd $CAFFE_ROOT/data
$ cp -r Mnist_image Mnist_image

并生成数据的标签文件,即每个文件对应哪个数字

# !/usr/bin/env sh
CAFFE_ROOT=~/caffe_BLVC
MNIST=$CAFFE_ROOT/data/Mnist_image
DATA_TRAIN=$MNIST/train
DATA_TEST=$MNIST/test
echo "Create train.txt..."
for i in 0 1 2 3 4 5 6 7 8 9
do
  find $DATA_TRAIN/$i/ -name *.png | cut -d '/' -f8-9 | sed "s/$/ $i/">>$MNIST/train.txt
done
echo "Create test.txt..."
for i in 0 1 2 3 4 5 6 7 8 9
do
  find $DATA_TEST/$i/ -name *.png | cut -d '/' -f8-9 | sed "s/$/ $i/">>$MNIST/test.txt
done
echo "All done"

我们会得到train.txt和test.txt两个文件

0/0_2257.png 0
0/0_5565.png 0
...
4/4_787.png 4
4/4_414.png 4
...
9/9_2947.png 9
9/9_4352.png 9

2. 转换成lmdb格式

#!/usr/bin/env sh
set -e
CAFFE_ROOT=~/caffe_BLVC 
EXAMPLE=$CAFFE_ROOT/examples/Mnist_image 
DATA=$CAFFE_ROOT/data/Mnist_image 
TOOLS=$CAFFE_ROOT/build/tools
TRAIN_DATA_ROOT=$DATA/train/ 
TEST_DATA_ROOT=$DATA/test/

echo "Creating train lmdb..."
GLOG_logtostderr=1 $TOOLS/convert_imageset \
    --resize_height=20 \
    --resize_width=20 \
    --shuffle \
    --gray=true \
    $TRAIN_DATA_ROOT \
    $DATA/train.txt \
    $EXAMPLE/mnist_train_lmdb

echo "Creating test lmdb..."
GLOG_logtostderr=1 $TOOLS/convert_imageset \
    --resize_height=20 \
    --resize_width=20 \
    --shuffle \
    --gray=true \
    $TEST_DATA_ROOT\
    $DATA/test.txt \
    $EXAMPLE/mnist_test_lmdb

echo "Done."

3. 计算均值

图片减去均值再训练,会提高训练速度和精度。

$ cd $CAFFE_ROOT
$ sudo build/tools/compute_image_mean examples/Mnist_image/mnist_train_lmdb examples/Mnist_image/mean.binaryproto

创建一个全0的均值文件meanfile.npy

# zeronp.py
import numpy as np
zeros = np.zeros((1,20,20), dtype=np.float32)
np.save('meanfile.npy', zeros)
$ python zeronp.py

4. 创建模型

在examples/Mnist_image下创建Caffe的两个配置文件:

# solver.prototxt
net: "examples/Mnist_image/lenet_train_test.prototxt"
test_iter: 100
test_interval: 500
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
lr_policy: "inv"
gamma: 0.0001
power: 0.75
display: 100
max_iter: 10000
snapshot: 5000
snapshot_prefix: "examples/Mnist_image/lenet"
solver_mode: CPU
#train_test.prototxt
name: "LeNet"
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/Mnist_image/mnist_train_lmdb"
    batch_size: 64
    backend: LMDB
  }
}
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "examples/Mnist_image/mnist_test_lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool2"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}

5. 训练

#!/usr/bin/env sh
set -e
cd $CAFFE_ROOT
./build/tools/caffe train --solver=examples/Mnist_image/lenet_solver.prototxt $@

开始训练,得到caffenet_train_iter_10000.caffemodel。

6. 在examples/Mnist_image/生成deploy.prototxt

#deploy.prototxtname: "LeNet" 
layer {
  name: "data"
  type: "Input"
  top: "data"
  input_param { shape: { dim: 1 dim: 1 dim: 20 dim: 20 } }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool2"
  top: "ip1"
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
  }
} 
layer {
  name: "prob"
  type: "Softmax"
  bottom: "ip2"
  top: "prob"
}

7. 识别图片
examples/Mnist_image/classify_mnist.py

#!/usr/bin/env python
#coding:utf-8
import sys
caffe_root = '/home/young/caffe_BLVC/'
sys.path.insert(0, caffe_root + 'python')
import numpy as np
import os
import time
import caffe

labels = ['zero','one','two','three','four','five','six','seven','eight','nine']
model_def = os.path.join(caffe_root, 'examples/Mnist_image/deploy.prototxt')
pretrained_model = os.path.join(caffe_root, 'examples/Mnist_image/lenet_iter_10000.caffemodel')
mean_file = os.path.join(caffe_root, 'examples/Mnist_image/meanfile.npy')
mean = np.load(mean_file).mean(1).mean(1)

def main(argv):
    caffe.set_mode_cpu()
    print("CPU mode")
    classifier = caffe.Classifier(model_def, pretrained_model,
            image_dims=(20,20), mean=mean, raw_scale=255)

    input_file = sys.argv[1]
    print("Loading file: %s" % input_file)
    inputs = [caffe.io.load_image(input_file, False)]

    start = time.time()
    scores = classifier.predict(inputs).flatten()
    print(scores)
    indice =(-scores).argsort()[:1][0]
    prediction = labels[indice]
    print ('The number is %s' %prediction)
    print("Done in %.2f s." % (time.time() - start))

if __name__ == '__main__':
    main(sys.argv)
$ python classifierMnist.py 4.jpg

为机器人换个好使的脑子(五)

现在我们需要测试一下,Caffe是否安装正确。我们运行一下Caffe自带的mnist,一个手写数字的图片库,mnist类似与我们学C语言时写的第一个Hello world程序。

Caffe的优势是只需要编辑几个文本文件,就能训练神经网络。这几个文件是:

train_test.prototxt: 定义神经网络的层次结构
solver.prototxt: 定义神经网络如何训练、优化

设置环境

$ export CAFFE_ROOT=~/caffe
$ export PYTHONPATH=$CAFE_ROOT/python
$ cd $CAFFE_ROOT

下载mnist数据

$ sudo sh data/mnist/get_mnist.sh

运行成功后,在 ~/caffe/data/mnist/目录下有四个文件:

train-images-idx3-ubyte:  训练集样本 (9912422 bytes) 
train-labels-idx1-ubyte:  训练集对应标注 (28881 bytes) 
t10k-images-idx3-ubyte:   测试集图片 (1648877 bytes) 
t10k-labels-idx1-ubyte:   测试集对应标注 (4542 bytes)

数据转换成LMDB格式

$ sudo sh examples/mnist/create_mnist.sh

转换成功后,会在 examples/mnist/目录下,生成两个文件夹,分别是mnist_train_lmdb和mnist_test_lmdb,里面存放的data.mdb和lock.mdb,就是我们需要的运行数据。

/mnist_train_lmdb
        |- data.mdb
        |- lock.mdb
/mnist_test_lmdb
        |- data.mdb
        |- lock.mdb

运行

$ sudo time sh examples/mnist/train_lenet.sh

可以看到Caffe训练神经网络的结果

最终训练的模型保存在caffe/examples/mnist/lenet_iter_10000.caffemodel文件中。