TinyMS ResNet50 教程¶
在本教程中,我们会演示使用TinyMS API进行训练/推理一个ResNet50模型过程。
环境要求¶
Ubuntu:
18.04
Python:
3.7.x
Flask:
1.1.2
MindSpore:
CPU-1.1.1
TinyMS:
0.1.0
numpy:
1.17.5
Pillow:
8.1.0
pip:
21.0.1
requests:
2.18.4
介绍¶
TinyMS是一个高级API,目的是让新手用户能够更加轻松地上手深度学习。TinyMS可以有效地减少用户在构建、训练、验证和推理一个模型过程中的操作次数。TinyMS也提供了教程和文档帮助开发者更好的上手和开发。
本教程中,包含6个步骤:构建模型
、下载数据集
、训练模型
、定义servable.json
、启动服务器
和推理
,其中服务器在子进程中启动。
[1]:
import os
import json
from PIL import Image
from tinyms import context
from tinyms.serving import start_server, predict, list_servables, shutdown, server_started
from tinyms.data import Cifar10Dataset, download_dataset, ImageFolderDataset
from tinyms.vision import cifar10_transform, ImageViewer, imagefolder_transform
from tinyms.model import Model, resnet50
from tinyms.callbacks import ModelCheckpoint, CheckpointConfig, LossMonitor
from tinyms.metrics import Accuracy
from tinyms.optimizers import Momentum
from tinyms.losses import SoftmaxCrossEntropyWithLogits
[WARNING] ME(12653:139827079800640,MainProcess):2021-03-19-15:26:20.878.475 [mindspore/ops/operations/array_ops.py:2302] WARN_DEPRECATED: The usage of Pack is deprecated. Please use Stack.
WARNING: 'ControlDepend' is deprecated from version 1.1 and will be removed in a future version, use 'Depend' instead.
1. 构建模型¶
TinyMS封装了MindSpore ResNet50模型中的init和construct函数,代码行数能够大大减少,原有的大量代码段行数会被极限压缩:
[2]:
# 构建网络
net = resnet50(class_num=10)
model = Model(net)
2. 下载数据集¶
本教程演示了使用cifar10数据集进行训练,同时也提供两个预训练好的ckpt文件下载,一个是cifar10
数据集训练得来,另一个是由ImageNet2012数据集中的蘑菇图片训练得来.
[3]:
# download the cifar10 dataset
cifar10_path = '/root/cifar10/cifar-10-batches-bin'
if not os.path.exists(cifar10_path):
download_dataset('cifar10', '/root')
print('************Download complete*************')
else:
print('************Dataset already exists.**************')
************** Downloading the Cifar10 dataset **************
[███████████████████████████████████████████████████████████████████████████████████████████████████ ] 99.66%************Download complete*************
3. 训练模型¶
数据集中的训练集、验证集都会在此步骤中定义,同时也会定义训练参数。训练后生成的ckpt文件会保存到/etc/tinyms/serving/resnet50_cifar10
文件夹以便后续使用,训练完成后会进行验证并输出 Accuracy
指标。
提示:训练过程非常漫长,建议跳过训练步骤并直接下载、使用本教程提供的ckpt文件进行后续的推理
[ ]:
# 检查ckpt文件和路径
cifar10_ckpt_folder = '/etc/tinyms/serving/resnet50_cifar10'
cifar10_ckpt_path = '/etc/tinyms/serving/resnet50_cifar10/resnet50.ckpt'
if not os.path.exists(cifar10_ckpt_folder):
!mkdir -p /etc/tinyms/serving/resnet50_cifar10
else:
print('resnet50_cifar10 ckpt folder already exists')
# 设置训练参数
epoch_size = 90 # default is 90
batch_size = 32
# 设置环境参数
dataset_sink_mode = False
device_target = "CPU"
context.set_context(mode=context.GRAPH_MODE, device_target=device_target)
# 设置数据集参数
train_dataset = Cifar10Dataset(cifar10_path, num_parallel_workers=4, shuffle=True)
train_dataset = cifar10_transform.apply_ds(train_dataset, repeat_size=1, batch_size=32, is_training=True)
eval_dataset = Cifar10Dataset(cifar10_path, num_parallel_workers=4, shuffle=True)
eval_dataset = cifar10_transform.apply_ds(eval_dataset, repeat_size=1, batch_size=32, is_training=False)
step_size = train_dataset.get_dataset_size()
save_checkpoint_epochs = 5
ckpoint_cb = ModelCheckpoint(prefix="resnet_cifar10", config=CheckpointConfig(
save_checkpoint_steps=save_checkpoint_epochs * train_dataset.get_dataset_size(),
keep_checkpoint_max=10))
# 定义loss函数
net_loss = SoftmaxCrossEntropyWithLogits(sparse=True, reduction="mean")
# 定义optimizer
net_opt = Momentum(filter(lambda x: x.requires_grad, net.get_parameters()), 0.01, 0.9)
model.compile(loss_fn=net_loss, optimizer=net_opt, metrics={"Accuracy": Accuracy()})
print('************************Start training*************************')
model.train(epoch_size, train_dataset, callbacks=[ckpoint_cb, LossMonitor()],dataset_sink_mode=dataset_sink_mode)
model.save_checkpoint(cifar10_ckpt_path)
print('************************Finished training*************************')
model.load_checkpoint(cifar10_ckpt_path)
print('************************Start evaluation*************************')
acc = model.eval(eval_dataset, dataset_sink_mode=dataset_sink_mode)
print("============== Accuracy:{} ==============".format(acc))
提示:如果跳过了训练步骤,下载预训练的ckpt文件并继续推理步骤
点击resnet_imagenet下载ResNet50_imagenet ckpt文件,或者点击resnet_cifar下载ResNet50_cifar ckpt文件,将ckpt文件保存到/etc/tinyms/serving/resnet50_<dataset_name>/resnet50.ckpt
或者运行以下代码下载resnet_imagenet
和 resnet_cifar
ckpt文件:
[4]:
# check lenet folder exists or not, and download resnet50_imagenet
imagenet2012_ckpt_folder = '/etc/tinyms/serving/resnet50_imagenet2012'
imagenet2012_ckpt_path = '/etc/tinyms/serving/resnet50_imagenet2012/resnet50.ckpt'
if not os.path.exists(imagenet2012_ckpt_folder):
!mkdir -p /etc/tinyms/serving/resnet50_imagenet2012
!wget -P /etc/tinyms/serving/resnet50_imagenet2012 https://ascend-tutorials.obs.cn-north-4.myhuaweicloud.com/ckpt_files/imagenet2012/resnet50.ckpt
else:
print('imagenet2012 ckpt folder already exists')
if not os.path.exists(imagenet2012_ckpt_path):
!wget -P /etc/tinyms/serving/resnet50_imagenet2012 https://ascend-tutorials.obs.cn-north-4.myhuaweicloud.com/ckpt_files/imagenet2012/resnet50.ckpt
else:
print('imagenet2012 ckpt file already exists')
# check lenet folder exists or not
cifar10_ckpt_folder = '/etc/tinyms/serving/resnet50_cifar10'
cifar10_ckpt_path = '/etc/tinyms/serving/resnet50_cifar10/resnet50.ckpt'
if not os.path.exists(cifar10_ckpt_folder):
!mkdir -p /etc/tinyms/serving/resnet50_cifar10
!wget -P /etc/tinyms/serving/resnet50_cifar10 https://ascend-tutorials.obs.cn-north-4.myhuaweicloud.com/ckpt_files/cifar10/resnet50.ckpt
else:
print('cifar10 ckpt folder already exists')
if not os.path.exists(cifar10_ckpt_path):
!wget -P /etc/tinyms/serving/resnet50_cifar10 https://ascend-tutorials.obs.cn-north-4.myhuaweicloud.com/ckpt_files/cifar10/resnet50.ckpt
else:
print('cifar10 ckpt file already exists')
imagenet2012 ckpt folder already exists
--2021-03-19 15:28:24-- https://ascend-tutorials.obs.cn-north-4.myhuaweicloud.com/ckpt_files/imagenet2012/resnet50.ckpt
Resolving ascend-tutorials.obs.cn-north-4.myhuaweicloud.com (ascend-tutorials.obs.cn-north-4.myhuaweicloud.com)... 49.4.112.113, 49.4.112.90, 49.4.112.5, ...
Connecting to ascend-tutorials.obs.cn-north-4.myhuaweicloud.com (ascend-tutorials.obs.cn-north-4.myhuaweicloud.com)|49.4.112.113|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 188521005 (180M) [binary/octet-stream]
Saving to: ‘/etc/tinyms/serving/resnet50_imagenet2012/resnet50.ckpt’
resnet50.ckpt 100%[===================>] 179.79M 32.4MB/s in 6.0s
2021-03-19 15:28:31 (30.0 MB/s) - ‘/etc/tinyms/serving/resnet50_imagenet2012/resnet50.ckpt’ saved [188521005/188521005]
cifar10 ckpt folder already exists
--2021-03-19 15:28:31-- https://ascend-tutorials.obs.cn-north-4.myhuaweicloud.com/ckpt_files/cifar10/resnet50.ckpt
Resolving ascend-tutorials.obs.cn-north-4.myhuaweicloud.com (ascend-tutorials.obs.cn-north-4.myhuaweicloud.com)... 49.4.112.113, 49.4.112.90, 49.4.112.5, ...
Connecting to ascend-tutorials.obs.cn-north-4.myhuaweicloud.com (ascend-tutorials.obs.cn-north-4.myhuaweicloud.com)|49.4.112.113|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 188462121 (180M) [binary/octet-stream]
Saving to: ‘/etc/tinyms/serving/resnet50_cifar10/resnet50.ckpt’
resnet50.ckpt 100%[===================>] 179.73M 4.07MB/s in 40s
2021-03-19 15:29:11 (4.52 MB/s) - ‘/etc/tinyms/serving/resnet50_cifar10/resnet50.ckpt’ saved [188462121/188462121]
4. 定义servable.json¶
在下列两段代码中选择其中一段
运行以定义servable json文件,该文件会在后续推理中使用 运行下列代码定义ResNet50_imagenet2012
模型servable json文件:
[ ]:
servable_json = [{'name': 'resnet50_imagenet2012',
'description': 'This servable hosts a resnet50 model predicting mushrooms',
'model': {
"name": "resnet50",
"format": "ckpt",
"class_num": 9}}]
os.chdir("/etc/tinyms/serving")
json_data = json.dumps(servable_json, indent=4)
with open('servable.json', 'w') as json_file:
json_file.write(json_data)
运行下列代码定义ResNet50_cifar10
模型servable json文件:
[5]:
servable_json = [{'name': 'resnet50_cifar10',
'description': 'This servable hosts a resnet50 model predicting 10 classes of objects',
'model': {
"name": "resnet50",
"format": "ckpt",
"class_num": 10}}]
os.chdir("/etc/tinyms/serving")
json_data = json.dumps(servable_json, indent=4)
with open('servable.json', 'w') as json_file:
json_file.write(json_data)
5. 启动服务器¶
6. 推理¶
6.1 上传图片¶
ResNet50_imagenet2012
模型需要用户上传一张蘑菇图片作为输入,而ResNet50_cifar10
模型需要用户上传一张属于如下10个类别的图片作为输入:
['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
点击蘑菇下载本教程中使用的蘑菇图片以运行ResNet50_imagenet2012
,或者点击飞机以运行ResNet50_cifar10
。上传图片,如果使用命令行终端,可以使用’scp’或者’wget’获取图片,如果使用Jupyter,点击菜单右上方的’Upload’按钮并且选择上传的图片。将图片保存在根目录下,重命名为’mushroom.jpeg’或者’airplane.jpg’。
或者运行下列代码下载蘑菇
图片(推理ResNet_imagenet
模型)和飞机
(推理ResNet_cifar
模型)图片:
[7]:
# 下载蘑菇图片
if not os.path.exists('/root/mushroom.jpeg'):
!wget -P /root/ https://ascend-tutorials.obs.cn-north-4.myhuaweicloud.com/tinyms-test-pics/mushrooms/mushroom.jpeg
else:
print('mushroom.jpeg already exists')
# 下载飞机图片
if not os.path.exists('/root/airplane.jpg'):
!wget -P /root/ https://ascend-tutorials.obs.cn-north-4.myhuaweicloud.com/tinyms-test-pics/objects/airplane.jpg
else:
print('airplane.jpg already exists')
--2021-03-19 15:29:18-- https://ascend-tutorials.obs.cn-north-4.myhuaweicloud.com/tinyms-test-pics/mushrooms/mushroom.jpeg
Resolving ascend-tutorials.obs.cn-north-4.myhuaweicloud.com (ascend-tutorials.obs.cn-north-4.myhuaweicloud.com)... 49.4.112.113, 49.4.112.90, 49.4.112.5, ...
Connecting to ascend-tutorials.obs.cn-north-4.myhuaweicloud.com (ascend-tutorials.obs.cn-north-4.myhuaweicloud.com)|49.4.112.113|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 76020 (74K) [image/jpeg]
Saving to: ‘/root/mushroom.jpeg’
mushroom.jpeg 100%[===================>] 74.24K --.-KB/s in 0.1s
2021-03-19 15:29:18 (563 KB/s) - ‘/root/mushroom.jpeg’ saved [76020/76020]
--2021-03-19 15:29:18-- https://ascend-tutorials.obs.cn-north-4.myhuaweicloud.com/tinyms-test-pics/objects/airplane.jpg
Resolving ascend-tutorials.obs.cn-north-4.myhuaweicloud.com (ascend-tutorials.obs.cn-north-4.myhuaweicloud.com)... 49.4.112.113, 49.4.112.90, 49.4.112.5, ...
Connecting to ascend-tutorials.obs.cn-north-4.myhuaweicloud.com (ascend-tutorials.obs.cn-north-4.myhuaweicloud.com)|49.4.112.113|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 151188 (148K) [image/jpeg]
Saving to: ‘/root/airplane.jpg’
airplane.jpg 100%[===================>] 147.64K 568KB/s in 0.3s
2021-03-19 15:29:19 (568 KB/s) - ‘/root/airplane.jpg’ saved [151188/151188]
6.2 List servables¶
使用list_servables
函数检查当前后端的serving模型
[8]:
list_servables()
[8]:
[{'description': 'This servable hosts a resnet50 model predicting 10 classes of objects',
'model': {'class_num': 10, 'format': 'ckpt', 'name': 'resnet50'},
'name': 'resnet50_cifar10'}]
如果输出的description
字段显示这是一个resnet50
的模型,则可以继续到下一步发送推理请求,后续的代码会自动判断后端模型并运行推理代码
6.3 发送推理请求¶
运行predict
函数发送推理请求,第4个参数选择TOP1_CLASS
或TOP5_CLASS
以指定输出策略:
[9]:
#设置两张图片的路径(对应两个不同的数据集)和输出策略(可以在TOP1和TOP5中选择)
imagenet_image_path = "/root/mushroom.jpeg"
cifar_image_path = "/root/airplane.jpg"
strategy = "TOP1_CLASS"
# predict(image_path, servable_name, dataset, topk_strategy)
# predict方法的4个参数分别是图片路径、servable名称,数据集名称(默认MNIST,此处需手动指定)和输出策略
if server_started() is True:
servable_name = list_servables()[0]['name']
if servable_name == 'resnet50_imagenet2012':
img_viewer = ImageViewer(Image.open(imagenet_image_path), imagenet_image_path)
img_viewer.show()
print(predict(imagenet_image_path, "resnet50_imagenet2012", "imagenet2012", strategy))
else:
img_viewer = ImageViewer(Image.open(cifar_image_path), cifar_image_path)
img_viewer.show()
print(predict(cifar_image_path, "resnet50_cifar10", 'cifar10', strategy))
else:
print('Server not started')
TOP1: airplane, score: 0.99997282028198242188
检查输出¶
如果用户运行ResNet50_imagenet2012
且能看到类似如下输出:
TOP1: Amanita毒蝇伞,伞菌目,鹅膏菌科,鹅膏菌属,主要分布于我国黑龙江、吉林、四川、西藏、云南等地,有毒, score: 0.99119007587432861328
那么意味着已经进行了一次成功的推理
如果用户运行ResNet50_cifar10
,输出应该类似于:
TOP1: airplane, score: 0.99997282028198242188
切换模型¶
如果想要体验使用ImageNet2012
毒蘑菇数据集训练,可以运行下列代码,同时运行对应的servable_json
代码段:
[ ]:
# download the imagenet2012 mushroom dataset
imagenet_path = '/root/mushrooms'
if not os.path.exists(imagenet_path):
!wget -P /root/ https://ascend-tutorials.obs.cn-north-4.myhuaweicloud.com/resnet-50/mushrooms/mushrooms.zip
!mkdir /root/mushrooms/
!unzip /root/mushrooms.zip -d /root/mushrooms/
print('************Download complete*************')
else:
print('************Dataset already exists.**************')
# check ckpt folder exists or not
imagenet_ckpt_folder = '/etc/tinyms/serving/resnet50_imagenet2012'
imagenet_ckpt_path = '/etc/tinyms/serving/resnet50_imagenet2012/resnet50.ckpt'
if not os.path.exists(imagenet_ckpt_folder):
!mkdir -p /etc/tinyms/serving/resnet50_imagenet2012
else:
print('resnet50_imagenet2012 ckpt folder already exists')
epoch_size = 90 # default is 90
batch_size = 32
# set environment parameters
dataset_sink_mode = False
device_target = "CPU"
context.set_context(mode=context.GRAPH_MODE, device_target=device_target)
# set dataset parameters
imagenet_train_path = '/root/mushrooms/train'
train_dataset = ImageFolderDataset(imagenet_train_path, num_parallel_workers=4, shuffle=True)
train_dataset = imagefolder_transform.apply_ds(train_dataset, repeat_size=1, batch_size=32, is_training=True)
imagenet_eval_path = '/root/mushrooms/eval'
eval_dataset = ImageFolderDataset(imagenet_eval_path, num_parallel_workers=4, shuffle=True)
eval_dataset = imagefolder_transform.apply_ds(eval_dataset, repeat_size=1, batch_size=32, is_training=False)
step_size = train_dataset.get_dataset_size()
save_checkpoint_epochs = 5
ckpoint_cb = ModelCheckpoint(prefix="resnet_imagenet2012", config=CheckpointConfig(
save_checkpoint_steps=save_checkpoint_epochs * train_dataset.get_dataset_size(),
keep_checkpoint_max=10))
# define the loss function
net_loss = SoftmaxCrossEntropyWithLogits(sparse=True, reduction="mean")
# define the optimizer
net_opt = Momentum(filter(lambda x: x.requires_grad, net.get_parameters()), 0.01, 0.9)
model.compile(loss_fn=net_loss, optimizer=net_opt, metrics={"Accuracy": Accuracy()})
print('************************Start training*************************')
model.train(epoch_size, train_dataset, callbacks=[ckpoint_cb, LossMonitor()],dataset_sink_mode=dataset_sink_mode)
model.save_checkpoint(imagenet_ckpt_path)
print('************************Finished training*************************')
model.load_checkpoint(imagenet_ckpt_path)
print('************************Start evaluation*************************')
acc = model.eval(eval_dataset, dataset_sink_mode=dataset_sink_mode)
print("============== Accuracy:{} ==============".format(acc))