学习笔记(03):英特尔®OpenVINO™工具套件中级课程--(第三章)模型优化器调优

立即学习:https://edu.****.net/course/play/28807/427187?utm_source=blogtoedu

OpenVINO Toolkits

学习笔记(03):英特尔®OpenVINO™工具套件中级课程--(第三章)模型优化器调优

模型优化器

mo.py是离线工具,为了提高性能,无关精度,有许多选项。

学习笔记(03):英特尔®OpenVINO™工具套件中级课程--(第三章)模型优化器调优

  • 将TF/Caffe/Mxnet/ONNX转化为IR格式
  • Map to supported libraray calls/kernels/layers
  • 多种选项(预处理),如改变RGB通道顺序,去掉均值等
  • 优化网络,合并网络结构
  • 改变数据类型,FP32,FP16,INT8
  • 切掉部分网络
  • 支持自定义层

使用帮助

usage: mo.py [options]

optional arguments:
  -h, --help            show this help message and exit
  --framework {tf,caffe,mxnet,kaldi,onnx}
                        Name of the framework used to train the input model.

Framework-agnostic parameters:
  --input_model INPUT_MODEL, -w INPUT_MODEL, -m INPUT_MODEL
                        Tensorflow*: a file with a pre-trained model (binary
                        or text .pb file after freezing). Caffe*: a model
                        proto file with model weights
  --model_name MODEL_NAME, -n MODEL_NAME
                        Model_name parameter passed to the final create_ir
                        transform. This parameter is used to name a network in
                        a generated IR and output .xml/.bin files.
  --output_dir OUTPUT_DIR, -o OUTPUT_DIR
                        Directory that stores the generated IR. By default, it
                        is the directory from where the Model Optimizer is
                        launched.
  --input_shape INPUT_SHAPE
                        Input shape(s) that should be fed to an input node(s)
                        of the model. Shape is defined as a comma-separated
                        list of integer numbers enclosed in parentheses or
                        square brackets, for example [1,3,227,227] or
                        (1,227,227,3), where the order of dimensions depends
                        on the framework input layout of the model. For
                        example, [N,C,H,W] is used for Caffe* models and
                        [N,H,W,C] for TensorFlow* models. Model Optimizer
                        performs necessary transformations to convert the
                        shape to the layout required by Inference Engine
                        (N,C,H,W). The shape should not contain undefined
                        dimensions (? or -1) and should fit the dimensions
                        defined in the input operation of the graph. If there
                        are multiple inputs in the model, --input_shape should
                        contain definition of shape for each input separated
                        by a comma, for example: [1,3,227,227],[2,4] for a
                        model with two inputs with 4D and 2D shapes.
                        Alternatively, you can specify shapes with the --input
                        option.
  --scale SCALE, -s SCALE
                        All input values coming from original network inputs
                        will be divided by this value. When a list of inputs
                        is overridden by the --input parameter, this scale is
                        not applied for any input that does not match with the
                        original input of the model.
  --reverse_input_channels
                        Switch the input channels order from RGB to BGR (or
                        vice versa). Applied to original inputs of the model
                        if and only if a number of channels equals 3. Applied
                        after application of --mean_values and --scale_values
                        options, so numbers in --mean_values and
                        --scale_values go in the order of channels used in the
                        original model.
  --log_level {CRITICAL,ERROR,WARN,WARNING,INFO,DEBUG,NOTSET}
                        Logger level
  --input INPUT         Quoted list of comma-separated input nodes names with
                        shapes, data types, and values for freezing. The shape
                        and value are specified as space-separated lists. The
                        data type of input node is specified in braces and can
                        have one of the values: f64 (float64), f32 (float32),
                        f16 (float16), i64 (int64), i32 (int32), u8 (uint8),
                        boolean. For example, use the following format to set
                        input port 0 of the node `node_name1` with the shape
                        [3 4] as an input node and freeze output port 1 of the
                        node `node_name2` with the value [20 15] of int32
                        typeand the shape [2]: "0:node_name1[3
                        4],node_name2:1[2]{i32}->[20 15]".
  --output OUTPUT       The name of the output operation of the model. For
                        TensorFlow*, do not add :0 to this name.
  --mean_values MEAN_VALUES, -ms MEAN_VALUES
                        Mean values to be used for the input image per
                        channel. Values to be provided in the (R,G,B) or
                        [R,G,B] format. Can be defined for desired input of
                        the model, for example: "--mean_values
                        data[255,255,255],info[255,255,255]". The exact
                        meaning and order of channels depend on how the
                        original model was trained.
  --scale_values SCALE_VALUES
                        Scale values to be used for the input image per
                        channel. Values are provided in the (R,G,B) or [R,G,B]
                        format. Can be defined for desired input of the model,
                        for example: "--scale_values
                        data[255,255,255],info[255,255,255]". The exact
                        meaning and order of channels depend on how the
                        original model was trained.
  --data_type {FP16,FP32,half,float}
                        Data type for all intermediate tensors and weights. If
                        original model is in FP32 and --data_type=FP16 is
                        specified, all model weights and biases are quantized
                        to FP16.
  --disable_fusing      Turn off fusing of linear operations to Convolution
  --disable_resnet_optimization
                        Turn off resnet optimization
  --finegrain_fusing FINEGRAIN_FUSING
                        Regex for layers/operations that won't be fused.
                        Example: --finegrain_fusing Convolution1,.*Scale.*
  --disable_gfusing     Turn off fusing of grouped convolutions
  --enable_concat_optimization
                        Turn on concat optimization
  --move_to_preprocess  Move mean values to IR preprocess section
  --extensions EXTENSIONS
                        Directory or a comma separated list of directories
                        with extensions. To disable all extensions including
                        those that are placed at the default location, pass an
                        empty string.
  --batch BATCH, -b BATCH
                        Input batch size
  --version             Version of Model Optimizer
  --silent              Prevent any output messages except those that
                        correspond to log level equals ERROR, that can be set
                        with the following option: --log_level. By default,
                        log level is already ERROR.
  --freeze_placeholder_with_value FREEZE_PLACEHOLDER_WITH_VALUE
                        Replaces input layer with constant node with provided
                        value, for example: "node_name->True". It will be
                        DEPRECATED in future releases. Use --input option to
                        specify a value for freezing.
  --generate_deprecated_IR_V7
                        Force to generate old deprecated IR V7 with layers
                        from old IR specification.
  --keep_shape_ops      [ Experimental feature ] Enables `Shape` operation
                        with all children keeping. This feature makes model
                        reshapable in Inference Engine
  --disable_weights_compression
                        Disable compression and store weights with original
                        precision
  --progress            Enables model conversion progress display
  --stream_output       Switches model conversion progress display to a
                        multiline mode
  --transformations_config TRANSFORMATIONS_CONFIG
                        Use the configuration file with transformations
                        description.

TensorFlow*-specific parameters:
  --input_model_is_text
                        TensorFlow*: treat the input model file as a text
                        protobuf format. If not specified, the Model Optimizer
                        treats it as a binary file by default.
  --input_checkpoint INPUT_CHECKPOINT
                        TensorFlow*: variables file to load.
  --input_meta_graph INPUT_META_GRAPH
                        Tensorflow*: a file with a meta-graph of the model
                        before freezing
  --saved_model_dir SAVED_MODEL_DIR
                        TensorFlow*: directory representing non frozen model
  --saved_model_tags SAVED_MODEL_TAGS
                        Group of tag(s) of the MetaGraphDef to load, in string
                        format, separated by ','. For tag-set contains
                        multiple tags, all tags must be passed in.
  --tensorflow_custom_operations_config_update TENSORFLOW_CUSTOM_OPERATIONS_CONFIG_UPDATE
                        TensorFlow*: update the configuration file with node
                        name patterns with input/output nodes information.
  --tensorflow_use_custom_operations_config TENSORFLOW_USE_CUSTOM_OPERATIONS_CONFIG
                        Use the configuration file with custom operation
                        description.
  --tensorflow_object_detection_api_pipeline_config TENSORFLOW_OBJECT_DETECTION_API_PIPELINE_CONFIG
                        TensorFlow*: path to the pipeline configuration file
                        used to generate model created with help of Object
                        Detection API.
  --tensorboard_logdir TENSORBOARD_LOGDIR
                        TensorFlow*: dump the input graph to a given directory
                        that should be used with TensorBoard.
  --tensorflow_custom_layer_libraries TENSORFLOW_CUSTOM_LAYER_LIBRARIES
                        TensorFlow*: comma separated list of shared libraries
                        with TensorFlow* custom operations implementation.
  --disable_nhwc_to_nchw
                        Disables default translation from NHWC to NCHW

Caffe*-specific parameters:
  --input_proto INPUT_PROTO, -d INPUT_PROTO
                        Deploy-ready prototxt file that contains a topology
                        structure and layer attributes
  --caffe_parser_path CAFFE_PARSER_PATH
                        Path to Python Caffe* parser generated from
                        caffe.proto
  -k K                  Path to CustomLayersMapping.xml to register custom
                        layers
  --mean_file MEAN_FILE, -mf MEAN_FILE
                        Mean image to be used for the input. Should be a
                        binaryproto file
  --mean_file_offsets MEAN_FILE_OFFSETS, -mo MEAN_FILE_OFFSETS
                        Mean image offsets to be used for the input
                        binaryproto file. When the mean image is bigger than
                        the expected input, it is cropped. By default, centers
                        of the input image and the mean image are the same and
                        the mean image is cropped by dimensions of the input
                        image. The format to pass this option is the
                        following: "-mo (x,y)". In this case, the mean file is
                        cropped by dimensions of the input image with offset
                        (x,y) from the upper left corner of the mean image
  --disable_omitting_optional
                        Disable omitting optional attributes to be used for
                        custom layers. Use this option if you want to transfer
                        all attributes of a custom layer to IR. Default
                        behavior is to transfer the attributes with default
                        values and the attributes defined by the user to IR.
  --enable_flattening_nested_params
                        Enable flattening optional params to be used for
                        custom layers. Use this option if you want to transfer
                        attributes of a custom layer to IR with flattened
                        nested parameters. Default behavior is to transfer the
                        attributes without flattening nested parameters.

Mxnet-specific parameters:
  --input_symbol INPUT_SYMBOL
                        Symbol file (for example, model-symbol.json) that
                        contains a topology structure and layer attributes
  --nd_prefix_name ND_PREFIX_NAME
                        Prefix name for args.nd and argx.nd files.
  --pretrained_model_name PRETRAINED_MODEL_NAME
                        Name of a pretrained MXNet model without extension and
                        epoch number. This model will be merged with args.nd
                        and argx.nd files
  --save_params_from_nd
                        Enable saving built parameters file from .nd files
  --legacy_mxnet_model  Enable MXNet loader to make a model compatible with
                        the latest MXNet version. Use only if your model was
                        trained with MXNet version lower than 1.0.0
  --enable_ssd_gluoncv  Enable pattern matchers replacers for converting
                        gluoncv ssd topologies.

Kaldi-specific parameters:
  --counts COUNTS       Path to the counts file
  --remove_output_softmax
                        Removes the SoftMax layer that is the output layer
  --remove_memory       Removes the Memory layer and use additional inputs
                        outputs instead

使用实例

使用如下参数进行模型转换:
mo.py --input_model mobilenetv2-7.onnx --mean_values=data[123.675,116.28,103.53] --scale_values=data[58.624,57.12,57.375]  --reverse_input_channels -o $lab_dir

Model Optimizer arguments:
Common parameters:
        - Path to the Input Model:      /home/dc2-user/****200/03/exercise-1/mobilenetv2-7.onnx
        - Path for generated IR:        /home/dc2-user/****200/03/exercise-1/
        - IR output name:       mobilenetv2-7
        - Log level:    ERROR
        - Batch:        Not specified, inherited from the model
        - Input layers:         Not specified, inherited from the model
        - Output layers:        Not specified, inherited from the model
        - Input shapes:         Not specified, inherited from the model
        - Mean values:  data[123.675,116.28,103.53]
        - Scale values:         data[58.624,57.12,57.375]
        - Scale factor:         Not specified
        - Precision of IR:      FP32
        - Enable fusing:        True
        - Enable grouped convolutions fusing:   True
        - Move mean values to preprocess section:       False
        - Reverse input channels:       True
ONNX specific parameters:
Model Optimizer version:

[ SUCCESS ] Generated IR version 10 model.
[ SUCCESS ] XML file: /home/dc2-user/****200/03/exercise-1/mobilenetv2-7.xml
[ SUCCESS ] BIN file: /home/dc2-user/****200/03/exercise-1/mobilenetv2-7.bin
[ SUCCESS ] Total execution time: 22.76 seconds.
[ SUCCESS ] Memory consumed: 160 MB.
It's been a while, check for a new version of Intel(R) Distribution of OpenVINO(TM) toolkit here https://software.intel.com/en-us/openvino-toolkit/choose-download?cid=&source=upgrade&content=2020_3_LTS or on the GitHub*

应用于分类使用方法

python3 classification_sample.py -m mobilenetv2-7.xml --labels labels.txt -i images/2.jpeg
[ INFO ] Creating Inference Engine
[ INFO ] Loading network files:
        mobilenetv2-7.xml
        mobilenetv2-7.bin
[ INFO ] Preparing input blobs
[ WARNING ] Image images/2.jpeg is resized from (454, 500) to (224, 224)
[ INFO ] Batch size is 1
[ INFO ] Loading model to the plugin
[ INFO ] Starting inference in synchronous mode
[ INFO ] Processing output blob
[ INFO ] Top 10 results:
Image images/2.jpeg

classid probability
------- -----------
Cardigan Welsh corgi          16.9098377
hound, Walker foxhound        15.7053862
beagle                        15.6501904
Bernard, St Bernard           15.2576942
Swiss Mountain dog            14.2999191
foxhound                      14.2940111
Appenzeller                   13.7427340
Pembroke Welsh corgi          12.8577442
EntleBucher                   12.7613630
collie                        12.4419489

实验二、mo带参数例子input_shape [1,300,300,3]

mo.py --input_model public/ssd_mobilenet_v2_coco/ssd_mobilenet_v2_coco_2018_03_29/frozen_inference_graph.pb --reverse_input_channels --input_shape [1,300,300,3] --input=image_tensor --transformations_config=$mo_dir/extensions/front/tf/ssd_v2_support.json --tensorflow_object_detection_api_pipeline_config=public/ssd_mobilenet_v2_coco/ssd_mobilenet_v2_coco_2018_03_29/pipeline.config --output=detection_classes,detection_scores,detection_boxes,num_detections --model_name ssd-mobilenet
Model Optimizer arguments:
Common parameters:
        - Path to the Input Model:      /home/dc2-user/****200/03/exercise-2/public/ssd_mobilenet_v2_coco/ssd_mobilenet_v2_coco_2018_03_29/frozen_inference_graph.pb
        - Path for generated IR:        /home/dc2-user/****200/03/exercise-2/.
        - IR output name:       ssd-mobilenet
        - Log level:    ERROR
        - Batch:        Not specified, inherited from the model
        - Input layers:         image_tensor
        - Output layers:        detection_classes,detection_scores,detection_boxes,num_detections
        - Input shapes:         [1,300,300,3]
        - Mean values:  Not specified
        - Scale values:         Not specified
        - Scale factor:         Not specified
        - Precision of IR:      FP32
        - Enable fusing:        True
        - Enable grouped convolutions fusing:   True
        - Move mean values to preprocess section:       False
        - Reverse input channels:       True
TensorFlow specific parameters:
        - Input model in text protobuf format:  False
        - Path to model dump for TensorBoard:   None
        - List of shared libraries with TensorFlow custom layers implementation:        None
        - Update the configuration file with input/output node names:   None
        - Use configuration file used to generate the model with Object Detection API:  /home/dc2-user/****200/03/exercise-2/public/ssd_mobilenet_v2_coco/ssd_mobilenet_v2_coco_2018_03_29/pipeline.config
        - Use the config file:  None
Model Optimizer version:
[ SUCCESS ] Generated IR version 10 model.
[ SUCCESS ] XML file: /home/dc2-user/****200/03/exercise-2/./ssd-mobilenet.xml
[ SUCCESS ] BIN file: /home/dc2-user/****200/03/exercise-2/./ssd-mobilenet.bin
[ SUCCESS ] Total execution time: 98.30 seconds.
[ SUCCESS ] Memory consumed: 694 MB.

实验三、

batch 4

mo.py  --input_model mobilenetv2-7.onnx --mean_values=data[123.675,116.28,103.53] --scale_values=data[58.624,57.12,57.375] --reverse_input_channels --output_dir $lab_dir --batch=4 --model_name mobilenetv2-7-batch-4
Model Optimizer arguments:
Common parameters:
        - Path to the Input Model:      /home/dc2-user/****200/03/exercise-3/mobilenetv2-7.onnx
        - Path for generated IR:        /home/dc2-user/****200/03/exercise-3/
        - IR output name:       mobilenetv2-7-batch-4
        - Log level:    ERROR
        - Batch:        4
        - Input layers:         Not specified, inherited from the model
        - Output layers:        Not specified, inherited from the model
        - Input shapes:         Not specified, inherited from the model
        - Mean values:  data[123.675,116.28,103.53]
        - Scale values:         data[58.624,57.12,57.375]
        - Scale factor:         Not specified
        - Precision of IR:      FP32
        - Enable fusing:        True
        - Enable grouped convolutions fusing:   True
        - Move mean values to preprocess section:       False
        - Reverse input channels:       True
ONNX specific parameters:
Model Optimizer version:

[ SUCCESS ] Generated IR version 10 model.
[ SUCCESS ] XML file: /home/dc2-user/****200/03/exercise-3/mobilenetv2-7-batch-4.xml
[ SUCCESS ] BIN file: /home/dc2-user/****200/03/exercise-3/mobilenetv2-7-batch-4.bin
[ SUCCESS ] Total execution time: 20.22 seconds.
[ SUCCESS ] Memory consumed: 161 MB.

剪切模型,去掉模型前5层

将模型第六层"mobilenetv20_features_conv0_fwd"设置为模型优化器的--input 的参数,意味着我们剪切掉模型的前5层:

mo.py --input_model mobilenetv2-7.onnx --mean_values=data[123.675,116.28,103.53] --scale_values=data[58.624,57.12,57.375] --reverse_input_channels --output_dir $lab_dir --input="mobilenetv20_features_conv0_fwd" --model_name mobilenetv2-7-no-head
Model Optimizer arguments:
Common parameters:
        - Path to the Input Model:      /home/dc2-user/****200/03/exercise-3/mobilenetv2-7.onnx
        - Path for generated IR:        /home/dc2-user/****200/03/exercise-3/
        - IR output name:       mobilenetv2-7-no-head
        - Log level:    ERROR
        - Batch:        Not specified, inherited from the model
        - Input layers:         mobilenetv20_features_conv0_fwd
        - Output layers:        Not specified, inherited from the model
        - Input shapes:         Not specified, inherited from the model
        - Mean values:  data[123.675,116.28,103.53]
        - Scale values:         data[58.624,57.12,57.375]
        - Scale factor:         Not specified
        - Precision of IR:      FP32
        - Enable fusing:        True
        - Enable grouped convolutions fusing:   True
        - Move mean values to preprocess section:       False
        - Reverse input channels:       True
ONNX specific parameters:
Model Optimizer version:

[ SUCCESS ] Generated IR version 10 model.
[ SUCCESS ] XML file: /home/dc2-user/****200/03/exercise-3/mobilenetv2-7-no-head.xml
[ SUCCESS ] BIN file: /home/dc2-user/****200/03/exercise-3/mobilenetv2-7-no-head.bin
[ SUCCESS ] Total execution time: 19.55 seconds.
[ SUCCESS ] Memory consumed: 161 MB.

用模型优化器将模型转化为FP16精度

mo.py  --input_model mobilenetv2-7.onnx --mean_values=data[123.675,116.28,103.53] --scale_values=data[58.624,57.12,57.375] --reverse_input_channels --output_dir $lab_dir --data_type=FP16 --model_name mobilenetv2-7-FP16
Model Optimizer arguments:
Common parameters:
        - Path to the Input Model:      /home/dc2-user/****200/03/exercise-3/mobilenetv2-7.onnx
        - Path for generated IR:        /home/dc2-user/****200/03/exercise-3/
        - IR output name:       mobilenetv2-7-FP16
        - Log level:    ERROR
        - Batch:        Not specified, inherited from the model
        - Input layers:         Not specified, inherited from the model
        - Output layers:        Not specified, inherited from the model
        - Input shapes:         Not specified, inherited from the model
        - Mean values:  data[123.675,116.28,103.53]
        - Scale values:         data[58.624,57.12,57.375]
        - Scale factor:         Not specified
        - Precision of IR:      FP16
        - Enable fusing:        True
        - Enable grouped convolutions fusing:   True
        - Move mean values to preprocess section:       False
        - Reverse input channels:       True
ONNX specific parameters:
Model Optimizer version:

[ SUCCESS ] Generated IR version 10 model.
[ SUCCESS ] XML file: /home/dc2-user/****200/03/exercise-3/mobilenetv2-7-FP16.xml
[ SUCCESS ] BIN file: /home/dc2-user/****200/03/exercise-3/mobilenetv2-7-FP16.bin
[ SUCCESS ] Total execution time: 19.92 seconds.
[ SUCCESS ] Memory consumed: 161 MB.