Keras(Tensorflow) 模型压缩

原文:How to compress your Keras model x5 smaller with TensorFlow model optimization

Keras(Tensorflow) 模型压缩

This tutorial will demonstrate how you can reduce the size of your Keras model by 5 times with TensorFlow model optimization, which can be particularly important for deployment in resource-constraint environments.

From the official TensorFlow model optimization documentation. Weight pruning means eliminating unnecessary values in weight tensors. We set the neural network parameters’ values to zero to remove what we estimate are unnecessary connections between the layers of a neural network. This is done during the training process to allow the neural network to adapt to the changes.

Here is a breakdown of how you can adopt this technique.

  1. Train Keras model to reach an acceptable accuracy as always.
  2. Make Keras layers or model ready to be pruned.
  3. Create a pruning schedule and train the model for more epochs.
  4. Export the pruned model by striping pruning wrappers from the model.
  5. Convert Keras model to TensorFlow Lite with optional quantization.

Here is what you get, x5 times smaller model.

Size of the pruned model before compression: 12.52 Mb
Size of the pruned model after compression: 2.51 Mb

The source code for this post is available on my Github and runnable on Google Colab Notebook.