qai_hub.submit_quantize_job

submit_quantize_job(model, calibration_data, weights_dtype=QuantizeDtype.INT8, activations_dtype=QuantizeDtype.INT8, name=None, options='')

Submits a quantize job. Input model must be onnx. The resulting target model on a completed job will be a quantized onnx model in QDQ format.

Parameters:
  • model (Union[Model, TopLevelTracedModule, MLModel, ModelProto, bytes, str, Path]) – Model to quantize. The model must be a PyTorch model or an ONNX model

  • calibration_data (Union[Dataset, Mapping[str, List[ndarray]], str]) – Data, Dataset, or Dataset ID used to calibrate quantization parameters.

  • name (Optional[str]) – Optional name for the job. Job names need not be unique.

  • weights_dtype (QuantizeDtype) – The data type to which weights will be quantized.

  • activations_dtype (QuantizeDtype) – The data type to which activations will be quantized.

  • options (str) – Cli-like flag options. See Quantize Options.

Returns:

job – Returns the quantize job.

Return type:

QuantizeJob

Examples

Submit an onnx model for quantization:

import numpy as np
import qai_hub as hub

model_file = "mobilenet_v2.onnx"
calibration_data = {"t.1": [np.random.randn(1, 3, 224, 224).astype(np.float32)]}
job = hub.submit_quantize_job(
    model_file,
    calibration_data,
    weights_dtype=hub.QuantizeDtype.INT8,
    activations_dtype=hub.QuantizeDtype.INT8,
    name="mobilenet",
)