qai_hub.submit_quantize_job
- submit_quantize_job(model, calibration_data, weights_dtype=QuantizeDtype.INT8, activations_dtype=QuantizeDtype.INT8, name=None, options='')
Submits a quantize job. Input model must be onnx. The resulting target model on a completed job will be a quantized onnx model in QDQ format.
- Parameters:
model (
Union
[Model
,TopLevelTracedModule
,MLModel
,ModelProto
,bytes
,str
,Path
]) – Model to quantize. The model must be a PyTorch model or an ONNX modelcalibration_data (
Union
[Dataset
,Mapping
[str
,List
[ndarray
]],str
]) – Data, Dataset, or Dataset ID used to calibrate quantization parameters.name (
Optional
[str
]) – Optional name for the job. Job names need not be unique.weights_dtype (
QuantizeDtype
) – The data type to which weights will be quantized.activations_dtype (
QuantizeDtype
) – The data type to which activations will be quantized.options (
str
) – Cli-like flag options. See Quantize Options.
- Returns:
job – Returns the quantize job.
- Return type:
Examples
Submit an onnx model for quantization:
import numpy as np import qai_hub as hub model_file = "mobilenet_v2.onnx" calibration_data = {"t.1": [np.random.randn(1, 3, 224, 224).astype(np.float32)]} job = hub.submit_quantize_job( model_file, calibration_data, weights_dtype=hub.QuantizeDtype.INT8, activations_dtype=hub.QuantizeDtype.INT8, name="mobilenet", )