qai_hub.submit_inference_job
- submit_inference_job(model, device, inputs, name=None, options='', retry=True)
Submits an inference job.
- Parameters:
model (
Union
[Model
,MLModel
,bytes
,str
,Path
]) – Model to run inference with. Must be one of the following: (1) Model object from a compile job viaqai_hub.CompileJob.get_target_model()
(2) Any TargetModel (3) Path to Any TargetModeldevice (
Union
[Device
,List
[Device
]]) – Devices on which to run the job.inputs (
Union
[Dataset
,Mapping
[str
,List
[ndarray
]],str
]) –If Dataset, it must have matching schema to model. For example, if model is a target model from a compile job, and the compile job was submitted with input_shapes=dict(a=(1, 2), b=(1, 3)). The dataset must also be created with dict(a=<list_of_np_array>, b=<list_of_np_array>). See
qai_hub.submit_compile_job()
for details.If Dict, it’s uploaded as a new Dataset, equivalent to calling
qai_hub.upload_dataset()
with arbitrary name. Note that Dict is ordered in Python 3.7+ and we rely on the order to match the schema. See the paragraph above for an example.If str, it’s a h5 path to Dataset.
name (
Optional
[str
]) – Optional name for the job. Job names need not be unique.options (
str
) – Cli-like flag options. See Profile Options.retry (
bool
) – If job creation fails due to rate-limiting, keep retrying periodically until creation succeeds.
- Returns:
job – Returns the inference jobs.
- Return type:
InferenceJob | List[InferenceJob]
Examples
Submit a TFLite model for inference on a Samsung Galaxy S23:
import qai_hub as hub import numpy as np # TFlite model path tflite_model = "squeeze_net.tflite" # Setup input data input_tensor = np.random.random((1, 3, 227, 227)).astype(np.float32) # Submit inference job job = hub.submit_inference_job( tflite_model, device=hub.Device("Samsung Galaxy S23"), name="squeeze_net (1, 3, 227, 227)", inputs=dict(image=[input_tensor]), ) # Load the output data into a dictionary of numpy arrays output_tensors = job.download_output_data()
For more examples, see Running Inference.