Getting started


We recommend using Miniconda to manage your python versions and environments.

Step 1: Python environment

Install miniconda on your machine.

Windows: When the installation finishes, open Anaconda Prompt from the Start menu.
macOS/Linux: When the installation finished, open a new shell window.

Set up an environment for Qualcomm® AI Hub:

conda create python=3.8 -n qai_hub
conda activate qai_hub


If any of these steps fail with SSL: CERTIFICATE_VERIFY_FAILED errors, you have an SSL interception and traffic inspection tool installed. Ask your IT department for instructions on how to set up certificates for Python pip and the Python Requests library. They should provide you with a new certificate file. You will also need to add this certificate to Miniconda prior to creating your environment. This can be done with conda config --set ssl_verify <path to certificate>.

Step 2: Install Python client


On Windows for ARM, only 32 and 64-bit versions of python are supported by AI Hub. Installation will fail when using native ARM versions of Python.

pip3 install qai-hub
Step 3: Sign in

Go to Qualcomm® AI Hub and sign in with your Qualcomm ID to view information about jobs you create.

Once signed in navigate to Account -> Settings -> API Token. This should provide an API token that you can use to configure your client.

Step 4: Configure API Token

Next, configure the client with API token using the following command in your terminal:

qai-hub configure --api_token INSERT_API_TOKEN

You can check that your API token is installed correctly by fetching a list of available devices. To do that, you can type the following in a Python terminal:

import qai_hub as hub

If you run into any issues, please contact us.

Quick example (PyTorch)

Once you have set up your Qualcomm® AI Hub environment, the next step is to submit a profiling job. First, install the dependencies of this example:

pip3 install "qai-hub[torch]"


If any of the snippets fail with an API authentication error, it means that you do not have a valid API token installed. Please see Installation to learn how to set this up.

Submit an analysis of the MobileNet v2 network:

import numpy as np
import requests
import torch
from PIL import Image
from torchvision.models import mobilenet_v2

import qai_hub as hub

# Using pre-trained MobileNet
torch_model = mobilenet_v2(pretrained=True)

# Step 1: Trace model
input_shape = (1, 3, 224, 224)
example_input = torch.rand(input_shape)
traced_torch_model = torch.jit.trace(torch_model, example_input)

# Step 2: Compile model
device = hub.Device("Samsung Galaxy S24 (Family)")
compile_job = hub.submit_compile_job(
assert isinstance(compile_job, hub.CompileJob)

# Step 3: Profile on cloud-hosted device
target_model = compile_job.get_target_model()
assert isinstance(target_model, hub.Model)
profile_job = hub.submit_profile_job(

# Step 4: Run inference on cloud-hosted device
sample_image_url = (
response = requests.get(sample_image_url, stream=True)
response.raw.decode_content = True
image =, 224))
input_array = np.expand_dims(
    np.transpose(np.array(image, dtype=np.float32) / 255.0, (2, 0, 1)), axis=0

# Run inference using the on-device model on the input image
inference_job = hub.submit_inference_job(
assert isinstance(inference_job, hub.InferenceJob)

# Download inference output dict[str, [np.array]]
# where str: name of the output
# [np.array]: output as a batch of numpy.array
on_device_output = inference_job.download_output_data()
assert isinstance(on_device_output, dict)

# Step 5: Post-processing the on-device output
output_name = list(on_device_output.keys())[0]
out = on_device_output[output_name][0]
on_device_probabilities = np.exp(out) / np.sum(np.exp(out), axis=1)

# Read the class labels for imagenet
sample_classes = ""
response = requests.get(sample_classes, stream=True)
response.raw.decode_content = True
categories = [str(s.strip()) for s in response.raw]

# Print top five predictions for the on-device model
print("Top-5 On-Device predictions:")
top5_classes = np.argsort(on_device_probabilities[0], axis=0)[-5:]
for c in reversed(top5_classes):
    print(f"{c} {categories[c]:20s} {on_device_probabilities[0][c]:>6.1%}")

# Step 6: Download model
model = compile_job.download_target_model()

This will submit a compile job and then a profile job, printing the URLs of both jobs. Finally, the code performs an inference job on some sample data. View all your jobs at /jobs/.

You can programmatically query the status of the job:

status = profile_job.get_status()

You can access the results of the job using the snippet below. There are three main parts

  • Profile: Results of the profiling in JSON format.

  • Target Model: Optimized model ready for deployment.

  • Results: Folder containing all the artifacts of the job (including logs).

Note that these are blocking API calls that wait until the job finishes:

# Download profile results as JSON (blocking call)
profile = profile_job.download_profile()

# Download an optimized model (blocking call)
model =

# Download results to current directory (blocking call)

For more information, continue on to Profiling Models or refer to the API documentation.