Linking
Link jobs allow multiple models to be combined into a single deployable asset with multiple graphs that share weights. Weights are shared automatically based on name and content. A common use case is to support different insantiations of the same network, such as different input shapes. Partial or no weight sharing is also possible.
The input models to a link job must be compiled through Qualcomm® AI Hub and target the same device.
Note
Link jobs are exclusive to QNN context binaries.
Variable input size example
In this example, we have a segmentation network that we would like to be able to execute in both landscape or portrait mode on a mobile phone. We assume the model has been trained to support these input sizes.
To create this model, we start by compiling two separate QNN context binaries for the two inputs. We then link these two assets together into a single context binary.
This example requires the package Torchvision. First we compile the single-graph context binaries:
from typing import Tuple
import torch
import torchvision
import qai_hub as hub
# Using pre-trained FCN ResNet-50 semantic segmentation model
# We wrap the model in a module to convert the output dictionary containing
# both the primary output and auxiliary output, to return only the primary.
class FCN(torch.nn.Module):
def __init__(self):
super().__init__()
self.model = torchvision.models.segmentation.fcn_resnet50(pretrained=True)
def forward(self, x):
return self.model(x)["out"]
torch_model = FCN()
torch_model.eval()
input_shape_landscape: Tuple[int, ...] = (1, 3, 256, 384)
input_shape_portrait: Tuple[int, ...] = (1, 3, 384, 256)
# Trace model
# The traced model will be the same for both input shapes, so the asset can be
# re-used for both compile jobs. For models where different input shapes
# trigger different code paths, link jobs will still work, but both inputs will
# need to be traced separately.
example_input = torch.rand(input_shape_landscape)
pt_model = torch.jit.trace(torch_model, example_input)
src_model = hub.upload_model(pt_model)
device = hub.Device("Samsung Galaxy S24 (Family)")
# Compile both models
compile_job_landscape = hub.submit_compile_job(
src_model,
name="FCN Landscape",
device=device,
options="--target_runtime qnn_context_binary --qnn_graph_name landscape",
input_specs=dict(image=input_shape_landscape),
)
assert isinstance(compile_job_landscape, hub.CompileJob)
compile_job_portrait = hub.submit_compile_job(
src_model,
name="FCN Portrait",
device=device,
options="--target_runtime qnn_context_binary --qnn_graph_name portrait",
input_specs=dict(image=input_shape_portrait),
)
assert isinstance(compile_job_portrait, hub.CompileJob)
model_landscape = compile_job_landscape.get_target_model()
model_portrait = compile_job_portrait.get_target_model()
assert isinstance(model_landscape, hub.Model)
assert isinstance(model_portrait, hub.Model)
Now we are ready to call submit_link_job()
to link the models together:
# Link the models
link_job = hub.submit_link_job(
[model_landscape, model_portrait],
name="FCN Landscape+Portrait",
)
assert isinstance(link_job, hub.LinkJob)
linked_model = link_job.get_target_model()
assert isinstance(linked_model, hub.Model)
The resulting model is a context binary that contains multiple graphs.
To profile or perform inference on such a model, we need to specify which graph
to use with --qnn_options context_enable_graphs=<graph_name>
. See
Qualcomm® AI Engine Direct Options for more information.
The available graph names in a context binary can be found in the model’s Qualcomm® AI Hub
page in the Metadata section. In the above example, we deliberately set the graph
names during compilation using the --qnn_graph_name
option. To profile
the model above:
# Profile the portrait model
profile_job = hub.submit_profile_job(
linked_model,
name="FCN Portrait",
device=device,
options="--qnn_options context_enable_graphs=portrait",
)