Creating a C++ Model Wrapper
Overview
This guide explains how to integrate a PyTorch model into Neuralyzer by creating a C++ wrapper. There are two approaches:
- Runtime JSON model — define the model’s I/O in a JSON file (no C++ needed)
- Compiled C++ model — create a
ModelBasesubclass with custom logic
Most models can use the JSON approach. The compiled approach is only needed when the model requires custom pre/post-processing logic in forward() that cannot be expressed through the standard encoder/decoder pipeline.
Prerequisites
- A trained PyTorch model exported as either:
- AOT Inductor (
.pt2) — recommended, see AOT Inductor Tutorial - TorchScript (
.pt) —torch.jit.trace()ortorch.jit.script()
- AOT Inductor (
- Knowledge of the model’s input and output tensor shapes
Approach 1: Runtime JSON Model (Recommended)
For models with straightforward I/O (images in, masks/points/lines out), define a JSON specification file and register it at runtime. No recompilation needed.
Step 1: Determine Model I/O
Identify each input and output tensor:
- Name — a unique identifier (e.g.
"image","heatmap") - Shape — dimensions excluding the batch axis (e.g.
[3, 256, 256]) - Data type — which Neuralyzer data type maps to this tensor
- Encoder/Decoder — which conversion strategy to use
Step 2: Write the JSON Spec
Create a .json file alongside your model weights:
{
"model_id": "whisker_detector",
"display_name": "Whisker Detector",
"description": "Detects whisker positions from video frames",
"weights_path": "whisker_detector.pt2",
"backend": "auto",
"preferred_batch_size": 0,
"max_batch_size": 0,
"inputs": [
{
"name": "image",
"shape": [3, 256, 256],
"description": "Input video frame (RGB)",
"recommended_encoder": "ImageEncoder",
"is_static": false
}
],
"outputs": [
{
"name": "whisker_mask",
"shape": [1, 256, 256],
"description": "Predicted whisker probability map",
"recommended_decoder": "TensorToLine2D"
}
]
}JSON Schema Reference
Top-level fields:
| Field | Type | Required | Description |
|---|---|---|---|
model_id |
string | Yes | Unique identifier for registration |
display_name |
string | Yes | Shown in the UI model selector |
description |
string | No | Tooltip text in the UI |
weights_path |
string | No | Path to model file. Relative paths resolve against the JSON file’s directory. |
backend |
string | No | "auto" (default), "torchscript", or "aotinductor" |
preferred_batch_size |
int | No | Default batch size in the UI (0 = model decides) |
max_batch_size |
int | No | Maximum batch size (0 = unlimited) |
inputs |
array | Yes | Input slot specifications |
outputs |
array | Yes | Output slot specifications |
Slot fields:
| Field | Type | Required | Description |
|---|---|---|---|
name |
string | Yes | Slot identifier, must match the model’s expected input order |
shape |
array of int | Yes | Tensor shape excluding batch dimension |
description |
string | No | Human-readable description |
recommended_encoder |
string | No | "ImageEncoder", "Point2DEncoder", "Mask2DEncoder", "Line2DEncoder" |
recommended_decoder |
string | No | "TensorToPoint2D", "TensorToMask2D", "TensorToLine2D" |
is_static |
bool | No | If true, this is a memory input set once by the user |
is_boolean_mask |
bool | No | If true, values are 0/1 flags |
sequence_dim |
int | No | Axis index for frame sequences (-1 = none) |
Step 3: Register in Neuralyzer
The JSON model can be registered programmatically:
auto result = dl::ModelRegistry::instance().registerFromJson("path/to/spec.json");The model will then appear in the Deep Learning widget’s model selector.
Approach 2: Compiled C++ Model
For models requiring custom logic (e.g., multi-step inference, output-to-input feedback loops, custom tensor manipulation), create a ModelBase subclass.
Step 1: Create the Header
Create a header in src/DeepLearning/models_v2/your_model/:
#ifndef WHISKERTOOLBOX_YOUR_MODEL_HPP
#define WHISKERTOOLBOX_YOUR_MODEL_HPP
#include "models_v2/ModelBase.hpp"
#include <memory>
namespace dl {
// Forward declare to keep torch out of the header if needed
class ModelExecution;
class YourModel : public ModelBase {
public:
YourModel();
~YourModel() override;
// Non-copyable, movable
YourModel(YourModel const &) = delete;
YourModel & operator=(YourModel const &) = delete;
YourModel(YourModel &&) noexcept;
YourModel & operator=(YourModel &&) noexcept;
[[nodiscard]] std::string modelId() const override;
[[nodiscard]] std::string displayName() const override;
[[nodiscard]] std::string description() const override;
[[nodiscard]] std::vector<TensorSlotDescriptor> inputSlots() const override;
[[nodiscard]] std::vector<TensorSlotDescriptor> outputSlots() const override;
void loadWeights(std::filesystem::path const & path) override;
[[nodiscard]] bool isReady() const override;
[[nodiscard]] int preferredBatchSize() const override;
[[nodiscard]] int maxBatchSize() const override;
std::unordered_map<std::string, torch::Tensor>
forward(std::unordered_map<std::string, torch::Tensor> const & inputs) override;
private:
std::unique_ptr<ModelExecution> _execution;
};
} // namespace dl
#endifStep 2: Implement the Model
#include "YourModel.hpp"
#include "models_v2/ModelExecution.hpp"
#include "device/DeviceManager.hpp"
#include "registry/ModelRegistry.hpp"
namespace dl {
// ── Self-registration ──
DL_REGISTER_MODEL(YourModel);
// ── Metadata ──
std::string YourModel::modelId() const { return "your_model"; }
std::string YourModel::displayName() const { return "Your Model"; }
std::string YourModel::description() const {
return "Description of what your model does.";
}
int YourModel::preferredBatchSize() const { return 0; }
int YourModel::maxBatchSize() const { return 0; }
// ── Slot Descriptors ──
std::vector<TensorSlotDescriptor> YourModel::inputSlots() const {
return {
TensorSlotDescriptor{
.name = "image",
.shape = {3, 256, 256},
.description = "Input video frame",
.recommended_encoder = "ImageEncoder",
},
};
}
std::vector<TensorSlotDescriptor> YourModel::outputSlots() const {
return {
TensorSlotDescriptor{
.name = "output_mask",
.shape = {1, 256, 256},
.description = "Predicted segmentation mask",
.recommended_decoder = "TensorToMask2D",
},
};
}
// ── Weight Loading ──
void YourModel::loadWeights(std::filesystem::path const & path) {
_execution = std::make_unique<ModelExecution>();
_execution->load(path); // auto-detects backend from extension
}
bool YourModel::isReady() const {
return _execution && _execution->isLoaded();
}
// ── Inference ──
std::unordered_map<std::string, torch::Tensor>
YourModel::forward(
std::unordered_map<std::string, torch::Tensor> const & inputs)
{
// Validate required inputs
if (inputs.find("image") == inputs.end()) {
throw std::runtime_error("Missing required input: image");
}
// Move inputs to the correct device
auto & dm = DeviceManager::instance();
std::vector<torch::Tensor> ordered_inputs;
ordered_inputs.push_back(dm.toDevice(inputs.at("image")));
// Execute inference
auto outputs = _execution->execute(ordered_inputs);
// Map outputs to named slots
std::unordered_map<std::string, torch::Tensor> result;
if (!outputs.empty()) {
result["output_mask"] = outputs[0];
}
return result;
}
} // namespace dlStep 3: Key Design Decisions
Slot Descriptors
When defining inputSlots() and outputSlots(), consider:
recommended_encoder/recommended_decoder— the UI pre-selects these but the user can override them. Choose the most common use case.is_static = true— use for memory inputs that the user sets once (e.g., reference frames). The UI renders these as a separate “Memory Inputs” section.is_boolean_mask = true— use for 0/1 flag tensors that indicate which memory slots are active. The UI renders checkboxes instead of data source selectors.sequence_dim— set to a non-negative axis index if the model expects multiple frames stacked along that dimension. TheSlotAssemblerwill automatically arrange static input entries along this axis.- Batch size — set
preferredBatchSize() = 1if your model has an output→input feedback loop (like NeuroSAM). Leave at 0 for models that can process arbitrary batch sizes.
Using ModelExecution
ModelExecution auto-detects the inference backend from the weight file extension:
| Extension | Backend | API |
|---|---|---|
.pt |
TorchScript | torch::jit::load() |
.pt2 |
AOT Inductor | AOTIModelPackageLoader |
You can also force a specific backend:
_execution = std::make_unique<ModelExecution>(BackendType::AOTInductor);Device Management
Always use DeviceManager for device placement:
auto & dm = DeviceManager::instance();
auto device_tensor = dm.toDevice(cpu_tensor); // moves to CUDA if availableNever create your own torch::Device objects or hardcode torch::kCUDA.
Step 4: Register with CMake
Add your source files to src/DeepLearning/CMakeLists.txt:
set(DEEP_LEARNING_SOURCES
# ... existing sources ...
models_v2/your_model/YourModel.hpp
models_v2/your_model/YourModel.cpp
)The DL_REGISTER_MODEL macro handles self-registration at static initialization time. Your model will appear in the Deep Learning widget’s model selector after rebuilding.
Step 5: Write Tests
Create tests/DeepLearning/models_v2/YourModel.test.cpp:
#include <catch2/catch_test_macros.hpp>
#include "models_v2/your_model/YourModel.hpp"
#include "registry/ModelRegistry.hpp"
TEST_CASE("YourModel metadata", "[YourModel]") {
dl::YourModel model;
CHECK(model.modelId() == "your_model");
CHECK(model.displayName() == "Your Model");
}
TEST_CASE("YourModel input slots", "[YourModel]") {
dl::YourModel model;
auto inputs = model.inputSlots();
REQUIRE(inputs.size() == 1);
CHECK(inputs[0].name == "image");
CHECK(inputs[0].shape == std::vector<int64_t>{3, 256, 256});
}
TEST_CASE("YourModel is not ready without weights", "[YourModel]") {
dl::YourModel model;
CHECK_FALSE(model.isReady());
}
TEST_CASE("YourModel registered in ModelRegistry", "[YourModel]") {
auto & registry = dl::ModelRegistry::instance();
auto models = registry.availableModels();
CHECK(std::find(models.begin(), models.end(), "your_model") != models.end());
}Choosing Between Approaches
| Criterion | JSON Runtime Model | Compiled C++ Model |
|---|---|---|
| Recompilation needed | No | Yes |
| Custom pre/post-processing | No | Yes |
| Output→input feedback loops | No | Yes |
| Multiple inference steps | No | Yes |
| Simplicity | Simplest | More work |
| Typical use case | Standard image→mask/point/line models | Models like NeuroSAM with memory feedback |
Model Export
Both approaches require the model to be exported from Python. See:
- AOT Inductor Export Tutorial — recommended for new models
- TorchScript export:
torch.jit.trace(model, example_inputs).save("model.pt")