Migrate from XRT to VRT
This guide maps XRT (Xilinx Runtime) concepts to their VRT (V80 Runtime) equivalents. It is intended for developers familiar with host code written against XRT for Alveo U200/U250/U280/U55C boards, and provides the reference needed to become productive on the Alveo V80 with VRT.
Note
VRT targets the AMD Alveo V80 exclusively. The API surface is smaller and more opinionated than XRT, so there is less to learn.
Quick Reference
XRT |
VRT |
Header |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
(integrated into |
|
xclbin |
vbin |
|
|
|
|
|
|
|
|
|
|
xbutil |
v80-smi |
CLI tool |
|
Built into vbin (platform auto-detected) |
– |
|
|
|
N/A |
|
|
Architecture: What Changed
XRT talks to the kernel driver directly via ioctls. VRT is one component
of the broader SLASH platform, which inserts a daemon (vrtd)
between the application and the driver:
XRT: App --> libxrt_core --> xocl/xclmgmt (kernel driver) --> FPGA
SLASH: App --> libvrt --> vrtd (daemon) --> slash (kernel driver) --> FPGA
The daemon multiplexes device access across processes, manages DMA buffer lifetimes, and handles FPGA programming. From the user’s perspective, this layering is transparent: the application code interacts with the same style of Device/Kernel/Buffer objects regardless of how requests are dispatched underneath.
What this means in practice:
The
vrtddaemon must be running before the application starts (it is a systemd service).Multi-process access to the same device works without coordination from the user side.
There is no equivalent of
xbmgmt– management operations go throughvrtdorv80-smi.
Includes
XRT:
#include <xrt/xrt_device.h>
#include <xrt/xrt_kernel.h>
#include <xrt/xrt_bo.h>
VRT:
#include <vrt/device.hpp>
#include <vrt/kernel.hpp>
#include <vrt/buffer.hpp>
Device Management
Opening a Device
XRT opens a device by index and loads an xclbin:
auto device = xrt::device(0);
auto uuid = device.load_xclbin("design.xclbin");
VRT opens a device by PCIe BDF and programs a vbin in one step:
vrt::Device device("d8:00", "design.vbin");
The constructor extracts the vbin archive, programs the FPGA, and parses kernel metadata. To skip programming (when the device is already loaded):
vrt::Device device("d8:00", "design.vbin", false);
Note
BDF format: VRT uses board-level BB:DD or DDDD:BB:DD –
no function suffix. Copy the address directly from v80-smi list
output.
Binary Format: xclbin vs vbin
xclbin |
vbin |
|
|---|---|---|
Format |
Custom Xilinx container |
tar archive |
Contents |
Bitstream, metadata, clock info |
PDI, system_map.xml, (optional) emu/sim executables |
Kernel metadata |
Embedded XML sections |
|
Platform variants |
Separate files or |
Single file; platform (hw/emu/sim) embedded in metadata |
VRT auto-detects whether a vbin targets hardware, emulation, or
simulation. There is no XCL_EMULATION_MODE environment variable –
the platform is a property of the vbin itself.
You can inspect a vbin without a device:
v80-smi inspect design.vbin
Kernel Execution
Getting a Kernel Handle
XRT:
auto kernel = xrt::kernel(device, uuid, "my_kernel");
VRT:
vrt::Kernel kernel(device, "my_kernel_0");
Note
Naming convention: VRT kernel names include the instance suffix
from the design (e.g., "vadd_0"), matching what appears in
system_map.xml.
Setting Arguments and Launching
XRT exposes two launch styles: a one-call form that takes the arguments
inline, and a staged form that sets arguments individually before
starting. VRT mirrors the same two styles directly on the Kernel
object.
XRT – one call, blocking:
auto run = kernel(bo_in, bo_out, size); // set args + start
run.wait();
XRT – staged:
auto run = xrt::run(kernel);
run.set_arg(0, bo_in);
run.set_arg(1, bo_out);
run.set_arg(2, size);
run.start();
run.wait();
VRT – one call, blocking (call sets the args, starts the kernel,
and waits for completion):
kernel.call(buffer_in, buffer_out, size);
VRT – one call, non-blocking (start with arguments sets them and
starts execution without blocking):
kernel.start(buffer_in, buffer_out, size);
// ... do other work ...
kernel.wait();
VRT – staged with setArg + start / call:
kernel.setArg(0, buffer_in);
kernel.setArg(1, buffer_out);
kernel.setArg(2, size);
kernel.start(); // non-blocking; pair with kernel.wait()
// or
kernel.call(); // blocking equivalent of start() + wait()
Arguments can also be set by name:
kernel.setArg("input", buffer_in);
kernel.setArg("output", buffer_out);
kernel.setArg("size", 1024);
kernel.call();
Style |
Sets args |
Starts |
Waits |
|---|---|---|---|
|
yes |
yes |
yes |
|
yes |
yes |
no |
|
yes |
yes |
yes |
|
yes |
yes |
no |
Note
Buffer arguments are resolved automatically. When you pass a
vrt::Buffer<T> to call, start, or setArg, VRT extracts
the physical address. No need to call .address() or similar.
Reading Output Registers
XRT: Read kernel outputs via xrt::bo or register access.
VRT: Read directly from kernel registers by offset:
uint32_t result = kernel.read(0x18);
Buffer Management
Creating Buffers
XRT allocates buffer objects with a memory group:
auto bo = xrt::bo(device, size_bytes, kernel.group_id(0));
VRT uses typed, element-counted buffers. Memory placement comes from kernel metadata:
vrt::Buffer<float> buf(device, num_elements, kernel.argMemoryConfig("input"));
argMemoryConfig() returns a MemoryConfig that encodes the correct
memory type (DDR, HBM, or HBM_VNOC) and HBM port for that kernel
argument – the VRT equivalent of XRT’s group_id().
XRT |
VRT |
|---|---|
|
|
Size in bytes |
Size in elements (byte size = |
Untyped ( |
Typed ( |
Writing Data to a Buffer
XRT:
auto host_ptr = bo.map<float*>();
for (int i = 0; i < n; i++) host_ptr[i] = i;
VRT buffers are directly subscriptable:
for (int i = 0; i < n; i++) buf[i] = static_cast<float>(i);
You can also get a raw pointer if needed:
float* ptr = buf.get();
Synchronizing Buffers
XRT:
bo.sync(XCL_BO_SYNC_BO_TO_DEVICE);
// ... run kernel ...
bo.sync(XCL_BO_SYNC_BO_FROM_DEVICE);
VRT:
buf.sync(vrt::SyncType::HOST_TO_DEVICE);
// ... run kernel ...
buf.sync(vrt::SyncType::DEVICE_TO_HOST);
Memory Types
VRT exposes three memory types through MemoryRangeType:
VRT Memory Type |
Description |
|---|---|
|
DDR memory |
|
HBM with explicit port (0-63) |
|
HBM via Virtual Network-on-Chip (auto-distributed) |
When using kernel.argMemoryConfig(), the correct type and port are
selected automatically from the design metadata. This is the recommended
approach.
CLI: xbutil vs v80-smi
Task |
XRT (xbutil) |
VRT (v80-smi) |
|---|---|---|
List devices |
|
|
Detailed device info |
|
|
Program device |
|
|
Reset device |
|
|
Validate device |
|
|
Inspect binary |
|
|
Query loaded design |
– |
|
JSON output |
– |
Add |
Version |
|
|
CMake Integration
XRT:
find_package(XRT REQUIRED)
target_link_libraries(myapp PRIVATE XRT::xrt_coreutil)
VRT:
find_package(vrt REQUIRED CONFIG)
target_link_libraries(myapp PRIVATE vrt::vrt)
Full example:
cmake_minimum_required(VERSION 3.20)
project(my_v80_app LANGUAGES CXX)
set(CMAKE_CXX_STANDARD 20)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
find_package(vrt REQUIRED CONFIG)
add_executable(my_app main.cpp)
target_link_libraries(my_app PRIVATE vrt::vrt)
See Use CMake Modules for the full CMake module reference.
Multi-Device
XRT:
auto dev0 = xrt::device(0);
auto dev1 = xrt::device(1);
VRT uses BDF strings instead of indices:
vrt::Device fpga0("e2:00", "design.vbin");
vrt::Device fpga1("21:00", "design.vbin");
Each device is fully independent – separate kernels, buffers, and frequencies:
vrt::Kernel k0(fpga0, "vadd_0");
vrt::Kernel k1(fpga1, "vadd_0");
vrt::Buffer<int> buf0(fpga0, 1024, k0.argMemoryConfig("in"));
vrt::Buffer<int> buf1(fpga1, 1024, k1.argMemoryConfig("in"));
Use v80-smi list to discover available board addresses. The format
is BB:DD (no function suffix) – copy the address directly from the
command output. See Use Multiple Boards for more detail.
Clock Frequency Control
VRT exposes runtime clock frequency control, which has no direct XRT equivalent:
std::cout << "Current: " << device.getFrequency() << " Hz\n";
std::cout << "Max: " << device.getMaxFrequency() << " Hz\n";
device.setFrequency(300000000); // 300 MHz
See Set Clock Frequency for more detail.
Emulation and Simulation
XRT: Set XCL_EMULATION_MODE=hw_emu or sw_emu and run with an
emulation xclbin.
VRT: Build the design for the target platform (hw, emu, or sim) and use the corresponding vbin. The platform is auto-detected – no environment variable needed:
// Same host code for all three. The vbin determines the platform.
vrt::Device device(bdf, "design_emu.vbin");
// Check which platform is active:
if (device.getPlatform() == vrt::Platform::EMULATION) {
std::cout << "Running in emulation mode\n";
}
Platform values: vrt::Platform::HARDWARE, vrt::Platform::EMULATION,
vrt::Platform::SIMULATION.
See Platform Modes for further background.
Logging
VRT has a built-in logger with configurable verbosity:
#include <vrt/utils/logger.hpp>
vrt::utils::Logger::setLogLevel(vrt::utils::LogLevel::DEBUG);
Log levels: NONE, WARN, ERROR, INFO, DEBUG.
Complete Example: Vector Add
Here is a minimal vadd host program showing the full XRT-to-VRT translation.
XRT version:
#include <xrt/xrt_device.h>
#include <xrt/xrt_kernel.h>
#include <xrt/xrt_bo.h>
int main() {
auto device = xrt::device(0);
auto uuid = device.load_xclbin("vadd.xclbin");
auto kernel = xrt::kernel(device, uuid, "vadd");
auto bo_a = xrt::bo(device, 1024 * sizeof(int), kernel.group_id(0));
auto bo_b = xrt::bo(device, 1024 * sizeof(int), kernel.group_id(1));
auto bo_c = xrt::bo(device, 1024 * sizeof(int), kernel.group_id(2));
auto a = bo_a.map<int*>();
auto b = bo_b.map<int*>();
for (int i = 0; i < 1024; i++) { a[i] = i; b[i] = i; }
bo_a.sync(XCL_BO_SYNC_BO_TO_DEVICE);
bo_b.sync(XCL_BO_SYNC_BO_TO_DEVICE);
auto run = xrt::run(kernel);
run.set_arg(0, bo_a);
run.set_arg(1, bo_b);
run.set_arg(2, bo_c);
run.set_arg(3, 1024);
run.start();
run.wait();
bo_c.sync(XCL_BO_SYNC_BO_FROM_DEVICE);
auto c = bo_c.map<int*>();
// verify c[i] == 2*i
}
VRT version:
#include <vrt/device.hpp>
#include <vrt/kernel.hpp>
#include <vrt/buffer.hpp>
int main() {
vrt::Device device("d8:00", "vadd.vbin");
vrt::Kernel vadd(device, "vadd_0");
vrt::Buffer<int> a(device, 1024, vadd.argMemoryConfig("a"));
vrt::Buffer<int> b(device, 1024, vadd.argMemoryConfig("b"));
vrt::Buffer<int> c(device, 1024, vadd.argMemoryConfig("c"));
for (int i = 0; i < 1024; i++) { a[i] = i; b[i] = i; }
a.sync(vrt::SyncType::HOST_TO_DEVICE);
b.sync(vrt::SyncType::HOST_TO_DEVICE);
// One-call blocking form (set args + start + wait):
vadd.call(a, b, c, 1024);
// Equivalent staged form:
// vadd.setArg("a", a);
// vadd.setArg("b", b);
// vadd.setArg("c", c);
// vadd.setArg("size", 1024);
// vadd.start(); // non-blocking
// vadd.wait();
c.sync(vrt::SyncType::DEVICE_TO_HOST);
// verify c[i] == 2*i
}
Key differences at a glance:
Device by BDF, not index. Programming and xclbin-loading combined into the constructor.
No UUID. Kernel lookup by name only.
No separate
xrt::runobject.call/start/setArg/waitlive onKerneldirectly.Buffers are typed and element-counted. Memory placement via
argMemoryConfig().Explicit
sync()with enum direction instead ofXCL_BO_SYNC_*macros.Device and buffer cleanup is automatic via RAII, same as XRT.