Use Multiple Boards
This guide shows how to control multiple V80 boards from a single application
using separate vrt::Device instances.
Prerequisites
Two or more V80 boards installed and visible via
v80-smi list.The SLASH stack is installed and
vrtdis running.A vrtbin file built for the target design.
See Getting Started for an introduction to the VRT workflow.
Identifying Your Boards
Run v80-smi list to discover the BDF address of each board:
v80-smi list
Each board is listed with a unique BDF (Bus:Device.Function). Note the BDF for every board you intend to use.
Opening Multiple Devices
Create a separate vrt::Device for each board. Both can load the same
vrtbin:
vrt::Device fpga0("e2:00.0", "my_design.vrtbin", false, vrt::ProgramType::FLASH);
vrt::Device fpga1("21:00.0", "my_design.vrtbin", false, vrt::ProgramType::FLASH);
Note
Replace the BDF strings with the addresses reported by v80-smi list on
your system. The addresses above are examples.
Each Device manages its own connection to the vrtd daemon and its own
FPGA state.
Per-Device Kernels and Buffers
Kernels and buffers are scoped to the device that created them. Create separate instances for each board:
// Kernels on fpga0
vrt::Kernel increment0(fpga0, "increment_0");
vrt::Kernel accumulate0(fpga0, "accumulate_0");
// Same kernel names, but on fpga1
vrt::Kernel increment1(fpga1, "increment_0");
vrt::Kernel accumulate1(fpga1, "accumulate_0");
// Buffers — one per device
vrt::Buffer<float> buffer0(fpga0, size, increment0.argMemoryConfig("in"));
vrt::Buffer<float> buffer1(fpga1, size, increment1.argMemoryConfig("in"));
You cannot pass a buffer allocated on one device to a kernel on another.
Running Kernels Independently
Each device operates independently. You can run kernels on different boards sequentially or concurrently:
buffer0.sync(vrt::SyncType::HOST_TO_DEVICE);
buffer1.sync(vrt::SyncType::HOST_TO_DEVICE);
// Run on fpga0
increment0.start(size, buffer0.getPhysAddr());
accumulate0.start(size);
increment0.wait();
accumulate0.wait();
// Run on fpga1
increment1.start(size, buffer1.getPhysAddr());
accumulate1.start(size);
increment1.wait();
accumulate1.wait();
To maximise throughput, start kernels on both boards before waiting:
increment0.start(size, buffer0.getPhysAddr());
increment1.start(size, buffer1.getPhysAddr());
// Both boards are now running in parallel
increment0.wait();
increment1.wait();
Each device can also have its own clock frequency:
fpga0.setFrequency(200000000);
fpga1.setFrequency(200000000);
Cleanup
Call cleanup() on each device when done:
fpga0.cleanup();
fpga1.cleanup();
Complete Example
Example 03_multiple_boards demonstrates this pattern using the
increment and accumulate kernels from example 00_axilite.
cd examples/03_multiple_boards
cmake -B build -S . -G Ninja -DSLASH_USE_REPO=ON
cmake --build build
Note
This example reuses the vrtbin from example 00_axilite. Build that
example first to produce the vrtbin file.
Next Steps
vrt::Device — full Device API reference.
Buffers and Memory — buffer management in depth.
Set Clock Frequency — frequency tuning per device.