Architecture
Overview
Every board support module in axi-pcie-core follows the same structural
pattern. The per-board entity <Board>Core instantiates three top-level
subsystems:
a PCIe PHY wrapper (board-specific IP delivered as a pre-built
.dcpcheckpoint),AxiPcieReg— the BAR0 AXI-to-AXI-Lite bridge and 15-slave register crossbar, andAxiPcieDma— the AXI-Stream data-plane DMA engine.
All DMA and register logic runs synchronously on a single 250 MHz system clock
(DMA_CLK_FREQ_C) sourced from the PCIe PHY’s recovered reference clock.
Application logic on a separate appClk domain is decoupled via
surf.AxiLiteAsync bridges inside AxiPcieReg.
AXI-Lite Register Tree
AxiPcieReg bridges the AXI4 BAR0 register bus from the PCIe PHY down to a
15-slave AXI-Lite crossbar. The bridge uses surf.AxiToAxiLite; write
transactions are pre-filtered by AxiPcieRegWriteMux, which separates PCIe
Intercommunication Protocol (PIP) writes from normal register writes before
the bridge.
The diagram below shows the BAR0 AXI-Lite crossbar fan-out. Base addresses
are sourced from the AXI_CROSSBAR_MASTERS_CONFIG_C constant in
AxiPcieReg.vhd:
Host
| BAR0 (AXI4, 256-bit, 64-bit addr)
v
AxiPcieReg (AXI4 -> AXI-Lite bridge, 15-slave crossbar)
|
+--[0x0000_0000] DMA control (AxiStreamDmaV2 descriptor engine)
+--[0x0001_0000] PCIe PHY CSR (per-board PHY status / config)
+--[0x0002_0000] AxiVersion (build info, device DNA, ICAP)
+--[0x0002_4000] Sysmon (XADC: die temperature / voltage)
+--[0x0002_8000] GPU async (AxiGpuAsyncCore, appClk domain)
+--[0x0003_0000] BPI flash (AxiMicronMt28ewReg, boot PROM)
+--[0x0004_0000] SPI flash 0 (AxiMicronN25QCore, chip-select 0)
+--[0x0005_0000] SPI flash 1 (AxiMicronN25QCore, chip-select 1)
+--[0x0006_0000] IB stream monitor (AXIS traffic monitor, inbound)
+--[0x0006_8000] OB stream monitor (AXIS traffic monitor, outbound)
+--[0x0007_0000] I2C / I2C mux (AxiLiteMasterProxy -> board I2C)
+--[0x0010_0000] App region 1 (appClk domain, 1 MB)
+--[0x0020_0000] App region 2 (appClk domain, 2 MB)
+--[0x0040_0000] App region 3 (appClk domain, 4 MB)
+--[0x0080_0000] App region 4 (appClk domain, 8 MB)
The four application region slaves (indices 11–14) are merged by a second
AXI-Lite crossbar inside AxiPcieReg and exposed as a single
appReadMaster / appWriteMaster pair that crosses to the application
clock domain via surf.AxiLiteAsync. Unimplemented or unused crossbar
slots respond with AXI_RESP_DECERR.
AXI-Stream DMA Channels
AxiPcieDma instantiates surf’s AxiStreamDmaV2 engine and
AxiPcieCrossbar. AxiPcieCrossbar presents DMA_SIZE_G + 2 AXI4
slave ports feeding a single AXI4 master that connects to the PCIe PHY:
one descriptor port, up to eight DMA lane ports, and one user general-purpose
port. Because the crossbar has a fixed budget of 10 slave ports, DMA_SIZE_G
is bounded to 8.
Each DMA lane exposes two AXI-Stream channels to the application:
Inbound (IB): data flowing from the host into the FPGA application. The DMA engine reads host DMA buffers over PCIe and drives
dmaObMasterstoward the application.Outbound (OB): data flowing from the FPGA application to the host. The application drives
dmaIbMastersinto the DMA engine, which writes the data into host DMA buffers over PCIe.
The IB/OB FIFO depth, descriptor back-pressure, and the DMA IRQ path are
described in the PCIe DMA Model page.