These platforms support the PCI DMA interface, but it is mostly a false front. There are no mapping registers in the bus interface, so scatterlists cannot be combined and virtual addresses cannot be used. There is no bounce buffer support, so mapping of high-memory addresses cannot be done. The mapping functions on the ARM architecture can sleep, which is not the case for the other platforms.
do not support
Our x86 and BlueField ports currently only support applications running on the same platform as FlexTOE. Hence, context queues always use shared memory rather than DMA. The corresponding DMA pipeline stage executes the payload copies in software using shared memory, rather than leveraging a DMA engine.
use shared memory
The dma-buf subsystem provides the framework for sharing buffers for hardware (DMA) access across multiple device drivers and subsystems, and for synchronizing asynchronous hardware access.
This is used, for example, by drm "prime" multi-GPU support, but is of course not limited to GPU use cases.
The three main components of this are: (1) dma-buf, representing a sg_table and exposed to userspace as a file descriptor to allow passing between devices, (2) fence, which provides a mechanism to signal when one device as finished access, and (3) reservation, which manages the shared or exclusive fence(s) associated with the buffer.
exposed to userspace
Linux kernel to support Mellanox BlueField SoCs
linux kernel v.5.4
current: mellanox blue-field ofed kernel 5.2, fastswap 4.11, Leap 4.4.125
https://github.com/Mellanox/bluefield-linux/blob/master/Documentation/crypto/async-tx-api.txt
Ib_sg_dma_address
Ib_sg_dma_len
Ib_dma_sync_single_for_cpu
Ib_dma_sync_single_for_device
Ib_dma_alloc_coherent
Ib_dma_free_coherent