Adds information on:
- Log 2 of interface data widths in bits
- Interface type (0 - Axi MemoryMap, 1 - AXI Stream, 2 - FIFO ) .
Lets the driver discover interface widths and interface type settings,
this will deprecate the corresponding device tree properties.
This is useful in case of parametrized projects where the width of
the datapath is changing. This change will allow the use of a generic
device tree node.
Updated version to 4.3.a
In some cases, the Vivado version can contain other characters than just
numbers. One such example is after applying the patch of AR# 71948,
which makes `version -short` return something like `2018.3_AR71948`.
This patch changes the version check to ignore anything after the first
two components of the version.
Let the measured transfer length to be cleared at the end of each
transfer, other case in cyclic mode the counter will overflow and will
not present any useful information.
Once xfer_request is set the DMA must accept samples in the same clock
cycle if the fifo_wr_en signal is asserted.
If the req_valid asserts faster than the ID gets synchronized over the
the xfer request asserts without being ready to accept data.
This can lead to overflow assertion when using a FIFO like interface.
This patch addresses the following issue:
In case of transfers with multiple segments, if TLAST asserts on the last
beat of a non-last segment while more descriptors are queued up,
the completions for the queued segments may be missed causing timeout in
processes that wait for transfer completions.
This patch addresses the following issue:
In 2D mode when consecutive partial transfers occur, and the latter is
very short, will interfere with the completion mechanism of the first
transfer leading to uncompleted segments and unreported partial
transfers.
The tb_base.v verilog files does not contain a full module definition,
just some plain test code. In general the files is sourced inside the
test bench main module. As is, defining a timescale in these files will
generate an error, because timescale directive can not be inside a
module.
Delete all the timescale directive from these files.
The interrupt controller from Microblaze based projects requires that
all its inputs have attributes which define the sensitivity of the
interrupt line. Other case it defaults to EDGE_RISING which is not the
case for DMAC, leading to incorrect interrupt reporting and handling in
case of such projects.
Current implementation does not supports updated versions of Vivado
e.g. 2017.4.1 or 2018.2.1
This fix ignores the update number from the version checking.
The DMAC has the requirement that the length of the transfer is aligned to
the widest interface width. E.g. if the widest interface is 256 bit or 32
bytes the length of the transfer needs to be a multiple of 32.
This restriction can be relaxed for the memory mapped interfaces. This is
done by partially ignoring data of a beat from/to the MM interface.
For write access the stb bits are used to mask out bytes that do not
contain valid data.
For read access a full beat is read but part of the data is discarded. This
works fine as long as the read access is side effect free. I.e. this method
should not be used to access data from memory mapped peripherals like a
FIFO.
This means that for example the length alignment requirement of a DMA
configured for a 64-bit memory and a 16-bit streaming interface is now only
2 bytes instead of 8 bytes as before.
Note that the address alignment requirement is not affected by this. The
address still needs to be aligned to the width of the MM interface that it
belongs to.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
FPGAs support different widths for the read and write port of the block
SRAM cells. The DMAC can make use of this feature when the source and
destination interface have a different width to up-size/down-size the data
bus.
Using memory cells with asymmetric port width consumes the same amount of
SRAM cells, but allows to bypass the re-size blocks inside the DMAC that
are otherwise used for up- and down-sizing. This reduces overall resource
usage and can improve timing.
If the ratio between the destination and source port is too larger to be
handled by SRAM alone the SRAM block will be configured to do partial up-
or down-sizing and a resize block will be inserted to take care of the
remaining up-/down-sizing. E.g. if a 256-bit interface is connected to a
32-bit interface the SRAM will be used to do an initial resizing of 256 bit
to 64 bit and a resize block will be used to do the remaining resizing from
64 bit to 32 bit.
Currently this feature is disabled for Intel FPGAs since Quartus does not
properly infer a block RAM with different read and write port widths from
the current ad_asym_mem module. Once that has been resolved support for
asymmetric memories can also be enabled in the DMAC.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The handling of the src_data_valid_bytes signal and its related signal is
tightly coupled to the behavior of the resize_src module. The code that
handles it makes assumptions about the internal behavior of the resize_src
module.
Move the handling of the src_data_valid_bytes signal when upsizing the data
bus into the resize_src module so that all the code that is related is in
the same place and the code outside of the module does not have to care
about the internals.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The DMA_LENGTH_ALIGN LSBs of all length For the most part the tools are
able to deduce this using constant propagation.
But this propagation does not work across the asynchronous meta data FIFO
in the burst memory module.
Add a DMA_LENGTH_ALIGN parameter to the burst_memory module which is used
to explicitly keep the LSBs of length registers on the destination side
fixed at 1'b1. This reduces resource use and improves timing by allowing
better constant propagation and unused logic elimination.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
This simplifies the burst length in the response manager significantly
while not costing much extra resources in the burst memory.
This change will also enable other future improvements.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
One of the major features of the DMAC is being able to handle non matching
interface widths for the destination and source side.
Currently the test benches only support the case where the width for the
source and the destination side are the same. Extend them so that it is
possible to also test and verify setups where the width is not the same.
To accomplish this each byte memory location is treated as if it contained
the lower 8 bytes of its address. And then the written/read data is
compared to the expected data based on that.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Add support for Vivado's simulator. By default the run script is using
the Icarus simulator.
If the user want to switch to another simulator, it can be explicitly
specify the required simulator tool in the SIMULATOR variable.
Currently, beside Icarus, Modelsim (SIMULATOR="modelsim") and Vivado's
xsim (SIMULATOR="xsim") is supported.
For consistent simulation behavior it is recommended to annotate all source
files with a timescale. Add it to those where it is currently missing.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
If the req_valid asserts faster than the ID gets synchronized over we
assert the xfer request without being ready to accept data.
This can lead to overflow assertion when using a FIFO like interface.
Data mover/ src axis changes
Request rewind ID if TLAST received during non-last burst
Consume (ignore) descriptors until last segment received
Block descriptors towards destination until last segment received
Request generator changes
Rewind the burst ID if rewind request received
Consume (ignore) descriptors until last segment received
If TLAST happened on last segment replay next transfer (in progress or
completed) with the adjusted ID
Create completion requests for ignored segments
Response generator changes
Track requests
Complete segments which got ignored
Length of partial transfers are stored in a queue for SW reads.
The presence of partial transfer is indicated by a status bit.
The reporting can be enabled by a control bit.
The progress of any transfer can be followed by a debug register.
Drive the descriptor from the source side to destination
so we can abort consecutive transfers in case TLAST asserts.
For AXIS count the length of the burst and pass that value to the
destination instead the programmed one. This is useful when the
streams aborts early by asserting the TLAST. We want to notify the
destination with the right number of beats received.
For FIFO source interface reuse the same logic due the small footprint
even if the stream does not got interrupted in that case.
For MM source interface wire the burst length from the request side to
destination.
Reduce the width of ID signals to avoid size mismatches in Arria 10 SoC
projects where the ID width of the hard IP is 4.
The width of ID that reaches the slave can be increased by the interconnect if
multiple masters access the slave so we end up with mismatches.
Since these signals are unused it is safe to reduce them to minimum width and
let the interconnect zero-extend them as required.
The buffers inside the interconnect are sized based on maximum burst sizes
the masters can produce.
For AXI4 the max burst size is 128 but for these projects for the
default burst size of 128 bytes the DMACs are creating only burst of 8 or
16 beats depending on the bus width (128bits and 64 bits respectively).
These burst sizes can fit in the AXI3 protocol where the max burst
length is 16. Therefore the interconnect will be reduced.
The observed reduction is around 4 Mb of block RAM per project.
Another benefit is a better timing closure,
since these buffers reside in the DDR3 clock domain.
The round function from tcl is a rounding to nearest. Using it in address
width calculation produces incorrect values.
e.g.
round(log(0xAF000000)/log(2)) will produce 31 instead of 32
The fix is to replace the rounding function with ceiling that guarantees
rounding up.
In MM2S applications like video DMA it is useful to mark the end of the stream
with the TLAST.
The change enables the generation of the TLAST on the last beat of the
last row of the 2d transfer.