A larger store-and-forward memory provides better protection against worst
case memory interface latencies by being able to store more data before
over-/underflowing.
Based on empirical testing it was found that using a size of 4 bursts can
still result in underflows/overflows under certain conditions. These do not
happen when using a size of 8 bursts.
This change does not significantly increase resource consumption. Both on
Intel and Xilinx the block RAM has a minimum depth of 512 entries. With a
default burst length of 16 beats that allows for up to 32 bursts without
requiring additional block RAM.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The label for the store-and-forward memory size configuration option at the
moment is just "FIFO Size" and while the store-and-forward memory uses a
FIFO that is just a implementation detail.
Change the label to "Store-and-Forward Memory Size". This is more
descriptive as it references the function not the implementation.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
For correct operation the store-and-forward memory size must be a
power-of-two in the range of 2 to 32.
This is simple enough so we can list all values and let the IP Integrator
and QSYS perform proper validation of the parameter.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
This comment hasn't been true in a long long time. It does not have any
relation to the code around it anymore.
So just remove it.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Commit e6aacd2f56 ("axi_dmac: Better support debug IDs when ID_WIDTH !=
3") managed to get the order of the IDs in the debug register wrong.
Restore the original order.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Split the register map code into a separate sub-module instead of having it
as part of the top-level axi_dmac.v file.
This makes it easier to component test the register map behavior
independently from the DMA transfer logic.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
When the source and destination bus widths don't match a resize block is
inserted on the side of the narrower bus. This resize block can contain
partial data.
To ensure that there is no residual partial data is left in the resize
block after a transfer shutdown the resize block is reset when the DMA is
disabled.
Currently this is implemented by tying the reset signal of the resize block
to the enable signal of the DMA. This enable signal is only a indicator
though that the DMA should shutdown. For a proper shutdown outstanding
transactions still need to be completed.
The data that is in the resize block might be required to complete those
transactions. So performing the reset when the enable signal goes low can
lead to a situation where the DMA tries to complete a transaction but can't
do it because the data required to do so has been erased by resetting the
resize block. This leads to a dead lock and the system has to be rebooted
to recover from it.
To solve this use the sync_id signal to reset the resize block. The sync_id
signal will only be asserted when both the destination and source side
module have indicated that they are ready to be reset and there are no more
pending transactions.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The MAX_BYTES_PER_BURST option allows to configure the maximum bytes that
are part of a burst. This can be an arbitrary value.
At the same time there is a limit of how many bytes can be supported by the
memory buses. A AXI3 interface supports a maximum of 16 beats per burst
and a AXI4 interface supports a maximum of 256 beats per burst.
At the moment the it is possible to specify a MAX_BYTES_PER_BURST value
that exceeds what can be supported by the AXI memory-mapped bus. If that is
the case undefined behavior will occur and the DMAC will function
incorrectly.
To avoid this make sure that the MAX_BYTES_PER_BURST value does not exceed
the maximum that can be supported by the interfaces.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The width of the AXI burst length field depends on the AXI standard
version. For AXI3 the width is 4 bits allowing a maximum burst length of 16
beats, for AXI4 it is 8 bits wide allowing a maximum burst length of 256
beats.
At the moment the width of the length signals are determined by type of the
source AXI interface, even if the source interface type is not AXI. This
means if the source interface is set to AXI3 and the destination interface
is set to AXI4 the internal width of the signal for all interfaces will be
4 bits. This leads to a truncation of the destination bus length field,
which is supposed to be 8 bits.
If burst are generated that are longer than 16 beats the upper bits of the
length signal will be truncated. The result of this will be that the
external AXI slave interface (e.g. the DDR memory) and the internal logic
in the DMA disagree about burst length. The DMA will eventually lock up
when its internal buffers are full.
To avoid this issue have different configuration parameters for the source
and destination interface that configure the AXI bus length field width.
This way one of the interfaces can be configured for AXI3 and the other for
AXI4 without interfering with each other.
Fixes: commit 495d2f3056 ("axi_dmac: Propagate awlen/arlen width through the core")
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
When the DMAC is used in async clock domains the data FIFO instantiate
an ad_mem component to handle properly the clock crossing.
For Intel, this mode is used only in FMCJESDADC1 designs but without this
an error could appear in other projects too if the user reconfigures the core.
The set_false_path constraint targeted to the *ram* cells of the dmac
matched several intra clock domain paths where the timing analysis got
ignored resulting in intermitent data integrity issues.
Exposed AXI3 interface on the Intel version of the IP for UI and feature consistency.
Some of the signals that are defined as optional in the AMBA standard
are marked as mandatory in Qsys in case of AXI3. Because of this such signals
were added to the interface of the DMAC and driven with default values.
For Xilinx in order to keep existing behavior the newly added signals
are hidden from the interface.
New parameters are added to define the width of the AXI transaction IDs;
these are hidden from the UI; We can add them to the UI if the fixed size
of the IDs will cause port incompatibility issues.
The primary use-case of the DMA controller is in non-2D mode. Make this the
default, since allows projects to instantiate the controller with the
default configuration without having to explicitly disable 2D support.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the individual IP core dependencies are tracked inside the
library Makefile for Xilinx IPs and the project Makefiles only reference
the IP cores.
For Altera on the other hand the individual dependencies are tracked inside
the project Makefile. This leads to a lot of duplicated lists and also
means that the project Makefiles need to be regenerated when one of the IP
cores changes their files.
Change the Altera projects to a similar scheme than the Xilinx projects.
The projects themselves only reference the library as a whole as their
dependency while the library Makefile references the individual source
dependencies.
Since on Altera there is no target that has to be generated create a dummy
target called ".timestamp_altera" who's only purpose is to have a timestamp
that is greater or equal to the timestamp of all of the IP core files. This
means the project Makefile can have a dependency on this file and make sure
that the project will be rebuild if any of the files in the library
changes.
This patch contains quite a bit of churn, but hopefully it reduces the
amount of churn in the future when modifying Altera IP cores.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The include files are currently only implicitly added to the component file
list. Do it explicitly as this will make sure that they show up in the
generated Makefile dependency list.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
This reduces the amount of boilerplate code that is present in these
Makefiles by a lot.
It also makes it possible to update the Makefile rules in future without
having to re-generate all the Makefiles.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Bundle the TLAST signal in with the other AXIS slave signals to enable
easier connection between AXIS devices that use TLAST
Signed-off-by: Matt Fornero <matt.fornero@mathworks.com>
Add some limit TLAST support for the streaming AXI source interface. An
asserted TLAST signal marks the end of a packet and the following data beat
is the first beat for the next packet.
Currently the DMAC does not support for completing a transfer before all
requested bytes have been transferred. So the way this limited TLAST
support is implemented is by filling the remainder of the buffer with 0x00.
While the DMAC is busy filling the buffer with zeros back-pressure is
asserted on the external streaming AXI interface by keeping TREADY
de-asserted.
The end of a buffer is marked by a transfer that has the last bit set in
the FLAGS control register.
In the future we might add support for transfer completion before all
requested bytes have been transferred.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The commit 6900c have added an additional register stage into the fifo read
data path, but the control signals (ready/valid/underflow) were not realigned
to the data. This can cause data lose or duplicated samples in some case.
Realign the control signals to the data.
The first attempt (f3daf0) faild miserably. When the data_req signal
from the device had more than 1 cycle of deassert state, because of the
added latency of the data stream, the device got 'zeros' too.
In this fix, the DMA will hold the valid data on the bus, between two
consecutive data request. The bus is reseted just after all the data
were sent out.
Reset the fifo_rd_data if the DMA does not have an active transfer.
Becasue all the DAC device cores are transfering the data from the FIFO
interface to the data interface without any validation signal, DMA needs to put
the data bus into a known state, to prevent the device core to send the
last known data again and again.
The current layout of the debug ID register assumes that the ID_WIDTH is 3.
Change things so that the padding 0 width depends on the ID_WIDTH
parameter so that we end up with the same register layout regardless of the
value of ID_WIDTH.
Also split things into two registers, this allows for an ID_WIDTH up to 8
(which should hopefully be enough for all practical applications).
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The AXI specification that the minimum address space size is 4k, make sure
the axi_dmac adheres to this.
Internally the register space is still 2k. This means the upper and lower
2k of the axi4lite register space will map to the same internal registers.
Software must not rely on this and only access the lower 2k to enable
compatibility in case the internal space grows in the future.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Terminate the m_axi_list signal of the data mover instance in the
src_axi_stream module. This avoids a warning about the port being
unconnected.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the axi_dmac_hw.tcl script does not create interfaces if they
are not used in the current configuration. This has the disadvantage that
the ports belonging to these interfaces are not included in the generated
HDL wrapper. Which will generate a fair bunch of warnings when synthesizing
the HDL.
Instead always generate all interfaces, but disable those that are not used
in the current configuration. This will make sure that the ports belonging
to these interfaces are properly tied-off in the generate wrapper HDL.
This reduces the amount of false positive warnings generated and makes it
easier to spot actual issues.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The DMAC currently doesn't support transfers where the length is not a
multiple of the bus width. When generating the wstrb signal we do pretend
though that we do and dynamically generate it based on the LSBs of the
transfer length.
Given that the other parts of the DMA don't support such transfers this is
unnecessary though. So remove it for now and replace it with a constant
expression where wstrb is always fully asserted.
The generated logic for the wstrb signal was quite terrible, so this
improves the timing of the core.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the read side of the src_response interface is not used. This
leads to warnings about signals that have a value assigned but are never
read.
To avoid this just comment out all signals that are related to the
src_response interface for now.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Make sure that the right hand side expression of assignments is not wider
than the target signal. This avoids warnings about implicit truncations.
None of these changes affect the behaviour, just fixes some warnings about
implicit signal truncation.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The external s_axi_{awaddr,araddr} signals that are connect to the core
have their width set according to the specified size of the register map.
If the s_axi_{awaddr,araddr} signal of the core is wider (as it currently
is for many cores) the MSBs of those signals are left unconnected, which
generates a warning.
To avoid this make sure that the signal width matches the declared register
map size.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The axi_dmac can issue up to FIFO_SIZE read and write requests in parallel.
This is done in order to maximize throughput and compensate for for
latency.
Set the {read,write}IssuingCapability properties accordingly on the AXI
master interfaces. Otherwise qsys might decide to insert bridges that
artificially limit the number of requests, which in turn might affect
performance.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>