A converter typically only supports a specific subset of framer
configurations.
Add a configuration parameter to select a specific converter part number.
Based on the selected part a mode validation will be performed and if the
selected framer configuration is not supported by the part an error will be
generated.
This helps to catch invalid configurations early on rather than having to
first build the bitstream and then notice that it does not work.
When using "Generic" for the part configuration parameter no validation
will be done and any framer configuration can be selected.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The exact layout of the input data into the DAC transport layer core
depends on the framer configuration. The number of input channels is
always equal to the NUM_CHANNELS parameter, but the number of samples per
channel per beat depends on the ratio of number of lanes, number of
channels and bits per sample.
It is possible to compute this manually, but this might require in-depth
knowledge about how the JESD204 framer works. Add read-only parameters that
display the number of samples per channel per beat as well as the total
width of the channel data signal.
This information can also be queried in QSys scripts and used to
automatically configure the input pipeline. E.g. like the upack core:
set NUM_OF_CHANNELS [get_instance_parameter_value jesd204_transport NUM_CHANNELS]
set CHANNEL_DATA_WIDTH [get_instance_parameter_value jesd204_transport CHANNEL_DATA_WIDTH]
add_instance util_dac_upack util_upack
set_instance_parameter_values util_dac_upack [list \
CHANNEL_DATA_WIDTH $CHANNEL_DATA_WIDTH \
NUM_OF_CHANNELS $NUM_OF_CHANNELS \
]
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
For a specific set of L, M and NP framer configuration parameters there is
an infinite set of possible values for the S and F configuration parameters
as long as S and F are integer and the following relationship is met
S / F = (L * 8) / (M * NP)
Typically the preferred framer configuration is the one with the lowest
latency. The lowest latency is achieved when S is minimal.
Automatically compute and select this value for S instead of having the
user to manually provide a value.
Since some converters allow modes where S is not minimal provide a manual
overwrite to specify S manually in case somebody wants to use such a mode.
For completeness also add a read-only OCTETS_PER_FRAME (F) parameter that
can be used to verify and check which value for F was chosen.
There is no manual overwrite for F since if L, M, NP and S are set to a
fixed value there is only a single possible value for F, which is computed
automatically.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The ad_ip_jesd204_tpl_dac currently only supports JESD204 modes that have
both N and N' set to 16.
Newer DACs like the AD9172 support modes where N and N' are not equal to
16. Add support for these modes.
The width of the internal channel data path is set to N, only processing as
many bits as necessary. At the framer the data is up-sized to N' bits with
tail bits inserted as necessary. This data is then passed to the link
layer.
The width at the DMA interface is kept at 16 bits per sample regardless of
the configuration of either N or N'. This is done to keep the interface
consistent with the existing infrastructure it will connect to like upack
and DMA. The data is expected to the LSB aligned, the unused MSBs will be
ignored.
Same is true for the test-pattern data registers. These register keep their
existing 16-bit layout, but unused MSBs will be ignored by the core.
The PN generators are modified to create only N bits of data per sample.
Note that while the core can now support modes with N' = 12 there is still
the restriction that requires the number of frames per beat to be an even
number. Which means that not all modes with N' = 12 can be supported yet.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The current framer implementation is limited in that it only supports N'=16
and either S=1 or F=1.
Rework the framer implementation to be more flexible and support more
framer setting combinations.
The new framer implementation performs the mapping in two steps. First it
groups samples into frames, as there might be more than one frame per beat.
In the second step the frames are distributed onto the lanes.
Note that this still results in a single input bit being mapped onto a
single output bit and no combinatorial logic is involved. The two step
implementation just makes it (hopefully) easier to follow.
The only restriction that remains is that number of frames per beat must be
integer. This means that F must be either 1, 2 or 4. Supporting partial
frames would result in partial sample sets being consumed at the input,
which is not supported by input pipeline.
The new framer has provisions for handling values for the number of octets
per beat other than 4, but this is not exposed as a configuration option
yet since the link layer can only handle 4 octets per beat. Making the
octets per beat configurable is something for future iterations of the
core.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The ad_ip_jesd204_tpl_dac currently is only instantiated as a submodule by
other cores like the axi_ad9144 or axi_ad9152. These cores typically only
support one specific framer configuration.
In an effort to allow more framer configurations to be used the core is
re-worked, so it can be instantiated standalone.
As part of this effort provide GUI integration for Xilinx IP Integrator
where users can instantiate and configure the core.
For this group the configuration parameters by function, provide
descriptive label and a list of allowed values for parameter validation.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The ad_ip_jesd204_tpl_dac currently is only instantiated as a submodule by
other cores like the axi_ad9144 or axi_ad9152. These cores typically only
support one specific framer configuration.
In an effort to allow more framer configurations to be used the core is
re-worked, so it can be instantiated standalone.
As part of this effort provide GUI integration for Intel Platform Designer
(previously known as Qsys) where users can instantiate and configure the
core.
For this group the configuration parameters by function, provide
descriptive label and a list of allowed values for parameter validation.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The framer module is purely combinational at this point and the clk signal
is unused.
This is a leftover of commit commit 5af80e79b3 ("ad_ip_jesd204_tpl_dac:
Drop extra pipeline stage from the framer").
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Commit 5d044b9fd3 ("ad_ip_jesd204_tpl_dac: Share PN sequence generator
between all channels") add a new file to the ad_ip_jesd204_tpl_dac, but
neglected to update the hw.tcl for the axi_ad9144 and axi_ad9152 which use
this file.
The result is that Intel project using these cores currently do not build.
Fix it by adding the missing file to the file list.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Reduce the width of ID signals to avoid size mismatches in Arria 10 SoC
projects where the ID width of the hard IP is 4.
The width of ID that reaches the slave can be increased by the interconnect if
multiple masters access the slave so we end up with mismatches.
Since these signals are unused it is safe to reduce them to minimum width and
let the interconnect zero-extend them as required.
The buffers inside the interconnect are sized based on maximum burst sizes
the masters can produce.
For AXI4 the max burst size is 128 but for these projects for the
default burst size of 128 bytes the DMACs are creating only burst of 8 or
16 beats depending on the bus width (128bits and 64 bits respectively).
These burst sizes can fit in the AXI3 protocol where the max burst
length is 16. Therefore the interconnect will be reduced.
The observed reduction is around 4 Mb of block RAM per project.
Another benefit is a better timing closure,
since these buffers reside in the DDR3 clock domain.
This improvement will solve a couple of [DRC REQP-1839] warning:
"The RAMB36E1 has an input control pin * which is driven by a register * that has
an active asynchronous set or reset. This may cause corruption of the memory
contents and/or read values when the set/reset is asserted and is not analyzed
by the default static timing analysis. It is suggested to eliminate the use of
a set/reset to registers driving this RAMB pin or else use a synchronous reset
in which the assertion of the reset is timed by default."
The frame synchronization between axi_hdmi_tx and axi_dmac is based
on the DMA(2D streaming) last signal. The last signal will be used as
an end of frame signal marking the beginning of the future frame to be
transferred by the DMA.
Only after both HDMI and DMA are ready for a "new frame" data will be
requested from the DMA.
The datarate and CDC between the axi_dmac and axi_hdmi_tx cores
will be handled by axi_hdmi_tx's DMA interface based on a backpressure
mechanism.
Add a interface definition for the link interface that combines the valid,
ready and data signals into a AXI streaming interface.
This allows to connect the interface to the JESD204 link layer peripheral
in one go without having to manually connect each signal.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Some modes produce only one sample per channel per beat, e.g. when M=2*L.
In this case the pattern output needs to alternate between the two patterns
from beat to beat.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
All channels have a copy of the same logic to generate the PN sequences.
Sharing the PN sequence generator among all channels slightly reduces the
resource utilization of the core.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Only the N (where N is the size of the PN sequence) MSB bits of the reset
state of the PN generator should be set to 1. All other bits should be
initialized following the PN generator sequence.
Otherwise the first set of samples contain an incorrect PN sequence.
This does not increase the complexity of the PN generator, all reset values
are still constant.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
All the inputs to the framer are registered. And the framer itself does not
have any combinatorial logic, it just re-orders the wire numbering of the
individual bits.
Currently the framer module adds a output register stage, but since there
is no logic in the framer this just means that these registers are directly
connected to the output of the previous register stage.
Remove the extra pipeline register. This slightly reduces utilization and
pipeline delay of the core.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Remove unused register from the ad_ip_jesd204_tpl_dac_channel module.
Commit commit 92f0e809b5 ("jesd204/ad_ip_jesd204_tpl_dac: Updates for
ad_dds phase acc wrapper") removed all users of those registers.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
All parameters are DAC related since this is a peripheral that handles
DACs. Having DAC as a prefix on some of the parameter names is a bit
redundant, so remove them.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Use a relative path for all IP local files. This is the common style
throughout the HDL repository and also makes it easier to move the
directory around.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
- connect unused GPIO inputs to loopback
- connect unconnected inputs to zero
- complete interface for system_wrapper instantiated in all system_top
fixes incomplet portlist WARNING [Synth 8-350]
fixes undriven inputs WARNING [Synth 8-3295]
The change excludes the generated system.v and Xilinx files.
This patch will fix the following critical warning, generated by Quartus:
"Critical Warning (18061): Ignored Power-Up Level option on the following
registers
Critical Warning (18010): Register ad_rst:i_core_rst_reg|rst_sync will power
up to High File: ad_rst.v Line: 50"
For a proper reset synchronization, the asynchronous reset signal should
be connected to the reset pins of the two synchronizer flop, and the
data input of the first flop should be connected to VCC.
In the first stage we're synchronizing just the reset de-assertion, avoiding
the scenario when different parts of the design are reseting at different time,
causing unwanted behaviours.
In the second stage we're synchronizing the reset assertion.
The module expects an ACTIVE_HIGH input reset signal, and provides an ACTIVE_LOW
(rstn) and an ACTIVE_HIGH (rst) synchronized reset output signal.
Assign a unique value to each lane's error count register and verify that
the correct value is returned for the right lane.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The RX register map testbench currently fails because the expected value
for the version register was not updated, when the version was incremented.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The loopback testbench currently fails, because the cfg_links_disable signal is not connected to the RX side of the link.
Fix this.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
In case when the SYSREF is connected to an FPGA IO which has a limitation
on the IOB register IN_FF clock line and the required ref clock is high
we can't use the IOB registers.
e.g. the max clock rate on zcu102 HP IO FF is 312MHz but ref clock is 375MHz;
If IOB is used in this case a pulse width violation is reported.
This change makes the IOB placement selectable in such case or
for targets which don't require class 1 operation.
The round function from tcl is a rounding to nearest. Using it in address
width calculation produces incorrect values.
e.g.
round(log(0xAF000000)/log(2)) will produce 31 instead of 32
The fix is to replace the rounding function with ceiling that guarantees
rounding up.
- remove reset logic
- add wait for dac valid logic
- rewrite sine concatenation on wires for different path width to
suppress warnings
- use computed atan LUT tables
The CORDIC has a selectable width range for phase and data of 8-24.
Regarding the width of phase and data, the wider they are the smaller
the precision loss when shifting but with the cost of more FPGA
utilization. The user must decide between precision and utilization.
The DDS_WD parameter is independent of CORDIC(CORDIC_DW) or
Polynomial(16bit), letting the user chose the output width.
Here we encounter two scenarios:
* DDS_DW < DDS data width - in this case, a fair rounding will be
implemented corresponding to the truncated bits
* DDS_DW > DDS data width - DDS out data left shift to get the
corresponding concatenation bits.
Update for the parametrized ad_mul module. This will scale
a selectable sine width in a multiplication module.
Rename the data and phase width parameters for legibility.
When the tool calculates the X value for different phase widths, we
get rounding errors for every width in the interval [8;24].
Depending on the width thess errors cause overflows or smaller amplitudes
of the sine waves.
The error is not linear nor proportional with the phase. To fix the issue
a simple aproximation was chosen.
Perform the shifting operation before addition/subtraction in a
rotation stage. In the previous method, the result of the arithmetic
operation was shifted and the outcome was presented to the next stage.
In this way, data connections will be reduced between pipeline stages
Add parameters:
- to select the sine generator (polynomial/CORDIC)
- to select the CORDIC data width(default 16)
Suppress the warnings generated when the DDS is disabled.
https://en.wikipedia.org/wiki/CORDIC
Configurable in/out data width (14,16,18,20);
The HDL implementation requires pipelines, resulting in a
data_width + 2 clock cycles delay between the phase input data and the
sine data. For this reason, a ddata (delay data) was propagated through
the pipeline stages to help in future use scenarios
Typically only one of the character error conditions is true at a time. And
even if multiple errors were present at the same time we'd only want to
count one error per character.
For each character track whether at least one of the monitored error
conditions is true. Then count the number of characters for which at least
one error condition occurred. And finally add that sum to the total numbers
of errors.
This results in a slightly better utilization.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
When the link is explicitly disabled through the control interface reset
the error statistics counter.
There is usually little benefit to preserving until after the link has been
disabled. If software is interested in the values it can read them before
disabling the link. Having them reset makes the behavior consistent with
all other internal state of the jesd204 RX peripheral, which is reset when
the link is disabled.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
In MM2S applications like video DMA it is useful to mark the end of the stream
with the TLAST.
The change enables the generation of the TLAST on the last beat of the
last row of the 2d transfer.
The index on MSB of addresses was set to 31,
but the width of address in the axi_dmac depends on a parameter.
The mismatch causes issues in the Xilinx simulator which does not extends the
narrower width signal with zeros, instead the wider signal gets 'Z' on its MSBs.
When the address was incremented with the stride it became 'X' due the uninitialized
MSBs.
Vivado recognises .h files as C header files,
the expected extension for Verilog Header is .vh
This causes issues in simulating block designs since these files
won't be exported for the simulation even if they are
part of the simulation fileset.
When creating a block design targeted for simulation, in the testbench
it is useful to know the parameters of the sub components (e.g DMAC)
Xilinx's way to pass the parameters to the testbench in case of it's AXI
verification IP is through package files. We will do the same for the DMAC.
The package file can be generated from template files (ttcl).
These will be added only to the simulation file set of the project and
won't affect synthesis.
This change adds a diagnostic interface to the DMAC core.
The interface exposes internal information about the core,
information which can't be exposed through AXI registers
due the latency and update rate.
Such information is the fullness of the internal buffer.
For this is exposed in bursts and is driven from the destination
clock domain, as this is reflected in its name.
The signal has a fixed size and is dimensioned by
taking in account the supported maximum number of bursts of 128.
This change adds the TLAST signal to the AXI streaming interface
of the source side for Intel targets.
Xilinx based designs already have this since the tlast is part of the
interface definition.
In order to make the signal optional and let the tool connect a
default value to the it, the USE_TLAST_SRC/DEST parameter is
added to the configuration UI. This conditions the tlast port on
the interface of the DMAC.
Xilinx handles the optional signals much better so the parameter
is not required there.
In its current implementation the DMAC requires that the length of a
transfer is aligned to the widest interface. E.g. if the widest interface
is 128 bits wide the length of the transfer needs to be a multiple of 16
bytes.
If the requested length is not aligned to the interface width it will be
rounded up.
This works fine as long as both interfaces have the same width. If they
have different widths it is possible that the length is rounded up to
different values on the source and destination side. In that case the DMA
will deadlock because the transfer lengths don't match and either not enough
of too much data is delivered from the source to the destination side.
Currently it is up to software to make sure that such an invalid
configuration is not possible.
Also enforce this requirement in the DMAC itself by setting the LSBs of the
transfer length to a fixed 1 so that the length is always aligned to the
widest interface.
Software can also use this to discover the length alignment requirement, by
first writing a zero to the length register and then reading the register
back. The LSBs of the read back value will be non-zero indicating the
alignment requirement.
In a similar way the stride needs to be aligned to the width of its
respective interface, so the generated addresses stay aligned. Enforce this
in the same way by keeping the LSBs cleared.
Increment the minor version number to reflect these changes.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The transfer abort logic in the src_axi_stream module is making some
assumptions about the internal timings of the data mover module.
Move this logic inside the data mover module. This will make it easier to
update the internal logic without having to update other modules.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The only two users of the data mover module both implement the same
sync-transfer-start logic. Move this into the data mover module to avoid
the duplicated code.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
With the recent rework there is now a fair amount of dead code in the
datamover module that is no longer used. Remove it.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Data is gated on the source side interface and not let into the pipeline if
there is no space available inside the store and forward memory.
This means whenever data is let into the pipeline space is available and
backpressure wont be asserted. Remove the backpressure signals altogether
to simplify the design.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the source side of the DMAC can issue requests for up to
2*FIFO_SIZE-1 bursts even though there is only room for FIFO_SIZE bursts in
the store and forward memory.
This can problematic for memory mapped buses. If the data is not read fast
enough from the DMAC back-pressure will propagate through the whole system
memory subsystem and can cause significant performance penalty or even a
deadlock halting the whole system.
To avoid this make sure that not more that than what fits into the
store-and-forward memory is requested by throttling the request ID based
on how much room is available in the memory.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The second destination side register slice was put in place to provide
additional slack on some of the datapath control signals. It looks as if
this is no longer required for the latest version of the DMA controller.
All timing paths have sufficient margin.
So remove this extra slice register which just takes up resources and adds
pipeline latency.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently both the source side and the destination side interfaces employ a
beat counter to identify the last beat in a burst.
The burst memory already has an internal last signal on the destination
side. Exporting it allows the destination side interfaces to use it instead
of having to generate their own signal. This allows to eliminate the beat
counters on the destination side and simplify the data path logic.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the destination side request ID is synchronized response ID from
the source side. This signal is effectively the same as the synchronized
src ID inside the burst memory. The only difference is that they might not
increment in the exact same clock cycle.
Exporting the request ID from the burst memory means we can remove the extra
synchronizer block.
This has the added bonus that the request ID will increment in the same
clock cycle as when the data becomes available from the memory.
This means we can assume that when there is a outstanding burst request
indicated via the ID that data is available from the memory and vice versa
when data is available from the memory that there is a outstanding burst
request.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the DMAC uses a simple FIFO as the store-and-forward buffer. The
FIFO handshaking is beat based whereas the remainder of the DMAC is burst
based. This means that additional control signals have to be combined with
the FIFO handshaking signal to generate the external handshaking signals.
Re-work the store-and-forward buffer to utilize a BRAM that is subdivided
into N segments. Where N is the maximum number of bursts that can be stored
in the buffer and each segment has the size of the maximum burst length.
Each segment stores the data associated with one burst and even when the
burst is shorter than the maximum burst length the next burst will be
stored in the next segment.
The new store-and-forward buffer takes care of generating all the
handshaking signals. This means handshaking is generated in a central place
and does not have to be combined from multiple data-paths simplifying the
overall logic.
The new store-and-forward buffer also takes care of data width up- and
down-sizing in case that the source and sink modules have a different data
width. This tighter integration will allow future enhancements like using
asymmetric memory.
This re-work lays the foundation of future enhancements to the DMA like
support for un-aligned transfers and early transfer abort which would have
been much more difficult to implement with the previous architecture.
In addition it significantly reduces the resource utilization of the
store-and-forward buffer and allows for better timing due to reduced
combinatorial path lengths.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
There is an implicit dependency between the outgoing data stream and the
incoming response stream. The AXI specification requires that the
corresponding response is not sent before the last beat of data has been
received.
We can take advantage of this and remove the currently explicit dependency
between the data and response paths. This slightly simplifies the design.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
For the AXI streaming interfaces we need to make sure that the handshaking
rules for the external interface are met. Hence we can't just disable the
DMA and have to wait for any pending beats to complete.
For the FIFO interfaces on the other hand no such requirements exist. All
handshaking is for the internal pipeline which will be reset as a whole so
it is OK to violate the handshaking without causing any undefined behavior.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
For the memory-mapped AXI read interface the slave asserts rlast for the
last beat in a burst.
This means we don't have to count the number of beats to know when the
burst is completed but instead can use rlast. This slightly reduces the
amount of resources needed for the MM-AXI source module and given that the
beat_counter is often the bottleneck timing wise this should also improve
the timing.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
When the DMA is disabled it should gracefully shutdown and eventually end
up in an idle state. All outstanding AXI MM requests need to complete
before the DMA is fully disabled.
Add testbenches that test this for both AXI MM read and write behavior.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The DMAC allows a transfer to be aborted. When a transfer is aborted the
DMAC shuts down as fast as possible while still completing any pending
transactions as required by the protocol specifications of the port. E.g.
for AXI-MM this means to complete all outstanding bursts.
Once the DMAC has entered an idle state a special synchronization signal is
send to all modules. This synchronization signal instructs them to flush
the pipeline and remove any stale data and metadata associated with the
aborted transfer. Once all data has been flushed the DMAC enters the
shutdown state and is ready for the next transfer.
In addition each module has a reset that resets the modules state and is
used at system startup to bring them into a consistent state.
Re-work the shutdown process to instead of flushing the pipeline re-use the
startup reset signal also for shutdown.
To manage the reset signal generation introduce the reset manager module.
It contains a state machine that will assert the reset signals in the
correct order and for the appropriate duration in case of a transfer
shutdown.
The reset signal is asserted in all domains until it has been asserted for
at least 4 clock cycles in the slowest domain. This ensures that the reset
signal is not de-asserted in the faster domains before the slower domains
have had a chance to process the reset signal.
In addition the reset signal is de-asserted in the opposite direction of
the data flow. This ensures that the data sink is ready to receive data
before the data source can start sending data. This simplifies the internal
handshaking.
This approach has multiple advantages.
* Issuing a reset and removing all state takes less time than
explicitly flushing one sample per clock cycle at a time.
* It simplifies the logic in the faster clock domains at the expense of
more complicated logic in the slower control clock domain. This allows
for higher fMax on the data paths.
* Less signals to synchronize from the control domain to the data domains
The implementation of the pause mode has also slightly changed. Pause is
now a simple disable of the data domains. When the transfer is resumed
after a pause the data domains are re-enabled and continue at their
previous state.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Move the transfer logic, including the 2d module, into its own sub-module.
This allows testing of the full transfer logic independently of the
register map logic.
The top-level module now only instantiates the register map and transfer
module, but does not have any logic on its own.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The timing exceptions for the debug paths are currently a bit to broad and
can include paths that should not have an exception.
All the debug signals are coming from the i_request_arb instance, so
include that in the match to avoid false positives.
For most projects this wont have been a problem since there is usually a
fair amount of slack on the paths that were affected by this. But in
projects with high utilization this might result in undefined behavior.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Fix the read side of the CDC data FIFO. The read address generation did not
function correctly.
Redesign the read side of the FIFO, and make sure that it becomes empty after
the DMA transfer ends; and never get stock in a cyclic mode.
The dac_last signal is not used anywhere in the module. Remove it and its
synchronization registers.
Fixes the following warnings:
[Synth 8-6014] Unused sequential element dac_dlast_reg was removed. ["axi_dacfifo_rd.v":372]
[Synth 8-6014] Unused sequential element dac_dlast_m1_reg was removed. ["axi_dacfifo_rd.v":373]
[Synth 8-6014] Unused sequential element dac_dlast_m2_reg was removed. ["axi_dacfifo_rd.v":374]
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Commit bfc8ec28c3 ("util_axis_fifo: instantiate block ram in async mode")
added the read-enable (reb) signal to the ad_mem block.
It didn't update the ad_mem instance in axi_dacfifo_address_buffer.v. This
results in the read-enable of the address_buffer being tied to 0.
Fix this by connecting the same signal that increments the read address.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The DMAC implementation guarantees that the expression `dma_valid &
dma_xfer_req` is always identical to just dma_valid.
When generating the util_dacfifo dma_wren_s signal the optimizer doesn't know
of this though and hence will route both signals into the LUT that drives
the write enable for the BRAMs.
Simplify the expression by removing dma_xfer_req from it. Considering this
can be a fairly high fan-out net and is typically the bottleneck for the
util_dacfifo timing this helps to improve the timing.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Some parts of the util_cdc library rely on dead logic elimination to remove
unused logic. Unfortunately with newer Vivado versions this results in
warnings about unused sequential elements being removed. Like:
WARNING: [Synth 8-6014] Unused sequential element cdc_sync_stage1_reg was removed.
To avoid this encase the logic in generate blocks that makes sure they are
not generated when not needed.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
This reverts commit 4b1d9fc86b "axi_dmac: Modified in order to avoid
vivado crash".
Vivado no longer crashes and this structure is much more efficient when it
comes to resource usage and timing. The intention here is to create a 1-bit
memory that is N entries deep and not a N bit signal.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The burst_count signal and its derived signals are not used until the
burst_count has been explicitly initialized by loading a transfer. There is
no need to have a reset.
This reduces the fan-out of the reset signal.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The data path register of the 2d_transfer module are qualified by the
corresponding valid signal. Their content is not used until they have been
explicitly initialized. There is no need to reset them explicitly.
This reduces the fan-out of the reset signal.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
There is no need to reset the data path in the address generator. The
values of the register on the data path are not used until they have been
explicitly initialized. Removing the reset simplifies the structure and
reduces the fan-out of the reset signal.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Xilinx tools don't allow to use $clog2() when computing the value of a
localparam, even though it is valid Verilog.
For this reason a parameter was used for BYTES_PER_BURST_WIDTH so far. But
that generates warnings from both Quartus and Vivado since the parameter is
not part of the parameter list.
Fix this by changing it to a localparam and computing the log2() manually.
The upper limit for the burst length is known to be 4k, so values larger
than that don't have to be supported.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
A larger store-and-forward memory provides better protection against worst
case memory interface latencies by being able to store more data before
over-/underflowing.
Based on empirical testing it was found that using a size of 4 bursts can
still result in underflows/overflows under certain conditions. These do not
happen when using a size of 8 bursts.
This change does not significantly increase resource consumption. Both on
Intel and Xilinx the block RAM has a minimum depth of 512 entries. With a
default burst length of 16 beats that allows for up to 32 bursts without
requiring additional block RAM.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The label for the store-and-forward memory size configuration option at the
moment is just "FIFO Size" and while the store-and-forward memory uses a
FIFO that is just a implementation detail.
Change the label to "Store-and-Forward Memory Size". This is more
descriptive as it references the function not the implementation.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
For correct operation the store-and-forward memory size must be a
power-of-two in the range of 2 to 32.
This is simple enough so we can list all values and let the IP Integrator
and QSYS perform proper validation of the parameter.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
This comment hasn't been true in a long long time. It does not have any
relation to the code around it anymore.
So just remove it.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
* jesd204: Add RX error statistics
Added 32 bit error counter per lane, register 0x308 + lane*0x20
On the control part added register 0x244 for performing counter reset and counter mask
Bit 0 resets the counter when set to 1
Bit 8 masks the disparity errors, when set to 1
Bit 9 masks the not in table errors when set to 1
Bit 10 masks the unexpected k errors, when set to 1
Unexpected K errors are counted when a character other than k28 is detected. The counter doesn't add errors when in CGS phase
Incremented version number
Commit e6aacd2f56 ("axi_dmac: Better support debug IDs when ID_WIDTH !=
3") managed to get the order of the IDs in the debug register wrong.
Restore the original order.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The cfg_links_disable register will mask the SYNC lines, disabled links
will always have a de-asserted SYNC (logic state HIGH).
The FSM will stay in CGS as long as there is one active link with an
asserted SYNC (logic state LOW).
Update the test bench to generate the SYNC signals in different clock
edges, so it can test all the possible scenarios.
A multi-link is a link where multiple converter devices are connected to a
single logic device (FPGA). All links involved in a multi-link are synchronous
and established at the same time. For a TX link this means that the FPGA receives
multiple SYNC signals, one for each link. The state machine of the TX link
peripheral must combine those SYNC signals into a single SYNC signal that is
asserted when either of the external SYNC signals is asserted.
Dynamic multi-link support must allow to select to which converter devices on
the multi-link the SYNC signal is propagated too. This is useful when depending
on the use case profile some converter devices are supposed to be disabled.
Add the cfg_links_disable[0x081] register for multi-link control and
propagate its value to the TX FSM.
A multi-link is a link where multiple converter devices are connected to a
single logic device (FPGA). All links involved in a multi-link are synchronous
and established at the same time. For a RX link this means that the SYNC signal
needs to be propagated from the FPGA to each converter.
Dynamic multi-link support must allow to select to which converter devices on
the multi-link the SYNC signal is propagated too. This is useful when depending
on the usecase profile some converter devices are supposed to be disabled.
Add the cfg_links_disable[0x081] register for multi-link control and
propagate its value to the RX FSM.
Split the register map code into a separate sub-module instead of having it
as part of the top-level axi_dmac.v file.
This makes it easier to component test the register map behavior
independently from the DMA transfer logic.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
In DUAL mode half of the data ports are unused and the unused inputs need
to be connected to dummy signals.
Completely hide the unused ports in DUAL mode to remove that requirement.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
When the axi_ad9144 core is configured for DUAL mode two of the four
channels are unused. But there is still some residual logic left for those
unused channels that can't be removed by the optimizer.
Completely disable the unused channels by reducing the channel and lane
count. This slightly reduces utilization.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Replace the axi_ad9152 implementation with the new generic JESD204
interface DAC core. The replacement is functionally equivalent.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Replace the axi_ad9144 implementation with the new generic JESD204
interface DAC core. The replacement is functionally equivalent.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
For most of the DACs that use JESD204 as the data transport the digital
interface is very similar. They are mainly differentiated by number of
JESD204 lanes, number of converter channels and number of bits per sample.
Currently for each supported converter there exists a converter specific
core which has the converter specific requirements hard-coded.
Introduce a new generic core that has the number of lanes, number of
channels and bits per sample as synthesis-time configurable parameters. It
can be used as a drop-in replacement for the existing converter specific
cores.
This has the advantage of a shared and reduced code base. Code improvements
will automatically be available for all converters and don't have to be
manually ported to each core individually.
It also makes it very easy to introduce support for new converters that
follow the existing schema.
Since the JESD204 framer is now procedurally generated it is also very
easy to support board or application specific requirements where the lane
to converter ratio differs from the default (E.g. use 2 lanes/2 converters
instead of 4 lanes/2 converters).
This new core is primarily based on the existing axi_ad9144.
For the time being the core is not user instantiatable and will only be
used as a based to re-implement the converter specific cores. It will be
extended in the future to allow user instantiation.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Replace the axi_ad9680 implementation with the new generic JESD204
interface ADC core. The replacement is functionally equivalent.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Replace the axi_ad9250 implementation with the new generic JESD204
interface ADC core. The replacement is functionally equivalent, except that
the converter clock ratio is now correctly reported as 2 rather than 1 as
before.
Also the adc_rst output port is removed. It is not used in any design. The
current guidelines for the reset for the JESD204 subsystem is to use an
external reset generator.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Replace the axi_ad6676 implementation with the new generic JESD204
interface ADC core. The replacement is functionally equivalent, except that
the converter clock ratio is now correctly reported as 2 rather than 1 as
before.
Also the adc_rst output port is removed. It is not used in any design. The
current guidelines for the reset for the JESD204 subsystem is to use an
external reset generator.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
For most of the ADCs that use JESD204 as the data transport the digitial
interface is very similar. They are mainly differentiated by number of
JESD204 lanes, number of converter channels and number of bits per sample.
Currently for each supported converter there exists a converter specific
core which has the converter specific requirements hard-coded.
Introduce a new generic core that has the number of lanes, number of
channels and bits per sample as synthesis-time configurable parameters. It
can be used as a drop-in replacement for the existing converter specific
cores.
This has the advantage of a shared and reduced code base. Code improvements
will automatically be available for all converters and don't have to be
manually ported to each core individually.
It also makes it very easy to introduce support for new converters that
follow the existing schema.
Since the JESD204 deframer is now procedurally generated it is also very
easy to support board or application specific requirements where the lane
to converter ratio differs from the default (E.g. use 2 lanes/2 converters
instead of 4 lanes/2 converters).
This new core is primarily based on the existing axi_ad9680.
For the time being the core is not user instantiatable and will only be
used as a based to re-implement the converter specific cores. It will be
extended in the future to allow user instantiation.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The ADC DMA will never underflow and unsurprisingly the adc_dunf signal is
never used anywhere. It is very unlikely it will ever be used, so remove
it.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The DAC DMA will never overflow and unsurprisingly the dac_dovf signal is
never used anywhere. It is very unlikely it will ever be used, so remove
it.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Some designs choose to swap the positive and negative side of the of the
JESD204 lanes. One reason for this would be because it can simplify the
PCB layout. The polarity is in most cases also only applied to a subset of
the used lanes.
Add support for this to the util_adxcvr module. This done by adding new
parameter to the modules that allows to specify a per lane polarity
inversion. Each bit in the parameter corresponds to one lane. If the bit is
set the polarity is inverted for his lane. E.g. setting the parameter to
0xc will invert the 3rd and 4th lane.
The setting is forwarded to the Xilinx transceiver for the corresponding
lane.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Some designs choose to swap the positive and negative side of the of the
JESD204 lanes. One reason for this would be because it can simplify the
PCB layout. The polarity is in most cases also only applied to a subset of
the used lanes.
Add support for this to the adi_jesd204 and jesd204_phy for Altera modules.
This done by adding new parameter to the modules that allows to specify a
per lane polarity inversion. Each bit in the parameter corresponds to one
lane. If the bit is set the polarity is inverted for his lane. E.g. setting
the parameter to 0xc will invert the 3rd and 4th lane.
The setting is forwarded depending on whether soft or hard PCS is used to
either the soft PCS module or the transceiver block itself.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Add a parameter to the soft_pcs_loopback_tb that allows to test whether the
soft PCS modules work correctly when the lane polarity is inverted.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Some designs choose to swap the positive and negative side of the of the
JESD204 lanes. One reason for this would be because it can simplify the
PCB layout.
To support this add a parameter to the jesd204_soft_pcs_tx module that
allows to specify whether the lane polarity is inverted or not.
The way the polarity inversion is implemented is for free since it just
inverts the output mapping of the 8b10b encoder LUT tables.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Some designs choose to swap the positive and negative side of the of the
JESD204 lanes. One reason for this would be because it can simplify the
PCB layout.
To support this add a parameter to the jesd204_soft_pcs_rx module that
allows to specify whether the lane polarity is inverted or not.
The way the polarity inversion is implemented it is for free since it will
only invert the input mapping of the 8b10b decoder LUT tables.
The pattern align module does not care whether the polarity is inverted or
not since the pattern align symbols look the same in both cases.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
When the source and destination bus widths don't match a resize block is
inserted on the side of the narrower bus. This resize block can contain
partial data.
To ensure that there is no residual partial data is left in the resize
block after a transfer shutdown the resize block is reset when the DMA is
disabled.
Currently this is implemented by tying the reset signal of the resize block
to the enable signal of the DMA. This enable signal is only a indicator
though that the DMA should shutdown. For a proper shutdown outstanding
transactions still need to be completed.
The data that is in the resize block might be required to complete those
transactions. So performing the reset when the enable signal goes low can
lead to a situation where the DMA tries to complete a transaction but can't
do it because the data required to do so has been erased by resetting the
resize block. This leads to a dead lock and the system has to be rebooted
to recover from it.
To solve this use the sync_id signal to reset the resize block. The sync_id
signal will only be asserted when both the destination and source side
module have indicated that they are ready to be reset and there are no more
pending transactions.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The MAX_BYTES_PER_BURST option allows to configure the maximum bytes that
are part of a burst. This can be an arbitrary value.
At the same time there is a limit of how many bytes can be supported by the
memory buses. A AXI3 interface supports a maximum of 16 beats per burst
and a AXI4 interface supports a maximum of 256 beats per burst.
At the moment the it is possible to specify a MAX_BYTES_PER_BURST value
that exceeds what can be supported by the AXI memory-mapped bus. If that is
the case undefined behavior will occur and the DMAC will function
incorrectly.
To avoid this make sure that the MAX_BYTES_PER_BURST value does not exceed
the maximum that can be supported by the interfaces.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The width of the AXI burst length field depends on the AXI standard
version. For AXI3 the width is 4 bits allowing a maximum burst length of 16
beats, for AXI4 it is 8 bits wide allowing a maximum burst length of 256
beats.
At the moment the width of the length signals are determined by type of the
source AXI interface, even if the source interface type is not AXI. This
means if the source interface is set to AXI3 and the destination interface
is set to AXI4 the internal width of the signal for all interfaces will be
4 bits. This leads to a truncation of the destination bus length field,
which is supposed to be 8 bits.
If burst are generated that are longer than 16 beats the upper bits of the
length signal will be truncated. The result of this will be that the
external AXI slave interface (e.g. the DDR memory) and the internal logic
in the DMA disagree about burst length. The DMA will eventually lock up
when its internal buffers are full.
To avoid this issue have different configuration parameters for the source
and destination interface that configure the AXI bus length field width.
This way one of the interfaces can be configured for AXI3 and the other for
AXI4 without interfering with each other.
Fixes: commit 495d2f3056 ("axi_dmac: Propagate awlen/arlen width through the core")
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
This commit fixes the following warning from the IP packaging flow:
"[IP_Flow 19-801] The last file in file group "Synthesis" should be an HDL file:
"axi_dmac_constr.ttcl". During generation the IP Flow uses the last file to
determine library and other information when generating the top wrapper file.
If possible, please make sure that non-HDL files are located earlier in the list
of files for this file group."
Having the ttcl or other non HDL file at the end of the file group causes issues
when the project preferred language is set to VHDL. Since the synthesis file group
is set to "xilinx_anylanguagesynthesis" the tool tries to guess the type of wrapper
to be generated for that IP based on the last file from the file group.
If the file is non HDL then he defaults to the preferred language (this case VHDL)
Due some issue when the tool tries to create a VHDL wrapper for an IP that has
a Verilog top file with boolean parameters set from the IP packager he fails.
After we reorder the files after each non HDL file addition
he will create a correct Verilog wrapper for it with all parameters
which can be integrated in a VHDL system top file without issues.
Fixes the following warning:
[BD 41-1731] Type mismatch between connected pins: /util_fmcomms11_xcvr/tx_out_clk_0(clk) and /axi_ad9162_core/tx_clk(undef)
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
When the DMAC is used in async clock domains the data FIFO instantiate
an ad_mem component to handle properly the clock crossing.
For Intel, this mode is used only in FMCJESDADC1 designs but without this
an error could appear in other projects too if the user reconfigures the core.
The set_false_path constraint targeted to the *ram* cells of the dmac
matched several intra clock domain paths where the timing analysis got
ignored resulting in intermitent data integrity issues.
Exposed AXI3 interface on the Intel version of the IP for UI and feature consistency.
Some of the signals that are defined as optional in the AMBA standard
are marked as mandatory in Qsys in case of AXI3. Because of this such signals
were added to the interface of the DMAC and driven with default values.
For Xilinx in order to keep existing behavior the newly added signals
are hidden from the interface.
New parameters are added to define the width of the AXI transaction IDs;
these are hidden from the UI; We can add them to the UI if the fixed size
of the IDs will cause port incompatibility issues.
Fix the following warnings that are generated by Quartus:
Warning (10230): Verilog HDL assignment warning at ad_sysref_gen.v(68): truncated value with size 32 to match size of target (8)
No functional changes.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Fix the following warnings that are generated by Quartus:
Warning (10036): Verilog HDL or VHDL warning at ad_datafmt.v(69): object "sign_s" assigned a value but never read
Move the sign_s and signext_s signals into the generate block in which
they are used.
No functional changes.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Fix the following warnings that are generated by Quartus:
Warning (10236): Verilog HDL Implicit Net warning at util_dacfifo.v(257): created implicit net for "dac_mem_ren_s"
Warning (10230): Verilog HDL assignment warning at util_dacfifo.v(166): truncated value with size 32 to match size of target (10)
Warning (10230): Verilog HDL assignment warning at util_dacfifo.v(266): truncated value with size 32 to match size of target (10)
Warning (10230): Verilog HDL assignment warning at util_dacfifo.v(268): truncated value with size 32 to match size of target (10)
No functional changes.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The primary use-case of the DMA controller is in non-2D mode. Make this the
default, since allows projects to instantiate the controller with the
default configuration without having to explicitly disable 2D support.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
All the file names must have the same name as its module. Change all the
files, which did not respect this rule.
Update all the make files and Tcl scripts.
Most of the cores are fully covered by the generic constraint files. When
the constraints where moved from the core specific to the generic
constraint files some empty core constraints files where left around. These
don't do anything, so remove them.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The standard Makefile output is very noisy and it can be difficult to
filter the interesting information from this noise.
In quiet mode the standard Makefile output will be suppressed and instead a
short human readable description of the current task is shown.
E.g.
> make adv7511.zed
Building axi_clkgen library [library/axi_clkgen/axi_clkgen_ip.log] ... OK
Building axi_hdmi_tx library [library/axi_hdmi_tx/axi_hdmi_tx_ip.log] ... OK
Building axi_i2s_adi library [library/axi_i2s_adi/axi_i2s_adi_ip.log] ... OK
Building axi_spdif_tx library [library/axi_spdif_tx/axi_spdif_tx_ip.log] ... OK
Building util_i2c_mixer library [library/util_i2c_mixer/util_i2c_mixer_ip.log] ... OK
Building adv7511_zed project [projects/adv7511/zed/adv7511_zed_vivado.log] ... OK
Quiet mode is enabled by default since it generates a more human readable
output. It can be disabled by passing VERBOSE=1 to make or setting the
VERBOSE environment variable to 1 before calling make.
E.g.
> make adv7511.zed VERBOSE=1
make[1]: Entering directory 'library/axi_clkgen'
rm -rf *.cache *.data *.xpr *.log component.xml *.jou xgui
*.ip_user_files *.srcs *.hw *.sim .Xil .timestamp_altera
vivado -mode batch -source axi_clkgen_ip.tcl >> axi_clkgen_ip.log 2>&1
...
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the individual IP core dependencies are tracked inside the
library Makefile for Xilinx IPs and the project Makefiles only reference
the IP cores.
For Altera on the other hand the individual dependencies are tracked inside
the project Makefile. This leads to a lot of duplicated lists and also
means that the project Makefiles need to be regenerated when one of the IP
cores changes their files.
Change the Altera projects to a similar scheme than the Xilinx projects.
The projects themselves only reference the library as a whole as their
dependency while the library Makefile references the individual source
dependencies.
Since on Altera there is no target that has to be generated create a dummy
target called ".timestamp_altera" who's only purpose is to have a timestamp
that is greater or equal to the timestamp of all of the IP core files. This
means the project Makefile can have a dependency on this file and make sure
that the project will be rebuild if any of the files in the library
changes.
This patch contains quite a bit of churn, but hopefully it reduces the
amount of churn in the future when modifying Altera IP cores.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The include files are currently only implicitly added to the component file
list. Do it explicitly as this will make sure that they show up in the
generated Makefile dependency list.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Some IP core have files in their file list for common modules that are not
used by the IP itself. Remove those.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The DC filter implementation in library/common/dc_filter.v is Xilinx
specific as it uses the Xilinx DSP48 hard-macro. There is a matching Altera
specific implementation in library/altera/common/dc_filter.v.
Move the Xilinx specific implementation from the generic common folder to
the Xilinx specific common folder in library/xilinx/common/ since that is
where all other Xilinx specific common modules reside.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the IP component dependency in the Makefile system is the Vivado
project file. The project file is only a intermediary product in producing
the IP component definition file.
If building the component definition file fails or the process is aborted
half way through it is possible that the Vivado project file for the IP
component exists, but the IP component definition file does not.
In this case there will be no attempt to build the IP component definition
file when building a project that has a dependency on the IP component.
Building the project will fail in this case.
To avoid this update the Makefile rules so that the IP component definition
file is used as the dependency. In this case the IP component will be
re-build if the component definition file does not exist, even if the
project file exists.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
This reduces the amount of boilerplate code that is present in these
Makefiles by a lot.
It also makes it possible to update the Makefile rules in future without
having to re-generate all the Makefiles.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The library Makefiles for share most of their code. The only difference is
the list of project dependencies.
Create a file that has the common parts and can be included
by the library Makefiles.
This drastically reduces the size of the library Makefiles and also allows
to change the Makefile implementation without having to re-generate all
Makefiles.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
In cases when a shallow FIFO is requested the synthesizer infers distributed RAM
instead of block RAMs. This can be an issue when the clocks of the FIFO are
asynchronous since a timing path is created though the LUTs which implement the
memory, resulting in timing failures. Ignoring timing through the path is not a
solution since would lead to metastability.
This does not happens with block RAMs.
The solution is to use the ad_mem (block RAM) in case of async clocks and letting
the synthesizer do it's job in case of sync clocks for optimal resource utilization.
Add a parameter to the control the clock source option of the MMCM. If
the MMCM has only one clock source the CLKSEL pin will be tied to VDD.
The previous version added a redundant path between the CLKSEL port and
register map.
This module upscale an n*sample_width data bus into a 16 or 32*n data
bus. The samples are right aligned and supports offset binary or two's
complement data format.
The up_xfer_cntrl and up_xfer_status modules have its own constraints files
in library/xilinx/common. Each IP which has an instance of these
modules, have to use these constraints files.
The following IPs were modified:
- axi_adc_decimate
- axi_adc_trigger
- axi_dac_interpolate
- axi_logic_analyzer
A couple of new parameters and new ports are missing in several
up_[adc|dac]_[common|channel] instance, and generates warnings. The rule of
thumb is to use full instantiations, defining all the existing parameter and
ports of the module.
Fix all the instantiation of up_[adc|dac]_[common|channel], by defining all its
parameters and ports.
The up_rstn is driven by s_axi_resetn, which is generated by a
Processor System Reset module. (connected to port peripheral_aresetn)
Therefor using this reset signal as an asynchronous reset is redundant,
and a bad design practice at the same time. Asynchronous reset should be
used if it's inevitable.
Vivado sometimes generates semi-valid or invalid warnings and critical warnings.
In the past these messages were silenced, by changing its message severity.
These setups were scattered in multiple scripts. This commit is an attempt
to centralize it and make it more maintainable and easier to review it.
The cores that handle the JESD204 ADC cores do not feature IQ correction
logic. The Q_OR_I_N parameter for the channel modules is unused, so remove
it.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The cores that handle the JESD204 ADC converters do not feature any direct
IO and subsequently no IO-delay blocks either. Remove the unused
IO_DELAY_GROUP parameter.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Bundle the TLAST signal in with the other AXIS slave signals to enable
easier connection between AXIS devices that use TLAST
Signed-off-by: Matt Fornero <matt.fornero@mathworks.com>
Add some limit TLAST support for the streaming AXI source interface. An
asserted TLAST signal marks the end of a packet and the following data beat
is the first beat for the next packet.
Currently the DMAC does not support for completing a transfer before all
requested bytes have been transferred. So the way this limited TLAST
support is implemented is by filling the remainder of the buffer with 0x00.
While the DMAC is busy filling the buffer with zeros back-pressure is
asserted on the external streaming AXI interface by keeping TREADY
de-asserted.
The end of a buffer is marked by a transfer that has the last bit set in
the FLAGS control register.
In the future we might add support for transfer completion before all
requested bytes have been transferred.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Commit ff50963c7f ("axi_ad9361- altera/xilinx reconcile- may be broken-
do not use") inverted the polarity of the TX feedback clock.
This exposed some issues in the existing drivers which can cause the
interface tuning to fail randomly under certain conditions.
To keep backwards compatibility with existing drivers restore the previous
behavior.
A separate fix will be applied to the drivers that resolves the issue that
has been exposed by the polarity inversion. So that interface calibration
works reliably under all conditions.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Assigning the value of a local parameter(localparam) to a parameter
will end up with a conflict(not highlighted by the tool). In this
case, the parameter type was defined as a string instead of an
integer. Furthermore, this scenario leads to an undesired choice
between primitive types.
The dma_last_beats is used by the Avalon Memory Mapped interface
controller, to define the last burst length.
Its value get stable after the last valid data of the DMA interface, and staying
stable until the positive edge of the DMA's xfer_req.
No need to condition the transfer of this register to avalon clock
domain.
The XFER_END state defines the end of a transaction, when the entire
data set is written or read to/from the DDRx memory.
A transaction can contain multiple Avalon bursts. Make sure that the FSM
goes back into staging phase at the end of each burst; also define a
signals which indicate the end of each burst for control.
The period_count should be updated once per clock cycle. This is not
enforced with the current implementation, which probably leads to
period_count being decremented on both m_axis_aclk edges.
A problem observed due to this is that the m_axis_tlast output is not
asserted or is asserted for a too short time for the consumer to
detect it.
Fix by letting the decrement (and thus the m_axis_tlast toggling)
happen only on the rising edge of the m_axis_aclk clock.
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
The commit 6900c have added an additional register stage into the fifo read
data path, but the control signals (ready/valid/underflow) were not realigned
to the data. This can cause data lose or duplicated samples in some case.
Realign the control signals to the data.
The util_adxcvr supports GTX2, GTH3 and GTH4. The transceiver is selected
using the XCVR_TYPE parameter.
The axi_adxcvr on the other hand only has a configuration parameter to
indicate whether a GTX or GTH transceiver is used (GTH_OR_GTX_N). Since
there are some minor differences between GTH3 and GTH4 that software needs
to know about rename the GTH_OR_GTX_N to XCVR_TYPE and match use the same
semantics as util_adxcvr.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The dac_xfer_req should indicate one single thing, that the FIFO is in
read phase. Should not be affected by any signals, which indicates data
validity on any interface. (e.g. dac_valid)
This signal is not used by the device core, its main purpose is to
indicate the state of the interface for a posible intermediat processing
module.
Fix the reset of the dma_mem_waddr (write address register of the CDC
FIFO on DMA's clock domain). This solves the occasional invalid read backs after
multiple re-initialization of the PL_DDR_FIFO.
+ Build both the read and write logic around an FSM
+ Consistent naming of registers and wires
+ Add support for burst lenghts higher than one, current burst lenght
is 64
+ Fix all the bugs, and make it work (first bring up with
adrv9371x/a10soc)
If the ADI_HDL_DIR or ADI_PHDL_DIR are set on Windows platforms, an
invalid TCL character (e.g. backslash) may be used as a file separator,
causing issues with the build / library scripts.
Normalize the paths before using them as global TCL variables.
The first attempt (f3daf0) faild miserably. When the data_req signal
from the device had more than 1 cycle of deassert state, because of the
added latency of the data stream, the device got 'zeros' too.
In this fix, the DMA will hold the valid data on the bus, between two
consecutive data request. The bus is reseted just after all the data
were sent out.
The grey coder/decoder function was limited to 10 bits, and this
resulted an unwanted limitation of the FIFO size. Using this
module, the coder/decoder data width can be adjusted to the current
address width.
Reset the fifo_rd_data if the DMA does not have an active transfer.
Becasue all the DAC device cores are transfering the data from the FIFO
interface to the data interface without any validation signal, DMA needs to put
the data bus into a known state, to prevent the device core to send the
last known data again and again.
Add missing timing exceptions on paths between the DMA and DDR clock
domains. All these paths are properly synchronized using CDC in the HDL,
but are missing timing exceptions in the XDC file. This can lead to timing
errors when building a design using the axi_adc_fifo.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Export the reset signal for the link clock domain. This can be used by
external logic that is in the link clock domain to reset itself.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Enabling the phase alignment mode of the FPLL seems to break manual
re-calibration, which is required when changing the lane rates. The
calibration seems to select the wrong VCO frequency band and the PLL no
longer locks.
Disable phase alignment mode for now, this has a negative effects on
deterministic latency, but it is better than not working at all.
Waiting for feedback from Altera/Intel on how to make manual re-calibration
work in phase alignment mode.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
To be able to check the FPLL re-configuration arbitration status from
software enable the avmm_busy flag in the register map.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The DEGLITCH state of the RX state machine is a workaround for misbehaving
PHYs. It is an internal state and an implementation detail and it does not
really make sense to report through the status interface.
Rework things so that DEGLITCH state is reported as part of the CGS state
on the external status interface.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The current layout of the debug ID register assumes that the ID_WIDTH is 3.
Change things so that the padding 0 width depends on the ID_WIDTH
parameter so that we end up with the same register layout regardless of the
value of ID_WIDTH.
Also split things into two registers, this allows for an ID_WIDTH up to 8
(which should hopefully be enough for all practical applications).
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Redesign the axi_dacfifo, to increase the supported datarates.
Major modifications:
+ The FIFO consist of two module: WRITE and READ. The axi_dacfifo_dac
was deprecated.
+ Both the AXI write and AXI read transaction are controlled by two
FSM, to increase redability of the code.
+ Support all the possible burst lengths [0..225], handles the last
fractional burst on both sides correctly.
+ Common reset architecture throughout the design, all the internal
registers and memories are reset on the posedge of dma_xfer_req
+ Delete all Altera related sources, for Altera projects
avl_dacfifo should be used.
WIP: foobar
[WIP]axi_dacfifo: Update
axi_dacfifo: Few minor updates, almost working state
Add a wrapper module for Altera/Intel platforms that instantiates and
connects all the components required to for a JESD204 link.
The following components are created:
* Transceiver for each lane
* Transceiver lane PLL (TX only)
* Transceiver reset controller
* Link PLL
* JESD204 link layer processing
* JESD204 link layer processing control interface
* axi_adxcvr link management peripheral
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Add a wrapper that instantiates the Arria10 Native PHY and configures it
for JESD204 operation. The datapath width is set to 4 octets per beat.
The maximum lane rate that is achievable with hard-logic PCS included in
the PHY is below the requirements of the JESD204 for some of the PHY speed
grades. For projects that require a lane rate that is higher than what the
hard-logic PCS can support a soft-logic PCS module can be instantiated. The
external interface of the jesd204_phy is identical regardless of whether
soft- or hard-logic PCS is used.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Add soft logic PCS that performs 8b10b encoding for TX and character
pattern alignment and 8b10b decoding for RX.
The modules are intended to be used in combination with a transceiver that
does not have these features implemented in hard logic PCS.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Add Qsys IP scripts as well as SDC constraint files for the ADI JESD204
peripherals. This allows them to be instantiated and used on Altera/Intel
platforms.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The Xilinx tools are quite forgiving when it comes to required signals on
standard interfaces, which is why it was possible to define a AXI streaming
interface without the required valid signal.
The Altera tools are more strict and wont allow this. Add a dummy valid
signal to the TX data interface to make the tools happy. For now the signal
does not do anything, in the future it might be used to detect an underflow
condition on the data interface and report this through the status
interface.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the ILAS memory for the receive register map uses a shift
register with variable tap output for storing the ILAS information. This
maps very efficiently onto the primitives found in Xilinx FPGAs. But there
is no equivalent primitive in Altera FPAGs resulting in increased
utilization from having to implement the structure in pure logic.
Change the ILAS memory so it uses a simple dual port RAM for storing the
data. This has slightly increased utilization on Xilinx platforms (but
still good enough) and highly decreased utilization on Altera platforms.
One side effect of this change is that since the RAM output is synchronous
reading the ILAS memory registers will take one extra clock cycle.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Add a set of helper functions for the CDC library that creates the correct
constraints for the CDC blocks. This makes it easier to specify the
constraints in the individual user's SDC files.
This only works for Altera where full scripting capabilities are available
in the SDC files.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Allow to specify additional properties when defining a IP parameter. The
properties take the form of a list of key value pairs. E.g.
ad_ip_parameter ... { \
DISPLAY_NAME "Name" \
DISPLAY_HINT "radio" \
}
This helps to reduce the amount of boilerplate when additional properties
need to be specified for a parameter.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
TCL files can be helpful to automate certain tasks like creating timing
constraints. Add handling for them to the ad_ip_addfile function.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the reset for the link clock domain is generated internally in
the axi_jesd204_{rx,tx} peripheral. The reset is controlled by through the
register map.
Add an additional external reset for link clock domain. The link clock
domain is kept in reset if either the internal reset or the external reset
is asserted.
This for example allows the fabric to keep the domain in reset if the clock
is not yet stable.
The status of the external reset can be queried from the register map.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
dma_raddr is only incremented if it is less than dma_waddr_rel_s.
dma_waddr_rel_s is always less or equal to adc_waddr_rel << RATIO and
adc_waddr_rel is less than DMA_ADDR_LIMIT >> RATIO.
By induction we can conclude that this means that dma_raddr will always be
less then DMA_ADDR_LIMIT and the check for this will always evaluate to
false can be removed.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
When the DMA clock to ADC data rate ratio exceeds a certain threshold it is
possible that an erroneous dma_waddr_rel toggle event is generated. This
causes the last address of the previous DMA transfer to be transferred to
the DMA domain. And the DMA side will start reading from the FIFO even
though data is not available yet.
This results in data corruption with the current transfer containing data
from the previous transfer.
The root cause here is that the toggle signal CDC synchronizer register are
reset in the DMA when a new transfer starts, but not in the ADC domain,
causing a potential mismatch and the incorrect toggle event. To fix this
remove the reset from the DMA side. This is OK since the registers are
self-resetting if the reset signal is asserted for more than 3 clock cycles.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
AND logic means that all enabled triggers need to evaluate to true, others
are don't care. Fix the logic to behave accordingly.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
At the moment the drain signal is always asserted when the controller is
enabled. This breaks backpressure and data is lost. The drain signal should
only be asserted when the controller gets disabled until the last beat of
the current DMA transfer.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The 'PRIMITIVE_SUBGROUP == flop' filter only works on 7-Series. Replace it
with 'IS_SEQUENTIAL' which works on both 7-Series and UltraScale.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
In its default configuration the ram_2port module as a read latency of 2
clock cycles. Both the read address as well as the output data are
registered.
This is not the behavior that is expected from the alt_mem_asym module and
causes incorrect behavior and data corruption in the util_adc_fifo.
Disable the data output register to get a read latency of 1 clock cycle.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Name all CDC blocks following the patter i_cdc_${signal_name}. This makes
it clear what is going on.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Use the CDC sync_bits helper to synchronize the asynchronous external SYNC~
signal into the link clock domain, rather than open-coding this operation.
This makes it more explicit what is going on.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Which events will be exposed as IRQs and at what level of granularity will
need some additional through. Remove the two existing IRQ events for now
again. This will be added back later.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The up_cfg_ilas_data signal is a two dimensional array. There are 4
register entries for each lane. Model it as such rather than compressing it
down to a one dimensional array. This makes accessing the individual
entries a bit more straight forward and the code clearer.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The ilas_cfg_static.v is part of the jesd204_tx_static_config module.
Somehow a copy of that file made it into the jesd204_tx module where it is
completely unused. Remove the duplicated file.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
We know that NUM_OF_LANES will never exceed 255, but the tools don't know
and generate a warning about implicit signal truncation. Make the
truncation explicit to indicate that this is intentional.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
+ update to verilog-2001 coding standard
+ define RATIO outside the generate block
+ $clog2 macro is not supported by some tools, define function
locally
The CIC filter introduces different amplifications depending on the
decimation ratio. By adding a multiplier in the decimation chain
the amplification can be compensated
The ADI transport layer peripherals expect the first octet to be in the
LSBs and the last octet to be in the MSBs. The Altera JESD204 core orders
the octets the other way around though, first octet in the MSBs and last
octet in the LSBS.
Currently this is handled by having each transport layer peripheral swap
the octets around when it is connected to the Altera JESD204 core.
Change this so that rather than having to do the data swizzling in every in
every transport layer peripheral perform it at the input/output of the link
layer peripheral inside the generated block.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the TX lane mapping is implemented by having to connect tx_phy_s_* to
the tx_ip_s_* and the tx_phy_d_* to the tx_ip_d_* signals in the system
qsys file in the desired order.
Re-work things so that instead the lane mapping is provided through the
TX_LANE_MAP parameter. The parameter specifies in which order logical lanes
are mapped onto the physical lanes.
The appropriate connections are than made inside the core according to this
parameter rather than having to manually connect the signals externally.
In order to generate a 1-to-1 mapping the TX_LANE_MAP parameter can be left
empty.
This change slightly reduces the boiler-plate code that is necessary to
setup the transceiver.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The CIC filter introduces different amplifications depending on the
interpolation ratio. By adding a multiplier in the interpolation chain
the amplification can be compensated
+ Add a HDL parameter for the PPS receiver module :
PPS_RECEIVER_ENABLE. By default the module is disabled.
+ Add the CMOS_OR_LVDS_N and PPS_RECEIVER_ENABLE into the CONFIG
register
+ Define a pps_status read only register, which will be asserted, if the free
running counter reach a certain fixed threshold. (2^28) The register can
be deasserted by an incomming PPS only.
Make sure that the right hand side expression of assignments is not wider
than the target signal. This avoids warnings about implicit truncations.
None of these changes affect the behaviour, just fixes some warnings about
implicit signal truncation.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The AXI specification that the minimum address space size is 4k, make sure
the axi_dmac adheres to this.
Internally the register space is still 2k. This means the upper and lower
2k of the axi4lite register space will map to the same internal registers.
Software must not rely on this and only access the lower 2k to enable
compatibility in case the internal space grows in the future.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Terminate the m_axi_list signal of the data mover instance in the
src_axi_stream module. This avoids a warning about the port being
unconnected.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the axi_dmac_hw.tcl script does not create interfaces if they
are not used in the current configuration. This has the disadvantage that
the ports belonging to these interfaces are not included in the generated
HDL wrapper. Which will generate a fair bunch of warnings when synthesizing
the HDL.
Instead always generate all interfaces, but disable those that are not used
in the current configuration. This will make sure that the ports belonging
to these interfaces are properly tied-off in the generate wrapper HDL.
This reduces the amount of false positive warnings generated and makes it
easier to spot actual issues.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The DMAC currently doesn't support transfers where the length is not a
multiple of the bus width. When generating the wstrb signal we do pretend
though that we do and dynamically generate it based on the LSBs of the
transfer length.
Given that the other parts of the DMA don't support such transfers this is
unnecessary though. So remove it for now and replace it with a constant
expression where wstrb is always fully asserted.
The generated logic for the wstrb signal was quite terrible, so this
improves the timing of the core.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the read side of the src_response interface is not used. This
leads to warnings about signals that have a value assigned but are never
read.
To avoid this just comment out all signals that are related to the
src_response interface for now.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Make sure that the right hand side expression of assignments is not wider
than the target signal. This avoids warnings about implicit truncations.
None of these changes affect the behaviour, just fixes some warnings about
implicit signal truncation.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The external s_axi_{awaddr,araddr} signals that are connect to the core
have their width set according to the specified size of the register map.
If the s_axi_{awaddr,araddr} signal of the core is wider (as it currently
is for many cores) the MSBs of those signals are left unconnected, which
generates a warning.
To avoid this make sure that the signal width matches the declared register
map size.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the util_upack_hw.tcl script does not create interfaces if they
are not used in the current configuration. This has the disadvantage that
the ports belonging to these interfaces are not included in the generated
HDL wrapper. Which will generate a fair bunch of warnings when synthesizing
the HDL.
Instead always generate all interfaces, but disable those that are not used
in the current configuration. This will make sure that the ports belonging
to these interfaces are properly tied-off in the generate wrapper HDL.
This reduces the amount of false positive warnings generated and makes it
easier to spot actual issues.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the util_cpack_hw.tcl script does not create interfaces if they
are not used in the current configuration. This has the disadvantage that
the ports belonging to these interfaces are not included in the generated
HDL wrapper. Which will generate a fair bunch of warnings when synthesizing
the HDL.
Instead always generate all interfaces, but disable those that are not used
in the current configuration. This will make sure that the ports belonging
to these interfaces are properly tied-off in the generate wrapper HDL.
This reduces the amount of false positive warnings generated and makes it
easier to spot actual issues.
While we are at it also use a loop to create the interfaces since they all
follow the same pattern.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the axi_ad9144_hw.tcl script does not create interfaces if they
are not used in the current configuration. This has the disadvantage that
the ports belonging to these interfaces are not included in the generated
HDL wrapper. Which will generate a fair bunch of warnings when synthesizing
the HDL.
Instead always generate all interfaces, but disable those that are not used
in the current configuration. This will make sure that the ports belonging
to these interfaces are properly tied-off in the generate wrapper HDL.
This reduces the amount of false positive warnings generated and makes it
easier to spot actual issues.
While we are at it also use a loop to create the interfaces since they all
follow the same pattern.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The width of a ternary operator expression is the width of the wider of the
two selectable expression. This means the right side expression of the
tx_data assigment is always 256 bits. This generates an implicit truncating
warning if the tx_data signal itself is only 128 bits.
To avoid this slightly reformulate the expression to yield the correct
width depending on the configuration.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The width of a ternary operator expression is the width of the wider of the
two selectable expression. This means depending on the selected DMA_RATIO
the right side expression of the dma_waddr_rel_s assignment can be up to
three bits wider than the dma_waddr_rel_s signal. This generates an
implicit truncation warning.
Slightly reformulate the expression without the use of the ternary operator
to avoid this warning.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The signal is called adc_clk and not adc_clock. None of the designs is
currently using the signal, so this hasn't been an issue other that it
generates a warning.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Comments starting with the word altera are interpreted by the Altera tools
to be synthesis attribute assignments. In this case this is just a generic
comment though which results in a warning that the synthesis attribute is
unknown.
Slightly reword the comment to avoid this. This is not pretty, but better
than having the false positive warning show up in the log.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The PLL frequency must be half of the lane rate and the core clock rate
must be lane rate divided by 40. There is no other option, otherwise things
wont work.
Instead of having to manually specify PLL and core clock frequency derive
them in the transceiver script. This reduces the risk of accidental
misconfiguration.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The clock bridge expects the clock rate to be specified in Hz, but
$m_coreclk_frequency is in MHz. Do the appropriate conversion.
Nothing seems to rely on the clock bridge reporting the correct frequency
at the moment, so this is only a cosmetic change.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The ad_pps_receiver is instantiated at the top of core.
The rcounter is placed into adc/dac_common registers space, at the
address 0x30 (word aligned).
The interrupt mask is placed into adc/dac_common, at the address 0x04
(word aligned). Because the core has an instance of both modules, the
interrupt masks are OR-ed together.
Add a module to receive 1PPS signal from a GPS module. The module has a
free running counter, which runs on the device's interface clock. The
counter value is latched into a register each time when a 1PPS arrives.
An interrupt signal is also generated in every 1PPS.
Add a check to RX register map to confirm that the ILAS memory registers
return the correct values after the ILAS data has been received.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The MSB of the d_count signal is used as a overflow marker to stop the
counter from incrementing in the monitored clock domain. It is not exported
through the register map and truncated when assigned to the up_d_count
signal.
Make the truncation explicit to make it clear that this is not a mistake
and to avoid warnings about implicit truncation.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The generic Altera clock monitor constraints expect the instance to be
called i_clock_mon. Adjust the code accordingly.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
In this particular case the behaviour is the same with non-blocking and
blocking assignments, but that could change if the code is modified in the
future. To avoid any potentially issue due to this consistently use
non-blocking assignments.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The axi_dmac can issue up to FIFO_SIZE read and write requests in parallel.
This is done in order to maximize throughput and compensate for for
latency.
Set the {read,write}IssuingCapability properties accordingly on the AXI
master interfaces. Otherwise qsys might decide to insert bridges that
artificially limit the number of requests, which in turn might affect
performance.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Qsys allows to query to query the clock domain that is associated with a
clock input of a peripheral. This allows to automatically detect whether
the different clocks of the DMAC are asynchronous and CDC logic needs to be
inserted or not.
Auto-detection has the advantages that the configuration parameters don't
need to be set manually and the optional configuration will be choose
automatically. There is also less chance of error of leaving the settings
in a wrong configuration when e.g. the clock domains change.
In case the auto-detection should ever fail configuration options that
provide a manual overwrite are added as well.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Group configuration parameters by function, provide human readable labels
as well as specify the allowed ranges for each parameter.
This prevents accidental misconfiguration and also makes it easier to
inspect (or change) the configuration in the Qsys GUI.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
In this particular case the behaviour is the same with non-blocking and
blocking assignments, but that could change if the code is modified in the
future. To avoid any potentially issue due to this consistently use
non-blocking assignments.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Use the ad_ip_intf_s_axi helper function to create the axi4lite slave
interface for memory mapped peripherals. This slightly reduces the amount
of boilerplate code in the peripheral's *hw.tcl
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The address width of the AXI interface depends on the size of the register
and can differ from peripheral to peripheral. Add a parameter to the
function that allows to specify the address width, this allows to use the
function for more peripherals.
Keep the current value of 16 bits as the default if the parameter is not
specified.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The axi_adxcvr register map only uses a single 4k page, make this explicit.
This will allow for tighter packaging in the limited available total
register map space.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
This partially reverts commit a8ade15173.
Remove the nonsensical Makefile dependencies that got added by accident.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The MSB of the d_count signal is used as a overflow marker to stop the
counter from incrementing in the monitored clock domain. It is not exported
through the register map and truncated when assigned to the up_d_count
signal.
Make the truncation explicit to make it clear that this is not a mistake
and to avoid warnings about implicit truncation.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The generic Altera clock monitor constraints expect the instance to be
called i_clock_mon. Adjust the code accordingly.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
In this particular case the behaviour is the same with non-blocking and
blocking assignments, but that could change if the code is modified in the
future. To avoid any potentially issue due to this consistently use
non-blocking assignments.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The axi_dmac can issue up to FIFO_SIZE read and write requests in parallel.
This is done in order to maximize throughput and compensate for for
latency.
Set the {read,write}IssuingCapability properties accordingly on the AXI
master interfaces. Otherwise qsys might decide to insert bridges that
artificially limit the number of requests, which in turn might affect
performance.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The SYNC signal that gets reported through the status interface should be
the output (second stage) of the synchronizer circuit.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Make sure the core_cfg_transfer_en signal is declared before they are used.
Strictly speaking the current code is correct and synthesis correctly, but
declaring the signals make the intentions of the code more explicit.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Be more standard compliant and assign names to generate for-blocks. This is
required for Altera/Intel support.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Make sure the req_gen_valid and req_gen_ready signals are declared before
they are used. Strictly speaking the current code is correct and synthesis
correctly, but declaring the signals make the intentions of the code more
explicit.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
In some cases, the 'core_ilas_config_data' registers will be infered as
FDRE, instead of FDSE. Therefor a max delay definition, which are using
the S pin as its endpoint, it can become invalid, nonexistent.
Generalize the path, using the register itself as endpoint.
Increase the width of wvalid_counter, should be equal with awlen width.
The wvalid_counter needs to count from zero to the required burst
length. The maximum burst length is 255, so the width of the counter
have to be 8 bits. axi_last_beats will get the last axi burst length.
The fifo will ask for a new data from the DDR, if the current
level is lower than the high threshold. This will prevent overflow.
By deleting the lower threshold, we can avoid ocassional underflows,
when the DAC rate is closer to the max DDRx rate.
All verilog file are using the Verilog-2001 standard to define
and/or declare ports. Definin a port width with a local parameter
is a bad practive, when this standard is used. Some simulators
will crash. Try to avoid it.
Fix the dma_ready mux in top module, and the dma_ready_out reset
logic in axi_dacfifo_wr module. Also, both write and read addresses
of the async CDC fifo (inside the axi_dacfifo_wr) should be reset
before a dma transaction starts.
If the streaming bit is set, after the trigger condition is met
data will be continuosly captured by the DMA. The streaming bit
must be set to 0 to reset triggering.
If the streaming bit is set, after the trigger condition is met,
data will be continuosly captured by the DMA. The streaming bit
must be set to 0 to reset triggering.
In non-streaming mode we want direction changes to be applied immediately.
The current code has a typo and checks the wrong signal. overwrite_data
holds the configured output value of the pin, whereas overwrite_enable
configures whether the pin is in streaming or manual mode.
For correct operation the later signal should be used to decide whether a
direction change should be applied. Otherwise the direction change will
only be applied if the output value of the pin is set to logic high.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
When using non-broadcast access to the GT DRP registers lane filtering is
done on both sides. The ready and data signals are filtered in the in the
axi_adxcvr module and the enable signal is filtered in the util_adxcvr
module. This works fine as long as both sides use the same transceiver IDs.
E.g. channel 0 of the axi_adxcvr module is connected to channel 0 of the
util_adxcvr module.
But this is not always the case. E.g. on the ADRV9371 platform there are
two RX axi_adxcvr modules (RX and RX_OS) connected to the same util_adxcvr.
The first axi_adxcvr uses lane 0 and 1 of the util_adxcvr, the second uses
lane 2 and 3.
Non-broadcast access for the first RX axi_adxcvr module works fine, but
always generates a timeout for the second axi_adxcvr module. This is
because lane 0/1 of the axi_adxcvr module is connected to lane 2/3 of the
util_adxcvr and when ID based filtering is done both can't match at the
same time.
To avoid this perform the filtering for all the signals in the axi_adxcvr
module. This makes sure that the same base ID is used.
This also removes the sel signal from the transceiver interfaces since it
is no longer used on the util_adxcvr side.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Always explicitly specify the signal width for constants to avoid warnings
about signal width mismatch.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The buffer delay should be 0 in the default configuration. The current
value of 0xb must have slipped in by accident.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Use a single standalone counter that counts the number of beats since the
release of the SYNC~ signal, rather than re-using the LMFC counter plus a
dedicated multi-frame counter.
This is slightly simpler in terms of logic and also easier for software to
interpret the data.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
There are currently two sysref related events. One the sysref captured
event which is generated when an external sysref edge has been observed.
The other is the sysref alignment error event which is generated when a
sysref edge is observed that has a different alignment from previously
observed sysref edges.
Capture those events in the register map. This is useful for error
diagnostic. The events are sticky and write-1-to-clear.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The internal LMFC offset signals are in beats, whereas the register map is
in octets. Add the proper alignment padding to the register map to
translate between the two.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
For SYSREF handling there are now three possible modes.
1) Disabled. In this mode the LMFC is generated internally and all external
SYSREF edges are ignored. This mode should be used for subclass 0 when no
external sysref is available.
2) Continuous SYSREF. An external SYSREF signal is required and the LMFC is
aligned to the SYSREF signal. The SYSREF signal is continuously monitored
and if a edge unaligned to the previous edges is detected the LMFC is
re-aligned to the new edge.
3) Oneshot SYSREF. Oneshot SYSREF mode is similar to continuous SYSREF mode
except only the first edge is captured and all further edges are ignored,
re-alignment will not happen.
Both in continuous and oneshot signal at least one external sysref edge is
required before an LMFC is generated. All events that require an LMFC will
be delayed until a SYSREF edge has been captured. This is done to avoid
accidental re-alignment.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
If the output pin is not defined as a clock, some of the Vivado IPI
propagation TCL will error out.
Signed-off-by: Matt Fornero <matt.fornero@mathworks.com>
By adding support for partial avalon transfers (data width < bus width),
valid data set size (DMA transfer length) will be dependent on the DMA bus
width only.
+ avl_write_transfer_done_s is a redundant net
+ specify the net state explicitly on if statements
+ to define the edge of avl_mem_fetch_wr_address signal,
its register and its second sync register should be used
The ad_mem_asym memory read interface has a 3 clock cycle delay, from the
moment of the address change until a valid data arrives on the bus;
because the dac_xfer_out is going to validate the outgoing samples (in conjunction
with the DAC VALID, which is free a running signal), this module will compensate
this delay, to prevent duplicated samples in the beginning of the
transaction.
+ all net names should have a *_s postfix
+ avl_burstcount is a constant 1, no need for an additional
register for it
+ all CDC should have two synchronization register, add
avl_last_beat_req_m2
The "'b0" constant will be translate as a 32 bit width vector by
ModelSim, and will throw a buswidth mismatch error. Tie the data_b
bus to zero, using its width parameter.
Currently the scripts use 'analog.com' as the vendor property for IP cores,
but 'ADI' for interfaces.
Make things consistent by using 'analog.com' for both interfaces as well
as IP cores.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Make sure that the XML files are re-build when any of the scripts that are
used to generated it are modified.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
All the rules to generate the XML files are the same. Reduce the number of
rules by useing wildcard matching for the rule target.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The ADI JESD204 link layer cores are a implementation of the JESD204 link
layer. They are responsible for handling the control signals (like SYNC and
SYSREF) and controlling the link state machine as well as performing
per-lane (de-)scrambling and character replacement.
Architecturally the cores are separated into two components.
1) Protocol processing cores (jesd204_rx, jesd204_tx). These cores take
care of the JESD204 protocol handling. They have configuration and status
ports that allows to configure their behaviour and monitor the current
state. The processing cores run entirely in the lane_rate/40 clock domain.
They have a upstream and a downstream port that accept and generate raw PHY
level data and transport level payload data (which is which depends on the
direction of the core).
2) Configuration interface cores (axi_jesd204_rx, axi_jesd204_tx). The
configuration interface cores provide a register map interface that allow
access to the to the configuration and status interfaces of the processing
cores. The configuration cores are responsible for implementing the clock
domain crossing between the lane_rate/40 and register map clock domain.
These new cores are compatible to all ADI converter products using the
JESD204 interface.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The sync_data module can be used to continuously transfer multi-bit signals
like status signals safely from the source to the destination clock
domain. A transfer takes 2 source and 2 destination clock cycles. It is not
guaranteed that all transitions on the source side will be visible on the
target side if the signal is changing faster than this. Logic using this
block should be aware of it. The primary intention is for it to be used for
slowly changing status signals.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The event synchronizer can be used to safely transfer 1-bit 1-clock cycle
event signals from one clock domain to another.
For each event recorded in the source domain it is guaranteed that a event
will be generated in the target domain at a later point in time. It is
possible though that multiple events in the source domain will be coalesced
into a single event in the target domain if events are generated faster
than they can be transferred.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Move the CDC helper modules to a dedicated helper modules. This makes it
possible to reference them without having to use file paths that go outside
of the referencing project's directory.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the name of the newly created IP core is automatically inferred
from the top-level module. This works fine if there is only one top-level
IP. But for an IP core that is a collection of helper modules this fails.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the polarity of the reset signal is always set to negative.
Change this so that the polarity is selected on the suffix of the name. If
it ends with a 'n' or 'N' the polarity will be negative, otherwise it will
be positive.
This allows this function to be used with reset signals that have positive
polarity.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
This patch adds a helper function that allows to create multiple ports for
a single set of underlying signals. This is useful when the number of ports
is a configuration parameter. It sort of allows the emulation of port
arrays without having to have on set of input/output signals for each port,
instead the signals are shared by all ports.
The following snippet illustrates how this can for example be used to
generate multiple AXI-Streaming ports from a single set of signals.
<verilog>
module #(
parameter NUM_PORTS = 2
) (
input [NUM_PORTS*32-1:0] data,
input [NUM_PORTS-1:0] valid,
output [NUM_PORTS-1:0] ready,
);
...
endmodule
</verilog>
<tcl>
adi_add_multi_bus 8 "data" "slave" \
"xilinx.com:interface:axis_rtl:1.0" \
"xilinx.com:interface:axis:1.0" \
[list \
{ "data" "TDATA" 32} \
{ "valid" "TVALID" 1} \
{ "ready" "TREADY" 1} \
] \
"NUM_PORTS > {i})"
</tcl>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Commit 2f023437b4 ("adi_ip- remove adi_ip_constraints") changed the
default processing order of IP core constraint files from late to normal.
This is problematic because some IP core constraint files try to access
clocks that are that are generated by different files with the normal
processing order level. These clock may or may not be available to the IP
core constraint file depending on the (random) order in which the files
were processed.
To avoid this issue change the default processing order back to late.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The clock monitor reports the ratio of the clock frequencies of a known
reference clock and a monitored unknown clock. The frequency ratio is
reported in a 16.16 fixed-point format.
This means that it is possible to detect clocks that are 65535 times faster
than the reference clock. For a reference clock of 100 MHz that is 6.5 THz
and even if the reference clock is running at only 1 MHz it is still 65
GHz, a clock rate much faster than what we'd ever expect in a FPGA.
Add a configuration option to the clock monitor that allows to reduce the
number of integer bits of ratio. This allows to reduce the utilization
while still being able to cover all realistic clock frequencies.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently when the monitored clock stops the clock monitor retains the old
frequency ratio value and there is no way to detect that the clock has
stopped and the reported value is indistinguishable form a clock still
running at the right rate.
If a full iteration as elapsed on the monitoring side and there is no
indication that the counter on the monitored side has started running set
the reported clock ratio value to 0 to indicate that the clock has stopped.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the clock monitor features a hold register in the monitored clock
domain. This old register is used to store a instantaneous copy of the
counter register. The value in the old register is then transferred to the
monitoring domain. Since the counter is continuously counting it is not
possible to directly transfer it since that might result in inconsistent
data.
Instead stop the counter and hold the registers stable for a duration that
is long enough for the monitoring domain to correctly capture the value.
Once the value has been transferred the counter is reset and restarted for
the next iteration.
This allows to eliminate the hold register, which slightly reduces
utilization.
The externally visible behaviour is identical before and after the patch.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Make sure that the XML files are re-build when any of the scripts that are
used to generated it are modified.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
All the rules to generate the XML files are the same. Reduce the number of
rules by useing wildcard matching for the rule target.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Sort the entries in the library Makefile alphabetical. Keeping it ordered
makes it easier to track changes compared to randomly reshuffling it
every time a new entry is added.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
All the hdl (verilog and vhdl) source files were updated. If a file did not
have any license, it was added into it. Files, which were generated by
a tool (like Matlab) or were took over from other source (like opencores.org),
were unchanged.
New license looks as follows:
Copyright 2014 - 2017 (c) Analog Devices, Inc. All rights reserved.
Each core or library found in this collection may have its own licensing terms.
The user should keep this in in mind while exploring these cores.
Redistribution and use in source and binary forms,
with or without modification of this file, are permitted under the terms of either
(at the option of the user):
1. The GNU General Public License version 2 as published by the
Free Software Foundation, which can be found in the top level directory, or at:
https://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
OR
2. An ADI specific BSD license as noted in the top level directory, or on-line at:
https://github.com/analogdevicesinc/hdl/blob/dev/LICENSE
There are devices which have a asynchronous data ready signal. (asynchronous
with the spi clock) The CDC stages can be enabled by setting up
the ASYNC_TRIG parameter.
In case of high precision devices with just a simple SPI interface
for control and data, the effective data rate can be significatly
lower than the SPI clock, and more importantly there isn't any relation
between the two clock domain.
The rate is defined by a SOT (start of transfer) generator, which
initiates a SPI transfer. Taking the fact that the generator runs
on system clock (100 MHz), and the device can require smaller rate (in kHz domain),
the 7 bit dac_datarate register is just too small.
Therefor increasing to 16 bit.
This core can be used in conjunction with the SPI_ENGINE, will work
as an offload module, forwarding a data stream to the SPI excecution,
received from a DMA.
Calculate the output clock frequencies based on the input clock frequencies
and the default divider settings and configure the output clock pins
accordingly. This allows connected peripherals to infer the frequency of
the clock.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Instead of having to manually specify the input clock period infer the
values from the block design. This means that less configuration parameters
need to be changed if the clock input frequency changes.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Add interface definition for the input and output clocks. This will allow
the tools to recognize them as clocks and enable things like clock
frequency propagation.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The secondary clock inputs and outputs of the axi_clkgen are rarely used.
Add enable parameters that need to be explicitly set before they are
available. This allows to hide the secondary clock pins when they are not
used in the block design.
There are currently no projects which use the secondary clock inputs or
outputs so there is no need to set these new parameters anywhere.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Vivado infers the type of floating point type parameters as integer if the
value can be expressed as an integer (i.e. decimal places are 0). To
correctly infer them as floating point parameters add types to the
parameter declaration.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Can not be multiple 'if' statements inside a generate block. If there are
multiple cases use if/esle statement, but always should be one single
if/else inside a generate.
When a mapping has multiple address segments we need to consider all of
them to calculate the required address width.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The address width needs to be large enough to be able to address the
largest possible address. This means the in addition to the address segment
range the specified offset also needs to be considered to calculate the
address width.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
up_rdata is qualified by the up_rack signal. There is no need to reset it
since by the time the signal is read the reset value has already been
overwritten anyway.
Also gate the up_rdata registers if no read operation is in progress. In
this case any changes would be ignored anyway.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The axi_adc_trigger does not use the full width of the AXI interface
address. It only responds to register access in the first 32 registers.
Reduce the size of the AXI address to 7 bit accordingly. This allows the
scripts to correctly infer the internal register map size which will cause
the interconnect to filter out access to these unused register.
This slightly reduces utilization by getting rid of some pipeline
registers.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The axi_adc_decimate does not use the full width of the AXI interface
address. It only responds to register access in the first 32 registers.
Reduce the size of the AXI address to 7 bit accordingly. This allows the
scripts to correctly infer the internal register map size which will cause
the interconnect to filter out access to these unused register.
This slightly reduces utilization by getting rid of some pipeline
registers.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The axi_dac_interpolate does not use the full width of the AXI interface
address. It only responds to register access in the first 32 registers.
Reduce the size of the AXI address to 7 bit accordingly. This allows the
scripts to correctly infer the internal register map size which will cause
the interconnect to filter out access to these unused register.
This slightly reduces utilization by getting rid of some pipeline
registers.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The axi_logic_analyzer does not use the full width of the AXI interface
address. It only responds to register access in the first 32 registers.
Reduce the size of the AXI address to 7 bit accordingly. This allows the
scripts to correctly infer the internal register map size which will cause
the interconnect to filter out access to these unused register.
This slightly reduces utilization by getting rid of some pipeline
registers.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The AXI DMAC peripheral only uses 11-bit of the register map interface
address. Reducing the signal width to this value allows the scripts to
correctly infer the size of the register map.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Not all peripherals need the full address space. To be able to infer the
size of the address space of a peripheral allow the size of the AXI address
signals to be configurable rather than hardcoding its width to 32 bit.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the register map range of a peripheral is hardcoded to 64k. Not
all peripherals need that much space though and reducing the size of the
address can reduce the amount of logic required, both in the interconnect
as well as in the peripheral.
Let adi_ip_properties() infer the size of the register map from the number
of bits of the address when creating the register map.
For backwards compatibility limit the register map size to 64k since
currently peripherals have a address width of 32 bits, event if they use
less.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the AXI address width of the DMA is always 32-bit. But not all
address spaces are so large that they require 32-bit to address all memory.
Extract the size of the address space that the DMA is connected too and
configure reduce the address size to the minimum required to address the
full address space.
This slightly reduces utilization.
If no mapped address space can be found the default of 32 bits is used for
the address.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The delay_clk is only used internally when the IODELAYs are enabled. This
means the port has no function when the IODELAYs are disabled so hide the
port in that case.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Typically when a port has a enablement dependency it also should have a
tie-off value to the port is connected to when disabled.
Make it possible to specify this tie-off value when calling
adi_set_ports_dependency().
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The output data mux is used to bypass the filter when it is not used. Which
setting is used for the mux depends on the 3-bit filter_mask signal.
Registering the control logic into a single bit signal reduces the amount
of routing resources required. Since changing the filter_mask settings is
asynchronous to the processing anyway the extra clock cycle delay
introduced by this change does not affect behaviour.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Move the processing pipeline of the axi_adc_decimate core to its own
sub-module. This makes it easier to simulate the processing independent of
the register map.
Also since the filter is two instances of the same logic, one for each
channel, let the new sub-module model one channel and instantiate it twice.
This allows to change the implementation without having to change the same
code twice.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The output data of the decimation block is 16-bit signed. Properly sign
extend the 12-bit input signal when the filter is bypassed.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The minimum number of bits required for the adders in a CIC filter depends
on the decimation rate. Higher decimation factors require more bits. This
means for a multirate filter the size of the logic structures is determined
by the highest supported rate.
The current implementation of the filter always uses all bits of the
structure to compute the results, that means even when running with the
lowest decimation factor all the bits that are required for the highest
decimation factor are used. This will work fine as additional bits do not
affect the output of the filter.
This patch implements dynamic partial gating of the filter structure based
on the selected decimation factor. Bits that are not required for a certain
rates are gated and the carry bits are masked from propagating through the
adder chain. This results in significant power savings at smaller
decimation factors.
This means that the filter itself is now using more power the higher the
decimation rate. But this is offset by the reduced data output rate running
subsequent processing stages at a lower rate and reducing power consumption
there. This results in a more or less flat power profile regardless of
decimation factor.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Allow to split a CIC int or comb block into multiple stages and be able to
dynamically gate some of the stages. Also prevent carry propagation in
gated stages to keep the adder output constant.
This is useful for multi-rate filter where not all bits are needed all the
time.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The minimum decimation rate of the CIC block is five, this means data
arrives at the FIR filter at most every five clock cycles. The decimation
rate of the filter is two so the filter produces an output at most every
ten clock cycles. This allows for ten clock cycles to compute the result.
The current implementation of the filter uses a fully pipelined
architecture with one multiplier for each coefficient. Which then do work
for one clock cycle and sit idle for the next nine clock cycles.
Rework the filter to be sequential reducing the number of required
multipliers to one. In addition exploit the symmetric structure of the
filter to make use of the preadder reducing the required multiply
operations by two.
This significantly reduces the logic utilization of the filter as well as
moderately reduces power consumption.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The minimum decimation of the CIC block is 5. This means new data arrives
at the comb stages at most every 5 clock cycles. Rather than letting the
logic sit idle during those 4 extra cycles use it to sequentially process
the comb stages of the filter. This reduces the logic utilization of the
filter by quite a bit.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The output data mux is used to bypass the filter when it is not used. Which
setting is used for the mux depends on the 3-bit filter_mask signal.
Registering the control logic into a single bit signal reduces the amount
of routing resources required. Since changing the filter_mask settings is
asynchronous to the processing anyway the extra clock cycle delay
introduced by this change does not affect behaviour.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Re-implement the CIC using the basic building blocks from the util_cic
library.
This new implementation is structurally equivalent to the previous version,
but will be used as a platform for implementing changes that will improve
area and power consumtion of the filter
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Move the processing pipeline of the axi_adc_decimate core to its own
sub-module. This makes it easier to simulate the processing independent of
the register map.
The debug registers are useful during development but are rarely used in a
production design. Add a option that allows to disable them, this reduces
the resource utilization of the DMAC.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the BRAM and data registers in the util_axis_data are ungated
when the FIFO is ready to receive data. This good for high-performance
since it reduces the number of control signals. But it is bad from a power
point of view since it causes additional reads and writes.
Change the core gate the BRAM and data register if either the consumer is
not ready to accept data or the producer has no data to offer.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Currently the IDDRs are configured in SAME_EDGE_PIPELINED mode, but then
the negative data is delayed by an additional clock cycle. This is the same
behaviour as using the IDDR in SAME_EDGE mode.
Switching to SAME_EDGE mode removes extra pipelining registers while
maintaining the same behaviour.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The current implementation doesn't quite work right when the interface
clock is slower than the trigger clock and also causes timing issues.
Disable it temporarily until a proper CDC transfer is implemented.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The read and write interfaces of a AXI bus are independent other than that
they use the same clock. Yet when connecting a single read-only and a
single write-only interface to a Xilinx AXI interconnect it instantiates
arbitration logic between the two interfaces. This is dead logic and
unnecessarily utilizes the FPGAs resources.
Introduce a new helper module that takes a read-only and a write-only AXI
interface and combines them into a single read-write interface. The only
restriction here is that all three interfaces need to use the same clock.
This module is useful for systems which feature a read DMA and a write DMA.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
The register read logic is not that complicated that it needs two extra
pipeline stages. It can easily be condensed into a single combinatorial and
still meet timing with large margins.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Disable registers in the register map which are not needed for this core.
This reduces the utilization of the core.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Not all peripherals use the GPIO register settings, but the registers still
take up a fair amount of space in the register map. Add options to allow to
disable them when not needed. This helps to reduce the utilization for
peripherals where these features are not needed.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Not all peripherals use the GPIO and START_CODE register settings, but the
registers still take up a fair amount of space in the register map. Add
options to allow to disable them when not needed. This helps to reduce the
utilization for peripherals where these features are not needed.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Depending on whether the core is configured for AXI4 or AXI3 mode the width
of the awlen/arlen signal is either 8 or 4 bit. At the moment this is only
considered in top-level module and all other modules use 8 bit internally.
This causes warnings about truncated signals in AXI3 mode, to resolve this
forward the width of the signal through the core.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Declaring local parameters in the module parameter list is not valid
verilog. For some reasons Vivado accepts it nevertheless so the code has
worked so far. But this is not true for other tools, so move the local
parameter definitions inside the module body.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
For experimentation, to solve a constraint scoping issue, split up the
ad_axi_ip_constraint file into separate constraints file, in function
of there parent module.
It seems that in the latest version a constant of "0" is no longer a valid
enablement dependency and "false" has be used instead.
Not setting the enablement dependency correctly results in the AXI port to
be assumed to be read-write rather than just read or write. This will
generate unnecessary logic for example in interconnects to which the DMA
controller is connected.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>