SPI Engine: Add execution delay documentation (#1230)

* SPI Engine: Add execution delay documentation

Add documentation for the different delays on the command path and
data path, including communication between submodules and instruction
execution overhead.

Signed-off-by: Laez Barbosa <laez.barbosa@analog.com>
main
LBFFilho 2023-11-28 18:08:51 -03:00 committed by GitHub
parent 6a6e1f97f9
commit 0f87d845d3
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 154 additions and 8 deletions

View File

@ -14,8 +14,10 @@ SPI Engine
Offload Control Interface<offload-control-interface>
SPI Bus Interface<spi-bus-interface>
Instruction Set Specification<instruction-format>
Pipeline Delays<pipeline-delays>
Tutorial - PulSAR ADC<tutorial>
SPI Engine is a highly flexible and powerful SPI controller framework.
It consist out of multiple sub-modules which communicate over well defined
interfaces.
@ -65,7 +67,7 @@ Related IP Cores
--------------------------------------------------------------------------------
This list contains cores that are not part of the core SPI engine framework but
make use of its interfaces and are intend to be used together with the SPI engine
make use of its interfaces and are intended to be used together with the SPI engine
framework.
* :dokuwiki:`util-sigma-delta-spi <resources/fpga/peripherals/util_sigma_delta_spi>`:
@ -93,4 +95,5 @@ Additional Resources
--------------------------------------------------------------------------------
* :download:`Presentation: SPI Engine Design Philosophy <https://wiki.analog.com/_media/resources/fpga/peripherals/spi-engine3.pdf>`.
* :ref:`spi_engine pipeline-delays`
* :ref:`spi_engine tutorial`.

View File

@ -62,15 +62,17 @@ SPI Engine execution module.
Before and after the update is performed the execution module is paused for the
specified delay. The length of the delay depends on the module clock frequency,
the setting of the prescaler register and the t parameter of the instruction.
This delay is inserted before and after the update of the chip-select signal,
so the total execution time of the chip-select
instruction is twice the delay, plus a fixed 2 clock cycles (fast clock, not prescaled)
for the internal logic.
the setting of the prescaler register and the parameter :math:`t` of the
instruction. This delay is inserted before and after the update of the
chip-select signal, so the total execution time of the chip-select instruction
is twice the delay, with an added fixed 2 clock cycles (fast clock, not
prescaled) before for the internal logic.
.. math::
delay = t * \frac{(div + 1)*2}{f_{clk}}
delay_{before} = 2+ t * \frac{(div + 1)*2}{f_{clk}}
delay_{after} = t * \frac{(div + 1)*2}{f_{clk}}
.. list-table::
:widths: 10 15 75
@ -128,7 +130,8 @@ Synchronize Instruction
The synchronize instruction generates a synchronization event on the SYNC output
stream. This can be used to monitor the progress of the command stream. The
synchronize instruction is also used by the :ref:`spi_engine interconnect`
module to identify the end of a transaction and re-start the arbitration process.
module to identify the end of a transaction and re-start the arbitration
process.
.. list-table::
:widths: 10 15 75

View File

@ -0,0 +1,140 @@
.. _spi_engine pipeline-delays:
SPI Engine Pipeline Delays
================================================================================
The SPI Engine implementation imposes certain constraints on the timing of
different commands. Each instruction requires some number of cycles to execute,
which may depend on the instruction parameters. Additionally, there are delays
associated with the internal architecture of the SPI Engine, which become
relevant unless we are using the Offload functionality.
.. _instruction_execution_times:
Instruction Execution
--------------------------------------------------------------------------------
Every instruction requires 1 cycle minimum for communication between the offload
module and the execution module. Additionally, the Chip Select, Sleep, Transfer
and Sync instructions require another cycle for checking the idle condition
(total 2 fixed delay for these).
The exact values are, counting from the execution
module:
.. list-table::
:widths: 10 80
:header-rows: 1
* - Instruction
- Cycles
* - Configuration Write
- 1 cycle.
* - Sync
- 2 cycles.
* - Chip-select
- :math:`2+ 2*t*((div+1)*2)`. Where :math:`t` is the chip select delay
parameter on the instruction, and :math:`div` is the prescaler register
value. The CS value change happens after the first
:math:`2+t*((div+1)*2)` cycles.
* - Sleep
- :math:`2 + t*((div+1)*2)`. Where :math:`t` is the sleep delay parameter
on the instruction, and :math:`div` is the prescaler register value.
* - Transfer
- 2 cycles, plus the transfer time.
Counting from the execution module means that these values are useful for
calculating the delays on the offload case (simply add up each instruction
execution time). For other cases, the detailed delays of the architecture are
needed.
.. _detailed_delays:
Detailed Delays
--------------------------------------------------------------------------------
This section lists the delays inside the SPI Engine architecture. To make use of
this information, one needs some degree of familiarity with the hdl
implementation (knowledge of the sub-modules and the way they communicate).
See also: :ref:`spi_engine control-interface`,
:ref:`spi_engine offload-control-interface`.
Offload Module
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All the delays measured for this module are in terms of SPI Engine clock cycles.
* Trigger input to command valid output: 1 + 1-2(from a 2FF CDC, 0 if not
asynchronous) cycles.
* Trigger in to sdo_data_valid: 1 + 1-2(from a 2FF CDC, 0 if not asynchronous)
cycles.
* Maximum command throughput: 1 command per cycle.
* sdi_data_valid to offload_sdi_valid: 0 cycles.
Interconnect Module
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All the delays measured for this module are in terms of SPI Engine clock cycles.
The interconnect will only accept one master at a time, and will wait until a
sync handshake back to the master is completed to free the channel.
* Command valid input to command valid output (s0/s1 to m): 1 cycle if idle, 0
if already "owned" by the source (s0 or s1).
* Sync valid from m side to s0/s1 sync valid (back to originating master): 0
cycle.
* Sync ready to idle (delay after finishing transaction response): 1 cycle
* Thus, 2 cycles per command minimum if changing masters, 3 if accounting for
sync (this is the worst case).
* 1 cycle per command (can accept back to back) if from same master.
* Thus, :math:`2+N_{cmd}` minimum cycles per :math:`N_{cmd}` "burst" from same
source.
* s0/s1_sdo_valid to m_sdo_valid: 0 if already "owned" by the source (s0 or
s1). Otherwise has to wait until s0/s1 owns the channel.
* m_sdi_valid to s0/s1_sdi_valid: 0 if already "owned" by the sink (s0 or s1).
Otherwise has to wait until s0/s1 owns the channel.
Execution Module
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All the delays measured for this module are in terms of SPI Engine clock cycles.
See above: :ref:`instruction_execution_times`
* Every instruction requires 1 cycle minimum for communication between the
Offload Module and the Execution Module. Additionally, the Chip Select, Sleep,
Transfer and Sync instructions require another cycle for checking the idle
condition (total 2 fixed delay for these).
* Chip Select, Sleep and Transfer have additional cycle requirements due to
intentional delays in execution. This is better detailed at
:ref:`instruction_execution_times`.
* SDI data delay: 0 cycles (sdi_data_valid arrives at the same cycle as the
Transfer instruction finishes and the next command is accepted).
AXI Module
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* AXI transaction to take effect internally: 1 (AXI clock).
* Meaning: if counting delay to other parts of the design (e.g. command fifo),
this is the AXI delay. Other AXI delays affect only AXI throughput, creating
backpressure for the AXI master.
* Throughput: 4 cycles (AXI clock) per transaction.
* Command FIFO delay: depends on parametrization:
* Synchronous, 1 deep: 1 clk (AXI clock = SPI Engine clock).
* Asynchronous, 1 deep: 1 (AXI clock), + 1-2 (SPI Engine clock) (2FF CDC)
input to output; + 1-2 (AXI clock) (2FF CDC) until ready to accept next.
* Asynchronous, true FIFO: 2 (AXI clock) (mem write + bin2gray addr), + 1-2
(SPI Engine clock) (2FF CDC), + 2 (SPI Engine clock) (gray2bin + valid) .
* AXI transaction start to command valid (total for async FIFO case): 3 AXI
clock + 3-4 SPI Engine clock.
* SDO Data FIFO delay: same as Command FIFO.
* SDI Data FIFO delay: depends on parametrization:
* Synchronous, 1 deep: 1 clk (AXI clock = SPI Engine clock).
* Asynchronous, 1 deep: 1 (SPI Engine clock), + 1-2 (AXI clock) (2FF CDC)
input to output; + 1-2 (SPI Engine clock) (2FF CDC) until ready to accept
next.
* Asynchronous, true FIFO: 2 (SPI Engine clock) (mem write + bin2gray addr), +
1-2 (AXI clock) (2FF CDC), + 2 (AXI clock) (gray2bin + valid) .