|
The DSP32C, manufactured by Lucent Technologies, is a
second generation to the DSP32 introduced in 1985. It's
32-bit floating point operations meet increased precision
and dynamic range requirements unattainable by
fixed-point digital signal processors. Low power CMOS
technology enables 80 MHz operation, 20 MIPS (million
instructions per second) and 40 MFLOPS (million floating
point operations per second). A 50 MHz 25 MFLOPS version
of the DSP32C is available at a lower price for most
boards. The DSP executes code independent of any other
processor. Parallel processing increases throughput by
adding more DSPs.
Instruction Set
The C-like syntax of the DSP32C's assembly language is
easier to learn than mnemonic instructions and results in
more readable code. Even if the Lucent C-compiler is used
to generate code for the DSP, hand optimization is
sometimes required for time-critical algorithms. An
example of a 32-bit multiply/accumulate with store to
memory:
*r1++=a0=a1 + *r2++ * *r3++
The execution of this instruction simply follows the
conventions of C language: "Multiply the 32-bit
floating-point values stored in the memory locations
pointed to by registers r2 and r3. Add the result to the
contents of accumulator a1, store the result in
accumulator a0, and write the result to the 32-bit memory
location pointed to by register r1. Post-increment
pointer registers r1, r2, and r3."
Number Conversions
All computers which host boards featured in this brochure
use the IEEE P754 standard representation of floating
point numbers. One of the DSP32C instructions converts
its own internal format to the single-precision version
of this standard. Another instruction performs the
translation back to DSP format. This means that a single
conversion takes 50 nsec.
Other format conversions available are for 8, 16, and
24-bit two's complement integer, u-law, and A-law.
Memory
The 20nsec static RAM used on all boards with the 50MHz
DSP32C processor runs at zero wait-state. When the 80MHz
DSP is used, 15nsec SRAM is required to meet the specs of
zero wait-state. For some boards, specifically the A5,
B1, G1 and all VME products, 15nsec SRAMs are not
currently available. The user has two choices:
1) Specify 74MHz operation. By slowing the processor's
clock, the 20nsec SRAMs meet the DSP's timing specs.
2) Run at 80MHz and one wait-state SRAM. Although the
DSP can access internal RAM at zero wait-state, each
external memory access slows the processor down by 25%.
For non-critical applications, this configuration can be
run at zero wait-state provided the user periodically
runs the memory diagnostics to be sure no errors are
occuring. CAC does not guarantee boards will successfully
run faster than its specs.
All memory can be addressed as 8, 16, 24, or 32-bit
words, with 32-bit data accessed at the same speed as
8-bit data. The DSP32C has 6144 bytes of zero wait-state
on-chip static RAM and a 16 Mbyte addressing space for
external memory. Two independent external memory speed
partitions, A and B, permit a mix of fast and slow
memories, usually 0 and 1 wait-state. Instructions,
tables, and data can be arbitrarily located anywhere in
memory.
Slower or buffered memory can be accommodated by the
automatic insertion of 12.5nsec wait-states during each
memory access. One wait-state memory will slow the DSP
program by 25% each time an instruction reads or writes
the slower memory.
Serial I/O Unit
The SIO port is used to interface to analog I/O and other
DSPs, communicating via a serial data stream. Double
buffering makes back-to-back transfers possible, allowing
the DSP32C program to begin a second transfer before the
first has been completed. Three modes of performing I/O
include skip-on-flag, interrupt, and DMA. Data widths can
be 8, 16, 24, or 32 bits. Clocks and strobes can be
generated internally or provided externally. Maximum
clock rate is 40 Mbps for 80 MHz DSP and 25 Mbps for 50
MHz.
Parallel I/O Unit
The PIO data bus is used to communicate with the host
computer. The auto-increment direct memory access (DMA)
provides high bandwidth, non-intrusive data transfers.
The host can perform cycle-stealing access with minimal
effect on the DSP processing and does not require any
code to be executed by the DSP.
Interrupt Operation
The DSP32C provides single-level interrupt facility with
four internal and two external sources. The interrupts
are prioritized and are individually maskable. The DSP
code is programmed with an interrupt vector table which
contains six pairs of branches to the appropriate
servicing routine. Before branching to the interrupt
service routine, the DSP32C saves the state of the
machine and the four accumulators. Upon returning from
servicing the interrupt, the state of the DSP and
accumulators are restored.
The possible sources of interrupts are two external
signals, SIO input buffer full, and SIO output buffer
empty. The two PIO related interrupts are not used on
these boards.
Data Arithmetic Unit
The DAU is the primary execution unit performing
multiply/accumulate operations for signal processing
algorithms. Four 40-bit accumulators perform 20 million
instructions per second of the form: a = b + c * d Since
each instruction can perform both a multiply and an
addition, the maximum throughput is 40 MFLOPS, the
standard benchmark used to rate all DSPs.
Control Arithmetic Unit
The CAU executes 16 or 24-bit fixed point arithmetic
instructions at the rate of 20 million per second. It
performs the integer arithmetic and logic operations with
the use of a 24-bit program counter, and 22 general
purpose registers.
|