CL-PS7500FE
System-on-a-Chip for Internet Appliance
17. FPA COPROCESSOR MACROCELL
The FPA is a floating-point accelerator for the ARM family of CPUs. The FPA is designed to maximize the
performance/power, performance/cost, and performance/die size ratios while still providing a balanced
floating-point versus integer performance for ARM-based systems.
Typical performance, in the range 3–8 Mflops, is expected at a clock frequency of 40 MHz; actual perfor-
mance is dependent upon:
q the precision selected
q the system configuration
q the degree that the floating-point code is scheduled and otherwise optimized
The FPA in the CL-PS7500FE is an on-chip floating-point coprocessor connected to the ARM processor
core. It is a fully static design and its low power consumption, especially when in Standby mode, makes
it eminently suitable for portable and other power- and cost-sensitive applications. When used in conjunc-
tion with its support code, the FPA fully implements the IEEE Standard for binary floating-point arithmetic
(ANSI/IEEE Std. 754-1985).
The design of the FPA is based on an 81-bit internal datapath, with autonomous load/store and arithmetic
units that can operate concurrently. Single, double, and extended precision IEEE formats are all sup-
ported. The FPA achieves its high performance while remaining a low-cost and low-power solution, by
employing RISC and other advanced design techniques. It is interfaced to the ARM CPU over a simple,
high-performance coprocessor bus. The ARM instruction pipeline is mirrored on the FPA so that floating-
point instructions can be executed directly with minimal communication overhead. Pipelining, concurrent
execution units, and speculative execution are all employed to improve performance without having a
great impact on power consumption.
A RISC approach has been taken in selecting between those floating-point instructions that are candi-
dates for implementation in the FPA and those handled by software support. The FPA instruction reper-
toire includes only the basic operations plus compare, absolute value, round to integral value, and
floating-point to integer and integer to floating-point conversions. In addition, only normalized operands
and zeros are handled in hardware; operations on denormalized numbers, infinities, and NaNs are han-
dled by the support code. Only the inexact exception is dealt with by hardware; all other exceptions cause
the software support code to be called, whether or not the associated trap is enabled. This approach has
helped to minimize the die size while having a negligible effect on performance in most applications.
156
FPA COPROCESSOR MACROCELL
ADVANCE DATA BOOK v2.0
June 1997