CL-PS7500FE
System-on-a-Chip for Internet Appliance
The labels 1, 2, 3, 4, and 5 indicate the cycles when these instructions are fetched on the CPD[31:0] bus,
while A, B, and C indicate the cycles when the floating-point instructions are issued to their respective
units in the FPA.
The first store multiple instruction (1) is issued (A) to the load/store unit, resulting in 12 words of data
being transferred on CPD[31:0] as shown by the shaded boxes on the timing diagram. Meanwhile, the
divide instruction (2) is issued (B) to the arithmetic unit (AU) that then begins execution speculatively; its
progress through the Prepare, Calculate, Align and Round stages of the AU pipeline is shown by the
shaded boxes on the timing diagram.
The second SFM instruction (3) is issued (C) to the load/store unit as soon as it is ready. This second SFM
then executes while the AU is still busy on the divide instruction; the second set of shaded boxes on the
CPD[31:0] bus indicates the 12 words of data being transferred for the second SFM instruction. This
example shows how the divide instruction’s execution time can effectively be hidden by other instructions.
NOTE: The concurrence between ARM integer unit execution and FPA execution can also be exploited. Contact
ARM Ltd. for further details on optimizing floating-point code for the FPA.
186
FLOATING-POINT INSTRUCTION SET
ADVANCE DATA BOOK v2.0
June 1997