连续签到天数:1天 | 签到总天数:15天 | 签到总奖励:345金币 |
|
楼主 |
发表于 2020-1-9 11:10:10
|
显示全部楼层
DSP笔记3
===================================================================
第2组寄存器用于控制索引内存访问。
24位但访问内存时只有低16位有效(AG1索引I0~I3、AG2索引I4~I7、修改M0~M3、I0~I1的长度L0~L1、I4~I5的L4~L5)
第3组寄存器可使用堆栈指令进行更改
rMAC0 rMAC的低24位
rMAC2 rMAC的高8位
rMAC12 The middle 24 bits of rMAC (with sign extension into rMAC2 whenpopping)
rMACB0 rMACB的低24位
rMACB2 rMACB的高8位
rMACB12 The middle 24 bits of rMACB (with sign extension into rMACB2 whenpopping)
DoLoopStart 循环开始地址
DoLoopEnd 循环结束地址
DivResult 除法结果
DivRemainder 除法余数
B0 基址0,非零时为指向循环缓冲或位反转数列
基址寄存器使循环缓冲区位于内存中的任意地址,通过消除对循环缓冲区放置的限制,使内存管理更容易
B1 基址1
B4 基址4
B5 基址5
FP 内存映射Frame指针
SP 内存映射Stack指针
Stalls(CPU的等待cycles,并不是IDEL而是WAIT):
PC保存当前指令的地址。内部指令寄存器保存当前执行的指令。每一条指令都是从内存中取出的,一个周期后,加载到指
令寄存器中。这将在程序流中引入单一级别的管道。然而,转发硬件允许零开销无条件分支,并在发生内存读取时防止某些
管道危险。某些指令被施加了暂停以减少DSP中的长组合路径。有些Stalls是Kalinba Architecture 3的新Stalls,不出现
在BlueCore5多媒体上。
Stall due to conditional branch, regardless of whether it is taken:
r0 = r0 - 1;
if POS jump dont_add_ten; // Stall here due to conditional branch
r0 = r0 + 10;dont_add_ten:
Stall due to main-instruction memory read:
r0 = 100; r1 = M[r0];
Stall due to modifying r10 before a do ... loop:
r10 = 100;
do loop; // Stall here due to modifying r10 before do..loop
r0 = r0 + 1;
loop:
Stall due to setting index register before a memory read:
I0 = 100;
r1 = r1 + r2, r0 = M[I0, 1]; //Stall here, since I0 is set up immediately before this memory read
Stall due to changing rLink before an rts:
POP rLink;
rts; // Stall here due to modifying rLink before rts
External wait signals from peripherals may introduce further stalls, for example:
Accessing memory mapped registers
Reading PM flash/ROM
Accessing DM flash/ROM
内存访问也会导致Stalls。内存读取在指令之前设置内存总线,而内存写入发生在当前指令的末尾。这意味着,如果前一条
指令执行memorywrite,而当前指令读取相同的存储库,则当前指令具有1个时钟周期延迟。
1 instruction can contain up to 3 memory accesses:
1 as part of the main instruction
1 as an AG1 indexed memory access
1 as an AG2 indexed memory access
执行此指令的时钟周期必须处理此指令的内存写入和下一条指令的内存读取。为此,它填写一个调度表。它有一个DM1列和
一个DM2列。每一行是一个时钟周期。
调度表按以下顺序填写:
Next instruction’s main read
Next instruction’s AG1 read
Next instruction’s AG2 read
Current instruction’s main write
Current instruction’s AG1 write
Current instruction’s AG2 write
for example:
// I0, I4, I5 all point to DM1
// r2 points to DM2
r0 = r0 + r1, M[I0,1] = r2, // I0 uses AG1 (write)
M[I4,1] = r3; // I4 uses AG2 (write)
r4 = M[r2],// Main instruction (read)
r1 = M[I5,1], // I5 uses AG2 (read)
M[I1,1] = r0; // I1 uses AG1 (write)
第一条指令的调度表:
Cycle DM1 DM2
3 r1 = M[I5,1] r4 = M[r2]
2 M[I0,1] = r2
1 M[I4,1] = r3
调度表从底部开始向上填,放在语句尾部的指令先排入调度表,M[I0,1] = r2与M[I4,1] = r3都是访问DM1的,
所以先填入M[I4,1] = r3。下一指令中有1个访问DM1,2个访问DM2,只能同时排1个DM1和1个DM2,看填写顺序表,
read先填,故填入r4 = M[r2]
防止Stalls出现:
Zero overhead looping:
r10 = 10;
r1 = 4;// A stall would occur if this wasn’t here
do loop; // copies 10 words of data from
r0 = M[I0,1]; // address I0 to address I2.
M[I2,1] = r0; // Takes 22 cycles in total.
loop:
|
|