新人请教用CSR8645做开发要学习什么？

qianhng 发表于 2020-1-7 14:02:32

想用CSR8645做开发，要下载哪些工具软件，学习哪些开发知识？

忙忙碌碌 发表于 2020-1-7 18:00:54

芯片的特性啊，各种相关的电路啊啥的

qianhng 发表于 2020-1-7 20:53:21

本帖最后由 qianhng 于 2020-1-7 20:56 编辑

wsnyy 发表于 2020-1-7 18:00
芯片的特性啊，各种相关的电路啊啥的
datasheet里找不到寄存器、指令集信息呢:dizzy:，请问怎么找？

qianhng 发表于 2020-1-8 09:12:54

本帖最后由 qianhng 于 2020-1-8 11:18 编辑

DSP笔记1
===================================================================
ADK4.01里的DSP说明文档Kalimba DSP Reference Guide是一套html文件，里面讲解的是库中函数的输入输出参数：
ADK4.0.1\doc\dsp\html\index.html
The Kalimba DSP particularly targets audio processing applications for BlueCore. The likely audio processing

applications include:

Sub-Band Coding (SBC) encoding and decoding, as defined in the Bluetooth Advanced Audio Distribution

Profile
MP3 encoding and decoding, as defined in ISO/IEC 11172-3, and the sample rate extensions defined in

ISO/IES 13818-3
Advanved Audio Coding (AAC) encoding and decoding, as defined in ISO/IEC 13818-7
Alternative voice/Hi-Fi CODECs
Echo and noise cancellation
Speech and music enhancement
看来CSR对声音的处理还是挺简单的，一些编解码、抑噪消自激、增强音效
===================================================================
互联网中搜到DSP特点：
（1）在一个指令周期内可完成一次乘法和一次加法；
（2）程序和数据空间分开，可以同时访问指令和数据；
（3）片内具有快速RAM，通常可通过独立的数据总线在两块中同时访问；
（4）具有低开销或无开销循环及跳转的硬件支持；
（5）快速的中断处理和硬件I/O支持；
（6）具有在单周期内操作的多个硬件地址产生器；
（7）可以并行执行多个操作；
（8）支持流水线操作，使取指、译码和执行等操作可以重叠执行。
===================================================================

qianhng 发表于 2020-1-8 11:16:21

DSP笔记2
===================================================================
ADK4.0.1\doc\support\adkdocs\dsp\kalimba\CS-202067-UG.pdf
Kalimba Architecture 3 叙述了DSP的寄存器、指令、外设及一些例程
主要特点:
24-bit fixed point DSP core
80MIPS performance (40MIPS on some devices), which can be divided down for power saving
1 program memory and 2 data memory banks, all 3 of which can be accessed simultaneously in a single cycle
Flash/ROM support for both data and code, with caches to improve code performance
Single-cycle 24 x 24-bit multiply with 2 56-bit accumulators
Single-cycle barrel shifter with 56-bit input and 56-bit or 24-bit output
12-cycle divide (performed in the background)
Majority of instructions can be conditional
Zero overhead ring buffer indexing
Zero overhead looping and unconditional branching
Bit reversed addressing capability, and bit reverse data function
Largely orthogonal instruction set, which is quick to learn and easy to write in the algebraic assembler

language
Stack instructions: PUSH, POP, PUSHM, POPM, FP/SP Adjust, FP/SP Relative LOAD/STORE, and otherinstructions

featuring overflow detection
Low-power internal architecture
8 hardware program breakpoints and 2 data breakpoints

The key features of the Kalimba DSP peripherals include:
Close integration with the on-chip MMU, giving access to features such as DACs and ADCs
8 low-overhead read/write ports (11 in selected CSR devices) to transfer streaming data to and from

theBlueCore subsystem
2 memory-mapped windows into the MCU RAM for data exchange
3 windows for access to the flash/ROM data memory
Memory-mapped interface to the I/O address map
Multiple interrupt sources including 2 24-bit timers
Read, write and direction control access to external PIO lines
----------------
<百度翻译>
Kalimba DSP的主要特点：
24位定点DSP核
80MIPS性能（在某些设备上为40MIPS），可分为以下几部分以节省电源
一个程序存储器和两个数据存储器组，这三个存储器组都可以在一个周期内同时存取
对数据和代码的Flash/ROM支持，带有缓存以提高代码性能
单周期24 x 24位乘法与2个56位累加器
具有56位输入和56位或24位输出的单周期桶形移位器
12周期除法（在后台执行）
大多数指令可以是有条件的
零开销环缓冲器索引
零开销循环和无条件分支
位反向寻址能力和位反向数据功能
很大程度上是正交的指令集，学习速度快，用代数汇编语言编写容易
堆栈指令：PUSH、POP、PUSHM、POPM、FP/SP Adjust、FP/SP Relative LOAD/STORE和其他具有溢出检测功能的指令
低功耗内部架构
8个硬件程序断点和2个数据断点

Kalimba DSP外围设备的主要特点包括：
与片上MMU紧密集成，提供dac和adc等功能
8个低开销读/写端口（在选定的CSR设备中为11个），用于在BlueCore子系统之间传输流数据
2个内存映射窗口到MCU RAM中进行数据交换
3个访问闪存/ROM数据存储器的窗口
内存映射接口到I/O地址映射
多个中断源，包括2个24位定时器
外部PIO线路的读、写和方向控制访问

片上RAM：
DM1: 24-bit data memory 1
DM2: 24-bit data memory 2
PM: 32-bit program memory, with 1024-word or 512-word direct-mapped cache for flash/ROM program spaceand 32

-word loop cache

MMU接口由多个读写端口组成，可以有效地与IC的其他部分进行数据交互
可控制多达32条由固件控制的可编程I/O线，可直接读任何数字I/O，但只能更改mcu启用的数字输出的pin方向（通过VM应用

程序执行，如果存在）。
中断会延迟10到20条指令的执行时间
地址生成器（AGs）AGs形成索引内存读写的数据内存地址。每个AG有4个关联的地址指针（索引寄存器）。当索引寄存器用

于内存访问时，它将由指定的修改寄存器中的值或2位常量进行后期修改。在两个独立的AGs下，DSP可以同时产生两个地址

，用于双索引存储器访问。长度值可以与4个索引寄存器相关联，以实现循环缓冲区的自动模寻址。起始地址也可以关联，

从而消除了循环缓冲区对齐的需要。当在rFlags寄存器中设置了适当的模式位时，AG1的输出被位反转，然后驱动到地址总

线上。此功能可在基2 FFT算法中实现高效寻址；
共有3组16个寄存器：
第1组是可用于几乎所有指令的通用寄存器。
24/56位(rMAC、rMACB)、24位(通用寄存器R0~R9、循环计数器R10、rLink、rFlags)
   INT_UM_FLAG 15 0
   INT_BR_FLAG 14 0
   INT_SV_FLAG 13 0
   INT_UD_FLAG 12 0
   INT_V_FLAG 11 0
   INT_C_FLAG 10 0
   INT_Z_FLAG 9 0
   INT_N_FLAG 8 0
   UM_FLAG 7 0
   BR_FLAG 6 0
   SV_FLAG 5 0
   UD_FLAG 4 0
   V_FLAG 3 0
   C_FLAG 2 0
   Z_FLAG 1 0
   N_FLAG 0 0
Condition Condition Flag State Condition Code
Z (zero) / EQ (equal) Z = 1 0 0 0 0
NZ (not Zero) / NE (not equal) Z = 0 0 0 0 1
C (ALU carry) / NB (not ALU borrow) C = 1 0 0 1 0
NC (not ALU carry) / B (ALU borrow) C = 0 0 0 1 1
NEG (negative) N = 1 0 1 0 0
POS (positive) N = 0 0 1 0 1
V (ALU overflow) V = 1 0 1 1 0
NV (not ALU overflow) V = 0 0 1 1 1
HI (unsigned higher) C = 1 AND Z = 0 1 0 0 0
LS (unsigned lower or same) C = 0 OR Z = 1 1 0 0 1
GE (signed greater than or equal) N = V 1 0 1 0
LT (signed less than) N != V 1 0 1 1
GT (signed greater than) Z = 0 AN DN = V 1 1 0 0
LE (signed less than or equal) Z = 1 OR N != V 1 1 0 1
USERDEF (user defined) USERDEF = 1 1 1 1 0
Always true don’t care 1 1 1 1

rMAC(B) is the overall 56-bit register
rMAC(B)0 is a 24-bit register that forms the lower part of the rMAC(B) register(bit23:0)
rMAC(B)1 is a 24-bit register that forms the middle part of the rMAC(B) register(bit47:24)
rMAC(B)2 is an 8-bit register that forms the higher part of the rMAC(B) register(bit55:48)
rMAC(B)12 is a 32-bit register that is a combination of rMAC(B)2 and rMAC(B)1 that forms part of therMAC(B) register(bit55:24)
The 24-bit rounded and saturated version of rMAC(B) is often referred to as rMAC(B)1

qianhng 发表于 2020-1-9 11:10:10

DSP笔记3
===================================================================
第2组寄存器用于控制索引内存访问。
24位但访问内存时只有低16位有效(AG1索引I0~I3、AG2索引I4~I7、修改M0~M3、I0~I1的长度L0~L1、I4~I5的L4~L5)
第3组寄存器可使用堆栈指令进行更改
rMAC0 rMAC的低24位
rMAC2 rMAC的高8位
rMAC12 The middle 24 bits of rMAC (with sign extension into rMAC2 whenpopping)
rMACB0 rMACB的低24位
rMACB2 rMACB的高8位
rMACB12 The middle 24 bits of rMACB (with sign extension into rMACB2 whenpopping)
DoLoopStart 循环开始地址
DoLoopEnd 循环结束地址
DivResult 除法结果
DivRemainder 除法余数
B0 基址0，非零时为指向循环缓冲或位反转数列
基址寄存器使循环缓冲区位于内存中的任意地址，通过消除对循环缓冲区放置的限制，使内存管理更容易
B1 基址1
B4 基址4
B5 基址5
FP 内存映射Frame指针
SP 内存映射Stack指针

Stalls（CPU的等待cycles，并不是IDEL而是WAIT）：
PC保存当前指令的地址。内部指令寄存器保存当前执行的指令。每一条指令都是从内存中取出的，一个周期后，加载到指

令寄存器中。这将在程序流中引入单一级别的管道。然而，转发硬件允许零开销无条件分支，并在发生内存读取时防止某些

管道危险。某些指令被施加了暂停以减少DSP中的长组合路径。有些Stalls是Kalinba Architecture 3的新Stalls，不出现

在BlueCore5多媒体上。
Stall due to conditional branch, regardless of whether it is taken:
r0 = r0 - 1;
if POS jump dont_add_ten; // Stall here due to conditional branch
r0 = r0 + 10;dont_add_ten:
Stall due to main-instruction memory read:
r0 = 100; r1 = M;
Stall due to modifying r10 before a do ... loop:
r10 = 100;
do loop; // Stall here due to modifying r10 before do..loop
r0 = r0 + 1;
loop:
Stall due to setting index register before a memory read:
I0 = 100;
r1 = r1 + r2, r0 = M;//Stall here, since I0 is set up immediately before this memory read
Stall due to changing rLink before an rts:
POP rLink;
rts; // Stall here due to modifying rLink before rts

External wait signals from peripherals may introduce further stalls, for example:
Accessing memory mapped registers
Reading PM flash/ROM
Accessing DM flash/ROM

内存访问也会导致Stalls。内存读取在指令之前设置内存总线，而内存写入发生在当前指令的末尾。这意味着，如果前一条

指令执行memorywrite，而当前指令读取相同的存储库，则当前指令具有1个时钟周期延迟。

1 instruction can contain up to 3 memory accesses:
1 as part of the main instruction
1 as an AG1 indexed memory access
1 as an AG2 indexed memory access
执行此指令的时钟周期必须处理此指令的内存写入和下一条指令的内存读取。为此，它填写一个调度表。它有一个DM1列和

一个DM2列。每一行是一个时钟周期。
调度表按以下顺序填写:
Next instruction’s main read
Next instruction’s AG1 read
Next instruction’s AG2 read
Current instruction’s main write
Current instruction’s AG1 write
Current instruction’s AG2 write

for example:
// I0, I4, I5 all point to DM1
// r2 points to DM2
r0 = r0 + r1, M = r2, // I0 uses AG1 (write)
M = r3; // I4 uses AG2 (write)
r4 = M,// Main instruction (read)
r1 = M, // I5 uses AG2 (read)
M = r0; // I1 uses AG1 (write)
第一条指令的调度表：
Cycle DM1 DM2
3 r1 = M r4 = M
2 M = r2
1 M = r3
调度表从底部开始向上填，放在语句尾部的指令先排入调度表，M = r2与M = r3都是访问DM1的，
所以先填入M = r3。下一指令中有1个访问DM1，2个访问DM2,只能同时排1个DM1和1个DM2,看填写顺序表，
read先填，故填入r4 = M

防止Stalls出现：
Zero overhead looping:
r10 = 10;
r1 = 4;// A stall would occur if this wasn’t here
do loop; // copies 10 words of data from
r0 = M; // address I0 to address I2.
M = r0; // Takes 22 cycles in total.
loop:

页: [1]

我爱蓝牙网 - 52Bluetooth - 最具人气蓝牙技术交流网站's Archiver

新人请教用CSR8645做开发要学习什么？