Embedded System Programmimg on ARM Cortex-M3/M4 Processor

Lecture Notes on Udemy

Qertile 郭泰爾
9 min readFeb 19, 2022

Cortex M3/M4 使用的是Thumb 指令集(16-bit), 而不是ARM 指令集(32-bit)
但可以達到和32-bit指令集相同的表現

其他處理器

ARM: Cortex-M系列
Atmel (Microchip): AVR8, 16, 32-bit (arduino系列)
Texas Instrament: MSP430

TRM

Technical Reference Manual, ARM原廠說明文件

Processor core, Processor, and Microcontroller

Processor = Processor core + Peripheral
Microcontroller = Processor + Peripheral

e.g.
- Cortex-M4 core (made by ARM)
register set, ALU, barrel shifter, pipeline engine, instruction decoder …
- STM32F446 (made by ST)
Cortex-M4(from ARM) 和其他周邊設備(from ST),透過Bus Matrix(I-bus, D-bus, System bus)和周邊溝通

Printf output on ARM Cortex M3/M4/M7

Using SWD (ST-link debugger)
Using Semi-hosting (OpenOCD debugger)

  • using SWO pin (Serial Wire Output) of SWD interface(Serial Wire Debug)
SWO pin location
SWD inside M4 processor

SWD(Serial Wire Debug)

  • 2-wire protocol for accessing the ARM debug interface
    - SWDIO: 資料線 e.g. 中斷點的資訊會藉由這條線傳送給CPU
    - SWCLK: a clock driven by the host
  • JTAG的替代方案
  • 藉由SWD,可以修改內部flash、存取記憶體、設置中斷點、停止/運行CPU
  • 配合SWO可以用printf來debug

SWO(Serial Wire Output)

  • 用來SWV(Single Wire Viewing)
  • ITM內部有硬體FIFO
  • 將需要print的內容寫進FIFO,內容會透過SWO被ST link debug電路捕獲
  • 並非所有IDE都支援SWO功能
FIFO inside ITM unit

JTAG

  • 較傳統的debug機制,適用於ARM 7/9系列,需要4個pin
  • 在Cortex-M系列被SWD介面所取代,只需要兩條pin

Operational mode of the processor

processor give 2 operational modes

  • Thread mode (user mode)
    所有的應用程式會在這個模式下進行,啟動後永遠會預設進入該模式進行
  • Handler mode
    所有的例外(exception)或中斷(interrupt)會在該模式下運作,應用程式執行時若遇到例外或中斷,處理器便會自動切換到該模式

下圖為範例程式中,PSR(program status register)的狀態,10011 = 19
對照下表也就是IRQ3,PSR != 0 就是在Handler mode

program status register

Access levels of the processor

processor give 2 access level

  • Privileged Access Level (PAL)
    程式有權限存取所有處理器的所有資源與暫存器。程式預設會在PAL底下運作。在Handler mode底下,永遠會在PAL。
  • Non-Privileged Access Level (NPAL)
    在Thread mode下,可以透過CONTROL register從PAL切換到NPAL,但沒辦法再次切換回PAL。除非更改為Handler mode。

Register set of the processor

Core Register in Cortex M4

Core register

R0-R12 registers are for general purpose e.g. data operation, store addr. etc.

  • SP (Stack Pointer, R13)
    PSP (Process Stack Pointer )
    MSP (Main Stack Pointer)
  • LR (Link Register, R14)
    儲存從函式、例外中return時,需要回來的位址
    It stores the return information for subroutines, function calls, and exceptions
  • PC (Program Counter, R15)
    儲存下一行要被執行的指令的位址。reset時會自動重設為reset vector(0x00000004)。reset時,Bit[0]會被存入EPSR T-bit且必須為1

Special Register

  • PSR (Program Status Register)
PSR (Program Status Register)
  1. APSR (Application Program Status Register)
    N: Negitive
    Z: Zero
    C: Carry or borrow flag
  2. IPSR (Interrupt Program Status Register)
  3. EPSR (Execution Program Status Register)
    T-bit (Thumb state bit, PSR[24]):
    如果等於 1, 處理器會認為下一個要被執行的指令是來自 Thumb ISA (Instruction Set Architecture)
    如果等於 0, 處理器會認為下一個要被執行的指令是來自 ARM ISA

ARM Cortex Mx 只支援 Thumb指令集,所以T bit應該保持為1

  • Non-memory mapped register
    沒有位址,不在processor memory map裡面,因此沒辦法(透過C語言)被存取,必須透過組合語言(mov, load)
    e.g. GP register, SP, LR, PC, Special register
  • Memory mapped register
    每一個暫存器都有自己的位址,可以透過C語言進行存取與操作。分為兩種類型
    - Register of the processor: NVIC, MPU, SCB, DEBUG …
    - Register of the microcontroller: RTC, I2C, TIMER, CAN, USB …

Inline assembly code in ARM GCC compiler

在ARM GCC編譯器的環境底下,如果要在C程式中使用組合語言進行暫存器的操作的話,必須用inline assembly code,語法如下圖

example of inline assembly code syntex

LDR R0,[R1] // 從R1中紀錄的位址存到R0
STR R1,[R3] // 將R1的值存到R3中紀錄的位址

General form of an inline assembly statement
  • Constrain Modifier
    =: Write-only operand, usually used for all output operands
    +: Read-only operand, must be listed as an output operands
    &: A register that should be output only
  • Constrain Character
    r: general register r0-r15
    f: floating point register
    I: immediate value
    X: any operand

example of inline code

example: use inline assembly to implement swap

Reset sequence of the processor

  1. PC = 0X00000000
  2. MSP = value @0x00000000(processor first initializes Main Stack Pointer)
  3. PC = value @ 0x00000004 (value here is the address of reset handler)
  4. Jump to reset handler and execute the instructions
  5. calling main() in reset handler
Flow chart of reset handler

Privileged level and T bit

Privileged level

change privilege in Control Register
Use inline asembly to modify CONTROL register to change privilege level

When in unprivileged level, only in handler can access special register

T bit

  • If T bit of EPSR is set(1), processor think the next instruction is from Thumb ISA. On the other hand, if it is reset(0), processor think the next instruction is from ARM ISA
  • Cortex-Mx processors does not support ARM ISA, so T bit shall be “1"
  • The lsb(bit[0]) of PC is linked to T bit.
  • Whan load a value into PC, the bit[0] will be load into T bit
  • Hence, the bit[0] of address in PC must be 1 (address must be odd)

Bus interface and Memory map

Memory map of ARM Cortex Mx processor

Code Region

  • Different type of Code memories are store here. e.g. Embedded flash, ROM etc.
  • Processor by default fetches vector table information from here after reset.

SRAM Region

  • Connect to on-chip SRAM
  • The first 1MB is bit addressable
  • Can execute program code from here

Peripheral Region

  • Connect to on-chip peripheral. e.g. RTC, ADC etc.
  • The first 1MB is bit addressable (optional)
  • This is an eXecure Never (XN) region, can not execute code from here, if trying to execute will trigger fault exception

External RAM Region

  • Can execute program code from here

External Device Region

  • For external device and/or shared memory
  • This is an eXecure Never (XN) region.

Private Peripheral Bus (PPB) Region

  • For processor peripheral register. e.g. NVIC, Systick timer etc.
  • This is an eXecure Never (XN) region.

Bus protocols and bus interfaces

STM32F40xxx block diagram

base on Advanced microcontroller bus architecture (AMBA) specification.
Used for on-chip communication inside SOC (system on chip)

  • AHB Lite (AMBA High-performance Bus)
    - High speed communication (168 MHz in Cortex M4)
    - Used for main bus interfaces. e.g. PPB,
    S-code(system), D-code(data), I-code(instruction and vector table)
    - Some on-chip peripheral using an AHB-APB bridge
  • APB (AMBA Peripheral Bus)
    - Low speed communication (84/42 MHz in Cortex M4)
    e.g. GPIO, UART, SPI, I2C etc
    - Need bridge to connect to AHB then connect to processor

Bit banding

Example of alias address
  • Using another byte(alias address) to represent specific bit(bit-band region)
  • While using alias address to operate, will not be interrupt by exception or interrupts.
  • Can use Macro tocalculate, and lower amount of assembly code
Formula to calculate alias addeess
Example of operating SRAM code address by alias method

For example, we want change bit[7] in 0x20000200 using alias method, then
Alias_base = 0x22000000
Bit_band_memory_addr = 0x20000200
Bit_band_base = 0x20000000

So, alias address = 0x2200401C, then we can operate the target bit by operate alias address.

Stack

Stack memory

  • 主記憶體的一部分
  • Last In First Out (LIFO)
  • 當中斷、例外或是呼叫函式時,儲存處理器暫存器、區域變數、儲存context

RAM

SRAM
  • Global data: 儲存全域變數、靜態區域變數(static local variable)、甚至是指令(instruction)
  • Heap: 動態記憶體配置(malloc, free …)
  • Stack: 當中斷、例外或是呼叫函式時,儲存處理器暫存器、區域變數、儲存context

Stack Operation Models

Stack Operation Models

ARM Cortex Mx系列使用Full Descending Stack

  • Full: Stack Pointer 指向最後一筆資料的位址
  • Empty: Stack Pointer 指向第一筆空的位址
  • Ascending: 資料擺放隨記憶體位址增加
  • Descending: 資料擺放隨記憶體位址減少
Full Descending Stack in ARM Cortex Mx

Banked stack design of the processor

  • MSP (Main Stack Pointer)
    重新啟動之後SP預設為MSP,用於處理thread mode當中的所有例外(exception)、中斷(interrupt)
  • PSP (Process Stack Pointer)
    備用的SP,只能在thread mode使用,在嵌入式OS中通常會用做應用程序的SP
  • 處理器重啟之後,SP預設會複製MSP裡面的值
  • 在線程模式(Thread Mode)下可以透過更改CONTROL register當中的SPSEL bit 可以把SP設為PSP
  • 在處理模式(Handler Mode)下,SP只能是MSP,修改SPSEL不會有作用
  • 用PSP之前記得要初始化
SPSEL in Thread Mode
this value located in XXX_FLASH.ld

exercise

implementation of change sp to psp
  • 當SP為PSP時,在例外(exception)中,SP會自動切換成MSP,並在結束例外之後,自動切換回PSP

AAPCS (Procedure Call Standard for the Arm Architecture)

  • Define the responsibility of caller(呼叫者) and callee(被呼叫者)
  • C 語言編譯器會自己遵守這個規範,純C 語言使用者不必在意這些
  • R0, R1, R2, R3, R12, R14(LR)被稱作"caller saved registers"
  • R4 - R11被稱作”callee saved registers”
  • R0 - R3 用作傳送參數(argements)到callee的暫存器,大於4個參數的話,多餘的參數就用stack
  • R0, R1 用作回傳值(return)的暫存器

Stacking duing exception/interrupt

  • While stacking and unstacking, processor automatically push/pop the stack frame into stack.

Stack initialization tips

  • 決定需要使用多大的stack
  • 了解stack種類(FD, FA, ED, EA)
  • 決定stack在記憶體中擺放的位址(中間、後面、外部記憶體)
  • 通常會使用兩階段的stack,如果要使用外部的SDRAM的話。首先要初始化內部的RAM,然後在main.c當中配置以及初始化外部的SDRAM
  • 在使用ARM Cortex Mx 處理器時,確保vector table的第一個位址是initial stack address(MSP),(通常startup code會做這件事)
  • 可以透過linker script決定stack, heap和其他區域在RAM裡頭的邊界
  • 通常MSP會用在kernel mode, PSP會用在user mode

Exception Model

  • 例外: 由處理器內部產生的
  • 中斷: 從外部產生的
  • 當處理器發生例外/中斷時,會切換到Handler mode
  • Cortex M系列處理器支援15例外(9個實現、6個保留)+240中斷

Exception type

  • Reset, -3 fixed
    最高級別的例外,以下兩種情況會觸發reset exception,重啟時會進入reset handler with priviliged thread mode
    - power up(斷電重啟)
    - warm reset(按鈕重啟)
  • NMI (Non Maskable Interrupt), -2 fixed
    永遠啟用且不可被遮蔽,可由軟體或周邊設備觸發
  • HardFault, -1 fixed
    當例外處理中發生錯誤,或是當錯誤不屬於其他的例外(包含對應的例外被禁用時)時,就會觸發HardFault
  • MemMange
    當MPU啟用時,存取受保護的記憶體位址時就會被觸發
    - MPU (Memory Protection Unit)
  • BusFault
    很少見,也很少用到
  • UsageFault
    以下幾種情況會觸發
    - undefined instruction
    - illegal unaligned access
    - invalid state on instruction execution
    - error on exception return
    - an unaligned address on word and halfword memory access
    - divided by zero
  • SVCall (Supervisor Call)
    由SVC指令觸發的例外,概念等同於System Call
    SVC指令有對應的編碼,並進入handler mode提供編碼對應的服務
  • PendSV
    interrupt-driven,用於實現context switch
  • SysTick
    SysTicktimer is a peripherial,用於生成定時例外,與實現context switch

PPB (Private Periphral Bus)

Peripheral of ARM Cortex Mx processor

這些周邊設備各自的control register由32-bit PPB控制

Address map of PPB

SCB Registers (System Control Block)

  • Interrupt是在NVIC配置,不會在這邊。SCB負責Exception, Fault, …
  • 設置fault handler、取得pending的狀態
  • 利用除以0或unaligned data access 來trap處理器
  • 控制睡眠與睡眠喚醒
  • 配置exception 的priority
  • 控制SysTick Timer

NVIC (Nested Vectored Interrupt Controller)

ARM Cortex M系列共支援240個中斷(0–239),實作方式(這240個要用來做什麼)由各製造商決定。

Implementation of NVIC
  • 4 bytes for each 8 registers, for all 240 IRQ (other 9 exception, 6 reversed)
Implementation of ISPR
  • Interrupt
    Interrupt有三種狀態,分別是:
    - Enable/Disable: 啟用/禁用
    - Pending/Not pending: 掛起(等待)/非等待。
    中斷被掛起之後系統會依據優先度判斷誰應該先被處理,並進入active
    - Active/Not active: 使用中(處理中)/非使用中
  • NVIC_ISER (Interrupt Set Enable Registers, R/W)
    一個32bits,共有七個(支援0-239個中斷)
    R: 0表示禁用、1表示啟用
    W: 0無效,1表示啟用該中斷
  • NVIC_ICER (Interrupt Clear Enable Registers, R/W)
    一個32bits,共有七個(支援0–239個中斷)
    R: 0表示禁用、1表示啟用
    W: 0無效,1表示清除該中斷
  • NVIC_ISPR (Interrupt Set Pending Registers, R/W)
    一個32bits,共有七個(支援0–239個中斷)
    R: 0表示無掛起、1表示掛起
    W: 0無效,1表示掛起
  • NVIC_ICPR (Interrupt Clear Pending Registers, R/W)
    一個32bits,共有七個(支援0–239個中斷)
    R: 0表示無掛起、1表示掛起
    W: 0無效,1表示清除掛起
  • NVIC_IABR (Interrupt Active Bit Registers, R/W)
    一個32bits,共有七個(支援0–239個中斷)
    R/W: 0表示非使用中、1表示使用中

Priority

在Cortex-M系列處理器中,數字越小,優先度越高。

Interrupt Priority Register of ARM Cortex Mx processor

Priority level

Priority等級的數量取決於晶片製造商設計的Interrupt Priority Register
STM32F4x (STM): 16 levels , TM4c123Gx (TI): 8 levels

Interrupt Priority Register
  • 60 register to cover all 240 IRQs, each register has 4 byte for 4 different IRQs
Implementation of priority level in Cortex Mx processor
  • Exception is configure in System Control Block not NVIC

Pre-empt priority and sub-priority

Priority grouping
Case 1: Priority Group 0 (top), Case 2: Priority Group 5 (bottom)
  • Case 1
    Since only 4 bits are implemented, only [7:4] are pre-empt priority, others are no use.Case 1: Priority Group 0 (top), Case 2: Priority Group 5 (bottom)
  • Case 2
    Since only 4 bits are implemented, [7:6] are pre-empt priority, [5:4] are sub-priority, others are no use.
  • If same pre-empt priority and sub-priority 同時請求處理器,處理器會以編號較低的IRQ先執行(TIM2 in this case ,IRQ28 < IRQ31)
TIM2 (IRQ28) will be processed first
Behavior of double pended interrupt

Exception/Interrupt entering and exiting

在進入中斷程式(ISR)時,CPU會產生一個EXC_RETURN來記錄返回主程式時的模式,並存在LR當中。

離開中斷程式(ISR)前,會透過pop指令把Stack當中的LR的值,也就是EXC_RETURN 寫入到PC,以此來觸發excrption return

離開中斷程式(ISR)後,則會讀取main stack中的PC 回到中斷前的位址

  • EXC_RETURN
    32bits, bit 0,1,4–31 are reserved, only bit 2 and 3 works
    - bit 2 : 1 = return to Thread mode; 0 = return to Handler mode
    - bit 3 : 1 = return with PSP; 0 = return with MSP
  • EXC_RETURN Value:
    - 0xFFFFFFF1 : 回到Handler mode,從main stack讀取中斷前的狀態,並用MSP當作SP
    - 0xFFFFFFF9 : 回到Thread mode,從main stack讀取中斷前的狀態,並用MSP當作SP
    - 0xFFFFFFFD : 回到Thread mode,從process stack讀取中斷前的狀態,並用PSP當作SP

Fault Handling

Fault exceptions in Cortex Mx processor

Fault 是由處理器產生的一種例外(exception),Cortex Mx 系列有最多16種例外,其中有實現9種,其他的保留。

HardFault, MemMange, BusFault, UsageFault 是其中的fault

HardFault: 預設開啟
1. 發生其他的fault,但該fault未被啟用,將會強制升級為HardFault
2. 當vector fecth時,發生BusFault,也會強制升級為HardFault
3. 若在處理器中禁用halt mode與debug monitor,這時若在程式中加入breakpoint將會發生HardFault
4. 在SVC handler中,執行SVC指令

HardFault Status Register

MemMange: 預設為關閉,可在System Handler Control and State Register (SHCSR) 設定
1. 存取被處理器或MPU保護的記憶體區域
2. 在Non-Privileged Thread mode下存取 "Privileged only" 的記憶體區域
3. 嘗試寫入被MPU標記為read-only的區域
4. 從“peripheral”的記憶體區域執行程式

System Handler Control and State Register (SHCSR)

BusFault: 預設為關閉
1. 在存取儲存裝置時產生的bus interfaces error
2. 當處理器的bus interfaces嘗試存取無效或受限制的記憶體位置
3. 當裝置沒有準備好接受記憶體傳輸時
4. 用DRAM controller存取外部記憶體(e.g. SDRAM)時可能發生
5. 對PPB的unprivileged access

UsageFault: 預設為關閉
1. 執行未定義的指令 (e.g. ARM ISA)
2. 在未啟用浮點運算單元(FPU)的情況下,執行浮點運算指令
3. 嘗試接換到ARM模式以執行ARM ISA (T-bit為0)
4. 在例外/中斷還是active的情況下,嘗試切換回thread mode
5. 嘗試除以0 (only if enabled the “devide by zero trap”)
6. 未對齊的資料存取(only if enabled unaligned data access trap)
7. 多個讀取/儲存指令的未對齊的資料存取

interrupt handling

bus interfaces and bus matrix

memory architecture, bit banding, memory map

endianness

aligned and unaligned data trasfer

boorloader and IAP (In Application Programming)

--

--

Qertile 郭泰爾

學習路上順便做點筆記留下痕跡OUO,怕以後忘了曾經所學的這些知識。