In this article, we’ll begin the Kinetis Deep Dive by setting up an embedded build and debug toolchain. Then, we can test the toolchain on an evaluation board through GDB.
I’m going to use the plural first person pronoun throughout this set of articles. “We” sounds much more inclusive than “I”, and ideally, if you’re reading these articles, then I hope that you will participate in this process. The cost of participation is not especially expensive. You will need to have access to a relatively modern computer and some familiarity with programming in C. There are plenty of great tutorials and study guides out there to bring you up to speed. In addition, you will need access to the evaluation board I’m using here. For the initial setup, I will be using the FRDM-K22F, which can be purchased for around $29.
Application
Before we begin the process of bootstrapping an embedded toolchain, it will be useful to consider the application to which we are working in this article. The primary focus of the initial articles is to explore the technology provided by the Kinetis parts and the Cortex-M4 platform. However, without a clear goal in mind, these articles could go on for quite a while with no practical outcome. So, to help scope these articles, I will describe a rather simple application – so simple as to be frivolous for the power of this chipset – and we will build firmware in support of this goal.
The application is a simple Pomodoro clock with an OLED display, a separate battery-backed RTC module, a Bluetooth Low Energy peripheral module, an audio amplifier, and a speaker. This, along with a few arcade style buttons, will be all of the peripherals with which our microcontroller board will interact. The goal is to create a simple clock that can perform the 25-minute and 5-minute countdown sequences that are part and parcel to this time management technique. Clearly, this project is not very practical – one could buy a smart phone app or a timer for far cheaper than the cost in parts here – but these peripherals and this application allow us to explore some real embedded concepts without digging into the complexities of a real embedded project. Specifically, this project lets us explore the three main communication protocols found in many real-life embedded applications: SPI, UART, and I2C. Also, because the Kinetis part we are using has support for digital audio, I’m including an I2S audio amplifier.
For the exact parts I am using, I picked a single popular electronics hobby supplier, Adafruit. Here are the components I selected. Note that we won’t be using these components until much later in this project, so if you do intend to follow along, you can wait to purchase these until I post articles building the drivers for each. Note that these components are in kit form, and will require some light soldering. However, they each fit into a standard breadboard to make testing with our evaluation board quite easy.
Debug Tool
Now that we have an application in mind, we can consider setting up an embedded toolchain. The evaluation board has built-in support for OpenSDA via bootloader firmware provided by Segger. The first step will be to download and install the Segger J-Link Software and Documentation Pack for your platform as well as the OpenSDA bootloader firmware for the FRDM_K22F. Note that you will need access to a Windows PC to load the bootloader firmware. However, the rest of this bootstrapping process will work on Windows, Linux, or Mac OS once the boot loader for the Freedom board has been loaded. Please follow the instructions provided in those links to set up both the bootloader for the FRDM-K22F board and the J-Link software.
Toolchain Build Script
Now that we have debugging enabled on the evaluation board, we will need an embedded build and debug toolchain that can target this board. The GNU Compiler Collection fits this bill nicely. We will need to bootstrap this compiler as a cross-compiler that targets the ARMv7 architecture using the ARM Embedded ABI. Although the Kinetis part we are targeting has an FPU, we will be using the soft float ABI to start since this simplifies some work in task switching. We can circle back and consider implementing hard float support later on.
The following instructions assume that you have access to a Unix-like environment. Windows users can install Cygwin or potentially use the Windows 10 Bash Shell. I have tested the following under Windows with Cygwin, but I have not tested this with any other Windows features. I have also tested this on Linux and Mac OS.
We are going to build an automated script that will verify that our build environment has the tools we need to build the toolchain, download the source tarballs we need to build this toolchain, verify that they downloaded completely, and then extract, configure, and install each component needed in this toolchain. There are a lot of steps, which is why we’re building a script. Don’t do by hand that which can be automated, especially if it will be repeated often. Furthermore, it is good practice to ensure that a process as important as setting up a build toolchain is repeatable. Setting up a script like this is crucial when setting up a toolchain that an embedded engineering team may use, so it’s good to get into the practice.
We will build this script from the bottom-up. This will be a bash script, which should exist in most Unix systems or that can be easily installed. We’ll start with the preamble and a few basic functions for checking that directories and executables exist. For those who want to cheat, the complete script and usage instructions can be found here.
The first function, check_directory_exists
, checks that that the directory
provided in the first argument exists. The next function, check_exe
, checks
that an executable provided by the first argument exists. We will use the first
function to ensure that an environment variable, ARM_TOOLCHAIN_DIR
, is defined
by the user of this script.
We want the build process in the script to be restartable. If an error occurs,
or if we need to stop and restart the script, it’s useful if it can reason about
what has already been done and start where it left off. The remaining functions
will provide rudimentary restart support which should be good enough to ensure
that this build process won’t be annoying if we have to tweak something in the
middle. The first function, download_if_missing
, will only download a tarball
from a mirror if the tarball does not already exist in the current directory.
This function uses curl to perform the download. We use the truncated progress bar, follow links, and let curl decide the name of the output file using the url.
Once we download files, we need to verify that they downloaded completely, and have some basic assurances that we downloaded the right file. To do this, we use whichever utilities the developers of each tarball suggest. This isn’t a particularly secure way of handling this problem, but setting up a secure keychain and maintaining some of the other sorts of policies we’d need for vendor branching is a bit beyond this article. Suffice it to say that if you are looking for a pretty good way to ensure that files have downloaded correctly and you don’t suspect a malicious man-in-the-middle, this approach will work fine. If you are building a toolchain for something mission critical, research provenance techniques.
The first thing we want to do is grab a few public keys we will need to verify the signatures of source tarballs. The following function grabs a public key by identifier on a given key server, and saves them to the GPG key ring.
Next, we can verify the signature of a tarball with the verify_signature
method. This method uses the signature provided in the first argument to verify
the file provided in the second argument.
If this process fails, both the tarball and the signature file should be deleted and the script should be run again. Most likely, the download failed.
Next, we can verify the MD5 hash of a file using verify_md5_ugly
.
As evidence by the name of the function, MD5 hashes are all but worthless. The
algorithm suffers from major vulnerabilities, and it is possible to generate a
file with an arbitrary hash in a surprisingly small amount of time. An MD5 hash
is about as secure as a CRC. Unfortunately, many open-source
projects still use MD5 hashes, even though they should know better.
To round out our verification functions, the verify_sha512
function compares a
file against the SHA-512 hash provided.
At this point, we have everything we need to download and verify tarballs. Once the tarballs are downloaded and verified, they need to be extracted, configured, built, optionally tested, and installed. For each of these steps, we will use dot files to track our progress. These dot files will be created after a given step completes successfully, and the next time the script is run, the existence of these dot files will allow the script to skip over work it has already performed.
The first function, extract_once
, will extract a given tarball just once, and
will enforce this extraction policy through the existence of a dot file that is
created after the extraction successfully completes.
Next, configure_once
runs the configure script provided with
the tarball just once, enforcing this policy through the existence of a dot
file that is created after the configure process successfully completes. This
function takes four arguments: the tag name for the dot file, the workspace to
configure, any additional configure options to pass to the configure script, and
the name of the build directory to be used to build this package. This last
argument is optional. If it exists, then the build directory will be created
and configure will be run from that directory. If it does not exist, then
configure will run from the package root directory.
The build_once
function builds a package just once, enforcing this policy
through the existence of a dot file that is created after the build process
successfully completes. This function takes four arguments, the tag to be used
to create the dot file, the workspace name, the build directory, and the build
target to pass to make. Both the build directory and build
target arguments are optional. If the build directory is not specified,
build_once
will build from the package root. If the build target argument is
not specified, then build_once
will make the default build target.
Of note in this function is the existence of a MAKE_OPTS
variable. This
variable can be set by the user or passed as a variable assignment when this
script is invoked to pass some options to make. For instance, to speed up the
build process, it may not be a bad idea to set MAKE_OPTS
to -j4
or -j8
depending upon the number of CPU cores you have. This will cut down the build
time considerably.
In general, testing is critical for success. This is often true when building
certain source packages as well. The multiprecision libraries on which gcc
depends can be finicky on certain platforms. We definitely want to make sure
that these libraries have built correctly, otherwise gcc may crash, or even
worse, generate bad code. The test_once
function runs the test suite of a
newly built package. As before, this test suite is run just once, and the
policy is enforced through the existence of a dot file that is created after the
test suite runs successfully. The function takes four arguments, the tag to be
used to create the dot file, the workspace name, the build target that kicks off
the test suite, and the build directory. The build directory argument is
optional. If not specified, then the test suite will be run from the package
root.
Once a package has been extracted, configured, built, and tested, all that is
left is to install it. The install_once
function installs a package just
once, using a dot file to enforce this policy. It takes four arguments, the tag
name used to generate the dot file, the workspace, the optional build directory,
and an optional install target name. If the build directory is not specified,
installation is run from the package root. If the install target is not
specified, then “install” is used as the target. The installation directory is
specified by configure_once
during the configuration step; each package is
installed in ARM_TOOLCHAIN_DIR
.
That is all of the functions we need to create the build toolchain. Now, we can
use these functions to run our script. First, we do a little sanity checking
and environment setup. We want to make sure that the ARM_TOOLCHAIN_DIR
environment variable is set and points to a real directory. We want to add this
directory to the executable and library path so we can use tools after they have
been built. Finally, we want to check that all tools required by this script
have been installed. Using the functions we defined, this is short work.
Next, we want to download each of the required packages. We will also download public keys and signature files. Finally, we verify each of the packages to ensure that they were downloaded correctly.
The last step in the script is to extract and build each package in the right order. Using the functions we defined above, this is a very easy process.
That’s it. Combining all of these into a script gives us a bit of automation that can build the ARM toolchain for our Kinetis part. This script is easily modified to track new versions of GCC or other dependencies, and it can be easily adapted to other chipsets. If we want someone else to be able to build our embedded project, this script can come in handy for ensuring that the exact same toolchain is available to everyone.
As a final note, from this point on, we’ll need to update our build environment to pull in the new executables for this toolchain.
Testing the Toolchain
The last thing we’ll do in this article is test that the toolchain works. Note that this test will only work if the Segger tool was downloaded and the Segger OpenSDA firmware was loaded onto the evaluation board.
First, we need to build a linker control file that will allow us to create firmware images for the K22 microcontroller on the evaluation board. The linker control file defines where important segments of the code should go. The linker uses this file to build a firmware image that can be executed by the microcontroller. Among the details we have to get right are the sizes of the memory regions and their locations, such as RAM and code flash. This information can be discovered by reading the reference manual for the part on the evaluation board, which can be found here. Be sure to download and save this reference manual, as it is pretty much the bible for how we proceed with building the embedded OS and drivers.
The first thing we need to define in the linker control file is the entry point
for our firmware image. We are using Newlib for our C runtime,
so the entry point is predefined by Newlib as _start
. We will also provide
default values for the stack and heap sizes, and also provide the option of
creating an interrupt vector table in RAM.
ENTRY(_start)
SZ_STACK = DEFINED(__sz_stack__) ? __sz_stack__ : 0x0400;
SZ_HEAP = DEFINED(__sz_heap__) ? __sz_heap__ : 0x0400;
SZ_RAM_VECTOR_TABLE = DEFINED(__sz_ram_vec_tab__) ? 0x0400 : 0x0000;
By default, we provide a kilobyte of stack and heap, which is what we will use for the kernel’s stack and heap. This stack will also be used by the interrupts, so we want it to be sufficient enough in size to allow re-entry into the kernel by the interrupts later on.
Next, we need to define the memory areas for this MCU. According to the reference manual, the memory map looks something like this.
Address Range | Description | |
---|---|---|
0x00000000 - 0x07FFFFFF |
Program Flash / Constants | |
0x08000000 - 0x1BFFFFFF |
FlexBus / Reserved | |
0x1C000000 - 0x1FFFFFFF |
SRAM_L | |
0x20000000 - 0x200FFFFF |
SRAM_H | |
… | … |
The important things to notice in this memory map is the beginning of program
flash, and the end of SRAM_L and the beginning of SRAM_H. In the Kinetis family
of parts, the two halves of SRAM are anchored, but their sizes can vary. The
end of SRAM_L is anchored at address 0x1FFFFFFF
. The beginning of SRAM_H is
anchored at 0x20000000
. These two halves form a contiguous region, but that
region is anchored at the middle. According to the reference manual, this part
has 128KB of RAM, which is split evenly between these two regions. Therefore,
with a little math, we know that SRAM_L begins at address 0x1FFF0000
and has a
length of 64KB, or 0x00010000
bytes. We know that SRAM_H begins at address
0x20000000
and has a length of 0x00010000
bytes.
The interrupt vector, by default, starts at address 0x00000000
. There are 256
entries in the vector table, each of which is a 32-bit, or four byte address.
Therefore, the total size of the interrupt vector table is 1024 bytes or
0x400
bytes. Immediately after the vector table is the flash config, which is
16 bytes or 0x10
bytes in length. Then, the main program flash starts, at
offset 0x410
. The total flash size on this chip is 512KB, or 0x00080000
bytes. Subtracting the sizes of the interrupt vector table and the flash
config, we get a length for the main program flash of 0x0007FBF0
bytes.
Putting all of this together, here is the memory map for our linker control file:
MEMORY
{
mem_interrupts (RX) : ORIGIN = 0x00000000, LENGTH = 0x00000400
mem_flash_config (RX) : ORIGIN = 0x00000400, LENGTH = 0x00000010
mem_text (RX) : ORIGIN = 0x00000410, LENGTH = 0x0007FBF0
mem_data (RW) : ORIGIN = 0x1FFF0000, LENGTH = 0x00010000
mem_data_2 (RW) : ORIGIN = 0x20000000, LENGTH = 0x00010000
}
Object files place executable code, constants, and initialized data in different sections. The rest of the linker control file tells the linker where to place each of these sections in memory. We use the offsets provided in the memory map to organize these sections in a way that provides the linker with enough information to decide where each section goes.
SECTIONS
{
We’ll start with the interrupt vector table. We place all .isr_vector
sections that are defined in object fils into the .interrupts
section. We
then place this section in the mem_interrupts
memory region.
.interrupts :
{
__VECTOR_TABLE = .;
. = ALIGN(4);
KEEP(*(.isr_vector))
. = ALIGN(4);
} > mem_interrupts
We also set the symbol, __VECTOR_TABLE
to the location where this interrupt
vector table is located. We use 32-bit alignment for each of the vectors, and
we use a glob pattern to pull all .isr_vector
sections into this section.
Next, we define a section for flash configuration data. This will be placed in
the mem_flash_config
memory region.
.flash_config :
{
. = ALIGN(4);
KEEP(*(.FlashConfig))
. = ALIGN(4);
} > mem_flash_config
Next, we define a section to hold executable code. This will be placed in the
mem_text
region. There are a lot of glob patterns here. The most important
one is .text
, but we throw in a few of the other common sections that may be
encountered depending upon toolchains and languages used.
.text :
{
. = ALIGN(4);
*(.text)
*(.text*)
*(.rodata)
*(.rodata*)
*(.glue_7)
*(.glue_7t)
*(.eh_frame)
KEEP (*(.init))
KEEP (*(.fini))
. = ALIGN(4);
} > mem_text
Next, we provide some sections that deal with exception unwinding and stack traces. These may be useful later.
.ARM.extab :
{
*(.ARM.extab* .gnu.linkonce.armextab.*)
} > mem_text
.ARM :
{
__exidx_start = .;
*(.ARM.exidx*)
__exidx_end = .;
} > mem_text
The .ctors
and .dtors
sections glob together sections that are used by
object constructors and destructors during program initialization and shutdown.
We also include .preinit_array
, .init_array
, and .fini_array
, which round
out the initialization and shutdown function pointers needed to start up a C/C++
runtime environment.
.ctors :
{
__CTOR_LIST__ = .;
KEEP (*crtbegin.o(.ctors))
KEEP (*crtbegin?.o(.ctors))
KEEP (*(EXCLUDE_FILE(*crtend?.o *crtend.o) .ctors))
KEEP (*(SORT(.ctors.*)))
KEEP (*(.ctors))
__CTOR_END__ = .;
} > mem_text
.dtors :
{
__DTOR_LIST__ = .;
KEEP (*crtbegin.o(.dtors))
KEEP (*crtbegin?.o(.dtors))
KEEP (*(EXCLUDE_FILE(*crtend?.o *crtend.o) .dtors))
KEEP (*(SORT(.dtors.*)))
KEEP (*(.dtors))
__DTOR_END__ = .;
} > mem_text
.preinit_array :
{
PROVIDE_HIDDEN (__preinit_array_start = .);
KEEP (*(.preinit_array*))
PROVIDE_HIDDEN (__preinit_array_end = .);
} > mem_text
.init_array :
{
PROVIDE_HIDDEN (__init_array_start = .);
KEEP (*(SORT(.init_array.*)))
KEEP (*(.init_array*))
PROVIDE_HIDDEN (__init_array_end = .);
} > mem_text
.fini_array :
{
PROVIDE_HIDDEN (__fini_array_start = .);
KEEP (*(SORT(.fini_array.*)))
KEEP (*(.fini_array*))
PROVIDE_HIDDEN (__fini_array_end = .);
} > mem_text
That wraps up the text data. We can now create a symbol that represents the end
of text data, __etext
, and a symbol that represents the beginning of ROM data
__DATA_ROM
.
__etext = .;
__DATA_ROM = .;
If RAM interrupts are enabled, they will go at the beginning of mem_data
. The
.interrupts_ram
section picks up any of these. We can then compute a few
related symbols that can be used by the firmware to set up the RAM interrupt
vector if defined.
.interrupts_ram :
{
. = ALIGN(4);
__VECTOR_RAM__ = .;
__interrupts_ram_start__ = .;
*(.interrupts_ram)
. += SZ_RAM_VECTOR_TABLE;
. = ALIGN(4);
__interrupts_ram_end__ = .;
} > mem_data
__VECTOR_RAM = DEFINED(__sz_ram_vec_tab__)
? __VECTOR_RAM__
: ORIGIN(mem_interrupts);
__RAM_VECTOR_TABLE_SIZE_BYTES = DEFINED(__sz_ram_vec_tab__)
? (__interrupts_ram_end__ - __interrupts_ram_start__)
: 0x0;
Next, we define a section to hold all initialized data.
.data : AT(__DATA_ROM)
{
. = ALIGN(4);
__DATA_RAM = .;
__data_start__ = .;
*(.data)
*(.data*)
. = ALIGN(4);
__data_end__ = .;
} > mem_data
__DATA_END = __DATA_ROM + (__data_end__ - __data_start__);
text_end = ORIGIN(mem_text) + LENGTH(mem_text);
ASSERT(__DATA_END <= text_end, "region mem_text overflowed with text and data")
The last assertion just checks that the text section doesn’t override the data section.
Next, we define a section to hold the .bss
data, or uninitialized data.
.bss :
{
. = ALIGN(4);
__START_BSS = .;
__bss_start__ = .;
*(.bss)
*(.bss*)
*(COMMON)
. = ALIGN(4);
__bss_end__ = .;
__END_BSS = .;
} > mem_data
Finally, we place the heap and stack in the mem_data_2
region and make some
simple assertions to make sure that our sizes and offsets are sane.
.stack :
{
. = ALIGN(8);
. += SZ_STACK;
} > mem_data_2
__StackTop = ORIGIN(mem_data_2) + LENGTH(mem_data_2);
__StackLimit = __StackTop - SZ_STACK;
PROVIDE(__stack = __StackTop);
ASSERT(__StackLimit >= __HeapLimit,
"region mem_data_2 overflowed with stack and heap")
To close out the linker control file, we null out the .ARM.attributes
sections
so these do not appear in the firmware image. These are still available for
debugging purposes.
.ARM.attributes 0 : { *(.ARM.attributes) }
}
We’ll name the linker control file, k22_board.ld
.
Dummy Interrupt Vector Table
To start our example code, we will need to define a dummy interrupt vector table. Since we just want to test that our toolchain works, we are going to stub out this table. In subsequent articles, we’ll add interrupt handlers to this table and turn on interrupts.
Example Program
With the linker control file completed and the dummy interrupt vector table
defined, we can now build a simple example program to test that the build
toolchain properly builds an image that we can load, and that the debugger
allows us to step through the code. We’ll name this file, main.c
.
The other thing we will need to provide is a system exit method. When main
exits, the _exit
method is eventually called by the C runtime. Generally,
this method is used to do any operating-system specific work necessary to return
control of the system back to the operating system. In our case, we are running
on bare metal, and there is no OS to which to return. Instead, we’ll write an
_exit
method that just enters an infinite loop. We’ll name this file,
sysexit.c
.
For simplicity sake, we’ll write a quick bash script for compiling our code.
Later on, we’ll replace this with a GNU Make build script, but
for now, we don’t need anything that elaborate. This script grabs all of the C
source files in the current directory, and performs a single-shot compile using
our new GCC compiler. We’ll call this script, build.sh
.
After running this script, two files should be created. test.elf
is the
ELF file for our image, and test.map
is a linker map.
At this point, we can plug the evaluation board into the computer’s USB port, and start up the Segger GDB server. How this is started is dependent upon the installation. But, if running from the command-line, the startup should look something like this:
Next, in a separate terminal window, we’ll want to start up the ARM debugger built as part of the toolchain.
From within GDB, we first want to load the firmware image file.
file test.elf
Now, we want to connect to the remote target.
target remote:2331
Now, we’ll want to halt the target CPU so we can load the firmware image.
monitor halt
load
At this point, the firmware image is loaded. We want to set a breakpoint in
main
just before the global variable z
is set.
list main
break 5
Now, we’ll let the monitor run until it hits our breakpoint.
cont
To verify that our code is working, we will first print the value of z
. It
should be 0
, since z
is in the .bss
section.
print z
Now, step over the instruction to set z
.
next
Verify that z
was set by printing z
again.
print z
If z
has been set to 7
, then we have verified that the toolchain, the
runtime library, and the debugger are all running correctly. That’s all for
this article. Next time, we’ll begin building some task management
functionality.