STM32 Baremetal Examples
If you’ve been reading the posts about STM32s that I’ve been writing, I owe you an apology. Usually when people write microcontroller tutorials,
UART is one of the first peripherals that they talk about, and I’ve gone far too long without mentioning it. It is such a fundamental peripheral that I vaguely thought I’d already written about it until I got a couple of comments asking about it, so thank you for those reminders!
UART stands for “Universal Asynchronous Receiver / Transmitter”, and it is a very simple serial communication interface. In its most basic form, it only uses two data signals: “Receive” (
RX) and “Transmit” (
TX). Since it is asynchronous (no clock signal), both devices need to use the same “baud rate”, which is basically the transmission frequency measured in Hertz. If you have a baud rate of 9600, then you expect a new bit every 1 / 9600 of a second. (But technically, your actual transmission frequency will be slightly lower than the baud rate, because the standard includes extra “control” bits which are sent in addition to the actual data.)
One of the most common uses of
UART is to transmit strings of text or binary data between devices. That, combined with the availability of cheap off-the-shelf
USB / UART bridges, makes it a popular way to add some interactivity and a working
printf(...) function to bare-metal applications.
And while a simple 2-wire
UART connection is reliable enough for most purposes, there is also an extended
USART standard which adds an optional “clock” line to synchronize the two devices’ timing; the extra “S” stands for “Synchronous”. The standards are otherwise very similar, so you might see
USART used interchangeably in some places. There are also a set of extra “flow control” signals, but I’m not going to talk about those or
USART functionality in this post.
I will cover a few basic ways to use the STM32
UART peripherals, though:
- Setting up the
UARTperipheral to send / receive data one byte at a time.
- Implementing the C standard library’s
printf(...)function to send text strings over
- Using interrupts to receive data as it arrives.
- Setting up a “ring buffer” to handle continuous data reception.
If any of that sounds interesting, keep reading! The target hardware will be either an
STM32L432KC “Nucleo-32” board or an
STM32F103C8 “pill” board; they cost around $11 or $2-5 respectively. The “Nucelo” boards are easier to use, because they include a debugger. If you use a “pill” board, you’ll also need an
ST-LINK debugger and a
USB / UART bridge such as a
CP2102 board. And these examples are all available in a GitHub repository, if you just want a quick reference.
Across the globe, people seem to enjoy decorating their homes, communities, and outdoor spaces with lights and ornaments during the winter holidays. Maybe it helps with the depressingly early sunsets for those of us who don’t live near the equator. Anyways, I thought it’d be fun to make some ornaments with multi-color addressable LEDs last year, and I figured I’d write about what worked and what didn’t.
I didn’t have many microcontrollers at the time because I was visiting family for the holidays, so I ended up coding the lighting patterns for a cheap little
STM32F103 “black pill” board which was in the bottom of my backpack. And it’s a convenient coincidence that I just started learning about the very similar
GD32VF103 chips with their fancy
RISC-V CPUs and nearly-identical peripheral layout, so this also seems like a good opportunity to write about how to cross-compile the same code for two different CPU architectures.
This was a fun and festive project, and it might not be a bad way to introduce people to embedded development since there are so many ways to drive these ubiquitous “NeoPixel” LEDs. Sorry that this post is a little bit late for the winter holidays – I’ve been traveling for the past few months – but maybe it’ll get you thinking about next year 🙂
I’ll talk about how I assembled the stars and what I might do differently next time, then I’ll review how to light them up with an
STM32F103, and how to adapt that code for a
GD32VF103. But you could also use a MicroPython or Arduino board to set the LED colors if you don’t want to muck around with peripheral registers.
I’ve written a little bit in the past about how to design a basic STM32 breakout board, and how to write simple software that runs on these kinds of microcontrollers. But let’s be honest: there’s still a bit of a gap between creating a small breakout board to blink an LED, and building hardware / software for a ‘real-world’ application. Personally, I would still want a couple of more experienced engineers to double-check any designs that I wanted to be reliable enough for other people to use, but building more complex applications is a great way to help yourself learn.
So in this post, I’m going to walk through the process of designing a small ‘gameboy’-style handheld with a GPS receiver and microSD card slot, for exploring the outdoors instead of video games. Don’t get me wrong, you could still write games to run on this if you wanted to, and that would be fun, but everyone and their dog has made a Cortex-M-based handheld game console by now; there are plenty of better guides for that, and many of those authors put a lot more time into their designs and firmware than I ever did.
The board design isn’t too complicated, but there are several different parts and it gets easier to make small-but-important mistakes as a design gets larger. It mostly uses peripherals that I’ve talked about previously, but there are a couple of new ones too. The display will be driven over SPI, the speaker uses a DAC, the GPS receiver talks over UART, the battery and light levels will be read using an ADC, and the buttons will be listened to using interrupts. But I haven’t written about the USB or SD card (“MMC”) peripherals, and those will need to go in a future post since I haven’t actually worked them out myself yet. Note that SD cards can technically use either SPI or SD/MMC to communicate, but the microcontroller that I picked has a dedicated SD/MMC peripheral, and I wanted to learn about it.
Anyways, if that sounds interesting, read on and let’s get started!
If you choose to pursue embedded development beyond the occasional toy project, it probably won’t take long before you want to design something which runs off of battery power. Many types of devices would not be useful if they had to be plugged into a wall outlet all the time, and power efficiency is one of the biggest advantages that microcontrollers still have over application processors like the one in a Raspberry Pi.
When you do move an application to battery power, you’ll quickly discover that it is very important for your device to be able to A) charge its battery and B) alert you when its battery is running low. Not knowing whether something has hours or seconds of life left can be really annoying, and trying to use a nearly-dead battery can cause strange behavior, especially if the battery’s power drops off slowly as it dies. Most lithium-based batteries also last longer if you avoid fully discharging them – there’s some good information about lithium battery aging in this article.
So in this post, I’m going to go over a very basic circuit to power an STM32 board off of a single lithium-ion battery and monitor its state of charge. I will also talk briefly about how to add a simple battery charger to your design, but you should always independently verify any circuitry which interacts with lithium batteries! This circuit seems to work to the best of my knowledge, but don’t take my word for it; it’s very important to double- and triple-check your li-po battery circuits, because they can easily become serious fire hazards if they are handled improperly. It’s also good practice to avoid leaving lithium-ion batteries unattended while they are charging, and you should try to get batteries with built-in protection circuitry to help mitigate bad situations like over-current, under-voltage, etc.
So with those brief and not comprehensive safety warnings out of the way, let’s get started! I’ll use an
STM32L4 chip for this example, but the ADC peripheral doesn’t seem to change much across STM32s. And here is a GitHub repository containing design files for a simple board which demonstrates the concepts described in this post.
It has been about nine months since ST released their new
STM32G0 line of microcontrollers to ordinary people like us, and recently they released some new chips in the same linup. It sounds like ST wants this new line of chips to compete with smaller 8-bit micros such as Microchip’s venerable AVR cores, and for that market, their first round of
STM32G071xB chips might be too expensive and/or too difficult to assemble on circuit boards with low dimensional tolerances.
Previously, your best bet for an STM32 to run a low-cost / low-complexity application was probably one of the cheaper
STM32L0 chips, which are offered in 16- and 20-pin TSSOP packages with pins spaced 0.65mm apart. They work great, but they can be difficult to use for rapid prototyping. It’s hard to mill or etch your own circuit board with tight enough tolerances, and it’s not very easy to solder the chips by hand. Plus, the aging
STM32F031F6 still costs $0.80 each at quantities of more than 10,000 or so, and that’s pretty expensive for the ultra-cheap microcontroller market.
STM32G031J6: an STM32 chip which comes in a standard SOIC-8 package with 32KB Flash, 8KB RAM, a 64MHz speed limit, and a $0.60 bulk price tag (closer to $1.20-1.40 each if you’re only buying a few). That all compares favorably to small 8-pin AVR chips, and it looks like they might also use a bit less power at the same clock speeds. Power consumption is a tricky topic because it can vary a lot depending on things like how your application uses the chip’s peripherals or what voltage the chip runs off of. But the
STM32G0 series claims to use less than 100uA/MHz, and that is significantly less than the 300uA/MHz indicated in the ATTiny datasheets. Also, these are 32-bit chips, so they have a larger address space and they can process more data per instruction than an 8-bit chip can.
Considering how easy STM32 chips are to work with, it seems like a no-brainer, right? So let’s see how easy it is to get set up with one of these chips and blink an LED.
I’ve written a few basic tutorials about bare-metal STM32 development in the past, and even though I’m still learning as I write them, I think that there’s enough groundwork to start covering some ‘real world’ scenarios now. I’d like to start with a very important technique for designing efficient applications: the Direct Memory Access (DMA) peripheral. DMA is important because it lets you move data from one area of memory to another without using CPU time. After you start a DMA transfer, your program will continue to run normally while the data is moved around ‘in the background’.
That’s the basic idea, but the devil is always in the details. So in this post, we’re going to review how the three main types of STM32 DMA peripherals work. Different STM32 chips can have similar peripherals which behave slightly differently, and usually more expensive / newer chips have more fully-featured peripherals. I think that this is how the peripherals are grouped, but I didn’t test every type of STM32 chip and corrections are always appreciated:
- ‘Type 1’ Simple DMA:
- ‘Type 2’ Double-buffered DMA:
- ‘Type 3’ DMA + DMA multiplexer:
Once we’ve reviewed the basics of how DMA works, I’ll go over how to use it in a few example applications to show how it works with different peripherals and devices. The required hardware for each example will be discussed later, but I’ll present code to:
- Generate an audio tone by sending a sine wave to the DAC peripheral at a specific frequency.
- Map an array of colors to a strip of
- Map a small region of on-chip RAM to a monochrome
- Map a a region of RAM to an
The key to these examples is that the communication with an external device will happen ‘in the background’ while your microcontroller’s CPU is doing other things. Most of the examples won’t even use interrupts; the data transmission is automatic once you start it. But be aware that DMA is not magic. Every DMA ‘channel’ or ‘stream’ shares a single data bus which is also used by the CPU for memory transfers, so there is a limit to how much data you can actually send at once. In practice this probably won’t be a problem unless you have multiple high-priority / high-speed DMA transfers with tight timing requirements, but it’s something to be aware of.
So let’s get started!
Rust is a fairly new language that has gotten to be very popular in recent years. And as the language matures, it has started to support a wider set of features, including compilation and linking for bare-metal targets. There is an excellent “Embedded Rust” ebook being written which covers the concepts that I’ll talk about here, but it’s still in-progress and there aren’t many turn-key code examples after the first couple of chapters.
The Rust language is less than 10 years old and still evolving, so some features which might change in the future are only available on the
nightly branch at the time of writing; this post is written for
rustc version 1.36. And the language’s documentation is very good, but it can also be a little bit scattered in these early days. For example, after I had written most of this post I found a more comprehensive “Discovery ebook” which covers hardware examples for an
STM32F3 “Discovery Kit” board. That looks like a terrific resource if you want to learn how to use the bare-metal Rust libraries from someone who actually knows what they’re talking about.
As a new Rustacean, I’ll admit that the syntax feels little bit frustrating at times. But that’s normal when you learn a new language, and Rust is definitely growing on me as I learn more about its aspirations for embedded development. Cargo looks promising for distributing things like register definitions, HALs, and BSPs. And there’s an automated
svd2rust utility for generating your own register access libraries from vendor-supplied SVD files, which is useful in a language that hasn’t had time to build up an extensive set of well-proven libraries. So in this post I’ll talk about how to generate a “Peripheral Access Crate” for a simple
STM32L031K6 chip, and how to use that crate to blink an LED.
The target hardware will be an STM32L031K6 Nucleo-32 board, but this should work with any
STM32L0x1 chip. I also tried the same process with an
STM32F042 board and the
STM32F0x2 SVD file, which worked fine. It’s amazing how easy it is to get started with a new chip compared to C, although you do still need to read the reference manuals to figure out which registers and bits you need to modify. This post will assume that you know a little bit about how to use Rust, but only the very basics – I’m still new to the language and I don’t think I would do a good job of explaining anything. The free Rust ebook is an excellent introduction, if you need a quick introduction.
Microcontrollers are just like any other kind of semiconductor product. As manufacturers learn from customer feedback and fabrication processes continue to advance, the products get better. One of the most visible metrics for gauging a chip’s general performance – and the basis of Moore’s Law – is how large each transistor is. Usually this is measured in nanometers, and as we enter 2019 the newest chips being made by companies like Samsung and Intel are optimistically billed as 7nm.
The venerable and popular STM32F1 series is more than a decade old now, and it is produced using a 130nm process. But ST’s newer lines of chips like the STM32L4 use a smaller 90nm process. Smaller transistors usually mean that chips can run at lower voltages, be more power-efficient, and run at faster clock speeds. So when ST moved to this smaller process, they introduced two types of new chips: faster ‘mainline’ chips like the
F7 lines which run at about 100-250MHz, and more efficient ‘low-power’ chips like the
L4 lines which have a variety of ‘sleep’ modes and can comfortably run off of 1.8V. They also have an
H7 line which uses an even smaller 40nm process and can run at 400MHz.
Now as 2018 fades into history, it looks like ST has decided that it’s time for a fresh line of ‘value-line’ chips, and we can order a shiny new STM32G0 from retailers like Digikey. At the time of writing there aren’t too many options, but it looks like they’re hoping to branch out and there are even some 8-pin variants on the table. I could be misreading things, but these look like a mix between the
L0 lines, with lower power consumption than
F0 chips and better performance than the
L0 chips. The STM32G071GB that I made a test board with has 128KB of Flash, 36KB of RAM, and a nice set of communication peripherals.
So what’s the catch? Well, this is still a fairly new chip, so “Just Google It” may not be an effective problem-solving tool. And it looks like ST made a few changes in this new iteration of chips to provide more GPIO pins in smaller packages, so the hardware design will look similar but slightly different from previous STM32 lines. Finally, with a new chip comes new challenges in getting an open-source programming and debugging toolchain working. So with all of that said, let’s learn how to migrate!
I haven’t written about STM32 chips in a little while, but I have been learning how to make fun displays and signage using colorful LEDs, specifically the WS2812B and SK6812 ‘Neopixels’. I talked about the single-wire communication standard that these LEDs use in a post about running them from an FPGA, and I mentioned there that it is a bit more difficult for microcontrollers to run the communication standard. If you don’t believe me, take a look at what the poor folks at Adafruit needed to do to get it working on a 16MHz AVR core. Yikes!
When you send colors, the
1 bits are fairly easy to encode but the
0 bits require that you reliably hold a pin high for just 250-400 nanoseconds. Too short and the LED will think that your
0 bit was a blip of noise, too long and it will think that your
0 is a
1. Using timer peripherals is a reasonable solution, but it requires a faster clock than 16MHz and we won’t be able to use interrupts because it takes about 20-30 clock cycles for the STM32 to jump to an interrupt handler. At 72MHz it takes my code about 300-400 nanoseconds to get to an interrupt handler, and that’s just not fast enough.
There are ways to make it faster, but this is also a good example of how difficult it can be to calculate how long your C code will take to execute ahead of time. Between compiler optimizations and hardware realities like Flash wait-states and pushing/popping functions, the easiest way to tell how long your code takes to run is often to simply run it and check.
Which brings us to the topic of this tutorial – we are going to write a simple program which uses an STM32 timer peripheral to draw colors to ‘Neopixel’ LEDs. Along the way, we will debug the program’s timing using Sigrok and Pulseview with an affordable 8-channel digital logic analyzer. These are available for $10-20 from most online retailers; try Sparkfun, or Amazon/eBay/Aliexpress/etc. I don’t know why Adafruit doesn’t sell these; maybe they don’t want to carry cheap generics in the same category as Salae analyzers. Anyways, go ahead and install Pulseview, brush up a bit on STM32 timers if you need to, and let’s get started!
As you start to re-use components like sensors and displays, you might start to get frustrated with how long it takes to set up new projects. Copying and cleaning code between old working examples and new ideas can be time-consuming and tedious. It’s much easier to simply copy a few portable library files around. While there are plenty of existing libraries for these sorts of peripherals and external devices, it’s good to learn how to write your own, and this is also a good way to demonstrate a few ‘gotchas’ that you should be aware of when using C++ in an embedded application.
In this tutorial, we will walk through setting up a couple of object-oriented classes to represent a common communication ‘I/O’ peripheral model:
For simplicity’s sake, I’ll only cover a class for a bank of GPIO pins to demonstrate the core requirements for using C++ in an embedded application, but you can also find similar classes for the I2C peripheral and an SSD1306 OLED display in the example Github repository’s reference implementation of the concepts presented in this tutorial.
In a previous tutorial, we walked through the process of setting up a hardware interrupt to run a function when a button was pressed or a dial was turned. Most chips have a broad range of hardware interrupts, including ones associated with communication and timer peripherals. There is also a basic ‘SysTick’ timer included in most ARM Cortex-M cores for providing a consistent timing without needing to count CPU cycles.
One good use of that evenly-spaced timing is to run a Real-Time Operating System (often called an ‘RTOS’) to process several tasks in parallel. As you add more parts to your projects, it will become awkward to communicate with them all using a simple ‘while’ loop. And while hardware interrupts can help, it’s usually good to do as little as possible inside of an interrupt handler function.
So let’s use FreeRTOS, an MIT-licensed RTOS, to run a couple of tasks at the same time. To demonstrate the concept, we’ll run two ‘toggle LED’ tasks with different delay timings to blink an LED in an irregular pattern.
This example will also add support for some faster microcontrollers; the STM32F103C8 and STM32F303K8, which are Cortex-M3 and -M4F cores respectively. The F103C8 is available on cheap ‘blue pill‘ and ‘black pill‘ boards, and ST sells a ‘Nucleo’ board with the F303K8. Both of those chips can run at up to 72MHz, and they can also get more done per cycle since they have larger instruction sets. And as usual, there is example code demonstrating these concepts on Github.
As you learn about more of your microcontroller’s peripherals and start to work with more types of sensors and actuators, you will probably want to add small displays to your projects. Previously, I wrote about creating a simple program to draw data to an SSD1331 OLED display, but while they look great, the small size and low resolution can be limiting. Fortunately, the larger (and slightly cheaper) ILI9341 TFT display module uses a nearly-identical SPI communication protocol, so this tutorial will build on that previous post by going over how to draw to a 2.2″ ILI9341 module using the STM32’s hardware SPI peripheral.
We’ll cover the basic steps of setting up the required GPIO pins, initializing the SPI peripheral, starting the display, and then finally drawing pixel colors to it. This tutorial won’t read any data from the display, so we can use the hardware peripheral’s MISO pin for other purposes and leave the TFT’s MISO pin disconnected. And as with my previous STM32 posts, example code will be provided for both the STM32F031K6 and STM32L031K6 ‘Nucleo’ boards.