Talking to Hardware
I’ve written a few basic tutorials about bare-metal STM32 development in the past, and even though I’m still learning as I write them, I think that there’s enough groundwork to start covering some ‘real world’ scenarios now. I’d like to start with a very important technique for designing efficient applications: the Direct Memory Access (DMA) peripheral. DMA is important because it lets you move data from one area of memory to another without using CPU time. After you start a DMA transfer, your program will continue to run normally while the data is moved around ‘in the background’.
That’s the basic idea, but the devil is always in the details. So in this post, we’re going to review how the three main types of STM32 DMA peripherals work. Different STM32 chips can have similar peripherals which behave slightly differently, and usually more expensive / newer chips have more fully-featured peripherals. I think that this is how the peripherals are grouped, but I didn’t test every type of STM32 chip and corrections are always appreciated:
- ‘Type 1’ Simple DMA:
- ‘Type 2’ Double-buffered DMA:
- ‘Type 3’ DMA + DMA multiplexer:
Once we’ve reviewed the basics of how DMA works, I’ll go over how to use it in a few example applications to show how it works with different peripherals and devices. The required hardware for each example will be discussed later, but I’ll present code to:
- Generate an audio tone by sending a sine wave to the DAC peripheral at a specific frequency.
- Map an array of colors to a strip of
- Map a small region of on-chip RAM to a monochrome
- Map a a region of RAM to an
The key to these examples is that the communication with an external device will happen ‘in the background’ while your microcontroller’s CPU is doing other things. Most of the examples won’t even use interrupts; the data transmission is automatic once you start it. But be aware that DMA is not magic. Every DMA ‘channel’ or ‘stream’ shares a single data bus which is also used by the CPU for memory transfers, so there is a limit to how much data you can actually send at once. In practice this probably won’t be a problem unless you have multiple high-priority / high-speed DMA transfers with tight timing requirements, but it’s something to be aware of.
So let’s get started!
Whenever I talk to someone about FPGAs, the conversation seems to follow a familiar routine. It is almost a catechism to say that ‘FPGAs are very interesting niche products that, sadly, rarely make sense in real-world applications’. I often hear that organizations with Money can afford to develop ASICs, while hobbyists are usually better served by today’s affordable and powerful microcontrollers except in some very specific circumstances like emulating old CPU architectures. I don’t have enough experience to know how accurate this is, but I do have a couple of projects that seem like they could benefit from an FPGA, so I decided to bite the bullet and learn the basics of how to use one.
I chose a popular $25 development board called the ‘Icestick‘ to start with. It uses one of Lattice’s iCE40 chips, which is nice because there is an open-source toolchain called Icestorm available for building Verilog or VHDL code into an iCE40 bitstream. Most FPGA vendors (including Lattice) don’t provide a toolchain that you can build from source, but thanks to the hard work of Clifford Wolf and the other Icestorm contributors, I can’t use “maddeningly proprietary tools” as a reason not to learn about this anymore.
One thing that FPGAs can do much better than microcontrollers is running a lot of similar state machines in parallel. I’d eventually like to make a ‘video wall’ project using individually-addressable LEDs, but the common ‘Neopixel’ variants share a maximum data rate of about 800kbps. That’s probably too slow to send video to a display one pixel at a time, but it might be fast enough to send a few hundred ‘blocks’ of pixel data in parallel. As a small step towards that goal, I decided to try lighting up a single strip of WS2812B or SK6812 LEDs using Verilog. Here, I will try to describe what I learned.
And while this post will walk through a working design, I’m sorry that it will not be a great tutorial on writing Verilog or VHDL; I will try to gloss over what I don’t understand, so I would encourage you to read a more comprehensive tutorial on the subject like Al Williams’ series of Verilog and Icestorm tutorials on Hackaday. Sorry about that, but I’m still learning and I don’t want to present misleading information. This tutorial’s code is available on Github as usual, but caveat emptor.
In previous tutorials, I covered how to use the STM32 line of microcontrollers to draw to small displays using the SPI communication standard. First with software functions and small ‘SSD1331’ OLED displays, and then with the faster SPI hardware peripheral and slightly larger ‘ILI9341’ TFT LCD displays. Both of those displays are great for cheaply displaying data or multimedia content, because they can show 16 bits of color per pixel and have enough space to present a moderate amount of information. But if you want to design a very low-power application, you might want a display which does not need to constantly drain energy to maintain an image.
Enter ‘E-Ink’ displays, sometimes called “Electrophoretic Displays“. As the name implies, they use the same basic operating principle as techniques like Gel Electrophoresis, which separates polarized molecules such as DNA based on their electric charge. Each pixel in one of these displays is a tiny hollow sphere filled with oppositely-charged ink molecules, and they are separated between the top and bottom of their capsules to make the pixel light or dark. The ink remains in place even after power is removed; I think that they are suspended in a solid gel or something. Modern E-Ink modules sometimes have a third color such as red or yellow, but this post will only cover a humble monochrome display.
In a previous post, I wrote about designing a ‘breakout board’ for an
SSD1331 OLED display with 96×64 pixels and 16 bits of color per pixel. With the hardware already put together, this post will cover writing a basic software driver for the displays. To keep things simple, we will talk to the display using software SPI functions instead of the STM32’s SPI hardware peripheral.
If you want to skip assembling your own boards, you can also buy a pre-made display such as this one sold by Adafruit. They have also written a library for these displays which works with several common types of microcontrollers, if you just want to use them without worrying about the display settings. But if you want to try understanding this sort of communication at a lower level, read on!
Since many small microcontrollers – including the
STM32F031K6 discussed in this example – don’t have 12KB of RAM available to store a 96×64 display at 16 bits per pixel, I’ll use a framebuffer with just 4 bits per pixel in this example (3KB), and map those 16 values to a palette. This example builds on the first few “Bare Metal STM32 Programming” tutorials that I’ve been writing, so here is a Github repository with the entire example project (including supporting files) if you don’t want to read those.