I’ve written a few basic tutorials about bare-metal STM32 development in the past, and even though I’m still learning as I write them, I think that there’s enough groundwork to start covering some ‘real world’ scenarios now. I’d like to start with a very important technique for designing efficient applications: the Direct Memory Access (DMA) peripheral. DMA is important because it lets you move data from one area of memory to another without using CPU time. After you start a DMA transfer, your program will continue to run normally while the data is moved around ‘in the background’.
That’s the basic idea, but the devil is always in the details. So in this post, we’re going to review how the three main types of STM32 DMA peripherals work. Different STM32 chips can have similar peripherals which behave slightly differently, and usually more expensive / newer chips have more fully-featured peripherals. I think that this is how the peripherals are grouped, but I didn’t test every type of STM32 chip and corrections are always appreciated:
- ‘Type 1’ Simple DMA:
- ‘Type 2’ Double-buffered DMA:
- ‘Type 3’ DMA + DMA multiplexer:
Once we’ve reviewed the basics of how DMA works, I’ll go over how to use it in a few example applications to show how it works with different peripherals and devices. The required hardware for each example will be discussed later, but I’ll present code to:
- Generate an audio tone by sending a sine wave to the DAC peripheral at a specific frequency.
- Map an array of colors to a strip of
- Map a small region of on-chip RAM to a monochrome
- Map a a region of RAM to an
The key to these examples is that the communication with an external device will happen ‘in the background’ while your microcontroller’s CPU is doing other things. Most of the examples won’t even use interrupts; the data transmission is automatic once you start it. But be aware that DMA is not magic. Every DMA ‘channel’ or ‘stream’ shares a single data bus which is also used by the CPU for memory transfers, so there is a limit to how much data you can actually send at once. In practice this probably won’t be a problem unless you have multiple high-priority / high-speed DMA transfers with tight timing requirements, but it’s something to be aware of.
So let’s get started!
Recently, I wrote about trying to figure out a way to automatically produce visualizations of STM32 peripheral mappings from their datasheets. Unfortunately, it didn’t go too well; I had trouble parsing the output from the command-line
pdftotext utility, so I ended up having to manually clean up each peripheral table after it was half-parsed by a script.
I was thinking of trying to write a better parsing script, but before diving into that rabbit hole I took another look at open-source PDF-parsing programs and found Tabula. It is an MIT-licensed utility with one goal: extracting tables from PDF files. And it seems to work very well with ST’s datasheets.
Step 1: Download Tabula
Tabula runs on Java, so it’s simple to set up on just about any platform. They have good installation instructions on their website and GitHub readme file. If you are using Linux, you can download and unzip the
tabula-jar-<version>.zip version of the latest release to get a runnable JAR file.
Following the instructions in the
README.md file, once you unzip the file and
cd into its
tabula/ directory, you can start running a local server with the command that the readme file suggests:
java -Dfile.encoding=utf-8 -Xms256M -Xmx1024M -jar tabula.jar
Once it starts up, you should be able to navigate to http://127.0.0.1:8080 in a browser to access the Tabula UI. There is also a command-line version of the project, which will probably be better for setting up an automatic process. But for now, it’s nice to use the visual selection tools which the UI provides.
When you want to pick a microcontroller for a project, it can be useful to have a bunch of quick references to find the smallest/cheapest one that will fit the parts you want to use. Usually the information is freely available in datasheets, but those are usually PDF files with pinout/peripheral information spanning several pages.
So I’ve been playing with ways to partially automate the process of generating some nice reference images. I’m not exactly happy with what I have yet, but it’s not the sort of thing that merits spending a ton of time on, and I think I scripted away enough of the tedium that it isn’t too frustrating to make these sorts of tables. You wind up with reference images like this:
Honestly, this post would be sort of moot if I could find nice CSV files or spreadsheets of STM32 pinouts and peripherals (see below). Maybe other vendors distribute information in that format, but I couldn’t find many for ST’s chips. And that means that I ended up with a sort of convoluted process.
A fragile Python script runs the
pdftotext utility to convert an STM32 datasheet into a pinout table containing peripheral information. The automatic formatting doesn’t include table cell boundaries, so I couldn’t figure out how to put all of the peripherals associated with a pin on the same line quickly; someone needs to manually perform that step before running a second Python script which generates LaTeX files representing formatted peripheral tables. The PDF files generated by
pdflatex can then be imported into Inkscape, where the table columns can be connected to the pins on an image of the chip’s outline. So, let’s get started!
Note: I ended up finding a better way to parse STM32 datasheets using a project called Tabula – see this newer post for more information. My first attempt presented in this post ended up working pretty poorly, but that’s life for you. Sometimes it takes a few tries to find a good solution.
The ESP32 modules sold by Espressif are very popular in the IoT and embedded development space. They are very cheap, they are quite fast, they include radios and peripherals for WiFi and Bluetooth communication, and in some ways they even appear to bridge the gap between MCU and CPU. And Espressif provides pre-built modules with built-in antennas and external Flash memory, both of which appear to be required for general-purpose ‘IoT’ application development. They can be a bit power-hungry when they are using their wireless communication modules though, and I haven’t found much information on how to develop ESP32 applications without using the heavyweight (but very functional and well-written) “ESP-IDF” toolchain which is distributed by Espressif.
Usually, avoiding bulky and proprietary HALs is a worthwhile goal in and of itself. But Espressif has actually released their ESP-IDF toolchain under a very permissive Apache license, and it looks like a well-thought-out system with solid ongoing support. So if you are looking at starting a new project with the ESP32, I would personally recommend using the ESP-IDF to save time and effort. But sometimes it is nice to learn about how chips work at a deeper level, and ESP-IDF projects are often quite large, and they can take a long time to build depending on your environment.
The large code size also discourages what appears to be one use case that the chip was designed for: to load new instructions into RAM every time that it reboots from an external ‘socket’. The current crop of ESP32 modules use a SPI Flash chip as that ‘socket’, but if you put them in a factory or a field you might want to use Ethernet, or RS-232, or who knows what. I’m not sure how extensible the chip’s ROM bootloader actually is yet, but let’s take a look at what it takes to get a simple C program running on the ESP32 without using the ESP-IDF build system.
Unlike the STM32 and MSP430 microcontrollers which I have written about previously, there are not many software tools available for the ESP32 core. The ESP32’s dual-core architecture uses two ‘Xtensa LX6’ CPU cores which Espressif licenses from Cadence, and I haven’t seen them in any other mainstream microcontrollers. It looks like a core that is intended to be customized for the needs of an application as a step between general-purpose microcontrollers and something like an ASIC, so maybe it is more common in application-specific environments than general-purpose ones. In this case, it looks like the specific application which Espressif chose is wireless communication, and apparently a lot of the WiFi and Bluetooth code is burned directly into the ESP32’s ROM.
The ESP32 also looks more like a proper CPU than many microcontroller cores, with a few hundred kilobytes of on-chip RAM, a 240MHz top speed, an MMU, and support for up to 8 process IDs (2 privileged / 6 unprivileged) per core. People used to make do with much less, but since the ESP32 is complex and somewhat unique, Espressif provides the only toolchain that I know about which can build code for it. That means that while this tutorial will not use the full ESP-IDF development environment, it will still use Espressif’s ports of GCC and OpenOCD for compilation and debugging, as well as their
esptool utility for formatting and flashing the compiled code. The target hardware will be either the
ESP32-WROVER-KIT board which includes a JTAG debugging chip, or any of the smaller generic ESP32 dev boards (such as the
ESP32-DevKitC) combined with an FTDI
And like most of my previous tutorials, the software presented here is all open-source and you should be able to build and run it on the platform of your choice. It won’t have any colorful LEDs this time – sorry – but the code is available on GitHub. I’m also still learning about this chip and there is a lot that I don’t know, so corrections and comments are definitely appreciated. So if you’re still interested after those disclaimers, let’s get started by building and installing the toolchain!
The GCC toolchain is a nice way to get started with bare-metal ARM platforms because it is free, well-supported, and something that many C programmers are already familiar with. It is also constantly improving, and while most Linux distributions include a version of the
arm-none-eabi GCC toolchain in their default package managers, those versions are sometimes old enough that I run into bugs with things like multilib/multiarch support.
So after trying and failing a few times to build and install this toolchain, I thought I would share what eventually worked for me. If you are building this on a system without much free disk space, this did take about 16GB of disk space when I ran it – but you shouldn’t need to keep more than ~500MB of that. I ended up building it on an ext3-formatted 32GB USB drive, and that seemed to work fine. You will probably run into errors if you try to run this on a FAT-formatted drive like a new USB stick, though, because the build process will run
chmod and that will fail on a filesystem that doesn’t support file permissions. You might be able to make it work with an NTFS drive, but I digress.
Anyways, the first step is downloading the code, so let’s get started!
Step 1: Download the Source Code
The first step is to download the
arm-none-eabi-gcc source code from ARM’s website – you want the “Source Invariant” version. If you are using a common architecture such as
x86_64, you can also download a pre-built version of the latest release, but you probably want to build the thing yourself if you are reading this guide. If you want to make sure that the file you downloaded was not corrupted or tampered with, you can check it against the MD5 checksum listed under the file’s name on the download page:
> md5sum gcc-arm-none-eabi-8-2018-q4-major-src.tar.bz2 d6071d95064819d546fe06c49fb9d481 gcc-arm-none-eabi-8-2018-q4-major-src.tar.bz2
That looks like it matches as of the time of writing. The archive contains build scripts and code for the whole toolchain, so create a new directory to build the toolchain in – I’ll use
~/arm_gcc/ as an example. Once it finishes downloading, move the archive to that folder and extract it:
> mkdir ~/arm_gcc > cd ~/arm_gcc > mv [Your_Downloads_Folder]/gcc-arm-none-eabi-8-2018-q4-major-src.tar.bz2 . > tar -xvf gcc-arm-none-eabi-8-2018-q4-major-src.tar.bz2
The file that I downloaded was called
gcc-arm-none-eabi-8-2018-q4-major-src.tar.bz2, but yours will depend on where you are in time. The extracted archive will make a new directory with a similar name – take a look inside, and you’ll see a few scripts and a ‘how-to’ document alongside the source code:
> cd gcc-arm-none-eabi-8-2018-q4-major/ > ls build-common.sh build-toolchain.sh install-sources.sh python-config.sh release.txt build-prerequisites.sh How-to-build-toolchain.pdf license.txt readme.txt src
Whenever I talk to someone about FPGAs, the conversation seems to follow a familiar routine. It is almost a catechism to say that ‘FPGAs are very interesting niche products that, sadly, rarely make sense in real-world applications’. I often hear that organizations with Money can afford to develop ASICs, while hobbyists are usually better served by today’s affordable and powerful microcontrollers except in some very specific circumstances like emulating old CPU architectures. I don’t have enough experience to know how accurate this is, but I do have a couple of projects that seem like they could benefit from an FPGA, so I decided to bite the bullet and learn the basics of how to use one.
I chose a popular $25 development board called the ‘Icestick‘ to start with. It uses one of Lattice’s iCE40 chips, which is nice because there is an open-source toolchain called Icestorm available for building Verilog or VHDL code into an iCE40 bitstream. Most FPGA vendors (including Lattice) don’t provide a toolchain that you can build from source, but thanks to the hard work of Clifford Wolf and the other Icestorm contributors, I can’t use “maddeningly proprietary tools” as a reason not to learn about this anymore.
One thing that FPGAs can do much better than microcontrollers is running a lot of similar state machines in parallel. I’d eventually like to make a ‘video wall’ project using individually-addressable LEDs, but the common ‘Neopixel’ variants share a maximum data rate of about 800kbps. That’s probably too slow to send video to a display one pixel at a time, but it might be fast enough to send a few hundred ‘blocks’ of pixel data in parallel. As a small step towards that goal, I decided to try lighting up a single strip of WS2812B or SK6812 LEDs using Verilog. Here, I will try to describe what I learned.
And while this post will walk through a working design, I’m sorry that it will not be a great tutorial on writing Verilog or VHDL; I will try to gloss over what I don’t understand, so I would encourage you to read a more comprehensive tutorial on the subject like Al Williams’ series of Verilog and Icestorm tutorials on Hackaday. Sorry about that, but I’m still learning and I don’t want to present misleading information. This tutorial’s code is available on Github as usual, but caveat emptor.
It’s good to know what time it is, and once you work out how to use a RealTime Clock module, making a clock display seems like a natural next step. 7-segment LED displays used to be the go-to way to display time digitally before LCD and OLED screens got so cheap, and I thought it would be fun to make a more modern take on them with diffused multicolor LEDs for the actual lighting. I would like to write a post about drawing to traditional 7-segment digits using shift registers, but this project uses the easy-to-wire ‘Neopixels’ that we all know and love.
You’ll need a few materials for this project if you want to follow along:
- Access to a laser cutter and a 3D printer.
- About 6 x 12″ of 1/8″-thick ‘frosted’ acrylic, to diffuse the LEDs. (About 150 x 300mm, 3mm thick)
- Enough plywood to make a clock face for the front and back; I wound up using about 500 x 200mm each.
- 58 individually-addressable RGB LEDs. (28 sets of 2 for each segment of the 4 digits, and 2 more for the colon dots.)
- A DS3231 or similar realtime clock module.
- A microcontroller to run everything; I used an ATMega328 on an “Arduino” nano board to keep the code simple.
- Craft supplies: solder, soldering iron, glue, hot glue gun, wire, etc.
For the LEDs, I cut lengths off of a reel of LED strip that I ordered off Amazon; I think it used SK6812 LEDs and had 30 LEDs / meter. For the two dots of the colon, you can also buy small single-LED boards that are about 10mm in diameter and usually have WS2812B LEDs. I didn’t run into any problems stringing SK6812 and WS2812B LEDs together with Adafruit’s “Neopixel” library, but caveat emptor.