Blog for my various projects, experiments, and learnings

Getting Started with Bare Metal ESP32 Programming

The ESP32 modules sold by Espressif are very popular in the IoT and embedded development space. They are very cheap, they are quite fast, they include radios and peripherals for WiFi and Bluetooth communication, and in some ways they even appear to bridge the gap between MCU and CPU. And Espressif provides pre-built modules with built-in antennas and external Flash memory, both of which appear to be required for general-purpose ‘IoT’ application development. They can be a bit power-hungry when they are using their wireless communication modules though, and I haven’t found much information on how to develop ESP32 applications without using the heavyweight (but very functional and well-written) “ESP-IDF” toolchain which is distributed by Espressif.

Usually, avoiding bulky and proprietary HALs is a worthwhile goal in and of itself. But Espressif has actually released their ESP-IDF toolchain under a very permissive Apache license, and it looks like a well-thought-out system with solid ongoing support. So if you are looking at starting a new project with the ESP32, I would personally recommend using the ESP-IDF to save time and effort. But sometimes it is nice to learn about how chips work at a deeper level, and ESP-IDF projects are often quite large, and they can take a long time to build depending on your environment.

The large code size also discourages what appears to be one use case that the chip was designed for: to load new instructions into RAM every time that it reboots from an external ‘socket’. The current crop of ESP32 modules use a SPI Flash chip as that ‘socket’, but if you put them in a factory or a field you might want to use Ethernet, or RS-232, or who knows what. I’m not sure how extensible the chip’s ROM bootloader actually is yet, but let’s take a look at what it takes to get a simple C program running on the ESP32 without using the ESP-IDF build system.

Unlike the STM32 and MSP430 microcontrollers which I have written about previously, there are not many software tools available for the ESP32 core. The ESP32’s dual-core architecture uses two ‘Xtensa LX6’ CPU cores which Espressif licenses from Cadence, and I haven’t seen them in any other mainstream microcontrollers. It looks like a core that is intended to be customized for the needs of an application as a step between general-purpose microcontrollers and something like an ASIC, so maybe it is more common in application-specific environments than general-purpose ones. In this case, it looks like the specific application which Espressif chose is wireless communication, and apparently a lot of the WiFi and Bluetooth code is burned directly into the ESP32’s ROM.

The ESP32 also looks more like a proper CPU than many microcontroller cores, with a few hundred kilobytes of on-chip RAM, a 240MHz top speed, an MMU, and support for up to 8 process IDs (2 privileged / 6 unprivileged) per core. People used to make do with much less, but since the ESP32 is complex and somewhat unique, Espressif provides the only toolchain that I know about which can build code for it. That means that while this tutorial will not use the full ESP-IDF development environment, it will still use Espressif’s ports of GCC and OpenOCD for compilation and debugging, as well as their esptool utility for formatting and flashing the compiled code. The target hardware will be either the ESP32-WROVER-KIT board which includes a JTAG debugging chip, or any of the smaller generic ESP32 dev boards (such as the ESP32-DevKitC) combined with an FTDI C232HM cable.

And like most of my previous tutorials, the software presented here is all open-source and you should be able to build and run it on the platform of your choice. It won’t have any colorful LEDs this time – sorry – but the code is available on GitHub. I’m also still learning about this chip and there is a lot that I don’t know, so corrections and comments are definitely appreciated. So if you’re still interested after those disclaimers, let’s get started by building and installing the toolchain!

ESP32 Toolchain Setup

Espressif provides good documentation for installing their toolchain, as well as for building it from source. I am impressed with their documentation, and the source build was fairly painless on an ARM architecture. So either follow the ‘Setup Toolchain’ steps for a pre-built Windows/Mac/Linux toolchain, or follow the steps to build it from source. Once it is installed, you should be able to check which version you have with:

> xtensa-esp32-elf-gcc --version
xtensa-esp32-elf-gcc (crosstool-NG crosstool-ng-1.22.0-80-g6c4433a5) 5.2.0
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Your toolchain’s version will probably differ depending on when you are. Next, follow the steps to install the ESP-IDF package. We won’t be using its build system, but we will be looking at its ‘hello world’ example, using some utilities that it ships with, and inspecting its bootloader code to figure out how the system reaches a main method. Standing on the shoulders of giants…

I put the ESP-IDF files under ~/esp-idf/, but they can go anywhere; I’ll use $(IDF_PATH)/... to refer to files in the ESP-IDF source code.

Inspecting ESP-IDF

Our first question should be, ‘what happens when the system boots?’ This is sort of explained in section 3 of the ESP32 Technical Reference Manual, but it’s not very clear about a lot of the specifics. What memory address does the system boot to? How do I map my code to that address if the only re-writable nonvolatile memory is on an external SPI Flash chip? If some DRAM and IRAM memory spaces refer to the same physical memory, what do I put in the linker script? And so on.

Fortunately, the ESP IDF has an Apache-licensed bootloader project, which we can look at to see what gets things started. The project is located under $(IDF_PATH)/components/bootloader/, and there is some supporting logic under $(IDF_PATH)/components/bootloader_support/. The actual bootloader only has one bootloader_start.c source file a few directories down, and you can see in that file that a function called call_start_cpu0 is the entry point of the whole program.

It sounds like the ESP32 has a ROM bootloader which cannot be overwritten, and that “first-stage” bootloader drops the PC into a preset memory address in the chip’s Instruction RAM. I’m not very clear on this, but I guess that the ROM bootloader also fetches a pre-set area of the external Flash memory to place into that RAM space. I’m not quite sure how it decides what that address is yet, but the documentation suggests that it is a setting in the Flash MMU peripheral (see the ‘IROM’ section). Whatever the case, the default ‘single-app’ projects place the application at 0x10000 and the bootloader at 0x1000.

So to get a ‘hello world’ C program going, we need to link our code such that it maps to the right memory space in IRAM, and then we need to upload that code to where the ESP32 expects its ‘second-stage’ bootloader to be on the SPI Flash chip. Finally, we will verify that everything worked by stepping through the program in GDB.

When you have the luxury of working source code, one easy way to look at memory addresses is to compile and inspect an ELF file, and in our case that means building a basic ESP-IDF project. The framework ships with a few examples, and the documentation goes over how to configure and build one, so go ahead and follow the steps to copy the ESP-IDF ‘hello world’ project and configure it. Once you get to the ‘Build and Flash’ step, you can stop and just run make. If the build seems slow, try make -j4 to use 4 threads.

Once the project finishes building, it will leave three .bin files in the build/ directory, and two .elf files. There is the bootloader which gets written to 0x1000 in the SPI Flash, a partition table at 0x8000 which tells the bootloader which images are available to boot, and since the default setting is to only build one application image, the ‘hello world’ application image gets written to 0x10000.

To find out where the chip boots to, let’s inspect the bootloader using an nm command: xtensa-esp32-elf-nm build/bootloader/bootloader.elf. It will spit out a list of where various parts of the program are placed in memory, like this:

...
400794c4 T bootloader_utility_get_selected_boot_partition
400795dc T bootloader_utility_load_boot_image
4007935c T bootloader_utility_load_partition_table
3fff0018 A _bss_end
3fff0000 A _bss_start
40078658 T __bswapsi2
400095e0 A cache_flash_mmu_set_rom
40009a14 A Cache_Flush_rom
40009ab8 A Cache_Read_Disable_rom
40009a84 A Cache_Read_Enable_rom
40080764 T call_start_cpu0
         U call_user_start_cpu0
4005cfec A crc32_le
3ff96350 A __ctype_ptr__
3fff0018 d current_read_mapping
3fff001c A _data_end
3fff0018 A _data_start
40079b64 t debug_log_hash
...

So we can see that the entry call, call_start_cpu0, is located a bit after 0x40080000 and the bootloader expects to call a call_user_start_cpu0 function which is defined elsewhere as the entry point to the user’s application. You can also look in the linker script under $(IDF_PATH)/components/bootloader/subproject/main/esp32.bootloader.ld to see how the bootloader’s memory is laid out and why; it looks like the ‘program text’ section starts at 0x40080400 to leave room for a vector table, but a word of warning: I think that the esptool formatting which we will talk about later might also relocate some of these memory segments.

Once the project builds, run make flash monitor and watch as the demo runs and prints to the board’s Serial output. That should verify that your toolchain is working, but the simple test program that I’m going to go over won’t have GPIO or UART outputs – remember when I said that you should use the ESP-IDF for any real project? So let’s also go over how to step through an ESP32 program using GDB.

OpenOCD and GDB

Before we go any further, you’ll need a way to debug the project – our example is only going to increment a variable forever, so we’ll need to be able to check what that value is to make sure that it works. So go ahead and install Espressif’s port of OpenOCD – they have instructions available here. You’ll also need to connect an appropriate JTAG interface to the ESP32 in order to use OpenOCD; like I mentioned in the first few paragraphs of this post, you can use a C232HM cable or breakout board with the .cfg files that ship with the ESP32 port of OpenOCD. If you have an ESP32-WROVER-KIT board (the one with an LCD display on one side,) it already has one of those chips built in. Otherwise, you’ll need to connect your own.

If you’re using a different development board, you can connect a C232HM cable to the following ESP32 pins:

ESP32 Pin #

JTAG Signal / Wire Color

14

TMS (Brown)

12

TDI (Yellow)

GND

GND (Black)

13

TCK (Orange)

15

TDO (Green)

Espressif’s OpenOCD port has a bin/ directory with the OpenOCD binary, and a share/ directory with scripts for connecting to the ESP32. Once you have everything connected properly, you should be able to open an OpenOCD interface with the command from the Espressif documentation linked above. Depending on how you install it, it might look something like:

bin/openocd -s share/openocd/scripts -f interface/ftdi/esp32_devkitj_v1.cfg -f board/esp-wroom-32.cfg
openocd -f esp32_devkitj_v1.cfg -f esp-wroom-32.cfg

If the connection succeeds, you should see something like this:

Open On-Chip Debugger 0.10.0-dev-ga859564 (2017-07-24-16:18)
Licensed under GNU GPL v2
For bug reports, read
        http://openocd.org/doc/doxygen/bugs.html
none separate
adapter speed: 20000 kHz
force hard breakpoints
Info : ftdi: if you experience problems at higher adapter clocks, try the command "ftdi_tdo_sample_edge falling"
Info : clock speed 20000 kHz
Info : JTAG tap: esp32.cpu0 tap/device found: 0x120034e5 (mfg: 0x272 (Tensilica), part: 0x2003, ver: 0x1)
Info : JTAG tap: esp32.cpu1 tap/device found: 0x120034e5 (mfg: 0x272 (Tensilica), part: 0x2003, ver: 0x1)

Once the connection is open, we can connect to GDB over local port 3333, and step through the ESP-IDF bootloader. At the time of writing, it looks like Espressif recommends creating a script with a handful of initialization commands, and passing that to GDB. So leave the OpenOCD connection open, and in a new window, create a new file in your project directory called gdbinit with the following commands:

target remote :3333
set remote hardware-watchpoint-limit 2
mon reset halt
flushregs
thb call_start_cpu0

Those commands connect to the target, reset/halt it, and set a breakpoint at the bootloader’s call_start_cpu0 method. Then you can start debugging the bootloader with:

xtensa-esp32-elf-gdb -x gdbinit build/bootloader/bootloader.elf

You should see GDB start, followed by the output of the initialization commands:

...
Reading symbols from build/bootloader/bootloader.elf...done.
0x400076dd in ?? ()
JTAG tap: esp32.cpu0 tap/device found: 0x120034e5 (mfg: 0x272 (Tensilica), part: 0x2003, ver: 0x1)
JTAG tap: esp32.cpu1 tap/device found: 0x120034e5 (mfg: 0x272 (Tensilica), part: 0x2003, ver: 0x1)
esp32: Debug controller was reset (pwrstat=0x5F, after clear 0x0F).
esp32: Core was reset (pwrstat=0x5F, after clear 0x0F).
Target halted. PRO_CPU: PC=0x5000004B (active)    APP_CPU: PC=0x00000000
esp32: target state: halted
esp32: Core was reset (pwrstat=0x1F, after clear 0x0F).
Target halted. PRO_CPU: PC=0x40000400 (active)    APP_CPU: PC=0x40000400
esp32: target state: halted
Hardware assisted breakpoint 1 at 0x40080764: file bootloader/subproject/main/bootloader_start.c, line 38.
(gdb)

You can debug the program normally from here; if you enter continue, the bootloader should halt at the start of its C entry method, and you can step through it with the usual commands like next, step, etc. Espressif has a guide for command-line GDB debugging which goes over some of those common GDB commands in more detail, if you need a refresher.

Writing a Standalone Application

Now that we know how to check whether a program is working without needing to use GPIO pins or a UART connection, let’s write a ‘hello world’ C program without using the ESP-IDF build system. The ESP32 doesn’t really have a concept of a vector table in Flash, because it doesn’t have any on-chip Flash memory. It looks like it expects you to relocate the interrupt vectors into RAM, but that is not required for a super-simple ‘hello world’ program so for the sake of simplicity I am going to gloss over interrupts for now.

We will need a linker script, but the Xtensa GCC toolchain looks about the same as the ARM Cortex-M/R GCC toolchain, so our linker script can look very similar to those of the STM32 chips which I have written about previously. The biggest difference is that the ESP32 only has no Flash memory – instead, we refer to IRAM (Instruction RAM) and DRAM (Data RAM). I’m still learning, but I think the main difference is that you can execute code from IRAM:

/*
 * GNU linker script for Espressif ESP32
 */

/* Default entry point */
ENTRY( call_start_cpu0 );

/* Specify main memory areas */
MEMORY
{
  /* Use values from the ESP-IDF 'bootloader' component.
  /* TODO: Use human-readable lengths */
  /* TODO: Use the full memory map - this is just a test */
  iram_seg ( RX )       : ORIGIN = 0x40080400, len = 0xFC00
  dram_seg ( RW )       : ORIGIN = 0x3FFF0000, len = 0x1000
}

/* Define output sections */
SECTIONS {
  /* The program code and other data goes into Instruction RAM */
  .iram.text :
  {
    . = ALIGN(16);
    KEEP(*(.entry.text))
    *(.text)
    *(.text*)
    KEEP (*(.init))
    KEEP (*(.fini))
    *(.rodata)
    *(.rodata*)

    . = ALIGN(4);
    _etext = .;
  } >iram_seg

  /* Initialized data goes into Data RAM */
  _sidata = .;
  .data : AT(_sidata)
  {
    . = ALIGN(4);
    _sdata = .;
    *(.data)
    *(.data*)

    . = ALIGN(4);
    _edata = .;
  } >dram_seg

  /* Uninitialized data also goes into Data RAM */
  .bss :
  {
    . = ALIGN(4);
    _sbss = .;
    *(.bss)
    *(.bss*)
    *(COMMON)

    . = ALIGN(4);
    _ebss = .;
  } >dram_seg

  . = ALIGN(4);
  PROVIDE ( end = . );
  PROVIDE ( _end = . );
}

If you look at the memory map in the ESP32 Technical Reference Manual, you’ll notice that this only uses a subset of the available RAM on the chip. There are several different banks located at different memory addresses, and this simple test program doesn’t need very much memory, so for the sake of simplicity I only used the core ‘instruction’ and ‘data’ RAM segments used by the ESP-IDF bootloader. And we rely on that first-stage ROM bootloader which is permanently burned into the chip to pull our code out of the external SPI Flash chip and put it into the expected IRAM segment in the ESP32.

We will also need a main.c file containing a program to compile. For the sake of simplicity, I also included some minimal startup logic to copy bss/data segments into RAM with memset and memcpy:

#include <string.h>

extern unsigned int _sbss, _ebss, _sidata, _sdata, _edata;

static volatile int sisyphus = 0;
int main( void ) {
  // Increment a variable.
  while ( 1 ) {
    ++sisyphus;
  }
  return 0;
}

// Startup logic; this is the application entry point.
void __attribute__( ( noreturn ) ) call_start_cpu0() {
  // Clear BSS.
  memset( &_sbss, 0, ( &_ebss - &_sbss ) * sizeof( _sbss ) );
  // Copy initialized data.
  memmove( &_sdata, &_sidata, ( &_edata - &_sdata ) * sizeof( _sdata ) );

  // Done, branch to main
  main();
  // (Should never be reached)
  while( 1 ) {}
}

And finally, we’ll need a Makefile to build the project. This looks like any other GCC Makefile:

# 'Bare metal' ESP32 application Makefile
# Use the xtensa-esp32-elf toolchain.
TOOLCHAIN = xtensa-esp32-elf-

CFLAGS_PLATFORM  = -mlongcalls -mtext-section-literals -fstrict-volatile-bitfields
ASFLAGS_PLATFORM = $(CFLAGS_PLATFORM)
LDFLAGS_PLATFORM = $(CFLAGS_PLATFORM)

###
# General project build
###
CC = $(TOOLCHAIN)gcc
LD = $(TOOLCHAIN)ld
OC = $(TOOLCHAIN)objcopy
OS = $(TOOLCHAIN)size

# Linker script location.
LDSCRIPT       = ./ld/esp32.ld
# Set C/LD/AS flags.
CFLAGS += $(INC) -Wall -Werror -std=gnu11 -nostdlib $(CFLAGS_PLATFORM) $(COPT)
# (Allow access to the same memory location w/ different data widths.)
CFLAGS += -fno-strict-aliasing
CFLAGS += -fdata-sections -ffunction-sections
#CFLAGS += -Os
CFLAGS += -Os -g
LDFLAGS += -nostdlib -T$(LDSCRIPT) -Wl,-Map=$@.map -Wl,--cref -Wl,--gc-sections
LDFLAGS += $(LDFLAGS_PLATFORM)
LDFLAGS += -lm -lc -lgcc
ASFLAGS += -c -O0 -Wall -fmessage-length=0
ASFLAGS += $(ASFLAGS_PLATFORM)

# Set C source files.
C_SRC += \
  ./src/main.c \

OBJS += $(C_SRC:.c=.o)

# Set the first rule in the file to 'make all'
.PHONY: all
all: main.elf

# Rules to build files.
%.o: %.S
  $(CC) -x assembler-with-cpp $(ASFLAGS) $< -o $@

%.o: %.c
  $(CC) -c $(CFLAGS) $< -o $@

main.elf: $(OBJS)
  $(CC) $^ $(LDFLAGS) -o $@

# Target to clean build artifacts.
.PHONY: clean
clean:
  rm -f $(OBJS)
  rm -f ./main.bin
  rm -f ./main.elf ./main.elf.map

With those files written, you can build the program with make – it should spit out a main.elf file. You can use a similar GDB command as we used to inspect the bootloader, as long as you have an OpenOCD connection attached to the chip:

xtensa-esp32-elf-gdb -x gdbinit main.elf

Flashing a ‘Hello World’ Application

To actually upload our test program, we need to format it for the ESP32 and then store it in the SPI Flash chip connected to the actual ESP32 within the module. We can do that with Espressif’s esptool utility. It should be part of the ESP-IDF download, but you can find more detailed instructions in the project’s GitHub repository. To format the ELF file into a binary image:

esptool --chip esp32 elf2image --flash_mode="dio" --flash_freq "40m" --flash_size "4MB" -o main.bin main.elf

To flash a binary image to Flash address 0x1000 (where the ESP32 expects a ‘bootloader’ to be located):

esptool --chip esp32 --port /dev/ttyUSB0 --baud 115200 --before default_reset --after hard_reset write_flash -z --flash_mode dio --flash_freq 40m --flash_size detect 0x1000 main.bin

Note that you might need to specify a different port, depending on which system resource your ESP32 is connected to. And I think that this also depends on the partition table which we uploaded to 0x8000 in Flash as part of the test ‘hello world’ project. That sort of feels like cheating, but I haven’t quite figured out how the chip retrieves its code from Flash yet.

Anyways, once the image is flashed, you can use the same steps as above to connect with GDB – just replace build/bootloader/bootloader.elf with main.elf. You should be able to step through the program after it reaches its ‘main’ method, and observe that the poor sisyphus variable increments and overflows endlessly.

Conclusions

This isn’t actually useful for writing an ESP32 application; what are you going to do, write your own WiFi and Bluetooth drivers? But it is a fun learning exercise, and maybe it would be possible to access some of the ESP-IDF functionality from a ‘bare metal’ program like this, depending on how you built it.

Plus, I skipped some important things that I haven’t figured out yet, like where to put the interrupt vector table and what it should look like. But there you go – I hope this trivia about how the ESP32 works was interesting or helpful. And corrections are welcome as always; like I said, there’s a lot I still haven’t figured out. And you can find a repository with this example’s code on GitHub.

Comments (2):

  1. Mahyar

    July 1, 2019 at 4:41 am

    This is an excellent write up, and a great start to making a tiny SDK. Also, thanks for the source code.

    Reply
    • Vivonomicon

      July 6, 2019 at 2:41 pm

      Thank you for the kind words I’m glad it was helpful. Please do keep in mind that I am still learning about the ESP32 and I might be misunderstanding some parts of the boot process, but good luck!

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *