Previous: , Up: Introduction   [Contents][Index]


1.2 The ELF file format

The authoritative source for the ELF file format is Tool Interface Standard (TIS) Executable and Linking Format (ELF) Specification, https://refspecs.linuxfoundation.org/. Another precious resource is the ‘elf(5)’ man page. ELF files may be examined with the ‘readelf(1)’ utility.

Every ELF file begins with a header, which comes in two parts. The first part is called the identification. It is a ‘16’ byte array and its purpose is to provide, among other things, the architecture of the file, which may be 32-bit or 64-bit, and the data encoding of the file, which may be little-endian or big-endian. The second part contains the rest of the ELF header, which contains either ‘36’ or ‘48’ more bytes, depending on the architecture being 32 or 64-bit. These bytes are interpreted as 16-bit, 32-bit, or 64-bit integers. The main information in the second part is an offset to the program header table and an offset to the section header table, together with the total size (in bytes) of each, and the size of each table’s individual entries (all entries have the same size).

What is the information in program header entries and section header entries? Each entry points to a region of the ELF file: an offset and a byte size. The regions pointed to by program header entries are called segments, and the regions pointed to by section header entries are called sections. In general, segments and sections are unrelated. In this manner, we have two complementary views of the ELF file, that offered by segments and that offered by sections. In general, the loader uses segments to put together the memory of a process before handing execution to the program, while the linker, and other compiler tools such as debuggers, use sections.

Both program header entries and section header entries have a type, which indicates what the segment or section contains. Furthermore, the ELF header contains an index, called ‘e_shstrndx’, into the section header table. The corresponding section is the section name string table. The standard way to store string tables in ELF files is as as follows (for example):

  \0 f o o \0 b a r \0 b a z \0

Example 1.1: An example string table.

Every string table starts with a NUL byte; then an array of NUL-terminated strings follows. The encoding of the strings is not specified and it is application-specific, but latin-1 should be a reasonable assumption. Specifically the section name string table is a special table that is used to give textual names to sections. Every section header entry has an ‘sh_name’ field, which is an index into the section name string table. For example, with the above string table, an index of ‘1’ points to the ‘foo’ string; while an index of ‘6’ points to the ‘ar’ string.1 There are other entities that may have textual names, such as symbols. Symbols are stored in sections with type ‘SHT_SYMTAB’ or ‘SHT_DYNSYM’. The section header contains in its the ‘sh_link’ field an index to the section header table into another section header entry, which points to the respective string table for the symbols’ names.


Footnotes

(1)

Some tools, such as GNU ld, perform a shared-suffix optimization. This means that they will merge the entries for two strings such as ‘bar’ and ‘ar’ into a single entry. Thus one should be prepared to index into the middle of a string in a string table.


Previous: Reporting bugs, Up: Introduction   [Contents][Index]