Comments
You can use your Mastodon account to reply to this post.
I recently ended up building a configuration system for a piece of C++ software I’m maintaining. One feature I wanted was: I want to just create new configuration items at arbitrary locations in my code without “registering” them somewhere centrally. The config system should just “pick them up” on startup.
One prominent example of a library using a similar functionality is Google Test, Google’s testing framework. In Google Test, you can just write something like
1TEST(TestSuiteName, TestName) {
2 ... test body ...
3}
somewhere is your code, and when Google Test’s main function runs, it picks up on your test and runs it.
How does one achieve that?
On a more abstract level, one needs to run a function with side effects before one’s main()
runs. The side effect can then be used to register the test function or the config item in some
central data structure.
So the question boils down to: How can we run some code before main()?
I’m not sure if this is the only trick C++ offers us, but it’s the trick Google Test uses: When you have some global variable in your code, you can (and should!) initialize it, i.e., assign a value that the global variable will have when your program starts. In its simplest form, this can look like this:
1namespace whatever {
2 int64_t simpleGlobal = 42; // initialized to 42
3}
For the global variable simpleGlobal
, your compiler will usually choose static initialization.
This means that the compiler will add space for the variable (8 bytes in this case) to the .data
segment of the resulting file (an ELF binary). When your OS loads an ELF file, the .data
segment
is loaded into memory as-is. So, if the compiler writes the value 42
into the space in the .data
segment used for simpleGlobal
, the variable is automatically initialized to the correct value.
However, static initialization is not our only option. We can also write something like this:
1namespace whatever {
2 int64_t someFunction(); // defined in some other translation unit
3
4 int64_t dynamicGlobal = someFunction();
5}
In this case, the initial value for dynamicGlobal
is determined by executing someFunction()
. If
we make someFunction()
complex enough it is clear that the compiler cannot evaluate it at compile
or link time, and instead it somehow needs to be executed before main()
. There is no rule that
someFunction()
has to be side-effect free! Can we use that side effect to register our test /
config item / whatever?
The answer is: probably. The C++ standard states in basic.start.dynamic/5 (slightly simplified):
It is implementation-defined whether the dynamic initialization of a […] variable with static storage duration is sequenced before the first statement of main or is deferred. If it is deferred, it strongly happens before any […] odr-use of any […] function or […] variable defined in the same translation unit as the variable to be initialized.
So basically the standard explicitly says that you may not rely on the side effects of
someFunction
already being visible when main()
starts. In fact, the side effects may never
occur if you have no ODR-use of anything in the respective TU!
The fact that Google Test uses something very similar1 makes one hopeful that this trick is pretty portable, even if it’s not guaranteed by the standard. Google Test advertises platform support according to Google’s “foundational C++” rules, which would mean pretty widespread support.
In this section I’m going to take a closer look at how the dynamic initialization is achieved in
practice. The details of this will vary depending on compiler, standard library implementation, CPU
architecture and probably also linker. I’m using GCC 14.2.0
, glibc 2.40-1ubuntu3.1
, libstdc++
14.2.0-4ubuntu
on an x86_64
system.
This is my example “project”, where the side effect of the dynamic initialization function is just setting a global variable called global
to the value 42
:
Let this be main.cpp
:
1#include <iostream>
2
3int global = 0;
4
5int main() {
6 std::cout << "Global value: " << global << "\n";
7}
And this a second translation unit other.cpp
:
1extern int global;
2
3bool setGlobal() {
4 global = 23;
5 return true;
6}
7
8bool dummy = setGlobal();
If you build and execute this, it should output:
1> ./example
2Global value: 23
So setGlobal
did indeed run before main()
. How? Let’s fire up gdb
and set a breakpoint in setGlobal()
:
1gdb ./example
2GNU gdb (Ubuntu 15.1-1ubuntu1~24.04) 15.1
3…
4Reading symbols from ./example...
5(gdb) break setGlobal()
6Breakpoint 1 at 0x1184: file …/other.cpp, line 4.
7(gdb) r
8Starting program: …/example
9[Thread debugging using libthread_db enabled]
10Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
11
12Breakpoint 1, setGlobal () at …/other.cpp:4
134 global = 23;
14(gdb) bt
15#0 setGlobal () at ~/some_dir/other.cpp:4
16#1 0x000055555555518f in __static_initialization_and_destruction_0 ()
17 at ~/some_dir/other.cpp:8
18#2 0x00005555555551a5 in _GLOBAL__sub_I__Z9setGlobalv ()
19 at ~/some_dir/other.cpp:8
20#3 0x00007ffff782a4f4 in call_init (argc=1, argv=0x7fffffffde28, env=<optimized out>)
21 at ../csu/libc-start.c:145
22#4 __libc_start_main_impl (main=0x5555555551a7 <main()>, argc=1, argv=0x7fffffffde28,
23 init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
24 stack_end=0x7fffffffde18) at ../csu/libc-start.c:347
25#5 0x00005555555550a5 in _start ()
26(gdb)
We see that the chain of calls reaching our setGlobal()
is:
_start()
__libc_start_main_impl()
call_init()
_GLOBAL__sub_I__Z9setGlobalv()
__static_initialization_and_destruction_0()
For the two topmost stack frames (__static_initialization_and_destruction_0()
and _GLOBAL__sub_I__Z9setGlobalv()()
), gdb
claims that these originate from our own code (other.cpp:8
). This tells us that these two stack
frames do not correspond to functions in libraries, but that they are somehow generated by the
compiler for our variable initialization.
The two frames below that (call_init()
and __libc_start_main_impl()
)
point to csu/libc-start.c
, which is part of the GNU C Library.
Below that, there is a mysterious _start
, which has no origin at all.
_start
Let’s start with that _start
stack frame. Even though gdb doesn’t tell us so, this function actually also comes from the GNU C Library.
In our built executable, it looks like this (as generated by objdump -Crawd ./example
):
10000000000001080 <_start>:
2 1080: f3 0f 1e fa endbr64
3 1084: 31 ed xor %ebp,%ebp
4 1086: 49 89 d1 mov %rdx,%r9
5 1089: 5e pop %rsi
6 108a: 48 89 e2 mov %rsp,%rdx
7 108d: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
8 1091: 50 push %rax
9 1092: 54 push %rsp
10 1093: 45 31 c0 xor %r8d,%r8d
11 1096: 31 c9 xor %ecx,%ecx
12 1098: 48 8d 3d 08 01 00 00 lea 0x108(%rip),%rdi # 11a7 <main>
13 109f: ff 15 3b 2f 00 00 call *0x2f3b(%rip) # 3fe0 <__libc_start_main@GLIBC_2.34>
14 10a5: f4 hlt
15 10a6: 66 2e 0f 1f 84 00 00 00 00 00 cs nopw 0x0(%rax,%rax,1)
We see the call to __libc_start_main_impl
in line 13. Don’t be confused by the reference to main
in the line above that - that’s not a call to main()
. That instruction (lea
- load effective
address) loads the address of the main()
function into the %rdi
register, which is used for
passing an argument to the function called in the following call
instruction. Thus,
__libc_start_main_impl
takes the address of the main()
function to execute as a parameter.
__libc_start_main
To find __libc_start_main
(resp. __libc_start_main_impl
) in libc-start.c
, one needs to chase a couple of #define
and then ends up here:
1
2/* Note: The init and fini parameters are no longer used. fini is
3 completely unused, init is still called if not NULL, but the
4 current startup code always passes NULL. (In the future, it would
5 be possible to use fini to pass a version code if init is NULL, to
6 indicate the link-time glibc without introducing a hard
7 incompatibility for new programs with older glibc versions.)
8
9 For dynamically linked executables, the dynamic segment is used to
10 locate constructors and destructors. For statically linked
11 executables, the relevant symbols are access directly. */
12STATIC int
13LIBC_START_MAIN (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL),
14 int argc, char **argv,
15#ifdef LIBC_START_MAIN_AUXVEC_ARG
16 ElfW(auxv_t) *auxvec,
17#endif
18 __typeof (main) init,
19 void (*fini) (void),
20 void (*rtld_fini) (void), void *stack_end)
21{
22…
The comment is interesting: It says that historically, the init
parameter was used to pass
initialization routines2, but today “the dynamic segment” is used to
locate “constructors and destructors”.
The body of __libc_start_main
does a lot of initialization and then calls call_init()
:
1…
2 /* Call the initializer of the program, if any. */
3#ifdef SHARED
4…
5 if (init != NULL)
6 /* This is a legacy program which supplied its own init
7 routine. */
8 (*init) (argc, argv, __environ MAIN_AUXVEC_PARAM);
9 else
10 /* This is a current program. Use the dynamic segment to find
11 constructors. */
12 call_init (argc, argv, __environ);
13…
14#else /* !SHARED */
15 call_init (argc, argv, __environ);
16…
17#endif
call_init()
Here is a slightly shortened (and reformatted…) version of call_init()
, also from libc-start.c:
1/* Initialization for dynamic executables. Find the main executable
2 link map and run its init functions. */
3static void
4call_init (int argc, char **argv, char **env)
5{
6 /* Obtain the main map of the executable. */
7 struct link_map *l = GL(dl_ns)[LM_ID_BASE]._ns_loaded;
8
9 …
10
11 ElfW(Dyn) *init_array = l->l_info[DT_INIT_ARRAY];
12 if (init_array != NULL) {
13 unsigned int jm = l->l_info[DT_INIT_ARRAYSZ]->d_un.d_val / sizeof (ElfW(Addr));
14
15 ElfW(Addr) *addrs = (void *) (init_array->d_un.d_ptr + l->l_addr);
16 for (unsigned int j = 0; j < jm; ++j) {
17 ((dl_init_t) addrs[j]) (argc, argv, env);
18 }
19 }
20}
Lines 7 and 11 access the “link map” of the executable. The “link map” is a data structure created by the dynamic linker when the executable is loaded into memory. It basically contains information about where in memory the different sections of the loaded ELF file ended up. You can find it here in glibc.
Line 11 uses the DT_INIT_ARRAY
tag into the link map to access the address of .init_array
section of the ELF file (where it was loaded into memory), and line 13 uses the DT_INIT_ARRAYSZ
tag (SZ
for “size”…) to determine the number of entries in the .init_array
section.
The content of the .init_array
section is a list of addresses of functions to be called for
initialization - that’s what lines 15 to 18 do. Each of these initialization functions gets argc
,
argv
and a map of the environment variables passed in.
Let’s look at the contents of the .init_array
in our example:
1> objdump -s -j .init_array ./example
2
3./example: file format elf64-x86-64
4
5Contents of section .init_array:
6 3d98 60110000 00000000 98110000 00000000 `...............
The .init_section
apparently contains two 64-bit values, i.e., two addresses. Converting from
little-endian, this tells us that two functions at address 0x1160
and 0x1198
should be called.
Let’s have a quick look at the next function on our call stack, _GLOBAL__sub_I__Z9setGlobalv
:
1> objdump -Cdwr ./example
2…
30000000000001198 <_GLOBAL__sub_I__Z9setGlobalv>:
4 1198: f3 0f 1e fa endbr64
5 119c: 55 push %rbp
6…
It’s at address 0x1198
! So now we know how it ends up being called from call_init()
.
The other function at 0x1160
is an (indirect) call to register_tm_clones
, which is a setup
function for some inner mechanics of the (not yet working, I think?) GNU Transactional Memory
Library. We can ignore that.
_GLOBAL__sub_I__Z9setGlobalv
and __static_initialization_and_destruction_0()
Let’s take a closer look at _GLOBAL__sub_I__Z9setGlobalv
:
10000000000001198 <_GLOBAL__sub_I__Z9setGlobalv>:
2 1198: f3 0f 1e fa endbr64
3 119c: 55 push %rbp
4 119d: 48 89 e5 mov %rsp,%rbp
5 11a0: e8 dd ff ff ff call 1182 <__static_initialization_and_destruction_0()>
6 11a5: 5d pop %rbp
7 11a6: c3 ret
That’s simple enough. Nothing going on here, just forwarding the call to __static_initialization_and_destruction_0()
.
And that function look like this:
10000000000001182 <__static_initialization_and_destruction_0()>:
2 1182: f3 0f 1e fa endbr64
3 1186: 55 push %rbp
4 1187: 48 89 e5 mov %rsp,%rbp
5 118a: e8 da ff ff ff call 1169 <setGlobal()>
6 118f: 88 05 bc 2f 00 00 mov %al,0x2fbc(%rip) # 4151 <dummy>
7 1195: 90 nop
8 1196: 5d pop %rbp
9 1197: c3 ret
Again not much going on, basically only the call to setGlobal()
.
I assume that these two compiler-generated functions will look more complicated if you have multiple functions running for dynamic initialization.
So, can we (ab)use the side effects of dynamic initialization, e.g. for “plugin registration” the way Google Test does?
My understanding is that if we use it in the way we want to use it (say a TU which is never called
into from main()
, but which contains a call like Google Test’s TEST(…) {…}
), the standard gives
us zero guarantees at which point these side effects run, or whether they run at all. If the
compiler (or rather: linker) detects that no symbols from a translation unit are ODR-used at all, it
may optimize everything from that TU away, I think.
Aside from optimizing away the call, the other wrench the compiler could throw into our works is
deferred dynamic initialization, i.e., deferring the calls until the first ODR-use of anything in
the same TU, instead of running them in the _start
phase. However I fail to see how this could be
achieved without massive overhead. The only way I see would be to generate a function that runs
before every ODR-use of any symbol of a TU and which makes sure that all dynamic initialization
from the TU has run. That seems insane.
So, in conclusion: While removing all dynamic initialization of a TU when the linker can prove that nothing of the TU is ever ODR-used may seem realistic, my guess would be that such an optimization would break a lot of things, Google Test amongst other things. So my guess would be that it’s pretty safe to depend on this behavior.
Actually, they use dynamic initialization of a static class member variable instead of a namespace-scope variable, see this code. ↩︎
I did not check, but I assume that this was essentially the .init
section of the ELF file containing the executable… As far as I can tell, the Glibc hasn’t used the .init
section since 1999. ↩︎
You can use your Mastodon account to reply to this post.