Understanding the C++ Compilation Process

Ankit Dhamsaniya
Read Time: 3 Minutes
Understanding the C++ Compilation Process

The compilation process preprocesses and translates the high-level human-readable C++ source code to machine-readable or executable code that the computer understands. 

The Build Pipeline: Preprocess, Compile, and Link

The compilation process of C++ source code involves further sub-processes.

  1. Preprocessing
  2. Compilation
  3. Linking
  4. Optimization
  5. Execution

Preprocessing involves preprocessing or scanning the source code for preprocessor directives (#include, #define) and conditional compilation directives (#if, #endif). The header files (e.g., #include <iostream>, #include<string>) are copied into C++ source code. Macros defined by #define directive gets replaced with their respective values.

Compilation involves translating the preprocessed C++ source code into an intermediate form called object code or assembly code. This is the step when the compiler checks the syntax and semantics and generates warnings and errors, a crucial phase for students asking for "do my C++ homework for me". The output of this step is an object file (.obj or .o) containing the machine code.

The linking step generates an executable (.exe) file by combining object files from the compilation stage. Both static and dynamic external libraries are linked in this step. It resolves references between different header files and source files while ensuring correct interfacing between variables and functions. 

Optimization is an optional step that generates smaller and quickly executable codes at the expense of compilation time. 

Execution involves running the executable code, and the operating system loads it into memory. Execution starts from the main() function.

How Source Files Import and Export Symbols

C++ source code files import and export symbols via declaration and definition.

Declaration tells the compiler about symbol presence (variable, function, class). At this stage, no memory allocation happens. The declaration takes place in header files (.h) that are later included in source files (.cpp) for exporting the symbol. 

The definition provides the implementation of the symbol, and hence, memory allocation happens here. The definition is found in source files (.cpp) that import the symbols from the header file included at the top. 

Figure 1 shows C++ source code to explain importing and exporting symbols. To help differentiate overloaded functions, C++ compilers perform symbol mangling to encode any information about the function. In Unix-like systems, the command nm –C program_name.o verifies and displays the imported & exported symbols without symbol mangling.

c++ Source code

Figure 1: Symbol import export depicted

Managing Header Files and Guards

Including a header file multiple times in a C++ source code file causes the re-declaration of symbols during the preprocessing step, and the compiler throws an error during the compilation step. C++ introduces the concept of guarding, which simply prevents the re-declaration of the header of a file using directives #ifndef, #define, and #endif. The syntax is shown in Figure 2.

C++ introduces the concept of guarding

Figure 2: Header guards syntax in C++

Pass by Value and Constness in C++ Functions

In C++, the integrity of function parameters is ensured by declaring them constant. const is the keyword to ensure this. Declaring a parameter constant implies that the function cannot alter its value inside its body.
Passing a function parameter by value also ensures that the original value of the parameter outside that function is unaltered. In passing by value, a copy of that parameter becomes available to the function and any alteration done to that parameter reflects only inside that function body.

The Role of the Preprocessor in C++ Compilation

In C++, every statement starting with a ‘#’ sign is the preprocessor directive. Preprocessor directives allow the preprocessor to perform file inclusion, macro expansion, and conditional compilation to make the code organization and performance more effective.

Symbol Management and Linking in C++

The linking process combines object files and resolves symbols across different translational units. Every symbol in C++ can possess a linkage of 3 types:

  1. Internal – Only visible within the current translational unit
  2. External - Visible to other translational units
  3. No linkage

The keyword extern is often used for establishing external linkage, i.e., any file in the project can access the symbol preceded by this keyword. extern symbol needs to be defined in only one of the project files. const() keyword establishes internal linkage. 

Best Practices for Header File Management

  1. Make use of forward declaration reduces compile-time dependencies
  2. Creating multiple header files for different purposes saves build time and increases modularity
  3. Use include guards to prevent multiple declarations at the preprocessing step
  4. Add dependencies explicitly
  5. The order of header file inclusion in C++ source code should not matter
Explore TechImply Featured Coverage

Get insights on the topics that matter most to you through our comprehensive research articles & informative blogs.