One man's look at C and C++

This page is a collection of incomplete notes on C and C++ by Dan Polansky, to complement other wiki resources. It is published with the hope that someone else will find some of the notes useful as well.

For a comprehensive coverage, there is: More resources to learn from are in the Further reading section.
 * C Programming: a decent if imperfect book
 * C++ Programming: a book that would need a lot of work to be really useful

C-style casting
At a glance:

See also C_Programming/Simple_math.

Links:
 * 3.12 Type Casts, The GNU C Reference Manual, gnu.org

C++-style casting
At a glance:

Above, const cast and dynamic cast are not covered.

Const cast is used to remove constness and more.

Dynamic cast is for pointers and references to instances of polymorphic classes, and enables to treat what is on the face of it available as a broad class as a narrower class. A class is polymorphic if it has at least one virtual method. A failed dynamic cast of a pointer returns null. An example:

See also C++_Programming/Programming_Languages/C++/Code/Statements/Variables/Type_Casting.

Links:
 * Casting Operators, microsoft.com
 * Working Draft, Standard for Programming Language C++, 28 February 2011, open-std.org, search for "5.4 Explicit type conversion (cast notation)"

Zero initialization
What follows is tentative. Some variables and attributes are automatically initialized to zero. These are above all those with static storage duration. Objects with static storage duration include the following: These objects are stored neither on the stack nor on the heap. Static storage duration is not to be confused with static variable, declared so using the keyword static; a non-static global variable has static storage duration. Static storage duration stands in contrast with automatic storage duration (stack) and dynamic storage duration (heap).
 * Global variables
 * Static variables outside of functions, global to a compilation unit
 * Static variables inside of functions
 * Static class attributes

The above may hold only for plain old data (POD); to be clarified.

The relevant part of the C++03 standard is section 3.6.2 Initialization of non-local objects.

Links:
 * Working Draft, Standard for Programming Language C++, 28 February 2011, open-std.org, search for "3.6.2 Initialization of non-local variables"
 * C++, wikipedia.org
 * zero initialization and static initialization of local scope static variable, stackoverflow.com
 * Storage Duration, cs.kent.edu/~durand

Plain old data
As a first approximation, plain old data (POD) is C data as opposed to C++ data. This includes ints, floats, pointers, C structs, and arrays. An array of non-POD types is not plain old data. However, certain classes are plain old data as well. A struct or class that has a programmer-defined constructor, a destructor, or a virtual function is not plain old data; some other cases are not plain old data as well. C++11 offers std::is_pod template to find out, requiring #include . Links:
 * C++ Language Note: POD Types by Walter E. Brown, 1999, fnal.gov
 * Working Draft, Standard for Programming Language C++, 28 February 2011, open-std.org, search for "9 Classes", and there paragraph 9 starting with "is a class that is both a trivial class and a standard-layout class"
 * What are POD types in C++?, stackoverflow.com

Constructor call order
In sum, the body of the base class constructor first. A constructor of a derived class calls the constructors of its base classes before it starts executing its own body.

Thus, if class AppleTree derives from Tree, which derives from Plant, which derives from Thing, the order of executing the class constructor bodies upon instantiation of AppleTree will be this: Note that the constructor invocation order is the opposite of the above. Thus, a contrast needs to be made between constructor invocation order and constructor body execution order. AppleTree is invoked first, and the very first thing it does is invoke Tree constructor, which invokes Plant constructor as the very first thing, which invokes Thing constructor as the first thing. Thing constructor handles its initializer list and executes its body, and then returns control to Plant constructor. Plant constructor handles its initializer list and then its body, and returns control to Tree constructor; etc.
 * Thing
 * Plant
 * Tree
 * AppleTree

For multiple inheritance, the picture is a little more complex: the multiple base classes are taken in the order of derivation, from left to right.

Links:
 * Order of calling constructors/destructors in inheritance, stackoverflow.com
 * Constructors (C++) # Order of Construction, Visual Studio 2015, msdn.microsoft.com
 * Order of constructor & destructor calls in Bruce Eckel, Thinking in C++, linuxtopia.org

Template compilation
Template definitions have to be included (usually via #include) in the compilation units that use them, and thus are often placed into header files. The result may be that the compiler creates a duplication of template executable code in .o files of the separate compilation units that use a particular template instantiation, a duplication that is only removed by the linker. Some compiler suites use compilation strategies to avoid that duplication.

A technique called explicit instantiation can be used by the programmer to eliminate that duplication, generally resulting in reduced compilation time. It consists in marking one or more compilation units for inclusion of a particular template instantiation while marking other compilation units that use that instantion for exclusion and mere referencing of the code of that instantiation. The latter is ensured by adding the keyword extern to the explicit instantiation syntax. Thus, the compilation units in which an instantiation is marked for exclusion depend on other compilation units for this to work at the linking time.

Possible good location for this material: C++ Programming/Templates.

Links:
 * An Introduction to C++ Templates by Kais Dukes, 2002, codeguru.com
 * Template compilation models in Bruce Eckel, Thinking in C++, linuxtopia.org
 * Working Draft, Standard for Programming Language C++, 28 February 2011, open-std.org - section 14.7.2 Explicit instantiation
 * Moving Templates Out of Header Files, 1997, drdobbs.com
 * 7.5 Where's the Template? in Using the GNU Compiler Collection (GCC), gcc.gnu.org
 * Explicit Instantiation, Visual Studio 2015, msdn.microsoft.com
 * Chapter 7 Compiling Templates in Sun Studio 12: C++ User's Guide, docs.oracle.com
 * C++
 * Template (C++)

Interpreter
There exist approximate interpreters of C and C++, allowing interactive entry as if in a command shell.

Cint is one such interpreter, licensed under X11/MIT license. It can be currently downloaded from hanno.jp. Limitations of Cint are documented at cern.ch.

To use Cint, run it, and on its command line enter. (period) and press enter. Thereafter, you can have an interactive session: cint.exe> mystr = "Hello" // ; is optional and the type is auto (const char* 0x3ecdb8)"Hello" cint.exe> mystr[0] (char 72)'H' cint.exe> *(mystr+1) (char 101)'e' cint.exe> sizeof(mystr) (const int)4 cint.exe> float pi = 3.141592 cint.exe> *(int*)&pi (int)1078530008 cint.exe> printf("pi cast %x\n", *(int*)&pi) pi cast 40490fd8 (const int)17 cint.exe> puts("Hey") Hey (const int)0 cint.exe> int myarray[] = {1,2,3} cint.exe> myarray[1] (int)2 cint.exe> int nested[2][2] = {{1,2},{3,4}} cint.exe> nested[1][1] (int)4 cint.exe> for (int i=0;i<2;i++) { printf("%i\n",i); } 0 1 cint.exe> sin(4.0) // Some math from math.h but not all of it (const double)(-7.56802495307928200e-001) cint.exe> strlen("abc") // Some string functions (const unsigned int)3

Links:
 * download location, hanno.jp
 * Cint limitations, cern.ch

Naming conventions
Different naming conventions for variables, types, etc. are used by different organizations.

See also C++ Programming/Weblinks.

Links:
 * GCC Coding Conventions, gcc.gnu.org
 * Google C++ Style Guide, google.github.io
 * LLVM Coding Standards, llvm.org
 * Joint Strike Fighter Air Vehicle C++ Coding Standards, stroustrup.com
 * Mozilla Coding Style, developer.mozilla.org

Brace conventions
Some put the opening brace ({) on a separate line, some don't:
 * The K&R style seems to place the opening brace on a separate line for functions but not for control structures: Indentation style.
 * GCC Coding Conventions do not seem to be explicit bout this, but code examples they provide place the opening brace of a function definition on a separate line. An example file is fixed-bit.c at github.com.
 * Google C++ Style Guide says that "The open curly brace is always on the end of the last line of the function declaration, not the start of the next line."
 * LLVM Coding Standards do not seem to be explicit about this, but code examples they provide do not place the opening brace of a function definition on a separate line. Example file: LLLexer.cpp at github.com.
 * Joint Strike Fighter Air Vehicle C++ Coding Standards place the braces on separate lines as per AV Rule 60.
 * Mozilla Coding Style indicates that Firefox uses Google Coding style for C++ code. As an example, nsBrowserApp.cpp at dxr.mozilla.org uses multiple styles.
 * Linux code base: fork.c at github.com places the opening bracket on a new line for functions but on the same line for control structures such as if.
 * Python C implementation: ceval.c at github.com seems to use the K&R style: opening brace on a new line for functions but not for control structures. This seems to match PEP 7 (Style Guide for C Code).
 * C Style and Coding Standards for SunOS from 1993 at cis.upenn.edu seem to use the K&R style.

Links:
 * See the links in section.
 * Indentation style

Arrays vs. pointers
Arrays and pointers can be used interchangeably in some contexts but not in all contexts.

In a variable declaration (global, local or local static), there is a marked difference in the memory allocated. Thus, if one declares e.g. "int *ip", the memory required for the pointer is allocated (e.g. 4 or 8 bytes); by contrast, e.g. "int ia[1000]" leads to an allocation of memory for all 1000 items, e.g. 4000 bytes. (The exact number of bytes depends on the targeted platform and compiler.)

In the use of the square bracket operator ([]), a pointer variable and an array variable seem to be interchangeable. Thus, if we have declarations "int *ip" and "int ia[1000]", and we assign "ip = &ia[0]" (let the pointer point to the first element of the array), it holds "ip[1] == ia[1]" and the syntax is valid. Moreover, there is a referential identity: "&ip[1] == &ia[1]".

In the use of the dereference operator (*), a pointer variable and an array variable seem to be interchangeable as well. Thus, if we have declarations "int *ip" and "int ia[1000]", and we assign "ip = &ia[0]", it holds "*ip == *ia"; we can treat the array variable name as a pointer to the first element of the array.

Above, we assigned "ip = &ia[0]", but we could have as well assigned "ip = ia".

Links:
 * C Programming/Pointers and arrays
 * C Pointers and Arrays, w3schools.com
 * c++ - What is array-to-pointer conversion aka. decay?, stackoverflow.com

Multidimensional arrays
Statically dimensioned multidimensional arrays appear intuitive enough: int intarr[2][5]; for (int rowno=0; rowno < 2; rowno++) for (int colno=0; colno < 5; colno++) { intarr[rowno][colno] = rowno * 10 + colno; printf("%i ", *(*(intarr + rowno) + colno)); } printf("\n"); The declaration "intarr[2][5]" reserves a consequtive area of bytes of the length 2 * 5 * sizeof(int). The way to read this declaration is this:
 * A 2-element array of 5-element arrays of ints.

The above reading is confirmed by the following decay: int intarr[2][5]; int (*intap)[5]; // Pointer to array of 5 elements. intap = intarr; // No compilation warning in gcc

One may work with dynamically allocated multidimensional quasi-arrays, declared and element-accessed as follows: int **p; ... p[rowno][colno] = 42; However, these have a very different layout in memory. Each row is a contiguous block of bytes representing the array values; however, the core data block of the quasi-array is a block of pointers to rows.

If one wishes to have a dynamically allocated/sized multidimensional array that is contiguous in memory, one can calculate the cell index manually: int rowcount = 2; int colcount = 5) int *p = malloc(rowcount * colcount * sizeof(int)); ... p[colno + colcount * rowno] = 42; Whether the above is a common idiom is unclear.

Links:
 * Multidimensional Arrays (GNU C Language Manual), gnu.org

C type declarations
The syntax of C type declarations leaves a lot to wish, as recognized by designers of some of the languages that came later.

For example, take this: One may wonder why this is not usually written as "int* a, b" that is, why is the asterisk (*) type information visually attached to the variable name. But there is a good heuristic reason: it is only a that is a pointer to int, while b is an int.
 * int *a, b;

Some combinations of pointer and arrays:
 * int *a1[10];    // A 10-element array of pointers to int
 * int (*a1)[10];  // A pointer to a 10-element array of ints

Further reading:
 * typedef - How do you read C declarations?, stackoverflow.com

C increment and decrement
There are preincrement and postincrement operators: "++i" and "i++". In some contexts, they are equivalent to "i += 1".

One classical idiomatic use is in loop control:
 * for (int i=0; i < 10; i++)
 * printf("%i", i);

Another idiomatic use is this, where p and q are pointers:
 * *p++ = *q++

What it does is that it copies the value pointed to by q to the location pointed by p, thereafter incrementing both pointers.

A similar case obtains for decrement: "--i" and "i--".

Links:
 * 3.3 Incrementing and Decrementing, gnu.org

C memory management
Functions involved in dynamic memory allocation include malloc, calloc, realloc, and sizeof. The function used for deallocation is free.

Example 1:
 * int *array = malloc(100 * sizeof(int));

The above can be interpreted as follows: allocate a space for an array of int-typed values, for 100 elements.

Example 2:
 * char **arguments = (char **) malloc(argc * sizeof(char*));

The above can be interpreted as follows: allocate a space for an array of pointers to C strings by considering the array length (argc) and the array element size (the size of a pointer to char).

Example 1 does not cast the result of malloc, unlike example 2. Casting the result is not necessary in C, but is necessary in C++

For more, see the links below.

References:

Loops (C)
Loops supported by C by example:

Inclusion
The preprocessor directive for inclusion of header files (or other files) is "#include".

A classic example that includes standard input-output functions, e.g. printf:
 * #include 

Example with project-specific header file:
 * #include "myproject.h"

The choice of the angle-bracket form () vs. the quoted form ("myproject.h") impacts the preprocessor's search strategy/algorithm for the files to be included. The precise strategy/algorithm is implementation-dependent, as per the C standard.

The usual practice is to include standard library header files via the angle-bracket form, while including project-specific header files via the quoted form.

A classic C++ example:
 * #include

There is no .h extension, since the header file itself has no such extension.

For gcc, one can obtain the include path via "echo | gcc -E -Wp,-v -", but it is to be clarified how include path relates to the algorithms used for the two forms, the angle-bracket form and the quoted form.

Links:
 * , wikipedia.org
 * C Programming/Preprocessor directives and macros
 * c++ - What is the difference between #include and #include "filename"?, stackoverflow.com
 * When can you omit the file extension in an #include directive?, stackoverflow.com
 * Search Path (The C Preprocessor), gcc.gnu.org
 * Source file inclusion, cppreference.com

References (C++)
C++ references are like pointers but with key differences:
 * One does not need to dereference the reference using the asterisk operator (*) or the arrow operator (->) to access its value.
 * A C++ reference cannot be declared without being bound/initialized, except as an argument in function declaration where the binding/assignment is guaranteed to take place upon function call.
 * Once a C++ reference is bound/assigned, it cannot be rebound/reassigned.

For more, the linked Wikipedia article is relatively in-depth, although not authoritative.

A glance:

Links:
 * , wikipedia.org
 * References, C++ FAQ, isocpp.org
 * Reference declaration, en.cppreference.com
 * Reference initialization, en.cppreference.com

Trying it online
One can write, compile and and run small C programs online.

Links:
 * C (gcc) – Try It Online, tio.run
 * C (clang) – Try It Online, tio.run

MinGW on Windows
If one wants to use free-as-in-freedom tooling on Windows, there is MinGW (minimalist GNU on Windows), based on gcc, the GNU compiler suite. MinGW is an alternative to Cygwin.

Links:
 * , wikipedia.org
 * mingw, osdn.net
 * mingw, sourceforge.net -- is moving to osdn.net, but as of 1 Mar 2024, there is still a lot of download activity from the files section

Compilation with gcc
A quick cheat sheet on compilation with gcc (from Wikibooks):
 * gcc test.c
 * Compiles test.c, links it with requisite libraries, and creates a resulting executable, which is a.out or a.exe.
 * gcc test.c -o test
 * Compiles test.c, links it with requisite libraries, and creates test executable.
 * gcc -c mod.c -o mod.o
 * Compiles mod.c into mod.o object file. Linking is prevented by -c option.
 * gcc test.c -o test mod.o
 * Compiles test.c, links it with mod.o, and creates test executable as a result.
 * g++ test.cpp -o test
 * Compiles and links C++ source code.

Links:
 * Guide to Unix/Commands/SW Development
 * GCC online documentation, gcc.gnu.org
 * GCC and Make - Compiling, Linking and Building C/C++ Applications, ntu.edu.sg

Debugging with gdb
(In part from Wikibooks) Part of the GNU project, gdb helps debug C/C++ programs and those in some other languages. A program to be debugged needs to be compiled with -g flag. Then one can start gdb like gdb myprog, which enters gdb shell without running the program. Once in the gdb shell, one can control gdb by typing commands:
 * break main
 * Sets a breakpoint at function main.
 * b main
 * Shorthand for the above. In general, prefixes of commands can be used as long as they are unambiguous.
 * break MySource.c:145
 * Sets a breakpoint at line 145 of source code MySource.c.
 * break MySource.c:145 if myvar > 3
 * Sets a conditional breakpoint.
 * watch myvar
 * Creates a watchpoint, which stops the execution when the variable or expression is changed.
 * watch myvar if myvar > 3
 * Creates a conditional watchpoint.
 * info break
 * Lists breakpoints.
 * info watch
 * Lists watchpoints.
 * run
 * Runs the program. Initially, the program is not running. A program can be run multiple times from the same gdb session.
 * continue
 * Continues execution until the next breakpoint or until the end.
 * step
 * Executes one step of the program, entering into function calls if applicable-
 * next
 * Executes one step of the program without nesting into function calls.
 * quit
 * Quits the debugger.
 * print myvar
 * Outputs the value of the variable.
 * print (myvar * 3) << 4
 * Outputs the value of an expression that it calculates.
 * disp myvar
 * Adds the variable to a list of expressions to be output on each step.
 * set myvar = 1
 * Changes the value of a variable.
 * set myvar = 1 << 4
 * Changes the value of a variable, supporting evaluation of an expression.
 * where
 * Outputs the call stack.
 * help breakpoints
 * Outputs help on the subject of breakpoints, including commands that deal with breakpoints.

Links:
 * Guide to Unix/Commands/SW Development
 * gdb, freebsd.org
 * Debugging with GDB, sourceware.org
 * GDB wiki, sourceware.org
 * GDB Quick Reference, csl.mtu.edu
 * GDB to LLDB Command Map, lldb.llvm.org
 * GNU Debugger, en.wikipedia.org

Dev-C++ on Windows
Dev-C++ is a graphical environment for Windows that is a frontend for the underlying compilers and debuggers, coming bundled with MinGW compiler suite and GDB debugger.

Links:
 * , wikipedia.org
 * Home - Dev-C++ Official Website, bloodshed.net

Software written in C
The following are some of the notable examples of software written in C (not in C++). Software is counted as written in C even if relatively small portion of it is written in other language, e.g. C++, assembly, Unix shell.

List:
 * Unix
 * Linux kernel
 * Many of the GNU tools used with Linux kernel
 * Windows kernel
 * OS X kernel
 * Git, the file versioning system
 * CPython, the standard implementation of Python
 * Ruby
 * Doom
 * Gcc prior to 2012
 * GTK, a free GUI toolkit
 * GIMP, a free raster image manipulation program (think Photoshop)

One can obtain a longer list e.g. by querying Github, which shows for each project which languages it is written in.

Links:
 * What are some known programs written in C?, quora.com
 * Wikipedia: Category:Free software programmed in C, wikipedia.org

Software written in C++
Bjarne Stroustrup (the author of C++) has an extensive list of examples. Let us randomly pick some from Bjarne:
 * Adobe Illustrator and multiple other Adobe products
 * Firefox browser
 * Chrome browser
 * Parts of Google web search engine
 * MySQL database engine
 * Qt graphical toolkit
 * KDE desktop for Linux (uses Qt)
 * OpenOffice/LibreOffice document suites
 * Doom III engine
 * Clang and LLVM

Other:
 * SAP Basis is probably written in C++ since it uses regex from C++ Boost library; verification/tracing pending. (SAP R/3 and SAP ERP applications are written in ABAP; we are talking about what the engine that runs ABAP is written in.)
 * Core of Java OpenJDK

Links:
 * C++ Applications, stroustrup.com

Competition
Some of the competition of C and C++ in the same performance league:
 * Rust
 * Carbon, a new (2022) language from Google intended to replace C++ but built for interoperability with it.
 * Pascal, in which key software for the Mac was written in the early days, and which was surviving for some time as Delphi.
 * Fortran, apparently still used in scientific computing

If one does not need that performance league but is still performance conscious, one may switch to languages running on managed just-in-time compiling runtimes such as Java (on JVM) and C# (on .NET).

If one is even less performance conscious, accepts lack of explicit typing, loves skipping the compilation cycle, batteries included and ease of installing other batteries (libraries/modules/packages), one can go for the popular Python.

Popularity
C and C++ have been among the top languages in the Tiobe index for a long time. If Tiobe counted them together as one bucket, they would have been number one.

Links:
 * TIOBE Index, tiobe.com

Criticism of C++
C++ is not without critics. One classic example is a rant by Linus Torvalds, the creator of Linux kernel, given in the links (warning, profanity ahead).

Points of criticism:
 * Complexity in terms of number of language features and subfeatures, especially as contrasted to C, a near-subset of it. It is hard for one to be confident to know the complete standard C++ language and the standard library.
 * C++ makes it much harder to see what calls and the volume of calls lie behind an innocently looking construction, e.g. variable initialization. For C, it is generally much easier to imagine what sort of assembly is produced by the compiler.
 * Long compilation times for some features.
 * Hard-to-decode compilation error messages for some features.

Links:
 * Linus Torvalds on C++, harmful.cat-v.org
 * , wikipedia.org
 * , wikipedia.org

Gcc having switched to C++
Despite the criticism noted in section Criticism of C++, gcc--the GNU compiler suite--switched from C to C++ as its implementation language. One can peruse the links below for a rationale.

Google is not switching away from C++ to C, but rather spearheaded a replacement language Carbon.

Links:
 * GCC's move to C++, 13 March 2013, lwn.net
 * Write gcc in C++, Ian Lance Taylor, Google, 17 Jun 2008

Name mangling in C++
A C++ compiler typically produces mangled names/labels based on names/identifiers used in the human-legible source code. When one disassembles an object file created by a C++ compiler, one typically sees the mangled names unless one instructs the disassembler to show the unmangled names.

See also Visual C++ name mangling.

Links:
 * , wikipedia.org
 * GCC C++ Name mangling reference, stackoverflow.com

Traps and oddities
What follows are traps/oddities/pitfalls that may trick the unwary, to be expanded.

1) In "int *p, q", q is declared to be an int, not a pointer to int.

2) In "if (i = 0)", an inexperienced author could have meant "if (i == 0)".

3) In "if (flags & FLAG != 0)", the operator "!=0" has precedence, possibly tricking the unwary.

4) One sometimes forgets to use "break" in a switch statement; without a break, there is a fallthrough. (Arguably, this is a design/usability defect, which some languages correct by introducing a fallthrough keyword.)

5) One sometimes forgets to use brackets in a macro definition to ensure the intended evaluation order.

6) One may wrongly think of a macro as of a function and forget that use of a macro can lead to a double effect, e.g. in invocation of max(i++, 10).

Links:
 * C programming Traps and Gotchas - Hackerspace / Software Development, talk.dallasmakerspace.org
 * C Traps and Pitfalls* by Andrew Koenig, cs.tufts.edu
 * My takeways from "C Traps and Pitfalls" by Tomek Osinski, osinstom.github.io

Standards
Both C and C++ are standardized by ISO. One has to know which C++ standard one's project uses. One has to be conscious of the differences and understand that not each project uses the latest C++ version/standard. While this pertains to C in principle as well, it is much more true of C++ since the C standard underwent only few and minor changes over the decades.

Links:
 * History of C++, en.cppreference.com
 * Programming languages — C Committee Draft — April 12, 2011 (n1570), open-std.org
 * Working Draft, Standard for Programming Language C++ (n324), 28 February 2011, open-std.org
 * Working Draft, Standard for Programming Language C++ (n4849), 2020-01-14, open-std.org