記事一覧へ English

Internet of Tomohiro

よろしければ投げ銭をお願いします。


Nim言語使用者向けのGCCとC言語入門

This article explains about GCC and a part of C programming language for Nimmers trying to uses C libraries with Nim. There are tools that help importing C libraries to Nim.

Some knowledge about C and GCC might help you when you use these tools. How GCC works is related to how Nim uses C libraries because Nim generates C code and calls C compiler to build an executable file.

If you want to learn more about GCC, please read GCC online documentation.

If you want to learn more about ld (the linker used by GCC in most environment) in binutils, please read Linker (ld) in Documentation for binutils

Hello world

This is hello world code in C:

#include <stdio.h>

int main() {
  printf("Hello world\n");
  
  return 0;
}

#include is an include preprocessor directive in C. Lines start with # are not comments but preprocessor directives and that are processed by preprocessor before compiling. stdio.h is a file name of the header file in C standard library and #include <stdio.h> reads the the file and insert the content there. You need to include stdio.h when you use printf function.

int main() { ... } defines function main. main function is the first function called when the program runs. Unlike Nim or Python, code that are executed at runtime must be written inside functions. printf is a function in C standard library and output text in standard output.

Save above code to file hello.c. Following command compiles it with GCC and produces an executable file a.out on Linux, a.exe on Windows.

On Linux:

$ gcc hello.c
$ ./a.out
Hello world

On Windows:

>gcc hello.c

>a.exe
Hello world

The executable file name is a.out/a.exe in default. You can specify executable file name with -o <filename> option:

On Linux:

$ gcc -o hello hello.c
$ ./hello
Hello world

On Windows:

>gcc -o hello hello.c

>hello.exe
Hello world

On Windows, gcc automatically add .exe extension to the output executable file name if you specified -o without the extension.

Define a function

C libraries consist of multiple functions. Lets define a simple C function.

#include <stdio.h>

int square(int x) {
  return x * x;
}

int main() {
  int sq = square(3);
  
  printf("3 * 3 = %d\n", sq);
  
  return 0;
}

int square(int x) is a function that takes 1 int value and return a int value. Unlike Nim, return type is written before function name and parameter type is written before parameter name. In main function, square function is called with int literal 3 and the return value is stored in local variable sq. Unlike Nim, variable type is written before variable name. Then the value of sq is output with printf function. Value of sq is converted to string "3" and %d in the string literal given to printf is relaced with that string.

Compile this code and run:

$ gcc -o test test.c
$ ./test
3 * 3 = 9

Multiple files

Many of C projects are consist of multiple *.c files. I separate square function to new file square.c.

square.c:

int square(int x) {
  return x * x;
}

test.c:

#include <stdio.h>

int main() {
  int sq = square(3);
  
  printf("3 * 3 = %d\n", sq);
  
  return 0;
}

Unlike Nim, you need to specify all *.c files when compiling multiple *.c files. Compiling them success but shows the warning:

$ gcc -o test test.c square.c
test.c: In function ‘main’:
test.c:4:12: warning: implicit declaration of function ‘square’ [-Wimplicit-function-declaration]
    4 |   int sq = square(3);
      |            ^~~~~~
$ ./test
3 * 3 = 9

Declaring functions defined in another *.c file in the *.c files that use them fix the warning:

square.c:

int square(int x) {
  return x * x;
}

test.c:

#include <stdio.h>

// Declare square to tell types of arguments and return type to the compiler.
int square(int x);

int main() {
  int sq = square(3);
  
  printf("3 * 3 = %d\n", sq);
  
  return 0;
}

Then I can compile them without warning:

$ gcc -o test test.c square.c
$ ./test
3 * 3 = 9

But why using a function in another *.c file without declaring it cause warning? Because when you call the function with wrong arguments, compiler don't detect it.

For example:

square.c:

int square(int x) {
  return x * x;
}

test.c:

#include <stdio.h>

int main() {
  // What if you forget to add an argument?
  int sq = square();
  
  printf("3 * 3 = %d\n", sq);
  
  return 0;
}

Then, GCC shows warning but doesn't show compile error. And the resulting executable file prints wrong output.

$ gcc -o test test.c square.c
test.c: In function ‘main’:
test.c:6:12: warning: implicit declaration of function ‘square’ [-Wimplicit-function-declaration]
    6 |   int sq = square();
      |            ^~~~~~
$ ./test
3 * 3 = 1

Define and use struct type

Next example defines a new type Vector3 and function vector3Dot that uses Vector3 type in vector.c. testvec3.c uses Vector3 variables and calls vectorDot function. Vector3 is like an object type in Nim with 3 float32 fields x, y, z:

vector3.c:

typedef struct {
  float x, y, z;
} Vector3;

float vector3Dot(Vector3 v0, Vector3 v1) {
  return v0.x * v1.x + v0.y * v1.y + v0.z * v1.z;
}

testvec3.c:

#include <stdio.h>

float vector3Dot(Vector3 v0, Vector3 v1);

int main() {
  Vector3 v0 = {-1.0f, 0.0f, 1.0f};
  Vector3 v1 = {0.0f, 1.0f, 2.0f};
  
  printf("%f\n", vector3Dot(v0, v1));
  
  return 0;
}

Compiling them result in error:

$ gcc -o testvec3 testvec3.c vector3.c
testvec3.c:3:18: error: unknown type name ‘Vector3’
    3 | float vector3Dot(Vector3 v0, Vector3 v1);
      |                  ^~~~~~~
testvec3.c:3:30: error: unknown type name ‘Vector3’
    3 | float vector3Dot(Vector3 v0, Vector3 v1);
      |                              ^~~~~~~

When you use Vector3 type in other *.c files, it also need to be defined in *.c files that use it.

testvec3.c:

#include <stdio.h>

typedef struct {
  float x, y, z;
} Vector3;

float vector3Dot(Vector3 v0, Vector3 v1);

int main() {
  Vector3 v0 = {-1.0f, 0.0f, 1.0f};
  Vector3 v1 = {0.0f, 1.0f, 2.0f};
  
  printf("%f\n", vector3Dot(v0, v1));
  
  return 0;
}

Now, it compiles and works.

$ ./testvec3
2.000000

Write and use header file

But when you have many *.c files that use these type and function, you have to copying them to all *.c files? In C programming language, header file solves that problem.

vector3.h:

// This #ifndef ... is a include guard
#ifndef VECTOR3_H
#define VECTOR3_H

typedef struct {
  float x, y, z;
} Vector3;

float vector3Dot(Vector3 v0, Vector3 v1);

#endif

vector3.c:

#include "vector3.h"

float vector3Dot(Vector3 v0, Vector3 v1) {
  return v0.x * v1.x + v0.y * v1.y + v0.z * v1.z;
}

testvec3.c:

#include <stdio.h>
#include "vector3.h"

int main() {
  Vector3 v0 = {-1.0f, 0.0f, 1.0f};
  Vector3 v1 = {0.0f, 1.0f, 2.0f};
  
  printf("%f\n", vector3Dot(v0, v1));
  
  return 0;
}

#include "filename.h" inserts the content of filename.h to there.

#include "filename.h" is used to include a file with a path relative to a current *.c file. #include <filename.h> is used to include a file in standard libraries or system include directories.

#ifndef VECTOR3_H
#define VECTOR3_H

...

#endif

Above lines in vector3.h is an include guard that prevents a header file is included multiple times. If header file X.h and Y.h included file Z.h and foo.c file includes both X.h and Y.h and Z.h didn't have include guard, Z.h is included twice in foo.c. That causes multiple definitions error.

#pragma once is also used as include guard.

Usually you don't need to change GCC command options when you use header file.

$ gcc -o testvec3 testvec3.c vector3.c
$ ./testvec3
2.000000

But if a c file includes files in specific directory, you need to add -I/path/to/include option:

$ gcc -I/path/to/directory -o testvec3 testvec3.c vector3.c

Compiling c files separately

So far, example codes are compiled with one GCC call, but in most of C projects, GCC is called for each *.c files. If there are many c files, compiling them all take long time. You would not like to do that everytime you fix a compile error. You can save your time by calling GCC for each c files and generating an object file. An object file is a machine code output of compiler. If all c files were successfully compiled to object files, call GCC to link all generated object files and generate an executable file or library. Then, when you change a line of code in one of c files, you recompile only that c file and link the new object file with existing object files to generate an executable file. There are tools like 'Makefile' that automatically detect which c files need to be recompiled by comparing time-stamp of a c file and corresponding object file. If it found a c file need to be recompiled, it automatically calls GCC to compile it and link the new object file with existing object files to generate an executable file or library.

Here, compile each c files manually to learn how it works:

$ gcc -c -o vector3.o vector3.c
$ gcc -c -o testvec3.o testvec3.c
$ gcc -o testvec3 testvec3.o vector3.o
$ ./testvec3
2.000000

-c option ask GCC to compile the source files, but do not link. Then GCC generates an object file with file name specified with -o option. Last GCC command links all object files (testvec3.o and vector3.o) and generates the executable file testvec3. Object files usually have *.o extension. MS Visual Studio uses *.obj extension for object files.

When compiling simple one line Nim code echo "Hello", Nim calls GCC for each generated c files:

$ nim c -r --listcmd hello.nim
Hint: used config file '/etc/nim/nim.cfg' [Conf]
Hint: used config file '/etc/nim/config.nims' [Conf]
.........................................................
CC: stdlib_digitsutils.nim: x86_64-pc-linux-gnu-gcc -c  -w -fmax-errors=3   -I/usr/lib/nim -I/tmp/tmp/testc -o /tmp/nimcache/d/hello/stdlib_digitsutils.nim.c.o /tmp/nimcache/d/hello/stdlib_digitsutils.nim.c
CC: stdlib_dollars.nim: x86_64-pc-linux-gnu-gcc -c  -w -fmax-errors=3   -I/usr/lib/nim -I/tmp/tmp/testc -o /tmp/nimcache/d/hello/stdlib_dollars.nim.c.o /tmp/nimcache/d/hello/stdlib_dollars.nim.c
CC: stdlib_io.nim: x86_64-pc-linux-gnu-gcc -c  -w -fmax-errors=3   -I/usr/lib/nim -I/tmp/tmp/testc -o /tmp/nimcache/d/hello/stdlib_io.nim.c.o /tmp/nimcache/d/hello/stdlib_io.nim.c
CC: stdlib_system.nim: x86_64-pc-linux-gnu-gcc -c  -w -fmax-errors=3   -I/usr/lib/nim -I/tmp/tmp/testc -o /tmp/nimcache/d/hello/stdlib_system.nim.c.o /tmp/nimcache/d/hello/stdlib_system.nim.c
CC: hello.nim: x86_64-pc-linux-gnu-gcc -c  -w -fmax-errors=3   -I/usr/lib/nim -I/tmp/tmp/testc -o /tmp/nimcache/d/hello/@mhello.nim.c.o /tmp/nimcache/d/hello/@mhello.nim.c
Hint: x86_64-pc-linux-gnu-gcc   -o /tmp/tmp/testc/hello  /tmp/nimcache/d/hello/stdlib_digitsutils.nim.c.o /tmp/nimcache/d/hello/stdlib_dollars.nim.c.o /tmp/nimcache/d/hello/stdlib_io.nim.c.o /tmp/nimcache/d/hello/stdlib_system.nim.c.o /tmp/nimcache/d/hello/@mhello.nim.c.o    -ldl [Link]
Hint: gc: refc; opt: none (DEBUG BUILD, `-d:release` generates faster code)
26628 lines; 0.781s; 31.645MiB peakmem; proj: /tmp/tmp/testc/hello.nim; out: /tmp/tmp/testc/hello [SuccessX]
Hint: /tmp/tmp/testc/hello  [Exec]
Hello

Above --listcmd option shows GCC commands Nim calls to compile c files. x86_64-pc-linux-gnu-gcc in above output is my GCC executable name. Nim implicitly imports system module and system module imports other modules (digitsutils, dollars, io). Each modules are compiled by Nim to generate c files and each generated c files are compiled by GCC to generate object files.

If you want to know GCC option in above output:

Then I changed echo "Hello" to echo "Hello Nim!":

$ nim c -r --listcmd hello.nim
Hint: used config file '/etc/nim/nim.cfg' [Conf]
Hint: used config file '/etc/nim/config.nims' [Conf]
.........................................................
CC: hello.nim: x86_64-pc-linux-gnu-gcc -c  -w -fmax-errors=3 -I/usr/lib/nim -I/tmp/tmp/testc -o /tmp/nimcache/d/hello/@mhello.nim.c.o /tmp/nimcache/d/hello/@mhello.nim.c
Hint: x86_64-pc-linux-gnu-gcc -o /tmp/tmp/testc/hello /tmp/nimcache/d/hello/stdlib_digitsutils.nim.c.o /tmp/nimcache/d/hello/stdlib_dollars.nim.c.o /tmp/nimcache/d/hello/stdlib_io.nim.c.o /tmp/nimcache/d/hello/stdlib_system.nim.c.o /tmp/nimcache/d/hello/@mhello.nim.c.o -ldl [Link]
Hint: gc: refc; opt: none (DEBUG BUILD, `-d:release` generates faster code)
26628 lines; 0.374s; 31.598MiB peakmem; proj: /tmp/tmp/testc/hello.nim; out: /tmp/tmp/testc/hello [SuccessX]
Hint: /tmp/tmp/testc/hello  [Exec]
Hello Nim!

This time, c files corresponding to system module and stdlib are not compiled by GCC again, only @mhello.nim.c was compiled.

Opaque type

If you want to hide all members of a struct type from other c files, you can use an opaque struct type. Then only specific c file can access members of the struct type and changing members of the struct type doesn't require changing and recompiling user code. Opaque types are often used in C libraries.

vector3.h:

#ifndef VECTOR3_H
#define VECTOR3_H

// Forward declaration of Vector3
typedef struct Vector3 Vector3;

// createVector3 and freeVector3 functions are added
// because other c files cannot declare Vector3 variable.
Vector3* createVector3(float x, float y, float z);
void freeVector3(Vector3* v);

float vector3Dot(const Vector3* v0, const Vector3* v1);

#endif

vector3.c:

#include <stdlib.h>
#include "vector3.h"

// Members of Vector3 are defined only in this c file.
struct Vector3{
  float x, y, z;
};

Vector3* createVector3(float x, float y, float z) {
  Vector3* ret = malloc(sizeof(Vector3));
  ret->x = x;
  ret->y = y;
  ret->z = z;
  
  return ret;
}

void freeVector3(Vector3* v) {
  free(v);
}

float vector3Dot(const Vector3* v0, const Vector3* v1) {
  return v0->x * v1->x + v0->y * v1->y + v0->z * v1->z;
}

testvec3.c:

#include <stdio.h>
#include "vector3.h"

int main() {
  // You cannot declare Vector3 variable but can declare pointer to Vector3.
  Vector3* v0 = createVector3(-1.0f, 0.0f, 1.0f);
  Vector3* v1 = createVector3(0.0f, 1.0f, 2.0f);
  
  printf("%f\n", vector3Dot(v0, v1));
  
  freeVector3(v1);
  freeVector3(v0);
  
  return 0;
}

Compile them:

$ gcc -c -o testvec3.o testvec3.c
$ gcc -c -o vector3.o vector3.c
$ gcc -o testvec3 testvec3.o vector3.o
$ ./testvec3
2.000000

In vector3.h, Vector3 type is forward declared and Vector3 is defined only in vector3.c. So testvec3.c can neither access members of Vector3 nor declare Vector3 type variable because it doesn't know about members of it. testvec3.c can only declare pointer to Vector3 and use functions that take or return pointer to Vector3. vector3.c need to provide any functions so that Vector3 type can be used without accessing its members.


by Tomohiro

記事一覧へ