Skip to content

section7_pointers

Fábio Gaspar edited this page Jan 17, 2019 · 1 revision

Pointers introduction

A pointer is a variable. But it is special because it's value is a memory address.

...+---+---+---+---+---+---+---+---+...
   |   |   |   |   |   |   |   |   |
   | P |   |   |   | A |   |   |   |
   |   |   |   |   |   |   |   |   |
...+-+-+---+---+---+-+-+---+---+---+...
     |               ^
     |               |
     +---------------+

The diagram above ilustrates an example. The main box is an abstraction of RAM memory and each cell represents a unit of memory. The box with P illustrates a pointer. Its value represents memory address where the variable A is stored. Assume, for instance, that the variable A is located on the address 0x100, then the value of the variable P is 0x100. But it's worth highlighting the fact that P is also a variable, as any other! Just has a special meaning and properties!

Thus, a pointer is really simple, despite being extremely useful and powerful.

Pointers declaration

For declaring a variable you must specify a type and a name. The same holds for a pointer, with a small addition.

int *ptr;

The small asterisk preceding the variable name is what it makes a pointer. This notation is intended as an mnemonic. As we will see next, the * is an operator dor dereferencing. In other words, and using the first example, deferencing the variable P allows you to access the variable A, which is pointed by P. Thus, the expression int *ptr can be read in this way: *ptr is an int. The dereferenciation of the pointer is a integer value.

For this reason, the writing style int *ptr makes a bit more sense than other styles such int* ptr or int * ptr. All are valid, but the first one is the what makes more sense.

Moreover, remember that you can declare several variables in a single statement. The example below might give you some insight why putting the asterisk right before the variable name makes more sense.

int* a, b, c;

Looking at the code above, might give the idea that int* is a data type, which is not true! The variable a is a pointer to int, BUT the variables b and c are not!

int *a, b, c, *d;

Using this style makes much more sense, we clearly see a and d are pointers and the remanining variables are integers.

Getting the addresses of variables

So, you know how to declare a pointer. But how to initialize it? Considering the purpose of pointers variables, we need a mechanism to get the addresses of variables. For that, you use the operator &.

int a;

int *ptr = &a;

In the example above, ptr is now referencing the variable a.

Dereferencing pointers

We already gave a small introduction to variables dereferentiation with the operator *. In the example below, which is a continuation of the example above, we will modify the variable a through the pointer ptr. Dereferentiation not only let's you get the value but also updating with by assigning a new value. Both examples are covered below.

int a = 10, *ptr = &a;

/* printing the initial value through variable 'a' and pointer 'ptr' */
printf("Initial value: %d\n", a);
printf("Initial value through pointer: %d\n", *ptr);

/* updating the value of 'a' through 'ptr' */
*ptr = 20;

printf("New value: %d\n", a);
printf("New value accessed through pointer: %d\n", *ptr);

The output is:

Initial value: 10
Initial value through pointer: 10
New value: 20
New value accessed through pointer: 20

Access invalid memory addresses

One problem with pointers in C programs is when you dereferentiate it but it points to an invalid memory position. This is common when your forget to initialize variables, when iterating arrays through pointers and you exceed the boundaries (arrays are covered on the next section) and more.

An invalid access will typically result in a signal SEGMENTATION FAULT. The a process attemps to access memory which doesn't own, the operating system with the help of hardware detect it, and the operating system kills the process (your program will stop running).

Let's change the previous example and not initialize the value of ptr variable. Therefore, the value of ptr will be gargabe and the chances of referencing a valid memory address for the program is extremely low.

int a = 10, *ptr;

/* printing the initial value through variable 'a' and pointer 'ptr' */
printf("Initial value: %d\n", a);
printf("Initial value through pointer: %d\n", *ptr);

And the result is, a beautiful segmentation fault!

Initial value: 10
Segmentation fault (core dumped)

This is a common problem, and sometimes is difficult to debug. The lazy and classical way of debugging in C is using printfs statements to isolate the origin of the problem. In the simple program as the one above, there's only one possible statement where the segmentation fault may occur, but for larger projects using printfs might help understanding the problem source.

However, there are much more elegant tools for debugging memory problems. The most famous one is valgrind. For the program above, you can compile with a special flag -g that generates debugging information in the generated executable. Then, you run valgrind.

gcc -g demo.c
valgrind ./a.out

This is the generated output:

==8533== Memcheck, a memory error detector
==8533== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==8533== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==8533== Command: ./a.out
==8533==
Initial value: 10
==8533== Use of uninitialised value of size 8
==8533==    at 0x109162: main (demo.c:8)
==8533==
==8533== Invalid read of size 4
==8533==    at 0x109162: main (demo.c:8)
==8533==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==8533==
==8533==
==8533== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==8533==  Access not within mapped region at address 0x0
==8533==    at 0x109162: main (demo.c:8)
==8533==  If you believe this happened as a result of a stack
==8533==  overflow in your program's main thread (unlikely but
==8533==  possible), you can try to increase the size of the
==8533==  main thread stack using the --main-stacksize= flag.
==8533==  The main thread stack size used in this run was 8388608.
==8533==
==8533== HEAP SUMMARY:
==8533==     in use at exit: 0 bytes in 0 blocks
==8533==   total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==8533==
==8533== All heap blocks were freed -- no leaks are possible
==8533==
==8533== For counts of detected and suppressed errors, rerun with: -v
==8533== Use --track-origins=yes to see where uninitialised values come from
==8533== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)

Notice how it let's you explicitly know the origin of the problem! Specially when you use dynamic memory, which wasn't covered yet, this is a must-use tool!