-
Notifications
You must be signed in to change notification settings - Fork 0
section9_introdunction_to_structs
Assume you want to store and manage data describing persons. You are interested in it's age, name, and social identifier. Declaring variables for storing those properties for each person is obviously impractical. You are probably thinking of using arrays! And that is a good initial though. However, you can't store mixed data in an array!
A possible workaround is defining three arrays:
- One for storing ages
- One for storing names
- One for storing the social identifier
Moreover, you can force that a person A
is stored in the same index for each array. If person A
is at index 2, then you can get the age, name and id from the arrays at index 2. In the example below: ages[2]
, names[2]
and social_id[2]
.
unsigned int ages[10];
char names[10][50];
unsigned long social_id[10];
While this is a possible solution, is not the most practical, specially as the number of properties to represent the entity grow.
This kind of problems are common, and fortunately C has a proper way to deal with them.
In C you can declare a structure, which is simply a list of declarations enclosed in braces. As a result, you can group differents types of data in a single variable.
struct Person {
unsigned int age;
char name[50];
unsigned long social_id;
};
In the code above, you can see a structure definition. What that represents is that I am creating a strucure called Person
, and every structure Person
has an age
, name
and social_id
.
More generically, the structure syntax is:
struct <structure tag> {
...
};
You use the struct
keyword, associate an option name, the structure tag
, which is useful for re-using the kind of structure being declared and finally, between braces, you set a list of variables which are called members
.
With structure declaration we are defining a new type, something we haven't done before!It's important to understand that we haven't, yet, declared a variable of type struct Person
, we are just defining the type, the template or shape of the structure.
Once you define the schema of a structure and provide a structure tag, you can declare variables that follow the structure shape.
For example:
struct Person p;
The variable p
follows the Person
schema, therefore has members age
, name
and social_id
.
You can also declare structures and immediatelly declare variables after the closing brace. The example below declares the Person
structure as previously, but also declares two variables, Maria
and Joao
.
struct Person {
unsigned int age;
char names[50];
unsigned long social_id;
} Maria, Joao;
An automatic structure are structure declarations that don't have an structure tag. For instance, you know up front you only need one or three variables following some schema and re-usability is not a concern, then you can ommit the structure tagname and declare the variables ahead.
struct {
int x;
int y;
} p1, p2, p3;
In the example, the structure declaration hasn't any tag, therefore I can refer to it. However, three variables, p1
, p2
and p3
are declared, and each one has the members x
and y
.
Typically, when you want to define a new structure type you want that definition to be visible everywhere. In such cases, you might want to declare the structure outside any function or in a header file, which we didn't covered yet.
However, you can declare structures inside functions, but it's definition is only visible inside the function where the declaration is present. This is useful for automatic structures.
int main() {
struct Person {
unsigned int age;
char name[50];
unsigned long social_id;
} Maria, Joao;
}
void foo() {
Person p;
}
In this example, the program won't compile because the Person
type is not known inside foo
function. Therefore, the structure definition should be outside any function to be globally visible.
struct Person {
unsigned int age;
char name[50];
unsigned long social_id;
};
int main() {
struct Person Maria, Joao;
}
void foo() {
struct Person p;
}
We have seen how to declare structures. In this section we cover how to initialize structures with constant values, pretty much like you saw with arrays.
Considering the automatic structure for storing point coordiantes, you can use constant initialization as follows:
struct {
int x;
int y;
} pt = {1,-5};
This is also possible when declaring tagged structures.
struct Person {
unsigned int age;
char name[50];
unsigned long social_id;
}
struct Person p = {20, "Helena", 123467};
Some final notes regarding constant struct initialization:
- The values order should be accordingly with the structure members declaration
- You don't have to initialize ALL members. You can initilize the first
N
elements you want. As of ANSI C or C89 you don't have the flexibility to instantiate members by name. Notice that if you only instantiate the firstN
members, the remaining ones, if any, are automatically initialized with zero values. These zero values have different pratical results between different data types.
Data type | Default value |
---|---|
Integers | 0 |
Floating | 0.0 |
Pointers | NULL |
Chars | NULL |
Note: If you don't make use of constant initialization, members aren't initilized with zero values. They have junk.
At this point, you should know the different ways to declare and initialize structures. The next step is to understand how you can read and write to structure members. That is possible with the .
operator.
struct {
int x;
int y;
} pt;
pt.x = 5;
pt.y = 0;
printf("x: %d \t y: %d\n", pt.x, pt.y);
Inside a structure you can have members of any primitive type, including structures you define.
Let's say we want to represent a rectangle with two point coordinates. The points belong to opposite extremes, thus that's enough to know the height, width and position in the cartesian graph. We can re-use the coordinate structures shown previously.
struct Point {
int x, y;
};
struct Rectangle {
struct Point p1, p2;
};
struct Rectangle rect;
rect.p1.x = 0;
rect.p1.y = 0;
rect.p2.x = 5;
rect.p2.y = 5;
The rect
is of type Rectangle
, which has two members: p1
and p2
. Both are of type Point
. Which has also two members: the integers x
and y
.
You can access the points individual coordinates by using the .
operator sequentially. You take the rect
variable and access the point p1
or p2
: rect.p1
. For instance, you are accessing a structure of type Point
which has members. As a result, rect.p1.x
.
Just like any other type in C, you can have pointers for structures.
struct Point {
int x, y;
};
struct Point p1;
struct Point *ptr = &p1;
The variable ptr
is a pointer to struct Point
. Recall that the *
operator dereferentiates the data pointed by the pointer variable, thus in this case *ptr
is, for pratical effects, a struct Point
structure. Therefore, in order to access the members trough a pointer you can write (*ptr).x
. The parentheses are required because the precedence of the structure member operator .
is higher then pointer derefentiation *
. Writing *ptr.x
would mean you are accessing the member x
of the structure ptr
, and x
is a pointer that you are deferentiating.
#include <stdio.h>
#include <stddef.h>
int main() {
int demo = 10;
struct {
int *ptr;
struct {
char a;
int b;
} hello_darkness_my_old_friend;
} spaghetti = {NULL, {'A', 123}};
spaghetti.ptr = &demo;
printf("%d\n", *spaghetti.ptr);
}
10
Dereferentiating pointers to structures in order to access members is very common. The syntax presented above is not the cleanest. Thankfully, C has an operator to make this task easier, ->
. The following statements are equivalent.
struct Point *pt;
(*pt).x;
pt->x;
Both .
and ->
are associative from left to right. Moreover, alongside these two operators, the ()
for functions and []
subscripting operators are at the top of precedence hierarchy. The example below is a common mistake. You might think you are incrementing the pt
pointer, but instead you are incrementing the x
member.
++pt->x;
++(pt->x);
The correct way would be: (++pt)->x
.
The following are legal operations with structures:
- Copying and assigning as a unit
- Taking the address with
&
operator - Accessing members
Accessing members was covered previously. Taking the address as well. Thus, the final topic to cover is structures copies.
struct Student {
char name[50];
char school[100];
int age;
};
struct Student S1 = {"Beatriz Pinto", "Faculdade Engenharia Universidade do Porto", 21};
In the example a simple structure Student
is declared and a variable of that type is initialized.
If you declare a new variable and assign S1 you are creating a copy of S1. That means that when you edit S1
you aren't affecting the new variable. This is not new, it works exactly as for the other cases covered with primitive types, but it's worth mentioning it again, specially if you have experience in other languages where many times assignment expressions are reference copies and not value copies.
struct Student S2 = S1;
printf("Student 1 Name: %s\n", S1.name);
printf("Student 2 Name: %s\n", S2.name);
S1.name = "Francisco";
printf("Student 1 Name: %s\n", S1.name);
printf("Student 2 Name: %s\n", S2.name);
Just like any other primitive in C, structures are passed by value in function arguments. That means if you want to modify a structure inside a function passed as an argument you are forced to use pointers.
In fact, structures should almost always be passed by pointer. If they are too big, creating a copy of the structure for the function call might hit the performance. Therefore, you generally see functions expecting pointers to structures and not solely structures as a unit. If you are writing a function that receives structures, through pointers, but you don't perform any modification, it's always a good practice to add the const
qualifier.
You can also return structures declared inside a function, a copy of it is created and returned. This means you can have functions designed for instantiation structures, which is handy. Notice that in this case you can't deal with pointers, unless you use dynamic memory allocation. The following example creates a structure inside the function and returns the address. You might think this is a good practice for the sake of performance. Why creating a copy when the structure already exists, right? However, don't forget that once the function returns, all automatic data created inside of it, arguments, and other details not relevant here, is destroyed! As a result, the address you returning will be pointing to junk, and there's a high chance you get a invalid memory access error.
#include <stdio.h>
struct Demo {
int a, b, c;
};
struct Demo* create_demo(int a, int b, int c) {
struct Demo d = {a, b, c};
return &d;
}
int main() {
struct Demo *d = create_demo(1, 245, -100);
printf("%d %d %d\n", d->a, d->b, d->c);
}
The example above will result in a segmentation fault, the error you know and love!
Soon...