Erin's CS stuff

CS stuff!

C-style Strings

We use C-style strings to store language.

What is language? What makes "pizza" different from 🍕 or even "PZA"? What's the difference between "Erin Keith" and "EK"?

If we wanted to store a user's first and last initials in our program, we could use an array of type char and size 2 because initials are just one letter each for the first and last name. On the other hand, if we wanted to store a user's first and last names how many characters would we need? Turns out, there's not really a way to know in advance!

Syntax

Syntactically, C-style strings are just character arrays with one very important difference: they must include a null character [1]! The null character is an escaped character: \0. It signifies the end of a string, since we cannot know in advance how long it will be.

Speaking of which, since we cannot know how many letters are in a name or phrase, the accepted approach is to declare a character array of an arbitrarily large size, as in the declaration below.

#define STRING_CAPACITY 100

int main(){
	char c_style_string[STRING_CAPACITY];

Unlike other arrays, C-style strings can be incorporated directly into Formatted IO using a %s conversion specifier. As you can see below, this is the only time scanf does not need the & (I'll explain why soon!).


With the exception of Formatted IO, the approach to acceesing the letter elements of C-style strings is the same as regular arrays, with one exception: the control condition. We must still use an array's bestie, the for loop, but instead of stopping at size (or capacity), the end of the C-style string is actually the null character \0, as demononstrated below.

for(int index = 0; c_style_string[index] != '\0'; index++){
  // loop body
}

It is very important to note that, syntactically, C-style strings are just a special case (or subset) of regular char arrays, the only difference being the use of the null character \0 to signify the end of the string within the array.

Behavior

The biggest difference between C-style strings and char is at an abstract level. For example, C-style strings would be appropriate for storing the make of a vehicle like Honda or Ford, since those are names and have different lengths. However, they would be the incorrect tool for storing a license plate number like 123-HUL, since that's more like a code with letters in it and should always have the same number of characters.

Additionally, while Array Behavior also applies to C-style strings, there are some differences to cover.

scanf

The scanf function stops at the first "blank space" character it encounters. This means it only allows us to store one word at a time. If the user enters a word shorter than the capacity, then the \0 (null character) is automatically stored after the last non-blank character entered.

fgets

The fgets function stops after the first endline character it encounters. This means it allows us to store multiple words at a time. Since it also uses the capacity, the \0 (null character) is automatically stored after the endline character or in the last element, whichever comes first. There are some additional differences in syntax that we'll go over in class, but you can get a head start here: https://en.cppreference.com/w/c/io/fgets.

Each of these tools work differently, so it's important to choose the right one for the job!


Creating New Strings

Finally, it's crucial to keep track of the null character \0 when you're creating your own C-style strings. For example, when creating a copy of a string in another character array, you'll have to add the null character \0 to the new string in the program after all of the characters have been copied.

Notes

1. Not be confused with the null pointer or endline character \n.