String types
Strings in Zig are slices “over” arrays of characters. As we know from Arrays and Slices, slices don’t own the data - they point to the data in an array.
So, strings, just like any other slice, are pointers. To be more precise, in Zig, all slices are a kind of pointer called a fat pointer; they contain the address of the first element and the number of elements from that address onward. We access that element count in Zig using .len.
Byte slice string
A moment ago, we saw a string declared like this:
|
const s_hello: []const u8 = "Hi everyone!"; |
, it’s a slice of bytes. In memory, it looks like this:
|
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
||
|
.{ |
’H’ |
’i’ |
’ ’ |
’e’ |
’v’ |
’e’ |
’r’ |
’y’ |
’o’ |
’n’ |
’e’ |
’!’ |
0 |
} |
|
fig. Slice de bytes |
||||||||||||||
It has .len, and Zig adds a null terminator (0) in memory after the end of the slice. That null is not part of the slice and isn’t accessible, not even if we try to compare against it in a while loop to check where the string ends.
string_literal.zig |
|
const std = @import("std"); const print = std.debug.print; pub fn main() void { const s_a: []const u8 = "Hi everyone!"; print("\nLength: {}\n", .{s_a.len}); var n_i: usize = 0; var c_x: u8 = s_a[n_i]; // compare if the char is null while (c_x != 0) : ({ n_i += 1; c_x = s_a[n_i]; }) { print("{c}", .{c_x}); } print("\n", .{}); } |
|
$ zig run string_literal.zig |
|
Length: 12 Hello everyone!thread 705195 panic: index out of bounds: index 12, len 12 string_literal.zig:12:18: 0x113e93e in main (string_literal.zig) c_x = s_a[n_i]; ^ |
Byte slice string with sentinel
However, if we define a string like this:
|
const s_hello: [:0]const u8 = "Hi everyone!"; |
That one is a slice with a sentinel. This time, when we define it, Zig also adds a null at the end, and we can check for it with a while.
|
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
||
|
.{ |
’H’ |
’i’ |
’ ’ |
’e’ |
’v’ |
’e’ |
’r’ |
’y’ |
’o’ |
’n’ |
’e’ |
’!’ |
0 |
} |
It has .len, and we can also check if a byte is null.
string_literal_sentinel.zig |
|
const std = @import("std"); const print = std.debug.print; pub fn main() void { const s_a: [:0]const u8 = "Hi everyone!"; print("\nLength: {}\n", .{s_a.len}); var n_i: usize = 0; var c_x: u8 = s_a[n_i]; // compare if the char is null while (c_x != 0) : ({ n_i += 1; c_x = s_a[n_i]; }) { print("{c}", .{c_x}); } print("\n", .{}); } |
|
$ zig run string_literal_sentinel.zig |
|
Length: 12 Hi everyone! |
Sentinel-terminated pointer string
If we define a string like this:
|
const s_hello: [*:0]const u8 = "Hi everyone!"; |
We’re declaring a pointer to bytes terminated with a sentinel. The only way to detect the end of the string is by checking where the null is. There’s no .len in this case. It’s not a typical Zig slice. This kind of string is useful when interacting with C functions that expect null-terminated strings.
|
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
||
|
.{ |
’H’ |
’i’ |
’ ’ |
’e’ |
’v’ |
’e’ |
’r’ |
’y’ |
’o’ |
’n’ |
’e’ |
’!’ |
0 |
} |
string_ptr.zig |
|
const std = @import("std"); const print = std.debug.print; pub fn main() void { const s_a: [*:0]const u8 = "Hi everyone!"; var n_i: usize = 0; var c_x: u8 = s_a[n_i]; // compare if the char is null while (c_x != 0) : ({ n_i += 1; c_x = s_a[n_i]; }) { print("{c}", .{c_x}); } // accessing .len causes a compile error // print("\nLength: {}\n", .{s_a.len}); print("\nLongitud: {}\n", .{n_i}); } |
|
$ zig run string_ptr.zig |
|
Hi everyone! Length: 12 |
Array “string”
An array can also be considered a string in a way, since it’s a sequence of characters. However, in Zig, it’s not treated as a string in the strict sense, because it’s not null-terminated, unless you explicitly add the null yourself.
|
const s_hello: [_]u8{'H','i',' ','e','v','e','r','y','o','n','e','!'}; |
|
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
||
|
.{ |
’H’ |
’i’ |
’ ’ |
’e’ |
’v’ |
’e’ |
’r’ |
’y’ |
’o’ |
’n’ |
’e’ |
’!’ |
} |
It has .len, but no null at the end. Still, you can build a slice over that array, because, as we already know, a slice is like a window covering part or all of an array.
So, is there always an array underneath every string?
Yes, in Zig, every string has an array underneath.
Since strings are slices (pointers with a start and length) or null-terminated pointers, they must point to an array, even if the programmer is only declaring a slice:
|
const s_hello:[*:0]const u8 = “Hello”; |
In that case, Zig internally creates an array like:
|
const s_hello: [_]u8{ 'H', 'e', 'l', 'l', 'o', 0 }; |
All these strings we’ve been defining are constants, so Zig knows them at compile time.
And if you want to define a “variable” string?
Well, you can’t exactly do that.
If you do this:
|
var s_x = "Hi everyone!"; |
And don’t change anything, the compiler will give you an error saying you’re not modifying the declared variable, so you should declare it as a constant.
If you try something like:
|
s_x[0] = ‘h’; |
You’ll get an error:
|
cannot assign to constant s_x[0] = 'h'; ~~~^~~ |
Because in reality, as we saw in Arrays and Slices, the underlying array isn’t variable: you’re declaring the slice as a variable, but not its contents.
What you can do is point to a different array using the same slice variable, as long as the length matches:
|
s_x = "Hello to all"; |
That, you can do.
So, strings in Zig are pointers to arrays of characters, and when declared as literals, they’re known at compile time.