Buffer overflows are bad. We all know that. It's important to make sure that when you are filling a buffer, you don't exceed its size. Some people, especially when looking for a solution that works well with strings, look at strncpy()
and think that since it takes a size_t
parameter that it will do what they want, and their code will be secure because it's protected against buffer overflows. Those people are wrong.
Let's consider a simple program that greets the user with their name, or with a default if no name is supplied:
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[])
{
char name[] = "world";
if (argc == 2) {
strncpy(name, argv[1], sizeof(name));
}
printf("hello, %s\n", name);
}
It seems to work well enough:
$ ./hello
hello, world
$ ./hello Jakob
hello, Jakob
Since the call to strncpy()
passes the size of the destination, there is a guarantee that we don't overflow the buffer. But let's test that:
$ ./hello "Jakob Kaivo"
hello, Jakob
What's going on here? I thought we couldn't overflow the buffer?
True, strncpy()
does prevent you from writing too much to the buffer. But, and this is the thing everyone seems to miss, it does not guarantee that the resulting string will be NULL terminated. So, when our input string is longer than the source buffer, it will just copy the first n
bytes, and now everything that expects a string will happily go reading off into infinity or SIGSEGV
, whichever comes first.
Why doesn't strncpy()
always NULL terminate the destination string? Because, despite the name, it's not actually meant for copying one string to another. It's meant to copy a string into a fixed size buffer, every byte of which needs to be filled. For example, ancient UNIX directory entries allocated exactly 14 bytes for file names, so it made sense when writing a directory entry to use strncpy()
for the representation in memory. The file name isn't a string, it's a fixed width buffer, and all 14 bytes have to be written to disk, so any bytes you aren't actively using need to be set to 0, which is what strncpy()
encounters a NULL terminator before it copies n
bytes. If this is not the actual origin of the function, it's for something remarkably similar.
So how do we fix this? A naive solution is to use tell strncpy()
that your buffer is one byte smaller than it actually is:
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[])
{
char name[] = "world";
if (argc == 2) {
strncpy(name, argv[1], sizeof(name) - 1);
}
printf("hello, %s\n", name);
}
It seems to work:
$ ./hello "Jakob Kaivo"
hello, Jakob
Unfortunately, this only works if the buffer contains a NULL terminator in the final byte to begin with. This is the case when the buffer is declared as an array of unspecified size initialized with a string literal, but falls apart if have something like this:
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[])
{
char name[5] = "world";
if (argc == 2) {
strncpy(name, argv[1], sizeof(name) - 1);
}
printf("hello, %s\n", name);
}
This initializes name
to the array { 'w', 'o', 'r', 'l', 'd' }
, with a distinct lack of NULL terminator. When we try running it, we get more garbage:
$ ./hello Jakob
hello, JakodH
Similar results can occur if the buffer is not initialized to begin with (often the case with dynamic memory returned by malloc()
).
You could manually terminate the string yourself, but my preferred method is the minimal case of my earlier post: use snprintf()
. Like strncpy()
, snprintf()
takes a size parameter, but unlike strncpy()
, snprintf()
guarantees that the resulting string will be NULL terminated:
#include <stdio.h>
int main(int argc, char *argv[])
{
char name[] = "world";
if (argc == 2) {
snprintf(name, sizeof(name), "%s", argv[1]);
}
printf("hello, %s\n", name);
}
This gives us the results we want:
$ ./hello "Jakob Kaivo"
hello, Jakob
Note: Be sure to use "%s"
as your format string, lest you open yourself up to a format string vulnerability (consider the case where someone runs ./hello %s
).