Then strncpy() will happily 0-fill all of that buffer space, which of course is completely pointless and just a waste of precious cycles. This is because it treats the buffer as a "record", and insists to initialize all of it.
This is the "flip side" of the logic that prevents it from 0-terminating if the string fills the buffer; since the buffer is a record with a known size, and not really a C string, it doesn't need to be terminated, right? Heh.
Also, POSIX reserves the namespace of functions whose names start with "str". I'm not 100% sure how well this applies to C++, but it seems prudent to avoid defining your own function with a name like that.
>> Then strncpy() will happily 0-fill all of that buffer space
Yes. And that feels like a really poor choice of the function name in my opinion. It should have been strcpy_pad_with_zeros() rather than strncpy(). To reflect the fact that it is going to waste precious cache lines and memory writes.
"Stop using strncpy already!"... or you might as well write an article called "Stop using null-terminated strings already!"... or since many commenters here have already pointed that out, and the author is talking about C++ anyway, perhaps "Stop fearing std::string already!"
Here's a better idea: Stop using the str* functions completely. Start using buffers with explicit length fields attached, and start using the mem* functions.
Each has its uses. The standard C and C++ library use C-strings for many things, so if you want to use, say, fopen you'd have to have a conversion routine set up (at additional memory and cpu cost). Just yesterday the ex-protobuf guy explained that in his new format (whose name I forgot) he uses null terminated strings exactly for that, even though he also stores the length of the string already.
But then, it's just C strings vs. Pascal strings all over again. It's your job as an engineer to use the right tool at the right time.
On the other end, it sucks that strlcpy is not as portable as the n counterparts. Last time I checked, the gnu libc still didn't have it for stupid bikeshed reasons.
> the gnu libc still didn't have it for stupid bikeshed reasons
Yep, still doesn't. Not so much for bikeshed reasons, but more for the Gnu people's stupid NIH reasons. OpenBSD offered it to them, reply was along the lines of "we don't see the need for it".
Null terminated strings are one of the worst failures of design in programming languages of all time. In C you're sort of stuck with turd polishing. It's never going to be good or right, it's always going to be partly broken in some way.
If I were writing a new code-base in C I'd standardize on using bstring but that too doesn't solve every problem.
So this safe function cannot be used with dynamically allocated buffers without casting them? That means it takes away the benefit of not passing in the size explicitly. It also does not deal with wide chars.
I think storing sizes explicitly is the better solution. If what you are dealing with are actually character strings, then you might store the byte size along with the char count.
If you are only dealing with ASCII (are you sure?), you could use memcpy and avoid the strncpy zero fill behavior.
char buffer[5] = {0};
strncpy(buffer, "This is a long string", sizeof buffer - sizeof buffer[0]);
(edit: sizeof, not sizeof()! edit2: bugfix!)
But okay, maybe your compiler won't let you do that (it should). Oh. We've already touched on what Microsoft won't let you do .. maybe Microsoft won't let you do that. (In which case the answer should be: don't use Microsoft).
But anyway .. then the author says this:
>We are programmers, are we not? If the functions we are given to deal with strings are difficult to use correctly then we should write new ones.
Umm: NO! Learn to use your tools properly and stop re-inventing the wheel to fit your misunderstanding of the world! BILLIONS of lines of code out there use strncpy() and other n-fn variants, and guess what: even in safety-critical, life-threatening, embedded environments!
This article should really be titled: "If you are going to use strncpy(), and you're scared of it, THEN DON'T DEPLOY WITHOUT FULL CODE COVERAGE TESTING!"
Testing, even full coverage testing (a perversion of a good idea if ever I saw one), cannot and will not catch all errors. It is not a substitute for better programming techniques, and certainly not a reason to avoid using them.
Maybe, but it will catch bugs like this if the test is written for it, and anyone writing C in the 21st century professionally, with safety in mind, is a) writing tests, and b) getting 100% coverage.
There actually is a good reason not to use parens with sizeof, and it is in fact a means of introducing subtle bugs to your code if you don't know the difference:
sizeof some_struct; //computed at *compile* time
sizeof(some_struct); // also computed at *compile* time, but looks like its a runtime call
Anything that looks like something but isn't actually that thing, in my opinion - especially with a language like C - is room for programmer enlightement ..
Just FYI - code:
sizeof buffer - sizeof(buffer[0])
is looking unconventional. Couple of reasons... You generally want to avoid +1 / -1 in your code. You don't want to write sizeof() without parenthesis, - it reduces readability.
I like the consistency of using parenthesis always. Just like some people insist of putting them in logical statements even if they do not affect the order of evaluation in that particular case.
It does matter. sizeof is computed at compile time, it is not computed at runtime. something() is a runtime invocation .. sizeof() 'looks' like that, but isn't.
2. sizeof is not necessarily done at compile time. C99 allows variable-sized automatic arrays, forcing it to store and later look up the value at runtime if you use sizeof.
Except you just failed to null-terminate your string, good job at proving the truth of the article :)
#include <stdio.h>
#include <string.h>
int main(void)
{
char buffer[5] = {0};
strncpy(buffer, "This is a long string", sizeof(buffer));
printf("%s\n", buffer);
}
$ ./test
This ^`gh?
Doesn't it bother you that your 5 character long buffer that is supposed to be terminated by a null happens to have 5 characters in it "[This ]" instead of 4 visible "[This]"?
Your memory just happens to have a null 6 bytes after the pointer.
This is an excellent post to show everyone who says that C is easy. Or that only sucky programmers creates segfaults and all you need is some discipline. Not even very experienced developers can be expected to keep track of all the nuances of the various ??str??cpy functions and their various faults. The solution is to stop using C already (unless you really, really have to).
You know, I can't remember a time, when I've got a segfault just because of poorly written strxxx code. It just doesn't happen this often in real life. In practice, it only happens in relatively complex code that interfaces to other languages, and in these cases usually you are facing a lot more trouble with memory management, so tracking of string sizes is peanuts in this case.
Real danger of C strings is that they create vulnerabilities in the network code. But unfortunately this is precisely the place, where latency is important, where everything is in C and where a solution to stop using C just wouldn't do.
C is easy. Keeping track of str* function nuances is as simple as opening the man page whenever you use a function you haven't used in a while (just a keypress away in vim, as easy as hitting F1 in an IDE). I fully intend to use C more. C makes things like static analysis and valgrind possible; you don't get those with Ruby (which I also use extensively). Loose typing comes with its own set of tricky invisible security issues. Use C for great glory.
The solution is to stop using C already (unless you really, really have to).
Use C for great glory.
Both of you are taking an extreme position.
C isn't very difficult. However, there are a lot of nuances that you have to remember while using it. Mostly this is a matter of experience. It eventually becomes something akin to muscle memory to match every malloc() with a free(), and to always have an acute awareness of buffer sizes.
When speed is critical, using C can often be a decisive advantage. Nothing else can match the raw horsepower that an effective C programmer can bring to bear on a problem. This implies that if you don't know C, then you're handicapping yourself as a developer. Consider that most of the really excellent programmers are also excellent C programmers: cperciva, guido, antirez, tlb, rtm, zed, etc. Either all of the excellent programmers are also delusional for using C, or you're missing something important by not knowing how to be effective with C.
On the flip side, C doesn't make sense for most use cases. For example, writing a distributed system in C when you could use Go, or writing Dropbox in C when you could use Python. Most people underestimate just how fast modern CPUs are, along with underestimating how much speed you can get from effectively profiling your app written in a high-level language like Lisp. Good profiling is more valuable than changing to a lower-level language, and it's also easier in the long run.
I agree totally. My final sentence, referring to great glory, was tongue in cheek. The software I write for Nitrogen Logic includes C backends, Ruby webapps, and C extensions to Ruby, each used where appropriate.
Seems bit odd to recommend his own function when the only complaint about strlcpy is that "it isn’t natively available in VC++.". I don't see how rolling your own is an improvement over battle-tested external function.
After every few days of programming in Go I start to miss C a little bit, probably more out of nostalgia than anything else,- until 5 minutes later I think about something like this.
Am I missing something obvious? I don't profess to know C++, but surely the proposed solution only works when writing to an array? There is no way for the compiler to know how much space has been allocated for arbitrary pointers.
Some malloc implementations do provide functions for determining the size of an allocated block, but that is also very problematic and certainly doesn't work with sizeof.
There's nothing wrong with strncpy if you know what you are doing. The problem is that far too few people know what they are doing and don't how to check if what they're doing is correct or not.
That said, if the article's code would have appeared on Stack Overflow, I wouldn't be able to prevent myself from also saying:
The author's suggested replacement, the C++ template function (!) strcpy_safe(), is not so hot in my opinion.
It uses strncpy(), which is record-oriented and thus will always fill all its n bytes. If you do:
Then strncpy() will happily 0-fill all of that buffer space, which of course is completely pointless and just a waste of precious cycles. This is because it treats the buffer as a "record", and insists to initialize all of it.This is the "flip side" of the logic that prevents it from 0-terminating if the string fills the buffer; since the buffer is a record with a known size, and not really a C string, it doesn't need to be terminated, right? Heh.
Also, POSIX reserves the namespace of functions whose names start with "str". I'm not 100% sure how well this applies to C++, but it seems prudent to avoid defining your own function with a name like that.