Great article. My personal take on this is that C programs are so damn reliable ...

bad_user · on Sept 22, 2011

    the building blocks are so simple and transparent
    that you can follow the thread of execution with 
    minimal mental overhead.

I do not agree.

I've seen plenty of code that does weird things with pointers, like passing around a reference to a struct's member, then to retrieve the struct decrementing a value from the pointer + casting. Or XOR-ing pointers in doubly-linked lists for compression. And these are just simple examples.

I've seen code where I was like "WTF was this guy thinking?".

My biggest problem with C is that error handling is non-standard. In case of errors ome functions are returning 0. Some are returning -1. Some are returning > 0. Some are returning a value in an out parameter. Some functions are putting an error in errno. Some functions are resetting errno on each call. Some functions do not reset errno.

Also, the Glibc documentation is so incomplete on so many important issues that it isn't even funny.

Yes, kernel hackers can surely write good code after years of experience with buggy code that they had to debug.

But for the rest of the code, written by mere mortals, I basically get a headache every time I have to take a peek at code somebody else wrote.

0x12 · on Sept 22, 2011

> I've seen plenty of code where I was like "WTF was this guy thinking?".

Yes, that happens. But I've seen that in COBOL, Perl, Pascal, Java, PHP and in Ruby as well.

> In case of errors (s)ome functions are returning 0. Some are returning -1.

That's not a feature of the language.

bad_user · on Sept 22, 2011

     That's not a feature of the language.

Well, yes, but it's kind of nice when you've got exceptions with stack traces attached.

Some people don't like exceptions, but I do.

ajross · on Sept 22, 2011

It is indeed "kind of nice". But the question at hand is whether it's a requirement for writing reliable software. I tend to agree with the posts here that argue that it's not. It saves time for developers, it doesn't meaningfully improve the quality of the end product.

Serious C projects tend to come up with this stuff on their own, often with better adapted implementations than the "plain stack trace" you see in higher level environments. Check out the kernel's use of BUG/WARN for a great example of how runtime stack introspection can work in C.

0x12 · on Sept 22, 2011

gdb: where

mattgreenrocks · on Sept 22, 2011

No, but C could use more consistent error handling semantics, rather than conflating return values and error codes. Worse still: a combination of a return code and a global error code.

praptak · on Sept 22, 2011

"Worse still: a combination of a return code and a global error code."

That's not the worst that exists in C :-) Let me quote a dietlibc developer from http://www.koders.com/c/fid1639C203A2255EB1FA11DC6A68D74FEB2...

    /* Oh boy, this interface sucks so badly, there are no  words for it.
    * Not one, not two, but _three_ error signalling methods!  (*h_errnop
    * nonzero?  return value nonzero?  *RESULT zero?)  The glibc goons
    * really outdid themselves with this one. */

epo · on Sept 23, 2011

And a comment written by a 13-year old proves what exactly?

adobriyan · on Sept 22, 2011

But first function returning multiple values. Baby steps. :-)

tbrownaw · on Sept 22, 2011

> In case of errors (s)ome functions are returning 0. Some are returning -1.

That's not a feature of the language.

The inconsistency is a natural, expected, unavoidable result of the language forcing, er strongly encouraging, use of an unsuitable error reporting mechanism ("find some value in the range of the function's return type that isn't in the range of the function, and use it to indicate an error"). This wouldn't be an issue with exceptions or tuples / multivalue return like some languages allow.

munin · on Sept 22, 2011

> like passing around a reference to a struct's member, then to retrieve the struct decrementing a value from the pointer + casting.

that's not weird, that's a pretty standard way to enqueue structures on singly/doubly linked lists... it's made somewhat prettier by offsetof/CONTAINING_RECORD though

bad_user · on Sept 22, 2011

Yeah, but why?

I mean, can't you pass a reference to the whole structure instead? I prefer pointers to void* to the whole thing, with a normal cast later, instead of seeing pointer arithmetic.

I'm not a C developer, I just play around -- I've seen for example this practice used in libev, passing around extra-context along with the file-handler in events callbacks.

That seems really ugly to me, as they could have added an extra parameter for whatever context you wanted to be passed around.

dap · on Sept 22, 2011

It's a relatively common pattern to have a "collection" data structure (like a list or hash table) use link structs embedded inside other structs to simplify memory management. When you append to a linked list in Java, you always allocate a new Link object and then update the various pointers. Using this pattern in C, the object you want to put in a list contains a list_link_t structure, and the list library takes as arguments pointers to these structures. This may sound like an argument about convenience, but the implications of this are very significant: if you have to allocate memory, the operation can fail. So in the C version (unlike the Java one) you can control exactly in which contexts something can fail.

For example, if you want to trigger some operation after some number of second elapses, you can preallocate some structure and then just populate it and fill it in when the timeout fires. Timeouts are usually just signals or other contexts where you have no way to return failure, so it's important that it be possible to always handle that case correctly without the possibility of failing.

emu · on Sept 22, 2011

This pattern is sometimes called an intrusive data structure. For example, see boost::intrusive in the C++ world. It saves allocation, gives better locality, and allows various optimizations such as the ability to remove an object from a doubly-linked list in constant time.

Another way to think about all the offsetof() stuff is that it's emulating multiple inheritance in C. You can think of structures as inheriting the "trait" of being a participant in a container; the "pointer-arithmetic-and-cast" idiom to move from a container entry to the corresponding object is isomorphic to downcasting from the trait to the object that contains it.

Interestingly it is not possible to express this pattern in a generic way within Java's type system.

caf · on Sept 23, 2011

True - and the important point here is that this is a pattern. Like any language, to be truly fluent in C you have to understand the common idioms as well as the syntax, grammar and vocabulary.

munin · on Sept 22, 2011

the purpose of offsetof/CONTAINING_RECORD is so that you don't "see" the pointer arithmetic, it's safely ensconced in a macro ;)

one advantage is it produces a generic linked list API. you can write routines to traverse, add, and remove elements from the list without caring about the structure of data stored in the list. if you use offsetof, you can also have the list data for a structure at any position inside of the structure instead of the beginning. some systems do that so they can store header information at the beginning.

you can also have elements enqueued on multiple lists. you might say that if you're doing that, you have bigger problems, but sometimes shakespere got to get paid.

burgerbrain · on Sept 22, 2011

Check out the Linux kernel's linked list implementation for an example of it being done right. The actual workings of it are hidden behind macros that you can look at (quite simple, easy to grasp) so it's very clean to used.

dap · on Sept 22, 2011

> I've seen code where I was like "WTF was this guy thinking?".

You can write obfuscated code in any language. The point about C is that the mental model is very simple. There's no magic happening anywhere, so if you can parse the language, you can figure out what's happening line-by-line pretty easily.

bengl3rt · on Sept 22, 2011

This is one of the biggest reasons Linus Torvalds refuses to re-write the Linux kernel in C++ even though he is repeatedly pushed to do so... in C++ a whole bunch of things outside the file (templates, operator overloading) can make it so that what you're looking at doesn't do what you think.

In C, what you see is what you get. :)

burgerbrain · on Sept 22, 2011

For definitions of "repeatedly pushed to do so" that mean "asked on occasion by language trolls on the mailing lists who aren't even kernel devs". ;)

tbrownaw · on Sept 22, 2011

    #define int double

0x12 · on Sept 22, 2011

There is a special place in hell for you.

Had me laughing though :)

exDM69 · on Sept 23, 2011

Remember that if you want to use C++ in kernel development, you'll only get a small subset of C++ because there's no runtime system to rely on. Exceptions are one example of this.

Without exceptions, the advantages of C++ are not that great compared to the hassle it needs to get running in kernel mode like dealing with name mangling, static/global constructors, etc and hassling with compilers.

tomp · on Sept 22, 2011

> like passing around a reference to a struct

I thought that references are a feature of C++, not C. Personally, I never really got references... They are just a kind of magical pointers that programmers can forget about, but they make the code much less readable and can interact in funny ways...

bad_user · on Sept 22, 2011

Btw, I corrected my statement -- I was referring to "a struct's member" from which later you can retrieve the actual struct that owns that value.

Of course C has references because C has pointers. References in C++ are just constant pointers.

sausagefeet · on Sept 22, 2011

> Of course C has references because C has pointers. References in C++ are just constant pointers.

This is incorrect. You cannot have a reference to nullptr, for example. Pointers and references are different beasts, nowhere in the C standard does it refer to pointers as references. The underlying representation in compilers does not imply equivalence.

Roboprog · on Sept 22, 2011

Absolutely. Having learned Pascal before C, I really missed pass-by-reference for quite some time. Efficiency wise, a reference is just a hidden pointer, BUT, it is nice to know that the reference CANNOT be null. The caller of a routine expecting a reference must have actual data to pass, or the routine never gets called.

C is a very handy portable assembler, though.

z92 · on Sept 22, 2011

You might like to read "Moron Why C is NOT Assembly"

http://james-iry.blogspot.com/2010/09/moron-why-c-is-not-ass...

tbrownaw · on Sept 22, 2011

I've seen code that does things like "Foo &foo = * ((Foo * )0);" (possibly split among multiple statements). It seems to work fine, I suppose it's really undefined behavior?

thisuser · on Sept 22, 2011

Dereferencing a pointer to 0 seemed to work fine?

tbrownaw · on Sept 22, 2011

I think what happened was that the reference was passed to a function that (1) under most conditions (I think there was a fast-path added after the function had been around a while) accessed it directly, and (2) under all conditions turned it back into a pointer for another call. The way that function was called in this particular case was outside of those "most conditions", so the only thing that was done with the invalid reference was to turn it back into a pointer and then null-check it. And so while making the reference probably counts as "dereferencing" the pointer as far as language rules go, the memory that it pointed to was never actually accessed.

(Why yes, that does sound like something badly in need of refactoring. And illegal reliance on implementation details.)

apaprocki · on Sept 23, 2011

On AIX, page 0 is mapped and readable, so dereferencing null works just fine. :)

caf · on Sept 23, 2011

Actually, the C standard does say:

  A pointer type describes an object whose value
  provides a reference to an entity...

sausagefeet · on Sept 27, 2011

References are a C++ creation with a precise definition. This is what this quote is talking about, considering C does not have references.

gsg · on Sept 22, 2011

You don't need to do wacky things with pointers to get into trouble in C:

  int i;

  /* Iterates over everything except the last n elements of array... right? */
  for (i = 0; i < length - innocent_little_function(); i++)
      do_something_with(array[i]);

cbs · on Sept 22, 2011

>You don't need to do wacky things...

A function call in a for loop's conditional doesn't fit your definition of wacky?

qntm · on Sept 22, 2011

A function call in a for loop's conditional is practically C best practice. K&R do it on almost every page.

Locke1689 · on Sept 22, 2011

They usually only do it for builtins (e.g., strlen) and const functions, which makes this example a lot safer.

gsg · on Sept 22, 2011

A bit suspicious, maybe? I was trying to suggest the unsigned issue without actually spelling it out, not anything to do with side effects.

I probably should have used sizeof, even though that doesn't make sense there.

wladimir · on Sept 22, 2011

Incompetence goes a long way towards explaining some of C's bad reputation

Not just incompetence. Also bad language choice (usually due to legacy).

If programmers don't get enough time to properly test and review the code, which needs to be done very thoroughly in C, it's easy for even experienced developers to shoot themselves in the foot.

C is very good (let's say irreplacable) for low-level hardware and OS code. This is code that needs to be verified and tested very well.

On the other hand, using C for run-of-the-mill business projects or higher-level stuff on a tight deadline can be a very bad idea. It results in a lot of overhead for programmers to think about the details of error handling, buffer sizes, pointers, memory allocation/deallocation and so on, especially getting it right for every function. It is a recipe for screwups.

In this case it is very useful to have garbage collection, bounds checking, built-in varlength string handling, and other "luxuries" that modern languages afford you.

chalst · on Sept 22, 2011

Right. C with Lua is a nice combination serving a wider range of projects without sacrificing C's approach to low-level correctness.

swah · on Sept 22, 2011

I love Lua, but what stack would allow you to use it in a web project?

stonemetal · on Sept 22, 2011

Mongrel2 has a Lua web framework called tir.

demallien · on Sept 22, 2011

Precisely. These days, with the advent of easy to integrate Javascript/Python/Ruby/Lua scripting languages, there is very little reason to write an entire application in C. My last two projects, I wrote the business logic in a scripting language, and all of the performance bound stuff in C. There are a few gotchas the first time you do this - your bindings code needs to fit with the object model that you're using in C, you need to force all calls into the VM onto the same thread etc. I ended up just writing my own IDL parser/bindigs generator, but once you've done that once, you can use it for all of your future projects, and it really isn't that much work. The pay-off is huge - you can arbitrarily move a module of code between C and your scripting language depending on the optimal split for performance/ease of programming.

16s · on Sept 22, 2011

Very true. When you have managers shouting "get it out" and over-promising to clients, then any language is a bad choice, but particularly powerful languages that require more careful thought and testing. I love C++ and use it regularly, but when I need to "get stuff out quickly" I'll use something like Python as I can be more reckless with it.

trebor · on Sept 22, 2011

I think that in other languages such as Object Pascal it is easier to be a good programmer. I've seen horrible code written in OP, yes, but I think the language itself helps a programmer be better. For one thing there are fewer ways to kill yourself than even in C++, and yet there is little difference in speed between them.

I think C is one of these "other languages" because "here be dragons".

I watch a lot of people bang out C++ code as if it's totally safe, and fail. I see a lot of people hammer out C# code and say, "to hell with you, you don't even have .NET!" And so on. But today, when a programmer sits down and writes a C program they must sit and think out what they're doing and why—with no abstractions like OO to make an easy solution.

There are so many ways to blow your head off in C without knowing you left the opportunity in the program, that it forces a competent programmer to think differently about how they code. And a newbie? Well, if they aren't scared stiff about blowing a hole in their system, they should be! ;D

And C doesn't change often, unlike other languages.

I'm no C programmer, but I've seen C code for years and translated it into whatever language I'm using at the time. I have tremendous respect for UNIX/Linux, and a great many C-powered programs. Thanks for your work on them, guys and gals.

rbanffy · on Sept 22, 2011

> But today, when a programmer sits down and writes a C program they must sit and think out what they're doing and why

That's true for any programming language. Sadly, far too often, programmers are unable to afford taking the time needed to think about what they are doing or understanding what happens under the hood of the libraries they link against.

deepinit_a · on Sept 22, 2011

>Incompetence goes a long way towards explaining some of C's bad reputation

Incompetence is REASON for C bad reputation(if any).

rohit89 · on Sept 22, 2011

Agree. I'd also say that someone who is not a good programmer will find C quite unreliable. Bugs and issues popping up every now and then.

coliveira · on Sept 22, 2011

That is why in the "old days" C was used as a language for advanced courses in CS programs. It teaches you to think straight about your code. Nowadays people think it is easier just to write anything and catch exceptions later.

pacala · on Sept 23, 2011

> the building blocks are so simple and transparent that you can follow the thread of execution with minimal mental overhead.

The first fundamental purpose of any programming language is to provide abstraction via functions. This implies that following the thread of execution is never easy and the blocks are never simple. It's pretty much a wash, with special mention for languages in the Hindley-Milner family.

The second fundamental purpose of any programming language is to provide specification abstraction via replaceable modules. This is where C fails. It is common practice in C culture to not specify interfaces in depth (we are all good programmers, aren't we?) and the implementation via manual virtual tables makes it painfully difficult to find the specific implementations in the code base.