Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Could someone enlighten me on why malloc and free don't automatically zero memory by default?

Someone pointed me to MALLOC_PERTURB_ and I've just run a few test programs with it set - including a stage1 GCC compile, which granted may not be the best test - and it really doesn't dent performance by much. (edit: noticeably, at all, in fact)

People who prefer extreme performance over prudent security should be the ones forced to mess about with extra settings, anyway.



Some old IBM environments initialized fresh allocations to 0xDEADBEEF, which had the advantage that the result you got from using such memory would (usually) be obviously incorrect. The fact that it was done decades ago is pretty good evidence that it's not about the actual initialization cost: these things cost a lot more back then.

What changed is the paged memory model: modern systems don't actually tie an address to a page of physical RAM until the first time you try to use it (or something else on that page). Initializing the memory on malloc() would "waste" memory in some cases, where the allocation spans multiple pages and you don't end up using the whole thing. Some software assumes this, and would use quite a bit of extra RAM if malloc() automatically wiped memory. It would also tend to chew through your CPU cache, which mattered less in the past because any nontrivial operation already did that.

I personally don't think this is a good enough reason, but it is a little more than just a minor performance issue.

That all being said, while it would likely have helped slightly in this case, it would not solve the problem: active allocations would still be revealed.


> Some old IBM environments initialized fresh allocations to 0xDEADBEEF, which had the advantage that the result you got from using such memory would (usually) be obviously incorrect.

On BSDs, malloc.conf can still be configured to do that: on OpenBSD, junking (fills allocations with 0xdb and deallocations with 0xdf) is enabled by default on small allocations, "J" will enable it for all allocations. On FreeBSD, "J" will initialise all allocations with 0xa5 and deallocations with 0x5a.


> What changed is the paged memory model: modern systems don't actually tie an address to a page of physical RAM until the first time you try to use it (or something else on that page). Initializing the memory on malloc() would "waste" memory in some cases, where the allocation spans multiple pages and you don't end up using the whole thing. Some software assumes this, and would use quite a bit of extra RAM if malloc() automatically wiped memory. It would also tend to chew through your CPU cache, which mattered less in the past because any nontrivial operation already did that.

Maybe an alternative approach is to simply mark the pages to be lazily zeroed out when attached, in the Page Table Entries of the MMU. They wouldn't be zeroed out at the time of the call malloc(), but only when they are attached to a physical memory location (the first time you use it).


And it seems to me the OS should ensure the pages are zero'd out rather than user space (via malloc()) doing it, because it's still a security hole to let a process read data that it's not supposed to have access to (whether it's from another process or the kernel - it doesn't matter).


OS already zeroes out pages, obviously. But malloc doesn't usually request memory to the OS but takes a chunk from the already allocated heap.


Unsure, not my job. But I read stuff along those lines. A modern OS plays all sorts of games to delay doing work. Allocate a couple of megs of memory and the OS sets up some pointers in a page table. And yes it'll keep already zero'd pages handy. And mark pages as dirty to be scraped clean later.


It doesn't need to affect your CPU cache, because x64 processors have non-temporal writes (streaming stores) that bypass the cache.

The stuff about eagerly allocating pages is spot on though.

There is calloc which allocates and zeroes memory, but people don't use it as often as they should.


Parsers don't usually need to hold onto what they're parsing for a very long time, so unless they were running this parallel on a machine with 4k cores, I'd imagine it would be much more likely that a buffer overrun hits the middle of an already-freed allocation rather than going into an active one.

In terms of "wasting" memory, perhaps the kernel could detect that you are writing 0s to a COW 0 page and still not actually tie the page to physical RAM. (If you're overwriting non-0 data, well it's already in a physical page.)

I don't quite follow the details of the CPU cache issue and why that is more-than-minor.

I do think in this day and age we should be re-visiting this question seriously in our C standard libraries. If the performance issues are actually major problems for specific systems, the old behaviour could be kept, but after benchmarking to show that it really is a performance problem.


In terms of "wasting" memory, perhaps the kernel could detect that you are writing 0s to a COW 0 page and still not actually tie the page to physical RAM.

Writing to your COW zero page causes a page fault. Now, in theory you could disassemble the executing instruction and if it's some kind of zero write, just bump the instruction pointer and go back to userspace - but then the very next instruction in your loop that zeroes the next 8 bytes will cause the same page fault. And the next. And the next...

Taking a page fault for every 8 bytes in your allocation is completely infeasible. You'd be better off taking the hit of the additional memory usage.


How about this idea: free() zeros or unmaps all memory it allocated. This shouldn't fault. The OS zeros pages when mapping them into the process space (which it should do anyway). I think that solves the problem.


free() doesn't know what portion of the memory you allocated actually got written to. So for the model where a large, page-spanning buffer is allocated and only a small portion used, this approach causes many unnecessary page faults at free () time as it tries to zero out lots of memory that was never used or paged in at all.


Large buffers just get unmmaped so the OS can fix that problem.


An invariant you get from most kernels is that all new memory pages are zeroed when mapped into processes (normally through mmap or sbrk), so you only have the paging problem when initializing with a value other than zero.


Zeroing on malloc and/or free would not have prevented this type of error, since the information disclosure was due to an overflow into an adjacent allocated buffer.

However, zeroing on free is generally a useful defense-in-depth measure because can minimize the risk of some types of information disclosure vulnerabilities. If you use grsecurity, this feature is provided by grsecurity's PAX_MEMORY_SANITIZE [0].

[0]: https://en.wikibooks.org/wiki/Grsecurity/Appendix/Grsecurity...


Zeroing on alloc/free probably wouldn't have helped much with this bug. Data in live allocations would still be leaked.


> Could someone enlighten me on why malloc and free don't automatically zero memory by default?

The computational cost of doing so, I suspect.


Just like why most filesystems don't zero deleted files.


Neither of these are good reasons: I already talked about MALLOC_PERTURB_ (man mallopt) in my post and my naive performance tests, and we rarely get bad security holes based on data from deleted files left on filesystems.


Unfortunately, people write microbenchmarks of malloc and free a lot (and not completely without reason: they do quite often show up high in profiles).

For example, binary-trees on the Benchmarks Game is basically malloc/free bound (or at least is supposed to be as Hans Boehm originally designed it). Likewise, most JavaScript benchmarks (V8 splay, for example) are heavily influenced by raw allocation performance. Many people choose browsers and programming languages based on relatively small differences in these results. All of the incentives align in favor of performance, not security, because performance is easy to measure and security is not.


You asked for a reason, not for a good reason.

malloc/free were designed around 1972. That was a time where performance was much more important and security concerns didn't really exists.

Modern systems, like Go, do zero-out newly allocated memory because they do consider a bit more security to be more important than a bit more performance.

But changing the defaults of malloc/free is not really an option and it would probably break stuff.

Especially on Linux, where, I believe, malloc returns uncommitted pages, which increases the perf advantage in some cases.

Security conscious programmers can use calloc() or write their own wrappers over malloc/free.


they aren't good reasons now. They were good reasons ~20 years ago.

language spec should probably now default to zeroing memory unless you specifically ask it not to....and maybe that should be a verbose option :)


Are these results hardware independent? Maybe it makes a difference on older machines, or different architectures.


I imagine clearing memory on free is more relevant than MALLOC_PERTURB_?


calloc zeroes memory on allocation.


Yes, I think the question was something like "why doesn't malloc call calloc?".


Always nice to have options. Not zeroing memory on allocation might save a few cpu cycles.


It's pretty much the definition of false economy. Would you rather save a few cycles or suffer debilitating security bugs at random intervals? Always use calloc unless a) there's a proven performance problem and b) you know for a fact that due to careful inspection/static analysis/black magic malloc is safe. Then use calloc anyway because why risk it?


It depends on the size of the chunk of allocated memory. If it is quite large, time spent zeroing it can be substantial. Then again, if you're allocating in performance critical path, you're doing it wrong anyways.


It takes time to do that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: