Of course, it's basically a wrapper around malloc. All allocated memory is subje...

vsync · on Nov 3, 2012

I wasn't sure why that would be the case. To me, the naïve answer would be that they wouldn't behave the same, as calloc is required to return cleared memory, meaning memset or similar. Saving time on allocation but clearing it on access would seem to require kernel-level access below that of the C library. So I took a look.

Interestingly, in glibc-2.15, the code for calloc (well, public_cALLOc) is longer than that for malloc. Most of that actually seems just to be doing magic to figure out when it actually needs to clear it. So in any cases where calloc is touching reused memory, it clears that itself. Otherwise, things end up at either mmapping MAP_ANONYMOUS or sbrking, which simply allocates pages that all get cleared on access (well, on write) anyway for security reasons.

Honestly I was surprised how much indirection and optimization there was. I knew that there were some clever things being done and they try to take advantage of processor features. But this stuff really does everything possible to avoid even a single unnecessary cycle. I'm impressed.

So in the general case, calloc might do some extra work. But where you're just callocing heaps and heaps (tee hee) it's not likely to. Strange, I could have sworn I just stumbled upon a case where someone claimed doing that was noticeably slower, but now I can't find it... was going to be very curious how that was the case. But after this romp I'm tired.

alexkus · on Nov 3, 2012

It doesn't behave the same with calloc because calloc zeros the memory and therefore writes to each page.

Just try the same program replacing malloc( 1 << 30 ) with calloc( 1, 1<<30).

pedrocr · on Nov 3, 2012

>because calloc zeros the memory and therefore writes to each page.

One does not imply the other. Internally what the kernel can do is link the page address it gives you to the zero page and mark it as copy on write. Only when you actually write to it will it allocate an actual page to back it. Only if your libc implements calloc as malloc+memset would this be a problem. Does glibc do that?

In fact the copy on write is probably also done on malloc as well. Even though the manpage implies different behavior (malloc doesn't guarantee setting the memory to 0, while calloc does) I don't think any sane kernel will give you someone else's free()'d memory. It would be a security leak.

vsync · on Nov 3, 2012

> Only if your libc implements calloc as malloc+memset would this be a problem. Does glibc do that?

I just checked (see my reply to the parent) and it doesn't.

> In fact the copy on write is probably also done on malloc as well. [...] I don't think any sane kernel will give you someone else's free()'d memory

You won't get someone else's freed memory but you're quite likely to get your own back and in that case it won't necessarily be zeroed.

pedrocr · on Nov 3, 2012

>You won't get someone else's freed memory but you're quite likely to get your own back and in that case it won't necessarily be zeroed.

Surely the kernel never gives a process it's own pages back. It would keep an unneeded page around that could just be pointing to the zero page.

Reading your other comment I assume what you mean is that it is a two step process where the kernel always gives zero pages but glibc's malloc implementation keeps some stock of pages and will return them back in a malloc() after a free(). That way you're not guaranteed to get zero'd memory on every malloc() since not all of it comes straight from the kernel.

The calloc() implementation has checks for that and will do the clearing when the memory is coming from the glibc stock and not the kernel. But even in that case it's only doing clearing when the page is already in the process address space. So a process will always receive zero pages from the kernel, but the malloc() implementation is made more efficient by giving you back some of your own free()'d memory that from the kernel's point of view was never given back.

Does that sound about right?

vsync · on Nov 4, 2012

I didn't look in enough detail on the Linux side to see if it always gives a zero page reference, or does/doesn't clear out a fresh page when it's referenced based on how it was allocated and by whom. I could see that saving time but I can easily see just nuking a whole page being faster than the work of tracking and checking.

From the glibc side I believe you are exactly correct.

alexkus · on Nov 4, 2012

> Only if your libc implements calloc as malloc+memset would this be a problem. Does glibc do that?

OK, it's not guaranteed that it will be, but the source shows several code paths where memset can be called during calloc():-

http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=malloc...

My quick experiement with 32-bit libc-2.13 showed that using calloc() is significantly slower than using malloc().

[EDIT] Should have read pedrocr's response fully.