Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In the bad old Linux days, if you read a huge amount of data off the disk (like if you were doing a backup) Linux would try to take all that data and jam it in the page cache. This would push out all the useful stuff in your cache, and sometimes even cause swapping as Linux helpfully swaps out stuff to make room for more useless page cache.

One of the great things about `dd` is that you have a lot of control how the input and output files are opened. You can bypass the page cache when reading data by using iflag=direct, which would stop this from happening.



Moreover, flash drives (and all flash media) have a favorite page size, which is generally 4kB or 512kB. By defining a common denominator page size like 1024kB (with bs=1024kB), you can keep your flash drive happy with enough backlog to write, so it can perform at its peak write speed without churning, which will help with faster writes and lower write amplification, which is a win-win.


Man, this brings back memories of a few years ago when I tried to dd a Linux install image on a usb drive.

It would take forever and always end up with an i/o error. I figured the new PC had somehow wonky usb ports or something. But when that happened on another "known good" box, I figured it was the flash drive. But Disk Utility in MacOS worked well.

Then I tried increasing the block size to 1M and everything went smoothly. It even took less time to write the whole image correctly than it took it to error out before.


(In theory at least,) kernel should take care of aggregating write blocks and those will be plenty big enough by the time it reaches target drive, all thanks to the very same page cache GP is talking about - unless you specify "oflag=direct" to dd.

That being said, probably don't use too small of a block-size - this will eat up CPU in system call overhead and slow down copy regardless of target media type.


I don't know how internals of "cp" and the related machinery interacts with the target drive, however if you don't provide bs=1024kB, dd writes with extremely small units (1 byte at a time IIRC), which overwhelms the flash controller and creates a high CPU load at the same time.

I always used dd since it provides more direct control over the transfer stream and how it's transported. I also call dd as "direct-drive" sometimes due to these capabilities of the tool.


From the blog post:

"By specification, its default 512 block size has had to remain unchanged for decades. Today, this tiny size makes it CPU bound by default. A script that doesn’t specify a block size is very inefficient, and any script that picks the current optimal value may slowly become obsolete — or start obsolete if it’s copied from "


While I remembered the default wrong (because I never used the defaults, and I was too lazy to look for it during writing the comment), it's possible for a script to get a correct block size every time.

There are ways to get block size of a device. Multiply it by 2 to 4 (or more), open it directly, and keep your device busy.

The blog post is oblivious to nuances about the issue and usefulness of "dd" in general.


Please forgive the nit-picking, I'm not attacking this (excellent) article, or your entirely sensible inclination to dig up some "physical" number, but...

With modern SSDs, "sector/block size" is rapidly approaching vagueness of cylinder/head/sector addressing scheme, as used a couple of decades ago on venerable spinning/magnetic disks.

That is - it is definitely a thing, somewhere deep down, but software running on host CPU trying to address those, wouldn't necessarily end up addressing the same thing as user had in mind.

If you want a concrete example - look no further than "SLC mode" cache - where drive would have a number of identical flash chips, but some of them (or even a dynamically-allocated fractional number of chips) would be run at lower bits per cell count, for higher speed/endurance. However erase- and write- blocksize for the chip is expressed in cells, not bits. What that means is - cache and main storage of the very same SSD would have different blocksize (in bits/bytes).


> Please forgive the nit-picking, I'm not attacking this (excellent) article, or your entirely sensible inclination to dig up some "physical" number, but...

I don't think it's nitpicking. We're discussing there. We're technical people, and we tend to point out different aspects/perspectives of a problem, and offer our opinions. That's something I love when it's done in a civilized manner.

Regarding to remaining part of your comment (I didn't want to quote to not make it look crowded), I kindly disagree.

The beauty of SSDs are they have a controller which fits into the definition of black magic, and all flash is abstracted behind that, but not completely. Hard drives also are almost in the same realm.

Running a simple "smartctl -a /dev/sdX" returns a line like the following:

    Sector Sizes:     512 bytes logical, 4096 bytes physical
This means I can bang it with 512 byte packs and it'll handle it fine, but the physical cell (or sector) size is different, which is 4kB. I have another SSD, again from same manufacturer, which reports:

    Sector Size:      512 bytes logical/physical
So, I can dd it and it'll just handle it just fine, but the first one needs a bs=4kB to minimize write-amplification and maximize speed.

This is completely same with USB flash drives. Higher end drives will provide full SMART (since they're bona-fide SSDs), but lower end ones are not that talkative. Nevertheless, a common denominator block size (like 1024kB, because drives also can be composed of huge, 512kB cells, too) allows any drive to divide the chunk to physical flash sector sizes optimally, and push data to it

In the SLC/xLC hybrid drives' case, controller does the math to minimize the write amplification, but again having a perfect multiple of reported physical sector size makes controller's work much easier, and makes things way smoother. Either because the reported physical size is for the SLC part which is you're hitting for the most cases, or the controller is already handling multi-level logistics inside the flash array (but thinking in terms of block sizes since the this is how it works on the bus side regardless of the case inside).


It's great to have control over this but I suspect most users never knew this was happening, had no idea dd could bypass that behavior, nor knew which argument to pass to dd to accomplish this. It's like saying 'what makes 3d printers so great is you can make anything!' but you'd be way better off with an industrially forged object than the 3d printed object.


right and good luck doing injection or vacuum molding for a few copies of a niche design

Edit: correcting for nitpicker


You can do vacuum forming at home pretty easily. We used to do it back in the day for one-off handheld versions of home game consoles.


That's cool, I didn't know you could do that. How do you make the mold? If (as I suspect) the answer is handcrafting, 3d printing still offers a considerable advantage in labor.


Oh gosh yes. Though for certain finishes, vacuum forming still has some upsides. Whats neat is you can use 3D printing to make the mould for vacuum forming over.

This gives you the best of both worlds, in terms of the shapes/finishes that forming can achieve, but with the workflow benefits of 3D printing.

Similarly, the new hotness is compression/cold forging using 3D printed moulds and carbon fibre (and other materials). Very very cool.


For a few copies of niche design you would do vacuum molding.


The main point wasn't the problems with 3d printers but actually looking at dd from a human centered design perspective.


Pretty much, and understanding what is going on "under the hood" as it were can be informative. Had the author done a 'cp myfile.foo /dev/sdb' on a UNIX system they would have found they now had a regular file named '/dev/sdb' with the contents of myfile.foo and and their sd card would have remained untouched. But you would only know that if you realized that cp would check to see if the file existed in the destination, unlink it[1], and then create a new file to copy into.

The subtlety of opening the destination file first, and then writing into it, was what made dd 'special' (and it would open things in RAW mode so there wasn't any translation going on for say, terminals) but that is lost on people. Bypassing the page cache and thus not killing directory and file operations for other users of the system is a level even below that. Only the few remaining who have done things "poorly" an incurred the wrath of the other users of the system sitting in the same room really get a good feel for that :-). Fortunately for nearly everybody these days they will never have to experience that social embarrassment. :-)

[1] Well unless you had noclobber set in which case it would error out.


Still the bad old days for copying files from iOS to Linux. It seems to make a copy internally of everything you copy internally on the device before sending it, which leads to running out of free space just trying to copy things of :(


amusingly this would also occur with writes as well.

there was some hueristic in there that tried to prevent it, but it wasn't very good.


In my experience, Windows NT (now just Windows) is very fond of its file cache and large copies can blow up into memory paging as well.

Early Windows NT was awful with this, pegging the system with a cascade of disk IO at unpredictable times, often for ten seconds or more.

Can anyone suggest ways to avoid blowing the file cache on Windows with large copies? Is this even a problem anymore?


On macOS, you can also use the `--nocache` flag for the `ditto` command.

Please keep in mind that `ditto` is a file copy and archive utility, not a block copy utility like `dd` (which is also available on macOS).

An online man page for ditto: https://ss64.com/osx/ditto.html


I think using a small bs also determines the size of the cache you use, as its the buffer.


GP is talking about the Linux kernel's buffer cache. Unless you tell the kernel to operate directly on the disk, your reads come from and your writes go to pages within the kernel's buffer cache. Using a small bs probably results in a buffer of only bs bytes inside dd's address space, but the buffer cache is completely different and resides in the kernel.

That is, without iflag=direct, dd will repeatedly ask the kernel to copy bs bytes from the kernel's buffer cache into dd's address space, and then ask the kernel to copy bs bytes from its address space into the kernel's buffer cache.


I’d of thought dd always avoided page cache. what dd use case is that desired behavior?


linux absolutely still does this FWIW, its one of the reasons that swap is a net negative.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: