Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

To avoid unnecessary cache writes, there's also a plugin that does implement a rudimentary LRU. Basically, you have to see some amount of traffic before being allowed to get written to the cache. This is typically done in a scenario where it's ok to hit the parent caches, or origins, once or a few times extra. It can also be a very useful way to avoid too heavy disk write load on SSD drives (which can be sensitive to excessive write wear, of course). See

https://github.com/apache/trafficserver/tree/master/plugins/...



I believe the term youre looking for is "cache admission policy." This is an adjunct to cache eviction, both are needed for success. I'm very curious what a highly efficient insertion policy and trivial "eviction" policy (FIFO) would look like in practice.

Cache insertion research is generally focused on use cases like small associative hardware caches. There's very little applicable public research for larger software caching systems, that Ive found. Probably the best would be Gil Einziger. He appears to have found it as an application of his work on extremely space/time efficient counting of sets, http://www.graduate.technion.ac.il/Theses/Abstracts.asp?Id=2... and http://www.cs.technion.ac.il/~gilga/. Of notable mention is TinyLFU http://www.cs.technion.ac.il/~gilga/TinyLFU_PDP2014.pdf. Gil submitted it to Caffeine (Java caching library) last summer, https://github.com/ben-manes/caffeine/pull/24. It got some traction over the winter and is now showing up in other places (https://issues.apache.org/jira/browse/CASSANDRA-10855). In fact Ben Manes just had a guest post on High Scalability the other day http://highscalability.com/blog/2016/1/25/design-of-a-modern....

PS: If anyone is interested in these problems, We're Hiring.

edit: https://aws.amazon.com/careers/ or preferably drop me a line to my profile email or my username "at amazon.com" for a totally informal chat (Im an IC, not manager nor recruiter nor sales)


TinyLFU+FIFO is in the simulator (https://github.com/ben-manes/caffeine/wiki/Simulator). However, you'd probably also want the window cache to correct the deficiency outlined in the updated paper (http://arxiv.org/pdf/1512.00727.pdf). A FIFO version with that isn't in the simulator, but would be easy to add.

mutli1 trace - W-TinyLFU: 55.6% - TinyLFU+Random: 53.8% - TinyLFU+FIFO: 48.6% - TinyLFU+LRU: 54.8% - FIFO: 41.0% - LRU: 46.7%

FIFO's poor choice of a victim has a big impact, so it doesn't seem promising. A random policy looks like an attractive fit, though.


Who's "we"?


Yeah, with SSD's I wonder how much that really helps to improve performance vs. just no cache. Most SSD's have a lot of caching implemented internally, so disk cache can often be self defeating.


"It Depends." If youre doing "random" writes down to the block dev, like updating a filesystem, it can be very bad. You'll end up hitting the read/update/write cell issues and block other concurrent access. In general I'd worry (expect total throughput to go down, and tail latency way up) around a 10-20% write:read ratio. Conversely if youre doing sane sequential writes, say log structure merges with a 64-256KB chunk size, Id expect much less impact to your read latencies.


This is a read cache though. If you just don't have it, you do reads on the SSD, which are pretty darn quick...


Ah, i must have missed the thread intent. I thought you were responding to the utility of limiting/optimizing cache write rate.


You got that right. It's a funky cache. ;-)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: