Understanding Distributed Filesystems




Understanding Distributed Filesystems

14 November 2018

New York

Added 01-Jan-1970

All persistent data stores share some common mechanical components and operating system abstractions. Whether you're running on spinning disks with RAID or SSD, the hardware will have an impact on the speed of your data access.

The organization of your data also plays a huge role. Small page sizes, hash indexes, and working sets are great for random access and simultaneously terrible for sequential access. Contiguous ordering is awesome for sequential access, but hurts random performance.

We'll look at these tradeoffs and more making reference to MySQL, MongoDB, Cassandra, Bigtable/HBase, Dremel, HDFS, S3, and many more popular technologies.

We'll check out the implementation of SSTables in NoSQL stores, Linked lists for logical ordering in RDBMSs, and append logs.

This meetup will arm you with an understanding of file storage and its impact on your choice of database.

Join us as we go down the rabbit hole, enjoy some pizza and beer, and come out of it a little wiser.

It was really tough to choose between this and RAFT as the meetup topic. I thought it would be good to step away from ordering and consistency protocols for a moment.