Prev Next

Introduction to File Systems

UNIX File Interface

Establishing relationships:

open("foo"): "I’d like to use the file named foo."
close("foo"): "I’m finished with foo."

Reading and writing:

read(2): "I’d like to perform a read from file handle 2 at the current position."
write(2): "I’d like to perform a write from file handle 2 at the current position."

Positioning:

lseek(2, 100): "Please move my saved position for file handle 2 to position 100.

What’s Missing?

dup2 is really about manipulating a processes names for files and doesn’t have much to do with the file system.

Files Together: File Organization

Each file has to have a unique name. No problem!

Letter to Mom: LetterToMom.txt
Letter to Suzanna: LetterToSuzanna.txt
Letter to Chuchu: WoofWoofWagWag.txt
Another letter to Suzanna: AnotherLetterToSuzanna.txt
A third letter to Suzanna: LetterToSuzanna22Mar2012.txt

Flat name spaces were actually used by some early file systems but naming gets gross fast…

Hierarchical Implications

Big idea: don’t look at everything all at once. Allows users to store and examine related files together.

letters/Mom/Letter.txt
letters/Chuchu/WoofWoofWagWag.txt
letters/Suzanna/Letters/1.txt
letters/Suzanna/Letters/2.txt
letters/Suzanna/Letters/2.txt

Each file should be stored in one place. (Although we’ll discuss exceptions to this rule.)

Location Implications

Location requires navigation and relative navigation is useful, meaning that locations (directories) can include pointers to other locations (other directories).
Finally, location is only meaningful if it is tied to a files name, so hierarchical file systems implement name spaces, which require that a file’s name map to a single unique location within the file system.

Why Trees?

File systems usually require that files be organized into an acyclic graph with a single root, also known as a tree.

Why?

What is the name of the file in the diagram below?

OK, I picked a root. What is the name of the file now?

OK, I eliminated the cycles. What is the name of the file now?

Tree Naming

Trees produce a single canonical name for each file on the system as well as an infinite number of relative names:

Canonical name: /you/used/to/love/well
Relative name: /you/used/to/love/me/../well
Relative name: love/me/../../love/me/../well

File System Design Goals

Efficiently translate file names to file contents.
Allow files to move, grow, shrink and otherwise change.
Optimize access to single files.
Optimize access to multiple files, particularly related files.
Survive failures and maintain a consistent view of file names and contents.

Three of These Things Are All Like Each Other

The file systems we will discuss all support the following features:

Files, including some number of file attributes and permissions.
Names, organized into a hierarchical name space.

This is the file interface and feature set we are all used to. The difference lie in the implementations and what happens on disk.

Implementing Hierarchical File Systems

Broadly speaking, two types of disk blocks:

Data blocks: contain file data.
Index nodes (inodes): contain not file data.

One of These Things Is Not Like the Others

What makes file systems different?

On-disk layout. How does the file system decide where to put data and metadata blocks in order to optimize file access?
Data structures. What data structures does the file system use to translate names and locate file data?
Crash recovery. How does the file system prepare for and recover from crashes?

File System Challenges

File systems are really maintaining a large and complex data structure using disk blocks as storage.
This is hard because making changes potentially requires updating many different structures.

Example `write`

Say a process wants to write data to the end of a file. What does the file system have to do?

Find empty disk blocks to use and mark them as in use.
Associate those blocks with the file that is being written to.
Adjust the size of the file that is being written to.
Actually copy the data to the disk blocks being used.

From the perspective of a process all of these things need to happen synchronously.
In reality, many different asynchronous operations are involved touching many different disk blocks. (Each operation above modifies at least one disk block.)
This creates both a consistency and a performance problem!

What Happens On Disk?

Let’s consider the on-disk structures used by modern file systems.

Specifically we are going to investigate how file systems:

translate paths to file index nodes or inodes.
find data blocks associated with a given inode (file).
allocate and free inodes and data blocks.

We’re going to try and keep this at a relatively high level, but examples are used for concreteness and drawn from the Linux ext4 file system.

Sectors, Blocks, Extents

Sector: the smallest unit that the disk allows to be written, usually 256 bytes.
Block: the smallest unit that the file system actually writes, usually 4K bytes.
Extent: a set of contiguous blocks used to hold part of a file. Described by a start and end block.

Why would file systems not write chunks smaller than 4K?

Because contiguous writes are good for disk head scheduling and 4K is the page size which affects in-memory file caching.

Why would file systems want to write file data in even larger chunks?

Because contiguous writes are good for disk head scheduling and many files are larger than 4K!