File System Data Structures
Locating inodes
-
All inodes are created at format time at well-known locations.
-
inodes may not be located near file contents.
ext4
creates multiple blocks of inodes within the drive to reduce seek times between inodes and data. -
Fixed number of inodes for the file system. Can run out of inodes before we run out of data blocks!
ext4
creates approximately one inode per 16 KB of data blocks, but this can be configured at format time.
open
: Path Name Translation
open("/etc/default/keyboard")
must translate "/etc/default/keyboard"
to an inode number.
-
Get inode number for root directory. This is usually a fixed agreed-on inode number, like 2.
-
Open the directory with inode number 2. Look for
"etc"
. Find"etc"
with inode number 393218. -
Open the directory with inode number 393218. Look for
"default"
. Find"default"
with inode number 393247. -
Open the directory with inode number 393247. Look for
"keyboard"
. Find keyboard with inode number 394692. -
Open the file with inode number 394692.
read
/write
: Retrieving and Modifying Data Blocks
-
read/write(filehandle, 345)
must translate to a -
There are multiple ways of doing this.
Data Blocks: Linked List
One solution: organize data blocks into a linked list.
-
inode contains a pointer to the first data block.
-
Each data block contains a pointer to the previous and next data block.
-
Simple.
-
Small amount of information in inode.
-
Offset look ups are slow! O(n) in the size of the file.
Data Blocks: Flat Array
A second solution: store all data blocks in the inode in a single array allocated at file creation time.
-
Also simple.
-
Offset look ups are fast, O(1).
-
Small file size fixed at startup time.
-
Large portion of array may be unused.
Data Blocks: Multilevel Index
Observation: most files are small, but some can get very large.
-
some pointers to blocks, which we refer to as direct blocks.
-
some pointers to blocks containing pointers to blocks, which we refer to as indirect blocks.
-
some pointers to blocks containing pointers to blocks containing pointers to blocks, which we refer to as doubly indirect blocks.
-
etc…
-
Index scales with the size of the file.
-
Offset look ups are still fairly fast.
-
Small files stay small, but big files can get extremely large.