Files
File Systems To The Rescue
-
Requires reading and writing entire 512-byte blocks.
-
No notion of files, directories, etc.
-
Compared to the CPU and memory that we have studied previously more of the file abstraction is implemented in software.
-
This explains the plethora of available file systems: ext2,3 and 4, reiserfs, NTFS, jfs, lfs, xfs, etc.
-
This is probably why many systems people have a soft spot for file systems even if they seem a bit outdated these days.
What About Flash?
No moving parts! Great! We can eliminate a lot of the complexity of modern file systems. Yippee!
-
Have to erase an entire large chunk before we can rewrite it.
-
And it wears out faster that magnetic drives, and can wear unevenly if we are not careful.
Clarifying the Concept of a File
Most of us are familiar with files, but the semantics of file have a variety of sources what are worth separating:
-
Just a file: the minimum it takes to be a file.
-
About a file: what other useful information do most file systems typically store about files?
-
Files and processes: what additional properties does the UNIX file system interface introduce to allow user processes to manipulate files?
-
Files together: given multiple files, how do we organize them in a useful way?
Just a File: The Minimum
-
Reliably store data. (Duh.)
-
Be located! Usually via a name.
Basic File Expectations
-
file contents should not change unexpectedly.
-
file contents should change when requested and as requested.
These requirements seem simple but many file systems do not meet them!
03 Mar 2012: Bug Report–Serious file system corruption and data loss caused to other NTFS drives by Windows 8 CP
Failures such as power outages and sudden ejects make file system design difficult and exposed tradeoffs between durability and performance.
-
Memory: fast, transient. Disk: slow, stable.
About a File: File Metadata
-
When was the file created, last accessed, or last modified?
-
Who is allowed to what to the file—read, write, rename, change other attributes, etc.
-
Other file attributes?
Where to Store File Metadata?
An MP3 file contains audio data. But it also has attributes such as:
-
title
-
artist
-
date
-
In the file itself.
-
In another file.
-
In attributes associated with the file and maintained by the file system.
-
Example: MP3 ID3 tag, a data container stored within an MP3 file in a prescribed format.
-
Pro: travels along with the file from computer to computer.
-
Con: requires all programs that access the file to understand the format of the embedded metadata.
-
Example: iTunes database.
-
Pro: can be maintained separately by each application.
-
Con: does not move with the file and the separate file must be kept in sync when the files it stores information about change.
-
Example: attributes have been supported by a variety of file systems including prominently by BFS, the BeOS file system.
-
Pro: maintained by the file system so can be queried and queried quickly.
-
Con: does not move with the file, and creates compatibility problems with other file systems.
Processes and Files: UNIX Semantics
Many file systems provide an interface for establishing a relationship between a process and a file.
-
"I have the file open. I am using this file."
-
"I am finished using the file and will close it now."
-
Can improve performance if the OS knows what files are actively being used by using caching or read-ahead.
-
The file system may provide guarantees to processes based on this relationship, such as exclusive access.
-
Some file systems, particularly networked file systems, don’t even bother to establish these relationships. (What happens if a networked client opens a file exclusively and then dies?)
File Location: UNIX Semantics
UNIX semantics simplify reads and writes to files by storing the file position for processes.
-
This is a convenience, not a requirement: processes could be required to provide a position with every read and write.
UNIX File Interface
-
open("foo")
: "I’d like to use the file named foo." -
close("foo")
: "I’m finished with foo."
-
read(2)
: "I’d like to perform a read from file handle 2 at the current position." -
write(2)
: "I’d like to perform a write from file handle 2 at the current position."
-
lseek(2, 100)
: "Please move my saved position for file handle 2 to position 100.