File systems made easy
Table of Contents
A File System (FS) is anything that can store files. So yes, the above picture is a File System, although quite ancient now. These days everything is stored in a File System on disk, with the help from your friendly neighbourhood Operating System (OS). So why care about File Systems? Understanding File Systems can help you debug weird/strange problems, such as data inconsistencies, retrieving data, and manipulating data. Let's shoot to simply understand the main concepts of the File System.
We use it just about every single day... So what is a File? A File is a contiguous range of logical addresses. This means that all our files are actually just a series of addresses, and at each address... we store a portion of the data which makes up the file. It is important to note the difference between logical and physical addresses. Logical addresses are generated from our CPU and are also known as virtual addresses, where as the physical address corresponds to a particular cell in main memory (RAM).
The format of the file can be an executable file, text file, source file etc. All of the contents of the file is stored in the address range. The actual type of data could be numeric (0-9), characters (A-Z), or binary (0s, 1s). Every file has attributes stored with them. I'm sure you have come across them - such as, file creation date, file name, or file location etc.
Managing open files can be tricky. It's achieved by using an open-file table, simply listing the file addresses of the open files. A pointer to the last read position of the file. The number of times the file is currently open. Disk location of the file. Lastly, the access rights of the particular file. Some files may require file locking, especially when another user is trying to write to your file. For example in UNIX, we can use the open-file table to ensure issues like race conditions do not occur.
Moving up a layer - we are now looking at the directory itself. It is a structure used to collect file objects/inodes (whatever you like to call them), which contain information related to the file. These file objects usually point to the address range of the file stored in memory. Your directory can also contain subdirectories. This would be achieved by simply storing a directory node which points to the contents of the sub-directory. Commonly the directory structure represents a Tree from the hierarchical structure of storing directories and files (Top level directory, Second level directory and so on.). There are other storage structures available, but this is the most common.
The disk can be a Solid State Drive (SSD), Hard Disk Drive (HDD), Tape etc. These are used to store the File System. The disk can be sliced up into sections, this is called partitions. You may have heard of the term "Volume", this is typically a single accessible storage area residing on a single partition of the disk. Now.. the volume is where the File system is located (along with the files).
Common operations of the File System are creating files, deleting files, renaming files, listing the contents of the directory and so forth. UNIX commands would translate to
"rm -rf <file_name>",
"mv <old_name> <new_name>",
"ls". These commands are all implementable, so if you felt the need, it's possible to write your own program to do something like rename a file. Why would you want to do this? Well, it's just good practice in understanding how a File system works under the hood. How would you go about doing this? By changing the attributes portion of the File to reflect the update (ie. changing the name). There are many more FS operations, even some systems have custom operations to fulfil unique user requirements.
File Systems can be complex, but they don't have to be. Sure it is a broad topic, especially when diving into different architectures of File Systems and performance enhancing techniques. Hopefully this post spiked your interest, if so, definitely don't stop here as there is so much more resources dedicated to implementing, choosing, and optimising File Systems. Administrators, when trying to work out which file systems are the most performant for your OS, try analysing benchmarks to determine the speed and trade-offs.
Recommended product: Coding Essentials Guidebook for Developers