The Linux Filesystem
Explanation: Linux
Background
Your hard drive is divided into sections called partitions.
These are not physical divisions, but defined by software. At
the beginning of the hard drive is a special area called the
partition table, which lists the beginning and end of every
partition on the hard drive. In Windows, different partitions
are commonly given different drive letters, for example
C:
for the system partition and D:
for the user and documents partition. Despite how Windows presents
them, they are both commonly on the same physical hard drive. In
Linux the physical hard drive is commonly identified as
/dev/sda
and partitions on that hard drive as
/dev/sda1
, /dev/sda2
, etc.
Most partitions contain a filesystem to make file and directory management robust and user-friendly. At the beginning of the partition is usually some info necessary to navigate the rest of the filesystem. Common filesystems include FAT32, NTFS (the default in Windows), and EXT4 (the default in Linux). Note that Windows cannot easily read EXT4 filesystems, so if you want a dualboot setup, your Windows partition must be NTFS, your Linux partition must be EXT4, and any partition you want to access from both operating systems should be NTFS. Really the only partition that does not have a filesystem is a Linux swap partition, which is basically fake RAM located on your hard drive (Windows has an equivalent called the Pagefile).
Directory Structure
The Linux filesystem starts at /
, which is called
the "root" of the filesystem. Every file, directory, and
external storage device is somewhere after /
.
That is to say, you can get to anywhere on the filesystem by
starting at /
and going through sub-directories.
Once you get used to this concept, I think it makes more sense
than Windows and its drive letters. Any path that begins with
/
is called an absolute path because it does not
matter what directory you are currently in. Any path that does
not start with /
is called a relative path because
it is interpreted relative to your current directory.
There are two special entries that exist in every directory. The entry
.
points to the directory it is in, and the entry
..
points to the parent directory of the directory it is
in. You will see these two entries in every directory no matter what
filesystem you are on. They will show up with ls -a
since
they count as "hidden". Additionally, the shell creates one more shortcut:
~
is equivalent to /home/currentuser/
.
Finally, you should note that any path ending in the path delimiter
(i.e. /
) is specifically a directory rather than a file.
The above shortcuts are very convenient when working in the terminal.
To help you understand them, note that all of the following paths
refer to the same file:
/home/username/.ssh/config
~/.ssh/config
~/.ssh/../.ssh/config
~/.ssh/./config
./config
(if yourpwd
is~/.ssh/
)
There are quite a few directories in /
by
default, and it's nice to know what their general purposes
are. Although the organization is just convention and not
enforced, following it makes everything easier.
/bin/
— programs needed before/usr/
is mounted/boot/
— boot loader files/dev/
— byte-level interface to physical devices/etc/
— mostly config files/home/
— user directories/lib/
— shared libraries and kernel modules/media/
— mount point for removable media (managed by distro)/mnt/
— like/media/
but user-managed/opt/
— manually installed software (not via package manager)/proc/
— provides info about kernel and system/root/
— home directory for the root user (it's not in/home/root/
)/run/
— files describing the state of running processes (also in/var/run/
)/sbin/
— same as/bin/
but these need root priveleges/srv/
— files made available to remote clients through services/tmp/
— temporary files (erased on reboot)/usr/
— multi-user programs/usr/bin/
— general system-wide programs/usr/sbin/
— same as/usr/bin/
but these need root priveleges/usr/local/bin/
— user-created system-wide programs/usr/local/sbin/
— same as/usr/local/sbin/
but these need root priveleges/var/
— temporary or state files (not erased on reboot)
From anyone used to Windows, I'd like to note that file extensions carry significantly less meaning than you've been led to believe. Most programs in Linux, including the operating system, do not care about file extensions. There are exceptions to this, but in general file extensions are purely intended as a convenient organization tool for the user. Changing an extension does nothing to change the actual data conatined in the file.
Ownership and Permissions
Every file and directory has an owner and related permissions.
The owner of a file is denoted by user:group
, which
means every file (or directory) effectively has two owners. The
first is a user, and the second is a group of users. In some
Linux distros like Ubuntu, for every user there is a group with
the same name whose only member is that user. Thus files and
directories can by owned by laptopdude:laptopdude
for example. On the other hand, if I wanted to let only certain
people access one of my directories, I could create a group,
let's call it friends
, and change the ownership
of my directory to laptopdude:friends
. Together
with permissions, this allows a Linux user fine-grained file
access control. You can modify the ownership of files that
you own using the chown
command.
There are three permissions in Linux: read, write, and execute,
commonly referred to as rwx
respectively. Anyone
with read permission can open and view the contents of a file.
Anyone with write permission can modify and save a file. Anyone
with execute permission can run the file as a program. For
directories, execute permission allows the user to view the
contents of the directory. These permissions can be specified
for three different categories of users. First is the user owner
of the file, second is the group owner of the file, and third
is other users, commonly referred to as ugo
respectively. Therefore every file has 9 permissions that can
allow or deny users access. They are commonly shown using a series
of 9 characters where each group of 3 corresponds (in order) to one of
ugo
. For example, rwxrwxrwx
allows
anyone to do anything to that file, while ---------
allows no one to do anything. Many files in your home directory
are commonly rwxr--r--
which allows you (the owner)
to read, write, and execute them, but only allows other users to
read them. You can change the permissions of files that you own
using the chmod
command.
Finally, Linux has "hidden" files. However, unlike in Windows,
there is no "Hidden" flag. Instead, any file or directory that
starts with .
in its name is classified as "hidden"
meaning certain programs will not display it by default. You can
view hidden files with ls -a
. One example is the
~/.ssh
directory.
Modifying the Filesystem
In Windows when you plug in a USB drive, it receives a new drive
letter and appears in My Computer. In Linux, when you mount a
USB drive, you place a special link to it somewhere on the
filesystem. For example, in Ubuntu your drive would likely end
up mounted at /media/username/drivename
. You can
then read and write any file or folder on the drive by browsing
to the aforementioned mountpoint. In this way even external
media (including CDs and network drives) ends up as a sub-path
of your root filesystem. You can do so using the mount
and umount
commands, or by editing
/etc/fstab
.
There is a special class of files called pseudofiles that look
like regular files but behave differently. For example, you will
likely find an entry at /dev/sda
, which is a
byte-level interface to your main hard drive. You could "edit"
it in order to write bytes directly to your hard drive (warning:
DO NOT EDIT ANYTHING IN /dev
). And reading or
printing the file would give you the data on the hard drive.
Other pseudofiles can do other interesting things. For example,
you likely have one at /sys/class/backlight/vendor/brightness
that controls your screen brightness. Writing a number to that
pseudofile will actually change your screen brightness.
The last cool thing you can do with the Linux filesystem is
symbolic linking. This is significantly different from a Windows
shortcut. When you create a symbolic link, the link is treated
as if it were the target of the link. For example, I could
create a symbolic link at ~/link
that points to
~/documents/school/senior/paper.txt
. Any operation
you perform on the path ~/link
such as opening it in
a text editor will be carried out on the target instead. Since
this is silently carried out by the operating system, it can be
very handy for scripting and development. You can view symbolic
links and their targets with ls -l
and create them
with ln -s
.
Live Media
If your computer has no operating system and you would like to install one, you have to boot your computer from Live Media. This could be a CD, DVD, or USB drive that contains an operating system. You insert the media, start the computer, press a function key to get to the Boot Menu, and then tell the computer to boot from the Live Media. The computer then reads the data on the Live Media, and boots the operating system it contains. This allows you to debug problems on a computer that crashes when trying to boot from an internal hard drive. When running an OS from Live Media, your computer's internal hard drives appear as mountable external storage. When booting from Linux Live Media, the root of your filesystem is located on the Live Media itself. From Live Media you can read/write data from your computer's hard drives, modify partitions, debug fatal problems, or install a new operating system. A Linux Live USB is a great thing to have in case your computer crashes and refuses to boot.
Conclusion
Although it can take some getting used to, I really think the Linux filesystem makes more sense than the Windows implementation. It is much more flexible and easier to work with when coding. Additionally, linux users should be happy to know that EXT4 practically never requires defragmenting. Now you're ready to start learning about and working on the command line.