21 minute read

In a recent post, we discussed why hierarchies are a problematic way of organizing files and how we can squash our files into a hierarchy when we have to use one anyway. This week, we’re going to take off in a different direction. You see, typical “hierarchical” filesystems aren’t quite 100% hierarchical: they allow creating different kinds of links to other areas in the filesystem, which means you can bypass the Single True Path requirement and access one file or folder from multiple places in the hierarchy.

It’s important to recognize that links are not a panacea; while they can solve specific problems with hierarchies quite effectively, they add new problems of their own, in particular increased complexity and the potential for broken links. In order to make sure we understand links well enough to take full advantage of them without getting ourselves into trouble, we’ll spend two weeks covering links. We’ll start out this week by looking at what links are, why we’d want to use them, and what kinds of links are available. Next week, we’ll discuss problems to avoid, specific use cases, and my recommendations for how to use links most effectively.

In an attempt to strike a balance between writing things in unnecessarily verbose ways and making this series accessible to people with less technical knowledge, I have added a filesystem glossary in the navigation bar (over on the left on a large screen, or under the “toggle menu” button at the top of the page on a small screen), which you can refer to if you run into terms you aren’t familiar with.

For our purposes, a link is an identifier that isn’t interesting in itself but refers to something else that is. On the Web, a hyperlink refers to another website that you might want to visit while looking at the current content; the URL itself isn’t very helpful, but when you click on the link, you get taken to the website where you can learn useful information. Similarly, a phone number can be considered a kind of link: if you punch the number into a phone keypad, you can talk to the person or institution the phone number refers to.

When focusing on filesystems, a link is a special type of file that refers to a different file or folder. (This is a slightly simplified definition, but it will do for the time being.) Shortcut and alias are other terms that are often used to refer to (usually specific types of) filesystem links.

Imagine you’re in the folder C:\Users\YourName\FavoriteLinks. You can put links in this folder to C:\Users\Alice\Photos\Family, D:\PublicFiles, C:\Users\Soren\Writing\Blogs\my-blog-post.docx, and so on. All of these files would then be accessible both from their original locations (the paths I just mentioned) and by double-clicking them in C:\Users\YourName\FavoriteLinks.

You can create as many links to a file as you want. This means you can have a file show up in multiple places and identified by different names, allowing you to bypass some of the problems with hierarchies. It also means if you have a file or folder with a really obnoxious path, say C:\Utilities\My-Dumb-Utility\My-Dumb-Utility\2.0\Files\2018\11\27\Documents\User, you can drop a link to that file or folder somewhere convenient (say, on your desktop) and access it instantly.

Since links are special link files and not the actual files they refer to, they have some drawbacks. Since the drawbacks are different for each type of link, we’ll describe those with each type.

For more ideas on why you’d want to use links, check out the common use cases for links section in the post after this one.

Note: The types of links available differ depending on the operating system you’re using. As is typical for Control-Alt-Backspace, I try to include as much information as I can for all of the three most popular operating systems today: Microsoft Windows, Mac OS, and Linux.

Warning: In this series of articles, I typically use the term “filesystem” to refer generally to one’s collection of files and the organization thereof (e.g., “My filesystem contains hundreds of bootleg Family Guy episodes”). The following sections will use the term in a more technical manner, referring to a particular software component that manages the organization of files (e.g., “The FAT32 filesystem cannot deal with files larger than 4 gigabytes”). Apologies in advance for the confusion.

Windows shortcuts

Availability
Microsoft Windows (starting with Windows 95); any filesystem.
Implementation
A special file in a proprietary format with the file extension .lnk (the extension is in virtually all circumstances hidden from the user by Windows Explorer, even if the show file extensions option is on). The file contains the path to the file or folder the shortcut targets, along with additional options.
Creation
Choose one of the following mechanisms (there are probably others, these are just some methods I use myself):
  • Right-click on a file and choose “create shortcut” (then move the shortcut to an appropriate location).
  • Hold down the Alt key while dragging the file to the location you want to create the shortcut.
  • Drag the file with the right mouse button, then choose “create shortcut” on releasing the mouse button
  • Right-click anywhere in Windows Explorer and create a new file of type Shortcut. Enter or browse for the path of the file you want to create a shortcut to.
Benefits
  • Very easy to create.
  • Familiar to any Windows user. The icons you click on to start programs, for example, are Windows shortcuts.
  • Support additional options, like a keyboard shortcut, parameters that are sent to the linked program when it’s run, or a custom icon.
  • Work even on very basic filesystems like FAT32 where other types of links are unavailable.
Drawbacks
  • Windows Explorer is the only program that understands Windows shortcuts. This often goes unnoticed since most Windows programs use Windows Explorer to browse for files. But if you try to use a shortcut at a command line or in certain programs, it will just show up as a .lnk file and you won’t be able to follow it or do anything useful with it. Similarly, the shortcut is generally unreadable and useless if you try to use it from a Mac or Linux computer or from a web interface such as Dropbox.
  • Following a Windows shortcut actually takes you to the folder or file it links to. That is, if you’re in C:\Users\Me\Documents and you follow a shortcut to a folder MyShortcut that you find in this folder, you will now be in the folder that MyShortcut targets (say, E:\SomewhereEntirelyDifferent). With symlinks or hard links, you would be in a folder called C:\Users\Me\Documents\MyShortcut containing the contents of E:\SomewhereEntirelyDifferent, which is often more useful – most of the time you don’t care where the item actually is and you’d rather call it by a different name that makes more sense in the context.
  • If the target is moved, the shortcut will no longer work. Windows will typically offer to attempt to locate the new target; this is certainly helpful, but it is fairly unreliable.

Mac OS aliases

Disclaimer: While I do have experience with Mac OS, my Mac knowledge is significantly weaker than my Windows or Linux knowledge. Much of this information comes from Wikipedia rather than from experience. Please let me know if I’ve gotten something wrong.

Availability
Mac OS (starting with System 7); any filesystem.
Implementation
A special file in a proprietary format, containing a variety of information about the file it points to that assists in re-locating the file if it moves.
Creating
  • Right-click or control-click a file and choose “Make Alias”.
  • Select a file and press Command+L.
  • Hold down the Option key while dragging a file to the location you want to create the alias at.
Benefits
  • Very easy to create.
  • Familiar to most Mac users.
  • Fairly fault-tolerant: in contrast to Windows shortcuts and symlinks, the Mac can usually locate the target file even after it has been moved or renamed, because aliases store more information about the file than just the name.
Drawbacks
  • Aliases work only in Finder (just as Windows shortcuts work only in Windows Explorer) and will not be recognized by the command line, by certain programs, or if accessed from another operating system or a web interface.
  • As with Windows shortcuts, following an alias actually takes you to the file it links to, so you can’t hide where the target is behind the name of the shortcut.
  • Because of the additional data stored for fault tolerance, aliases, especially those pointing to folders, can be much larger than other kinds of links – occasionally as large as 5MB! This is typically not an issue with the size of modern hard drives, but if you start wanting to automatically create links to hundreds or thousands of files or folders, it might turn into a serious annoyance.
  • While the search algorithms to locate targets that have been moved generally work well, there are no guarantees and they can fail occasionally. In addition, sometimes the correct target can actually be ambiguous – for instance, if you move the original file and then create another one at the original location with the same name, should the alias point to the moved file or the new file which was intended to replace the one that moved? (OS 10.2 and higher – which includes any version of Mac OS you’re using nowadays – choose the new file; older versions chose the moved file.)
Availability
All major operating systems: in Linux from the earliest days, in Mac OS since OS X, and in Windows since Windows Vista (though oddly, only since the Windows 10 Creators Update have you been able to create them without administrator privileges). Most major filesystems, including Windows’ NTFS, Mac OS’s HFS+, and all common Linux filesystems (ext2/3/4, ZFS, btrfs, etc.). Notably, however, this does not include the simple FAT and FAT32 filesystems commonly used for low-cost removable media like flash drives and SD cards.
Implementation
Unlike shortcuts or aliases, symlinks are implemented at the filesystem level, so they’re independent of the software that’s accessing them. A symlink is a file that generally contains the path to its target as text and has a special attribute set that indicates it’s a symlink.
Creating
  • Windows requires a shell extension (basically an add-on for Windows Explorer) to create symlinks in Windows Explorer.
  • Mac OS requires a script or third-party add-on to create symlinks in Finder – here are some options.
  • Most Linux file browsers can create symlinks natively.
  • In Linux and Mac OS, you can use the command ln -s to create a symlink at a command prompt; in Windows, you can use mklink (or mklink /d if linking to a folder).
  • For more details on creating symlinks in Windows, check out this fabulously detailed How-To Geek article.
Benefits
  • Since symbolic links are part of the filesystem rather than a specific program, they will almost always work in any program for any purpose. (A program can specifically choose to ignore or not follow symlinks, but generally only programs that have a good reason to do this will do it.)
  • Unlike Windows shortcuts or Mac OS aliases, symlinks are almost 100% transparent to the user. Going back to our earlier example, if you’re in C:\Users\Me\Documents and you follow a symlink MyShortcut you find in this folder, you will now be in a folder called C:\Users\Me\Documents\MyShortcut, with no indication that MyShortcut is actually a link pointing to E:\SomewhereEntirelyDifferent. With shortcuts or aliases, when you open MyShortcut, you’ll be immediately plopped into E:\SomewhereEntirelyDifferent.
  • Symlinks generally have good cross-platform compatibility. Since they’re a filesystem feature, usually any computer that can read a given filesystem will also understand its symlinks. This isn’t a guarantee, but it’s a much better guarantee than for Windows shortcuts or Mac OS aliases.
  • While shortcuts and aliases support only absolute paths, symlinks also support relative paths (have a peek at the glossary if you’re not sure what these are). That is, you can create a symlink to, say, ../otherFolder/file.txt. These are highly useful in many cases because they won’t break if you move the parent folder around or you access the symlinks on a different computer where your files are stored at I:\Users\Soren instead of C:\Users\Soren.
Drawbacks
  • Symlinks will still break if the original file is deleted or moved. This is somewhat mitigated by relative paths, though those have the disadvantage that they can break under some circumstances if the symlink is moved. Further, there is no service to help reconnect broken symlinks on any operating system I’m aware of, which makes the situation even worse than for Windows shortcuts or Mac OS aliases.
  • Symlinks don’t support certain fancy features that Windows shortcuts have, like the ability to choose a specific icon or run a program with specific options.
  • Symlinks don’t work on very simple filesystems like FAT32, which means you can’t create a symlink on most removable devices like flash drives or SD cards. (You can still create a symlink from a modern filesystem to something on these devices, though.)
Availability
All major operating systems: in Linux from the earliest days, in Mac OS since OS X, and in Windows since Windows Vista (in a limited form since Windows 2000). All modern filesystems; as with symlinks, this doesn’t include FAT32.
Creating
Creating hard links can only be done from the command line in all major operating systems. On Linux and Mac OS, use ln; on Windows, use mklink /h (starting in Windows Vista). The Windows shell extension mentioned in the symlinks section can also create hard links.
Implementation
The other types of links listed here are some kind of variation on a basic idea: a link is a file that contains the path to another file, and when you open the link, you’re redirected to the target file. The link is just a pointer to the target file – it isn’t actually the file. This has familiar consequences; for instance, if you delete the target file, the link will stop working.


Hard links are completely different. With a hard link, the link is functionally identical to the original file. It’s essentially like having multiple copies of the file, except that when you change one copy all the copies change in sync. If you delete a hard link, any other hard links remain in place and act exactly as they always did. When zero hard links remain, the file is deleted.

Under the hood: While hard links may sound magical and exotic, there’s nothing remotely special or unfamiliar about them – every file you see and work with is a hard link! A hard link is a record maintained by the filesystem that associates a filename with information about where that file can be found on the disk. Creating a new “link” consists of duplicating the record, so two filenames refer to the same data on disk.

Benefits
  • Like symlinks, hard links are generally understood by any system that can read the filesystem at all.
  • Since hard links aren’t really “links” at all in the traditional sense but just files that exist in multiple places, they’re the most transparent to users.
  • There is no way for a hard link to become “broken” (aside from the entire filesystem becoming corrupt due to a bug or hardware error, in which case you have much bigger problems than broken links). If the hard link exists, the file is accessible through it.
Drawbacks
  • Hard links typically cannot be created to folders.
  • Hard links cannot link to files on other disks – so if I have two hard drives, a C drive and a D drive, I can’t create a hard link on the C drive to a file on the D drive. This should not come as a surprise if you understand how hard links are implemented (see the “Under the Hood” sidebar above), but it’s a major limitation nonetheless.
  • Hard links often do not copy the way you would expect. If you have two hard-linked files and you copy them to a different location, you will typically end up with two duplicate, non-linked files in the new location. Some better copying tools like Linux’s rsync have options to preserve hard links, but you must take extra care to make sure you always use one of these tools so as not to lose the links. And it is very difficult to notice if you forgot, since a hard link looks the same as any other file!
  • Like symlinks, hard links are not supported on FAT32 filesystems, used by most cheap removable devices like flash drives and SD cards.
  • Most filesystems limit how many hard links you can create to a given file. The limit is usually at least a thousand, so this is rarely cause for concern, but there are no such limitations on any other kind of link.
  • Since hard links to a file are no different than any other file, it is possible to accidentally delete your last hard link to a file and permanently delete the file without realizing what you’re doing. With symlinks, Windows shortcuts, or aliases, you can tell whether you’re working with an “actual copy” of the file or a “link”, which means you can confidently delete links while reorganizing without worrying about actually losing anything.

Under the hood: Why can’t you create a hard link to a folder? Some filesystems actually do allow this, but most forbid it because of painful experience! If hard links to folders are allowed, you or a program can accidentally create a folder that contains a hard link to itself or a parent folder, causing programs to enter an infinite loop as they attempt to walk through all the directories in the filesystem. Symlinks have this problem too, but programs that automatically walk through directories, such as backup programs or virus scanners, can avoid the issue by choosing not to follow symlinks or by noting the target that each symlink points to and skipping that target if it reaches it again. Since hard links are indistinguishable from any other file or folder and don’t have a single “target” path, the only practical way to avoid the issue is to ensure the filesystem doesn’t have any of these loops in it.

For completeness, Windows supports several other strange and rarely used types of links:

  • NTFS Junctions: A junction is basically a limited version of a symlink that can only point to folders on the local computer. It may occasionally have certain minor advantages over a symlink, but unless you know you need one, these are probably not worth concerning yourself with. The one time you might want to look into junctions is if you are on a computer where you don’t have administrator privileges and you’re still using a version of Windows before the Windows 10 Creators Update; in this case, you’ll have permission to create junctions but not symlinks, so you can get at least some of the functionality of symlinks by using junctions.

  • Folder Shortcuts: These are an undocumented type of link to a folder (as the name suggests) created by placing special target.lnk and desktop.ini files in an ordinary folder. When viewed in Windows Explorer, the folder instead displays the contents of the folder referenced by target.lnk. These were quite useful before the introduction of symlinks in Windows, but are seldom used nowadays except for internally by Windows.

  • Shell objects: These act kind of like files and folders, but are created by putting things in the Windows Registry. Some built-in “folders” like “Documents” and “Administrative Tools” are actually shell objects.

You can safely skip this section if you are easily bored by technical details; however, it may help you understand the different types of links better, so it’s probably worth it!

In What is a link?, I said that a link is an identifier that isn’t interesting in itself but refers to something else that is. In computer science, the technique of using a link to refer to something else is called indirection. (Accessing the thing the link refers to is called following or dereferencing the link.) Indirection is a very powerful technique; just like a phone number lets you talk to someone who isn’t in the same room as you, indirection lets you work with an object that you don’t immediately have at hand. In fact, indirection is so critical to managing the complexity of a system that no useful technology could exist without it. Filesystems use multiple layers of indirection even before we start talking about links:

  1. You typically use software that represents the contents of the filesystem in a graphical manner, with icons of files and folders and information about them. Each icon refers to a file with a given path.
  2. That software identifies the path of the file you click on, which is a string of letters that refers to a series of folders and a filename (e.g., C:\Users\Soren\file.txt), and asks the filesystem to find the file this path refers to.
  3. The path and filename itself refers to an amorphous “file” object deeper inside the filesystem.
  4. The “file” refers to a series of numbered blocks on the disk.
  5. The blocks themselves refer to a series of bits on the surface of a disk or stored in circuitry, which when physically read and interpreted appropriately comprise the actual data.

(Can you imagine if you didn’t have any indirection in your filesystem and had to keep track of the numbers of individual bits for every file you wanted to access yourself? – “Open the file at bits 57749390 to 57800042 on my hard drive, please.” – “Oh, I added sixteen characters to this file, so I need to make the file 128 bits longer…but there’s another file after this one already, so I’ll have to either delete that file or find another place for this file and then delete this one….”)

Enough with the computer science lesson, though; we came to talk about how links to files work, not software and filesystems! There are several different types of filesystem links, which are implemented in different ways, but all of them involve adding another layer of indirection or operating at an existing layer.

  • Windows shortcuts and Mac OS aliases operate during step 1 above: the software you use to browse the filesystem looks at the file you chose, sees it’s a shortcut, and passes the target of the shortcut to the filesystem rather than the path of the shortcut itself.
  • Symbolic links add an extra step after step 3: after it finds the file, the filesystem checks to see if the file is a symbolic link; if it is, the filesystem checks what path the symbolic link refers to and feeds that path back into step 2.
  • Hard links operate at step 3: multiple filenames refer to the same file object.

Now that we’ve covered what links are good for, what types of links we can use, and how they work, next week we’ll pick up with how to use them effectively. And if you were overwhelmed by all the information about the different types of links, don’t worry – we’ll also cover which types of links you should use in typical situations.