Filesystem Links: Practical Advice for Design

15 minute read

In the previous post in this series, we discussed what links are, why you would want to use them, and what kinds of links are available. This week we’ll look at how you can take advantage of links in your filesystem design, and we’ll examine some common pitfalls to avoid.

First of all, last week we talked extensively about four main types of links: Windows shortcuts, Mac OS aliases, symbolic links (symlinks), and hard links.

Here’s the million-dollar question we didn’t answer last week: which one of those should you actually use?

Choices to eliminate

Hard links have some intriguing features, like their inability to become broken and their similarity to actually putting a file in multiple places. These features make them excellent choices for solving certain very specific problems, such as automatically backing up different revisions of files. However, they have too many drawbacks to be a solid choice for general-purpose organization. They tend to make keeping track of an organizational structure more confusing, they can behave unexpectedly when you copy them, and you may run into things you want to hard link but can’t (folders, items on other disks). So unless you know you need them, hard links are usually not the right choice.

Next, you should eliminate the operating-system-specific shortcut for any operating system you aren’t actually using from consideration. If you’re using Windows, Mac OS aliases will be of no use to you at all even if you manage to create them somehow; similarly, if you’re using a Mac, Windows shortcuts will be useless. If you’re using Linux, both of these will be useless, leaving you with symlinks as the single clear choice.

Remaining choices

If you’re not using Linux, you have a choice to make: your operating system’s built-in shortcut type (hereafter referred to as an “OS shortcut” for brevity), or a symlink?

The best choice usually depends on what you’re using the link for. I offer recommendations under each suggested use case in the common use cases for links section, but here are some points to consider if you have a different use case:

  • OS shortcuts often create less trouble if you expect you may move the files they’re pointing to, since your operating system offers a service to help re-locate the files. This is a bigger advantage on a Mac than on Windows, but it’s still an advantage on Windows. (However, if a set of files are moving frequently, it may be better not to go through the effort of creating a bunch of links to them in the first place when you obviously don’t yet know how the files should be organized – even if the links don’t break, you may find you’ve created links that no longer make sense.)

  • OS shortcuts are your only option if you want to create a link on a device with a FAT32 filesystem, such as a flash drive or SD card.

  • Symlinks will be recognized by more programs than OS shortcuts. If you use the command line a lot, OS shortcuts will be all but useless at the command line, whereas symlinks will work great there too.

  • Symlinks will typically work with multiple operating systems. If you’re creating shortcuts on a network drive that will be accessed by computers running different operating systems, symlinks are a no-brainer.

  • Symlinks allow you to access the target under a different name, rather than taking you directly to the target.

Dangerous is probably something of an overstatement; it’s not as if your computer is going to blow up in your face if you use a link the wrong way. However, recall that in the second paragraph of the previous post, I pointed out that while links can solve some problems, they also add new problems of their own. If you start using links willy-nilly, the organizational problems are likely to catch up with you sooner or later and might be enough to turn you off using links ever again! Let’s look at some things you shouldn’t do with links.

Links are a way to augment the features of a filesystem that is already well-organized: to make it quicker to navigate through, avoid having to do searches, or make it possible to access something from two places when it’s intrinsically difficult to figure out which of those places is the right one.

If you start throwing in links all over the place because you have trouble finding which location your files are supposed to be in, but the reason you can’t find your files is that they’re poorly organized to begin with, you will cause nothing but pain. First of all, it will become even less clear where your files are supposed to be, since many of them are now located in multiple places. Then at some point you may want to reorganize your files, and since they aren’t organized at all to begin with, you won’t be able to keep track of what links might break when you move the targets. Even if you’re using Mac OS aliases and they never break when you move files, if you delete a file its links will still break, and then you’ll wonder where it went when you run into the link.

While it’s valuable to think about what kinds of links you might want to use as you plan a new organizational system, you should create them only as part of that new system or on top of a system that’s already well-organized.

This is a special case of “Don’t use links to substitute for good organization” that many people fall into when they first learn about links.

Here’s an example of the kind of hierarchy you badly want to avoid (-> /Me/xyz after a name in this diagram means the file or folder is a link to /Me/xyz):

/Me/
   |- Jobs
      |- Applications/
         |- Initech/
            |- Portfolio -> /Me/Portfolio/Initech/
         |- XYZ Corp/
            |- Code samples.txt -> /Me/Portfolio/Code samples.txt
            |- References.docx
            |- Resume.pdf
      |- Work Photos/
         |- Portraits/ -> /Me/Photos/Portraits/
         |- Workplaces/
            |- Initech/
            |- XYZ Corp/
   |- Photos/
      |- Portraits/
      |- Workplaces -> /Me/Jobs/Photos/Workplaces/
   |- Portfolio/
      |- Code samples.txt
      |- Work history.pdf
      |- XYZ Corp Resume.pdf -> /Me/Jobs/XYZ Corp/Resume.pdf

Here’s the problem: there are related areas (/Me/Portfolio and /Me/Jobs/Applications, for example), but there is no standard on which of those two locations any item is in, so some links go from Portfolio to Applications and some go from Applications to Portfolio, leading to arrows going randomly in opposite directions if you were to draw out a diagram. Since items are linked from one to the other, you can probably get away with finding things in this hierarchy. However, the hierarchy is very hard to think or reason about, which means changing the organization is going to be a real pain, and you may well end up having more trouble remembering where you put things than you would have if you hadn’t used links.

Making either Portfolio or Applications into a virtual view is one good way to avoid this, since it keeps you continually aware of which direction your arrows are pointing.

Note: If you were using hard links, you wouldn’t have to worry about which location should be the target, since there is no “target” with a hard link – the file is just in two places. However, you’d still have to worry about making your hierarchy hard to reason about. In addition, the technical problems with hard links that make them generally unsuitable for these kinds of purposes have already been described in the first section of this article.

Links are a very useful tool. However, every link adds a little extra burden to your organizational system:

  • the link could break and have to be fixed (even if this is unlikely if using aliases)
  • every additional link to a file makes it harder to remember where the file really is
  • every additional file, link or not, makes your filesystem just a little bit more complex and thus harder to understand

So before you create a link, pause and ask yourself, “Is this link creating any value?” If your answer isn’t an obvious yes, stop and think about whether organizing your hierarchy better or coming up with some other solution would be more beneficial than creating a link.

Getting to folders quickly

The Depth Principle and the Single-Question Principle suggest that you should create lots of folders because it improves your organization. But at times this results in painfully long paths. For instance, I keep information about articles I might want to write on this blog in this folder:

/home/soren/cabinet/Me/Writing/Nonfiction/Essays/Blogs/cab/articles/proposed/

That’s a lot of folders to click through!

Fortunately, it’s pretty easy to save yourself from clicking through all those folders: just create a link from a more accessible location. I have a folder called current right in /home/soren that contains links to the folders I’m currently accessing most frequently. I have one called cab-writing linking to the cab folder, so all I really have to do is open that folder, then choose articles and proposed. Obviously, if I specifically used the proposed folder a whole lot, I could create a link directly to that folder.

You can keep the links on your desktop, in a special folder like my current folder, or anywhere that’s useful to you. If you have a large project that has its own area in your filesystem, you might keep a folder called links in that area and fill it with links to other areas of your filesystem that are important when working on that project.

In a file browser like Windows Explorer or Finder, you can “pin” or “favorite” a folder by dragging and dropping it to the left pane. This makes it very easy to access that folder. If you have a ton of related folders, you may find it helpful to create a folder of links and then pin that folder to avoid cluttering up your pinned list.

Advanced tip: If you frequently use the command line on Linux or Mac OS, check out the tool z, which helps you jump to frequently accessed folders in much the same way that the pinned or recently used menu in Windows Explorer or Finder does.

OS shortcuts are the favorite for this use case, since you often want to be taken directly to the target folder rather than accessing it under a different name. If you do prefer the behavior of a different name, or if you want the link to work on multiple operating systems, symlinks are a good choice too.

Relocating program folders

Some programs insist on storing data in a very particular location. Let’s suppose you use a program called AwfulSoftware, which lets you work with a document scanner called the Whizbang. AwfulSoftware, for no good reason, refuses to scan your documents to any location except C:\Users\YourUsername\Documents\My Awfulness. However, you really want to keep your scans in a folder on a network drive, L:\Shared\DigitizationProject\Scans. You can close the program, delete the My Awfulness folder (after moving its contents to the new location, of course), and then create a symlink to the new location named My Awfulness. Now, when AwfulSoftware saves its files to My Awfulness, it will actually be putting them in the DigitizationProject\Scans folder without knowing the difference.

Note that in nearly every situation, you must use a symlink for this to work. OS shortcuts don’t fool programs in this way, and hard links can’t be created to folders.

While it’s more common to have this issue with a folder, you can do this for a single file as well.

Virtual Views

Sometimes you want to be able to access a series of folders or files in multiple ways. One way you can accomplish this in a hierarchical filesystem is by organizing the files one way in a hierarchy, then creating one or more other folders of links to the same files, organized or named differently. I call these folders of links Virtual Views.

For creating multiple hierarchies

I maintain archives of all the files I created for my college classes (I refer to these surprisingly often). The files are organized chronologically and then by class, like this:

years/
   |- 2013-Fall
      |- intro-cs
      |- great-con
      |- psychology
      |- voice
   |- 2014-Fall
      |- great-con
      |- hardware-design
      |- linear-algebra
      |- latin
      |- voice
   |- 2014-Interim
   |- 2014-Spring
   |- 2015-Fall
   |- ...

This is a very straightforward, logical way of organizing the files. When I want to go back and look at what I did during a particular semester, it works great.

However, at other times, I want to find a file from a particular class, and I don’t remember off the top of my head which of the 12 terms in my college career I took the class during. So rather than constantly be scanning through all of the folders all the time, I established a Virtual View called classes-vv, which makes the overall area look like this:

ClassFiles/
|- classes-vv/
   |- GCON113_Greeks-and-Hebrews -> ../years/2013-Fall/great-con/
   |- GCON115_Romans-and-Christians -> ../years/2014-Interim/
   |- MUSPF152_voice_sem1 -> ../years/2013-Fall/voice
   |- MUSPF152_voice_sem3 -> ../years/2014-Fall/voice
   |- ...
|- years/
   |- ... (see previous snippet)

By creating this view, I’ve eliminated the need to answer the question “What semester?” before “What class within this semester?”, when I don’t know the answer to the first question. Instead, I can just answer one question, “What class number and name?” Yet I retain the ability to browse by “What semester?” if I know the answer.

For creating subsets

Another handy way to use a Virtual View is to show a subset of files. For instance, you might have taken 500 pictures on your last vacation, but you’ve selected 50 of the best. You don’t want to delete the other 450, and you don’t really want to copy them either, since they’re the very same files, and if you edit one you want both of them to change. What you can do is link to the photos you want to select from a separate folder. My photo folders often end up looking like this:

My Interesting Event/
   |- edited/
      |- ...photos that I thought needed editing
         (before editing a photo, I copy it from raw/
          so as to preserve the original in case I screw up)
   |- raw/
      |- ...photos the way they came off the camera
   |- selections-vv/
      |- a.jpg -> ../raw/5.jpg
      |- b.jpg -> ../raw/23.jpg
      |- c.jpg -> ../edited/25.jpg
      |- d.jpg -> ../edited/37.jpg
      |- ...

Tip: When I’m working on some photos, I actually just copy them into the different folders, then when I’m done editing and selecting, I use a tool that deduplicates files by turning the extra copies into links. This saves me from having to think about the details of the linking until I’m done doing the actual work. I’ll discuss deduplication later in this series.

Implementation notes

Symlinks are generally the best kind of link for implementing Virtual Views due to their ability to present the folder under a new name rather than jump you to the target folder, since the main idea is to be able to access some files under two different schemas. The ability to use relative paths (as here) can also make them more resilient (if you decide to move the entire ClassFiles/ folder, all the links will continue to work – even if you copy them to a different computer, unlike with Mac OS aliases). However, if you need or want to use OS shortcuts, those work acceptably too.

If the way you want to construct a Virtual View follows a clear pattern, you may be able to script the creation of the view and eliminate some manual work creating links. We’ll talk about that when we get to filesystem scripting later in this series.

Warning: It’s generally best not to create Virtual Views to access folders that are still changing and evolving rapidly, because creating all the links for a Virtual View is fairly labor-intensive and having the links break if you want to reorganize things can be a real pain.