In the previous post, we focused on the mechanics of naming, things like what characters we should be using and how we can make files sort appropriately. This week, we’ll look at how we can use these tools to create personalized naming conventions that make files easier to find and process automatically.
What exactly is a naming convention?
Put simply, a naming convention is a rule about how things should be named. Here are some examples from everyday life:
- Many cities number or letter their streets in order, perhaps with First Street downtown and increasing numbers further towards the edge of the city. Others designate north-south roads “Avenues” and east-west roads “Streets” (or something similar). This way, if you’re reasonably familiar with the city, when you hear the number or name, you can deduce roughly where the street is and which direction it goes even if you’ve never been there before.
- When you walk into a large office or government building, you typically expect the floors to be numbered starting from 1 (perhaps with a ground floor, basement, or similar in there as well). You expect the rooms to be numbered sequentially as well, with the first digit or digits representing the floor number. So if you’re trying to get to room 525, you know to go to the fifth floor, look at the numbers on some rooms near you to see which way the numbers go, and walk towards 25.
- When looking at software available for download, you expect it will list a version number somewhere, and the largest version number will be the newest version.
Why would I want to use naming conventions?
Imagine that, instead of using version numbers, software developers assigned arbitrary names to each version. Maybe they call the first release “Tommy G,” the second release “Purple Sphinx,” the third release “zxwwb,” and so on. In order to determine if you have a newer version than someone else, you now have to go find a table of all the releases of the software and scan through it. And I bet you prefer navigating in an unfamiliar city that numbers its streets rather than giving them random names (although with the advent of GPS, this particular convention is admittedly less important than it once was).
Simply put, naming conventions reduce cognitive load – when the names are consistent, you have to know fewer details of your system and spend less time figuring out what your names mean. Further, it’s a lot easier to tell a computer what to do with files that follow a standard convention than with haphazardly named files. A good naming convention can make the difference between a script you can write in 10 minutes and a problem you can’t even solve because you can’t explain the transformation you want to do to a computer and changing all the files manually would take too long.
How do I create naming conventions?
You tell me! Seriously, there aren’t any rules. At a basic level, just look at your filesystem with fresh eyes recognizing that you could benefit from naming conventions, and you may start to see some opportunities.
Once you have an idea, rename existing files and folders as necessary.
(But if that’s a lot of files and folders, see this tip.)
If the convention is simple enough,
you can probably stop there;
when you make additions or changes, just follow what’s there already.
If the convention is more complicated,
like the photo naming convention I lay out in the case study,
you may want to create a text file or Word document
explaining how it works
and keep it in the folder that uses the naming convention.
I call mine
conventions.txt (creative, right?).
In the rest of this post, I’ll suggest some sample conventions to help get you thinking, starting with simple ideas and ending with a complex system for managing digital photos.
Using leading zeroes
Last week I mentioned that you can force numbers in filenames to sort correctly
by placing leading zeroes in front of them.
This suggests a naming convention when you’re creating numbered files.
When you start adding the files,
consider how many files you’re likely to end up with:
fewer than 10? fewer than 100? fewer than 1000?
Then create your very first file with enough leading zeroes
to accommodate this number of files.
For instance, if you think you might be scanning
more than 999
but fewer than 9,999 pages to PNG files,
name the first scanned page
This way, you’ll remember to name the next ones
in accordance with the convention
and you’ll never have to worry about sorting.
Similarly, always using
for your files is a type of convention.
For some domains, you may want to develop domain-wide naming conventions. For instance, many media player programs expect music to be stored in a specific folder structure: each artist has a folder, which contains a folder for each album by that artist, which contains the songs on that album. You might want to name the files consistently as well – say, the track number followed by a hyphen and the name of the song.
Check out the case study for another example.
Here’s a personal convention I’ve never seen discussed anywhere else.
Remember when, back in the Separation Principle,
I suggested that it’s best to avoid
mixing files and folders within a single folder?
To help yourself remember that a particular folder
should contain only other folders,
you can name it differently than folders that contain files.
lowercase-hyphenated-names for folders that contain files
PascalCaseNames for folders that contain only other folders
(I call such folders “Hierarchy Directories”,
“directory” being a more technical term for “folder”).
Standard word separators and case
Do you like consistency and neatness?
You can get a surprising amount of mileage out of making sure
all your files use the same word separators and capitalization rules.
This one can apply across all your files.
If you like the
lowercase-hyphenated style, for instance,
make it a personal convention that whenever you create a new file,
you name it this way.
If you like using
bACKWARDS cASE wITH tWO sPACES
between each word,
well, I wouldn’t recommend that style,
but it’s still a lot neater-looking than inconsistent style!
Tip: If you have a lot of filenames to clean up, rather than spend a weekend renaming all your existing files, start by following the convention on new files, then rename files that violate the convention the first time you reorganize or otherwise touch those files. It may take a year or two to get through all of them, but you’ll be done before you know it and you won’t notice the effort nearly as much.
Case study: my photos
To consider a much more complicated naming convention as an example of how far you can go, I put my digital photos in a hierarchy that looks something like this:
Documentation/ |- 2017-05-12_owatonna-apartment-condition Events/ |- 2017 |- 2017-08-13_sorens-new-apartment |- ...photos |- 2017-12-16_christmas-party Items/ |- 2013-05-30_writing-desk
As you can see, the Events category is slightly different from the others because it contains subfolders for each year, but in general we have a top-level category folder and then a folder underneath for each picture-taking session or event, which has a date and a name. The date and the name are separated by an underscore, while the words within each part are separated by a hyphen, as I recommended in the previous post.
As far as the names of the files themselves,
these follow a specific format as well.
Here’s an example name:
dindicates this photo is identified by a date and time (images like screenshots that don’t have any date metadata associated with them get an
ifor Index and a sequential number instead).
20170813is the date the picture was taken, extracted from the EXIF info that all cameras include in the picture file.
194929is the time the picture was taken, down to the second.
- We don’t see a sequence number in this picture;
if it were present, it would look something like
-4after the date. If the same camera took more than one picture in a single second and they all went in the same folder, we would need a sequence number to distinguish them.
ilkis a code identifying the camera the picture was taken with (in this case, my iPhone 5’s built-in camera). Again, this can be inferred from the EXIF info included by the camera.
evtindicates this picture belongs to the Events category (see the hierarchy above).
sorens-new-apartmentidentifies the event folder this picture belongs to.
oindicates this photo is original and unmodified, exactly as it came off the camera. If I had edited the picture, it would be
e, and if I downloaded the picture from somewhere else, it would be
That’s a lot of information I (or a computer) can get about the photo
just from looking at the filename.
Further, I have enough information in the name
that I could dump all my pictures
in one folder and they would still have unique names
which sort in chronological order
– unlike default photo names like
where trying to merge photos taken on different cameras
produces an enormous mess.
Sure, I can get all this information from the EXIF data, but what’s easier, opening a photo viewer and clicking into the properties screen, or looking at the filename, which absolutely every program displays prominently? So I include the pieces of information I am personally most likely to want to know right in my filenames to reduce the number of times I have to dive into the properties. You might care about a different subset of information about your photos. That’s the great thing – since filenames are not specific to any program or any program’s idea of how things should be organized, you’re free to use them to organize your files in whatever way works best for you, not whatever way works best for some software company.
Note: Isn’t it a lot of effort to name all these files with such a complicated scheme? Nope! Except for the flags at the end, I never actually enter or change any of this information myself. A script can extract all the information from the files themselves and rename them accordingly.