It’s easy to laugh at programmers when they claim one of the hardest problems is naming things, especially in such an obviously tongue-in-cheek epigram. But while variable names in code might not matter to the computer, good names can absolutely make the difference between easy comprehension and complete bewilderment when you return to your old code months later.
Filenames have exactly the same effect. Have consistently good file and folder names in combination with a well-designed hierarchy and finding things is a cinch. Have bad file and folder names and the best you can do is hope to stumble on the right file by scanning through scores of them or using a search function which may or may not look in the right place. The principles in this post and the following post will assist you in coming up with names that help you rather than hinder you.
In this post, we’ll focus on mechanics: what characters and phrases you should use in filenames and how you can manipulate them to get useful effects. In the next post, we’ll look at developing high-level naming conventions.
First, let’s talk about individual characters. Most filesystems define some special characters you cannot use in filenames. In addition, you should not use certain other characters in filenames under any circumstances. Finally, you may prefer not to use some characters. Let’s look at these in turn.
- Linux fundamentally cannot handle file or folder names
containing a forward slash (
/), as forward slashes are reserved for separating folder names in paths.
- HFS+, the standard Mac OS filesystem,
cannot handle file or folder names containing a colon (
:), as colons are reserved for separating folder names in paths. (Mac OS typically displays paths with a slash separating folders, but underneath they’re actually stored as colons.)
- Windows cannot handle any of the following characters:
<>:"/\|?*. In addition, it cannot handle unprintable characters (more on that in a minute).
Under the hood:
Windows also can’t handle files with any of the following specific names,
or with any of these names in a different case,
or with any of these names followed by a filename extension
These are names of hardware devices which pretend to be files.
In the DOS days, one would use these special files
to do fancy things like
(get a list of all files in the current folder
and send it directly to the printer)
copy con file.txt
(write whatever is typed on the keyboard
to a new file called
It’s unlikely that you’ll ever want to give a file
one of these names in particular,
but this is too wonderful a piece of obscure Windows trivia
for me to gloss over it entirely!
Windows forbids many of these horrible offenses against reason, while Mac OS and Linux do not. No matter what operating system you’re using, including any of these characters in a filename is a dreadful idea. You will probably never try to use one of these characters, but it’s good to be aware they’re bad ideas in case you’re tempted to try:
- Whitespace other than normal spaces (tabs, carriage returns, or their less-well-known siblings, vertical tabs and form feeds). Filenames containing these characters can actually cause security vulnerabilities!
- Untrimmed whitespace – that is, spaces at the very beginning or end of a filename. Most programs remove this space before saving a file, so it’s difficult to get a filename with untrimmed whitespace, but if you do end up with one, programs that don’t handle this whitespace correctly may end up crashing or encountering another security vulnerability.
- Unprintable characters (with ASCII values 0-31). These include backspaces, escape characters, and so on. Like untrimmed whitespace, it’s difficult to get a filename containing these, but if you end up with one, it will be difficult to identify or correctly print the name of the file. Unprintable characters in filenames can also cause security vulnerabilities (gee, are you seeing a theme here?).
-, at the very beginning of a filename: Certain programs can end up interpreting this file as an option rather than a filename. For instance, in Linux or Mac OS, if you try to delete all the files in a folder at the command line and there’s a file called
-rfin the folder, it’s possible to end up deleting the folders in the same location which you didn’t select! Don’t expose yourself to bugs in other people’s programs.
So we can’t use certain characters in filenames. Is that a huge burden?
Actually, no – in most cases it’s best to use an even smaller set of characters. The simpler your filenames are, the easier it is to be consistent and the easier it is to write scripts that work with filenames. The best plan depends on how strict you want to be with yourself and how much you care about scriptability. Here are a couple of suggested plans, with the least restrictive first:
- Don’t use any characters forbidden by Windows – even if you’re not using Windows. Why? At some point in your life, you’ll probably want to open one of your files on Windows or send a file to someone who’s using Windows. It’s easy to get into trouble – or get someone else into trouble without their consent – if your filenames use characters Windows forbids. A lot of these characters can also get confusing, particularly in scripts.
- Don’t use any special characters except
_. That is, use just whatever letters your language uses, numbers, spaces, and the three characters above. This makes your files a lot easier to work with on the command line or in scripts, it makes sure your files are compatible with any operating system or program you might toss them at while eliminating the need to memorize a string of funny characters that Windows doesn’t support, and it makes your filenames clean and consistent.
- In addition to #2, don’t use any spaces. Some Windows and Mac users will probably shout at me, “What!? How could I not use spaces in my filenames!?” But most Linux users have been customarily avoiding spaces in their filenames since the beginning of time and have been getting along just fine! I’ll admit that eliminating spaces is almost entirely good for computers rather than for humans. That said, when you design a system to work well for the computer, it often makes life easier for humans as well because the system ends up working better.
Under the hood: What’s with the hate for spaces?
The shell languages that underlie operating systems
separate parameters by spaces.
For instance, the Linux/Mac OS shell command to move a file
source location to the
mv source destination.
What happens if we have spaces in the filenames themselves?
We have to do extra work by quoting or escaping the names
and give ourselves more opportunities to make mistakes:
mv "source file" "destination file"
mv source\ file destination\ file.
If we don’t do this, the shell
will consider all four words separate parameters
and try to move the three files
to the destination location
While this requirement is most annoying
when you’re typing shell commands yourself
or writing scripts that work with them,
it’s not uncommon for even popular programs
used by millions of people
to contain bugs related to handling spaces in filenames.
Now we know what characters we should use. But we probably care most about how we should string them together into meaningful words and phrases.
Separating words in filenames
Let’s suppose that you’ve decided not to use spaces in your filenames (which I generally recommend). How do you separate words? You have three main options:
- PascalCase: This convention takes its name from an early programming language that frequently used this style for variable names. You capitalize the first letter of each word and leave the remainder lowercase. (The related camelCase, named after the hump created in the middle, is like PascalCase but leaves the first letter of the first word lowercase.)
- hyphen-separation: Use a hyphen between each word. This method is usually paired with leaving all the letters lowercase, but it doesn’t have to be.
- underscore_separation: Use an underscore between each word. Again, the name is usually written entirely in lowercase, but it doesn’t have to be.
For the most part, which method to use is a matter of personal preference. Consistency is much more important than the convention you choose.
Tip: Most veteran Linux users prefer hyphens to underscores, for the simple reason that hyphens can be typed without pressing the shift key! This may sound trivial and silly, but most people type a lot of filenames in their lifetimes, so I think it’s worth considering.
Separating parts of a filename
Sometimes you want a filename to contain multiple pieces of information. For instance, you might want the names of your digital photos to include both the date they were taken and a sequential number (e.g., the 61st photo taken on May 26, 2016). It’s helpful to have a standard way of separating the information; once again, the separator is particularly helpful for scripts, but it can make filenames easier to read for you, too.
Personally, I find the best method is to use hyphens to separate words
and underscores to separate sections,
so our filename might look something like
This does a good job
of both visually and mechanically separating the components.
Designing for filename completion
In Windows Explorer or Finder, you can begin typing the name of a file or folder in a list to jump to it (try it!). Similarly, in most command-line interfaces and programming tools, you can begin typing a filename and press Tab or Ctrl-Space to automatically complete the rest of the name. These features introduce a couple of additional considerations for naming, more about efficiency than about organization. These tips particularly apply to the names of your main high-level folders, which you’ll end up navigating through frequently.
- Avoid names which begin with the same letters.
Ideally, when you have a limited number of items,
have every item begin with a different letter.
For instance, rather than
compositions, you might prefer to use
essays. This way you only have to type one letter rather than up to seven to unambiguously identify the item you want.
- Particularly avoid names which form part of other names.
If you have two folders named
finessed, for instance, it’s easy to type
fi, have the computer select
fine, see a folder was selected, and immediately press Enter out of habit thinking you got
finessed, when you actually got stopped at
fineand are now in the wrong folder. Part-of-another-name conflicts occur rarely but can be seriously obnoxious, particularly if you don’t recognize why you keep landing in the wrong folder.
- If you’re not worried about sort order
or this tip would lead to a good sort order anyway,
consider putting the part of the filename that’s most unique first.
Nearly all programs sort filenames alphabetically. This suggests the following tricks.
Forcing filenames to the top
This trick isn’t guaranteed to work everywhere
because programs can choose how to sort special characters,
but it has a high success rate.
If you want to put a particular file or folder
(or a small group of files or folders)
at the top of a long list,
start its name with an underscore (
If the underscore doesn’t work,
you can also try an exclamation point (
You may have to hit refresh to re-sort the folder and confirm that it worked.
If you want to put a file at the bottom,
try using a tilde (
Ever tried to find something in a folder
where the contents were named with dates
in the format
DAY-MONTH-YEAR, or, even worse,
5-8-2011 7-1-2015 9-15-2013 11-2-2012 12-3-2018
Yuck! That’s not sorted at all!
Fortunately, the international date format standard, ISO 8601,
comes to the rescue:
if you write the date
alphabetical order and chronological order are identical.
Just make sure you include the zero if the month or day is less than 10:
2011-05-08 2012-11-02 2013-09-15 2015-07-01 2018-12-03
Now that’s better. I’ve gotten so used to YMD format that I even date my paper notes using it. As xkcd says, YMD is the correct way:
Tip: If you need times as well,
add them after the date in 24-hour
where HOUR goes from 00 to 23
and MINUTE from 00 to 59.
How many times have you seen a list like this on a computer?
10.jpg 11.jpg 12.jpg 19.jpg 2.jpg 21.jpg 254.jpg 26.jpg 3.jpg
Obnoxious, but there’s an easy fix: left-pad the numbers with zeroes so every number is the same length. Here’s the result:
002.jpg 003.jpg 010.jpg 011.jpg 012.jpg 019.jpg 021.jpg 026.jpg 254.jpg
If you end up with a bunch of files that need leading zeroes added
and there are enough that renaming them all would be cumbersome,
you can change them with a script.
Here’s an example in PowerShell –
it will add zeroes
to the beginning of filenames
in the current folder
that start with a number
until the number is 3 digits wide (or any setting of
Warning: If you want to use this snippet, please back up the folder before running the code. I have tested it on my computer, but there is no warranty!
It’s best to avoid sorting files manually (by adding ordered numbers or letters at the beginning of the names). It’s labor-intensive to adjust the sort order when you change the files, and it’s often not as helpful as you might hope. However, if you have a specific requirement that your files or folders be listed in a particular order, here’s a trick that can help reduce the amount of effort required for changes: begin by numbering by tens, being sure to use leading zeroes as discussed in the previous section, “Getting numbers to sort correctly.” Then if you need to add a new file between existing files, you can simply use a number between the existing numbers.
Here’s an example of how that would work out:
010 First File.docx 020 Second File.docx 025 Inserted File.docx 030 Third File.docx 040 Fourth File.docx