For this month’s Geospatial Frequently Asked Question (G-FAQ), I turn my attention away from map making and toward a topic that few us of likely consider but perhaps more should, that being naming conventions. When I first started using computers, I remember the black DOS screen with a few white characters, one of them flashing, that you starred at when your PC started up (after waiting about 5 minutes that is!). In those days of bright fluorescent colors, hair bands and Joe Montana’s 49ers, people were forced to think about naming conventions as spaces were not recognized when you named your files in DOS. Further, navigating between multiple folders with long names and complex structure was a nightmare, as everything was a typed command rather than the point-and-click ease we have all grown used to using Windows.
The relative ease of current Windows operating environments versus DOS has also caused most of us to forget that naming rules still do exist even within Windows Seven. But what exactly are these rules and what if any specific rules apply to naming geospatial data? In this GFAQ, I explore this topic and offer my recommendations as related to this set of core questions:
What naming rules apply to Microsoft Windows? What rules are there to follow in ArcGIS? What happens if I break these rules? What are the best practices I can use to name my files and folders so that everyone in the organization can use them more effectively?
Naming Rules in Windows
To delve into this G-FAQ, let’s start with the over-arching naming rules that Windows enforces:
Naming Rules in ArcGIS
- A file’s path name is limited to 260 characters. A path name contains the following elements in this stated order: drive letter, colon, backslash, folder name(s), file name and a terminating null character. The bolded elements are added by default meaning that users have 256 characters to use between the folder(s) and file names. As an example, a file named test.shp on my desktop has a path name of: C:\Users\Brock\Desktop\test.shp; so while the file name is 8 characters long, the path name is actually 31 characters long.
- No special characters can appear in a folder or file name, these are characters are: / \ : * ? “ < > |
When it comes to specific Windows geospatial applications, all of them must adhere to the rules put forth above; and many of them will have their own naming intricacies that users should familiarize themselves with. As there are far too many applications to cover in this short piece, my focus will be on the most widely used geospatial application today, ArcGIS. Arc uses a wide variety of file formats and by far the most restrictive formats with regards to naming are coverage and grid files, as these files must have names that are: (1) 13 or fewer characters; and (2) all lowercase.
When it comes to naming Esri shapefiles and GeoTIFF files (the most common formats we use here at Apollo Mapping), there are no specific ArcGIS rules to follow but there are definitely best practices that should be observed. When they are not observed, issues can abound; I will discuss a few of these issues in the next section. Here is a list of best practices we recommend you follow:
I should mention a quick side note on naming shapefiles, or rather changing the name of shapefiles. A shapefile is actually a group of individual files including a .SHP, .DBF, .SHX and .PRJ at minimum. If you want to change the name of a shapefile in Windows Explorer, then you need to change the file name of all the individual components of the shapefile – not just the .SHP file.
- Do not use spaces in names. If you want to indicate a space, use an underscore (i.e. _).
- Do not use characters other than letters, numbers and underscores in names.
- Do not start file names with a number.
- Do not put periods in a file name as they typically indicate a file extension follows.
- Find a balance between descriptive and over simplified names. I will discuss this topic in more depth below but I tend to use file names that are 15 characters or less.
What Can Happen in ArcGIS With ‘Bad’ Names
ArcGIS does little to restrict users when it comes to the two naming rules for Windows put forth above. For example, you are able to use special characters in the name of a shapefile you create from a toolbox function; but when you try to actually run the function, ArcGIS fails giving you an Error 999999 with no explanation on how to solve the problem.
And if you try to create a shapefile with a path name that exceeds the 260 character limit put forth above, ArcGIS will fail with an Error 000210. Again, the help menu for this error offers little to no assistance in solving the issue. So in both cases, it is up to the user to understand the limitations of Windows to solve these issues.
ArcGIS will also allow you to violate all five of the best practices for naming put forth above. In each case, you can create shapefiles, TIFFs, etc. that will open and function properly but issues can still occur – and as Murphy’s Law states, these issues will always happen right before a major deadline. In the years of working with ArcGIS, here are the ‘oddities’ I have encountered and narrowed down to naming conventions:
File and Folder Naming Best Practices
- Lost data – this seems to happen most often when numbers are used to start file names and/or periods are used in a file name.
- Corrupt files – this seems to happen most when names get very long after running multiple toolbox functions on a set of files.
- Unstable files – by this I mean files that seem to open but might not work in toolbox functions, take for example this technical note written by Esri. I have seen periods used improperly in file names cause this as well.
- Incomplete production – by this I mean running toolbox functions but receiving no output. I have often solved this problem by choosing a simpler output name and moving all the source and output destinations to a folder in the root C:\ directory.
For the remainder of this G-FAQ, I put forth several recommendations that will help GIS professionals at large firms as well as GIS tinkers keep their geospatial data organized. In large organizations, establishing file and folder naming conventions is an absolutely must for improved efficiency as you might have tens or even hundreds of people accessing the same set of data files.
- It is crucial to find a balance between identifying the information contained in a file and/or folder and creating obscenely long names. We suggest 15 characters or less for both file and folder names. Here are some ways to keep file and folder names short:
- Use industry standard abbreviations to convey additional information.
- Some companies chose a combination of letters and numbers in a particular order to organize data. If that is what you chose, be sure to store a master look up table in a central location to un-code the cryptic titles.
- If you are creating files as part of an analysis, develop a naming schema ahead of time, for instance adding a version number to the end of all file and folder names.
- When you pick a name, you should try to include details on the elements in the file/folder; the scale it was created; the agency creating it; and the date. This will help future users surf through your geospatial datasets more efficiently.
- Establish a set folder structure that is flexible enough to account for multiple project types but is also consistent enough to assure a common thread between all the folders and subfolders for a project. For instance, a folder with the project name or number is an obvious top-level choice; with subfolders below for project scope, background files, working files, etc.
Do you have an idea for a future G-FAQ? If so, let me know by email at firstname.lastname@example.org.
Brock Adam McCarty
Find Out More About This Topic Here: