Developer Compendium: Naming

Developer Compendium: Introduction
Developer Compendium: Naming

Introduction
Reveal intent
- Say what it is
- Say what it does
Use real words
Avoid conflict
- Technical conflicts
- Cognitive conflicts
  - Namespace-dependant conflicts
  - Framework dependant conflicts
Avoid type suffixes
Avoid property prefixes and suffixes

Introduction

Whether it's your first child or your ten thousandth variable, naming things can be a painstaking endeavour.

Does their name rhyme with something rude? Do their initials spell a funny word? Is their name already deeply rooted in pop culture by fame or infamy, resulting in a lifetime of banter and casual mockery?
Will my colleagues know what this variable means? Will I know what this variable means 6 months from now?

Although the example of naming children is less common an issue in the context of programming, albeit one I can personally relate to, the latter is something which, and I believe I can speak with a great degree of confidence here, every developer has struggled with at some point, and will likely continue to struggle with until they eventually keel over and perish from malnutrition because they forgot to break for lunch again.

Meme panel showing an actor playing Pablo Escobar lost in thought in various locations, with the caption "when you try to choose a meaningful variable name" — Picking a name can be hard

Products referenced in my articles may contain affiliate links. Purchasing products from these links helps support me in creating more content like this.

The resources used for this article are:

Clean Code: A Handbook of Agile Software Craftmanship - Robert C. Martin

Although the content of this article contains a lot of my own findings and opinions, it is inspired by this book, which I'd recommend to developers of any level who care about the quality of their code.

Reveal intent

Programming is difficult, and it shouldn't be made more difficult by obscuring key information. Gone are the days when programmers needed to shorthand everything to ensure their code would fit onto a 3.5 inch floppy disk (cue chuckles from anyone born after the millennium), so don't be afraid to be verbose. A variable holds a piece of data in memory, and a function performs...a function. You should be able to determine their intent at first glance.

Say what it is

Consider the following:

var i = 0;

What is i? Convention dictates that it could possibly refer to the index of an item within an array, but can you be certain without looking further into the code? If it's used in the context of a for loop, for example:

for (var i = 0; i < 10; i++;) { ... }

then you can be quite confident that i is indeed an index, however the former example is used in absence of any context, and could therefore be anything. Providing context to a variable doesn't excuse you from naming it properly though!

var i = 0; // Index of a product within an array of products

Is this better? Technically, yes - but not really. Someone reading this piece of code will know that the variable refers to the index of a specific product, but it's unnecessary, and worse still, any code which references this variable will still be missing the context, and the reader would need to scroll up to the definition to determine its purpose from the comment.

The use of comments in code can be very divisive within the programmer community. Some will happily litter their code with wanton abandon, whilst others will flat-out refuse to approve a PR until a comment is removed. I'm on the fence - it depends - sometimes a piece of code is complex enough to benefit from some additional context without needing to refer to separate documentation - but in this case, just name the variable better.

var productIndex = 0;

Say what it does

Consider the following function signature:

public AcmeUser Get(Guid id) { ... }

Now, looking just at the function name Get, consider the following:

Get what?
Get how?

This signature is very simple, so these questions can easily be determined by looking at the signature as a whole, however if you're looking at existing code which was written poorly, you'll need to spend time investigating what the hell it's supposed to be doing:

var u = _service.Get(id); ❓

and even if the variables were defined better, you may still need to look at the definition's return value and parameter list to know for sure. These are things you want to avoid. The code should be readable as-is, not readable once you've had to spend time investigating.

This example could be improved by clarifying the verb or action with an indication of what type of object you're acting upon, and how it is to be retrieved.

public AcmeUser GetUserById(Guid id) { ... }

The same rules apply to parameter names. In the example above we define a parameter for id - the function is named well so we can assume that it refers to the user's ID - but what if there's an overload to only find a user if they are part of the same tenant? We'd likely call this additional parameter tenantId to differentiate the two, but why not match the pattern and call the first parameter userId to avoid any guess work?

Use real words

Use proper words to name things and make sure you spell them correctly. If you're searching for a particular keyword across a solution, then you'd expect the thing you're trying to find to be spelled correctly. People don't habitually search for misspellings when performing searches. Furthermore, if you're working in a team, especially one which spans the globe, ensure there's an agreement on whether to name things using UK or US English (e.g. IsAuthorised versus IsAuthorized).

Keep in mind that most IDEs let you find and replace text strings as whole words, with or without matched casing. If you want to run a quick search within a file for all references to product because you want to make it clear that these are all actually a superAwesomeProduct, it will be much easier if they're all named properly and consistently.

var prd = new Product(); ❌
var prod = new Product(); ❌
var produtc = new Product(); ❌
var product = new Product(); ✅

Avoid abbreviations and acronyms where possible. These are great for helping your code look more concise, but at what cost? If your company starts to ship a new line of Super Awesome Products and you name your class SAP or SaProduct, will someone new to the project know what it is? Unlikely. They'll need to either rummage through some documentation (assuming there is any), ask around, or try to figure it out from context. Save everyone some time, reduce friction, and just call it SuperAwesomeProduct.

Like a lot of topics on programming there are always a few caveats, which result in inconsistencies, and mean that a "rule" isn't always strictly adhered to. For example, I've just said to avoid acronyms, but I frequently use them with LINQ expressions, for example:

var superAwesomeProducts = GetSuperAwesomeProducts();
var electricalProducts = superAwesomeProducts.Where(sap => sap.Category == Category.Electrical);

When I first started programming, a lot of my more experienced colleagues, along with many of the resources I saw online used a random letter for the lambda expression variable, usually x for some reason, or m for model, which I was never a fan of. In my eyes, this alternative is an acceptable use of an abbreviation. The context is implied by the variable being filtered on, i.e. you know that you're iterating on a list of superAwesomeProducts, so it's safe to assume that sap refers to a singular superAwesomeProduct.

There are some exceptions to this however. If you're shortening a standard or well-known term, which may be common across the codebase, or within the industry in general, it's perfectly acceptable to stick with those conventions. For example, if you instantiate a new HttpClient, it's common to define the variable simply as client - you certainly wouldn't call it a hypertextTransferProtocolClient because it doesn't provide any any extra useful information, but you may call it httpClient if there are other instantiations of various "client"-like objects to allow you to differentiate between them.

Solutions can easily grow to hundreds of thousands of lines of code spread across thousands of files. It's not possible to keep track of where everything is, and it's a nightmare trying to find things, especially when trying to get your bearings on a new project. If your classes, functions and variables are all named using proper, meaningful words, it makes finding stuff a lot simpler.

Avoid conflict

Sage advice if you're not confrontational by nature, but in this case I'm referring to conflicts in your code. Compilers and IDEs will generally warn you if you're creating a technical conflict, but they won't flag a cognitive one. Allow me to elaborate...

Technical conflicts

If you declare two identically named variables within the same scope, the compiler will fail to build your application and present an error. As such, this type of conflict is relatively easy to find and resolve. If you're using a language which doesn't need compiling, such as JavaScript, you'll simply find that something somewhere will break because an object of the same name is being referenced in two potentially unrelated contexts. A lot of IDEs will still highlight the duplication though, even though nothing will physically prevent the application from running.

The guidance here is simple, don't give different things the same name as each other. If they are truly different, there will be something which differentiates them. For example:

var products = new List<SuperAwesomeProduct>();
var products = new List<CrapProduct>();

These objects are both collections of products, true, yet the thing which identifies them as different is their quality, which can be used to name the variables more clearly.

var superAwesomeProducts = new List<SuperAwesomeProduct>(); 🦸
var crapProducts = new List<CrapProduct>(); 💩

The same applies when using reserved keywords as variable names or parameters. For example, most programming language define a primitive type of string. As such, you can't (or at least shouldn't if there's no physical restriction) name your variables string.

var string = "Hello, world!"; ❌

Some languages allow variable names like this to be ignored as keywords by the compiler. This bypasses any issues raised by the compiler, but should still be avoided as keywords aren't generally descriptive enough to accurately describe a variable's intent.

var @string = "Hello, world!"; ❌
var greeting = "Hello, world!"; ✅

Cognitive conflicts

If you've ever worked on a solution containing multiple projects, then chances are that you've encountered objects which have the same name across different assemblies, but with different meanings and definitions.

Cognitive complexity is a common term in software engineering, and is a measure of how difficult something is to comprehend. A cognitive conflict refers to a naming conflict which is essentially all in your head. Otherwise known as an ambiguity, it's not always easy to keep track of which object you're referencing because their names are reused, but they aren't technically conflicting from a compiler's point of view.

Take for example a common architectural pattern which splits an application into three distinct layers:

Presentation: an API which defines consumer endpoints and handles requests and responses
Business logic: a service layer containing all of logic associated to managing resources
Data access: a layer which interacts directly with entities in the database

Each of these three layers need to communicate with each other, and will be referencing the same resources in some way or another when passing data between themselves.

Namespace-dependant conflicts

Keeping with the same example of managing products, each layer will need to either define its own version of a Product class, or reference a common assembly. The former is more common here because the presentational layer will typically display only a subset of data, or data combined from multiple sources. This is where ambiguous naming practices can create a cognitive conflict.

// Presentation layer
public class Product {
    public string Name { get; set; }

    public string Description { get; set; }

    public float Price { get; set; }
}

// Business logic layer
public class Product {
    public Guid Id { get; set; }

    public string Sku { get; set; }

    public string Name { get; set; }

    public string Description { get; set; }

    public float Price { get; set; }

    public int StockCount { get; set; }
}

// Data access layer
public class Product {
    public Guid Id { get; set; }

    public string Sku { get; set; }

    public string Name { get; set; }

    public string Description { get; set; }

    public Supplier Supplier { get; set; }

    public float CostPrice { get; set; }

    public float RetailPrice { get; set; }
}

When referencing a product, you need to be sure that you're using the appropriate Product class from the correct assembly.

var product = new Product();

Which one is being referenced here? The three classes each define a product, but they are all distinct from one another, and referencing the wrong one can create build or runtime errors, or create accidental circular dependencies from automatic imports. A common naming pattern for this particular project structure is with a suffix:

Presentation: append "ViewModel", e.g. ProductViewModel
Business Logic: append "Dto" (Data Transfer Object), e.g. ProductDto
Data access: the entity itself, e.g. Product

Framework-dependant conflicts

Similar to the self-inflicted ambiguity described above, these conflict issues can be further compounded if you define objects using names which are already standard as part of the framework you're developing in. For example, you may be working on a task which needs to manage user accounts within an application. Identity and authentication frameworks typically define their own User objects, so if you are creating your own abstraction locally, you need to be able to differentiate the identity version with your own (e.g. AcmeUser, ApplicationUser).

Avoid type suffixes

Variables shouldn't declare the type within their names as the type is inferred by the value assigned. This practise is commonly referred to as Hungarian Notation, which had its relevance in the past, but is no longer required in most modern programming landscapes.

var productNameString = "Acme Widget 2.0"; ❌
var sProductName = "Acme Widget 2.0"; ❌

The compiler (and your IDE) will infer that these variables are of type string, so just call it productName. Similarly, when storing a collection, appending List to a collection of products isn't necessary, when simply products will suffice. The plural is sufficient to imply a collection of multiple items.

If the word is already a plural, then feel free to break one of the rules such as spelling or suffixes, for example:

var deer = new Deer(); // A singular deer

var deers = new List<Deer>(); // A collection of deer using a questionable plural
var deerList = new List<Deer>(); // A collection of deer using an appended type

Alternatively, another classification may be more suitable if it makes sense, for example:

var animals = new List<Deer>(); // Generic grouping (e.g. Deer may inherit from an Animal base class)
var herd = new List<Deer>(); // Proper way to refer to a group of deer
var cervids = new List<Deer>(); // Scientific classification for the deer genus. I promise I didn't just Google this...

The last one was a joke, try to avoid obscure specialist or foreign languages for variable names (i.e. Latin or Greek terms), unless of course your team of developers are classically trained and they'll know exactly what you mean. But even then...don't.

Similarly, many words will have plurals which don't simply add an "s", for example:

var media = new List<Medium>();
var indices = new List<Index>();
var formulae = new List<Formula>();

Again, use some of these sparingly as they could be misleading. For example, "media" in most contexts will make people think of journalism or broadcast media, rather than a list of guest speakers for a talk on channelling the spirits of long-lost relatives.

Avoid property prefixes and suffixes

When defining a class and it's members, you may often see the type prefixed on each property, for example:

public class Product {
   public string ProductName { get; set; }

   public string ProductDescription { get; set; }

   public string ProductPrice { get; set; }
}

These prefixes don't really add anything. If you're looking at a product definition, you already have the context of the type, so it doesn't need to be repeated for every property. Additionally, your IDE's intellisense will prove more noisy, since typing "Product" might list all of the properties within Product, when all you really wanted was the class.

As with acronyms and abbreviations, there can be exceptions. Booleans and dates can often be seen with a prefix or a suffix. The former may be prefixed with a preposition such as Is or Has as opposed to just the name by itself, and the latter can often be seen prefixed or suffixed with Date.

public bool IsActive { get; set; }

public DateTime DateCreated { get; set; }
public DateTime CreatedDate { get; set; }

In some cases a prefix or suffix can help clarity, for example:

Todo.Completed = // ... ❓

The property Completed is ambiguous without knowing the type. Does it imply that a task has been completed, or the date on which the task was completed? These can easily be clarified with a prefix or suffix:

public bool IsComplete { get; set; }
public DateTime CompletedOn { get; set; }

Any IDE worth its salt will infer types and display information about a property, and a compiler will usually warn you if you're trying to Assert.True on a DateTime, but it's better to be clear from the outset.

Conclusion

In summary, do everyone a favour and put a bit of thought into how you name things. As a bare minimum, try and put yourself in the shoes of someone who doesn't know your code at all, and ensure you can at least hazard an educated guess at the purpose of each variable or function without having to delve into the logic. Your colleagues (and your future self) will appreciate it!

Metadata

Title: Developer Compendium: Naming

Author: Tom Jones

Created: 26/04/2024

Tags: programming productivity