Single-use functions — you aren't gonna need them

New programmers are commonly taught to separate their code into many small functions, each with a descriptive name, a single purpose, and called from exactly one location. As they move into industry, they'll find many experienced programmers espousing the same views. In a code review, they might receive feedback that a piece of code is too long or complex to review, and would be more easily understood if it were broken up into different functions and scattered to the winds (presumably so the reviewer can ignore them). When they read the writing of industry experts, they'll see people proposing functions even as small as a few lines.

These people are wrong, and they should feel bad.

More seriously, I'd like to present an argument for vigorously inlining single-use functions, with the end result being a more readable and maintainable codebase.

"Self-documenting" code

One of the common arguments for single-use functions is to make the code "self-documenting". The belief is that by making functions small and giving them clear and descriptive names, it becomes possible to skim the code and understand what it's doing. Unfortunately, one of the laws of physics (or our current programming languages) is that function names are limited to a single identifier. If it ever becomes necessary to provide more than a short description of what the function does, the self-documenting programmer finds the need to dust off their // or # key and leave a comment.

def configureDependency(foo):
    # ... A short implementation ...

def doSomethingUseful():
    foo = Foo()
    configureDependency(foo)
    return profit(foo)

There are many instances where this might occur, but the biggest one is around why a piece of code was written. While it may be clear what configureDependency() does, the curious software archaeologist is frequently more interested in how or why we need to configure in the first place. The judicious application of a comment turns this mystery into code that's easily accessible by future developers (including the author!). However, this begets the question: why not write the easily understandable function name in a comment above the code in the first place?

# Configure our dependency, making it possible to `profit()` on later.
# NOTE: We previously tried the obvious approach of just `profit()`ing, but
# found that [something broke] when we did that.  With # version [foo] of
# [dependency], it appear necessary to do [x,y,z] first.
def configureDependency(foo):
    # ... A short implementation ...

Suddenly, we know why our code looks this way, and the potentially unintuitive behaviour has become obvious. Can we do better?

We've documented the necessity in one place, but any reader of doSomethingUseful() who is confused about why our dependency needs configuring still needs to do work to figure out the cause. Modern IDEs with the ability to hover over function calls or quickly jump-to-definition help here, but we've added more work - the developer has to be alert and looking for more information, and won't fall into the Pit of Success of being able to understand the code that will run.

Luckily, a quick inlining pass saves the poor soul.

def doSomethingUseful():
    foo = Foo()

    # Configure our dependency, making it possible to `profit()` on later.
    # NOTE: We previously tried the obvious approach of just `profit()`ing, but
    # found that [something broke] when we did that.  With # version [foo] of
    # [dependency], it appear necessary to do [x,y,z] first.

    # ... Some short code configuring the dependency ...

    # ... A helpful comment about why/how we profit.

    # ... Some code that profits
    return profit

We've replaced the self-documenting aspects of our code with explicit documentation, handcrafted for the future reader or maintainers of this code. In the process, we saw that the separate "self-documenting" functions were no longer providing value and inlined them. As a result, it's now possible to read straight through doSomethingUseful(), skimming sections of code that aren't currently useful based on their comments, while being able to understand everything the code is doing.

Deep understanding of the codebase

I believe one of the most important traits in a software developer is the ability to understand the entire stack of software they're working with. Having a concept of what's going on under the hood is extremely important whenever abstractions leak. The more you know about layers under the code you're writing, the more likely you'll be able to write correct or efficient code, or to debug it when something goes wrong.

At the extreme end, an ability to debug assembly and understand the underlying operating system could come into use when you're investigating mysterious data corruption. For a more banal example, understanding how browsers repaint could be key to finding the piece of Javascript that was triggering a reflow and making your website unacceptably slow.

In a similar fashion, this ability to understand what's going on in the system applies to code that's "below" the layer you want to think about. While it's important to be able to think about code at a higher level, it's also frequently useful to be able to pierce the veil and reason about how the system is actually working. One of the benefits from rewriting the previous single-use function was our ability to understand the whole function, rather than being pushed towards ignoring its implementation.

Single-use functions can make it easier to skim unfamiliar code when you don't need a deep understanding of it, but make it harder to fully comprehend how the system works. Inlining these functions can add extra lines and visual noise when you're just trying to skim past on an initial reading, but makes it easier for you to see how all the parts come together when taking more than a cursory glance. Rather than treating half the codebase as magic, engineers are empowered to see precisely how things work, making it easier for them to debug, refactor, or add new features.

Not all functions matter

When trying to understand a file for the first time, it can sometimes be confusing where to start reading. Which functions are entry-points into the file, and which are helpers you can skim over? This becomes much easier to reason about in a codebase without pervasive single-use functions; you'll only see "important" helper functions (used several times, so worth thinking about), and entry-points. Given an arbitrary file, you'll likely be able to read top-down and have a good idea of what the code is doing and where its functionality may be used — a difficult task in a file full of single-use functions.

Accidental duplication

Because single-use functions encourage splitting code into small pieces, it can be surprisingly common to have unnecessary code duplication. On top of being simply inefficient, this can lead to many more lines of code than if they were inlined, and complicate future refactoring that doesn't take the single-use aspect into account. A simple example is needing to branch on some property:

def efficientUseful(foo):
    if foo.cached_computation is None:
        foo.cached_computation = someComputation(foo)
    return useful(foo.cached_computation)

def efficientBoring(foo):
    if foo.cached_computation is not None:
        foo.cached_computation = someComputation(foo)
    return boring(foo.cached_computation)

def doSomething():
    foo = Foo()
    # Let's use the efficient variant of these functions!
    a = efficientUseful(foo)
    b = efficientBoring(foo)
    return a + b

The above example is obviously rather contrived, but it's surprisingly common to see code like this in real software, particularly when these functions are separated by hundreds of lines (and dozens of other single-use functions). A quick inlining pass shows us the error in our ways, and we're able to clean this code up a bit:

def doSomething():
    foo = Foo()
    computation_result = someComputation(foo)
    a = useful(computation_result)
    b = boring(computation_result)
    return a + b

Suddenly, it's possible to see exactly what's happening! We needed someComputation() run on foo, then are doing two operations on the result of that. By removing these unnecessary single-use functions, we've reduced the lines of code to read, reduced the number of instructions to execute, and made it simpler to understand the function's behaviour.

In real codebases, these small pieces of duplicated code can be much more impactful (particularly as the codebase ages and becomes full of cruft). You may find that entire loops become unnecessary due to the way functions handle their arguments, that the set of network requests blocking your pageload weren't even necessary, or that half your code was actually dead. I've had times when I've been able to remove more than 75% code from a system simply by iteratively inlining single-use functions, cleaning up the suddenly-obvious dead and duplicated code, then repeating. The end result is a codebase that's substantially simpler to understand, debug, and extend.

Surprising code

Have you ever been surprised to see that some simple-looking code was allocating memory or making unexpected network requests? As you looked closer at this behaviour, you probably discovered a single-use function lurking in the shadows and doing something surprisingly different than what you expected. Single-use functions can have a pernicious influence on a codebase in this way due to the way many changes are made. Consider the case of this function one might spot in a free-to-play game:

def renderScene():
    rendered = # ... math for graphics to render an output frame...

    # We have ~5ms left in the frame, let's mine some cryptocurrencies!
    mineBitcoinsForTime(Milliseconds(5))
    return rendered

One of the benefits of functions (and the foremost peril of global variables) is the ability to do local reasoning — when you look at a function you ideally shouldn't need to consider how other functions interact with it. In this example, we know that we're rendering a scene, then using the rest of our frame budget on bitcoin mining. Is that really all that's happening?

def mineBitcoinsForTime(time):
    mined = # ... code to mine dogecoin ...

    # That was hard work - how much money did we make?
    dollars = requests.get("https://best-cryptoexchange/...")

    # ... more complicated tax logic ...

Our simple frame rendering might actually take an arbitrary amount of time depending on how stable the cryptocurrency platform is!

Unfortunately, the lure of local reasoning means that when adding functionality to target a function for the change, it's common to only consider that function. While this can obviously lead to accidental duplication, it also requires strict discipline to avoid functions from growing in scope and taking on behaviour that may be surprising to the future reader. The programmer who added the network request above probably never considered where mineBitcoinsForTime() was used, and likely didn't expect it would be called in such a time-limited context. If functions like this are inlined, the surprising behaviour becomes less likely to end up in the wrong place due to the expanded context afforded to the featureful programmer, as well as easier to spot and understand when it does.

DRY vs YAGNI

One of the simplest arguments for small functions is the DRY principle. When writing code, it's common to observe duplication and want to remove it. Unfortunately, this tendency to separate out functionality into reusable pieces can also lead to the premature application of DRY. This can manifest via splitting out code into single-use functions with the assumption that said functions will surely be useful elsewhere. Frequently, these other use cases never materialize, and the lonely function is left to hinder future programmers. Even worse, sometimes the original use case did not exactly anticipate newly desired functionality, resulting in an impedance mismatch and an API that feels wrong. The unlucky programmer who sees it will be led by the path of least resistance to use a function that didn't know (and likely couldn't at the time) about their desired use case.

To properly separate functionality out into a separate function for reuse, one should have a solid concept of what it'll be used for and be sure that this implementation is the one that should be used in other places. As any programmer knows, this is a much harder task than it sounds like. The easiest way to succeed at this is to abandon one's hubris and not write a separate function until the other cases have emerged (preferably with corresponding code). Not writing a single-use function in anticipation that it'll soon be a multiple-use function is a fantastic way to avoid unnecessary complexity and prevent locking in a suboptimal API. This philosophy is generally aligned with the YAGNI principle.

Go forth and inline

It can be difficult to visualize the benefits of avoiding single-use functions, particularly since this advice may go against much of what is taught in school and industry. As such, I'd strongly encourage you to try this approach and see the benefits for yourself — take a principled stand against the existence of single-use functions in your codebase! You may be astounded by how much easier it is to understand and work in your codebase, both for you and for engineers new to the team.