General vs. Specific Source Code

“Strangely, I still experience the same conflict”

For someone who likes logical puzzles, computer programming has a lot to offer. This was especially true for me as a teenager during the rise in popularity of personal computers. Learning to write programs with the BASIC programming language was like a dream come true. With a machine in front of me doing whatever it was instructed to do, I envisioned one day being able to build a robot like the one on “Lost in Space,” one of my favorite TV shows. After enjoying hundreds of hours programming an Atari 8-bit computer, computer science was an obvious choice for me when selecting a college major.

Of course things don’t always go perfectly.  I remember with college programming assignments, I would focus on the current assignment, having the computer do that very specific task. Typically, I completed the assignment, and it worked well. But next the professor would often say to make that task repeatable with arbitrary data. I would feel cheated because I did what they asked the first time. The next assignment would have me undo or redo part of my work. Why not ask for that to begin with? Why waste the students’ time?

At the time, I thought I was right to be annoyed. In retrospect, I tend to feel like I was being naive or stubborn, that I should have made the source code able to use arbitrary data to begin with. Maybe I should have learned my lesson and did that extra work up front. But that extra generality can make the code harder to understand, especially for a student. Between the extra work it takes to make code more general as well as trying to understand that more general code, it probably made sense at the time to write my code specifically for the given task. So in a way, I was right.

Strangely, I still experience the same conflict between writing code in a specific way or a more general way. What do modern experts have to say about this? On the one hand, today’s test-driven development (TDD) practice seems to side with coding specifically to a given task.[1] It says not to write code that isn’t needed at the time. This is sometimes stated with the acronym YAGNI, which stands for “you aren’t going to need it.”[2] The idea is even captured somewhat in a “code smell,” an indication of when to improve the structure of the code through refactoring (improving the design without changing the visible behavior). Martin Fowler calls this particular code smell “speculative generality” and recommends removing extra preparation for code features that are not yet required and may not even be added in the future.[3] Based on this, I might conclude that I was right after all.

On the other hand, TDD people say to keep each part of the code “well-factored” (simple enough, unassuming?) so that the code is easily adaptable. Wait – make it specific and yet general? At first glance, that sounds contradictory or even noncommittal, like something a politician might say. Or maybe a wise guru on top of a mountain. I admit, for the question of how specific or how general to make the code, the right answer does seem to be “it depends.”

A closer look at TDD shows that it actually makes sense. For some time, it has been generally accepted as best practice that a function should do only one thing. Granted, the interpretation of what is “one thing” varies, but still there is general agreement that a well-written function does one thing. Similarly a class – the fundamental module in object-oriented programming that binds data with operations on that data – should cover one concept. Again, there’s some variation in opinion of what counts as one concept. In this sense, yes, a piece of code should do something very specific.

As for using specific data, proper object-oriented design practice says to program to interfaces, not specific classes. It could be something labeled as an interface within the programming language or maybe just an abstract class (a relatively general class meant to be reused for defining more specific classes, not objects). For example, suppose a class needs to log data, and a logger object is accepted as a parameter to the current class’s constructor. Rather than use a specific known logger type that depends on a database or file or web service, pass in a reference to an interface or abstract class, so the choice of implementation can vary as needed. Now the current class is not tied to a specific kind of logger and might be more reusable. So, no, in general a class probably should not use specific data or even specific types, if avoidable. Maybe the general recommendation is to program specifically for the class logic but not for specific data or types.

With many years of object-oriented software development under my belt, the resolution of this apparent conflict now seems obvious to me. Granted things were not so simple when I was in college. Object-oriented programming was not as popular, and test-driven development had not been invented. Refactoring was not even a word yet, at least not something as commonly applied to source code as it is today. But I suppose I shouldn’t be that surprised. Many things in life seem to call for that “happy medium” between two extremes. Often, it all boils down to what makes sense at the time instead of strictly following rules. This is ironic, considering we are taught to obey rules when relatively young. Whatever makes sense at the time.

References

[1] https://blog.cleancoder.com/uncle-bob/2014/12/17/TheCyclesOfTDD.html

[2] https://en.wikipedia.org/wiki/Test-driven_development

[3] Fowler, Martin. Refactoring: Improving the Design of Existing Code. Reading, MA: Addison-Wesley, 1999, pp. 83-84. Newer edition described at https://refactoring.com.

(c) Copyright 2020 by Mike Ferrell

Leave a comment