Confusing definition with calling
Many comments are placed with the provider of functionality rather than the consumer. For example:
#if 'A' != 0x41
/* Hmmm... This source code isn't being compiled in ASCII.
* In order for data that flows over the network to make
* sense, we need to translate to/from ASCII.
This code says nothing about the use to which it will be put. It provides information about the environment. The comment, however, does not explain how that information is being provided. Rather, it explains the use to which that information will be put. But it’s quite possible that other uses will be found for that information. The comment should live in the client code; possibly in a function called
TranslateToASCII() which checks for
NOT_ASCII with a
#ifdef for example. But putting a comment explaining that we’re translating to ASCII in a function called
TranslateToASCII() seems a little redundant. Bin it.
Looking again at the same comment we can see that we haven’t really removed the need for the first line. That’s easy to deal with. Instead of some wooly name like
NOT_ASCII which could mean a variety of things, let’s be more explicit. Why not call it
CODE_NOT_COMPILED_IN_ASCII. No need for the first line of the comment then. As an aside, we should also reverse the logic and call it
CODE_COMPILED_IN_ASCII. Obviously the client code would need to change to reflect this. Avoid negatives where possible. And there’s no need for a
#define in this day and age. Use a free function like
IsCodeCompiledInASCII() at the least.
Bad third party interfaces
There’s often nothing you can do about third party code so you may sometimes need to do things like this:
/* affects include files on Solaris */
#define warrants a comment. The name of the
#define is cryptic and presumably not amenable to change. In this instance the comment is pretty dire but the fact remains that a useful comment would have been appropriate. The fundamental point though is that the comment shouldn’t be required. If someone had seen fit to use a decent name instead of
BSD_COMP we would have been OK. Long term solution: make sure your public interfaces are expressive and your clients won’t feel the need to turn to the dark side.
It’s Lint’s job
This type of comment is often found in the context of
#include <unistd.h> /* for getpid() */
Presumably it’s so a maintainer can come along and attempt to remove dependencies. If
getpid() is no longer called then the
#include can be removed. Assuming anyone else who decided they wanted to call something from
unistd.h had the foresight to update the comment. Regardless, this is a job for a static analyser. This kind of thing is amenable to automation and you should use the right tools for the job. Lint will pick up unused headers as well as a thousand other things so you should use it.
/* ------------------- DEFINITIONS -------------------------- */
is just pointless fluff. The language has features to delimit structure within your code where it makes sense to do so. Use those rather than pretty ASCII art. This one was found in the middle of a file where declarations and definitions were intermingled so it was wrong anyway.
/* maximum number of requests on a time limited test */
#define MAX_REQUESTS 50000
Many, many comments are there to explain a badly chosen name. Instead of
TIME_LIMITED_TEST_MAX_REQUESTS. The former is unclear. The latter does exactly what it says on the tin. And it does so wherever it’s used rather than forcing those who follow you to look at the symbol’s definition to find the explanatory comment. And no, it doesn’t mean more typing. Either use an IDE that supports native code completion or get a plugin that does.
Use the right primitive
int keepalive; /* non-zero if a keep-alive request */
Make it a
bool and the comment can vanish. Or maybe it’s the keep alive time in seconds. In which case it’s a really bad name accompanied by only half a comment.
Don’t use cryptic encodings
int verbosity = 0; /* no verbosity by default */
In the previous example a
bool seemed a natural choice based on the comment. Here, the comment is incomplete and ambiguous. It’s possible that a
bool would suffice but there may be differing levels of verbosity, from 0 to 3 for example. Use either
enums or alternatively, a
const int for each level, in this case. Again, expressive code removes the need for error prone commenting. You need to express yourself exactly, completely and minimally in this line of work. Code is designed to do that. Human language is not.
Don’t use globals
This a valid, if tired old line. The reasoning here is a little different to the conventional. A symbol at global scope does not have the advantage of a limited context in which to define itself. For example:
int tlimit = 0; /* time limit in secs */
If this was defined at local scope in a function called
WaitForNextCommand() then you could be pretty sure that, if the required command wasn’t received within
tlimit seconds, some kind of error recovery would be called for. The semantics of the symbol are fairly clear. When defined at global scope, that context is gone and the symbol could mean many different things. At local scope, simply changing the name to
timeLimitInSeconds would probably be enough to both remove the need for the comment and to make the code much clearer. At global scope you need to pretty much write an essay to describe it’s possible uses. The existing comment is both insufficient in it’s current context and superfluous in the correct context. Use the structural hierarchy of your code to limit the amount of explanation you need to do.
Don’t explain obvious code
/* simple little function to write an error string and exit */
static void err(char *s)
fprintf(stderr, "%s\n", s);
printf("Total of %ld requests completed\n" , done);
Both the example and the advice are obvious and require no explanation.
Represent the real world
This one’s a little specific but it highlights a general principle. It can be difficult to tease out that principle from the code in question at times but it’s worth bearing in mind. Sometimes we become bogged down in the detail of the code without thinking about how the internal bits relate to the real world.
/* avoid divide by zero */
printf("Requests per second: %.2f (#/sec) (mean)\n", (float) (done / timetaken));
Firstly, this will never produce an exactly correct figure since
timetaken is only an internally represented approximation of the real time taken to do something. The variable
timetaken can never be accurate. Obviously, if
timetaken are both ‘large’ then the result will be pretty close. So, the question is this; if
timetaken is zero, how many requests per second have we processed? The most sensible figure to present to the user is
done requests per second. This includes the case when
done is also zero. So, the code should be:
if (timetaken == 0)
printf("Requests per second: %.2f (#/sec) (mean)\n", (float) (done));
printf("Requests per second: %.2f (#/sec) (mean)\n", (float) (done / timetaken));
You might want to say ‘Approximate requests per second’ and you should change the condition to check for equality since ‘timetaken’ is not boolean. Regardless, producing a result for both zero and non-zero cases results in better code that more accurately reflects the problem domain and gives a more consistent user experience with a value being reported regardless of the mercurial nature of CPU clock cycles. It also removes the need for the comment since the code is much more explicit about what is happening. The comment had a clue in it. The word ‘avoid’ stinks of shoddy workmanship. If zero ‘happened’, deal with it.
Replace code block comment with function call
/* work out connection times */
lots more code here
If you have this kind of comment, it’s ripe for extracting the calculation to a separately called operation;
WorkOutConnectionTimes() in this instance.
/* calculating the sample variance: the sum of the squared deviations, divided by n-1 */
for (i = 0; i < requests; i++)
This is the most often cited ‘legitimate’ use for comments. Balderdash is what I say. The example above should obviously be in a separate function called
CalculateSampleVariance(). And anyone can look up the formula for variance if they don’t know it already. No need for a comment. Ditto for something more complex like the
DitterWibbleMeyerkovskiWizzySort(). Google is your friend here. If this is some complex algorithm that you’ve developed yourself then it will either be amenable to decomposition into smaller callable functions or you’ll have given it a name and written a paper that can be Googled. I’ve never come across anything that was too complex to understand purely by looking at the code and yet didn’t have any easily found external references explaining it.
Is this a ToDo or a rant? Either has no place in the codebase.
* XXX: what is better; this hideous cast of the compradre function; or
* the four warnings during compile ? dirkx just does not know and
* hates both/
/* calculate and output results in HTML */
static void output_html_results(void)
So what does it do? Does it calculate and output or just output? Either get the symbol name right or… No. There is no ‘or’.
Design by contract
DBC specs as commented annotations to a function signature are the closest real use I can think of. Some languages allow you to specify DBC structures explicitly and elegantly but none of the ‘mainstream’ languages do unfortunately. Asserts or exceptions, possibly wrapped as a
Precondition(<condition>) macro or function etc., are probably better but cannot be defined as part of the signature in C++. On balance, I’ll always use explicit checks for preconditions, invariants and postconditions within the code itself where they can be tested. The loss in expressiveness of the function signature doesn’t warrant the duplication and confusion caused by inaccurate comments.
Complex language features
Some code is difficult to understand because the language features being used are complex. For example, many C++ developers are unfamiliar with templates. This often leads to developers explaining what the language feature does in the code. The code is no place for language tutorials. If maintenance programmers do not understand a language feature then either they shouldn’t be touching the code or the language feature is inappropriate for the skill level of the team. Either way ‘tutorial via comments’ is giving just enough knowledge to be dangerous.
Detailed design commenting
Using comments as a form of pseudo code during coding is common and sensible if it works for you. For example:
//for each widget that's in the reject bin
//dock x dollars from operator's pay packet
What’s not acceptable is those comments remaining in place once the code has been written. Any temptation to leave them in can be refuted by applying one of the other comment ‘rules’. The best way to ensure they’re not left in for too long is to remove the // which leaves the code in a state that won’t compile.
The vast majority of comments are there to hide bad process, bad design and bad code; all of which can be avoided. The only comments that should remain in place are those that hide bad language features or bad third party interfaces. Essentially, anything that you have control over doesn’t need commenting.
Comments are a bad code smell, transparent to compilers, never executed, cannot be tested and can only be verified by manual inspection. If you’ve got the time and courage to mess with things that band-aid over bad code, resist all attempts at automation, create manual work, introduce significant risk to a project and provide no benefit, then comment merrily away. I think they’re evil myself.