Martin Robillard · Blog

APIs: What Should We Document (and Not Document)?

12 May 2014 by Martin P. Robillard

APIs are not always self-explanatory, so we need to complement them with documentation for better usability. In their book on framework design, Cwalina and Abrams offer the guideline:

Do: provide great documentation with all APIs.

But how do we do that?

The issue is taken up in a few articles. The default reference for Java is How to Write Doc Comments for the Javadoc Tool. I also found A Coder's Guide to Writing API Documentation. Both articles have substantial content and provide a lot of specific advice. However, they mostly focus on stylistic guidelines, such as starting the description of a method with a verb. The question of what kinds of topics to cover seems to remain open.

In 2011-2012, Walid Maalej and I undertook a large-scale study of the content of reference documentation for both the Java SDK 6 and the .NET Framework 4.0. This work involved manually assessing a total of over 5,500 pieces of documentation sampled using methods that would ensure representativeness. Seventeen people participated in this effort. For each piece of documentation (in both Java and .NET), we asked two assessors to independently rate whether the piece of documentation provided evidence of any of 12 types of knowledge, including:

The figure below shows the distributions of these types of knowledge across API elements in the Java SDK 6 and .NET 4.0.

Some of the interesting observations that emerged from a detailed analysis of our data include:

One especially interesting aspect of this project was that we studied the distribution of "non-information" in API reference documentation. We define non-information as a sentence (or sentence fragment) that is just uninformative boilerplate text. If you look at the figure above you will see that a lot of the Java and .NET documentation contains non-information. This is especially true of type members (fields and methods): 43% (Java) and 51% (.NET) of documentation units associated with members contain non-information. Why?

I think evidence of non-information is a symptom of the wish to be able to check documentation for completeness. Tools like Checkstyle include customizable rules to flag any public type or type member that does not have associated documentation. Omissions are then easy to spot. However, what is the point of documenting a method setTooltipText with the phrase "sets the tooltip text"?

For some, the benefits of systematic documentation outweigh the drawbacks of non-information:

The @return tag is required for every method that returns something other than void, even if it is redundant with the method description. (Whenever possible, find something non-redundant (ideally, more specific) to use for the tag comment.)

[Oracle]

However, the idea obviously has its detractors, notably Robert Martin:

It is just plain silly to have a rule that says that every function must have a javadoc [...]. Comments like this just clutter up the code, propagate lies, and lend to general confusion and disorganization.

[Robert C. Martin]

It seems to me we could both have our cake and eat it if there simply were an annotation such as @SelfDoc to declare that there is nothing to add to document a self-documenting element name (for a method, parameter, return type, etc.).