CITS5501 lab 7 (week 8) – reviews – solutions
It is strongly suggested you complete the recommended readings for weeks 1-7 before attempting this lab/workshop.
Code review is careful, systematic study of source code by people who are not the original author of the code. It’s analogous to having someone proofread a written essay or assignment before you submit it.
It has two main purposes:
Code review is widely practiced in open source projects like Apache and Mozilla, as well as in industry.
Most companies and large projects have coding style standards (for example, the Google Java Style). These can get pretty detailed, even to the point of specifying whitespace (how deep to indent) and where curly braces and parentheses should go. We won’t get into that much detail, but we stress that
If you’re interested in getting more details about how code review works, then there is more information available in the Pressman textbook, as well as a whole StackExchange site dedicated to code review.
For each of the following code samples, take the time to read through the code carefully, bearing in mind the “Code review instructions” at the end of this sheet and recording problems they find (and possibly suggesting concrete improvements).
After you’ve attempted the exercises, it’s recommended you drop in on one of the timetabled labs, to compare your answers with other students (or to discuss them with one of the lab facilitors).
dayOfYear
public static int dayOfYear(int month, int dayOfMonth, int year) {
if (month == 2) {
+= 31;
dayOfMonth } else if (month == 3) {
+= 59;
dayOfMonth } else if (month == 4) {
+= 90;
dayOfMonth } else if (month == 5) {
+= 31 + 28 + 31 + 30;
dayOfMonth } else if (month == 6) {
+= 31 + 28 + 31 + 30 + 31;
dayOfMonth } else if (month == 7) {
+= 31 + 28 + 31 + 30 + 31 + 30;
dayOfMonth } else if (month == 8) {
+= 31 + 28 + 31 + 30 + 31 + 30 + 31;
dayOfMonth } else if (month == 9) {
+= 31 + 28 + 31 + 30 + 31 + 30 + 31 + 31;
dayOfMonth } else if (month == 10) {
+= 31 + 28 + 31 + 30 + 31 + 30 + 31 + 31 + 30;
dayOfMonth } else if (month == 11) {
+= 31 + 28 + 31 + 30 + 31 + 30 + 31 + 31 + 30 + 31;
dayOfMonth } else if (month == 12) {
+= 31 + 28 + 31 + 30 + 31 + 30 + 31 + 31 + 30 + 31 + 31;
dayOfMonth }
return dayOfMonth;
}
What problems can you see with this code? What improvements would you suggest?
Does it violate any of the guidelines given in the “Code review instructions”?
Some problems with code sample 1:
isLeapYear(int year)
method.dayOfMonth
is re-used to store the result. It’s better for each variable to serve
one clear and distinct purpose.There are many possible ways to improve this code, but here’s one (quick) possibility:
/** Given a date, specified by providing a day of the month (bounds
* depend on the exact month, but should always be between 1 and 31
* inclusive), a month (from 1 to 12), and a 4-digit year,
* return what day of the year that date occurs on.
*/
public static int dayOfYear(int dayOfMonth, int month, int year) {
// For non-leap years: days in Jan-Nov.
int monthLength = { 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 31 };
int result = dayOfMonth;
// e.g. days in Feb have added to them the no. of days in Jan
// (array pos 0) added to them, only.
for int (i = 0; i < month - 1; i++) {
+= monthLength[i];
result }
// months past Feb are affected by leap years
if (isLeapYear(year) && month > 2) {
+= 1;
result }
return result;
}
It too could still be improved:
The Javadoc should explicitly document each parameter and the return value.
We could add to the Javadoc an explanation of what happens if out-of-bounds values are supplied (e.g. -1 or 0 for the month) – probably, throwing an exception would be most appropriate.
(Note that in this context – improving a piece of code – it’s perfectly fine to alter the specifications/design. But when writing tests for a method, the only thing you can test against is what the documentation says.)
Rather than using magic numbers for months – we could define a Java enum:
enum Month {
= 1,
JAN ,
FEB,
MAR// ... etc.
Then callers (and readers of the code) wouldn’t have to guess at what
month > 2
means (it’d be written
month > FEB
).
leap
Hopefully you spotted one problem with the previous code sample – it gives the wrong answer, when the year is a leap year!
So suppose a colleague writes the following method, intended to determine whether a year is a leap year. Again, look for problems in and potential improvements to this code, and drop in on one of the labs to compare solutions.
public static boolean leap(int y) {
// convert to string so we can access 1st, 2nd digit etc
String tmp = String.valueOf(y);
if (tmp.charAt(2) == '1' || tmp.charAt(2) == '3' || tmp.charAt(2) == 5 ||
.charAt(2) == '7' || tmp.charAt(2) == '9') {
tmpif (tmp.charAt(3)=='2'||tmp.charAt(3)=='6') return true; /*R1*/
else
return false; /*R2*/
}else{
if (tmp.charAt(2) == '0' && tmp.charAt(3) == '0') {
return false; /*R3*/
}
if (tmp.charAt(3)=='0'||tmp.charAt(3)=='4'||tmp.charAt(3)=='8')
return true; /*R4*/
}
return false; /*R5*/
}
Some problems with code sample 2:
Still no Javadoc comment!
Also, there’s no need for a one-letter parameter name: why not
call the y
paramater year
?
The year is converted into a String, simply so we can access particular digits. This is needlessly expensive (at worst), and an obscure way of doing things (at best).
The method could be named better. Method names are usually verb
phrases, like getDate
or isUpperCase
, while
variable and class names are usually noun phrases.
Comments have been added, but they are cryptic – there’s no information as to what “R1”, “R2” etc are. (Though presumably they’re some sort of reference to different bits for the algorithm for determining a leap year, https://en.wikipedia.org/wiki/Leap_year#Algorithm. Perhaps this one: https://docs.microsoft.com/en-us/office/troubleshoot/excel/determine-a-leap-year#how-to-determine-whether-a-year-is-a-leap-year.)
Don’t do this! If trying to implementing a specific, publicly available algorithm or data structure, give a citation so people know exactly what you’re using. e.g. “This implementation of a Red-Black Tree is based on Cormen, Leiserson, Rivest and Stein, Introduction to Algorithms, 3rd edn, chapter 13.”
Code formatting is very inconsistent, making it harder to understand what the method is doing (e.g. why is the “return” statement in line 6 on the same line?)
The method makes the assumption that the year is a 4-digit year
(if it’s only 2-digit, expressions like tmp.charAt(3)
in
line 13 will throw exceptions). This assumption should be clearly
documented (or better yet, not made at all – what if historians or
astronomers want to use this code for proleptic Gregorian years in the
1st century A.D.?)
It uses an extremely obscure algorithm, and gets many years wrong, anyway – try 1600 or 1552.
There are many possible ways to improve this code, but here’s one (quick) possibility, which makes improvements including the following:
/** Returns whether the proleptic Gregorian calendar year
* <code>year</code> is a leap year.
*
* Uses the algorithm described in *Calendrical Calculations* by
* Reingold & Dershowitz, chapter 3.
*/
public static boolean isLeapYear(int year) {
if (year % 4 != 0) {
return false;
} else if (year % 100 != 0) {
return true;
} else if (year % 400 != 0) {
return false;
} else {
return true;
}
}
If following a well-known algorithm (especially one this short), inline comments probably aren’t necessary (but Javadoc documentation still is).
In general, it’s expensive to spend the time of humans doing something a computer could do equally well – like formatting code correctly, or spotting errors that can be checked for mechanically. So before a human reviews your code, you should ensure it’s already been checked for basic errors by a computer.
Programs called code formatters, code beautifiers or pretty-printers apply consistent formatting to code. Some apply a single rigid, built-in style (e.g. “all indents are 4 spaces”) but most can be customised to reproduce a style the team prefers. One commonly-used formatter used in Java projects is Spotless, which can be integrated easily with the Microsoft Visual Studio Code editor, or the IntelliJ IDEA IDE. If you use either of those tools to write your Java code, you might like to experiment with it.
Linters analyse your source code looking for common programming errors, poor style, and programming language constructs which should be used with caution. They are performing a simple form of static code analysis (which we’ll talk more about later). Examples of linters for Java include Checkstyle, Spotbugs, and PMD. In your own time, you might like to try running one of these tools over code you have written, to see what sort of problems they can flag.
This exercise is to get you familiar with code coverage tools and some of their weaknesses.
A repository with Java code is provided at https://github.com/cits5501/week-8-lab-coverage-example
which contains a simple class for detecting whether a Java string is a
palindrome (Palindrome.java
), and tests for this class
using JUnit.
You can use GitPod to work with this code online. You will need a GitHub account; once you have one set up, follow the link below:
https://gitpod.io/new/#https://github.com/cits5501/week-8-lab-coverage-example
and select “New workspace”.
After a minute or two, you should get access to an online version of
the VS Code editor in which the source code for the repository is
available. Take a look at the build.xml
file; this file is
used by the Apache
Ant build automation tool. It defines several “targets” representing
tasks that can be run using the build file:
(Some of these tasks you can also do just using VS Code. For
instance, open PalindromeTest.java
, expand the “outline”
and “java projects” panes, and wait for the background tasks to finish.
A green triangle should appear next to the start of the
PalindromeTest
class, indicating that it can be clicked to
run the JUnit tests it contains.)
Click the “Terminal” pane, and type ant test
to run the
JUnit tests. The messages should indicate that four tests were run with
no failures.
Now run ant report
, and not only will the tests be run,
but the paths they take through the code of PalindromeTest
will be recorded by a tool called JaCoCo, and a report
produced.
You can view this by clicking “Go Live” in the VS Code taskbar (towards the right). This serves up the contents of the repository directory via HTTP, so it can be viewed in the browser. Click “report”, and by clicking through, you should be able to see a coverage report for our tests.
Read the JaCoco documentation on “Coverage Counters”, here, for more information about exactly what is being measured.
Exercises
Comment out one of the assert statements (e.g. the one containing
p.isPalindrome("redivider")
) and replace it with the bare
call to p.isPalindrome()
.
Does the coverage report change at all? Try doing the same for all tests, and check again. What weakness of coverage tools does this suggest?
Comment out all but one of the tests. Again, does the coverage report change?
Applying the ideas from ISP (but without necessarily doing a detailed analysis) – can you spot any test cases that are missing from the test suite?
Solution
The coverage report doesn’t change in any of the above cases.
This suggests we could have a very weak test suite, consisting of
just one test case – say, an empty string (""
) – and with
no assertions, and full coverage would still be reported.
If changes are made to the code, we could easily introduce errors that aren’t picked up by the test suite.
When we look at mutation testing, we will see ways of assessing the quality of our tests which do not depend on measuring coverage as reported by tools like JaCoCo.
As for a missing test case: an empty string (a degenerate or “border” case) would be a good thing to test for, as would single-letter strings, but this is not done.
Code review is a widely-used technique for improving software quality by human inspection. Code review can detect many kinds of problems in code, but in this workshop we looked at code samples which highlighted the following general principles:
These general principles help ensure that code is:
Safe from bugs. In general, code review uses human reviewers to find bugs. DRY code lets you fix a bug in only one place, without fear that it has propagated elsewhere. Commenting your assumptions clearly makes it less likely that another programmer will introduce a bug.
Easy to understand. Code review is really the only way to find obscure or confusing code, because other people are reading it and trying to understand it. Using judicious comments, avoiding magic numbers, keeping one purpose for each variable, using good names, and using whitespace well can all improve the understandability of code.
Ready for change. Code review helps here when it’s done by experienced software developers who can anticipate what might change and suggest ways to guard against it. DRY code is more ready for change, because a change only needs to be made in one place.
This document describes how and why to perform code reviews.
You can’t learn how to write without learning how to read – and programming is no exception. Code reviewing is widely used in software development, both in industry and in open source projects. Some companies like Google even make it a policy that all code must be reviewed before being committed to version control – this means that for every single line of code in Google’s repository, another software developer has read it, given feedback on it, and signed off on it.
Code review can be seen as a type of static quality assurance technique (it doesn’t require the code to be run), and has multiple benefits:
So code reviewing is not only a practically important skill that you will need in the real world, but also a learning opportunity.
In the exercises for this workshop, you can make comments about anything you think is relevant. But some guiding principles are that we want code to be:
So read the code with those principles in mind. What follows are some concrete examples of problems to look for – feel free to add others.
Bugs or potential bugs.
Unclear, messy code.
Misusing (or failing to use) essential design concepts.
public
or protected
)
implementation details that should be kept private
.Positive comments are also a good thing. Don’t be afraid to make comments about things you really like, for example:
The code samples, instructions and exercises in sections A–B and E are adapted from MIT’s course 6.005, “Software Construction”, reading 4, “Code Review”, available from the MIT website, which was collaboratively authored with contributiopn by Saman Amarasinghe, Adam Chlipala, Srini Devadas, Michael Ernst, Max Goldman, John Guttag, Daniel Jackson, Rob Miller, Martin Rinard, and Armando Solar-Lezama. This worksheet (as well as the original MIT content) is licensed under the CC BY-SA 4.0 license.