What makes an effective code review?

As developers, we’re all responsible for writing code in one way or another. We write it ourselves or help others write it better. Either way, code gets written and we all move on.

The rest of this short post will be focused on code reviews and what (I think) makes them efficient and beneficial.

Code Reviews (CRs)

Code review is systematic examination (often as peer review) of computer source code intended to find and fix mistakes overlooked in the initial development phase, improving both the overall quality of software and the developers’ skills. – Wikipedia

It’s not a stretch to realize that there is a strong relationship between the overall quality of software and the skills of the developers building it. Invest in developer education and you should have better software coming out the other end (at the very least you’ll have different and likely more challenging problems – assuming you’re not bottlenecked on process). In my mind, code reviews are an essential part of any development process. Regardless of whether you do them face to face or over the ‘net, the code will always come out better and the reviewers better informed.

What a Code Review isn’t

A code review isn’t an exercise in rubber stamping. It’s an opportunity for both the developer and reviewer to reach consensus while verifying that the proposed implementation or code change under review satisfies both the functional and non-functional requirements of the system. If it isn’t satisfactory, the expectation should be that any required changes will be made and the code re-submitted for review. The reviewer should hold power over what gets committed and what doesn’t.

At the company I work for, code reviews are mandated for every non-trivial commit, regardless of whether you’re a junior or senior developer. In the past our reviews have typically involved two people (developer and reviewer) and been done at the developers desk. They’re scheduled adhoc-ly with the reviewer involved generally being someone with an in-depth knowledge of the area being modified. More recently, we’ve been trying out Crucible in support of offline CRs.

It goes without saying that any code submitted for review should be atomic. Nobody likes reviewing 50+ significantly changed files on a project you spent the past 3 weeks working on. Break that down into three separate commits and everyone involved will be much happier :).

What’s Worked

A lot of things have worked. We’ve gotten progressively better over the past few years with every project making noticeable improvements to our overall development process.

Obviously the good thing for us is that code is getting reviewed before it gets committed to the repository. We’ve taken to documenting common problems and have developed a quick check list to help ensure a more thorough review. We’re generally successful with our attempts to have more senior developers (or domain experts) do reviews for junior developers, although it does vary a bit depending on the stage of development we’re in.

Alongside the code review check list, we’ve also developed a series of lessons we’ve learned the hard way. Unfortunately, software development is complex and you’re going to make mistakes along the way. It’s part of the learning process and the only way you’re going to avoid repeating them is to make sure they’re well documented and made front and center.

What Could Be Better

Scheduling of reviews has been a hot topic for us in the past. People are busy and it hasn’t always been possible to get the most appropriate person involved in a review. Part of the motivation for looking at a tool like Crucible is to enable offline code reviews at the convenience of the most appropriate reviewer. An offline CR is one in which the developer commits his/her change to a private branch and schedules a web-based review (using Crucible). Once the CR is complete, comments are sent back to the developer who then addresses them and either gets a follow-up in person review or schedules another offline one.

There are certainly benefits and drawbacks to both approaches. An offline CR loses a bit of the human touch and the emphasis on education could be diminished. Something that I’ve been experimenting with is doing an initial face-to-face review but committing to a private branch. The reviewer can then take a more in-depth look (at their leisure) at the code in a private branch before it’s actually merged into a team branch.

It’s important to have a process around code reviews. Something we’ve learned the hard way is that large commits (50+ files) don’t go over very well with reviewers (in-person or otherwise). They take a long time and diminishing returns start after you’ve been looking at code for longer than an hour or two. That being said, some large changes are unavoidable and have to be dealt with, but more often than not with a bit more decomposition you’ll have a handful of atomic commits that collectively address the same problem. It’s not surprising that most of the large change sets I’ve seen have come at the end of a particular feature or project. Unfortunately this usually leads to a series of conversations around whether we can afford to slip a completion date in order to ship code that’s up to par. Nobody likes to have these conversations.

The other thing to note is that it’s important to have documented outcomes and follow-up loops. This is more easily accomplished if you’re using a tool like Crucible, but could be as simple as a page of hand-written notes and a completed checklist.

What Makes a Good CR (my top 5)

  • Code submitted for review should be atomic and in a consistent state (passing tests, etc.)
  • The size of a change set should be consistent (maybe you set a goal of getting a review after X days or Y modified files).
  • Code should be reviewed by someone with a solid understanding of the area being modified.
  • A review should never be rushed. A reviewer should be free to prevent a commit from going through.
  • Code reviews must have documented outcomes that can and will be followed up on.

GluonJ Test Drive : Simple AOP

I admitedly don’t have a lot of experience with the various AOP frameworks that have gained popularity over the past few years.

I happened to be looking at Javassist a couple weeks back and noticed that they were in the process of developing a lightweight AOP framework on top of it called GluonJ. It’s still in fairly active development with a 1.5b being released last November.

I naively set out to solve what I thought was a pretty simple problem, track statistics about object serializations. Essentially I wanted to build something similar to JDBCSpy that tracked and exposed basic serialization information over JMX. Initially I’m thinking basic things like # of serialized objects and # of bytes serialized. I spend my days working on a client/server application and this type of information would be useful, if done in a transparent and non-intrusive way.

Caveat – There might be (and probably are) better ways to accomplish this, I’m not dead set on using the AOP hammer to solve this problem but it does present a nice excuse to try it out.

Attempt #1

My first attempt was to refine the definition of java.lang.Object and insert advice around writeObject().

This unfortunately didn’t make it very far as it doesn’t look like GluonJ allows you to mess with Object. You can refine any other class (even the final ones) but there’s no love for java.lang.Object.

Attempt #2

Second attempt was a bit more selective. I choose an object to monitor serializations for and went about trying to apply advice to its writeObject().

Unfortunately, I didn’t have much success in being able to apply any sort of advice (@Before, @After, @Around) to a private method. In hindsight, it would have been less then ideal anyways as most Serializable objects don’t implement their own writeObject() anyways.

Attempt #3

After a couple dead ends I decided to approach to problem from a slightly different angle. Rather than focusing on modifying the behaviour of the object being serialized, why not change the behaviour of the object actually responsible for performing the serialization. In this trivial example, that object was the ObjectOutputStream.

It worked and didn’t require very much code at all.

The Glue Code
This doesn’t really do much other than instruct the runtime that whenever ObjectOutputStream.writeObject() is invoked, we want to do a System.out.println() afterwards. Pretty simple right?

@Glue
public class SerializationGlue
{
    /**
     * @param obj Object
     * @return Number of bytes in the serialized form of 'obj'
     */
    public static long calculateSize(Object obj)
    {
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        try
        {
            ObjectOutputStream out = new ObjectOutputStream(baos);
            out.writeObject(obj);
        }
        catch (Exception e)
        {
            e.printStackTrace();
            return -1;
        }

    return baos.size();
}

@After("{ System.out.println(`obj size=` + org.jordens.jdbcspy.testutil.gluonj.SerializationGlue.calculateSize(object)); }")
Pointcut pc1 = Pcd.define("object", "$1")
                  .define("target", "$0")
                  .call("java.io.ObjectOutputStream#writeObject(Object)");

}

The Test Code

public class Runner
{
    public static void main(String[] args)
    {
        List<Long> values = new ArrayList<Long>()
        {
            public String toString()
            {
                return "List<Long>[" + size() + "]";
            }
        };

    for (long i=0; i&lt;100000; i++)
    {
        values.add(i);
    }

    try
    {
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        ObjectOutputStream out = new ObjectOutputStream(baos);
        out.writeObject(values);
    }
    catch (Exception e)
    {
        e.printStackTrace();
    }
}

}

How do I run this?
java -javaagent:gluonj-1.5b.jar=org.jordens.jdbcspy.testutil.gluonj.SerializationGlue -classpath classes/ org.jordens.jdbcspy.testutil.gluonj.Runner

would print:
obj size=1400181

Nothing too crazy there. One niceity in all of this is the relative ease in which you can enable/disable the runtime code weaving. Just leave off the -javaagent:XXX and you’re effectively back to normal.

Conclusions

GluonJ looks pretty cool but obviously it’s still fairly new and lacks many of the capabilities available in a more mature offering like AspectJ. That being said, the documentation is pretty good and the learning curve was basically non-existent. If you’re looking for something to play around with, it’s probably a bit easier to get started with.

The possibilities are certainly encouraging and I’ll definitely be keeping an eye on the progress of GluonJ as well as taking a more hands-on look at AspectJ.

Top 5 Attributes of Highly Effective Programmers

Top 5 Attributes of Highly Effective Programmers

I’ve always found it interesting to read others thoughts on what makes us effective. If you haven’t read it already, I’d recommend it.

Ben’s List

  • Humility
  • Love of Learning
  • Detail-orientedness
  • Adaptability
  • Passion

Personally, I think it’s a pretty good list. High level and applicable to most situations in life, not just programming.

If I could add a 6th, it would certainly be communication. As was mentioned in one of the comments, nothing can mess a team up faster than poor communication. We’ve all seen it and should be trying to avoid seeing it again.

It’s also important to note that balance is very important. I’ve known plenty of effective people that have been able to achieve separation between their personal lives and the demands of the job.