2

I am working somewhere where programming is an important part of the job, but where code review is something nobody ever heard of.

Being kind of enthusiastic about programming, I've seen a lot of questions about code reviews, here and mostly on Stack Overflow, but I could not experience it myself in a professional context.

For a little more context, I work in research in epidemiology. There is a team of data-managers who set up databases (Oracle SQL) based on raw data. Then, every researcher will write code (in R or SAS) independently to query the database and perform their analysis. Code written by a researcher is usually not used by another though. A study is judged upon its results, so as long as they are likely, little errors can sneak through.

There is no code reviewing, neither in the data managers team nor in the researchers team, to track errors. I think this could be very beneficial for both teams and I'd like to convince the boss to consider it.

Unfortunately, googling "manual code reviews guidelines" doesn't give any useful insights on setting up but only on improvement.

How is code reviewing usually introduced in naive teams? Where could I find the resources to teach my boss the actual benefits and the methodology to set up code reviewing?

jonrsharpe
  • 1,311
DCh
  • 197
  • 6

1 Answers1

5

The place to start is to identify the problem that your colleagues care about, and provide a means of correcting the problem that won't completely derail their productivity.

Problem Statement

We want to ensure our analysis is correct, and increase our customer's confidence in those results.

You need to start with the important stuff first. If there's legal risk involved, then expand that statement to include that as succinctly as possible.

Options

I've found it best to go with options, since blank whiteboards tend to get people arguing about very different things as if they were equivalent. So, there are some options:

  • Peer Review: The typical term used for finished analysis products, which includes algorithm, inputs, and discovered results. Ensures the final result is correct to the best of our knowledge.
  • Code Review: Reviewing the code products used during the development. Ensures the algorithm was implemented correctly, and may uncover assumptions about the data.
  • Data Review: Reviewing data affirms assumptions or identifies anomalies that deviate from the assumptions.
  • Automated Review: I'm not even sure if this is possible. Developers have the advantage of static analysis packages and automated tests to ensure a product is correct and stays correct. I included this here just to get you thinking.

Allow for the team to add more or to say that some of them are not required.

Honestly, all of these may be required. The question is when they get done and who does them. There are specific things that you would need to be looking for, so having a checklist is a good thing. It can be managed over time, but it helps jog everyone's memory when it is review time.

When to do it

I would recommend updating your processes a little at a time. Start with the biggest pain point, and attempt to resolve it. Keep trying until you feel like you get it right.

Do what can be done, and when things aren't working, be ready to drop it. At least you have the data points to justify why it wasn't working.

The general concept is that there is a process. The appropriate reviews need to be identified for the correct place in the process. I recommend starting with very little structure and add just enough structure to solve real problems your team faces. Everything needs to align back to resolving your problem statement.

Creating process is painful. It's best if you can guide the effort organically rather than coming up with some big overblown process that requires huge teams just to feed.