Ramping Up On Legacy Code

Question

When starting to work on a project with an existing code base, the first thing that needs to be done is usually to understand the application & existing code. Let's assume that the existing code is legacy code; referring to Michael Feathers' definition of "code with no tests".

I am sure that there are many different ways to handle this ramp-up phase. The most straightforward way is to go through the UI of the application (if there is one) and simultaneously debug the application to understand what is happening at the code level. This is a very time-consuming approach and also it is very easy to forget what you learn in a debugging session. Furthermore, there is no real way to share (among the team) what you learn during debugging.

Understanding the down-sides of this approach, I have tried another approach for my most recent project. What I did was to write a kind of API layer that sits on top of the existing code base. This API contained the functionality of pretty much what a user would do in the UI.

To be more specific, let's assume that the existing application is a typical transactional application with orders, items and shopping carts. My API turned out to something like this:

public class OrderAPI{
    public Order createOrder(customerName);
    public boolean deleteOrder(orderID);
    public List<Order> getOrdersForCustomer(customerName);
}

public class OrderItemAPI{
    public OrderItem createOrderItem(order);
    public boolean deleteOrder(orderID);
    public List<OrderItem> getItemInOrder(order);
}

public class ShoppingCartAPI {
    public ShoppingCart createCart(customer,order);
    public boolean addItemToCart(cart, item);
    public boolean removeItemFromCart(cart, item);
}

The methods in the API correspond to the actions that the user would perform at the UI level. Within these methods, the calls to the existing codebase are made.

Writing this API by itself, of course, doesn't mean much. So, I have written tests (I guess they automatically become integration tests) to ensure that the API works well; proving that I got an understanding of how the legacy code works.

After all this introduction, comes my question: Can you define (possibly in software engineering terms) what I have done? When taking this approach, I have gone completely with my intuition. At some point, I remember being extremely confused; working on the API, then working on my tests, fixing my API, then my tests. I wasn't sure anymore if my main objective was to learn the existing code base or come up with a stable API layer.

I would greatly appreciate any kind of explanation; I am sure that this must be a practice that other people have already experimented with. I just need the right guidance to point me to the appropriate discussions/resources.

Steve Jackson · Accepted Answer · 2011-10-19T19:36:58.227

The API is a Facade/Wrapper right? Feathers might also call it creating a Seam. I call it "Getting the System under Test".

If possible, you now want to take the UI layer and integrate it so it works through this new API. At that point you have real integration tests, and have some confidence of the system's behavior from the API down. If you leave it as is, I would characterize them as Learning Tests, which can still be useful for hunting regressions.

Learning Tests were featured in Clean Code, but I don't know who invented the term. Here are some links for more info:

score 2 · Answer 2 · answered Oct 19 '11 at 13:06

I like the idea of writing unit tests to familiarize yourself with the code, but I'm not sure about the API layer. Was the legacy code too difficult to write tests against without the wrappers? Do you plan to develop something that uses this API (other than the tests)?

I would not generally advocate writing much code that you did not expect to run in production. The person coming after you may be even more confused by this. If this is just part of your test suite, I'd just make sure that is very clear from the folder/project structure.

score 2 · Answer 3 · answered Oct 19 '11 at 19:56

Characterization tests is the term that Michael Feathers uses to write tests that try to understand and characterize what an unknown code base does. It seems that, to get to these tests, you were forced to also create this API, a wrapper or facade (as mentioned by Steve Jackson) that tries to expose a coherent set of functionality at the right level of abstraction without going through the UI. It's hard to test without good separation and coherence.

So to me, this API seems like necessary scaffolding to be able to write some characterization tests. It might be interesting to consider evolving it and incorporating it in the application itself, so that you don't have competing views of the same system, what might backfire and turn out to be a source of duplication (and maintenance headaches).

Other related terms that might be of interest include:

To build test data:

Big refactorings:

Strangler application

Design patterns:

Facade

Principles (on levels of abstraction):

Ramping Up On Legacy Code

3 Answers3