42

For my job, I work on multiple different scientific software projects, as well as general administrative tasks that go hand-in-hand with any 'office' job. Thus, any given working week could involve progressing none, or all, of those software projects.

Put simply, my problem is that I waste a lot of time (sometimes days) picking up from where I left off when I pick up one of those software projects after a large break from it. Usually this is because I have to re-learn:

  • How the particular function/class I'm looking at fits into the overall project
  • The general flow/algorithm with which the software works for our use-case
  • Which methods/classes etc. that I have already written (i.e. to ensure DRY-adherence with further work)

What are the most efficient and best ways in software development to avoid this 'dead'/'lag' time when you pick up development of a project again after a prolonged break from working on it?

PS - I am a scientist by trade, not a software developer so I apologise for the fundamental/basic nature of this question. Also, many of the projects are just developed by myself alone.

10 Answers10

55

Let me speed you up by slowing you down.

Ever heard who the best person is to hire as a tutor? It’s not the teacher. It’s some student who just took the class. Who remembers the struggle.

Therefor I want you to refactor just before doing new work. Because now is when you really understand how confusing the code is. And how hard it is to add the new code for your new feature. Right now you see the problems better than ever.

Later, when you’ve added that new feature, you’ll be dumb again. You'll know all too well how it works. So you’ll have no idea how readable it is.

When working alone this is the best you can do.

If you want to do better find some way to get someone else to look at your code. Do it soon after you write it when you’re still willing to change it. They can see what it really needs when you can't and will keep you from doing silly things just because you can.

Oh sure, tests, good names, whitespace, all that good stuff, are all still important. The sooner you do that stuff the better. But writing code changes your brain until you're not the codes intended audience anymore. It's intended for those that don't already know how it works. Test its readability against that.

Writers might recognize this pattern. When they proofread their own work sometimes it needs to spend some time in a drawer before they read it again. This is so they can see their typos for what they are, rather than only see what they meant to say.

That means those moments when you're kicking yourself for not making it easier the last time you touched it are just not going away. Your brain was code damaged back then. Now it's healed. Be quick and use that healed brain to fix the problem you see now, before it gets all damaged again.

candied_orange
  • 119,268
30

Work as if you are on a team

Think of your future self as another team member, with your same skillset but no knowledge of your code. Follow all the processes you would normally follow for another team member.

  • Follow task and code management procedures, e.g. keep a repo, create feature/bugfix branches, and use brief, meaningful commit messages
  • Use the simplest design possible, and common design patterns when applicable.
  • Write code that is idiomatic for the language.
  • Stick to core libraries if at all possible, and use well-known, stable libraries when it's not.
  • Maintain some kind of documentation, but keep it brief (your future you won't want to have to wade through too much stuff). I find a Wiki works fairly well.
  • Write automated unit tests. These are a form of documentation. They will also tell your future self when you've mucked something up.
  • Follow the principle of least astonishment
  • For any sort of libraries or long-term dependencies (i.e. code that other modules depend on that will need maintenance), write SOLID code. It's worth the effort.
  • Don't be afraid to write Really Obvious Code, even if it seems like more typing. In general, code is read many more times than it is written.
  • Validate your input parameters, and use meaningful exception messages. Even if your inputs are always valid, the validation code helps express how the function is meant to be used to other programmers.
  • Call things by names that will mean something to anyone else without needing to know too much context.

Fun story about the last bullet. I once knew an engineer who named a database field object_state, because he was working on a generic data synchronization feature, and he needed a way to keep track of the state of each object. Made perfect sense in his head, but could mean literally anything to anyone else (all data is object state of some kind). Sometimes you have to think carefully to avoid assuming a mental context that won't be there for anyone else.

When you work on a software team, the main point of source code is to not so much to tell the computer what to do, but to tell your other team members what you are trying to do. As Martin Fowler puts it:

Any fool can write code that a computer can understand. Good programmers write code that humans can understand.

Your "future you" is another human. There are whole books written on how to help him to understand.

P.S. You will be very glad you did all this if your work takes off and you need to hire another developer.

John Wu
  • 26,955
17

It is entirely normal and expected that you have to re-learn a codebase after you have not worked with it for a longer period of time. And the longer that period is, the larger the part that you need to re-learn.

What are the most efficient and best ways in software development to avoid this 'dead'/'lag' time when you pick up development of a project again after a prolonged break from working on it?

There are a number of practices that help reduce this re-learning time, and they can be summarized as "use good engineering practices":

  • Use good descriptive names
  • Use a consistent coding style
  • Make appropriate use of abstraction and encapsulation
  • Write documentation that helps a new person get up to speed (after a prolonged break, you yourself are also a new person to the codebase).
  • ... (I have probably missed a few)
13

Contrary to popular wisdom (and especially contrary to depiction in popular media!), successful programming is not a matter of innate genius or of extreme intelligence. It is predominantly a human factors issue.

All our high-level languages, type systems etc. are a testimony to this often-overlooked fact. None of them expands the power of a computer system above what would already possible in machine language. Their fundamental purpose is to make it easier to comprehend code bases.

This is good news for you. You don't need special aptitude or rare mental facilities to be a competent programmer (although popular-media depictions of scientists of course presume this anyway :-). What you need to do is to follow well-known principles of good communication and documentation, especially to yourself.

The time when you create part of a system is the ideal time to document what the new part does and how it fits into the rest of the system. Any minute you take to explain in simple language what a part of the system does will be repaid many times when you revisit the system.

(In larger, collaborative projects, it is already repaid when someone else views the same code, which may be as soon as the same day. But working alone has the advantage that you can tailor your communication style exactly to your own needs. If you stick to the principle of "invent it - describe it", it is surprisingly easy to find the correct depth and breadth of description.)

Kilian Foth
  • 110,899
11

As a complement to other answers, one of the most important things to document (i.e. write down), and often neglected, is design decisions: why you chose to do things the way you did.

Some decisions are "trivial" and apply overall: naming conventions, coding style, choice of programming language, object-orientation style ("full", "light", "mixed", "none", ...), functional style ("map/fold everywhere", lambdas, nothing), and a lot more...

The ones you need most are the specific and less obvious just reading the code. Some examples:

  • we copy this data instead of passing pointer to allow for in-place modification / to ease multirheading
  • compatibility with quirks/bugs-turned-features in old versions of the code
  • this is a hack around such and such OS quirk/bug
  • this variable is left uninitialized on purpose to add entropy to RNG seed
  • IHV-specific Direct3d12/Vulkan code
  • here we use array and sequential lookup instead of (complicated structure+algorithm) for simplicity because (reasons)
  • in general, code written/added/obscured as performance optimization
  • why such version of such dependency/library
  • papers explaining/proving the algorithms you use
  • and so on, and so forth...

Update: For that kind of documentation, write down the rationale for the choice, if at all possible. It is soon forgotten, quite often.

Pablo H
  • 684
4

There are lots of good tips here already. Perhaps I can add one more.

I have about two dozen embedded hobby projects with separate code bases. And in addition to the other tips here about documentation, refactoring, creating libraries of common code (which I highly recommend) I make my projects (code bases) as similar and simple as I possibly can. I use similar architecture, similar code structures, similar naming, similar styling, etc. And I refactor when I revisit a project, as others have suggested.

I value consistency so much that I sometimes do things that wouldn't be the best solution in a single project to maintain overall consistency across all my projects.

Others have mentioned it, but it's worth repeating: you can't eliminate time required to reorient yourself when you need to work on a code base you haven't seen in a while. But you can use use these tips to minimize it.

Finally, there's a very good book on this topic called "Code Simplicity". It's 70 pages, you can read it in one sitting, and I think you'll find it extremely helpful. The author has graciously made this book available for free: https://www.codesimplicity.com/book/

3

My advice would be to start by creating a library of reusable components that you can use in all your projects. This is especially true if you find yourself copying code between projects or re-implementing the same functionality. Try to keep this library well structured and make sure there are adequate comments and descriptions. This is especially true for small code sections, say filling an array with values or mapping a value from one range to another, or converting between radians/degrees. Move all of that to your helper library! This can help reducing the code size of the projects and lets you read descriptive names of functions, rather than trying to figure out what some block of math-heavy code does. It is also a good idea to write automated tests for your library so all of these functions actually what they should.

Something else that may help is trying to keep to a plan-implement-release cycle. Have some way to track what you are planning on doing for each project. Try to do as much implementation in one go to minimize the amount of projects you switch between. And make some notes about what you have done, so you can go back and check what you did last time. This might just involve post-it notes, text files, or complete issue tracking system.

JonasH
  • 6,195
  • 22
  • 20
3

Others have made very good points, but it's worth saying that reorientation and "task-switching" time is to be expected after you've put the code down for a while, and moved mentally onto other things.

As guidance, I'd say a few weeks is sometimes enough to forget fine details about code. Six months is enough to look again at code and sometimes recall nothing of ever writing it!

None of the others have begun with this point, of acknowledging that such time is normal. They all start from the presumption you're doing something wrong that can be fixed, rather than acknowledging that even good, well-structured code would require time to recollect later.

One of the signs of good code is that it is not absurd or unnavigable the next time you return to it. It is not that you never forget anything about it.

I assume, being a scientist, that you're not a fool when it comes to managing and communicating information in writing, so you'd already be making appropriate notes and diagrams if these would help your recollection later.

Steve
  • 12,325
  • 2
  • 19
  • 35
2

What are the most efficient and best ways in software development to avoid this 'dead'/'lag' time when you pick up development of a project again after a prolonged break from working on it?

A few points to add:

Start documentation early

In my learner days, documentation would follow code development. It was often then, in the "describe the overall code", that I would realize how my documentation was either hard to understand or hard to explain.

Now I write my high level documentation clearly early on and steer my code to meet the explanation rather than the other way around.

A clear top level doc aids in subsequent maintenance for myself and others.

Side effect: I tend to use standards more often.
E.g. to format a date, use yyyy-mm-ddThh:mm:ss Why 'T'?, ISO 8601.

Corner cases

Comment them. Often code warrants a comment so the next maintainer (or my later self) does not "simplify" and re-introduce the bug. See below

To facilitate code confidence and re-use, actively consider corner cases. Either handle all values (e.g. widely test) or document limitations.


Sample brittle C corner case failures I've experienced: (Try spotting the holes.)

for (p = 2; p*p <= x; p++) { if (x%p) return not_a_prime; }

double cplx_sqrt_x = sqrt((sqrt(aa + b*b) + a)/2);

int a; ... if (a < 0) a = -a; // Now use the absolute value.

Better.

// p*p may overflow when x is a large prime.
// Let compiler optimizes nearby x/p and x%p into efficient emitted code.
for (p = 2; p <= x/p; p++) { if (x%p) return not_a_prime; }

// (sqrt(a*a* + b*b) + a may be negative due to finite precision math. double sum = (sqrt(aa + b*b) + a)/2; double cplx_sqrt_x = sum < 0 ? 0 : sqrt(sum);

// -a is a problem when a == INT_MIN int a; ... if (a > 0) a = -a; // Refactored code to use the negative absolute value.

chux
  • 638
0

I face similar situation at my job. Multiple software projects to maintain plus a handful of data entry staff to manage. There are long pauses between coding days.

The thing that reduce the re-learning time most is consistency across multiple projects AKA same overall coding style.

Examples for Schema:

Table Naming:

  • Standard Tables - tables whose data users cannot change, that usually fill dropdownlists - are named in a specific way. Use a prefix STD_ for example or whatever but do stick to your convention.

  • Security Tables (user information, logs, history) are named in another specific way. Use a plural for example. Name all other tables in singular. Choose any one naming style to bifurcate between security and data tables.

Column Naming:

_ Use underscore between names or not. Be consistent.

_ Use highly context driven names or detailed naming. As usual, choose one style and stick to it. For example Id column in Student table can be Either "Id" Or "StudentId", Name column in Employee table can be Either "EmployeeName" or "Name".

_ When a column name must contain more information than is allowed or makes sense within its reasonable length then Either show most useful information Or use short forms of names. If use short form of names then stick to same short form for a word overall (in all projects).

Linking Tables:

_ Either tell database management system your relationship Or dont.

_ I find it most easy to remember and maintain when only the bottom most level of tables - the tables facing users whose data users can change - are linked with multiple tables, not any table above it. So parent tables of user facing tables each have only one parent of their own. It looks like it will not work and is too restrictive but it work for me.

Examples For Web Server Side Code:

  • Boilerplate code is put separately from Schema Based code which is put separately from Business Rules code.

(Boilerplate code is due to either intricacies of language or GUI. Schema Based code is code that must change if schema change. Business Rules code is for data validation and data transformation (such as categorization))

This separation reduce code that you must see and remember or re-learn when making a code change (add a feature, remove a bug, increase performance). You never have to change boilerplate code and having it separated in some way (you can use regions in visual studio for example) instead of having it spread in entire codebase makes things very easy. Also, schema based code change only when schema is change which is not frequent in transactional databases. When schema do change its straightforward change in schema based code.

_ For any data that you must maintain between postbacks use hidden controls. This way you always know what is newly entered data by user and what is old data from database yet to be changed. Ofcourse do this only when you have to compare old data with new data for example for data validation.

Consistency between projects automatically leads to refactoring and that after a while to library making but it a (very welcome) effect, not the cause. What you do is sow - maintain consistency. Fruits will grow "on their own" so to speak.

There are softwares that are large enough that multiple programmers work on a single software, and there are softwares that are small enough in scope or complexity or have so limited number of users that code changes are not frequent.

If your workload is of the latter type then you can afford to have your own coding style without worrying about matching with other styles (because there is no other style, nobody else is working on the projects). You can optimize for your ease. Just be consistent in your style. Your successors will be thankful.

Drawback

An apparent drawback is, you have to change code irrelevant to current code change in order to maintain consistency. Its a price you have to pay and you can turn it around as beneficial if you use that opportunity to do refactoring at time.

Atif
  • 144