102

I was naughty... Too much "cowboy coding," not enough committing. Now, here I am with an enormous commit. Yes, I should have been committing all along, but it's too late now.

What is better?

  1. Do one very large commit listing all the things I changed
  2. Try to break it into smaller commits that likely won't compile, as files have multiple fixes, changes, additional method names, etc.
  3. Try to do partial reversions of files just for appropriate commits, then put the new changes back.

Note: as of right now I am the only programmer working on this project; the only person who will look at any of these commit comments is me, at least until we hire more programmers.

By the way: I am using SVN and Subclipse. I did create a new branch before doing any of these changes.

More information: I asked a separate question related to how I got into this situation in the first place: How to prepare for rewriting an application's glue

durron597
  • 7,610
  • 10
  • 39
  • 67

10 Answers10

53

To answer, you have to ask yourself how you expect to use the results of these commits in the future. The most common reasons are:

  • To see what release a bug was introduced.
  • To see why a certain line of code is present.
  • To merge into another branch.
  • To be able to check out a previous version for troubleshooting an issue a customer or tester is seeing in that version.
  • To be able to include or exclude certain parts from a release.

The first two reasons can be served just as well with one big check-in, assuming you can create a check-in message that applies equally well to each line of changed code. If you're the only programmer, then smaller commits aren't going to make your merge any easier. If you don't plan on doing a release or testing with only part of your unsubmitted changes, then the last two reasons don't apply.

There are other reasons for making small commits, but they are for while you are in the middle of the changes, and that time is past. Those reasons are making it easier to back out a mistake or an experimental change, and making it easier to keep synced up with colleagues without huge scary merges.

From my understanding of your situation as you described it, it seems there's little to no benefit to splitting your commit at this point.

Karl Bielefeldt
  • 148,830
41

I think whatever you do, try to avoid checking in code that you know won't compile.

If you think your third option is feasible, that might be a good way to do it, as long as you can ensure that your sequence of commits won't create an uncompilable system. Otherwise, just do the one big commit. It's ugly, but it's simple, quick, and gets it done. Going forward, commit more often.

18

The most important reason to make frequent, small, and meaningful commits is to aid understanding of the history of the code. In particular, it's very difficult to understand how code has changed if it's difficult to generate understandable diffs.

Option 1 obfuscates the history of changes you've made, but otherwise it won't cause any problems.

Option 2 obfuscates the history of changes you've made, possibly somewhat less than option 1, but it could cause other problems for yourself or others if they assume or otherwise conclude that the commits are distinct, e.g. can be merged into other branches independently. Unless there's a strong practical reason why this is preferred over option 1, this is less ideal than it.

Option 3 is best, all else being equal, but if, as you've described elsewhere, doing so would require "extreme" amounts of time or would incur other significant costs, you'll have to weigh those costs against the expected benefits of creating cleaner commits.

Based on the info you've provided, I'd opt for option 1. Maybe you should setup reminders prompting you to commit your changes?

Prototyping and Rewriting

Another consideration to keep in mind, especially in light of your note about being the sole programmer, and my suspicion that you're working on a relatively new codebase, is that it's probably good to develop different habits with respect to committing changes for when you're prototyping new code versus maintaining or extending existing code. There probably isn't a terribly sharp division between the two, but I think it's still a useful distinction.

When you're prototyping new code, commit whenever you want to save your changes, almost certainly in a branch, but perhaps in a separate project. Or maybe even just work outside version control altogether. You can instead focus on gathering evidence about the feasibility of various hypotheses or designs you're considering. I often write small prototypes using different tools, e.g. LINQPad instead of Visual Studio for C# code.

When you've validated a particular hypothesis or design, rewrite it in your main project, ideally in a branch, and make the small, meaningful commits that will best aid the understanding of others (including future you) as to the nature of the changes you're making.

Kenny Evitt
  • 549
  • 3
  • 10
12

Although the only reasonable answer is to never break the trunk, some times it is not possible. For example, svn can break commit if you commit too much (maybe an option, or a feature, I am not sure). In such special cases, just check in in pieces. Since you are a single programmer it is not going to disturb anyone.

Therefore, I would go for option 1. If not possible, then option 2.

Option 3 requires much effort, and it simply isn't worth it.

BЈовић
  • 14,049
7

Try to break it into smaller commits that likely won't compile, as files have multiple fixes, changes, additional method names, etc.

When I've found myself in a similar situation I used the following technique:

  1. Add only the code that is relevant to a particular feature: git add --patch
  2. Stash all other changes: git stash save --keep-index
  3. Run tests/try compiling
  4. Commit changes if everything is okay, if not go to 1

I'm not familiar with SVN, so I don't know if this is applicable to your specific situation, but the basis should be the same - isolate small parts of code and test them individually.

Milos
  • 171
3

You're the only programmer; just do a single massive checkin detailing the important bits of what you did.

Are you likely to roll back "parts" of what you did? If not, then absolutely proceed with option 1.

There are a couple of reasons to check code into a version control system. And ALL of them, when you're the only developer, revolve around safety - namely, if you screw up, the machine dies or whatever, you can always get back to that last checkpoint.

A programmer coming into the project later is unlikely to want to roll back to a version that doesn't compile. So, IMHO, option 2 is lunacy.

Option 3 sounds like such a time sink, that if I was your boss and saw you wasting hours doing this, I'd have a little talk with you about what your time is worth.

To iterate: by checking in your code you are covering/saving your butt in case of failure on your machine. Everything else, on a one-man team, is window dressing.

ChrisLively
  • 1,063
2

How about option 4: Back up your repo's current state in a temporary place, revert your repo to its original state, make a list of all the changes you did (you can still look at the temporary backup), then manually reimplement (and compile and test!) them as separate commits.

This should be easier, because you've already written the code, it's just a bit of figuring out which parts to copy and paste from your temporary backup.

When you have re-implemented every change cleanly, and thus ensured that commits are self-contained, small, and compile, you can delete the temporary backup, and everything will be almost exactly as (except for the time/date of commits) it would have been if you did it right from the start.

Superbest
  • 538
1

My rule is: No checkin without a serious code review. If I'm on my own, I'll have to review the code myself, but it will be reviewed. You seem to have an amount of code changes that someone else couldn't review, therefore you can't review it yourself (reviewing your own code is harder and requires more discipline, because you automatically make the wrong assumption that your own code is correct).

Everyone's totally unbreakable rule is: Never check in code that doesn't even build, and seriously avoid checking in code that doesn't work.

Solution: Copy your changed code, go back to the original point. Merge one change at a time into the original code, review it, test it, check it in. With the programming method you described, you are bound to find some serious bugs that way.

It's also a good way to train yourself good programming habits.

1

I think you are worrying far too much. If you're the only programmer and you don't have a spec sheet to work against then it's entirely up to you what you do. I assume nobody is going to punish you for making a large commit so the only issues you're going to run into are technological, such as not being able to roll back individual file changes within the commit. Since you're solely in control of the repo at this moment that shouldn't be a huge concern either though.

There's a line to be drawn between pragmatism and dogmatism. What you did would not be a good idea in a team and probably shouldn't be repeated going forward, but in your situation I would just submit the large commit.

Roy
  • 513
-1

The problem is not in long delays between commits but in the fact that you keep the code in uncompilable state too long time. First of all what is your definition of 'long'? For some people 5 minutes for transitional state is too much but some can polish their code for days without even try to run compiler.

In fact it doesn't matter - what matters is that you lost control of your code, your code became unmanageable. This is normal, it just means that you have technological debt and it's time to think about refactoring. So you are frustrated? You code doesn't not compile, you even don't have unit tests? How can you think about refactoring in this case? No worries. Finish your "cowboy coding" and start cleaning it. Ideally try to write some tests. You don't have time for that? Ok, start from small improvements.

Some changes doesn't require tests. Change the name of variable for more suitable, move repeating code into separate function, etc... You will get better idea of your code. After that you can do bigger changes. Again, try to write tests if possible. Make your code manageable.

After that you will see that next change doesn't take you "too long" (whatever it means).

AlexT
  • 215