If two individual branches pass unit tests, once they're merged, is the result also guaranteed to pass unit tests?

Question

Suppose we have two branches A and B which have been forked from master.

Both branches A and B make some changes and implement some unit tests. They pass all current and new tests, then are merged back into master. For simplicity, there are no merge conflicts.

Is it guaranteed that the resulting code on master will also pass the unit tests?

The reason I ask the question, is I often see GitHub unit tests run automatically once a pull request is made. If they pass, then the code may be merged into master. However, I think master could still end up failing tests if two pull requests break each other? I would have thought a better solution would be:

When a pull request is made, run the unit tests to catch anything egregious.
Have conversations, code reviews etc...
Once the pull request is ready to be merged, do a test merge into master, run the unit tests, if all succeeds, commit the merge.

So you never actually commit broken code into master.

score 156 · Answer 1 · edited Oct 06 '20 at 10:36

156

No.

The simplest example I've seen is: branch A cleans unused imports in a file. Branch B adds code that actually uses some of the unused imports. Git merges automatically since the lines that were changed were not the same. Code can no longer compile and unit tests can not run.

edited Oct 06 '20 at 10:36

Glorfindel

3,167

answered Oct 05 '20 at 06:40

Esben Skov Pedersen

5,168

score 114 · Answer 2 · answered Oct 05 '20 at 00:52

No. As a counter example, consider branch A adds a unit test that uses reflection to check for a misspelling in an enum. And branch B adds a misspelling. Both pass because a misspelling doesn’t fail a build, in A the test doesn’t fail because everything is spelled right, and in B there isn’t a test to check it. There won’t be any merge conflicts because the enum and its unit test will be in separate areas. But the test will fail once the merge is complete.

Doc Brown · Answer 3 · 2020-10-07T19:11:20.377

Here is an example which neither does require changes to the existing tests itself, nor reflection, nor a failing build, for not giving the wrong impression such cases can only happen under artificial circumstances.

Assume the codebase contains a private function f which is currently not called anywhere (maybe it was in the past, but noone has deleted it). Now in branch A a call to f is added internally in an existing function g during a refactoring. The unit tests for g show that everything works as intended, and g's behaviour hasn't changed.

At the same time, the devs working on B observed that with some modifications to the behaviour of f they could reuse it, too, and since f is not used elsewhere from their perspective, they thought the modifications to be safe. Maybe they also added some tests here which cover the new version of f, but that does not really matter. So as a result, in branch B, the behaviour of function f is changed.

To illustrate this with a real example, assume f originally to be a function like this:

 // normalize a given angle in degrees to the interval [0, 360[
 double NormalizeAngle(double angleInDegrees)
 {
 // ....
 }

and in branch B, the behaviour gets changed, making NormalizeAngle deliver values from the interval [-180, 180[.

So before the merge, no tests will fail. After the merge, the tests for function g will fail. Note if in B function f's signature would have been changed, both branches A and B will compile (assuming a compiled language environment), but after merge not even the build will be successful anymore. If the signature does not change, the compiler will not find this issue, only the unit tests.

score 12 · Answer 4 · answered Oct 05 '20 at 06:40

Approaching this from a different angle, there's a simple process to ensure that the tests continue passing after merging both branches: a branch must pass CI after being applied to the current target branch before being merged. So when branch A merges, the goalpost for the other branches move to "main branch with A applied to it." To expedite this, most CI systems automatically trigger the CI pipeline for all the other pending branches when merging a branch.

Basically the only way to consistently break this "soft" guarantee would be for the tests to behave differently based on whether it's being run on the source or target branch.

score 9 · Answer 5 · answered Oct 05 '20 at 09:26

If two individual branches pass unit tests, once they're merged, is the result also guaranteed to pass unit tests?

Taking the question at face value, it's very simple to create an example where one branch only tests part of its codebase and has a bug in the untested part. Its unit tests pass, but there is a bug in the code.

Therefore, any test from the second branch that does test this (so far untested) piece of code might pass in its own branch (where it does not have the bug), but not when the bug from the first branch is merged into it.

The reason I asked the question, is I often see on GitHub unit tests run automatically on each pull request. If they pass, then the code is merged into master. Wouldn't it make more sense to unit test the resulting merge immediately before the actual merge is committed?

This, however, is a much more interesting question.

It makes sense to test the end result of a merge as well, but the way you're proposing to do it is not the best way.

First of all, your suggestion depends on the ability to auto-merge. Without auto-merging, your build process couldn't actually merge the two branches before testing its merged result.
When you can't auto-merge, you have to rely on what is currently already the suggested course of action: merging the master inside your feature branch before making your PR.

But let's say you can auto-merge. If you test the merged combined code, then the test failure can be caused by either an issue on the source branch (feature) or the target branch (master).
This means that your PR no longer reviews the feature branch itself, which makes it very hard to spot issues in specific feature branches.

The simplest example here is if a bug does make it onto the master branch, then every PR will fail, except a PR that solves the bug. While this may look appealing from the POV of someone who wants to keep the code clean, it's causing other issues. Every developer will now be troubleshooting their own PR build failure, unaware that there is a single bug causing all of the failures.

That's going to be quite inefficient in terms of having multiple developers independently locate the same bug. So let's say you try to counter that, by having developers confer before investigating a PR failure, so that not everyone tries to solve the same problem without coordinating.
But now, you've creating a point of distraction. Every time some developer makes a mistake in their own feature branch, they're needlessly requiring other developers to confirm that they themselves are not experiencing PR build failures. That, too, is a waste of time and effort.

Wouldn't it make more sense to unit test the resulting merge immediately before the actual merge is committed?

This does happen, but it's not the PR that does it. In most CI configurations I've worked in, the master branch goes through the build process whenever a PR gets merged into it. After all, a merge is just a kind of commit, and in a CI/CD environment you should be building your master after every commit made to it.

This breaks down the process in steps where the source of a step failure is easily identifiable.

The feature branch PR tests the feature itself. If it fails, the feature branch is flawed. If it passes, the feature itself is considered finished.
Post-merge, the master branch gets built. Any merge issues here will be reported as an issue in the master branch, not the feature branch, as it's not an issue with the feature itself, only its integration of the feature into the master.

In this answer I assumed you were working on a master branch instead of a separate dev branch.

The distinction between a master/dev branch is irrelevant as far as git merging goes; but this does highlight why the existence of a separate dev branch next to master has added value: dev acts as an integration branch which catches any issues with integrating a feature into the main codebase, before it makes its way into the master branch.

score 3 · Answer 6 · answered Oct 07 '20 at 01:18

No.

The solution to your problem is to take advantage of the fact that git is distributed and run your automated tests against the product of the desired merge locally (ideally on a CI runner) before pushing that merge commit to the shared repository (ideally performed by that CI runner).

Why this isn't the default paradigm for this sort of thing is completely beyond my comprehension.

score 2 · Answer 7 · answered Oct 06 '20 at 12:58

As the other answer stated, no, passing tests on 2 non-conflicting branches are not enought to say there won't be failures after merging them both. You have plenty of examples.

Let me focus on the second part of the question, the proposed flow, how it may fail and how it might be approached:

When a pull request is made, run the unit tests to catch anything egregious.

Have conversations, code reviews etc...

Once the pull request is ready to be merged, do a test merge into master, run the unit tests, if all succeeds, commit the merge.

This is a nice and sound process and perhaps a great one for a small project. It really ensures no failures in master, and it's quite simple. There is one big issue with it: it doesn't scale. It doesn't scale at all.

With this process you drafted you must serialize the commits, and this gets very costly very fast when the project grows.

For instance, if you have 3 pull requests, you need to test-merge the first one, run all the tests, then update master. Only then you can start testing the test-merge of the second branch, and only after it's in, you can start running the proper tests for the last branch.

This means that if your test suite takes 3 minutes, you can make at most 1 commit in every 3 minutes. That's inconvenient, but feasible. However, if your test suite takes 10 minutes, you are limited to 6 merges per hour at best, 48 merges per work day. A team of 20 people working with such a constraint would spend half their time babysitting the pull requests, and you could end up with a typo fix waiting half a day to be merged.

Worse yet, if your test suit take several hours and you have many thousands of developers working on a single monorepo, producing tens or hundreds of commits per minute... well, you see the problem. Even running the continuous integration after every merge makes little sense in this case.

What is more scalable?

Focus on the continuous integration and quick rollbacks instead of preventing all bad merges from happening. Also track the test failures, so that you can guess with high confidence whether a test failure is caused by the branch, or by the broken master (a smart enough testing tool will annotate them as "already failing", and you may vote to allow merging with this kind of failures). You don't even need to run the integration after each merge (it's the simplest starting point, but doesn't scale to the really huge projects), it may be every few minutes or every hour, depending on how much resources you want to throw at it.

If you detect no failures, everything is fine. If you detect a failure, you can run a binary search over the changes to determine which one caused the specific test to fail - this is relatively cheap, because usually you won't have half the tests fail, just a handful of them.

On top of that, leverage your build system in order to determine the set of builds and tests that actually may be affected by each change, and limit the required test suite to these. Also, as part of CI run these selected tests immediately after the merge, in order to detect the issues as quickly as feasible (separately from the full test suit running once in a while). The determination doesn't have to be watertight - in case you miss a test that is actually affected, the "big" integration testing will still catch it, just a while later, but most of the time you'll get the feedback quite fast.

The flow I described is loosely based on what Google does internally, I assume it's similar to what other big companies do as well. It's worth pointing out that no popular VCSs supports the monorepos as big as theirs, at least not the vanilla version.

In case of Google, the VCS is Perforce based, and it has much stricter rule for conflicts - any other change in the same file is a conflict, no matter how close or far apart are the changes. This eliminates quite a bunch of pitfalls, like those with removed imports - the change would have to be updated and rebuilt, and the issue would show up, similarly to the process you proposed. So this is one more counter-measure - just tighten up the rules for what can be merged to the master. While requiring "only fast-forward changes with passing tests" (i.e. your proposed rule) is unfeasible at scale, "only changes that are fast-forward with regard to affected files" can scale up relatively well.

score 1 · Answer 8 · answered Oct 05 '20 at 15:20

Interesting question, I gave it some thought and came up with the following situation in which 2 branches which are independently correct, result in a merge which breaks the build.

Suppose in the 2 branches a function/procedure with same name and signature is added to a class. However this is done in different locations or even different files. The resulting merge will result in a class with 2 identical functions/procedures and will therefore give a compile-error.

score 1 · Answer 9 · answered Oct 06 '20 at 20:54

Another problem scenario is that the original program performed a safety check in both a private function, and in its calling code. Branch A removes the check from the function, and branch B removes it from the calling code. Both branches will perform the safety check once, would for most purposes be an improvement over calling it twice, but merging the changes will cause the safety check to be omitted altogether. A test for safety-check behavior would thus pass on both branches, and the merged code would appear to work fine if the checked condition never arises, but code would no longer be protected against the condition that was supposed to be guarded by the safety checks.

score 1 · Answer 10 · answered Oct 07 '20 at 14:55

Let's look on how to prevent the problem. As mentioned in the question, CI (typically) reruns after every commit to the default branch, including merges. Some of the answers already explain how things can break.

Some other answers suggest a dev branch and fast-forward the main branch only when the CI is stable on dev. But this would require additional manual intervention and can become a hassle on a big project.

And then there is the Bors-ng tool:

Bors is a GitHub bot that prevents merge skew / semantic merge conflicts, so when a developer checks out the main branch, they can expect all of the tests to pass out-of-the-box.

Basically, accepted pull requests are merged with the main branch in a temporary branch. When CI passes, the main branch gets Fast Forwarded to the temporary branch. It is quite feature complete and supports reviewers, batching of builds, queuing (when it gets really busy) and (I believe) most CI providers.

Disclaimer: I have no affiliation with this product, just a happy user.

score 1 · Answer 11 · answered Oct 07 '20 at 17:48

Of course there is no guarantee. The examples are legion.

But.

It is not unreasonable to assume that unrelated, isolated changes are unlikely to break anything. Performance improvements in a backend algorithm are unlikely to change the database interface. This is the same assumption that's underlying the paradigm of unreserved checkouts/parallel development of which git is a prime example: Hopefully the team communicates well and organizes work packages in a fashion so that they do not conflict, or, if that is impossible, organizes conflicting work so that the arising problems are predictable and handled proactively. (Then, ideally, we know that a naive merge is broken.)

If two individual branches pass unit tests, once they're merged, is the result also guaranteed to pass unit tests?

11 Answers11