75

Whenever I need division, for example, condition checking, I would like to refactor the expression of division into multiplication, for example:

Original version:

if(newValue / oldValue >= SOME_CONSTANT)

New version:

if(newValue >= oldValue * SOME_CONSTANT)

Because I think it can avoid:

  1. Division by zero

  2. Overflow when oldValue is very small

Is that right? Is there a problem for this habit?

ocomfd
  • 5,750

10 Answers10

76

Two common cases to consider:

Integer arithmetic

Obviously if you are using integer arithmetic (which truncates) you will get a different result. Here's a small example in C#:

public static void TestIntegerArithmetic()
{
    int newValue = 101;
    int oldValue = 10;
    int SOME_CONSTANT = 10;

    if(newValue / oldValue > SOME_CONSTANT)
    {
        Console.WriteLine("First comparison says it's bigger.");
    }
    else
    {
        Console.WriteLine("First comparison says it's not bigger.");
    }

    if(newValue > oldValue * SOME_CONSTANT)
    {
        Console.WriteLine("Second comparison says it's bigger.");
    }
    else
    {
        Console.WriteLine("Second comparison says it's not bigger.");
    }
}

Output:

First comparison says it's not bigger.
Second comparison says it's bigger.

Floating point arithmetic

Aside from the fact that division can yield a different result when it divides by zero (it generates an exception, whereas multiplication does not), it can also result in slightly different rounding errors and a different outcome. Simple example in C#:

public static void TestFloatingPoint()
{
    double newValue = 1;
    double oldValue = 3;
    double SOME_CONSTANT = 0.33333333333333335;

    if(newValue / oldValue >= SOME_CONSTANT)
    {
        Console.WriteLine("First comparison says it's bigger.");
    }
    else
    {
        Console.WriteLine("First comparison says it's not bigger.");
    }

    if(newValue >= oldValue * SOME_CONSTANT)
    {
        Console.WriteLine("Second comparison says it's bigger.");
    }
    else
    {
        Console.WriteLine("Second comparison says it's not bigger.");
    }
}

Output:

First comparison says it's not bigger.
Second comparison says it's bigger.

In case you don't believe me, here is a Fiddle which you can execute and see for yourself.

Other languages may be different; bear in mind, however, that C#, like many languages, implements an IEEE standard (IEEE 754) floating point library, so you should get the same results in other standardized run times.

Conclusion

If you are working greenfield, you are probably OK.

If you are working on legacy code, and the application is a financial or other sensitive application that performs arithmetic and is required to provide consistent results, be very cautious when changing around operations. If you must, be sure that you have unit tests that will detect any subtle changes in the arithmetic.

If you are just doing things like counting elements in an array or other general computational functions, you will probably be OK. I am not sure the multiplication method makes your code any clearer, though.

If you are implementing an algorithm to a specification, I would not change anything at all, not just because of the problem of rounding errors, but so that developers can review the code and map each expression back to the specification to ensure there are no implementation flaws.

John Wu
  • 26,955
25

I like your question as it potentially covers many ideas. On the whole, I suspect the answer is it depends, probably on the types involved and the possible range of values in your specific case.

My initial instinct is to reflect on the style, ie. your new version is less clear to the reader of your code. I imagine I would have to think for a second or two (or perhaps longer) to determine the intention of your new version, whereas your old version is immediately clear. Readability is an important attribute of code, so there is a cost in your new version.

Your are right that the new version avoids a division by zero. Certainly you do not need to add a guard (along the lines of if (oldValue != 0)). But does this make sense? Your old version reflects a ratio between two numbers. If the divisor is zero, then your ratio is undefined. This may be more meaningful in your situation, ie. you should not produce a result in this case.

Protection against overflow is debatable. If you know that newValue is always larger than oldValue, then perhaps you could make that argument. However there may be cases where (oldValue * SOME_CONSTANT) will also overflow. So I don't see much gain here.

There might be an argument that you get better performance because multiplication can be faster than division (on some processors). However, there would have to be many calculations such as these for this to a significant gain, ie. beware of premature optimisation.

Reflecting on all of the above, in general I don't think there's much to be gained with your new version as compared to the old version, particular given the reduction in clarity. However there may be specific cases where there is some benefit.

dave
  • 351
  • 2
  • 5
24

No.

I'd probably call that premature optimization, in a broad sense, regardless of whether you're optimizing for performance, as the phrase generally refers to, or anything else that can be optimized, such as edge-count, lines of code, or even more broadly, things like "design."

Implementing that sort of optimization as a standard operating procedure puts the semantics of your code at risk and potentially hides the edges. The edge cases you see fit to silently eliminate may need to be explicitly addressed anyway. And, it is infinitely easier to debug problems around noisy edges (those that throw exceptions) over those that fail silently.

And, in some cases, it's even advantageous to "de-optimize" for the sake of readability, clarity, or explicitness. In most cases, your users won't notice that you've saved a few lines of code or CPU cycles to avoid edge-case handling or exception handling. Awkward or silently failing code, on the other hand, will affect people -- your coworkers at the very least. (And also, therefore, the cost to build and maintain the software.)

Default to whatever is more "natural" and readable with respect to the application's domain and the specific problem. Keep it simple, explicit, and idiomatic. Optimize as is necessary for significant gains or to achieve a legitimate usability threshold.

Also note: Compilers often optimize division for you anyway -- when it's safe to do so.

svidgen
  • 15,252
13

Use whichever one is less buggy and makes more logical sense.

Usually, division by a variable is a bad idea anyway, since usually, the divisor can be zero.
Division by a constant is usually just dependent on what the logical meaning is.

Here's some examples to show it depends on the situation:

Division good:

if ((ptr2 - ptr1) >= n / 3)  // good: check if length of subarray is at least n/3
    ...

Multiplication bad:

if ((ptr2 - ptr1) * 3 >= n)  // bad: confusing!! what is the intention of this code?
    ...

Multiplication good:

if (j - i >= 2 * min_length)  // good: obviously checking for a minimum length
    ...

Division bad:

if ((j - i) / 2 >= min_length)  // bad: confusing!! what is the intention of this code?
    ...

Multiplication good:

if (new_length >= old_length * 1.5)  // good: is the new size at least 50% bigger?
    ...

Division bad:

if (new_length / old_length >= 2)  // bad: BUGGY!! will fail if old_length = 0!
    ...
user541686
  • 8,178
3

Doing anything “whenever possible” is very rarely a good idea.

Your number one priority should be correctness, followed by readability and maintainability. Blindly replacing division with multiplication whenever possible will often fail in the correctness department, sometimes only in rare and therefore hard to find cases.

Do what’s correct and most readable. If you have solid evidence that writing code in the most readable way causes a performance problem, then you can consider changing it. Care, maths and code reviews are your friends.

gnasher729
  • 49,096
1

Regarding the readability of the code, I think the multiplication is actually more readable in some cases. For example, if there is something that you must check if newValue has increased 5 percent or more above oldValue, then 1.05 * oldValue is a threshold against which to test newValue, and it is natural to write

    if (newValue >= 1.05 * oldValue)

But beware of negative numbers when you refactor things this way (either replacing the division with multiplication, or replacing multiplication with division). The two conditions you considered are equivalent if oldValue is guaranteed not to be negative; but suppose newValue is actually -13.5 and oldValue is -10.1. Then

newValue/oldValue >= 1.05

evaluates to true, but

newValue >= 1.05 * oldValue

evaluates to false.

David K
  • 485
1

Note the famous paper Division by Invariant Integers using Multiplication.

The compiler is actually doing multiplication, if the integer is invariant! Not a division. This happens even for non power of 2 values. Power of 2 divisions use obviously bit shifts and are therefore even faster.

However, for non-invariant integers, it is your responsibility to optimize the code. Be sure before optimizing that you're really optimizing a genuine bottleneck, and that correctness is not sacrified. Beware of integer overflow.

I care about micro-optimization, so I would probably take a look at the optimization possibilities.

Think also about the architectures your code runs on. Especially ARM has extremely slow division; you need to call a function to divide, there is no division instruction in ARM.

Also, on 32-bit architectures, 64-bit division is not optimized, as I found out.

juhist
  • 2,579
  • 12
  • 14
1

Picking up on your point 2, it will indeed prevent overflow for a very small oldValue. However if SOME_CONSTANT is also very small then your alternative method will end up with underflow, where the value cannot be accurately represented.

And conversely, what happens if oldValue is very large? You have the same problems, just the opposite way round.

If you want to avoid (or minimise) the risk of overflow/underflow, the best way is to check whether newValue is closest in magnitude to oldValue or to SOME_CONSTANT. You can then choose the appropriate divide operation, either

    if(newValue / oldValue >= SOME_CONSTANT)

or

    if(newValue / SOME_CONSTANT >= oldValue)

and the result will be most accurate.

For divide-by-zero, in my experience this is almost never appropriate to be "solved" in the maths. If you have a divide-by-zero in your continuous checks, then almost certainly you have a situation which requires some analysis and any calculations based on this data are meaningless. An explicit divide-by-zero check is almost always the appropriate move. (Note that I do say "almost" in here, because I don't claim to be infallible. I'll just note that I don't remember seeing a good reason for this in 20 years of writing embedded software, and move on.)

However if you have a real risk of overflow/underflow in your application then this probably isn't the right solution. More likely, you should generally check the numerical stability of your algorithm, or perhaps simply move to a higher precision representation.

And if you don't have a proven risk of overflow/underflow, then you're worrying about nothing. That does mean you literally need to prove you need it, with numbers, in comments next to the code which explain to a maintainer why it's necessary. As a principal engineer reviewing other people's code, if I ran into someone taking extra effort over this, I personally would not accept anything less. This is kind of the opposite of premature optimisation, but it would generally have the same root cause - obsession with detail which makes no functional difference.

Graham
  • 2,062
0

Encapsulate the conditional arithmetic in meaningful methods and properties. Not only will good naming tell you what "A/B" means, parameter checking & error handling can neatly hide in there too.

Importantly, as these methods are composed into more complex logic, the extrinsic complexity stays very manageable.

I'd say multiplication substitution seems a reasonable solution because the problem is ill-defined.

radarbob
  • 5,833
  • 20
  • 33
0

I think it could not be good idea to replace multiplications with divisions because CPU's ALU (Arithmetic-Logic Unit) executes algorithms, though they are implemented in hardware. More sophisticated techniques are available in newer processors. Generally, processors strive to parallelize bit-pairs operations in order the minimize the clock cycles required. Multiplication algorithms can be parallelized quite effectively (though more transistors are required). Division algorithms can't be parallelized as efficiently. The most efficient division algorithms are quite complex. Generally, they requires more clock cycles per bit.