79

On a recent project, I needed to convert from bytes to kilobytes kibibyte. The code was straightforward enough:

var kBval = byteVal / 1024;

After writing that, I got the rest of the function working & moved on.

But later on, I started to wonder if I had just embedded a magic number within my code. Part of me says it was fine because the number is a fixed constant and should be readily understood. But another part of me thinks it would have been super clear if wrapped in a defined constant like BYTES_PER_KBYTE.

So are numbers that are well known constants really all that magical or not?


Related questions:

When is a number a magic number? and Is every number in the code considered a "magic number"? - are similar, but are much broader questions than what I'm asking. My question is focused on well-known constant numbers which is not addressed in those questions.

Eliminating Magic Numbers: When is it time to say "No"? is also related, but is focused on refactoring as opposed to whether or not a constant number is a magic number.

6 Answers6

103

Not all magic numbers are the same.

I think in that instance, that constant is OK. The problem with magic numbers is when they are magic, i.e. it is unclear what their origin is, why the value is what it is, or whether the value is correct or not.

Hiding 1024 behind BYTES_PER_KBYTE also means you don't see instantly if it is correct or not.

I would expect anyone to know immediately why the value is 1024. On the other hand, if you were converting bytes to megabytes, I would define the constant BYTES_PER_MBYTE or similar because the constant 1,048,576 isn't so obvious that its 1024^2, or that it's even correct.

The same goes for values that are dictated by requirements or standards, that are only used in one place. I find just putting the constant right in place with a comment to the relevant source to be easier to deal with than defining it elsewhere and having to chase both parts down, e.g.:

// Value must be less than 3.5 volts according to spec blah.
SomeTest = DataSample < 3.50

I find better than

SomeTest = DataSample < SOME_THRESHOLD_VALUE

Only when SOME_THRESHOLD_VALUE is used in multiple places does the tradeoff become worth it to define a constant, in my opinion.

whatsisname
  • 27,703
44

There are two questions I ask when it comes to magic numbers.

Does the number have a name?

Names are useful because we can read the name and understand the purpose of the number behind it. Naming constants may increase readability if the name is easier to understand than the number it replaces and the constant name is concise.

Clearly, constants such as pi, e, et al. have meaningful names. A value such as 1024 could be BYTES_PER_KB but I would also expect that any developer would know what 1024 means. The intended audience for source code is professional programmers who should have the background to know various powers of two and why they are used.

Is it used in multiple locations?

While names are one strength of constants, another is reusability. If a value is likely to change, it can be changed in one place instead of needing to hunt it down in multiple locations.

Your Question

In the case of your question, I would use the number as-is.

Name: there is a name for that number, but it is nothing really useful. It does not represent a mathematical constant or value specified in any requirements document.

Locations: even if used in multiple locations, it will never change, negating this benefit.

27

This quote

It's not the number that's the problem, it's the magic.

as said by Jörg W Mittag answers this question quite well.

Some numbers simply aren't magical within a particular context. In the example provided in the question, the units of measure were specified by the variable names and the operation that was taking place was quite clear.

So 1024 isn't magical because the context makes it very clear that it's the appropriate, constant value to use for conversions.

Likewise, an example of:

var numDays = numHours / 24; 

is equally clear and not magical because it's well known that there are 24 hours in the day.

17

Other posters have mentioned that the conversion happening is 'obvious', but I disagree. The original question, at this point in time, includes:

kilobytes kibibytes

So already I know the author is or was confused. The Wikipedia page adds to the confusion:

1000 = KB kilobyte (metric)
1024 = kB kilobyte (JEDEC)
1024 = KiB kibibyte (IEC)

So "Kilobyte" can be used to mean both a factor of 1000 and 1024, with the only difference in shorthand being the capitalization of the 'k'. On top of that, 1024 can mean kilobyte(JEDEC) or kibibyte(IEC). Why not shatter all of that confusion outright with a constant with a meaningful name? BTW, this thread has used "BYTES_PER_KBYTE" frequently, and that's no less ambiguous. KBYTE: is it KIBIBYTE or KILOBYTE? I'd prefer to ignore JEDEC and have BYTES_PER_KILOBYTE = 1000 and BYTES_PER_KIBIBYTE = 1024. No more confusion.

The reason why people like me, and many others out there, have 'militant' (to quote a commenter in here) opinions on naming magic numbers is all about documenting what you intend to do, and removing ambiguity. And you actually picked a unit that has lead to a lot of confusion.

If I see:

int BYTES_PER_KIBIBYTE = 1024;  
...  
var kibibytes = bytes / BYTES_PER_KIBIBYTE;  

Then it's immediately obvious what the author intended to do, and there's no ambiguity. I can check the constant in a matter of seconds(even if it's in another file), so even though it's not 'instant', it's close enough to instant.

In the end, it might be obvious when you're writing it, but it'll be less obvious when you come back to it later, and it may be even less obvious when someone else edits it. It takes 10 seconds to make a constant; it could take half an hour or more to debug an issue with units(the code isn't going to jump out at you and tell you the units are wrong, you're going to have to do the math yourself to figure that out, and you'll likely hunt down 10 different avenues before you check units).

Shaz
  • 2,612
11

Defining a name as referring to a numeric value suggests that whenever a different value is needed in one place that uses that name, it will likely be needed in all. It also tends to suggest that changing the numeric value assigned to the name is a legitimate way of changing the value. Such an implication can be useful when it's true, and dangerous when it's false.

The fact that two different places use a particular literal value (e.g. 1024) will weakly suggest that changes which would prompt a programmer to change one are somewhat likely to inspire the programmer to want to change others, but that implication is much weaker than would apply if the programmer assigned a name to such a constant.

A major danger with something like #define BYTES_PER_KBYTE 1024 is that it might suggest to someone who encounters printf("File size is %1.1fkB",size*(1.0/BYTES_PER_KBYTE)); that a safe way to make the code use thousands of bytes would be to change the #define statement. Such a change could be disastrous, however, if e.g. some other unrelated code receives the size of an object in Kbytes and uses that constant when allocating a buffer for it.

It might be reasonable to use #define BYTES_PER_KBYTE_FOR_USAGE_REPORT 1024 and #define BYTES_PER_KBYTE_REPORTED_BY_FNOBULATOR 1024, assigning a different name for every different purpose served by the constant 1024, but that would result in many identifiers getting defined and used exactly once. Further, in many cases, it's easiest to understand what a value means if one sees the code where it's used, and it's easiest to figure out where code means if one sees the values of any constants used therein. If a numeric literal is only used once for a particular purpose, writing the literal at the place where it's used will often yield more understandable code than assigning a label to it in one place and using its value somewhere else.

supercat
  • 8,629
7

I would lean towards using just the number, however I think one important issue hasn't been brought up: The same number can mean different things in different contexts, and this can complicate refactoring.

1024 is also the number of KiB per MiB. Suppose we use 1024 to also represent that calculation somewhere, or in multiple places, and now we need to change to it to calculate GiB instead. Changing the constant is easier than a global find/replace where you may accidentally change the wrong one in some places, or miss it in others.

Or it could even be a bit mask introduced by a lazy programmer that needs to be updated one day.

It's a bit of a contrived example but in some code bases this can cause issues when refactoring or updating for new requirements. For this particular case though, I wouldn't consider the plain number to be really bad form especially if you can enclose the calculation in a method for reuse, I would probably do it myself but consider the constant more 'correct'.

If you do use named constants though, as supercat says it is important to consider whether context matters too, and if you need multiple names.

Nick P
  • 179