-1

I just let my brain run wild and the results are again chaotic. It is kind of bothering me how in the very basics, programming itself can be pretty redundant.

Let's look at this snippet of code and analyze what it does

public function __construct($data = array(), $strict = false, $prefix = null) {
    if (!$data) {
        return;
    }

    foreach ($data as $key => $val) {
        if ($prefix) {
            if (($pos = strpos($key, $prefix)) === 0) {
                $key = substr($key, 0, $pos);
            } else {
                continue;
            }
        }

        if ($strict === false || property_exists($this, $key)) {
            $this->$key = $val;
        }
    }
}

When constructing this object we can provide an array to fill in it's properties, however there is some filtering we can do. If we enable the strict parameter it will only fill in keys it already has, and if we supply a prefix parameter it will only fill in properties that correspond to keys starting with a certain prefix.

See, however, that we have n*2 if-statements evaluated in the given loop at the very least. Meaning if we have 100 elements in the $data array there will be at least 200 if-statements evaluated. Wouldn't it be better if we could reduce that?

public function __construct($data = array(), $strict = false, $prefix = null) {
    if (!$data) {
        return;
    }

    if ($strict === false) {
        if ($prefix) {
            foreach ($data as $key => $val) {
                if (($pos = strpos($key, $prefix)) === 0) {
                    $this->{substr($key, 0, $pos)} = $val;
                }
            }
        } else {
            foreach ($data as $key => $val) {
                $this->$key = $val;
            } 
        }
    } else { 
        if ($prefix) {
            foreach ($data as $key => $val) {
                if (property_exists($this, $key) && ($pos = strpos($key, $prefix)) === 0) { 
                    $this->{substr($key, 0, $pos)} = $val;
                } 
            }
        } else { 
            foreach ($data as $key => $val) {
                if (property_exists($this, $key)) {
                    $this->$key = $val;
                } 
            }
        }
    }
}

As we can clearly see, readability has gone to hell but nevertheless we have reduced the amount of expressions evaluated from 200 to 2 in the best case scenario when $strict === false and prefix === null - that is a 99% increase in efficiency, theoretically.

Mostly everything I said so far is theoretical, so the question is - Is there any actual benefit to example 2 opposed to example 1 regarding program execution?

Given that I have not graduated in computer science I'm not as familiar on programming (and php) internals and how code is evaluated and executed on a deeper level, but I have read this and that on how processors can shortcut redundant if-statements but as I said, am not as familiar to answer the question myself.

Telastyn
  • 110,259
php_nub_qq
  • 2,234

4 Answers4

4

There is a great book called Clean Coder. It copes a lot of questions like this and contains some awesome tips for those fundamental issues you are talking about.

For your question, there is a rule called Beware of Optimizations!

The explaination starts with: Focus on writing comprehensible code. Optimized code is often anything but readable... and I personally agree on that.

If you don't need to optimize, don't do it. If you run into performance issues, or if it is probable that you would run into performance issues in the near futur, then you have no choice than to optimize your code for performance (even if that means making your code less readable).

Clean coder: http://clean-code-developer.com/Red-Grade.ashx#Beware_of_Optimizations!_2

3

This is not ideal code because there are two distinct operations being performed in this one function. There is copying a key only if it exists on the target (filter), called "$strict". Then there is matching the key to a prefix value and transforming the key to that value (filter + map). (Note that this latter one especially looks like it has flawed logic, but I used your code verbatim in my answer anyway.)

There should be 2 separate functions which perform their specific operation on ONE item. Then you could simplify the constructor code and even increase maintainability. I haven't used PHP in some time, so I'm not sure if this will run, but here goes.

private function choosePrefix($prefix, $key) {
    if (($pos = strpos($key, $prefix)) === 0) {
        return substr($key, 0, $pos);
    } else {
        return false;
    }
}

private function strictCopy($target, $key, $value) {
    if (property_exists($target, $key)) {
        $target->$key = $value;
    }
}

Note that separating things into functions adds a lot of overhead (well, not in all languages, but in this case). However, it has the advantage of describing the intention of what you are doing! You'll see below.

Secondly, you are treating the object as though it is a collection (dictionary). This feels perfectly natural in PHP, but in most other languages, it is common to represent collection data as a property of the class instead of key/values being intermixed with concrete properties.

$data = $this->data;
$data[$key] = $value;

Thirdly, what solves your specific gripe is choosing your processing functions before running the loop and then running the chosen functions on the data inside the loop. This is pretty common in functional languages, and I believe now available in PHP, if a bit awkwardly. Here is an example... but again, been a while since I've used PHP.

private function passThru($dummy, $key) { return $key; }
private function choosePrefix($prefix, $key) {
    if (($pos = strpos($key, $prefix)) === 0) {
        return substr($key, 0, $pos);
    } else {
        return false;
    }
}

private function strictFilter($target, $key) {
    if ($key !== false && property_exists($target, $key)) {
        return $key;
    } else {
        return false;
    }
}

public function __construct($data = array(), $strict = false, $prefix = null) {
    if (!$data) {
        return;
    }
    // ternary form of if/then/else
    $chooseFn = ($prefix ? 'choosePrefix' : 'passThru');
    $filterFn = ($strict ? 'strictFilter' : 'passThru');

    foreach ($data as $key => $val) {
        $key = $this->$chooseFn($prefix, $key);
        $key = $this->$filterFn($this, $key);
        if ($key !== false) {
            $this->$key = $value;
        }
    }
}

Now if we look at the constructor, without having to actually mentally work through the code, I can see that I might skip some keys, otherwise I will copy the key/value to this object. Giving the operations names by putting them in separate functions helps readability here.

1

In answer to this question from the comments:

We say that we sacrifice performance for maintainability and readability but why is this necessary, why is there no way to have the best out of both.

Code is a construct that is used to express a sequence of steps which represents a task to be carried out, or to express a set of expressions to calculate a value. In order for this logic to be of any use it has to be executed at some point, so the reality of this execution has to be taken into account. This is where optimization comes in.

Optimization means adding to or changing the logic of your task to account for details of the mechanism of execution, in order to speed things up. Often times this complicates the code, e.g. by breaking abstractions that are used by your code to get at and change implementation details.

Maintainable code has the properties of (among others) clarity and simplicity; removing these in your code, e.g. by breaking the abstractions that buy them, will cause the loss of some maintainability.

This doesn't necessarily mean that all optimization breaks maintainability. It can be possible to write maintainable code taking into account the particulars of the execution mechanism, or to apply different abstractions to do so. (Or to use abstractions properly in the first place.)

paul
  • 2,084
1

There needn't be any trade-off. I unfortunately don't know php well enough to write in it, so I'll use pseudo-code instead. Your algorithm can look like:

if($prefix) {
    $data = $data filtered to only include items prefixed with $prefix;
    $data = $data mapped to remove $prefix;
}

if($strict) {
    $data = $data filtered to only include items where $item.$key already exists;
}

foreach($item in $data) {
    add $item to $this;
}

This allows you to not repeat your logic or your checks, and in a language that provides you with declarative collection manipulation (like C#'s LINQ or Java 8's Stream API), would be very clean and clear.

Note that, realistically because checking booleans is so cheap, this would probably still be less performant than your version, but that's just because you chose an operation that's so cheap as to be pointless to optimize. Hopefully this addresses a more general point of performance vs cleanliness: often, it's illusive.

Ben Aaronson
  • 6,973