61

The plus sign + is used for addition and for string concatenation, but its companion: the minus sign, -, is generally not seen for trimming of strings or some other case other than subtraction. What could be the reason or limitations for that?

Consider the following example in JavaScript:

var a = "abcdefg";
var b = "efg";

a-b == NaN
// but
a+b == "abcdefgefg"
Lesmana
  • 1,559
  • 2
  • 15
  • 18

6 Answers6

115

In short, there aren’t any particularly useful subtraction-like operations on strings that people have wanted to write algorithms with.

The + operator generally denotes the operation of an additive monoid, that is, an associative operation with an identity element:

  • A + (B + C) = (A + B) + C
  • A + 0 = 0 + A = A

It makes sense to use this operator for things like integer addition, string concatenation, and set union because they all have the same algebraic structure:

1 + (2 + 3) == (1 + 2) + 3
1 + 0 == 0 + 1 == 1

"a" + ("b" + "c") == ("a" + "b") + "c"
"a" + "" == "" + "a" == "a"

And we can use it to write handy algorithms like a concat function that works on a sequence of any “concatenable” things, e.g.:

def concat(sequence):
    return sequence.reduce(+, 0)

When subtraction - gets involved, you usually talk about the structure of a group, which adds an inverse −A for every element A, so that:

  • A + −A = −A + A = 0

And while this makes sense for things like integer and floating-point subtraction, or even set difference, it doesn’t make so much sense for strings and lists. What is the inverse of "foo"?

There is a structure called a cancellative monoid, which doesn’t have inverses, but does have the cancellation property, so that:

  • A − A = 0
  • A − 0 = A
  • (A + B) − B = A

This is the structure you describe, where "ab" - "b" == "a", but "ab" - "c" is not defined. It’s just that we don’t have many useful algorithms that use this structure. I guess if you think of concatenation as serialisation, then subtraction could be used for some kind of parsing.

Jon Purdy
  • 20,597
36

Because concatenation of any two valid strings is always a valid operation, but the opposite is not true.

var a = "Hello";
var b = "World";

What should a - b be here? There's really no good way to answer that question, because the question itself isn't valid.

Mason Wheeler
  • 83,213
27

Because the - operator for string manipulation does not have enough "semantic cohesion." Operators should only be overloaded when it is absolutely clear what the overload does with its operands, and string subtraction doesn't meet that bar.

Consequently, method calls are preferred:

public string Remove(string source, string toRemove)
public string Replace(string source, string oldValue, string newValue)

In the C# language, we use + for string concatenation because the form

var result = string1 + string2 + string3;

instead of

var result = string.Concat(string1, string2, string3);

is convenient and arguably easier to read, even though a function call is probably more "correct," from a semantic standpoint.

The + operator can really only mean one thing in this context. This isn't as true for -, since the notion of subtracting strings is ambiguous (the function call Replace(source, oldValue, newValue) with "" as the newValue parameter removes all doubt, and the function can be used to alter substrings, not just remove them).

The problem, of course, is that the operator overload is dependent on the types being passed to the operator, and if you pass a string where a number should have been, you may get a result you didn't expect. In addition, for many concatenations (i.e. in a loop), a StringBuilder object is preferable, since each use of + creates a brand new string, and performance can suffer. So the + operator isn't even appropriate in all contexts.

There are operator overloads that have better semantic cohesiveness than the + operator does for string concatenation. Here's one that adds two complex numbers:

public static Complex operator +(Complex c1, Complex c2) 
{
    return new Complex(c1.real + c2.real, c1.imaginary + c2.imaginary);
}
Robert Harvey
  • 200,592
8

The Groovy language does allow -:

println('ABC'-'B')

returns:

AC

And:

println( 'Hello' - 'World' )

returns:

Hello

And:

println('ABABABABAB' - 'B')

returns:

AABABABAB
6

The plus sign probably contextually makes sense in more cases, but a counter-example (perhaps an exception that proves the rule) in Python is the set object, which provides for - but not +:

>>> set('abc') - set('bcd')
set(['a'])
>>> set('abc') + set('bcd')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'set' and 'set'

It doesn't make sense to use the + sign because the intention could be ambiguous - does it mean set intersection or union? Instead, it uses | for union and & for intersection:

>>> set('abc') | set('bcd')
set(['a', 'c', 'b', 'd'])
>>> set('abc') & set('bcd')
set(['c', 'b'])
Aaron Hall
  • 6,003
3

"-" is used in some compound words (for example, "on-site") for joining the different parts into the same word. Why don't we use "-" for joining different strings together in programming languages? I think it would make perfect sense! To hell with this + nonsense!

However, let's try looking at this from a bit more abstract angle.

How would you define string algebra? What operations would you have, and what laws would hold for them? What would their relations be?

Remember, there may be absolutely no ambiguity! Every possible case must be well defined, even if it does mean saying it is not possible to do this! The smaller your algebra is, the easier this is done.

For example, what does it actually mean to add or subtract two strings?

If you add two strings (for example, let a = "aa" and b = "bb"), would you get aabb as the result of a + b?

How about b + a? Would that be bbaa? Why not aabb? What happens if you subtract aa from the result of your addition? Would your string have a concept of negative amount of aa in it?

Now go back to the beginning of this answer and substitute spaceshuttle instead of the string. To generalize, why is any operation defined or not defined for any type?

The point I'm trying to make is, that there is nothing stopping you from creating an algebra for anything. It might be hard to find meaningful operations, or even useful operations for it.

For strings, concatenating is pretty much the only sensible one I've ever come across. Doesn't matter what symbol is used to represent the operation.

Zavior
  • 1,362