9

I'm tempted to create a final class CaseInsensitiveString implements CharSequence.

This would allow us to define variables and fields of this type, instead of using a regular String. We can also have e.g. a Map<CaseInsensitiveString, ?>, a Set<CaseInsensitiveString>, etc.

What are some of the pros and cons of this approach?

gnat
  • 20,543
  • 29
  • 115
  • 306

6 Answers6

27

Case insensitivity is a property of the comparison, not of the object (*). You'll want to compare the same string independently of the case or not depending on the context.

(And you have a whole kind of worms as what is a case insensitive comparison depend on the language -- i is upper cased as İ in Turkish -- and even the context -- depending on the word and the dialect ß can be uppercased as SS or SZ in German.)

(*) It can be a property of the object containing the string, but that is somewhat different of being a property of the string itself. And you can have an class which has no state excepted a string, and comparing two instances of that class will use a case insensitive comparison of the string. But that class won't be a general purpose string as it won't provide methods expected for a general purpose strings and will provide methods which aren't. This class won't be called CaseInsensitiveString but PascalIdentifier or whatever is pertinent to describe it. And BTW, the case independent comparison algorithm will most probably be provided by its purpose and be locale independent.

AProgrammer
  • 10,532
  • 1
  • 32
  • 48
7

Just off the top of my head:

Pros:

  • Makes a lot of code self-documenting, e.g:
    • bool UserIsRegistered(CaseInsensitiveString Username)
  • May streamline comparisons
  • May remove the potential for comparison bugs

Cons:

  • Might be a waste of time
    • people can just convert regular strings to lowercase if they need case-insensitive comparisons
  • Using it for front-end code will cause capitalization problems
    • For example, if you use CaseInsensitiveString to store a username, even though it makes sense to have case-insensitive back-end comparisons, the front-end code will display the user's name as "bob smith" or "BOB SMITH"
  • If your code base already uses regular strings, you will have to go back and change them or live with inconsistency
Maxpm
  • 3,136
4

CaseInsensitiveString is not a bad idea depends on your use, as long as you don't expect it to work together with String.

You may convert a CaseInsensitiveString to a String, or vice-versa, and that's all you should do.

Problem will happen if you try to do something like

class CaseInsensitiveString {
  private String value;

  public boolean equals(Object o) {
    // .....
    if (o instanceof String) {
      return value.equalsIgnoreCase((String) o);
    }
  }
}

You are doomed to fail if you are going to make your CaseInsensitiveString corporate with normal String, because you will be violating symmetric-ness and transitive-ness for equals() (and other contracts)

However, please ask yourself, in what case you really need this CaseInsensitiveString which it is not suitable to use String.CASE_INSENSITIVE_ORDER ? I bet not many case. I am sure there will be case that worth having this special class, but ask yourself first.

Adrian Shum
  • 1,095
2

Explicitly creating types in your domain/model is very good practice. Like Maxpm said it is self documenting. Also a big plus: people can't (by accident) use wrong input. The only negative thing it has would be that it will scare off junior (and even some medior) programmers.

1

A CaseInsensitiveString class and its helpers add a lot of code and they will make everything less readable than the String.toLoweCase() method.

CaseInsensitiveString vaName1 = new CaseInsensitiveString('HeLLo');
//... a lot of lines here
CaseInsensitiveString vaName2 = new CaseInsensitiveString('Hello');
//... a lot of lines here
if (varName1.equals(varName2)) ...

is more complex, less self documenting, and less flexible than

String vaName1 = 'HeLLo';
//... a lot of lines here
String vaName2 = 'Hello';
//... a lot of lines here
if (varName1.toLowerCase().equals(varName2.toLowerCase())) ...
Ando
  • 1,091
0

The most frequently used implementations on the web are case sensitive - XML, JavaScript. In terms of performance, it is always best to use the most appropriate function/property/object for each case.

If you are dealing with structures - XML or JS or similar, case sensitivity is important. It is much faster using system libraries.

If you are dealing with data in a database, as mentioned above the database indexing shall be used for case sensitive/insensitive strings.

If you are handling data on the fly, it is important to make the necessary conversion cost calculation for each string. It is probable that the strings should be compared or sorted somehow.