ICU Specifies different LDML Collation Settings. Some of them seem pretty interesting, especially the ones on case and accent,
- “Ignore accents”:
strength=primary- “Ignore accents” but take case into account:
strength=primary caseLevel=on- “Ignore case”:
strength=secondary- “Ignore punctuation” (completely):
strength=tertiary alternate=shifted- “Ignore punctuation” but distinguish among punctuation marks:
strength=quaternary alternate=shiftedpotentially a better method of doing what
You can also see these documented here. Are these ICU options and settings possible with PostgreSQL 10 ICU collation support?
CREATE COLLATION special (provider = icu, locale = 'en@strength=primary');
SELECT 'Å' LIKE 'A' COLLATE "special"; # returns false
I've also tried the CLDR BCP47
Starting with ICU 54, collation attributes can be specified via locale keywords as well, in the old locale extension syntax ("
el@colCaseFirst=upper") or in language tag syntax ("el-u-kf-upper"). Keywords and values are case-insensitive. See the LDML Collation spec, Collation Settings, and the data file listing the valid collation keywords and their values. (The deprecated attributes kh/colHiraganaQuaternary and vt/variableTop are not supported.)
For that, this looked right
CREATE COLLATION special (provider = icu, locale = 'en-ks-level1');
SELECT 'Å' LIKE 'A' COLLATE "special"; # returns false
Also tried en-u-ks-level1 That method seems to be what the docs go for,
CREATE COLLATION german_phonebook (provider = icu, locale = 'de-u-co-phonebk');