Questions tagged [international-components-unicode]

For questions about the International Components for Unicode (ICU) project, which provides collation in some databases.

International Components for Unicode (ICU) is a mature, widely used set of C/C++ and Java libraries providing Unicode and Globalization support for software applications.

Used by some databases for collation: Compare strings according to the conventions and standards of a particular language, region or country. ICU's collation is based on the Unicode Collation Algorithm plus locale-specific comparison rules from the Common Locale Data Repository, a comprehensive source for this type of data.

12 questions
15
votes
1 answer

PostgreSQL nondeterministic collations are not supported for LIKE

I am using Postgresql v12. I created a collation like this: CREATE COLLATION ci (provider = icu, locale = 'tr_TR', deterministic = false); I used that collation in a table: create table testtable1 ( id serial primary key, name text …
8
votes
1 answer

Specify ICU collations as `ENCODING`, `LC_COLLATE`, and `LC_CTYPE` in Postgres 10

Postgres 10 gains the ability to use International Components for Unicode (ICU) collations rather than depending on host OS implementations. See More robust collations with ICU support in PostgreSQL 10 by Peter Eisentraut. So how exactly does one…
6
votes
1 answer

What is the meaning of "-x-icu" in PostgreSQL's "collates"?

I had this query: SELECT * FROM table ORDER BY label ASC; Since the labels are not in English, they didn't get sorted in the right order (ones beginning with "ö" were not in the bottom/end). I therefore tried: SELECT * FROM table ORDER BY label…
4
votes
2 answers

Numeric collation sorts by digits instead of value- postgres

I have a table of users with ids in the following form user123@domain.com. When searching and sorting the users, I need user1@domian.com to be before user14@domain.com but since 4 is “smaller” than @ it sorts the other way around. After looking…
3
votes
1 answer

Does PostgreSQL support ICU collation's options and settings?

ICU Specifies different LDML Collation Settings. Some of them seem pretty interesting, especially the ones on case and accent, “Ignore accents”: strength=primary “Ignore accents” but take case into account: strength=primary caseLevel=on “Ignore…
3
votes
1 answer

Was my Postgres cluster built with the ICU libraries available for Postgres 10 and later?

I wonder if my installation of Postgres 10 Beta 2 has been built to include the new International Components for Unicode (ICU) collations. For background info, see More robust collations with ICU support in PostgreSQL 10 by Peter Eisentraut. I used…
Basil Bourque
  • 11,188
  • 20
  • 63
  • 96
2
votes
1 answer

Equivalent of utf8_general_ci in Postgres/ICU?

In MySQL there is a collation utf8_general_ci which provides case-insensitive comparisons in a variety of languages. For example, these are all 1 (true): SELECT 'ı' = 'I' SOLLATE 'utf8_general_ci'; SELECT 'i' = 'I' COLLATE 'utf8_general_ci'; SELECT…
2
votes
2 answers

Identify the version of a collation from ICU in Postgres

Postgres 10 and later incorporates the International Components for Unicode (ICU) library for text-handling and other internationalization issues. Changes happen to human languages, such as sorting rules evolving. Therefore collation definitions…
2
votes
1 answer

International Components for Unicode (ICU) Date Formatter for MySQL?

Regarding to ICU - International Components for Unicode definition: ICU is a mature, widely used set of C/C++ and Java libraries providing Unicode and Globalization support for software applications. ICU is widely portable and gives applications…
1
vote
1 answer

Postgres sorting order not respecting default C collation?

I have a postgres DB running in an AL2023 docker image: postgres=# select version(); select version(); version …
1
vote
1 answer

Collation for accent-insensitive comparison on Postgres?

On PG 13 documentation, there are several examples of ICU collations for specialized purposes. It is also mentioned that ICU locales exist that allow creating collations to ignore accents, and that they can be found on…
ARX
  • 1,509
  • 3
  • 14
  • 15
0
votes
0 answers

Error Building PostgreSQL from Source: Undefined ICU References During make world-bin

I am trying to build PostgreSQL from source on my Linux machine but am encountering errors related to ICU during the make world-bin process. Below are the details of what I've done so far: Steps Taken: sudo yum install libicu-devel git clone…