1

I am managing a MariaDB 10.6 Server for a bunch of people. I also dump the databases for backup purposes. Recently, one of the databases started to make problems while dumping:

mysqldump: Couldn't execute 'show create table my_view': Illegal mix of collations (utf8mb4_general_ci,COERCIBLE) and (utf8mb4_unicode_ci,COERCIBLE) for operation '=' (1267)

Everything in that database was created using the utf8mb4 character set and utf8mb4_unicode_ci collation. I checked the DB, tables, columns and view.

I then found out, that when I open a regular mariadb shell, the collation_connection differs:

MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE "collation_connection";
+----------------------+--------------------+
| Variable_name        | Value              |
+----------------------+--------------------+
| collation_connection | utf8mb4_unicode_ci |
+----------------------+--------------------+
1 row in set (0.002 sec)

MariaDB [(none)]> SHOW VARIABLES LIKE "collation_connection"; +----------------------+--------------------+ | Variable_name | Value | +----------------------+--------------------+ | collation_connection | utf8mb4_general_ci | +----------------------+--------------------+ 1 row in set (0.003 sec)

My first question: Why does the global scope have a different collation_connection compared to the session scope? I always thought the session scope inherits from the global scope. I was not able to find in the docs, where this gets set.

Here are the relevant parts of my mariadb option file:

[mariadb]
character_set_server=utf8mb4
collation_server=utf8mb4_unicode_ci

[client] default-character-set=utf8mb4

[mariadb-dump] default-character-set=utf8mb4

My second question: How to prevent such issues? Do I have to tell all of my users to use the same collation to prevent issues like that?


Here is an example to reproduce it:

CREATE TABLE employees (
    id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    department VARCHAR(100),
    salary DECIMAL(10, 2),
    created_at TIMESTAMP
);

INSERT INTO employees (name, department, salary, created_at) VALUES ('Alice', 'HR', 50000, NOW()), ('Bob', 'Engineering', 70000, NOW()), ('Charlie', 'HR', 45000, NOW()), ('David', 'Engineering', 80000, NOW()), ('Eve', 'Marketing', 60000, NOW());

SET collation_connection = utf8mb4_unicode_ci;

CREATE VIEW hr_employees AS SELECT id, name, salary FROM employees WHERE department = 'HR' AND date_format(created_at, "%Y") IN ('2024');

SET collation_connection = utf8mb4_general_ci;

SHOW CREATE VIEW hr_employees; ERROR 1267 (HY000): Illegal mix of collations (utf8mb4_general_ci,COERCIBLE) and (utf8mb4_unicode_ci,COERCIBLE) for operation '='

Luuk
  • 933
  • 5
  • 13
Vince
  • 111
  • 1

1 Answers1

1

Change your VIEW to:

CREATE VIEW hr_employees AS
SELECT id, name, salary
FROM employees
WHERE department = 'HR'  
  AND date_format(created_at, "%Y" COLLATE utf8mb4_general_ci) IN ('2024') 
;

OR you could do, (but that changes the query!):

CREATE VIEW hr_employees AS
SELECT id, name, salary
FROM employees
WHERE department = 'HR' AND date_format(created_at, "%Y") = '2024' COLLATE utf8mb4_general_ci
;

see: DBFIDDLE

The error is cause by date_format(created_at, "%Y") IN ('2024').

Adding the COLLATE utf8mb4_general_ci solves this.

the only thing I could find about it, is this:

DATE_FORMAT() returns a string with a character set and collation given by character_set_connection and collation_connection so that it can return month and weekday names containing non-ASCII characters.

Luuk
  • 933
  • 5
  • 13