2

I have a database of members of an organization.

The data is really messy - the leader had a catastrophic medical issue, so I'm trying to make a sensible database out of Claris Filemaker Pro.

I'm trying to look for duplicates. There are some - apparently on purpose. For example, a couple could join 'once' but get two badges, so in the database we have one member ID (so no unique on it) and two different names. Not only that, libraries are in the database and several of them have the same email address, so no joy there.

I was trying to make a unique compound index of last_name, first_name, email_address, and member_id, but I found it allows duplicates.

I know NULL's aren't in indexes, but I was surprised that a compound index, if there is a single NULL in it, apparently allows duplicates of the non-null fields. The MySQL docs say:

A UNIQUE index permits multiple NULL values for columns that can contain NULL.

Sure, but what about the other 3 fields?

I didn't see a mention in the MariaDB docs.

Database:

CREATE TABLE `test` (
    `id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
    `last_name` VARCHAR(46) NULL DEFAULT NULL COLLATE 'utf8mb3_general_ci',
    `first_name` VARCHAR(46) NULL DEFAULT NULL COLLATE 'utf8mb3_general_ci',
    `email_address` VARCHAR(50) NULL DEFAULT NULL COLLATE 'utf8mb3_general_ci',
    `member_id` VARCHAR(40) NULL DEFAULT NULL COLLATE 'utf8mb3_general_ci',
    UNIQUE INDEX `index_dupecheck` (`last_name`, `first_name`, `email_address`, `member_id`) USING BTREE,
    INDEX `index_id` (`id`) USING BTREE
)
COMMENT='Test version'
COLLATE='utf8mb3_general_ci'
ENGINE=InnoDB
AUTO_INCREMENT=3
;

test

id last_name first_name email_address member_id
1 Bob Johnson bob@example.com \N
2 Bob Smith bsmith@example.com 1
3 Bob Johnson bob@example.com \N

I want to ensure that row 3 could not be inserted if row 1 is already there.

It sounds like this is due to NVL comparison to NVL being 'not equal'? (Postgres, but this sounds like the problem How can this unique index allow duplicate rows? )

except that syntax doesn't seem to work with MariaDB.

Is there another, better way of doing it? I could probably do the checks programmatically, but whoever put this data in, will probably put it in again, so I'd like to have a server side restriction for obvious reasons.

I'm also thinking of just saying member-ID has to be unique, and generate some way to link 'family' accounts together, or a 'smart number' like 123-A, 123-B for Alice and Bob (so no one's #1)

Thank you

J. Gwinner
  • 131
  • 7

1 Answers1

1

Except for obscure situations, you can consider MySQL == MariaDB. I would include NULLs in UNIQUE definitions. And, yes, 3 columns with any column(s) having NULL acts that way.

Do note that the PRIMARY KEY does not allow Nulls, but is otherwise like Unique.

It is up to you to decide if the business logic is happy with this to distinguish one user from another:

(`last_name`, `first_name`, `email_address`, `member_id`)

But, since that is clunky, etc, I would shift to id being the PRIMARY KEY. How is id assigned? Manually? Could it be AUTO_INCREMENT instead? This would allow the table to assign ids.

What needs to happen if customer without a member_id (IS NULL) gets a member_id. This will cause grief because of what id. For that reason, maybe you should not make it AUTO_INCREMENT.

Can Bob Smith get a second member_id? Etc. etc. You need to think through all the odd-ball cases. Then decide on what the PRIMARY KEY needs to be.

Rick James
  • 80,479
  • 5
  • 52
  • 119