17

I'm currently evaluating a potential new project that involves creating a DB for typical customer information (userid, pwd, first & last name, email, adress, telfnr ...). At this point, requirements are only roughly defined.

The customer DB is expected in the O(millions) of records. In order to calculate some back-of-the-envelope numbers for DB sizing and evaluate potential DB options & architectures, I'm looking for some de-facto standards for these kind of records. In particular, the std size of every field (first name, last name, address,...) or typical avg for a simple customer record would be great info.

With so many e-commerce websites out there, there should be some kind of typical config that can be reused and avoid re-inventing the wheel.

Any ideas?

---- edit ----

The answers seems to be steering towards adopting an standard customer record vs designing your own. I would like to stress that the focus of this question is to locate a reference for field sizing for a customer object, and avoid figuring that out on my own.(I've emphasized that part on the original text -now in bold-)

maasg
  • 560
  • 4
  • 14

7 Answers7

16

The nice thing about standards is that you have so many to choose from. - Andrew Stuart Tanenbaum

Things like this are very specific to a customer and the industry, anything generic will include everything and the kitchen sink. Especially EDI type formats, they were organically defined over a decade or more in most cases and include everything every company on the committee ever wanted. They were supposed to be industry generic, and they became extremely industry specific and extremely brittle.

There is no royal road to the design or information you want. Do the time and the effort to get the requirements and get a concrete estimate. Otherwise you will be more wrong than correct. The only way to know what you need to know is to ask the questions and figure it out yourself.

Many CRM systems use what is now called an Expando object pattern, previously known as a dynamic property pattern. It is basically a key value pair dictionary construct. Except for very special cases it is considered an design Anti-Pattern and should be avoided.

I have designed and built at least 8 custom CRM solutions in the last 20 years, each and everyone had different requirements and none of the data models ( logical or physical ) would have worked across the board for all the domains.

Specific solutions for specific cases will always be better designs.

4

There is a thread in the DBA stack on best practices for common person fields that discusses the issues. It matters a great deal what you are planning to do with the data and how thorough you need to be. If you actually need to support all valid email addresses or all valid names, your columns are going to need to be much bigger than if you merely want to support whatever your organization and application consider a reasonable subset of the valid values.

Justin Cave
  • 12,811
3

As Jarrod pointed out, if you follow a generic standard, you will definitely end up with a record format that includes a lot of things your system will never need. Since you already know that there will be a fairly large number of records, it's likely that you'll get unnecessary performance issues because you're supporting data that will never be used. Conversely, it's also likely that the standard won't include fields that you do need, which will be a painful problem to solve; either you break the standard by adding these fields, or you'll have to find some (probably clunky) way of including the non-standard fields within the standard.

I think the real problem here isn't about finding a one-size-fits-all standard (which will almost always be one-size-fits-NONE) but that you have been tasked to estimate a solution where the requirements aren't specified yet. In these cases, I think the only professional thing to do is to make a minimum estimate based on the requirements you do have, and then make a maximum estimate based on all possible undefined requirements that you think might come up. Sure enough, the estimate might become ridiculously rough, in which case you should explain to whoever tasked you with this, that it's just not feasible to make a good estimate until the requirements are more well-defined.

1

Existing International Standards

There are quite a few standards, but specific to certain fields, with varying requirements for each of them depending on their data collection needs.

For instance, but not limited to (and talking from experience with both of these):

Some of the above link to fairly detailed documents, listing even requirements for health and formatting of fields (for instance, HL7 uses well-defined data-types). At lot of them do not go in this much detail though.

Government-Driven Standards for Internal Records

Governments, national or local, often have a strong need to record and store personal information for public offices, and obviously have come up with own "standards", which they implement across their organizations (with varying levels of success and interoperability with partner organizations).

An example could be this Data Formats for Identity Records Standard from the Government of New Zealand.

De-Facto Standards in Software

You could take inspiration from these, or use the source of known open-source CRM software to use as best-practices and guidelines for the data specifications of your customer data.

See the Top 10 Open-Source Business and Social CRM Software list, for which you could look up their data-models yourself.

haylem
  • 29,005
0

I'd say you need to find standards for EDI systems. There are hundreds of 'standard' documents, so you'd need to choose one based on your requirements. For example, here's a format for TRADACOMS invoice that you could grab fields you want from.

gbjbaanb
  • 48,749
  • 7
  • 106
  • 173
0

The Open Applications Group publishes a set of open standards for application implementation and interoperability. They're mostly XML-oriented, but they do specify a standard customer record with individual fields and sizes (look for CustomerPartyMaster in the document standards list).

TMN
  • 11,383
0

I would say "You ain't gonna need it (yet)". And with Ron Jeffries: "Always implement things when you actually need them, never when you just foresee that you need them."

So maybe if it is time to add a concrete database to the project you have much more knowledge about the data that will be stored there.

habakuk
  • 135
  • 2