Skip to content

Data Modeling Best Practices

A practical guide for designing effective ERDs. Common mistakes and principles to avoid them.


Naming Conventions

Entity Naming

Rule Good Example Bad Example Reason
Use singular form Customer, Order Customers, OrderList Entities represent an instance, not a collection
Use business terms Accounts Receivable AR_TABLE Logical models use business language
Avoid abbreviations Order Detail Ord DTL Don't mix abbreviations in logical names
Avoid verbs Shipment Ship Items Entities are nouns

Column Naming

Rule Good Example Bad Example Reason
Entity prefix (for PK) Customer No Number Makes it clear which number
Consistent suffixes Order Date, Reg Date Order Day, Reg DT Same concept = same word
Use standard words Amount (AMT) Money, Cash Use words registered in the standard dictionary
Clear boolean names Delete Flag Flag1 The meaning of the value should be obvious

Physical Name Rules

Rule Example
UPPERCASE + UNDERSCORE CUST_NO, ORD_DT
30 characters or less (Oracle compatibility) CUST_ORD_DTL_SEQ
Avoid reserved words ORD (instead of ORDER), GRP (instead of GROUP)
Use standard abbreviations NM(Name), NO(Number), DT(Date), AMT(Amount)

→ Standard abbreviation management: Words, Terms & Domains


Entity Design Principles

1. One Entity = One Concept

If an entity contains two or more concepts, consider splitting it.

Customer_Order (Entity)
├── Customer No (PK)
├── Customer Name
├── Order No
├── Order Date
└── Order Amount
→ Customer and order information are mixed. If the same customer orders multiple times, customer info is duplicated.

Customer         Order
├── Cust No(PK)  ├── Order No(PK)
├── Cust Name    ├── Cust No(FK)
└── Email        ├── Order Date
                 └── Order Amount
→ Each entity handles a single concept. Connected by relationships.

2. Define PK for Every Entity

Entities without a PK cannot distinguish individual rows, making data integrity impossible to guarantee.

Entity Type PK Strategy Example
Master (Customer, Product) Business key or surrogate key CUST_NO, PROD_CD
Transaction (Order, Payment) Numbering rule-based ORD_NO (YYYYMMDD + SEQ)
Bridge (Enrollment, Assignment) Composite PK (STUDENT_NO, COURSE_CD)
History (Change Log) Composite PK (SOURCE_PK, CHANGE_DTIM)

3. Normalization Decisions

Level Key Question Problem When Violated
1NF Are there repeating groups in columns? Phone1, Phone2, Phone3...
2NF Are columns dependent on only part of a composite PK? Partial dependency → data duplication
3NF Are columns dependent on non-PK columns? Transitive dependency → update anomalies

Practical Normalization Principles

  • In the logical modeling phase, normalize up to 3NF
  • At physical design time, only intentionally denormalize where performance requires it
  • When denormalizing, record the reason in a Note

Relationship Design Principles

1:N vs M:N Decision

Question 1:N M:N
"Can one A have multiple B?"
"Can one B also have multiple A?"

When you discover an M:N relationship, create a bridge entity to decompose it into two 1:N relationships.

Common Mistakes

Mistake Problem Solution
Missing FK Relationship exists but no FK Create relationship in ThinkERD → FK auto-generated
Circular reference A→B→C→A cycle Review business rules. Confirm if the cycle is truly needed
Excessive identifying relationships All relationships made identifying Use non-identifying if the child has independent meaning
FK overuse FK set for all code tables For code-type references, use logical relationships only without FK

Managing Large-Scale ERDs

When You Have 50+ Entities

Strategy Method
Separate with Zones Create zones per domain for visual grouping
Color Coding Unify entity colors per domain
Use View Modes Logical for business review, physical for dev handoff
Use Notes Record design decisions in notes

Review Checklist

Before reviewing an ERD, verify the following:

  • Is a PK defined for every entity?
  • Are entity names singular business terms?
  • Are cardinality and optionality correct?
  • Are there unnecessary column duplications? (Check normalization)
  • Are identifying/non-identifying relationships appropriate?
  • Are standard words/domains applied?
  • Is code-type data separated into its own entities?