Data Management

Data Modeling

Digital colored waves in data modeling representation

Data modeling is the practice of creating a visual representation of data elements and flows within the context of a business process. Data models are created to document how data is structured to support a business process. They can be reverse-engineered from existing database schemas to support re-engineering efforts.

Why Model Data?

Data models are a best practice method for creating a base design. Visual representation helps document and communicate the design with reviewers and collaborators. It is beneficial when a system needs to be enhanced to extend, and it is essential when refactoring an application to use a microservices architecture.

Types of Data Models

There are multiple ways to model data. Below are some examples:

  • An Entity-Relationship (ER) diagram is the most common model. It shows how different objects that represent entities are interrelated using lines.
  • An object-oriented model is suitable for showing connections between objects to be stored in an object-oriented database.
  • A relational model shows how tables are related through keys in a relational database schema.
  • A dimensional data model is useful when designing a star schema for a data warehouse.
  • A network data model can show how record-level data is structured and connected.
  • A graph model can describe connections between nodes in a graph database.

The Three Levels of Data Modeling

Depending on what stage of modeling you are at, a business can benefit from all three levels to articulate data relationships:

  • At the beginning of a project, it is helpful to outline at a high level what data will be used to support a business process. The conceptual data model provides this summary-level view of the processes used to operate the business process. Details are omitted at this level so concepts can be conveyed easily.
  • Most models start at the logical data model of the business process to be implemented. This model details all the entities, their attributes, and the relationships between them. Primary key relationships are highlighted, along with foreign key relationships. A business analyst would author such as data model.
  • The lowest level is the physical data model, which contains sufficient detail to create a database schema. A database designer or application developer would draw this data model level. The physical model can map to a schema appropriate for the database technology. For a relational database, the physical model may de-normalize a logical data model to optimize performance for the most used questions to be answered by the database. The schema types that a physical design supports include SQL, NoSQL, and JSON.

Modeling Multiplicities

There are multiple ways for entity relationships to be depicted. The general types of relationships mapped include:

  • Zero, one or optional relationship.
  • One to One relationship.
  • Zero or more relationships.
  • One or more relationships.
  • A bounded range relationship.

Data Modeling Notations

Designers have a choice of notation styles to use when drawing models. Some of the common styles are:

  • The IE notation is a clean style for high-level data models. It is simple because it omits attribute details.
  • The Barker notation can be used at multiple levels. It is more detailed than the IE notation.
  • The IDEF1X notation is well suited to physical database design but is no longer popular.
  • The Object Management Group (OMG) has proposed a modeling standard called the Universal Modeling Language (UML). This is still in draft form until it is ratified.
Notation styles

Constructing a Model

Building a model includes multiple steps and can be laborious. The benefit of going through the steps is developing a deep understanding of the data and its interrelationships, which will help you build better applications. The more common steps, in order, are:

  1. Understand what use cases or business processes the model will support.
  2. Create a conceptual model.
  3. Understand data volumes and what questions the database needs to answer.
  4. Create the logical data model and validate it for completeness and accuracy.
  5. Decide what database model best supports the application, such as SQL, NoSQL, or JSON.
  6. Normalize, reduce duplicates, and de-normalize for performance.
  7. Create a physical database model.
  8. Create the database schema.
  9. Load test data and write queries.
  10. Validate the query results.
  11. Deploy the model.

Data Modeling with Actian Solutions

The Actian Data Platform supports SQL database schemas and JSON data. Actian Zen can be used to deploy SQL and NoSQL database schemas. Visit the Actian website to learn how our solutions help people connect, manage, and analyze data.