Data in Motion

Data Platform

The term “data in motion” refers to data that is actively being processed or transformed or moving from a source device, sensor, storage device, or data communications node to a target location. The target location can be an intermediate node or a destination, where it can become “data at rest”.

Data in Use

When data is created, edited, transformed, or processed, it is considered “in use”. Common data creation sources are transaction systems such as point-of-sale (POS) terminals, banking, or stock market transactions. IoT applications create data based on events such as car telemetry, motion detection, or voltage fluctuations. Event-driven data creation that generates prolific amounts of data is not limited to business applications. Streamed data from gaming, social media engagement, and sports events generate masses of data that are consumed by marketing systems to drive multi-channel advertising campaigns.

Data Processing

Cloud providers often consider data to be in motion when it is being actively processed by an application. Examples include data from an IoT sensor for updating a weather map, data sent in an email, or data being replicated to a remote location to provide data protection. For data to be processed, it must be moved from non-volatile secondary storage to volatile RAM or CPU cache. When data is altered by an application, it needs to be written out to secondary storage for it to persist across server shutdowns and unplanned power failures. Data stored on non-volatile secondary storage such as disk and SSD is termed “data at rest”.

Three States of Data?

The three states of data are:

  1. Data at rest – on secondary storage media.
  2. Data in motion – between media, applications, or locations.
  3. Data in use – being used or created by applications.

Data Encryption and Data in Motion

When data is in motion, it is susceptible to hacking by anyone who can intercept communication between endpoints. If this data is readable by humans, it can be stolen or used to gain access to secure servers. For this reason, data needs to be obfuscated using a crypto key and encryption techniques such as AES and Rivest-Shamir-Adleman (RSA). AES encryption is used for in-transit and at-rest encryption. RSA encryption is commonly used to protect data being transmitted between endpoints of a wide area network (WAN). Data blocks can be encrypted using different key lengths. Longer keys take more time to break. The sensitivity of the data in motion usually dictates which key length is used. Examples of available keys include AES-128 bit, AES-192 bit, and AES-256 bit keys. Actian products can encrypt data at rest or in motion using a 256-bit AES key.

Ways to Protect Data in Motion

Data protection is best achieved if implemented proactively. Encryption is not the only way organizations have to protect data. Firewalls can be used effectively to stop outsiders from entering private networks. Strong passwords and access policies can be used to protect sensitive data from internal and external threats. Data should always be encrypted when traversing any external or internal networks.

Transient Data in a Data Warehouse

Databases that underpin data warehouses are managed by a relational database management system (RDBMS). To ensure each database transaction is completed in a timely manner, a database needs four properties – atomicity, consistency, isolation, and durability. When databases possess these properties, they are said to be ACID-compliant.

To deliver ACID compliance, an RDBMS must be able to protect transactional data to maintain data integrity. To do this, every user data creation, update, or deletion is recorded in a transaction log which is always written to secondary storage. Transient data in RAM maintains consistency by only allowing one user or application program to make changes to it by locking it until the transaction is either committed or rolled back. When a database server suffers from an unplanned shutdown, it performs a rollback of the transaction log to the last checkpoint when it is restarted. It then rolls forward through the transaction log applying any data changes since the checkpoint. An orderly shutdown performs a transaction log checkpoint to minimize startup times.