Data Sharing makes a single copy of data available to multiple applications, users, and external business partners.
Why Is Data Sharing Important?
Data sharing is important because it provides controlled, trusted, and maintained copies of high-quality data to subscribers. Data sharing is more efficient and effective than having multiple uncontrolled copies of a data set being used by internal and external organizations.
Data Sharing Approaches
There are many ways to share data. Below are some common examples:
- File transfer protocol (ftp): Operating system folders containing data files can be shared with external and internal users using mechanisms such as ftp and sftp. Security is provided using passwords. This is great for sharing software patches or files with customers.
- Cloud files: Google, Amazon and Microsoft offer scalable file stores not constrained by physical volume sizes, making them ideal for file sharing. The consumer of these files just needs login credentials and the URL of the file or folder.
- Web API: Apache Kafka shares data streams using a publish and subscribe scheme. An application programming interface (API) is provided to consuming applications that receive near real-time records.
- Event-based services: Event management systems provide immediate access to data demanded by applications such as gaming, stock trading, navigation systems, and emergency alerting systems. In these cases, changes are pushed to the client systems through APIs such as SMS and MQTT.
- Download: Many data publishers, such as the US government, like to share data using simple web downloads. These can be hyperlinked within page text or shared as URLs in emails.
- Database: Data sharing can be done using a higher-level API like SQL access. This saves subscribers the trouble of extracting and loading the data in their own data warehouse. Having SQL access allows users to point their BI dashboards to the shared data warehouse to analyze and visualize data.
Below are some of the many benefits of data sharing:
- Improves data governance by promoting the use of best-in-class data.
- Lowers data management costs by limiting the number of copies of data in circulation.
- Discourages siloed use of data by creating natural data bridges across lines of business or departments. Customer data, for example, should be in a single repository so all departments know how long they have been loyal and whether they have a problem impacting billing.
- Publishers of shared data retain sole write access to update and correct data as needed. This way, subscribers always have access to current, high-quality data.
- Data sharing is a form of democratizing data analysis by providing clean, high-quality data to citizen data analysts and data professionals.
Data Sharing With Actian
The Actian Data Platform is an ideal SQL-based data solution. Pre-built connectors to hundreds of data sources make data integration easy. Data can be stored and transmitted securely using strong encryption. Analytic and Transaction Processing Database services can be downloaded for self-managed instances or accessed using a subscription to a cloud service platform, including AWS, Azure, and Google Clouds.
Data publishers can charge for their data. Refinitiv, an Actian customer publishes high-quality stock information on their Elektron feed that uses Actian Vector columnar databases as a back end to deliver data with latencies of 20 milliseconds or less.
The Clinical Trial Service and Epidemiological Studies Unit (CTSU) in the Nuffield Department of Population Health at the University of Oxford deployed the Actian X enterprise-grade hybrid database for next-generation operational analytics. With Actian X, CTSU researchers can run ad-hoc analytic queries against large data volumes and get answers in minutes—even seconds—rather than days.
Actian DataConnect is built into the Actian Data Platform to enable organizations to prepare and share data. Data can be profiled, transformed, and data movement scheduled using a simple visual studio.