Blog | Actian Life | | 2 min read

Actian Acknowledged as One of the Top Workplaces in Austin

Austin, TX Actian Office

When I moved to Austin 10 years ago from the California Bay Area, little did I not know that I would be working in a company that gets acknowledged as the Top Workplaces in Austin today. This accolade was nominated and selected by the employees of the company.

Actian Corporation came about as a consolidation of several companies worldwide from 2010 to 2013. In 2016, the company heralded a new leadership team that was very employee-friendly and took various steps that benefited all of us in Austin.

One of the first steps was to treat all employees in the US equally – irrespective of which constituent company they came from. The second step was establishing a modern office and workplace which gave everyone a reason to get up and come to work every day feeling happy. Friendly people at the workplace made it even more exciting and the office was a welcome location not just to work, but to enjoy the cafeteria and socialize. There were areas in the office that greatly helped us in collaboration.

When Covid-19 struck us earlier this year, we all started to work from home. This was a great challenge for the company in general and the Austin office in particular. At the company level various teams were formed to ensure that we had a uniform corporate direction globally, while we followed local laws and were able to take decisions locally. All of us missed our workplace. Locally in Austin we organized regular team get-togethers virtually where we would share work-from-home (WFH) challenges and the support needed. Many employees said that they “felt heard” and all “appreciated the help that the IT team put in place” procuring additional monitors to help make WFH experience better. Virtual coffee meets and virtual happy hours on Friday helped keep the community connected and socialize. The CEO of the company started weekly all hands to ensure that everyone knew what was going on. Leadership was in full display and all employees felt cared.

As the early steps from 2016 to 2019 made everyone experience a great place to work, recent events in 2020 and the support from the company has made this a Top Workplace. Check out our Press Release.


Blog | Data Intelligence | | 7 min read

What is a Knowledge Graph and How Does it Enhance Data Catalogs?

Visual of a knowledge graph for data catalogs

Knowledge graphs have been interacting with us for quite some time. Whether it be through personalized shopping experiences via online recommendations on websites such as Amazon, Zalando, or through our favorite search engine Google.

However, this concept is still often a challenge for most data and analytics managers who struggle to aggregate and link their business assets in order to take advantage of them as do these web giants.

To support this claim, Gartner stated in their article “How to Build Knowledge Graphs That Enable AI-Driven Enterprise Applications” that:

“Data and analytics leaders are encountering increased hype around knowledge graphs, but struggle to find meaningful use cases that can secure business buy-in.”.

In this article, we will define the concept of a knowledge graph by illustrating it with the example of Google and highlighting how it can empower a data catalog.

What is a Knowledge Graph Exactly?

According to GitHub, a knowledge graph is a type of ontology that depicts knowledge in terms of entities and their relationships in a dynamic and data-driven way. Contrary to static ontologies, who are very hard to maintain.

Here are other definitions of a knowledge graph by various experts: 

  • A “means of storing and using data, which allows people and machines to better tap into the connections in their datasets.” (Datanami)
  • A “database which stores information in a graphical format – and, importantly, can be used to generate a graphical representation of the relationships between any of its data points.” (Forbes)
  • “Encyclopedias of the Semantic World.” (Forbes)

Through machine learning algorithms, it provides structure for all your data and enables the creation of multilateral relations throughout your data sources. The fluidity of this structure grows more as new data is introduced, allowing more relations to be created and more context to be added which helps your data teams to make informed decisions with connections you may have never found.

The idea of a knowledge graph is to build a network of objects, and more importantly, create semantic or functional relationships between the different assets. 

Within a data catalog, a knowledge graph is therefore what represents different concepts and what links objects together through semantic or static links.

Google Example:

Google’s algorithm uses this system to gather and provide end users with information relevant to their queries.

Google’s knowledge graph contains more than 500 million objects, as well as more than 3.5 billion facts about and relationships between these different objects.

Their knowledge graph enhances Google Search in three main ways:

  • Find the right thing: Search not only based on keywords but on their meanings.
  • Get the best summary: Collect the most relevant information from various sources based on the intent.
  • Go deeper and broader: Discover more than you expected thanks to relevant suggestions.

How do Knowledge Graphs Empower Data Catalog Usages?

Powered by a data catalog, knowledge graphs can benefit your enterprise in their data strategy through:

Rich and In-Depth Search Results

Today, many search engines use multiple knowledge graphs in order to go beyond basic keyword-based searching. Knowledge graphs allow these search engines to understand concepts, entities and the relationships between them. Benefits include:

  • The ability to provide deeper and more relevant results, including facts and relationships, rather than just documents.
  • The ability to form searches as questions or sentences — rather than a list of words.
  • The ability to understand complex searches that refer to knowledge found in multiple items using the relationships defined in the graph.

Optimized Data Discovery

Enterprise data moves from one location to another in the speed of light, and is being stored in various data sources and storage applications. Employees and partners are accessing this data from anywhere and anytime, so identifying, locating and classifying your data in order to protect it and gain insights from it should be the priority.

The benefits of knowledge graphs for data discovery include:

  • A better understanding of enterprise data, where it is, who can access it and where, and how it will be transmitted.
  • Automatic data classification based on context.
  • Risk management and regulatory compliance.
  • Complete data visibility.
  • Identification, classification, and tracking of sensitive data.
  • The ability to apply protective controls to data in real time based on predefined policies and contextual factors.
  • Adequately assess the full data picture.

On one hand it helps implement the appropriate security measures to prevent the loss of sensitive data and avoid devastating financial and reputational consequences for the enterprise. On the other, it enables teams to dig deeper into the data context to identify the specific items that reveal the answers and find ways to answer your questions.

Smart Recommendations

As mentioned in the introduction, recommendation services are now a familiar component of many online stores, personal assistants and digital platforms.

The recommendations need to take a content-based approach. Within a data catalog, machine learning capabilities combined with a knowledge graph,  will be able to detect certain types of data, apply tags, or statistical rules on data to run effective and smart asset suggestions.

This capacity is also known as data pattern recognition. It refers to being able to identify similar assets and rely on statistical algorithms and ML capabilities that are derived from other pattern recognition systems.

This data pattern recognition system helps data stewards maintain their metadata management:

  • Identify duplicates and copy metadata
  • Detect logical data types (emails, city, addresses, and so on)
  • Suggest attribute values (recognize documentation patterns to apply to a similar object or a new one)
  • Suggest links – semantic or lineage links
  • Detect potential errors to help improve the catalog’s quality and relevance

The idea is to use some techniques that are derived from content-based recommendations found in general-purpose catalogs. When the user has found something, the catalog will suggest alternatives based both on their profile and pattern recognition. 

Some Data Catalog Use Cases Empowered by Knowledge Graphs

  • Gathering assets that have been used or related to causes of failure in digital projects.
  • Finding assets with particular interests aligned with new products for the marketing department.
  • Generating complete 360° views of people and companies in the sales department.
  • Matching enterprise needs to people and projects for HRs.
  • Finding regulations relating to specific contracts and investments assets in the finance department.

Conclusion

With the never ending increase of data in enterprises, organizing your information without a strategy means not being able to stay competitive and relevant in the digital age. Ensuring that your data catalog has an enterprise Knowledge Graph is essential for avoiding the dreaded ‘black box’ effect.

Through a knowledge graph in combination with AI and machine learning algorithms, your data will have more context and will enable you to not only discover deeper and more subtle patterns but also to make smarter decisions.

For more insights on what is a knowledge graph, here is a great article by BARC Analyst Timm Grosser “Linked Data for Analytics?

Start Your Data Catalog Journey

Actian Data Intelligence Platform is a 100% cloud-based solution, available anywhere in the world with just a few clicks. By choosing the Actian Data Intelligence Platform Data Catalog, control the costs associated with implementing and maintaining a data catalog while simplifying access for your teams.

The automatic feeding mechanisms, as well as the suggestion and correction algorithms, reduce the overall costs of a catalog, and guarantee your data teams with quality information in record time.


GigaOm published a comprehensive evaluation of leading cloud data warehouse services based on performance and cost. Offerings analyzed included Snowflake, Amazon Redshift, Microsoft Azure Synapse, Google BigQuery, and our very own Actian Data Platform.

There are many intriguing results included in the report, but an indisputable conclusion was reached by GigaOm: “In a representative set of corporate-complex queries from TPC-H standard, Actian Data Platform consistently outperformed the competition.”

To put “outperformed” in more concrete terms, Actian Data Platform was 8.5X faster than Snowflake in a test of 5 concurrent users. In terms of price performance, the advantage over Snowflake was 6.4X.

gigaom chart price performance

Actian Data Platform Was Built for Performance Out-of-the-Box

So what’s the secret behind Actian Data Platform’s superior performance? There is a simple, fundamental explanation. Actian Data Platform was built from the ground up to deliver unrivaled performance on commodity infrastructure. Its original design goal was to makes the most of every CPU clock cycle, every byte of memory, and every I/O operation. And ensuring high performance continues to be a priority for us. Our efficient design is the reason why, even with the limitless resources of the cloud, you won’t see your costs ballooning as you increase concurrent users or data volume.

How specifically does Actian Data Platform deliver best-in-class analytics performance without the need for tuning? It is a combination of the following eight factors. Vendors such as Snowflake may offer a few of the capabilities listed such as Vector processing, but the unique combination creates a powerful compounding effect:

  1. Optimizing the Use of Microprocessor Cores to run multiple data operations in parallel during a single CPU cycle. This is known as Vector processing. Traditional scalar architectures typically consume myriad more CPU cycles to compute the same calculations, which impacts overall throughput.
  2. Taking Full Advantage of Multi-Core CPUs – Actian Data Platform can perform Vector processing across all available cores, which maximizes concurrency, parallelism, and system resource utilization.
  3. Processing Data Using the CPU’s On-Chip Cache is faster and closer to where operations are performed and therefore optimizes performance. Our competitors tend to use DRAM for query execution cache, which is far slower.
  4. Using Advanced Compression – Typically With a 5:1 Compression Ratio – Actian Data Platform’s compression algorithms are designed for maximum efficiency, particularly in decompression, yet still deliver about a 5:1 compression ratio. Compression is handled automatically, so no tuning is required.
  5. Optimizing I/O – Actian Data Platform is a pure columnar implementation. The data lives its life in columnar format, which results in I/O efficiency.
  6. Using Patented Technology to Maintain Indexes Automatically so that an indexing strategy is not necessary.
  7. MPP Architecture parallelizes query execution within and across nodes to power through business workloads regardless of size and complexity.
  8. Real-Time Updates that enable operational insights with zero performance penalty enabled by Actian Data Platform’s patented positional delta trees.

All this adds up to blazing-fast analytics performance that results in faster iterations on data models, quicker root cause analysis, and ultimately enabling a data-driven organization.

When Cost is More Important Than Performance

If you’re satisfied with the current performance of your data warehouse, Actian Data Platform can deliver that same level of performance while enabling you to dial back your spend on compute resources—which can immediately translate into considerable cost savings. Conversely, if you can benefit from increased performance, Actian Data Platform can give you much more at a much lower cost. In other words, you have two levers to play with – price and performance – and Actian Data Platform enables you to achieve the lowest cost of ownership for the level of performance your business demands.


Blog | Data Intelligence | | 6 min read

What is Data Literacy? Tips on Becoming Data Literate.

data-literacy-definition

Data literacy has been a trending topic for a few years, and it is known that it is a vital skill for enterprises seeking to fully transform their organizations and become data-driven.

As technology can be a point of failure if not handled properly, it is often not the most important roadblock to progress. In fact, in Gartner’s annual Chief Data Officer survey, the top roadblocks for success were cultural factors – human, skills, and data literacy. 

However, many of these enterprises still struggle to understand what data literacy truly is, or know how to reshape their cultural organization into a data literate one.

In its 2020 survey, New Vantage Partners observed that:

“Companies continue to focus on the supply side for data and technology, instead of increasing demand for them by business executives and employees. It’s a technology push rather than a pull from humans who want to make more data-based decisions, develop more intelligent business processes, or embed data and analytics into more products and services.”

In this article, we’d like to shed light on what data literacy is, why it is important for your enterprise, and tips on how to become a data-literate organization.  

The Definition of Data Literacy

Just as literacy means to have “the ability to read for knowledge, write coherently and think critically about printed material,” data literacy is the ability to consume for knowledge, produce coherently, and think critically about data.

In 2019, Gartner defined data literacy as: “the ability to read, write, and communicate data in context, including an understanding of data sources and constructs, analytical methods and techniques applied, and the ability to describe the use case, application, and resulting value.”

So, based on these definitions, we can conclude that data literate people can, among other things:

  • Make analyses using data.
  • Use data to communicate ideas for new services, products, workflows or even strategies.
  • Understand dashboards (visualizations for example).
  • Make data-based decisions rather than based on intuition.

In summary, being data literate signifies having the set of skills to be able to effectively use data individually and collaboratively. 

Why is Data Literacy Important?

Gartner expects that, by 2020, 80% of organizations will initiate deliberate competency development in the field of data literacy to overcome extreme deficiencies. By 2020, 50% of organizations will lack sufficient AI and data literacy skills to achieve business value.

The increasing volume and variety of data that businesses are flooded with on a daily basis require employees to employ higher order skills such as critical thinking, problem-solving, computational, and analytical thinking using data. And as organizations become more data-driven, poor data literacy will become an inhibitor to growth. In fact, in their survey “The Human Impact of Data Literacy”, Accenture found that:

  • 75% of employees are uncomfortable when working with data.
  • 1/3 of employees have taken a sick day from work due to headaches working with data.
  • A lack of data literacy costs employers 5 days of productivity translating to billions of dollars in lost productivity per employee each year.

Furthermore, a Deloitte survey conducted in 2019 found that 67% of executives are not comfortable accessing or using data resources.

Data uplifts the success of organizations in creating both physical and digital business opportunities—improving accuracy, increasing efficiency and augmenting the ability of the workforce to deliver greater value. It is therefore important and essential to be able to interpret, analyze and communicate findings on data to be able to uncover the secrets to successful business and competitive advantage. 

How to Become Data Literate

In order to build a successful data literacy program, here are some tips to help your organization on your data fluency journey:

Develop a Data Literacy Vision and Associated Goals

Any organization investing in data and AI capabilities should have already undertaken the creation of a data vision and  roadmap. In the process of doing so, data and IT leaders will have identified and prioritized the areas of business where data can produce value.

These steps are critical to creating a data-literate organization and reducing the friction around understanding and using data.

Management and HR need to communicate across the entire enterprise that data is a strategic asset that creates value. Using the data vision and roadmap as context, they should be able to explain to all employees why data matters, how it creates value, and how it impacts the business.

The absence of a clear vision for data and a plan to create value out of it, will create frustration and, as a consequence, employees will lack understanding of why they are being asked to make efforts and therefore, not have the motivation to do so.

In addition, a data literacy vision should detail desirable skills, abilities, and the level of literacy required for different business units and roles.

Business, IT, and HR leaders need to create a framework to achieve literacy goals, measure progress, and create a way to maintain data literacy.  This includes deciding what skills are required, how to measure & track skills development, and to what degree different parts of the organization should use data in achieving their strategic objectives.

Assess Workforce Skills

Data literacy skills should ideally be assessed during the recruitment process for new hires.  In this way, HR will already know what kind of data literacy learning should be offered to the new hire over time.

However, for already existing employees, HR can map current employee data skills based on the roles and responsibilities provided in the above steps, and determine where there are gaps.

Create Data Literacy Modules

According to Qlik, only 34% of firms provide data literacy training.

In most cases, the HR department is responsible for helping business managers identify and track areas of improvement and development opportunities for employees. They are also in charge of organizing the procedures for learning specific organizational skills as well as the time it takes. It’s no different when it comes to becoming data literate.

Once HR and managers have a general idea of an employee’s or a business unit’s strengths and weaknesses in data skills, HR can begin to construct personalized and efficient learning programs that allow employees to upskill in data and analytics responsibilities.

Track, Measure, and Repeat

A successful data literacy program takes time to put in place. Business leaders must allow their employees to invest the time required to become data literate and improve their skills.  Over time, data thinking will become part of the corporate culture.

Finally, it’s important to communicate data literacy progress across the enterprise and on an individual basis. Tracking and communicating on the progress is key to continuing the evaluation of your organization’s data roadmap, vision and literacy.

This type of long-term planning and investment in educating the entire organization about how to access, understand and analyze data on the job will accelerate the efforts and investment that data science, machine learning and AI teams are making.

The results of data literacy efforts will allow organizations to finally be able to embrace and leverage data across the enterprise and for maximum value.


Blog | Data Intelligence | | 8 min read

Air France: Their Big Data Strategy in a Hybrid Cloud Context

airfrance big data

Air France-KLM is the leading group in terms of international traffic departing from Europe. The airline is a member of the SkyTeam alliance, consisting of 19 different airlines, offering access to a global network of more than 14,500 daily flights in over 1,150 destinations worldwide. In 2019, Air France represented:

  • 104.2 million passengers.
  • 312 destinations.
  • 119 countries.
  • 546 aircrafts.
  • 15 million members enrolled in their “Flying Blue” loyalty program*.
  • 2,300 flights per day*.

At the Big Data Paris 2020, Eric Poutrin, Lead Enterprise Architect Data Management & Analytics at Air France, explained how the airline business works, what Air France’s Big Data structure started as, and how their data architecture is today in the context of a hybrid cloud structure.

How Does an Airline Company Work?

Before we start talking about data, it is imperative to understand how an airline company works from the creation of its flight path to its landing. 

Before planning a route, the first step for an airline such as Air France is to have a flight schedule. Note that in times of health crises, they are likely to change quite frequently. Once the flight schedule is set up, there are three totally separate flows that activate for a flight to have a given departure date and time: 

  • The flow of passengers, which represents different forms of services to facilitate the traveler’s experience along the way, from buying tickets on their various platforms (web, app, physical) to the provision of staff or automatic kiosks in various airports to help travelers check in, drop off their luggage, etc.
  • The flow of crew management, with profiles adapted to the qualifications required to operate or pilot the aircraft, as well as the management of flight attendant schedules.
  • The engineering flow which consists of getting the right aircraft with the right configuration at the right parking point.

However, Eric tells us that all this… is in an ideal world:

The “product” of an airline goes through the customer, so all of the hazards are visible. And, they all impact each other’s flows! So the closer you get to the date of the flight, the more critical these hazards become.

Following these observations, 25 years ago now, Air France decided to set up a “service-oriented” architecture, which allows, among other things, the notification of subscribers in the event of hazards on any flow. These real-time notifications are pushed either to agents or passengers according to their needs: prevention of technical difficulties (an aircraft breaking down), climate hazards, prevention of delays, etc.

“The objective was to bridge the gap between a traditional analytical approach and a modern analytical approach based on omni-present, predictive and prescriptive analysis on a large scale” affirmed Eric.

Air France’s Big Data Journey

The Timeline

In 1998, Air France began their data strategy by setting up an enterprise data warehouse on the commercial side, gathering customer, crew and technical data that allowed the company’s IT teams to build analysis reports. 

Eric tells us that in 2001, following the SARS (Severe Acute Respiratory Syndrome) health crisis, Air France had to redeploy their aircrafts following the ban on incoming flights to the United States. It was the firm’s data warehouse that allowed them to find other sources of revenue, thanks to their machine learning and artificial intelligence algorithms. This way of working with data had worked well for 10 years and even allowed the firm to overcome several other difficulties, including the tragedy of September 11, 2001 and the crisis of rising oil prices.

In 2012, Air France’s data teams decided to implement a Hadoop platform in order to be able to perform predictive or prescriptive analysis (depending on individual needs) in real time, as the data warehouse no longer met these new needs and the high volume of information that was to be managed. It was only in a few months after the implementation of Hadoop, KAFKA, and other new-generation technologies that the firm was able to obtain much “fresher” and more relevant data.

Since then, the teams have been constantly improving and optimizing their data ecosystem in order to stay up to date with new technologies and thus, allow data users to work efficiently with their analysis.

Air France’s Data Challenges

During the conference, Eric also presented the firm’s data challenges in the implementation of a data strategy:

  • Delivering a reliable analytics ecosystem with quality data.
  • Implementing technologies adapted for all profiles and their use cases regardless of their line of business.
  • Having an infrastructure that supports all types of data in real time.

Air France was able to resolve some of these issues with the implementation of a robust architecture (which notably enabled the firm to withstand the COVID-19 crisis), as well as the setting up of dedicated teams, the deployment of applications and the security structures, particularly regarding the GDPR and other pilot regulations. 

However, Air France KLM has not finished working to meet their data challenges. With ever-increasing volumes of data, the number of data and business users growing, managing data flows across the different channels of the enterprise and managing data is a constant work of governance:

We must always be at the service of the business, and as people and trends change, it is imperative to make continuous efforts to ensure that everyone can understand the data.

Air France’s Unified Data Architecture

The Unified Data Architecture (UDA) is the cornerstone of Air France. Eric explains that there are four types of platforms:

The Data Discovery Platform

Separated into two different platforms, they are the applications of choice for data scientists and citizen data scientists. They allow, among other things, to:

    • Extract the “knowledge” from the data.
    • Process unstructured data, (text, images, voice, etc.).
    • Have predictive analytics support to understand customer behaviors.

A Data Lake

Air France’s data lake is a logical instance and is accessible to all the company’s employees, regardless of their profession. However, Eric specifies that the data is well secured: “The data lake is not an open bar at all! Everything is done under the control of the data officers and data owners“. The data lake:

    • Stores structured and unstructured data.
    • Combines the different data sources from various businesses.
    • Provides a complete view of a situation, a topic or a data environment.
    • Is very scalable.

“Real Time Data Processing” Platforms

To operate the data, Air France has implemented 8 real-time data processing platforms to meet the needs of each “high priority” business use case.  For example, they have a platform for predictive maintenance, customer behavior knowledge, or process optimization on stopovers.

Eric confirms that when an event or hazard occurs, their platform is able to push recommendations in “real time” in just 10 seconds.

Data Warehouses

As mentioned above, Air France had also already set up data warehouses to store external data such as customer and partner data and data from operational systems.  These Data Warehouses allow users to query these datasets in complete security, and are an excellent communication vector to explain the data strategy between the company’s different business lines.

The Benefits of Implementing a Hybrid Cloud Architecture

Air France’s initial questions regarding the move to the Cloud were:

  • Air France KLM aims to standardize its calculation and storage services as much as possible.
  • Not all data is eligible to leave Air France’s premises due to regulations or sensitive data.
  • All the tools already used in UDA platforms are available both on-premise and in the public cloud.

Éric says that a hybrid Cloud architecture would allow the firm to have more flexibility to meet today’s challenges:

Putting our UDA on the Public Cloud would give greater flexibility to the business and more options in terms of data deployment.

According to Air France, here is the checklist of best practices before migrating to a Hybrid Cloud:

  • Check if the data has a good reason to be migrated to the Public Cloud.
  • Check the level of sensitivity of the data (according to internal data management policies).
  • Verify compliance with the UDA implementation guidelines.
  • Verify data stream designs.
  • Configure the right network connection.
  • For each implementation tool, choose the right level of service management.
  • For each component, evaluate the locking level and exit conditions.
  • Monitor and forecast possible costs.
  • Adopt a security model that allows Hybrid Cloud security to be as transparent as possible.
  • Extend data governance in the Cloud.

Where is Air France Today?

It’s clear that the COVID-19 crisis has completely changed the aviation sector. Every day, Air France has to take the time to understand new passenger behavior and adapt flight schedules in real time, in line with the travel restrictions put in place by various governments. By the end of summer 2020, Air France will have served nearly 170 destinations, or 85% of their regular network.

Air France’s data architecture has therefore been a key catalyst for the recovery of its airlines:

A huge thanks to our business users (data scientists) who, every day, try to optimize services in real time so that they can understand how passengers are behaving in the midst of a health crisis. Even if we are working on artificial intelligence, the human factor is still an essential resource in the success of a data strategy.


Blog | Data Management | | 3 min read

Speedier Interactions With Actian Zen From Node.js

Actian Zen

Make Faster Calls Using the High-Speed Btrieve 2 API

Developers building real-time, data-intensive edge applications are increasingly turning to Node.js. It’s not in itself a programming language but an open-source, multi-platform, run-time environment that leverages JavaScript and its ecosystem — and turns out to be quite well suited for today’s data streaming and JSON API applications.

If you’re using Actian Zen as your edge data management platform — and naturally we think you should — you’ll find that Node.js pairs well with Zen. However, there’s more than one way to pair them. You can easily interact with Actian Zen from Node.js using SQL via ODBC, for example, and when the complexity of your interactions warrants the use of SQL that’s a perfect option.

But SQL via ODBC isn’t the fastest way to interact with Zen, and when you need speed there’s a better option: From Node.js you can access Zen data via the Btrieve 2 API. Let’s talk conceptually about how you can do this, and then we’ll dive into the practicalities of doing this. You’ll need certain software components to facilitate interaction with the Btrieve 2 API – including PHP, Python3, C++, and a few others that are easily downloaded – but let’s skip over the setup for now and focus on how you can speed up access to the Zen data you need.

Using the Btrieve 2 API

Conceptually, your JavaScript program is going to push a call through a special Node.js interface to the Btrieve 2 API, which is a C++ library that interacts directly with the Zen database engine.

From the standpoint of a JavaScript program, the interactions are relatively straightforward. Here’s the procedural logic:

  • Define the libraries and components to be loaded.
  • Set up and variables to be used.
  • Define the name, location, and record characteristics of the data file to hold the results of a query.
  • Instantiate an instance of the BtrieveClient class used for performing engine-wide operations such as creating and deleting files and opening and closing files.
  • Prepare information defining the key segment.
  • Set the created key segment information into the index attribute.
  • Create a file attributes object and set the fixed record length.
  • Create a new Btrieve file based on the information set (BtrieveFile object is a class that handles Btrieve data files).
  • Open the file.
  • Perform the database operations that your application requires.
  • Close the Btrieve File.

You can download a sample .js file here that will enable you to see the logic in action. Performatively, the 43-line sample application creates a Btrieve file and populates it with 10,000 10-byte records (each record consisting of an 8-byte timestamp and a 2-byte integer that, in this instance, might represent input from, say, an IoT sensor). The sample program also stores for later use the timestamp index for every 200th record and, ultimately, extracts the last written record from the data file and displays the value recorded in that record. Naturally, your use case may be far more involved but you’ll see how easy it is to create the JavaScript that will provide a high-performance interaction with Zen.

Putting the Sample Through its Paces

Care to run the aforementioned .js file to experience the performance of the Btrieve 2 API?  Give it a try! Net-net, Node.js and Zen can provide a powerful array of options when it comes to developing mobile and IoT applications.


Blog | Data Intelligence | | 4 min read

What is the Difference Between a Data Owner and a Data Steward?

There are many different definitions associated with data management and data governance on the internet. Moreover, depending on the company, their definitions and responsibilities can vary significantly. To try and clarify the situation, we’ve written this article to shed light on these two profiles and establish a potential complementarity.

Above all, we firmly believe that there is no idyllic or standard framework. These definitions are specific to each company because of their organization, culture, and their “legacy”.

Data Owners and Data Stewards: Two Roles With Different Maturities

The recent appointment of CDOs was largely driven by the digital transformations undertaken in recent years: mastering the data life cycle from its collection to its value creation. To try to achieve this, a simple – yet complex – objective has emerged: first and foremost, to know the company’s information assets, which are all too often siloed.

Thus, the first step for many CDOs was to reference these assets. Their mission was to document them from a business perspective as well as the processes that have transformed them, and the technical resources to exploit them.

This founding principle of data governance was also evoked by Christina Poirson, CDO of Société Générale, during a roundtable discussion at Big Data Paris 2020. She explained the importance of knowing your data environment and the associated risks to ultimately create value. During her presentation, Christina Poirson developed the role of the Data Owner and the challenge of sharing data knowledge. As part of the business roles, they are responsible for defining their datasets as well as their uses and their quality level, without questioning the Data Owner:

“The data in our company belongs either to the customer or to the whole company, but not to a particular BU or department. We manage to create value from the moment the data is shared”. 

It is evident that the role of “Data Owner” has been present in organizations longer than the “Data Steward” has. They are stakeholders in the collection, accessibility and quality of datasets. We qualify a Data Owner as being the person in charge of the final data. For example, a marketing manager can undertake this role in the management of customer data. They will thus have the responsibility and duty to control its collection, protection and uses.

More recently, the democratization of data stewards has led to the creation of dedicated positions in organizations. Unlike a Data Owner and manager, the Data Steward is more widely involved in a challenge that has been regaining popularity for some time now: data governance.

In our article, “Who are data stewards“, we go further into explaining about this profile, who are involved in the referencing and documenting phases of enterprise assets (we are talking about data of course!) to simplify their comprehension and use.

Data Steward and Data Owners: Two Complementary Roles?

In reality, companies do not always have the means to open new positions for Data Stewards. In an ideal organization, the complementarity of these profiles could tend towards:

A data owner is responsible for the data within their perimeter in terms of its collection, protection and quality. The data steward would then be responsible for referencing and aggregating the information, definitions and any other business needs to simplify the discovery and understanding of these assets.

Let’s take the example of the level of quality of a dataset. If a data quality problem occurs, you would expect the Data Steward to point out the problems encountered by its customers to the Data Owner, who is then responsible for investigating and offering corrective measures.

To illustrate this complementarity, Chafika Chettaoui, CDO at Suez – also present at the Big Data Paris 2020 roundtable – confirms that they added another role in their organization: the Data Steward. According to her and Suez, the Data Steward is the person who makes sure that the data flows work. She explains:

“The Data Steward is the person who will lead the so-called Data Producers (the people who collect the data in the systems), make sure they are well trained and understand the quality and context of the data to create their reporting and analysis dashboards. In short, it’s a business profile, but with real data valence and an understanding of data and its value”. 

To conclude, there are two notions regarding the differentiation of the two roles: the Data Owner is “accountable for data” while the Data Steward is “responsible for” the day-to-day data activity.


What is a modern data platform?  When we ask this question, we get two different groups of answers. The first group is around the technology needed for data processing in today’s business environment. In this group, the answers mention things like cloud services, containers, and on-prem. The second set of answers involves sources of data, types of data, and the management of that data. This group is more focused on using the data that is collected, but the fact is that both groups need to work together to provide a modern data platform.

Modern Data Platform Technology

The modern data platform has to deal with three different aspects of business data.

1. Volume – This is perhaps the easiest aspect to deal with in storage systems that utilize cloud technologies. Products like Actian can be delivered on hybrid systems that include private and public cloud service providers, including Amazon Web Services, Microsoft Azure, and Google Cloud. These platforms all offer the capability to grow to support increasing volumes of data and, in many cases, offer the ability to maintain that data for long periods.

2. The Variety of Data Types and Input Sources – Most people have multiple digital devices and interact with multiple executable sources. A single individual may use a mobile phone, a tablet, and a computer to interact with a given business. More than direct interaction, inputs may come from social media applications, and mobile phone apps in the form of structured or semi-structured data. One of the keys to a successful modern data platform is the utilization and dependency on standard protocols rather than some proprietary connection.

3. The Velocity of Data Accrual – Because data arrives from a variety of sources, it tends to arrive quickly as well. The consolidation of disparate data sources is another key to the modern data platform. It is difficult at best to bring together all of the data into a single repository or format, so virtual unity through data management and operations is necessary. One way to accomplish the virtual consolidation is through adaptive indexing and metadata use. Replacing or augmenting traditional taxonomies with faceting classifications enables the data to be searched and organized in multiple ways. This provides the use of different analytics and allows organizations to understand the data.

Beyond the technology is the need for a business to understand the data that they have, what additional data they might need, and how all of that data can be used.

Data Usage

The goal for any organization is data-driven insights into business operations and business needs. Those insights are only achievable based on data pipelines that are cleansing, formatting, and organizing the data. The data pipeline is fed from the business data operations. Data operations are, in turn, fed by the raw data collection and our data repositories. That data should be accessed via standard protocols for things like workflows, data quality measures, and data governance.

The insights derived from the process need, in turn, to be distributed to the appropriate consumers in order to provide value. That distribution can be in the form of some combination of reports, service layer APIs for automated action, and mobile device alerts. The delivery of value based on the processes of a modern data platform needs to align with the needs of the users of the processed data. Dashboards and reports on things like financial and operational performance can be actionable data products. That means that reports and dashboards are important, but so also is information in a form that takes a step beyond reporting. For example, linking regulatory or previously stored data with current data would be useful when processing invoice data in a financial system.

From a business perspective, the types of data are an important aspect of the data platform. Customer data was ranked in a recent survey as the most important data in a data warehouse. Businesses want to turn prospects into customers, customers into loyal customers, and loyal customers into product advocates. In order to do that, good customer data collection and processing is necessary in order to build aggregated customer data to identifies customer groups and segments.

Best practices for managing customer data include:

  • Each customer has a single source record, no matter how many data sources there are.
  • Access to customer data should have standard governance for storage, retrieval, and usage.
  • Data that is not relevant to the business relationship should not be stored or processed.
  • Customer privacy needs to be protected.

The modern data platform needs to manage multiple types of data. In the survey, transactional data is listed as the second most important type of data. Transactional data is often the result of customer interactions, such as product purchases, product returns, payments, subscriptions to newsletters and other recurring information, and where applicable, donations. These types of data often have legal significance in addition to business significance. Other common types of data include operational data, contact center data, marketing data, and reservations.

Once a business has decided on what data is important to collect, then the data platform needs to be deployed and monitored. Data needs to be processed in a timely manner, appropriately governed, and protected. The data storage needs to be managed as well as data access and usage.

Learn more about Actian – hybrid data warehouse platform here.