What 37signals’ Cloud Repatriation Taught Us About AI Infrastructure

Summary

Cloud repatriation can cut costs by millions at scale.
AI workloads amplify savings due to GPU and storage costs.
On-prem or hybrid suits predictable, high-volume inference.
Cloud still fits burst workloads like model training.
Hybrid strategies balance cost, performance, and compliance.

In 2023, 37signals announced that it had completely left the public cloud and followed up by publicly documenting its cloud repatriation process, providing one of the clearest real-world examples of on-premises economics at scale. By reversing its cloud migration and shifting workloads to private cloud infrastructure, the company drastically reduced its annual cloud infrastructure spend by almost $2 million.

The transparency of the numbers made the case compelling. In 2022, 37signals spent $3,201,564 on cloud services, which is about $266,797 per month. These detailed cost breakdowns, along with published hardware investment and payback timelines, provided a rare look into the financial mechanics of large-scale cloud repatriation.

For commodity SaaS workloads, the math was clear. But the same logic raises an important question for the next generation of compute-heavy systems: “Does the economic argument extend to AI infrastructure as well?” In this article, we examine whether the same economic logic holds for AI infrastructure.

TL;DR

37signals spent ~$3.2M/year on AWS in 2022.
After repatriating workloads to their own infrastructure, cloud spend dropped to ~$1.3M by 2024.
The company invested roughly $700K–$800K in servers and paid them off in under 18 months.
The entire infrastructure is still run by the same 10-person team. No additional operational overhead.
The key takeaway is that at a sustained scale, owning infrastructure can be dramatically cheaper than renting it.

The 37signals Playbook: What Hanson Actually Documented

In 2022, 37signals spent $3.2 million annually on AWS. After leaving the cloud in 2023, their annual costs had dropped to approximately $1.3 million by 2024, a reduction of almost $2 million per year.

The transition required a hardware investment of roughly $600,000 in Dell servers. The company fully recouped the investment in under 18 months, achieving complete payback in the second half of 2023 as their AWS reserved instance contracts expired. From that point forward, the savings flowed directly to operating margin rather than offsetting capital expense.

37signals projected $1.5 million in hardware costs and roughly $200,000 per year in operating expenses. This shift replaces a recurring $1.3 million annual cloud storage bill with a one-time capital outlay plus a fraction of the ongoing operating cost. Over five years, 37signals revised the total savings projections upward from $7 million to more than $10 million.

37signals cloud exit financials by year

To illustrate the financial impact of 37signals’ cloud exit over time, the table below breaks down annual cloud spending, on-premises hardware investments, and operating costs, highlighting the resulting net savings and key operational notes.

Year	Cloud spend	Hardware investment	Operating costs	Notes
2022 Baseline	~$3.2M	$0	Included in cloud spend	Full cloud dependency
2023 Migration	~$2M	~$700–800K	Moderate	Hardware fully recouped in under 18 months
2024+ Post-repatriation	~$1.3M	~$1.5M (storage)	~$200K/year	~$1.9M annual savings
2025+	Minimal AWS dependency	~$1.5M (Pure Storage, 18PB)	~$200K/year	$10M+ projected 5-year savings

Notably, the migration did not require the team to expand operations. A 10-person infrastructure team handled the entire repatriation without adding new staff. Addressing a common concern about operational overhead, 37signals co-founder David Heinemeier Hansson noted:

We’ve been out for just over a year now, and the team managing everything is still the same. There were no hidden dragons of additional workload associated with the exit that required us to balloon the team, as some spectators speculated when we announced it. All the answers in our Big Cloud Exit FAQ continue to hold.

This directly challenges the common assumption that moving away from public cloud environments inevitably requires a significantly larger infrastructure team.

Execution followed a “criticality ladder” strategy where the team migrated lower-risk services first and more critical ones later. The team moved the HEY email system in stages, starting with caching, then database, and finally, job services. To minimize risk, they colocated infrastructure approximately one millisecond from the AWS region to preserve rollback capability during the cloud repatriation process. After stabilizing the system, they replaced managed services with substantial recurring costs, including RDS and managed Elasticsearch, which exceeded $500,000 together annually.

What makes 37signals’ case study consequential is the publicly documented cost efficiency. For organizations questioning long-term cloud adoption assumptions, particularly with regard to storage costs and managed services, the 37signals documentation provides a rare baseline for comparison.

Why AI Infrastructure Economics are Even More Extreme

The lessons from 37signals’ cloud repatriation take on a sharper edge when applied to AI infrastructure. Higher GPU costs, predictable inference workloads, massive embedding storage, and stricter data regulations create financial and operational pressures that amplify the advantages of on-premises or hybrid cloud solutions that allow you to move workloads where they make the most sense. Below, we break down the key drivers.

AI infrastructure cost comparison

To evaluate the cost implications of different AI infrastructure approaches, the table below compares upfront setup costs, monthly operating expenses at varying workloads, and expected break-even timelines for cloud, on-premises, and hybrid configurations.

Setup	Setup cost	Monthly cost	Break-even
Cloud GPU rental (AWS/ Azure)	$0	$2,900–3,500 (8h/day × $4–8/hour × 15 days)	N/A
Cloud inference APIs (Lambda Labs)	$0	$1,800–2,500 (8h/day × $3.67/hour × 15 days)	N/A
Self-hosted GPU (8×H100 server)	$200K–400K	$1,500–2,000 (power + maintenance)	<12 months
Hybrid (Cloud training + On-Prem)	$200K–400K	Training only, inference minimal	<12 months

Note: For cloud GPU rental, we estimate monthly cost assuming eight hours/day per GPU. The cost scales linearly with utilization; it is not directly per-query.

GPU cloud markups are high

AI workloads depend heavily on GPUs, and cloud providers charge far steeper premiums for GPU capacity than for typical CPU compute. On-demand AWS P5 instances with H100 GPUs cost roughly $4–8 per GPU-hour, while comparable Azure H100 instances are about $3.67 per hour. By contrast, spot markets and alternative providers such as Lambda Labs offer similar GPU capacity for $1–2 per hour, or $1.85–2.49 per hour with reserved commitments.

The result is a 4–8× markup for on-demand hyperscaler GPU capacity relative to the spot or specialized GPU cloud market. In other words, the premium cloud providers charge for high-end AI compute is significantly larger than typical CPU cloud markups. For organizations running sustained inference workloads, this pricing gap quickly becomes the dominant cost driver in AI infrastructure.

Predictable inference makes GPU ownership economical

High GPU pricing becomes especially significant because AI inference workloads are unusually predictable. Purchasing H100 GPUs outright can be cost-efficient. A single GPU costs roughly $25K–40K, while a complete 8×H100 server ranges from $200K–400K. Lenovo’s analysis shows that six or more hours of sustained daily usage reaches payback against AWS within the first year.

The reason this break-even arrives so quickly is that AI inference workloads are unusually predictable. Unlike SaaS traffic which fluctuates throughout the day, production AI systems such as recommendation engines tend to process steady volumes of requests.

Predictability changes the economics. When infrastructure runs at consistent utilization, owned hardware can be amortized efficiently across the workload. Paying cloud premiums for burst capacity that teams rarely use becomes unnecessary.

For organizations running inference continuously, the hardware investment is often recouped in under 12 months. From that point forward, the savings resemble the same pattern documented by 37signals. Fixed infrastructure replacing an ongoing rental bill.

Embedding storage requirements are massive

Even if GPU compute were optimized, AI systems introduce another rapidly growing cost layer: embedding storage. Vector databases store high-dimensional embeddings used for search, retrieval, and recommendation. As datasets scale into millions or billions of records, storage requirements expand quickly.

For instance, 10 million vectors at 1,536 dimensions require at least 58GB of raw storage, often 200–300GB with indexes and metadata. Cloud storage services like Pinecone charge $0.33/GB/month, meaning 500GB could cost $165/month before any queries. Self-hosted solutions like PostgreSQL with pgvector dramatically reduce cloud spending while keeping sensitive data under direct control. Over time, these storage requirements compound infrastructure costs alongside GPU compute, further reinforcing the economic advantages of self-hosted or hybrid architectures.

Data sovereignty and compliance favor on-premises deployment

Data residency regulations and general compliance are priorities in the AI space with the industry becoming increasingly regulated. Notably, the EU AI Act introduced strict regulations for AI systems, with prohibitions on certain AI use cases which took effect in February 2025. On-premises deployment simplifies compliance.

For financial organizations navigating complex regulatory environments, solutions like the Actian Data Intelligence Platform helps enforce data governance and streamline compliance workflows.

The Cloud Infrastructure Case Studies 37signals Validated

As much as the financial transparency of 37signals’ cloud exit was radical, their repatriation was not an isolated occurrence. It was part of a growing trend by many organizations trying to regain cost control and optimize their cloud infrastructure. Many high-profile case studies illustrate the scale and economics of moving workloads back from public clouds to owned or hybrid infrastructure.

Dropbox

Dropbox pioneered enterprise cloud repatriation as early as 2015, completing the migration between 2016 and 2018. The company moved roughly 90% of customer data, reportedly over 500 petabytes, off AWS to three owned colocation facilities. The infrastructure investment totaled $53 million, yet Dropbox reported $74.6 million in operational savings over two years per its 2018 S‑1 filing. A small portion of workloads, primarily European customers and specialized services, remain in AWS. Internally, the initiative was known as “Magic Pocket,” and it exemplifies how a well-executed hybrid cloud approach can deliver substantial savings while aligning with long-term business objectives.

Ahrefs

Ahrefs, the SEO tools company, relied on a Singapore colocation setup with 850 servers. Their reported savings from avoiding public cloud were approximately $400 million over 2.5 years. Actual infrastructure cost: $39.5 million for 850 servers (~$1,500/server/month), versus an estimated $447.7 million if hosted entirely on AWS (~$17,557/server/month equivalent). As Ahrefs put it: “We wouldn’t be profitable, or even exist, if our products were 100% on AWS.” While critics argue that Ahrefs inflated AWS estimates, the directional savings were undeniable, illustrating that cloud repatriation challenges can be surmounted at scale with careful planning.

GEICO

GEICO spent a decade migrating to multiple cloud providers only for its costs to climb and exceed projections by 2.5×, reaching $300 million by 2022 across eight providers. In response, GEICO began moving workloads to a private cloud using OpenStack and Kubernetes, targeting over 50% repatriation by 2029. Early results show 50% reductions in compute and 60% reduction per gigabyte of storage costs compared with public cloud services, demonstrating how a hybrid cloud architecture can deliver efficiency, compliance, and alignment with long-term business objectives.

Akamai

Akamai was on the path to spending over $100 million on third-party cloud services before migrating compute workloads to its own global edge network of 350,000+ servers. The migration delivered savings of roughly $100 million per year, a testament to the economics of repatriation when existing infrastructure and scale align.

What these cases share is the same economic pattern documented by 37signals. Predictable, high-volume workloads eventually become cheaper to run on owned infrastructure than on hyperscaler clouds.

These examples reflect a broader shift occurring across enterprise infrastructure strategies. Barclays’ Chief Information Officers (CIO) surveys show cloud repatriation trending upward in recent years, with the sentiment peaking in the second half of 2024 with 86% of CIOs planning repatriation.

barclays cio survey

Barclay’s CIO survey showing 86% of CIOs planning cloud repatriation

However, this statistic does not mean that companies are abandoning public cloud environments completely. According to IDC, only 8–9% of companies favor full repatriation with most preferring a hybrid approach that combines public and private clouds. Hybrid cloud infrastructure allows organizations to optimize workload placement by strategically allocating sensitive data and mission-critical applications on-premises while leveraging public cloud services for less critical workloads. As such, it has become increasingly important for teams exploring similar transitions to understand the nuances of hybrid deployments and their associated risks.

Cloud Repatriation Statistics

Cloud repatriation is accelerating at the same time as public cloud spending keeps climbing. IDC projects global public cloud spend will reach $1.6 trillion in 2028, doubling from their 2024 prediction. Yet as mentioned earlier, 86% of CIOs are planning some form of repatriation according to Barclays. Both trends can be true because this is not a cloud exodus so much as a rebalancing. Enterprises are leaning towards a hybrid cloud model.

AI is likely to accelerate that shift. AI workloads account for less than 10% of total cloud compute today but Gartner projects that this figure will approach 50% by 2029. Hyperscalers are responding with enormous capital investment. There is an estimated $600 billion in infrastructure spend in 2026, roughly three-quarters of it tied to AI. The assumption is clear: Enterprises will rent that GPU capacity. But the 37signals math suggests that once AI workloads move from experimentation to steady production, ownership economics begin to dominate.

Cost pressure is already driving behavior. Flexera reports that 27% of cloud resources are wasted or underutilized, and 21% of workloads have already been repatriated. The primary reason cited is cost exceeding projections, followed by performance concerns. With GPUs, the margin for inefficiency is thinner. There are fewer optimization levers, higher hourly rates, and faster budget burn.

Regulation adds another layer. The EU AI Act, DORA for financial services, China’s PIPL, and India’s DPDP are tightening data governance requirements. Mimecast reports that 87% of organizations now factor data sovereignty into vendor decisions. For AI systems, sovereignty extends beyond data location to model provenance, audit trails, and compliance documentation. On-premises deployment does not eliminate regulatory complexity, but it centralizes control, and for many enterprises, that simplicity is becoming strategically attractive.

flexera cloud challenges

A bar chart showing why enterprises repatriate

The Counter-Arguments and When Cloud Providers Win

Not all observers agree that cloud repatriation is the best path for every organization. Public cloud environments still deliver value in certain circumstances. But arguments often do not hold strong in the case of AI workloads.

When cloud wins vs. when on-premises wins

Component	Cloud advantage	On-prem advantage
Workload predictability	Handles spiky or unpredictable workloads	Predictable workloads cheaper to self-host
Team expertise	Requires minimal in-house infrastructure skill	Strong IT teams can optimize and reduce vendor reliance
Scale and growth	Rapid scaling and global expansion	Predictable growth enables cost-efficient hardware
Regulatory requirements	Managed compliance, geo-redundancy	Direct control simplifies regulatory alignment
Cost and margins	Pay-as-you-go reduces upfront spend	Long-term savings from owned infrastructure
Service quality	Cloud SLAs ensure availability and performance	Dedicated resources guarantee predictable uptime

Cloud “wrong usage” argument

Jeremy Daly, a serverless advocate, argues that “37signals was using the cloud wrong.” By treating cloud environments as virtual colocation, running VMs and Kubernetes, they were paying cloud premiums without capturing the value of serverless, managed services, and instant scaling. As Daly notes, “In the cloud, we should be renting services, not servers.”

For SaaS workloads with highly variable or spiky traffic, this argument is compelling. Serverless infrastructure allows organizations to scale instantly and pay only for the compute they actually use.

However, AI inference workloads often behave very differently. Production inference systems, such as recommendation models, copilots, and document processing pipelines, tend to run at steady, sustained utilization rather than unpredictable bursts. In these cases, the economic advantage of elastic cloud scaling diminishes. The premium paid for burst capacity still exists, but the workload itself rarely needs that burst capacity.

Daly’s argument, therefore, holds for variable SaaS workloads, where elasticity is critical. For sustained AI inference workloads running at high utilization, paying a premium for burst capacity that is rarely used can make dedicated infrastructure or hybrid deployments more cost-efficient.

Full cost critique

Some critics also question the financial assumptions behind 37signals’ approach. They point out that hardware and software normally account for only about 20% of IT costs, with the remainder covering electricity, cooling, physical security, racking, Uninterruptible Power Supply (UPS), and opportunity costs. David Heinemeier Hanson’s analysis did not include all of these overheads because 37signals used colocation facilities rather than fully owned data centers. Even so, considering 37signals’ figures, it is reasonable to conclude that renting colocation space can still be far cheaper than relying on cloud services.

Competence vs. growth framework

Forrest Brazeal’s IT competence versus growth aspirations framework provides additional nuance. He places 37signals in the High Competence/Low Growth quadrant, ideal for self-hosting. “Not every company has the competence (high) or growth aspirations (low) of 37signals,” he observes. Startups with uncertain or spiky workloads benefit from cloud flexibility, but AI companies running production inference at scale often combine high operational competence with steady growth. Such profiles (steady growth & high competence) are well-suited to repatriation.

Applying the Playbook to AI Infrastructure

If 37signals provided the economic blueprint, AI infrastructure makes the economics more concrete. The decision is no longer abstract. It becomes a structured assessment grounded in workload behavior, utilization, and regulatory exposure.

A practical four-question framework helps translate the 37signals logic into AI terms:

1. Is your inference workload predictable and sustained?

Unlike SaaS traffic spikes, most production AI systems such as recommendation engines, RAG pipelines, or fraud detection models process steady volumes with gradual growth.

2. Are projected GPU utilization rates above 60–70%?

At this threshold, owned hardware amortization typically undercuts public cloud GPU pricing within the first year.

3. Are you processing more than 10–50 million queries per month?

At this scale, per-token and per-query pricing from cloud APIs compound rapidly.

4. Do you face data sovereignty or strict compliance requirements?

For financial services, healthcare, or government workloads, regulatory mandates can tilt the decision toward controlled environments.

If the answer is “yes” to three or four of these, the repatriation economics tend to favor on-premises deployment for production inference.

Decision matrix

Workload stage	Recommended environment	Rationale
Model training	Public cloud	Compute-intensive; cloud GPUs handle burst workloads cost-effectively
Experimentation and prototyping	Public cloud	Flexible, fast provisioning for early-stage iteration
Production inference	On-premises / Hybrid	Steady workloads; owned hardware cheaper at 60–70%+ GPU utilization
Vector storage (embeddings)	On-premises	Reduces recurring managed-service costs and ensures data control

The hybrid AI pattern

In practice, most AI organizations adopt a hybrid model rather than an all-or-nothing shift. Training remains in the cloud. Inference moves closer to owned infrastructure.

Lenovo documented that training Llama 3.1 at hyperscale (39.3 million GPU hours) in the cloud would exceed $483 million. That type of elastic, short-term scale is exactly where public cloud excels. Inference is different. Once a model is trained, serving it for three to five years becomes steady, predictable work. That is where amortized hardware economics has the upper hand.

This split architecture also simplifies data migration risk. Instead of relocating entire AI pipelines at once, organizations can migrate production inference workloads gradually while leaving experimentation and early-stage training in cloud environments. A controlled, phased migration process reduces operational disruption while ensuring seamless integration between cloud-based training and on-premises serving layers.

Self-hosted inference economics

The economics of self-hosted inference depend heavily on utilization and token volume. According to enterprise deployment benchmarks, a 7B-parameter model running on an H100 GPU at roughly 70% utilization costs about $10,000 per year in spot nodes or hardware amortization. Power costs about $300 annually, bringing the total costs to about $10,300.

Public LLM APIs, by contrast, typically charge per million tokens, with enterprise pricing in 2025 ranging from $0.25–$15 per million input tokens and $1.25–$75 per million output tokens depending on model tier and provider.

At low usage levels, APIs remain the more economical option because infrastructure sits idle. However, the economics change as workloads scale. Industry analyses suggest that self-hosted deployment begins to break even at roughly two million tokens per day, after which the fixed cost of owned infrastructure is amortized across a large inference volume.

At high volumes, self-hosted inference can reduce costs by up to 78%. Artefact’s analysis found break-even around 8,000 conversations per day. Below that threshold, managed cloud APIs remain more economical. Above it, ownership compounds savings. The pattern mirrors 37signals: predictable workload plus high utilization equals rapid payback.

Vector databases

Instacart documented migrating from Elasticsearch plus FAISS to PostgreSQL with pgvector, achieving 80% cost savings and a 10× reduction in write amplification. Timescale’s pgvectorscale benchmarks show approximately 75% lower costs than managed vector services like Pinecone at comparable performance.

For RAG systems handling millions of queries monthly, self-hosted vector infrastructure produces savings that resemble the 37signals S3 case: large recurring storage bills replaced by amortized hardware and open-source tooling.

Data sovereignty as a structural driver

Grandview research reports that the sovereign cloud market was worth 648.87 billion USD in 2025 and is projected to reach USD 648.87 billion by 2033. Also, according to Gartner, around 60% of financial firms outside the United States are expected to adopt sovereign or on-premises deployments by 2028.

Frameworks such as the EU AI Act, China’s PIPL, and India’s DPDP mandate data localization and traceability. For organizations processing sensitive training datasets or proprietary inference logs, on-premises deployment inherently satisfies residency requirements because data never leaves jurisdictional boundaries.

The Bottom Line

37signals showed that cloud repatriation teams can measure, model, and defend decisions with hard numbers. With AI infrastructure, the economics can be even more pronounced. If cloud repatriation saved roughly $10 million for Basecamp, an equivalent AI company running production inference at a comparable scale could save multiples of that amount, given the much higher cost of GPU compute and embedding infrastructure.

For organizations choosing to run AI workloads in controlled environments, platforms like Actian VectorAI DB provide a purpose-built vector database designed for high-volume vector search and AI inference workloads. It can be deployed on-premises or in the cloud, allowing organizations to place vector infrastructure where it best fits their operational and economic requirements.

Join the community and learn more about Actian.

About Author

About Nick Johnson

Nick Johnson is a Senior Product Marketing Manager at Actian, driving the go-to-market success for HCL Informix and Actian Zen. With a career dedicated to shaping compelling messages and strategies for databases, Nick brings a wealth of experience from his impactful work at leading technology companies, including Neo4j, Microsoft, and SAS.

Securing Your Data With Actian Vector, Part 7

By Martin Fuerderer

#Actian Vector #Data Security #Databases

By Martin Fuerderer

#Actian Vector #Data Security #Databases

Summary

Encrypted databases allow rotating table-level encryption keys independently of passphrases or main keys.
ALTER KEYS enables DBAs to rotate keys for all tables or selected tables as needed.
Table key rotation re-encrypts data, improving security but may be resource-intensive.
Staggering table key changes helps reduce performance impact during peak workloads.

Action Vector 7.0 is renamed to Actian Analytics Engine beginning with version 8.0

Besides just changing the passphrase and optionally also the main key, as explained in the previous blog posts in this series, it is also possible to change the encryption keys for the individual tables in the encrypted database. The collection of these commands gives the DBA full control over changing keys in the Actian Vector encrypted database.

Managing Encryption Keys for Encryption at Rest

Rotating Table Keys

“Table keys” can be changed only in encrypted databases because there are no table keys in non-encrypted databases. Table keys are changed independently from the passphrase and the “main key.” For this, the SQL statement ALTER KEYS is used.

When a table key is changed, a new table key is generated randomly. The old table key is used to decrypt the data of the table, then the data is re-encrypted with the new table key. The new table key is encrypted with the “database key” and stored in the container.

Depending on the amount of data in the table, changing the table key can be a computationally expensive operation that takes considerable time. While it is possible to change all table keys in a database with a single SQL command, i.e. “at the same time,” table keys can also be changed for individual tables only. The latter allows an administrator to spread the workload by running separate SQL statements for different tables at different times, e.g. when the general workload from database users is low.

Changing the keys for all tables in a database:
ALTER KEYS ON DATABASE;
Changing keys of individual tables:
ALTER KEYS ON TABLE <table1>, <table2>, … ;

The effect of changing table keys for two individual tables is shown in the following graphic:

database key diagram

Explore Other Blogs on Securing Your Data With Actian Vector:

Using database encryption in Actian Vector
Managing encryption keys for encryption at rest.
Changing only the passphrase for an encrypted database.
Upgrading an existing encrypted database to Actian Vector 7.0.
Understanding different encryption keys.
Leveraging Actian Vector functional encryption capabilities.

About Author

About Martin Fuerderer

Martin Fuerderer is a Principal Software Engineer for HCLSoftware, with 25+ years in database server development. His recent focus has been on security features within database environments, ensuring compliance and robust data protection. Martin has contributed to major product releases and frequently collaborates with peers to refine database security standards. On the Actian blog, Martin shares insights on secure database server development and best practices. Check his latest posts for guidance on safeguarding enterprise data.

The Data Product Advantage: What New Global Research Reveals About AI Success

By Actian Corporation

#AI #Data Contracts #Data Management #Data Products

By Actian Corporation

#AI #Data Contracts #Data Management #Data Products

Summary

Organizations investing in AI often struggle to move from experimentation to production due to weak data foundations.
Data product adoption is rapidly increasing, growing from 48% in 2024 to 69% in 2026.
Companies with mature data products are far more likely to scale AI, with 85% running multiple production AI projects.
Data products improve AI success by ensuring governance, quality, ownership, and reusable, trusted data assets.

Insights from the 2026 BARC x Actian global research study of 300+ enterprise data leaders

Organizations across industries are investing heavily in AI. Yet many initiatives still struggle to move beyond experimentation and deliver consistent business value.

To better understand what separates AI initiatives that scale from those that stall, Actian partnered with BARC, a leading global analyst firm for data and analytics, to conduct a global research study of enterprise data leaders.

Based on insights from more than 300 respondents across industries and regions, the study examines how organizations adopt and operationalize data products and data contracts, and how these approaches influence the success of AI initiatives.

The results reveal a clear pattern: organizations that adopt data products at scale achieve significantly stronger AI outcomes.

BARC × Actian Global Research Report (2026)

AI Ambition vs. AI Reality

Many organizations today have ambitious AI strategies. However, turning those ambitions into production systems that deliver business value is far more difficult than launching pilots or proofs of concept.

AI initiatives depend on reliable data, clear ownership, and consistent data quality across systems. The research shows that trustworthy inputs for AI and decision-making are now the primary drivers behind data product adoption. Without these foundations, models may be developed successfully in isolation but struggle to operate reliably in real-world environments.

This gap between AI ambition and AI execution has become one of the defining challenges for enterprise data teams today.

The research conducted with BARC suggests that the difference often lies in how organizations manage and operationalize their data.

Data Products Have Entered the Roll-Out Phase

The research also shows that data products are moving beyond experimentation and into broader enterprise adoption.

Adoption of data products is accelerating rapidly across the organizations surveyed. In just over a year, the share of organizations using data products operationally increased from 48% in 2024 to 69% in 2026, marking a major shift in how enterprises design and manage data assets.

This suggests that the concept is evolving from an architectural idea into a practical approach for managing and delivering trusted data at scale.

data products across organizations

Adoption of data products across organizations. Data products are entering the enterprise roll-out phase – © BARC 2026

The Data Product Advantage

One of the most striking findings from the research is the strong correlation between data product adoption and AI maturity. Organizations that treat data as well-defined, governed, and reusable products are far more likely to move AI initiatives into production and scale them successfully.

In fact, the research shows a clear statistical gap between organizations that have scaled data products and those that have not: 85% of organizations that have established data products company-wide report three or more AI projects in production, compared to just 25% of organizations that are only experimenting with data products or not using them at all.

This suggests that data products are increasingly becoming a practical operating model for delivering reliable data to AI and analytics initiatives.

Why Data Products Matter for AI

Data products introduce several elements that are critical for AI systems to operate reliably at scale — particularly as organizations build the data foundations required for AI.

Data products typically include clearly defined data ownership, documented data contracts and schemas, built-in data quality monitoring and governance, and reusable, discoverable data assets. These elements help ensure that the data used by AI models remains consistent, trustworthy, and well governed.

As AI initiatives grow more complex, particularly with the emergence of agentic and autonomous systems, these foundations become even more important. Without reliable data pipelines and clearly defined data assets, scaling AI across an enterprise becomes extremely difficult.

Download the Full Research Report

This article introduces several key insights from the BARC x Actian global research study.

The full report, Data Products and Data Contracts in 2026: The Foundation for AI Success, explores how organizations across industries and regions are adopting and operationalizing data products and data contracts—and how these practices influence AI maturity, governance, and real-world outcomes.

Get the full report to explore the findings and learn how leading organizations are scaling AI with data products.

Download Full Report

About Author

About Actian Corporation

Actian empowers enterprises to confidently manage and govern data at scale. Organizations trust Actian data management and data intelligence solutions to streamline complex data environments and accelerate the delivery of AI-ready data. Designed to be flexible, Actian solutions integrate seamlessly and perform reliably across on-premises, cloud, and hybrid environments. Learn more about Actian, the data and AI division of HCLSoftware, at actian.com.

Why is Data Lineage Important?

By Actian Corporation

#Data Intelligence #Data Lineage

By Actian Corporation

#Data Intelligence #Data Lineage

Summary

Data lineage provides end-to-end visibility into data flow, improving trust and transparency.
It enables faster root cause analysis, reducing downtime and improving data reliability.
Lineage supports compliance, governance, and audit readiness across regulated industries.
It helps teams assess impact, reduce risk, and ensure accurate analytics and AI outcomes.

Data is the foundation of business strategy, innovation, compliance, and competitive advantage. Organizations across industries rely on analytics, artificial intelligence, reporting dashboards, and regulatory submissions to make critical decisions. But as data flows through complex pipelines, moving between systems while being transformed, aggregated, filtered, and enriched, its origins and journey often become unclear.

Data lineage is the solution. It provides a clear visual and traceable map of how data moves from its original source to its destination. It answers essential questions, such as: Where did this data come from? How was it transformed? Who touched it? Why does it look the way it does? Can we trust it?

Understanding why data lineage is important requires examining its impact on trust, compliance, operational efficiency, analytics accuracy, and long-term scalability.

What is Data Lineage?

Data lineage refers to the end-to-end lifecycle of data. It documents:

Data origins (source systems.
Movement across platforms.
Transformations and calculations.
Aggregations and filters.
Dependencies between datasets.
Final outputs (reports, dashboards, ML models).

Think of it as a detailed supply chain map for your data.

In modern environments using tools like Apache Airflow, Snowflake, dbt, and cloud warehouses, data pipelines can involve hundreds or thousands of transformations. Without lineage visibility, teams operate in the dark.

1. Building Trust in Data

Trust is the currency of modern data-driven organizations. If stakeholders cannot trust data, they will not rely on analytics to make decisions.

Why Trust Matters

Imagine a CFO reviewing revenue reports. A sudden 8% discrepancy appears compared to last month. Without data lineage, the team must manually investigate:

Was there a change in source system logic?
Did someone modify a transformation?
Was a filter removed?
Is there a data duplication issue?

With lineage, teams can trace the report back to the source table, see recent transformation changes, and quickly identify the root cause.

Data lineage transforms guesswork into data incident triage.

2. Faster Root Cause Analysis

Data issues are inevitable. Pipelines break. Schema changes happen. Columns are renamed. Data quality deteriorates.

Without lineage:

Debugging can take days or weeks.
Teams rely on tribal knowledge.
Investigations involve dozens of stakeholders.

With lineage:

Impacted datasets are immediately visible.
Downstream dependencies are mapped.
Engineers can pinpoint the exact transformation step causing the issue.

This dramatically reduces downtime and increases operational resilience.

3. Regulatory Compliance and Audit Readiness

In regulated industries such as finance, healthcare, and insurance, compliance is not optional.

Applicable regulations may include:

GDPR
HIPAA
SOX
Basel III

These regulations require organizations to demonstrate transparency in how data is collected, processed, stored, and reported.

For example, under GDPR, organizations must explain how personal data is used and where it resides. In financial services, regulators may require proof of how risk metrics were calculated.

Data lineage provides:

Documented transformation history.
Audit trails.
Traceability to source systems.
Evidence of governance controls.

Without lineage, audits become reactive, stressful, and risky. With lineage, audit preparation becomes structured and defensible.

4. Improved Data Governance

Data governance is about control, accountability, and clarity. But governance frameworks fail without visibility.

Data lineage strengthens governance by enabling:

Ownership tracking.
Change impact analysis.
Policy enforcement.
Access control validation.
Data classification mapping.

For example, if a sensitive column (e.g., social security number) is accidentally included in an analytics table, lineage can show where it propagated and who accessed it.

This prevents data sprawl and ensures responsible data usage.

5. Supporting Data Quality Initiatives

Data quality initiatives often focus on accuracy, completeness, consistency, and timeliness. But when quality issues arise, lineage becomes essential.

If a dashboard shows incorrect metrics, lineage allows teams to:

Trace data back to ingestion.
Identify transformation errors.
Detect schema drift.
Confirm calculation logic.

Rather than patching symptoms, teams can resolve root causes.

This leads to higher confidence in KPIs, improved reporting reliability, and stronger executive trust in analytics teams.

6. Enabling Impact Analysis Before Changes

Modern data environments evolve constantly. Engineers deploy new models. Analysts update calculations. Source systems introduce schema changes.

Without lineage, even small changes can have unknown downstream consequences.

Consider renaming a column in a source table. That column may feed:

15 downstream models.
4 dashboards.
2 executive reports.
1 machine learning pipeline.

Data lineage enables proactive impact analysis, showing:

All downstream dependencies.
Systems affected.
Stakeholders impacted.

This allows teams to communicate changes ahead of time and avoid breaking production systems.

7. Accelerating Data Democratization

Organizations increasingly aim to make data accessible to non-technical users. Self-service BI platforms empower teams across marketing, operations, finance, and HR.

But democratization without clarity leads to chaos.

Data lineage helps business users understand:

Where metrics originate.
What transformations were applied?
Which version of a dataset is authoritative?
Whether data is certified or experimental.

This reduces duplicate datasets, shadow analytics, and conflicting reports.

When users can see the journey of data, they use it more confidently and responsibly.

8. Enhancing Collaboration Between Teams

Data engineering, analytics, compliance, and business units often operate in silos. Miscommunication about data definitions and ownership can slow progress.

Lineage creates a shared language and visual understanding of data flows.

For example:

Engineers see pipeline dependencies.
Analysts see transformation logic.
Compliance teams see data movement.
Executives see reporting dependencies.

This shared visibility reduces friction and accelerates decision-making.

9. Supporting Cloud and Modern Data Architectures

Cloud adoption has increased system complexity. Organizations use multiple platforms:

Cloud data warehouses.
ETL/ELT tools.
Streaming platforms.
Business intelligence dashboards.
Machine learning services.

Data often flows between hybrid environments and third-party SaaS systems.

Lineage tools help unify this complexity by mapping cross-platform flows. Without lineage, cloud migration efforts can introduce hidden risks and broken dependencies.

10. Strengthening AI and Machine Learning Governance

As AI adoption grows, organizations must understand how training data is sourced and transformed.

Poorly governed data pipelines can lead to:

Biased models.
Inaccurate predictions.
Regulatory violations.
Reputational damage.

Data lineage enables teams to trace:

Training dataset origins.
Feature engineering transformations.
Data version histories.
Model input dependencies

This is critical for explainable AI and responsible AI initiatives.

If an AI-driven decision is questioned, lineage provides transparency.

11. Reducing Operational Risk

Operational risk increases when organizations depend on undocumented data pipelines.

Key risks include:

Single points of failure.
Knowledge loss when employees leave.
Accidental data corruption.
Inconsistent business logic across reports.

Lineage reduces reliance on tribal knowledge. Documentation becomes automated and centralized.

When institutional knowledge is captured visually, organizations become more resilient.

12. Improving Efficiency and Reducing Costs

Data inefficiencies can be costly:

Redundant pipelines.
Duplicate datasets.
Unused tables.
Overlapping transformations.

Lineage provides visibility into unused assets and redundant processes.

Teams can:

Decommission obsolete datasets.
Consolidate logic.
Reduce storage costs.
Simplify pipelines.

This operational clarity leads to leaner, more efficient data ecosystems.

13. Empowering Strategic Decision-Making

Executives depend on data for strategic decisions—market expansion, acquisitions, product development, and resource allocation.

But strategic confidence requires reliable foundations.

Data lineage ensures that:

KPIs are traceable.
Financial reports are auditable.
Forecast models are transparent.
Metrics are consistent across departments.

Without lineage, leadership decisions may rely on flawed assumptions.

With lineage, organizations gain strategic certainty.

14. Facilitating Mergers and Acquisitions

During mergers or acquisitions, organizations must integrate disparate data systems.

Common challenges include:

Conflicting definitions.
Redundant reporting structures.
Unclear data ownership.
Inconsistent transformation logic.

Lineage tools accelerate integration by revealing:

Overlapping datasets.
Dependency conflicts.
Redundant processes.
Governance gaps.

This speeds due diligence and reduces post-merger disruption.

15. Preparing for the Future of Data

The future of data is:

Real-time
Distributed
AI-driven
Highly regulated
Increasingly complex

As organizations scale, data pipelines become more intricate. Manual documentation cannot keep up.

Automated data lineage becomes a foundational capability—not a luxury.

It enables:

Observability
Scalability
Compliance by design
Agile experimentation
Sustainable growth

Organizations that invest in lineage build a durable data foundation capable of adapting to future demands.

Common Misconceptions About Data Lineage

You may have heard some of the following phrases uttered in your workplace. Read on to learn why they are misconceptions.

“We Have Documentation—That’s Enough.”

Static documentation becomes outdated quickly. Lineage must be automated and continuously updated to remain accurate.

“Only Engineers Need Lineage.”

Analysts, compliance teams, executives, and auditors all benefit from visibility into data flows.

“Lineage is Only for Large Enterprises.”

Even startups experience data complexity as they grow. Implementing lineage early prevents scaling problems later.

Key Benefits of Data Lineage

Area	How data lineage helps
Trust	Verifiable data origins
Compliance	Audit-ready documentation
Debugging	Faster root cause analysis
Governance	Clear ownership and control
Efficiency	Reduced redundancy
AI	Transparent model inputs
Risk	Lower operational exposure
Strategy	Confident executive decisions

Improve Your Data Lineage Tracking With Actian

Data lineage is a technical feature of modern data environments as well as a strategic enabler.

Since organizations rely on data to drive innovation, manage risk, and maintain regulatory compliance, understanding the journey of data is essential. Without lineage, businesses operate with blind spots. They react to issues rather than prevent them. They question reports instead of trusting them.

Ready to see how the Actian Data Intelligence Platform streamlines data lineage and makes tracking easier? Sign up for a personalized demonstration of the platform today.

About Author

About Actian Corporation

From Spatial to Vectors: How HCL Informix® Brings AI to Your Existing Data

By Jean-Georges Perrin

#AI #Databases #Informix #Vector

By Jean-Georges Perrin

#AI #Databases #Informix #Vector

Summary

Actian introduces native vector support in Informix, enabling AI use cases without new databases.
Eliminates data movement by combining vectors and operational data in one system.
Supports SQL-based similarity search with full ACID transactions.
Reduces complexity by leveraging existing security, governance, and infrastructure.
Positions “vector as a feature” over standalone vector databases.

The Database That Keeps Evolving

Here’s a story I don’t share often. Working on databases in college made me hate them.

Then karma did what karma does: one of my first jobs involved Informix. That was nearly 30 years ago, and the rest is history. What kept me around wasn’t just the performance or the reliability: it was the fact that Informix never stood still. Every time the industry said, “you need a new tool for that,” Informix said, “or you could just teach me.”

Today, the industry says you need a dedicated vector database for AI. Pinecone. Milvus. Weaviate. A whole new category of infrastructure to deploy, secure, and maintain. And what for? Just to store embeddings alongside the data you already manage.

I’m here to tell you: you don’t need another database. You need the one you have to do more. And that’s exactly what’s happening. HCL Informix®15 is getting native vector support, coming in Summer 2026. And it’s Actian making it happen.

Be the among the first to try the vector blade in HCL Informix 15. Join Waitlist

Why Vector Matters for Your Business

Before we get into the how, let’s talk about the why. Vector search turns unstructured data (text, images, sensor readings, documents) into numerical representations called embeddings. These embeddings can then be compared for similarity. That’s the foundation of semantic search, recommendation engines, and retrieval-augmented generation (RAG). This is the state of the art of AI now.

This isn’t abstract or futuristic. It’s happening right now across the industries where Informix has been a trusted workhorse for decades.

Retail: Product recommendations and visual search that understand intent, not just keywords.

Manufacturing: Anomaly detection from sensor embeddings, catching defects before they become recalls.

Financial services: Fraud pattern matching and document similarity across millions of transactions.

IoT: Similarity-based alerting on time series patterns, a natural bridge from Informix’s existing world-class TimeSeries capabilities.

Hospitality: A hotel chain stores guest profiles in Informix, including booking history, room preferences, dining choices, and spa usage. With vector embeddings, a similarity search at check-in finds guests with the most similar taste profiles and surfaces what they enjoyed: the rooftop restaurant, the late checkout, the spa package, or the bourbon selection at the bar (they start to know me really well). Not because a rule said so, but because similar guests loved it. And because HCL Informix supports read/write vectors, the guest’s embedding updates with every stay, every meal, every review, and this happens within the same ACID transaction that records the booking. No batch job. No stale recommendations.

The pressure from leadership is real: “add AI” without increasing operational overhead. But there’s a subtler challenge that kills most AI initiatives: the path to production. A proof of concept is easy. Getting it through security review, compliance certification, infrastructure provisioning, backup integration, and operational sign-off? That’s where projects stall (or die more or less quietly). Vector support inside your existing database collapses that path. The security model is already approved. The backup procedures are already in place. The ops team already knows the engine. You’re not asking anyone to adopt new infrastructure. You’re asking them to do more with what they trust.

Nothing Like Informix Doing Vector

Yes, there are vector databases. Yes, PostgreSQL has pgvector. But none of them are Informix.

The new HCL Informix vector blade introduces a native vector data type through the same extensibility architecture that made Informix a leader in spatial, time series, and JSON data. Vectors aren’t bolted on or constrained — they’re first-class citizens, replicated, backed up, indexed, and governed like every other data type in the engine.

Other databases are adding vector support too, but the depth of implementation varies. PostgreSQL with pgvector is the most popular open-source option, but scaling it for enterprise workloads requires careful tuning, and you’re on your own for security and governance. Oracle AI Vector Search is technically strong, but brings Oracle’s heavyweight stack, licensing costs, and complexity with it. And standalone vector databases like Pinecone or Milvus? They solve one problem while creating another: a new system to deploy, secure, sync, and pay for.

HCL Informix takes a different approach. The vector blade treats vectors as native types inside the engine, with the same operational maturity you expect from every other data type Informix handles. Embeddings can be inserted, updated, and deleted like any other column. This enables dynamic RAG workflows, real-time updates, and operational AI (clearly, not just batch analytics).

Here’s what makes HCL Informix unique in this space:

True multi-model from the ground up. SQL + NoSQL + JSON + time series + spatial + vector, all in one engine. Not bolted on, architecturally native.

Proven at scale. 2 million+ transactions per second, enterprise-grade high availability, minimal administration overhead. Your vectors get the same industrial treatment as your transactional data.

No data duplication and no data movement. Your operational data and your AI-ready embeddings live side by side, governed by the same security, backed up by the same processes. No ETL to a sidecar vector store.

SQL you already know. Similarity search through standard SQL using vector distance metrics. No new query language, no new API. If your team knows SQL (and they do, I saw it), adoption will be fast.

ACID on vectors. Transactions that include vector operations alongside relational updates with full consistency. Try that with Pinecone.

AI framework integration. Developers can use HCL Informix as a vector store for RAG applications, connecting directly to AI frameworks.

Free for HCL Informix customers. No additional licensing. No surprise costs. If you run HCL Informix, you get vector capabilities.

And it was no surprise that when I interviewed my friend, Pradeep “M” Muthalpuredathe, Actian’s VP of Engineering for Database Solutions, he frankly told me:

Business leaders in enterprises are consistently being told they need a new database for their AI solutions. I disagree. What they need is for the database they already trust to continuously innovate and meet their requirements. That’s what Informix has always done. Spatial? We got it. Time series? Got it. JSON? Same. Now vectors. HNSW indexing. Semantic search. Production-grade RAG. You see where this is going. All inside the engine our customers love and have relied on for decades. HCL Informix doesn’t ask you to start over. It grows with you and your business needs. That’s not marketing: that’s 30+ years of engineering conviction.

Informix in the Actian AI Ecosystem

The HCL vector blade doesn’t exist in isolation. Actian is building an AI-ready ecosystem around HCL Informix:

The new MCP Server for HCL Informix, also an Actian exclusive, not available for IBM Informix, exposes database capabilities, including vector search, as tools that AI agents can call directly. Your Informix data becomes accessible to agentic AI workflows without custom integration.

Combined with the Actian Data Intelligence Platform for governance and discovery, Actian Data Observability for data quality monitoring, and Actian AI Analyst (fka Wobby) for conversational analytics grounded in a governed semantic layer, vector data in Informix feeds an ecosystem where business users can ask questions in natural language and get trusted answers from the data you already manage. This isn’t a silo play. This is about making your entire data stack AI-aware from storage to insight.

And let me be direct: both the vector blade and the MCP Server are HCL Informix innovations, researched and developed by Actian. They will not be available in IBM Informix. This is what active R&D investment looks like.

Your Database, Now AI-Ready

You don’t need another database. You need the one you have to do more.

Informix has always been at the forefront of innovation. From being one of the first multi-model databases to handling spatial, time series, and JSON data natively, the engine has never stopped learning. The vector blade is the next chapter, and it’s being written solely by Actian.

My personal wish for what comes next? Native support for data contracts and data products. Through the Linux Foundation’s Bitol project, I chair the development of open standards like ODCS and ODPS. Imagine Informix not just storing your data and vectors, but natively understanding the contracts that describe it and the products that deliver it. No other database does that.

They say you can’t teach an old dog new tricks. They’re wrong. They just haven’t met Informix.

The vector blade for HCL Informix ships in Summer 2026. It’s free for HCL Informix 15 customers.

Informix is a trademark of IBM Corporation in at least one jurisdiction and is used under license.

Three Decades of Teaching Informix New Tricks

The DataBlade® Legacy

The DataBlade® architecture, born in the mid-1990s with Informix Universal Server, was built on a radical idea: the database engine should be able to learn new data types without being rebuilt. Instead of waiting for the vendor to add support for your data, you could extend the engine itself.

That architecture proved itself again and again. Informix was the first commercial database ported to Linux. Spatial data? DataBlade. Time series? DataBlade. JSON and BSON? Native support is built on the same extensibility framework. Each time a new data paradigm emerged, Informix absorbed it natively rather than requiring a separate engine or a bolt-on service.

In fact, this isn’t even Informix’s first encounter with vectors. The Excalibur Image DataBlade, available in the late 1990s, extracted feature vectors from images using neural network techniques and performed similarity search on them, returning ranked results based on vector distance. That was vector similarity search inside a relational database, before “vector database” was even a term.

The vector blade isn’t a new idea for Informix. It’s a homecoming.

Actian Invests, Informix Evolves

The vector blade is an HCL Informix innovation, developed by Actian. It will not be available in IBM Informix.

Actian is actively investing in Informix R&D. HCL Informix 15 delivered massive scalability improvements, external smartblobs, Kubernetes deployment, and REST APIs. The return of 4GL availability. And now, native vector support.

This is not a product on life support. This is a database with an active engineering roadmap, a dedicated R&D team, and a company that’s building its future, not just maintaining its past.

Comparison Table: Vector Database Landscape

	HCL Informix	DB2	pgvector	Oracle AI	Pinecone Milvus	LanceDB
Read/Write vectors	Yes	Yes*	Yes	Yes	Yes	Yes
Vector replication	Yes	No	Yes	Yes	N/A	N/A
Vector backup/restore	Yes	No**	Yes	Yes	N/A	N/A
Vector indexing	Yes	Early preview	Yes (HNSW)	Yes	Yes	Yes
SQL-native	Yes	Yes	Yes	Yes	No	No
Multi-model (same engine)	Yes	Limited	Extension	Yes	No	No
ACID on vectors	Yes	Yes	Yes	Yes	No	No
On-prem/hybrid	Yes	Yes	Yes	Yes	Limited	Yes
Operational footprint	Light	Heavy	Varies	Heavy	New infra	Light
Free for existing customers	Yes	No	Open source	No	No	Open source
Enterprise security	Yes	Yes	DIY	Yes	Limited	DIY

* DB2 12.1.2+ supports INSERT/UPDATE on VECTOR columns, but with significant operational constraints [16].

** DB2 documentation states: “Logical backup and restore operations do not support the VECTOR type” [16].

DB2: [9], [10], [16], [17]. pgvector: [15], [18]–[21]. Oracle: [22]–[27]. Pinecone/Milvus: [21], [28]–[30]. LanceDB: [14]. Excalibur heritage: [31], [32]. Comparison based on publicly available information as of March 2026.

Bibliography

HCL Informix, Product & Capabilities

Actian. “HCL Informix: High-Performance Database.” https://www.actian.com/databases/hcl-informix/
Taylor, Emily. “Experience Near-Unlimited Storage Capacity With HCL Informix 15.” Actian Blog, August 2025. https://www.actian.com/blog/databases/hcl-informix-15/
Schulte, Mary. “User-Friendly External Smartblobs Using a Shadow Directory.” Actian Blog, February 2025. https://www.actian.com/blog/databases/user-friendly-external-smartblobs-using-a-shadow-directory/
“Data Wars: The Rise of HCL Informix.” Actian Blog, February 2025. Dedicated to Carlton Doe III (in memoriam), founding member of IIUG. https://www.actian.com/blog/databases/data-wars-rise-of-hcl-informix/
Johnson, Nick. “Imagine New Possibilities With HCL Informix.” Actian Blog, August 2025. https://www.actian.com/blog/databases/imagine-new-possibilities-with-hcl-informix/

Actian AI Ecosystem

Radh, Dee. “Actian’s Winter 2026 Product Launch Solves the Agentic Trust Problem and More.” Actian Blog, February 2026. https://www.actian.com/blog/product-launches/winter-2026-launch/
Actian Corporation. “Actian Introduces Data Observability Agents for the Agentic AI Era.” Press release, February 24, 2026. ViaBigDATAwire.
Actian. “Actian Data Intelligence Platform.” https://www.actian.com/data-intelligence/platform/

Competitive Landscape & Comparison Table Sources

IBM. “Announcing IBM Db2 12.1.2: Empowering your AI and cloud data transformation.” June 2025. https://www.ibm.com/new/announcements/ibm-db2-12-1-2-empowering-your-ai-and-cloud-data-transformation
IBM. “IBM Db2 12.1.3 now generally available.” November 2025. https://www.ibm.com/new/announcements/ibm-db2-12-1-3-now-generally-available-advancing-ai-for-enterprise-data-management
IBM. “Announcing the IBM Db2 Vector Store integration forLlamaIndex.” November 2025. https://www.ibm.com/new/announcements/announcing-the-ibm-db2-vector-store-integration-for-llamaindex
LangChain. “IBM db2 vector store and vector search integration.” https://python.langchain.com/docs/integrations/vectorstores/db2/
SQLServerCentral. “Vectors in SQL Server 2025.” March 2026. https://www.sqlservercentral.com/articles/vectors-in-sql-server-2025
LanceDB. https://lancedb.com/
pgvector. PostgreSQL vector extension. GitHub. https://github.com/pgvector/pgvector
IBM. “Vector values.” Db2 12.1.x docs. Sections: “UPDATE and INSERT operations with vectors” (confirms read/write), “Vector limitations” (no replication, no logical backup/restore, no index/primary/foreign keys, no ORDER BY, no GROUP BY, no JOIN, no SELECT DISTINCT). https://www.ibm.com/docs/en/db2/12.1.x?topic=list-vector-values
Garcia-Arellano, Christian. “Vector Indexes in Db2 — An early preview.” IDUG, February 12, 2026.
Instaclustr/NetApp. “pgvector: Key features [2026 guide].” “Replication, backup, and role-based access control automatically extend to vector data.” https://www.instaclustr.com/education/vector-database/pgvector-key-features-tutorial-and-pros-and-cons-2026-guide/
Calmops. “PostgreSQL Vector Search: Complete Guide 2026.” “pg_dumpand continuous archiving work with vector columns. Point-in-time recovery includes vector data.” https://calmops.com/database/postgresql-vector-search-pgvector-2026/
Microsoft Azure. “Optimize performance of vector data on Azure Database for PostgreSQL.” HNSW andIVFFlatindexes, 2000-dimension limit. https://learn.microsoft.com/en-us/azure/postgresql/extensions/how-to-optimize-performance-pgvector
DEV Community (polliog). “PostgreSQL as a Vector Database.” 2026. ACID transactions for vectors + relational data; “No ACID — Like Pinecone, not a general database.” https://dev.to/polliog/postgresql-as-a-vector-database-when-to-use-pgvector-vs-pinecone-vs-weaviate-4kfi
Oracle. “Oracle AI Vector Search User’s Guide.” VECTOR data type, INSERT/UPDATE, similarity search. https://docs.oracle.com/en/database/oracle/oracle-database/26/vecse/overview-ai-vector-search.html
Oracle blog. “GoldenGate23ai and Oracle Database 23ai Vectors.” “Full replication of vectors.” https://blogs.oracle.com/dataintegration/goldengate-database-23ai-vectors
Oracle blog. “GoldenGate23ai vector replication between Oracle and PostgreSQL.” https://blogs.oracle.com/dataintegration/goldengate-23ai-vector-replication
Oracle. “Oracle Database 23ai Brings the Power of AI.” May 2024. “All mission-critical features now work transparently with AI vectors.” https://www.oracle.com/news/announcement/oracle-announces-availability-database-23ai-with-ai-vector-search-2024-05-02/
Oracle. “Oracle AI Database 26ai Release Notes.” “Data redaction is not supported for the VECTOR data type.” https://docs.oracle.com/en/database/oracle/oracle-database/26/rnrdm/issues-all-platforms-2.html
Oracle. “Indexing Guidelines with AI Vector Search” (June 2025) and “Using Hybrid Vector Indexes” (May 2025). https://www.oracle.com/database/ai-vector-search/
Oracle (competitive page). “What Is Pinecone?” “Lacking in SQL support and advanced relational querying.” https://www.oracle.com/database/vector-database/pinecone/
Pinecone Docs. “Database limits.” https://docs.pinecone.io/reference/api/database-limits
BraincuberTechnologies. “Pinecone vs pgvector: Comparison Guide 2025.” https://www.braincuber.com/blog/pinecone-vs-pgvector-which-vector-db-for-your-project
Oninit. “Excalibur Text Search DataBlade Module.” etx access method, ranked text search. https://www.oninit.com/manual/informix/english/docs/dbdk/is40/dbdktour/xb4.html
IBM. “Excalibur Image DataBlade Module.” Feature vector extraction via neural networks, similarity search with ranked results. https://public.dhe.ibm.com/software/data/informix/pubs/pdfs/excalibur2.pdf

Informix History & Community

“Informix.” Wikipedia. https://en.wikipedia.org/wiki/Informix
“Actian.” Wikipedia. https://en.wikipedia.org/wiki/Actian
International Informix Users Group (IIUG). https://www.iiug.org
IBM. “IBM Informix DataBlade Modules: Release notes.” https://www.ibm.com/support/pages/ibm-informix-DataBlade-modules-release-notes-documentation-notes-and-machine-notes
“Informix Corporation.” Wikipedia. https://en.wikipedia.org/wiki/Informix_Corporation

Industry Trends

McKinsey & Company. 51% of enterprises using AI have encountered negative consequences. Referenced in Actian Data Observability Agents press release [7].
Gartner. “By 2026, 50% of enterprises implementing distributed data architectures will have adopted data observability tools.” Market Guide for Data Observability Tools, June 2024.
Actian Corporation. “The Governance Gap: Why 60% of AI Initiatives Fail.”ActianBlog. https://www.actian.com/blog/data-governance/the-governance-gap-why-60-percent-of-ai-initiatives-fail/

Actian AI Analyst

Actian Corporation. “Actian Unveils Conversational Analytics Solution.” Press release, March 10, 2026. https://www.actian.com/company/press-releases/actian-unveils-conversational-analytics-solution-with-intelligently-generated-semantic-foundation-for-trusted-insights/

About Author

About Jean-Georges Perrin

Jean-Georges Perrin has been part of the Informix community for nearly 30 years. Elected to the Board of Directors of the International Informix Users Group (IIUG) in 2002, he served for 15 years and was the first non-US citizen elected to that board. A Lifetime IBM Champion, a distinction he earned through 16 consecutive years of recognition starting in 2009, when he became the first French citizen to receive the title, Jean-Georges has authored two ebooks on Informix, presented at IIUG conferences on topics ranging from 4GL modernization to proving that Informix is not just for legacy applications, and contributed to the Informix community across three continents. He serves in Actian’s CTO Office, focused on data standards and AI strategy. But when HCL Informix does something worth writing about, old habits kick in. He chairs the Linux Foundation's Bitol project, where he leads the development of open standards for data contracts (ODCS) and data products (ODPS). He is the author of several books, including Implementing Data Mesh (O'Reilly) and Spark in Action, 2nd edition (Manning), and is currently writing Building Data Products for O'Reilly. Working on databases in college made him hate them. Then karma introduced him to Informix. He never left.

How to Evaluate Vector Databases in 2026

By Tahiya Chowdhury

#Databases #RAG #VectorAI DB

By Tahiya Chowdhury

#Databases #RAG #VectorAI DB

Summary

Most vector database benchmarks are vendor-optimized and fail to reflect real-world production conditions like concurrency, filtering, and continuous ingestion.
Key production risks include tail latency (P95/P99), performance degradation over time, and rising total cost of ownership at scale.
The industry is shifting toward “vector as a feature,” favoring integrated platforms like PostgreSQL + pgvector or Actian VectorAI DB over standalone vector databases.
Effective evaluation requires real-world testing with high-dimensional data, concurrent workloads, and long-term cost modeling.

In 2026, a synthetic performance crisis challenges the vector database market. A GitHub search for “vector database benchmark” reveals polished repositories with dashboards and performance charts. However, vendors often build these tools to evaluate their own products and portray architecture-specific strengths as objective comparisons.

Zilliz maintains VectorDBBench. Redis and Qdrant publish benchmark suites that highlight their own systems. Even widely cited Approximate Nearest Neighbor (ANN) evaluations, such as ANN-Benchmarks, rely on low-dimensional datasets such as Scale-Invariant Feature Transform (SIFT) and Generalized Search Trees (GIST). Modern Large Language Model (LLM) embeddings often reach 3,072 dimensions. These benchmarks do not reflect that reality.

Leaderboards reward performance under static conditions, yet production systems must survive continuous writes, metadata filters, and concurrency spikes. As software engineer Simon Frey famously noted in a viral post: “The best vector database is the one you already have.” This captures the 2026 market shift, prompting teams to move from specialized silos toward the databases they already trust and operate.

This guide takes a production-first approach. We define the five critical tests for 2026 and explore why your optimal vector database may already exist within your current architecture, whether that is PostgreSQL with pgvector or an enterprise hybrid engine like Actian VectorAI DB.

TL;DR

The bias: Most benchmark suites originate from vendors and optimize for narrow architectural advantages.
The reality: Production workloads include continuous ingestion, metadata filtering, and concurrency spikes that synthetic tests ignore.
The risk: Tail latency (P99), index fragmentation, and write amplification degrade systems long before average QPS drops.
The cost curve: Managed vector services often introduce nonlinear pricing as the dataset size increases.
The direction: 2026 favors integrated platforms, from established relational extensions (PostgreSQL + pgvector) to enterprise hybrid systems (Actian VectorAI DB), over “vector-only” silos.

Why Every Benchmark You’ve Seen is Vendor-Optimized

Benchmarks create a perception of objectivity but often encode architectural assumptions. Tools like VectorDBBench (Zilliz) reward distributed scaling, while Redis and Qdrant suites emphasize in-memory operations. To find objective data, architects must look to peer-reviewed academic conferences such as NeurIPS and VLDB (Very Large Databases), which prioritize algorithmic rigor over marketing.

Before examining what matters in production, it helps to understand how common benchmark tools shape outcomes.

Benchmark tool	Primary creator	Optimization focus	Typical bias
VectorDBBench	Zilliz (Milvus)	High-throughput scaling	Favors massive clusters; penalizes single-node systems.
vector-db-benchmark	Redis/Qdrant	In-memory operations	Favors RAM-heavy architectures; ignores TCO of memory.
ANN-Benchmarks	Academic	Raw algorithm efficiency	Uses outdated, low-dimensional datasets (SIFT/GIST).
NeurIPS / VLDB	Academic Peers	Algorithmic robustness	Focuses on math/theory; ignores operational/SLA reality.

The Hidden Rules of Benchmarking

A significant hurdle is the “DeWitt Clause,” a legal provision in many End User License Agreements (EULAs) that prohibits users from publishing independent benchmarks without the vendor’s permission. In 2024, BenchANT found that 30% of the major vector databases legally prohibit disclosure that their products are slow.

Furthermore, these benchmarks often operate at “Time Zero,” the artificial window immediately following ingestion but preceding live updates. In production, systems must constantly insert and delete data, forcing the index to re-optimize in real time. Vendor benchmarks often omit the Out-of-Memory (OOM) failures that result.

The Five Production Tests That Actually Matter

Most benchmarks measure performance after loading data, before any real updates occur. But production is a nonstop, unpredictable process. To find a database that can handle real users, you should run these five stress tests.

1. Filtering under concurrent load

Pure vector similarity searches are rare in real life. In production, you’re more likely to search for something like “Product recommendations WHERE category is ‘shoes’ AND stock > 0.”

Reddit’s engineering team, managing 340M+ vectors, identified metadata filtering as the primary performance bottleneck in their 2025 deployment. They found that as concurrent users grew, the database spent more time resolving metadata filters than calculating similarity distances.

The reality: Production means 100+ concurrent clients hitting different metadata subsets.
The gap: VectorDBBench only tests with a single client. In real-world situations, moving data between the vector graph and the relational metadata store can cause P99 latency to jump by 10x, as the CPU waits for disk I/O.

2. Performance degradation over time

While archival retrieval-augmented generation (RAG) systems can technically use static knowledge bases, production-grade applications in 2026 must reflect real-time data, such as customer tickets or product inventory. As the engineering team at Milvus admitted, “Benchmarks test after data ingestion completes, but production data never stops flowing.” If the database cannot re-index as quickly as it ingests data, your AI may provide stale or incorrect answers for hours.

Benchmarks that omit a “72-hour continuous write-and-query” test provide zero value. You must determine whether query performance degrades after six months of continuous index maintenance.

3. Tail latency under load (P95/P99)

Average latency can be misleading and doesn’t show what users really experience. For example, a 10ms average response time doesn’t help if your slowest 1% of queries (P99) take 800ms. This makes your AI agent seem slow and unreliable. Only high-concurrency tests reveal these spikes, which often happen during garbage collection or index locking.

4. Total cost of ownership (TCO)

In 2025, managed vendors introduced complex “read unit” pricing. This created a “Growth penalty”: if your index grows from 10GB to 100GB, you may pay 10x as much for the same query result.

Scale metric	Managed Vector DB (usage-based)	Integrated/Hybrid platform	TCO impact
Initial (10GB)	High (Platform fee + usage)	Moderate (Fixed resource)	Integrated is ~40% lower
Growth (100GB)	High (Scales with volume)	Low (Vertical scaling)	8x cost gap
Enterprise (1TB+)	Prohibitive (Linear growth)	Optimized (Reserved capacity)	90%+ long-term savings

This economic reality primarily drives the market’s shift toward “Vector as a Feature,” in which teams prioritize on-premises capabilities and predictable scaling over usage-based silos.

5. Operational maturity

Benchmarks ignore the “Operational Support Tax,” which quantifies the cost and risk of maintaining specialized infrastructure. You can easily find a PostgreSQL expert because the community has thrived for 30 years, but hiring someone proficient in a niche, three-year-old vector database often creates a bottleneck.

Evaluate the ecosystem: Does the database work with standard backup tools? Can it integrate with Prometheus? How long does it take to rebuild an index after a crash?

Here’s how benchmark claims compare to production reality.

Metric	Benchmark focus	Production reality
Ingestion	Static QPS after completion	Sustained QPS during continuous writes
Latency	Average latency	P95/P99 Latency under concurrent load
Filtering	Single-client filtered search	100+ Concurrent metadata-filtered queries
Cost	Infrastructure cost per query	TCO at 100M+ queries/month

the ingestion cliff — The ingestion cliff

Spotting these hidden bottlenecks is the first step to building a strong system. In 2026, the answer is rarely to use a faster, specialized database. Instead, engineers are adding these features to the tools they already know and trust.

The Consolidation Shift: Vector as a Feature

Corey Quinn, Chief Cloud Economist, once said: “Vector is a feature, not a product.” This prediction shapes the 2026 market. Teams are moving away from specialized “Vector-Only” databases and choosing integrated “Vector-Also” platforms. Shifting data between a main database and a separate vector database often causes more problems than it fixes.

The PostgreSQL renaissance

Engineers frequently argue on platforms like Hacker News that ~80% of RAG use cases (specifically those with embeddings under 2M) do not require a specialized vector database. For these workloads, standalone silos often introduce more operational friction than they offer in performance gains. Instacart validated this at scale by migrating from Elasticsearch to PostgreSQL, achieving 80% cost savings and reducing write workload by 10x after eliminating the need to coordinate and reconcile data across fragmented architectures.

Recently, pgvectorscale achieved 471 queries per second at 99% recall on 50 million vectors, outperforming Qdrant’s 41 QPS on identical AWS hardware. Vendor benchmarks often omit this result because it shows that most RAG applications don’t require a specialized vendor.

Performance metric	PostgreSQL (pgvector + pgvectorscale)	Qdrant (Specialized)	The Delta
Throughput (QPS)	471.57	41.47	11.4x higher in Postgres
P95 Latency	60.42 ms	36.73 ms	Qdrant is 39% faster at tail
P99 Latency	74.60 ms	38.71 ms	Qdrant is 48% faster at tail
Hardware	AWS r6id.4xlarge (16 vCPU)	AWS r6id.4xlarge (16 vCPU)	Parity

The integrated enterprise gap

For workloads that exceed basic extensions, Actian VectorAI DB bridges the gap by embedding a high-performance engine with native vector support. Teams can execute metadata filtering and similarity search within a single system, reducing data movement and simplifying query execution.

Platform	Architectural strategy	Intended AI capability
Actian VectorAI DB	High-performance hybrid	Engineered for integrated analytics + native vector support.
PostgreSQL	Integrated feature	Leverages `pgvector` within standard SQL.
AWS S3 Vectors	Storage-centric	Designed to query multi-billion vectors in object storage.
MongoDB Atlas	Unified document/vector API	Integrates native vector search directly into the existing document store workflow.

As the market comes together, the way we evaluate databases shifts. Teams no longer ask, “Who has the fastest graph?” They ask, “Which architecture provides the most reliable query engine?” No universal winner exists. Teams instead face a spectrum of trade-offs between specialized speed and integrated reliability.

The evaluation process now puts more weight on operational strength, real-world flexibility, and support for hybrid search. Reliable query execution is becoming the top priority, especially given the growing demand for hybrid search.

Hybrid Search Reality That Pure Vector Benchmarks Hide

Pure vector search often fails the “groundedness” test, which measures how strictly an AI’s response relies on provided source material. A high groundedness score ensures that the LLM avoids fabrication and adheres closely to your internal data.

According to an analysis by the Microsoft Azure DevBlog, pure vector search alone struggles with factual accuracy, scoring a mediocre 2.79 out of 5 for groundedness. The solution is Hybrid Search, which blends semantic vector similarity with traditional keyword matching (BM25).

The 20–40% performance penalty

Hybrid search demands significant computation. The database must rank results from two different engines, such as lexical and semantic, then merge them using a fusion algorithm. Production implementations typically see a 20–40% performance penalty when moving from pure vector search to hybrid search. Reciprocal Rank Fusion (RRF) creates most of this “merge tax”, which, according to Elastic’s research, can significantly increase query latency compared to single-index lookups.

Databases that integrate vector search with filtering, full-text search, and query execution in a single engine execute hybrid queries within a single atomic statement. The query optimizer can evaluate metadata filters, full-text conditions, and vector similarity at once. This lets the optimizer produce better execution plans and move less data.

In contrast, specialized vector silos fragment the query path. Applications route requests across multiple systems and merge results outside the database. This increases system complexity and introduces unpredictable latency under load.

Hybrid platforms such as Actian VectorAI DB address this problem by embedding vector search within the database engine. This design removes cross-system joins, simplifies operations, and reduces long-term architectural overhead.

integrated query execution diagram — Integrated query execution vs. application layer merge

Build Your Own Evaluation Framework

Stop asking which database won a GitHub leaderboard. Start asking which architecture survives your constraints. In 2026, these constraints center on data residency, scale, and team expertise.

The case for hybrid and on-premises

Data residency is no longer optional for global companies. With EU AI Act penalties reaching 35M Euros or 7% of global revenue, cloud-only vector databases represent a legal non-starter for regulated industries.

Sovereignty: 60% of financial firms outside the US plan to adopt sovereign/on-premises vector solutions by 2028.
Cost: As query volumes hit 100M/month, the “cloud tax” becomes visible. Self-hosting or using hybrid platforms like Actian can cut your infrastructure bill in half.
Maturity: If you already manage a relational database, your team possesses 90% of the required skills.

The 2026 architecture decision tree

Does the data require on-premises storage for compliance? → Prioritize Actian VectorAI DB or self-hosted PostgreSQL.
Does your query volume exceed 100M/month? → Avoid managed usage-based pricing; use self-hosted or reserved capacity.
Do you require complex metadata filtering? → An integrated relational/vector engine is non-negotiable.

How to Evaluate the Evaluators

To avoid letting vendor benchmarks mislead you, give the evaluation tool the same careful review you give the database. To spot a biased test, look past the headline QPS numbers and check the exact conditions that produced them.

Use the following evaluation rubric to review any benchmark report before it shapes your architectural decisions.

Evaluation metric	Red flag (Discard result)	Green flag (Trustworthy result)
Ingestion state	Queries run against a static, immutable index with zero background writes.	“Read-while-Write” testing, where queries run during continuous data ingestion.
Hardware parity	Vendor cloud “Optimized” vs. Competitor “Default” local/mismatched instances.	Verified identical CPU, RAM, and Disk I/O configurations across all tested systems.
Data selectivity	“High Selectivity” filters (99% of data removed) that hide join/scan inefficiencies.	“Low Selectivity” (10–20% filtered) tests that force the engine to handle large-scale index traversal.
Dimensionality	Testing on 128-dimension legacy datasets (SIFT/GIST).	Testing on 1,536 or 3,072-dimension vectors that match modern LLM outputs.
Latency metric	Focuses strictly on “Average Latency” or “Mean Response Time.”	Clearly publishes P95 and P99 tail latency under high concurrent load.

Pre-Commitment Checklist

Test with production-representative high-dimensional embeddings (3,072d+).
Measure P99 latency with 100+ concurrent users hitting diverse metadata filters.
Calculate 3-year TCO, including storage growth, egress, and re-indexing fees.
Confirm that your team can manage observability and backups for the new stack.

Final Thoughts

Real evaluation requires testing with your data, your patterns, and your scale. Load your production-representative data, run a week-long stability test under concurrent load, and measure P99 latency and the TCO.

If your workload requires compliance, hybrid deployment, or production-grade operational maturity that managed vector databases don’t offer, then Actian VectorAI DB early access is the right next step.

Join the Actian community on Discord to discuss vector architecture with engineers solving real production problems.

About Author

About Tahiya Chowdhury

Tahiya Chowdhury is the Product Manager for Actian Zen, where she leads the strategy for the industry's most robust edge data platform. Drawing from her background at MongoDB and Goldman Sachs, Tahiya specializes in building products that sit at the intersection of high-scale enterprise needs and modern developer velocity. She is passionate about removing the complexity from data infrastructure, empowering engineering teams to move faster from prototype to production.

Actian Zen and Apache Kafka Integration Using Kafka Connect (JDBC)

By Johnson Varughese

#Databases #ETL #JDBC #Zen

By Johnson Varughese

#Databases #ETL #JDBC #Zen

Summary

Build a real-time financial data pipeline by streaming Actian Zen data to Apache Kafka using JDBC Source and Sink connectors.
Append-only source tables and idempotent upserts enable low-latency, replayable, and audit-ready trade event streaming.
Avro with Schema Registry ensures strong schema governance and safe evolution for financial workloads.
This architecture modernizes batch systems into streaming-first designs without replacing operational databases.

Modern financial systems are no longer built around overnight batches or periodic ETL jobs. Pricing engines, trade capture systems, risk dashboards, and compliance platforms all depend on continuous streams of events that must be processed with low latency, high reliability, and full observability.

At the same time, many organizations already rely on proven operational databases to store transactional data and power business-critical applications. Replacing those systems is rarely an option.

This engineering walkthrough shows how Actian Zen and Apache Kafka can work together to form a robust real-time data pipeline—without rewriting applications or introducing complex custom code. Using Kafka Connect JDBC Source and Sink connectors, we stream financial trade-like data from Zen into Kafka and back into Zen, creating a reusable architectural pattern suitable for real-world financial workloads.

Why Streaming Matters in Finance

Financial data has a unique set of characteristics:

Time sensitivity: Stale data can invalidate decisions.
Burstiness: Market open/close and volatility create spikes.
Strict correctness: Duplicates or missing events are unacceptable.
Auditability: Teams must replay and explain historical decisions.

Traditional batch architectures struggle under these requirements. By contrast, streaming architectures treat each record as an immutable event and allow downstream systems to react in near real time.

Kafka has become the backbone for event-driven pipelines, but Kafka alone doesn’t solve database integration. Kafka Connect bridges this gap by moving data between databases and Kafka using configuration rather than custom code.

What We’re Building

This pipeline demonstrates how financial trade-like data can be streamed from an operational Zen database into Kafka and then written back into a downstream Zen table using JDBC Source and Sink connectors:

The flow is:

A Python process generates synthetic trade ticks.
Each tick is inserted into a Zen source table (FinanceSource).
A Kafka Connect JDBC Source Connector reads new rows incrementally.
Records are published to Kafka as Avro messages (Schema Registry manages schemas).
A Kafka Connect JDBC Sink Connector consumes the topic.
Records are upserted into a Zen sink table (Finance).

This pattern maps directly to market data ingestion, trade replication, streaming ETL, and operational reporting.

A Look at Architecture

kafka blog diagram

At a high level, the architecture has three layers:

Data generation and operational storage: Actian Zen stores incoming trade ticks.
Streaming backbone: Kafka provides a durable, replayable event log.
Integration and delivery: Kafka Connect reads from Zen and writes back to Zen.

A key design principle is decoupling: producers don’t depend on consumers, and the database remains the system of record.

Data Model Design in Action

Schema design is foundational. This demonstration uses two Zen tables with clearly defined roles:

Source Table: FinanceSource (append-only)

CREATE TABLE FinanceSource (     id IDENTITY PRIMARY KEY,     symbol        VARCHAR(16)   NOT NULL,     trade_date    DATE          NOT NULL,     trade_time    TIME          NOT NULL,     price         DECIMAL(18,6)  NOT NULL,     volume        INTEGER       NOT NULL,     bid           DECIMAL(18,6),     ask           DECIMAL(18,6),     exchange      VARCHAR(16),     currency      VARCHAR(8)    DEFAULT 'USD',     recorded_at   TIMESTAMP     NOT NULL );

Two columns are especially important for streaming:

id provides a stable incrementing cursor.
recorded_at provides event time and enables safe incremental reads.

Sink Table: Finance (Materialized State)

CREATE TABLE Finance (     id            INTEGER       PRIMARY KEY,     symbol        VARCHAR(16),     trade_date    DATE,     trade_time    TIME,     price         DECIMAL(18,6),     volume        INTEGER,     bid           DECIMAL(18,6),     ask           DECIMAL(18,6),     exchange      VARCHAR(16),     currency      VARCHAR(8),     recorded_at   TIMESTAMP );

The sink uses id as the primary key, enabling idempotent upserts during replay or restart.

Generating Trade Ticks With Python

The generator simulates a live market feed by inserting a new record every two seconds. Each event includes symbol, price, bid/ask, volume, exchange, currency, and timestamps.

The generator function creates realistic market data:

def gen_tick():     symbol = random.choice(SYMBOLS)     price = round(random.uniform(10, 1500), 6)     spread = round(random.uniform(0.01, 0.50), 6)     bid = round(price - spread/2, 6)     ask = round(price + spread/2, 6)     vol = random.randint(1, 5000)     now = datetime.now()     return {         "symbol": symbol,         "trade_date": date.today(),         "trade_time": now.time().replace(microsecond=0),         "price": price,         "volume": vol,         "bid": bid,         "ask": ask,         "exchange": random.choice(EXCHANGES),         "currency": CURRENCY,         "recorded_at": now,     }

Insert statement:

sql = """ INSERT INTO FinanceSource (     symbol, trade_date, trade_time, price, volume,     bid, ask, exchange, currency, recorded_at ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?) """

This append-only approach is a good match for Kafka: every row is an immutable event that can be streamed, replayed, and consumed by multiple downstream services.

Streaming Zen → Kafka With the JDBC Source Connector

Kafka Connect’s JDBC Source Connector polls FinanceSource and publishes messages to Kafka.

Topic mapping:

Connector name: demo-finance-source
Topic prefix: finance.
Topic: finance.FinanceSource

Incremental mode:

"mode": "timestamp+incrementing", "timestamp.column.name": "recorded_at", "incrementing.column.name": "id", "poll.interval.ms": "2000"

This mode reads only new rows, avoids full scans, and supports safe restarts. Polling every two seconds keeps latency low without adding unnecessary load. Polling every two seconds also balances latency and load for demo and moderate workloads; in production, this should be tuned based on row-insert frequency and database capacity.

Complete source connector configuration:

"connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector", "connection.url": "jdbc:pervasive://host.docker.internal:1583/DEMODATA", "dialect.name": "ZenDatabaseDialect", "mode": "timestamp+incrementing", "timestamp.column.name": "recorded_at", "incrementing.column.name": "id", "table.whitelist": "FinanceSource", "topic.prefix": "finance.", "poll.interval.ms": "2000", "value.converter": "io.confluent.connect.avro.AvroConverter"

Avro and Schema Registry for Schema Governance

Financial schemas evolve: new metrics, new identifiers, or adjusted precision come into play. Avro with Schema Registry provides strong typing, centralized versioning, and compatibility controls.

Connector configuration:

"value.converter": "io.confluent.connect.avro.AvroConverter", "value.converter.schema.registry.url": "http://schema-registry:8081"

With this setup, schemas are registered automatically and consumers can evolve safely over time. Schema Registry is required only when using Avro (or Protobuf/JSON Schema); JSON converters can be used for lighter-weight demos at the cost of schema governance.

Kafka → Zen With the JDBC Sink Connector (Upsert)

The Sink Connector consumes the Kafka topic and writes into the Finance table.

Upsert configuration:

"topics": "finance.FinanceSource", "table.name.format": "Finance", "insert.mode": "upsert", "pk.mode": "record_value", "pk.fields": "id", "auto.create": "false", "auto.evolve": "true"

Upsert is a strong default because restarts and replays remain idempotent, and late-arriving corrections can update existing keys.

Deployment and Orchestration

All Kafka components run in Docker: Kafka broker, Schema Registry, Kafka Connect, and Kafka UI (Kafbat / AKHQ-compatible). Actian Zen runs on the host.

A single orchestration script starts the stack, initializes tables, creates connectors, and launches the generator. This “one command demo” model is useful for training, proofs of concept, and repeatable testing.

Endpoints typically used during validation:

Kafbat UI: http://localhost:8080
Kafka Connect REST: http://localhost:8083
Schema Registry: http://localhost:8081

Operational Validation

To validate end-to-end flow:

Confirm the generator prints new ticks every two seconds.
Check connector status via Kafka Connect REST.
Inspect messages in the finance.FinanceSource topic.
Query the Zen sink table Finance.

Status calls:

curl http://localhost:8083/connectors/demo-finance-source/status curl http://localhost:8083/connectors/demo-finance-sink/status

If something fails, Kafka Connect logs are usually the fastest signal: missing JDBC jars, dialect issues, or authentication problems.

Production Considerations

This demo is intentionally simple, but the architecture scales well. In production, consider:

TLS and authentication for Kafka and Connect.
Topic partitioning for parallelism (e.g., by symbol).
Dead-letter queues for problematic records.
Schema compatibility enforcement in Schema Registry.
Multi-worker Connect clusters for throughput and resilience.
Monitoring (Prometheus/Grafana).

The core pattern—append-only source + incremental polling + Avro + idempotent sink upserts—remains a strong baseline.

Take a Visual Walkthrough

The following screenshots demonstrate the pipeline in action, from data generation through Kafka to the final sink table:

Data Generator Output

The Python generator continuously produces synthetic trade ticks every two seconds, simulating live market data:

zen jdbc kafka demo

Kafbat UI – Topic View

The Kafbat UI provides real-time visibility into Kafka topics, showing messages flowing through the pipeline:

kafbat ui kafbat ui finance source

Connector Status

Both source and sink connectors show RUNNING status, confirming the pipeline is operational:

kafka connect

Message Contents

Individual messages in Kafka contain the full trade tick data in Avro format with schema versioning:

kafbat ui demo cluster

Sink Table Results

finance demo project

The Finance sink table in Zen receives the streamed data, demonstrating a successful end-to-end flow.

Getting Started

The demo includes a comprehensive orchestration script that automates the entire setup process. Running the demo is as simple as executing a single Python script.

One-Command Demo Launch

The orchestrator handles five key steps automatically:

Start Docker Compose stack (Kafka, Schema Registry, Connect, UI).
Wait for all services to become healthy (45 to 60 seconds).
Initialize FinanceSource and Finance tables in Zen.
Create and configure JDBC source and sink connectors.
Launch the data generator in the background.

Core orchestration logic:

def run(self):     # Step 1: Start Docker Compose     self.start_docker_compose()          # Step 2: Wait for services     self.wait_for_services()          # Step 3: Initialize databases     self.initialize_databases()          # Step 4: Setup connectors     self.setup_connectors()          # Step 5: Start data generator     self.start_data_generator()          # Show status and keep running     self.show_status()

The script provides clear status updates at each step and handles cleanup on interruption (Ctrl+C).

Table Initialization

The initialization script creates both tables with proper schemas and drops existing tables to ensure a clean state:

def create_finance_source(conn):     exec_sql(conn, "DROP TABLE IF EXISTS FinanceSource")          create_sql = """     CREATE TABLE FinanceSource (         id IDENTITY PRIMARY KEY,         symbol VARCHAR(16) NOT NULL,         trade_date DATE NOT NULL,         trade_time TIME NOT NULL,         price DECIMAL(18,6) NOT NULL,         volume INTEGER NOT NULL,         bid DECIMAL(18,6),         ask DECIMAL(18,6),         exchange VARCHAR(16),         currency VARCHAR(8) DEFAULT 'USD',         recorded_at TIMESTAMP NOT NULL     )     """     exec_sql(conn, create_sql)

Build and Benefit From a Real-Time Financial Pipeline

This solution demonstrates a practical way to build a real-time financial pipeline with Actian Zen and Kafka Connect:

Zen stores operational ticks and remains the system of record.
Kafka provides a durable, replayable stream.
Kafka Connect moves data reliably with configuration.
Avro and Schema Registry add schema safety.
The sink table provides queryable materialized state.

For organizations modernizing financial data flows, this architecture offers a clear path from batch processing to streaming-first designs without abandoning existing database investments.

Read more in our blog series that focuses on helping embedded app developers get started with Actian Zen.

About Author

About Johnson Varughese

Johnson Varughese manages Support Engineering at Actian, assisting developers leveraging ZEN interfaces (Btrieve, ODBC, JDBC, ADO.NET, etc.). As a Google Certified Data Practitioner and Generative AI Leader, he provides technical guidance and troubleshooting expertise to ensure robust application performance across different programming environments. Johnson's wealth of knowledge in data access interfaces has streamlined numerous development projects. His Actian blog entries detail best practices for integrating Btrieve and other interfaces. Explore his articles to optimize your database-driven applications.

Should You Use RAG or Fine-Tune Your LLM?

By Nick Johnson

#Databases #LLM #RAG #VectorAI DB

Should you use RAG or fine-tune your LLM

By Nick Johnson

#Databases #LLM #RAG #VectorAI DB

Summary

RAG dominates enterprise AI due to flexibility, but fine-tuning excels at scale, latency, and structured outputs.
RAG adds recurring costs from context and retrieval, while fine-tuning shifts cost upfront with stable per-query pricing.
Hybrid approaches combine retrieval with fine-tuning for higher accuracy and better reasoning.
Choosing the right approach depends on data volatility, query volume, and team capabilities.

The debate over retrieval augmented generation (RAG) vs. fine-tuning appears simple at first glance. RAG pulls in external data at inference time. Fine-tuning modifies model weights during training. In production systems, that distinction is insufficient.

According to the Menlo Ventures 2024 State of Generative AI in the Enterprise report, 51 percent of enterprise AI deployments use RAG in production. Only nine percent rely primarily on fine-tuning. Yet research such as the RAFT study from UC Berkeley shows that hybrid systems combining retrieval and fine-tuning outperform either approach alone across benchmarks.

If hybrid systems can produce better results, why does industry adoption favor only RAG? In this article, we’ll compare RAG, fine-tuning, and a hybrid architecture to understand the trade-offs and where each approach excels.

TL;DR

RAG: Best for frequently changing knowledge and moderate traffic; easy to update without retraining.
Fine-tuning: Best for stable domains and high-volume or low-latency tasks; improves task-specific accuracy and formatting.
Hybrid/RAFT: Combines up-to-date retrieval with optimized model behavior for the highest accuracy.
Key trade-off: Choice depends on query volume, how often knowledge changes, and team expertise.

Why the Standard RAG vs. Fine-Tuning Comparison Fails

RAG is a method where the model dynamically pulls in external data at inference time. Each query retrieves relevant documents or knowledge chunks, which the system appends to the prompt, allowing the model to produce answers grounded in current information.

Fine-tuning is the process of modifying a model’s weights during training using labeled data. Instead of relying on external retrieval, the model internalizes patterns directly, producing consistent outputs without querying external sources.

While these definitions are technically correct, most standard comparisons miss the factors that actually drive decisions in production. In real-world systems, the choice between RAG and fine-tuning depends on variables like scale, query volume, and how often your data changes.

Missing variable 1: Context expansion at scale

In many production RAG systems, every request appends hundreds of tokens. That added context changes how the model allocates attention and prioritizes weights.

Large retrieved contexts compete for attention with the prompt and instructions, which can dilute signal quality. Small retrieval errors or loosely relevant chunks can introduce formatting drift, or shift reasoning in subtle ways. The system’s output becomes tightly coupled to retrieval quality.

Fine-tuning works differently. Instead of injecting large volumes of text at inference time, it embeds patterns and constraints directly into the model during training. The distinction affects how the system behaves under real workloads.

Missing variable 2: Retraining frequency

The common advice says “use RAG if knowledge changes frequently” and “use fine-tuning if behavior is stable.” But how frequently is “frequently”?

If your knowledge base changes daily, retraining pipelines may introduce operational friction. Evaluation cycles, dataset versioning, and deployment validation all add delay.

Data preparation also matters. If your organization lacks structured, versioned, and clean datasets, the hidden cost of preparing training data can exceed compute costs.

The Cost Math of RAG vs. Fine-Tuning

Surface-level comparisons of RAG and fine-tuning often ignore the cost curves that determine long-term viability. In production systems, financial estimations are crucial in architectural decisions. To evaluate RAG vs. fine-tuning realistically, we need to examine three cost layers:

Token cost and context expansion.
Retrieval infrastructure cost.
Training infrastructure cost.

The cost structure of RAG

RAG systems introduce a recurring operational cost because each query retrieves external information and injects it into the model’s prompt. That additional context is billed on every request.

Context expansion

Production RAG systems append around 500 tokens of retrieved context to each query. The provider bills those tokens on every request.

Using pricing similar to GPT-5.2 at 1.750 dollars per million input tokens, the incremental monthly cost becomes:

Cost per query
500 tokens × $1.75/1,000,000 = $0.000875 per query

At a small scale, this cost appears negligible. However, because it applies to every query, the total overhead grows linearly with traffic.

At different traffic levels:

Monthly queries	Context cost
10 million	$8,750
50 million	$43,750
100 million	$87,500

This is context overhead alone. It does not include output tokens or base prompt tokens. At a sustained scale, what appears flexible and inexpensive becomes a significant recurring expense.

Vector database and retrieval cost

Token cost is only one component of RAG costs. RAG also relies on a vector database for semantic search. The system must store, index, and query embeddings efficiently.

Public pricing of Pinecone lists:

Storage at approximately 0.33 dollars per gigabyte per month.
Read units at approximately 16 dollars per million.
Write units at approximately four dollars per million.

For example, consider a system handling 50 million queries per month, where each query performs a single vector search (assuming a 1,024-dimension vector). That would result in 50 million read operations monthly. If the system also writes approximately six million records per month, the combined read and write activity would bring the total estimated monthly cost to around $1,532.

pinecone pricing

Figure 1: Pinecone pricing for 50M vectors

At 200 million queries per month, the total expenses rises to $9,000 per month.

Two RAG systems serving identical traffic can therefore have materially different cost structures depending on how the vector database is designed and optimized.

Infrastructure cost

RAG systems require storage and compute infrastructure to generate embeddings, store and index vectors, execute retrieval queries, and run inference. Each of these stages consumes compute resources, typically provisioned through cloud servers that must scale with traffic.

For real-time or high-throughput applications, additional capacity is required to maintain low latency and system reliability. Replication, autoscaling, monitoring, and failover mechanisms all add operational complexity. These infrastructure layers are essential for production-grade RAG, but they expand the total cost footprint beyond token usage alone.

The cost structure of fine-tuning

Fine-tuning introduces a different economic model from RAG systems. Instead of paying incremental costs on every request for external context, you invest upfront to modify the model’s internal behavior.

That upfront investment can be broken into four primary cost categories: data, training compute, experimentation, and operational maintenance.

Data preparation costs

High-quality labeled data is the foundation of effective fine-tuning. This includes collecting domain-specific examples, cleaning inconsistencies, formatting inputs and outputs correctly, and validating annotation quality.

In many organizations, data preparation consumes 20 to 40 percent of the total fine-tuning budget. Poorly curated data directly degrades model performance, leading to additional retraining cycles and wasted compute.

Training compute costs

OpenAI lists fine-tuning at roughly $25 per million training tokens for GPT-4.1. A run using 20 million tokens would cost about $500 in direct training fees, with larger datasets or multiple runs increasing this total.

For self-hosted training, costs depend on model size and hardware. High-performance GPUs such as A100 clusters can cost thousands of dollars per training epoch. Because fine-tuning is rarely a single-pass process, multiple epochs, evaluations, and retraining cycles are common, which further increases the overall cost.

Experimentation and validation costs

Fine-tuning is an iterative process that requires experimentation with hyperparameters, evaluation against baseline models, and testing across edge cases. These workflows require engineering time, infrastructure, and structured evaluation frameworks. Unlike prompt engineering, fine-tuning introduces a full ML lifecycle, adding ongoing operational overhead.

This creates a non-linear cost curve. Fine-tuning concentrates cost at the beginning, while marginal cost per request remains relatively stable as traffic grows.

non linear cost curve

Figure 2: Non-linear cost curve

Whether that trade-off is advantageous depends on three variables: query volume, knowledge stability, and retraining frequency. Without modeling those explicitly, cost comparisons between RAG and fine-tuning remain incomplete.

When RAG Wins

Despite its scaling trade-offs, RAG remains the dominant production choice for a reason. In certain operating conditions, it is structurally more flexible, faster to iterate, and operationally safer than fine-tuning. RAG is suitable in the following scenarios:

When knowledge changes frequently

If your domain knowledge changes weekly or daily, fine-tuning becomes operationally expensive. Dataset updates, retraining, evaluation, and deployment introduce delays that can stretch from hours to weeks, depending on governance requirements.

Teams frequently underestimate the operational overhead of keeping a fine-tuned model synchronized with a rapidly evolving knowledge base. In these environments, RAG shifts the problem from model retraining to data indexing.

When you have extensive unstructured data but limited labeled data

Many organizations possess terabytes of internal documents but lack high-quality supervised datasets. Building labeled training corpora requires annotation workflows, domain experts, and quality validation pipelines. In practice, this often becomes the most expensive part of fine-tuning projects.

RAG bypasses this constraint by allowing models to operate directly on existing document corpora without constructing large labeled datasets.

When governance and data residency requirements are strict

Once sensitive information is embedded in model weights, deletion and auditing become difficult. Removing a specific record from a fine-tuned model often requires retraining or maintaining complex dataset lineage.

RAG architectures avoid this issue by keeping sensitive information in external storage systems where standard governance controls already exist.

When query volume is moderate

As shown in the earlier cost analysis, context expansion overhead grows with query volume, reaching approximately $43,750 per month at 50 million queries. At moderate traffic, RAG’s per-request costs are typically lower than the amortized expenses of fine-tuning, including training and ongoing maintenance. This makes RAG an attractive choice for organizations that want high-quality outputs without front-loading infrastructure and compute investments.

Use cases

Large-scale examples illustrate RAG’s effectiveness at this volume. Notion’s Q&A assistant is effectively a large-scale RAG system over workspace data. The difficult engineering problem was not retrieval itself, but enforcing identity and access controls during retrieval. When a user queries the assistant, the system must ensure the model only retrieves documents that the user is permitted to see.

LinkedIn leveraged RAG and knowledge graphs to preserve the structure of their support cases. This system retrieved relevant subgraphs rather than isolated text chunks, improving retrieval accuracy by 77.6% and reducing median issue resolution time by 28.6%.

For systems at this scale, RAG combines cost efficiency with flexibility, allowing teams to update knowledge sources rapidly without retraining models, while still delivering high-quality results.

When Fine-Tuning Wins

Fine-tuning becomes structurally advantageous under different conditions. These conditions typically involve scale, stability, and behavioral precision.

When query volume exceeds 100 million per month

At very high traffic levels (100M+ queries per month), RAG’s per-request context overhead becomes significant. Each query adds hundreds of retrieved tokens that the model processes, causing costs to scale linearly with traffic. Large context windows can also increase latency, reduce throughput, and complicate infrastructure reliability.

If domain knowledge is relatively stable, fine-tuning can become more efficient. By embedding knowledge directly into the model, organizations avoid repeated retrieval and token costs, leading to more predictable per-query expenses, better consistency, and simpler operations at scale.

When output structure is critical

Fine-tuned models often excel in tasks that require strict adherence to structure or formal constraints. For example, Cosine, which is an AI software engineering assistant that’s able to autonomously resolve bugs and build features, was able to achieve a SOTA score of 43.8% on the SWE-bench⁠ verified benchmark.

swe bench leaderboard

Figure 3: SWE-bench leaderboard

Similarly, Distyl secured the top position on the BIRD-SQL benchmark, widely regarded as the premier evaluation for text-to-SQL performance. Its fine-tuned GPT-4o model reached an execution accuracy of 71.83% on the leaderboard.

leaderboard execution accuracy

Figure 4: Execution accuracy leaderboard

In applications where errors propagate downstream, into financial calculations, automated APIs, or compliance documents, behavioral consistency is mandatory. In these contexts, fine-tuning provides the reliability needed to minimize risk and maintain trust in automated outputs.

When latency requirements are strict

RAG adds multiple steps to the inference pipeline that increase response time. Each query must go through embedding generation, vector search, and context injection before reaching the model.

Fine-tuned models skip retrieval entirely. All necessary knowledge and reasoning patterns are internalized, allowing the model to generate outputs immediately. In applications where sub-100ms responses are required, such as live recommendation engines or high-frequency trading systems, removing the retrieval pipeline eliminates a major bottleneck.

When deep domain reasoning matters more than freshness

A domain-specific agriculture benchmark study found that fine-tuning improved model accuracy from 75% to 81%, while hybrid systems (fine-tuning + retrieval) reached 86%. Because the dataset focused on specialized agricultural knowledge and reasoning tasks, the improvement primarily reflects stronger domain reasoning, not simply better access to external information.

In domains such as legal analysis or medical decision support, reasoning patterns can be complex. Fine-tuning enables models to internalize domain expertise rather than rely solely on retrieved context.

The Hybrid Approach

While RAG and fine-tuning each have clear advantages, research shows that combining them effectively can produce superior results, but only when done correctly. The RAFT (Retrieval Augmented Fine-Tuning) approach, developed by UC Berkeley, Microsoft, and Meta Research, demonstrates how to do this in practice.

RAFT trains a model to operate in an “open-book” setting. It learns to process retrieved context, identify relevant passages, ignore distractors, and cite evidence accurately. Without this explicit training, simply layering RAG on top of a fine-tuned model often fails. For instance, a model fine-tuned on medical reasoning may retrieve irrelevant journal articles if it hasn’t learned to filter and prioritize context, resulting in hallucinations or incorrect recommendations.

RAFT addresses this with a structured 80/20 training split. 80% of training examples include oracle documents that the model should use, and 20% do not, forcing the model to learn when to trust retrieved data and when to rely on internalized knowledge. This operational detail is crucial for engineers evaluating whether their team can implement a hybrid approach successfully. It is not enough to just combine RAG and fine-tuning. The model must be trained to reason over the retrieved context.

A common and practical pattern is “fine-tune for format, RAG for knowledge.” Fine-tuning shapes the model’s internal behavior, enforcing domain-specific reasoning, output structure, and style. RAG provides dynamic access to external information that changes frequently or is too large to store in the model weights. In healthcare, for example, fine-tuning ensures the model understands medical terminology, follows proper diagnostic reasoning, and formats outputs according to clinical documentation standards. RAG supplements this by retrieving the latest research, newly published treatment guidelines, or patient-specific records, keeping recommendations current without retraining the entire model.

Similarly, Harvey AI fine-tuned on 10 billion case law tokens, but still leverages RAG to handle current cases and updates. This pattern is widely used in other domains too. Legal systems fine-tune for statutory reasoning and citation style, then layer RAG to retrieve the most current case law; finance models fine-tune for portfolio analysis rules, then layer RAG for market updates and regulatory changes. It’s a way to balance the stability of learned behavior with the adaptability of retrieval.

A Quantified Decision Framework for RAG vs. Fine-Tuning

The question is no longer “Which approach is better?” It is “Under what conditions does each approach make economic and operational sense?”

Instead of defaulting to architectural preference, evaluate three measurable variables:

Knowledge change frequency.
Monthly query volume.
Infrastructure capability and governance constraints.

When those variables are quantified, the decision becomes far clearer.

Step 1: Measure knowledge volatility

Knowledge change frequency is often the fastest way to eliminate one option. If your domain knowledge changes weekly or daily, RAG is structurally favored. Updating an index is far simpler than retraining a fine-tuned model. The separation between model weights and external data enables real-time data retrieval without redeployment cycles.

If knowledge remains stable for months at a time, fine-tuning becomes economically viable. Retraining frequency drops, and training cost can be amortized over longer intervals. In these environments, embedding domain-specific knowledge directly into model parameters may reduce long-term inference overhead.

As a practical threshold:

Knowledge changes more than monthly → prioritize RAG.
Knowledge stable for multiple months → evaluate fine-tuning.

Step 2: Calculate context expansion cost

The next variable is query volume. Large-scale RAG systems append hundreds of tokens to every query, and this context overhead scales linearly with traffic.

Quantitative triggers

Monthly queries	Guidance
<10M	RAG is cheaper
10–50M	Evaluate fine-tuning vs. RAG
50–100M	Fine-tuning or hybrid
>100M	Fine-tuning or hybrid

Step 3: Assess infrastructure maturity

Even if economics favor one approach, infrastructure capability may dictate feasibility.

RAG requires:

Strong data engineering.
Reliable data pipelines.
Efficient vector database architecture.
Observability and monitoring.

Fine-tuning requires:

High-quality labeled data.
Machine learning expertise.
Compute resource allocation.
Evaluation discipline.

When teams ignore their actual capabilities, architecture decisions collapse under scale. Many production failures blamed on “model quality” are just traits of immature infrastructure.

Decision matrix

The following matrix translates the analysis into practical guidance.

Scenario	Monthly queries	Knowledge update frequency	Recommendation	Rationale
Domain knowledge updates weekly, moderate traffic	10–50M	Weekly/Daily	RAG	Immediate indexing and low recurring cost
High-scale traffic, knowledge stable	50–100M+	<1 update/month	Fine-tuning	Avoids recurring context injection, reduces latency
Structured output or code generation required	Any	Any	Fine-tuning	Embeds domain-specific rules and formatting internally
Specialized reasoning + frequent updates	10–50M	Weekly/Daily	Hybrid	Combines internalized reasoning with dynamic knowledge
Multi-domain systems with diverse knowledge update cycles	10–100M	Mixed	Hybrid	Fine-tuning stabilizes core domains, RAG handles rapidly changing sources

Using this matrix, it becomes easier to make the decision whether to utilize RAG, fine-tune your LLMs, or use the hybrid approach.

Final Thoughts

The debate between RAG and fine-tuning is often framed as a binary choice, but the more useful question is “If hybrid systems demonstrably outperform either approach alone, why does industry adoption still overwhelmingly favor RAG?”

Hybrid requires both ML and data engineering capabilities simultaneously, a combination few organizations have. RAG remains the practical default, offering agility and transparency with less upfront complexity.

The key takeaway is to choose the architecture that matches your knowledge volatility, query scale, and team capability. For teams exploring enterprise-scale retrieval systems, platforms like Actian VectorAI DB provide purpose-built vector database capabilities designed for performance and scalability.

Join the Discord community and learn how Actian fits into your AI strategy.

About Author

About Nick Johnson

Your Data Strategy and AI Strategy are Now the Same Thing

By Dee Radh

#AI #Data Governance #Data Management #Data Quality #ML

your data strategy and ai strategy are now the same thing

By Dee Radh

#AI #Data Governance #Data Management #Data Quality #ML

Summary

AI performance depends on trusted, reliable data, making data strategy and AI strategy inseparable.
Poor data quality, weak governance, and missing lineage can undermine enterprise AI outcomes.
AI-ready data requires discovery, observability, governance by design, and clear operational context.
Organizations that unify data and AI foundations can move from AI experiments to reliable production systems.

If you’re treating your data strategy and your AI strategy as two separate initiatives, you’re overlooking a critical reality: AI performance depends on the quality and reliability of the data behind it. Models may get the headlines, but data determines the outcome.

Leading organizations are no longer approaching AI as a standalone technology project. They’re unifying their data and AI strategies into a single foundation for reliable data and trusted AI outcomes.

AI systems don’t operate in isolation. They rely on the quality, structure, and context of the data they consume. As you implement AI agents, copilots, and agentic AI systems, the gap between data strategy and AI strategy effectively disappears.

AI is Only as Reliable as the Data Supporting It

Many organizations have already discovered that building AI applications is easier than making them trustworthy, especially at enterprise scale. Sure, large language models and machine learning frameworks are widely available, but deploying AI into real business workflows requires something far more difficult: reliable, governed, readily accessible, and contextual data.

Research from Gartner underscores the challenge. By 2026, more than 60% of AI projects will be abandoned if they’re unsupported by AI-ready data. In other words, the problem isn’t the models. It’s the data.

Rushing to connect AI systems to fragmented data environments creates familiar problems:

Inconsistent business definitions across departments.
Missing lineage that makes data origins unclear.
Poor visibility into data quality issues.
Static data catalogs that lack operational context.
Unclear ownership and governance responsibilities.

Unless these data issues are solved, AI systems are at risk of producing inaccurate outputs, unreliable predictions, or decisions that business leaders simply cannot trust.

4 Data Management Capabilities Required for AI

Traditional data strategies are built for analytics and reporting. Data warehouses, dashboards, and BI tools allow you to analyze historical information and generate insights.

AI introduces a new set of requirements. Instead of only analyzing data, AI systems actively consume, reason over, and act on data in real time. That means you must ensure your data is not only accessible, but also trustworthy and explainable.

This requires a more comprehensive approach to data management that includes these four capabilities:

Data intelligence and discovery. Your teams must understand what data exists across the enterprise, how it relates to other assets, and which datasets are appropriate for AI use. This data must also be easily discoverable and accessible.
Data quality and observability. You need continuous monitoring of data pipelines and assets to detect issues such as schema drift, freshness gaps, or missing values before they affect downstream systems. Observability must do more than send alerts. It should proactively identify and mitigate issues.
Governance by design. Policies that address data access, ownership, and compliance must be embedded directly into the data ecosystem. This helps ensure AI systems operate within trusted boundaries.
Operational context. AI systems require real-time awareness of data reliability, lineage, and dependencies to produce accurate outcomes. They also require data context, including clear business definitions and usage policies, so AI agents and models can interpret data correctly.

These capabilities transform data from a static resource into an operational asset that AI systems can safely use.

The Rise of Data Reliability as an AI Requirement

A major shift with AI is the growing importance of data reliability. Oftentimes, data problems remain hidden until they impact dashboards, automation, or business decisions.

When an issue surfaces, teams often spend hours investigating what changed, which pipelines were affected, and how widespread the impact might be. This reactive model is incompatible with AI systems that operate continuously and automatically. If your AI relies on poor quality datasets, risk multiplies quickly.

That’s why modern data strategies increasingly include data observability and automated monitoring. These capabilities allow your teams to identify anomalies early, understand dependencies across data assets, and resolve issues before they cascade downstream to analytics, apps, or AI systems.

Trustworthy AI requires reliable data, and reliability must be continuously measured.

AI is Encouraging Data Teams and Business Teams to Align

AI is changing the conversation about who owns your organization’s data. What was once primarily a technical concern for IT is now a strategic priority for business leaders. Because AI systems influence decisions, automation, and customer interactions, the quality and trustworthiness of data have become business-critical issues.

If an AI system produces unreliable insights or incorrect recommendations based on faulty data, the impact quickly reaches leadership, operations, and customers. This means data governance, quality, and ownership can no longer be treated as purely technical concerns.

Organizations at the forefront of AI adoption typically focus on creating a shared understanding of data across teams and departments. Business users, analysts, engineers, and data product managers all need visibility into the same data context: how trustworthy the data is, how it is used, and what risks may exist.

When everyone works from the same trusted data foundation, AI systems become far more effective.

Moving From AI Experiments to AI Operations

Many organizations are still in the experimental phase of AI adoption. Pilot projects and prototypes demonstrate what’s possible, but scaling them into production requires operational discipline.

That discipline comes from the data layer. Enterprises that successfully operationalize AI focus on three key pillars:

Discover the right data across the enterprise.
Trust that the data is accurate, governed, and reliable.
Activate the data safely within analytics, applications, and AI and agentic workflow.

When these elements work together, AI moves from isolated experimentation to reliable enterprise capability.

Organization leaders often ask how they should build an AI strategy. The answer starts with data. AI models will continue to evolve and improve, but no algorithm or model can compensate for fragmented, poorly governed, or unreliable data.

To succeed with AI, you must recognize a simple but critical shift: your data strategy is no longer separate from your AI strategy. They are now the same thing.

Take a tour of the Actian Data Intelligence Platform to see how to make data discovery, trust, and activation a reality for your AI.

About Author

About Dee Radh

As Senior Director of Product Marketing, Dee Radh heads product marketing for Actian. Prior to that, she held senior PMM roles at Talend and Formstack. Dee has spent 100% of her career bringing technology products to market. Her expertise lies in developing strategic narratives and differentiated positioning for GTM effectiveness. In addition to a post-graduate diploma from the University of Toronto, Dee has obtained certifications from Pragmatic Institute, Product Marketing Alliance, and Reforge. Dee is based out of Toronto, Canada.

4 Guiding Principles That Lead to Success

By Liz Brown

#Actionable Insights #Company Culture #Data Insights

By Liz Brown

#Actionable Insights #Company Culture #Data Insights

Summary

Clear guiding principles help field marketing teams balance speed, execution, and customer impact.
Customer obsession, ownership, bias for action, and trust drive stronger alignment across marketing and sales.
Principles create shared expectations that shape decisions, collaboration, and accountability.
When consistently applied, guiding principles become a competitive advantage for modern marketing teams.

Early in my career, I worked at companies where mission statements were displayed throughout the building, but rarely lived in practice. They were polished, aspirational, and mostly ignored.

Then I joined Amazon. At the time I was there, we had 12 leadership principles to guide our actions and decisions. The list has grown since then, but what struck me wasn’t the number. It was the fact that people actually embraced them. The principles played a significant role in meetings, feedback, performance reviews, and even in everyday conversations.

If someone committed to a task and didn’t follow through, you might hear, “Where’s your bias for action?” Or “Where’s your ownership?”

These weren’t slogans. They were operating standards. Even though I haven’t worked there since 2022, those guiding principles continue to shape how I lead field marketing at Actian and even how I approach my personal life.

Why Guiding Principles Matter in Modern Field Marketing

Field marketing operates across brand, demand generation, sales, and customer experiences. We sit in a space where speed, execution, and trust all matter.

Without clear principles, it’s easy to default to:

That’s not my swim lane.
It’s good enough for now.
Sales will handle it.

Guiding principles create shared expectations around how we operate, not just what we deliver. For me, four principles continue to guide my work:

1. Customer Obsession: Marketing Starts and Ends With the Customer

In field marketing, it’s tempting to focus on attendance at the events we’re sponsoring, booth traffic, marketing qualified lead volume, or campaign metrics. The truth is, none of this matters if it doesn’t serve the customer.

Customer obsession means asking:

Does this event create value for attendees?
Does our message resonate with the real challenges they’re facing?
Are we helping sales build meaningful conversations?

At Actian, when we show up at events like the Gartner Data and Analytics Summit, I’m constantly thinking about how we make that experience valuable, not just visible. Field marketing is not about presence. It’s about impact.

Customer obsession keeps us focused.

2. Ownership: It’s All of Our Business

Ownership is one of the principles I lean into the most. Ownership means you don’t say, “That’s not my job.”

If demand gen isn’t performing, that’s not “their problem.” It’s all of our problem. Messaging that’s not resonating is never product marketing’s issue alone. It’s a shared responsibility across the organization.

Last year, I worked closely with sales leadership, some of whom were skeptical about marketing’s value. One leader told me directly that he didn’t think marketing did a good job. My response? Challenge accepted.

Ownership means stepping in, listening, improving processes, and delivering results until trust is built and maintained. In my interactions with the sales leader who didn’t have a positive view of marketing, over time, our relationship shifted. He realized how marketing contributes measurable value, then public recognition followed. More importantly, alignment between sales and marketing improved.

Ownership ultimately builds credibility.

3. Bias for Action: Speed Wins in Technology Marketing

Bias for action is one of my favorite guiding principles. It means speed matters. Decisions are often reversible, and perfection is rarely required before you take action.

In field marketing, especially in the fast-moving AI, data intelligence, and analytics space, waiting too long is a risk. Markets move. Messaging shifts. Competitors act. If you wait for the perfect time with the perfect campaign, you’re already behind.

Bias for action means:

Ship the campaign.
Test the message.
Launch the event strategy.
Iterate based on data.

At Actian, we talk internally about the idea that everything doesn’t need to be perfect before acting. Launch your strategy. Measure it. Adjust and improve as needed.

Bias for action also requires being comfortable with failure, which is a mindset that can take time to accept. This isn’t reckless failure, but a strategic approach.

For example, we tried a direct mail campaign that didn’t work. We tested programs that underperformed, yet every experiment taught us something.

Bias for action means fail fast, learn faster, and move forward.

4. Earn Trust: The Currency of Field Marketing

If I had to pick one principle that transcends an entire organization, it’s earning trust.

Trust is built through:

Delivering what you commit to.
Following up as needed.
Responding quickly to issues and opportunities.
Bringing thoughtful insights.
Owning results, whether they’re good or bad.

I never want to be known as the person who sits on requests or leaves emails unanswered. Even if I don’t have an immediate answer, acknowledgment matters.

Trust is also built through consistency. Over time, when sales teams know that marketing will execute, respond, and deliver, collaboration becomes organic.

When trust exists internally, it shows to customers externally.

Know Your Strengths and Growth Areas

“Think big” is an area that, if I’m honest, I can improve upon. I like diving in and getting things done. Execution energizes me. My mind is always running, often before 7 A.M., thinking about what needs to be started, reviewed, and closed out.

Carving out time to step back and think 30,000 feet above the work doesn’t come naturally to me, but growth comes from recognizing that. The most productive week I had recently wasn’t one where I cleared the most tickets. It was when our marketing team stepped back strategically to analyze what was working, what wasn’t, and where we could improve.

Guiding principles aren’t just about reinforcing strengths. They’re about identifying where you need to stretch, flex, and grow.

What Will You Be Known For?

At one point in my career, my manager asked, “What will you be known for this year?” That question stuck with me.

Each year, I think about:

What impact did I make?
Where did I elevate performance?
Who did I win over?
What results did I deliver?

Field marketing is visible work. Events, campaigns, and customer engagement are all measurable. But reputation is cumulative. You build it through action, ownership, and trust.

Guiding Principles as a Competitive Advantage

Not every organization has leadership principles posted on walls. That’s okay. The more important question is, “What principles are guiding you professionally and personally?”

For me, these principles aren’t corporate slogans. They’re habits. They influence how I run events, collaborate across teams, manage programs, and even how I show up in my personal life.

This mindset doesn’t happen by accident. It’s built through clear standards and repeated behavior. In a fast-moving industry like data and AI, execution without principles leads to chaos. Principles without execution lead to stagnation. By contrast, when guiding principles are ingrained, they become an enabler for getting things done.

Sign up for our blog to get industry insights, leadership perspectives, and the latest product news directly into your inbox.

About Author

About Liz Brown

Liz Brown is a high-energy, results-driven marketing professional with a proven track record of driving business growth and inspiring, mentoring, and enabling colleagues and peers. Known for her strategic thinking and collaborative leadership, Liz excels at building impactful marketing strategies, ABM programs, and enablement initiatives tailored to top accounts and industries. She has extensive experience in brand positioning, integrated campaigns, and customer engagement, from large-scale events to targeted digital initiatives.

5 Edge AI Architecture Patterns for Disconnected Environments

By Nick Johnson

#AI #IoT

5 Edge AI Architecture Patterns for Disconnected Environments blog

By Nick Johnson

#AI #IoT

Summary

Disconnected environments require edge AI architectures that operate fully offline without cloud dependency.
Five deployment patterns enable resilient edge AI: drone, factory, federated learning, store-and-forward, and mesh network.
Edge-native designs support real-time inference, low latency, and reliable operations in remote or intermittent networks.
Choosing the right architecture depends on connectivity stability, latency requirements, and hardware constraints.

A haul truck operating 200 miles from the nearest cellular tower does not pause when connectivity drops. An offshore wind turbine does not suspend fault detection because a satellite link fails in a storm. In these environments, inference, control loops, and safety systems must continue operating regardless of network status. Yet the dominant edge AI architecture still revolves around connectivity and cloud AI.

Disconnected environments demand edge-native, offline-first architectures designed for operational autonomy. Market signals reinforce this reality.

ABI Research projects edge server spending to reach $19B by 2027, with on-premises deployments accounting for nearly $10.5B. In 2025, organizations deployed approximately 815 million edge-enabled IoT devices globally.

Most operational environments are inherently distributed, generating data far from centralized cloud systems. Edge deployment strategies that depend on sending that data back and forth for processing cause IoT systems to miss critical insights, increase latency, and introduce data loss. Yet proposed edge architectures still treat offline readiness as an add-on rather than the default.

We present five edge AI deployment patterns that operate without assumed connectivity, covering their implementation tactics, real-world scenarios, trade-offs, and a decision framework for selecting the right pattern for your operational priorities.

TL;DR

Suitable use cases for each documented deployment pattern at a glance.

Pattern	Best for
The drone (self-contained single-node edge AI)	Autonomous mobile systems with strict energy budgets and zero cloud connection
The factory (multi-node edge AI with optional cloud)	Facilities with local infrastructure in intermittent environments
Hierarchical federated learning (client-edge-cloud)	Privacy-sensitive distributed operations where data leakage risks are unacceptable
Store-and-forward disconnected inference	Operations with scheduled connectivity windows
The network (distributed edge-to-edge fabric)	Distributed coordination without cloud dependency

Why Disconnected Environments are an Edge AI Problem

There is a structural blind spot for disconnected environments, driven by the assumption that industries using edge AI models are cloud-centric and operate under persistent connectivity. Where edge AI applications matter most, constant network access does not exist.

What disconnected actually means

Disconnected environments are settings with unreliable or nonexistent connectivity, ranging from airgapped scenarios with complete network isolation to intermittent setups with frequent connectivity degradation.

connectivity spectrum diagram — Connectivity spectrum

In these operational settings, edge AI capabilities truly shine because they support the real-time data processing, low latency, bandwidth optimization, and data governance that disconnected environments require.

Precedence Research estimates the global edge AI market will reach $143B by 2034, a potential 472% increase from $25B in 2025. For a significant portion of this market, constant cloud connectivity is not feasible. Yet inference, local data storage, and real-time decision-making must continue regardless of network status or location.

Disconnection is where edge AI earns its value

Disconnected environments such as mining sites, manufacturing plants, military operations, offshore wind farms, and smart cities expose the limitations of current edge AI deployment solutions.

Rio Tinto operates on mining sites up to 930 miles from cellular coverage, where operators cannot rely on a centralized infrastructure. They need autonomous inspection robots that use edge AI to track personnel and vehicles, interpreting data from 3D LiDAR, thermal imaging, and gas sensors in real-time.

At least 300 autonomous haul trucks operate in Rio Tinto’s Pilbara region. Each truck processes roughly 5TB of data daily through subterranean tunnels with limited connectivity, requiring private LTE networks for on-device IoT processing.

Offshore wind farms face a similar constraint. Turbines and inspection vessels go offline when satellite connections fail due to harsh weather or line-of-sight blockage, and each turbine averages approximately 8.3 failures per year. These farms need edge AI systems that detect issues early, monitor real-time maritime traffic, analyze local SCADA data, and trigger inspections based on immediate wind conditions.

In remote manufacturing environments, plant managers also need edge AI to automate quality inspections, predict machine failures, and protect workforce health.

A similar demand for local, secure processing drives military operations, where systems operate within airgapped networks in denied, disrupted, intermittent, and limited (DDIL) environments to maintain data confidentiality and integrity. Soldiers must communicate with command units and analyze real-time warfare data without relying on cloud data centers or large computing resources.

These are the environments where edge AI deployment delivers the most impact. According to Dell, enterprise data processing will shift to distributed data centers in 2026, but most documented architectures still emphasize transmitting data back to cloud data centers.

Constrained hardware shapes model deployment

The demands of AI compute and workload scaling at the edge also fuel the cloud-edge deployment recommendations.

A deep learning model with 3B parameters can require up to 4GB of RAM, but edge devices like microcontrollers and IoT sensors typically have less than 1GB for OS, workloads, and storage combined. Connected environment architectures assume large compute availability that doesn’t exist at the edge.

Edge AI architectures must start with offline-first assumptions and hardware ceilings from day one. Retrofitting offline capability into cloud systems will not compensate for connectivity gaps and limited hardware resources. Below, we detail five architectural patterns tailored for disconnected environments.

Pattern 1: The Drone (Self-Contained Single-Node Edge AI)

In environments where connectivity is unavailable, and operational latency cannot tolerate network round-trips, the deployment boundary collapses to a single device. Inference cannot be delegated, synchronized, or deferred. Edge devices like drones, underwater vehicles, and remote inspection robots must make decisions using only locally available compute, memory, and sensor input.

This constraint defines the drone architecture. All AI logic runs on a single device, without external orchestration or cloud offloading.

When the device is the entire stack

Mobile systems that must function autonomously in disconnected environments benefit most from this pattern.

With no external orchestration layer, data capturing, preprocessing, inference, storage, and control logic operate within a self-contained package. This package runs on a single node without networking with other nodes or distributing model training.

single node drone architecture diagram — Single-node drone architecture

Onboard decision logic means edge devices can execute predefined operations even when disconnected. Once a device captures data, it filters out redundant information, retaining only relevant data for eventual manual retrieval.

Autonomous drones that perform object detection and terrain classification in mining zones cannot pause execution while awaiting external inference. The drone architecture removes network dependency by focusing on on-device inference.

This makes it the most viable pattern for DDIL environments where connectivity is actively denied or degraded. Defense drones cannot assume that the network will recover or that a command signal will arrive at all. Every battlefield coordination must be executable from the device alone.

GE Aerospace, which runs 45,000+ commercial aircraft engines and captures over 480,000 data snapshots daily per aircraft, implements this architecture at scale. Onboard AI models handle predictive maintenance in strict accordance with DO-178C, which requires GE Aerospace to verify every airborne system against all possible failure conditions before it ever leaves the ground. This quality assurance aligns with the drone’s architectural requirement of no external support after model deployment.

Single-node local processing requires machine learning models with small footprints.

Optimizing intelligence for the edge

Edge devices operate within strict memory and power ceilings measured in megabytes and milliwatts. When full-precision networks exceed available RAM or energy budgets, model capacity must be optimized before inference becomes feasible.

Not every edge workload needs a neural network. In constrained environments like offshore wind farms, classical statistical methods, such as Welford’s algorithm and linear regression often outperform neural networks on streaming data processing.

A microcontroller computing sensor data with Welford’s algorithm updates statistics sequentially, without retaining past data points, which keeps memory and power consumption low. Before pushing a neural network to its hardware limit, consider whether the model class itself is suitable for the use case.

When neural networks are the right fit for the workload, quantization addresses their hardware limitations by reducing the numerical precision of their weights, biases, and activations. Downsizing from 32-bit to 8-bit shrinks model size by approximately 75% with less than 1% accuracy loss.

Another model compression technique, pruning, eliminates redundant parameters that contribute minimally to output accuracy. Pruning an object detection model like YOLOv5 can reduce its parameter count and computational cost by 40% before deployment.

TinyML frameworks such as TensorFlow Lite for Microcontrollers, ONNX Runtime, and PyTorch Mobile support compact model deployment. The following code shows an example quantization scenario with TensorFlow Lite.

import tensorflow as tf 
import numpy as np

# Post-training quantization using TFLite converter
# Converts 32-bit floats to 8-bit integers

def representative_dataset():
    for i in range(100):
        yield [X_train[i:i+1]]

converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset

converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8
converter.inference_output_type = tf.int8

tflite_quant_model = converter.convert()

Start with quantization for higher speedup rates without significant accuracy loss, followed by pruning to compress the model’s size further. For the drone architecture, the target size on a single microcontroller is <1MB. Plumerai’s person detection model demonstrates how compression techniques can achieve this goal. The model achieved 737KB on an ARM Cortex-M7 microcontroller with less than 256KB of on-chip RAM using binarized neural networks.

At the hardware level, energy-efficient processors such as the NVIDIA Jetson Nano, Google Edge TPU, and ARM Cortex-M execute AI models directly on edge devices, purpose-built for computer vision and sensor fusion workloads. ARM Cortex-M variants deliver up to 600 giga-operations per second (GOPS) with an energy efficiency averaging 3 tera-operations per second per watt (TOPS/W), depending on configuration.

Drone deployment introduces an architectural rigidity. With limited runtime intervention, the architecture must anticipate every failure state during design. The DO-178C reinforces this constraint by requiring full system validation before deployment. Teams must engineer every model update and behavioral correction with no orchestration window.

Pattern 2: The Factory (Multi-Node Edge AI With Optional Cloud)

During network outages in manufacturing and large retail facilities, inference must continue in-house across multiple machines. The factory architecture meets this requirement by distributing AI workloads across on-premises edge clusters, keeping operational control within the facility boundary.

Cloud synchronization remains optional, used only for model retraining or batch analytics rather than as a runtime dependency. The priority is maintaining resilience and operational independence across all nodes, regardless of network availability.

Inference stays on the factory floor

The factory architecture centers on three components: edge gateways, compute nodes, and local storage.

An edge gateway routes sensor requests to edge nodes, which pull context from local edge databases like Actian Zen, act on model inference, and write the results back to the database. Decision-making and local computing stays on-premises. Cloud systems only handle model updates periodically or on trigger.

multi node edge ai architecture — The factory architecture

Industrial environments generate continuous, high-volume telemetry data from sensors, controllers, and inspection systems. Distributing inference across multiple edge nodes maintains high inference throughput. But without a local orchestration layer managing distribution and managing model lifecycle, edge nodes operate as isolated processors rather than a coordinated system.

K3s, AWS IoT Greengrass, Azure IoT Edge, and Siemens Industrial Edge are popular orchestration tools for managing edge clusters. Each differs in how they handle model deployment and node management.

K3s deploys containerized models as clusters of worker nodes with a control plane for health visibility. Configuring its datastore endpoint parameter enables teams to store local data in on-premises databases like PostgreSQL and Actian Zen, replacing the default SQLite. Chick-fil-A uses K3s at the edge to process point-of-sale transactions across 3,000+ restaurants.

AWS IoT Greengrass deploys cloud-compiled AI models as components with predefined inference functions to NVIDIA Jetson TX2, Intel Atom boards, and Raspberry Pi-powered devices. Inference remains on-premises, with data exported optionally to AWS IoT Core for model optimization. Pfizer manufacturing sites use AWS IoT Greengrass for near-real-time bioreactor monitoring to minimize contamination risk.

Siemens Industrial Edge deploys Docker-containerized models directly on the shop floor, delivering real-time machine status. Siemens Electronics Factory Erlangen reduced model deployment time by 80% and false anomaly detection on printed circuit boards (PCBs) by 50% using this orchestrator. By running inference on PCB images locally and outsourcing only model retraining to the cloud, the factory has saved data storage costs by 90%.

Azure IoT Edge uses a JSON deployment manifest to specify which containerized models to download to edge devices. Data processing happens at the edge with Azure IoT Hub providing centralized oversight while the devices maintain autonomy. Thomas Concrete Group uses Azure IoT Edge to collect data from sensors embedded in wet concrete, estimate the concrete’s hardening timeline, and send predictions to Azure IoT Hub.

The table below highlights the differences between each orchestrator.

Criteria	K3s	Azure IoT Edge	AWS IoT Greengrass	Siemens Industrial Edge
Node management	Manages nodes via a lightweight control plane	Manages nodes remotely through Azure IoT Hub	Manages nodes via AWS IoT Core	Manages nodes via the Siemens Industrial Edge Management platform
Model deployment	Deploys models as Kubernetes pods using standard container images	Configures deployments via a JSON manifest that defines which modules, containing the trained models, run on which nodes	Deploys models as components with predefined inference functions	Deploys models directly on shop floors as Docker containers
Cloud integration	Can be integrated with a central infrastructure	Supported via Azure IoT Hub	Integrates with AWS IoT Core	Supports integration with AWS services

When the OT network is the security boundary

Industrial companies converge their IT and operational technology (OT) networks to support on-premises AI and IoT integrations. But this convergence expands their attack surface area. 75% of OT attacks originate in IT environments, and 80% of manufacturers report increasing security threats across their IT/OT networks.

For teams considering factory deployment for industrial systems, network segmentation must become a top priority. Edge AI solutions should operate solely within the OT network in compliance with the Purdue model. Sensitive data and inference stay close to the machines, sensors, and Programmable Logic Controllers (PLCs) that need them. This security boundary minimizes lateral movement of threats from the IT network.

Pattern 3: Hierarchical Federated Learning (Client-Edge-Cloud)

Hierarchical federated learning (HFL) builds on a three-layer infrastructure for teams navigating data mobility restrictions at the edge.

At the lowest layer, client devices perform local training, optimizing model parameters through local gradient descent. Edge servers at the intermediate layer aggregate updated model weights from all client devices for statistical coherence. A final aggregation round by a cloud server marks the top layer, producing a global model that the edge servers distribute back to the client devices. Since only parameter updates traverse this hierarchy, intermittent connectivity does not halt training progress.

The image below captures this iteration, which continues until the global model reaches the desired accuracy or converges.

Hierarchical federated learning architecture diagram — Hierarchical federated learning architecture

Domains such as healthcare and financial services, where raw data is bound to its origin by privacy constraints, regulatory requirements, and bandwidth limitations, are ideal HFL use cases. Data sovereignty mandates and geopolitical tensions add another layer to this constraint, restricting where and how data flows at the infrastructure level.

A study by BARC found that 19% of companies plan to increase their on-premises investments, driven by this need for data sovereignty. HFL allows a shared model to improve across distributed nodes without the underlying data ever crossing a jurisdictional boundary.

A recent experimental HFL training in healthcare achieved 94.23% accuracy on a modified National Institute of Standards and Technology dataset, while keeping data on client devices. Only relevant aggregated information ever reaches the cloud to preserve privacy and curtail data leakage risks.

In healthcare deployment, wearable devices (lowest layer) transmit raw data to a hospital’s local edge server (intermediate layer), which aggregates data from multiple wearables and sends it to a regional research institution (top layer) for final aggregation without exposing patient data.

HFL is the most complex pattern to implement. Tooling support remains fragmented, and unlike other patterns discussed, it currently lacks native support within the Actian ecosystem. Teams should weigh this implementation overhead before committing to this architecture.

The HFL architecture has three variants depending on which layer orchestrates data decisions.

1. Cloud-orchestrated hierarchical federated learning

The central cloud server coordinates the training process, client-edge communications, synchronization schedules, and the overall topology, with no additional aggregation rounds from the edge servers.

Cloud-orchestrated HFL fits financial institutions, where occasional reliable connectivity can sustain the coordination loop. In a fraud detection deployment, multiple banking institutions might train models using transaction data, sending updates to the cloud, which aggregates, validates, and redistributes the improved model back to the banks.

2. Edge-orchestrated hierarchical federated learning

Edge servers autonomously manage local client assignments, aggregating client updates to produce a locally improved model without cloud round-trips. Cloud systems only support at interval for bulk model retraining. Environments like offshore wind farms, where unstable connectivity is the baseline, benefit most from this variant. Turbines send model updates to a local edge server, which handles aggregation and independent model improvement.

3. Peer-to-peer aggregation

This variant focuses on a gossip-like model with no central orchestrator. Clients exchange their model weights with other nodes, reducing gradient conflicts under heterogeneous data.

Where the core HFL pattern reduces cloud ingress fees through aggregated updates, peer-to-peer aggregation keeps both training and aggregation within participating nodes. In distributed environments like smart cities, traffic sensors exchange anomaly-detection updates directly with neighboring devices until they converge on an improved model across the network organically.

All three variants differ in their functional requirements, highlighted in the table below.

Feature	Cloud-orchestrated	Edge-orchestrated	Peer-to-peer aggregation
Orchestration model	Cloud coordinates all aggregation and model distribution	Edge server aggregates locally, syncs with cloud periodically	No orchestrator; updates propagate between clients until convergence
Privacy level	Medium; the cloud controls model updates	High; raw data remains on local edge servers	High; no central point oversees aggregated updates
Bandwidth requirements	High; all updates are sent to the cloud	Medium; only aggregated updates reach cloud	Low; updates only travel between neighboring peers
Disconnection tolerance	Low; cloud disconnection breaks coordination	High; edge server operates independently during outages	Medium; network partitions slow convergence

HFL’s layered infrastructure supports large-scale model training by distributing computation and communication across multiple nodes in the hierarchy. The challenge with this multi-tier design lies in navigating communication overhead, stale global models, and node reconfigurations.

In HFL, communication cost is directly proportional to the model update size. Gradient compression techniques such as random sparsification and stochastic rounding shrink update payloads by up to 98% before transmission.

The asynchronous update cycle of HFL, where the global model incorporates client updates as they arrive, also amplifies the likelihood of stale model parameters. Weighted aggregation limits the influence of stale updates, preventing slower devices from degrading the global model.

Topology shifts add another challenge. Clients get reassigned to different edge servers, roles shift between client and aggregator nodes, and new devices join mid-training. Each reconfiguration stalls convergence and degrades accuracy if new edge servers lack prior training history.

Pattern 4: Store-and-Forward Disconnected Inference

In disconnected environments, intermittent connectivity can stretch for hours or days. Store-and-forward architecture accounts for this reality, sustaining large-scale data processing and storage during downtime, and forwarding summaries to the cloud once the system reconnects.

For industrial automation environments, such as remote oil and gas operations and maritime vessels operating miles from cellular towers, this architecture solves the core problem of maintaining data continuity despite network disruption.

Inference doesn’t wait for the cloud

Store-and-forward deployment follows a hybrid approach. Training begins in the cloud, but execution shifts to the edge after model deployment. When connectivity drops, decision-making, control loops, and alarm triggers continue locally without interruption, and the system buffers timestamped results to a local edge database until synchronization resumes.

Upon network restoration, the edge gateway offloads all buffered events to a central cloud infrastructure, providing the data required to push updated models and optimize AI pipelines.

store and forward pattern flow architecture diagram — Store-and-forward architecture

Store-and-forward architecture creates a feedback loop that prevents data loss during disconnection. In manufacturing plants, SCADA systems continue collecting data from PLCs, Remote Terminal Units (RTUs), and edge gateways until connection resumes.

When the data finally moves

The “forward” part of this architecture relies on lightweight communication protocols like Message Queuing Telemetry Transport (MQTT), designed for unstable networks and bandwidth-limited environments.

MQTT’s publish-subscribe model routes queued updates from edge gateways to the cloud through brokers like Mosquitto. Publishers (sensors) send messages to a topic (temperature), and subscribers (cloud servers) receive messages from their registered topics. Messages replay in the exact chronological order they were received.

The Python code snippet below illustrates a starting-point implementation using the Paho MQTT library. It uses Quality of Service (QoS) 1, a persistent session that enables Mosquitto to queue messages while the subscriber is offline.

# pip install paho-mqtt
 
import paho.mqtt.publish as publish
import sys
 
if len(sys.argv) < 3:
    print("Usage: publisher.py <topic> <message>")
    sys.exit(1)
 
# Production code will add retry logic, local queue persistence, and message deduplication
 
topic = sys.argv[1]
message = sys.argv[2]
publish.single(topic, message, hostname="localhost", qos=1)

To initiate data transfer after reconnection, the script below creates a persistent session using clean_session=False and loop_forever().

import paho.mqtt.client as mqtt
import sys
 
if len(sys.argv) < 2:
    print("Usage: subscriber.py <topic>")
    sys.exit(1)
topic = sys.argv[1]
client_id = "test-client" 
 
def on_connect(client, userdata, flags, rc):
    print(f"Connected with result code {rc}")
    client.subscribe(topic, qos=1)  
 
def on_message(client, userdata, msg):
    print(f"{msg.topic}: {msg.payload.decode()}")
 
client = mqtt.Client(client_id=client_id, clean_session=False)
client.on_connect = on_connect
client.on_message = on_message
client.connect("localhost", 1883, 60)
client.loop_forever()

Store-and-forward architecture can introduce data replication inconsistencies during gateway synchronization. The system requires an arbitration policy, such as last-write-wins, which applies changes based on each update’s timestamp. When timestamps are identical, data structures like Conflict-free Replicated Data Types (CRDTs) merge copies to achieve a consistent final state across all edge gateways.

Delta sync further improves CRDTs’ results. Where full dataset replication triggers on every record change, delta sync resolves conflicts at the property level, addressing only the modified fields.

Pattern 5: The Network (Distributed Edge-to-Edge Fabric)

The network deployment pattern addresses the lack of fault tolerance and distributed processing prevalent in disconnected multi-site operations such as logistics networks and smart grids.

Coordinating edge devices across multiple locations through a cloud system quickly breaks outside network coverage. This is why the network architecture follows an east-west communication pattern, enabling edge nodes to exchange data directly with peers without central coordination.

Mesh communication handles distributed intelligence

The network deployment pattern adopts a non-hierarchical design, connecting multiple IoT devices through a mesh network to improve system uptime during outages. Each node dynamically communicates with its neighbors, forming a bidirectional network that relays data to remote environments via multi-hop paths.

mesh network topology — Network architecture

The cloud only joins as a peer for optional sync, but core computing remains on the network, working without centralized control.

Smart grids are well-suited for this architecture, where teleprotection demands 10–20ms latency. A network of transmission substations continuously tracks electricity flow and consumption patterns in real-time to detect imbalances before they escalate. That real-time visibility supports dynamic load redistribution and autonomous microgrid management.

Military uncrewed aerial vehicles (UAVs) are another use case. When GPS fails in DDIL environments, UAVs relay ISR data between each other through mesh networks. Adaptive interference routing ensures reliable data flow, while line-of-sight transmission reduces latency.

This deployment pattern optimizes for network redundancy. Gossip protocol and distributed consensus algorithms like Raft eliminate single points of failure. When a node loses connection, the network remains operational, rerouting its data through other nodes.

Gossip protocol enables live peer discovery through continuous, lightweight information exchanges. Each node always has a current view of its local network. Raft follows a leader-based approach where an elected leader node handles all writes, and log replication ensures follower nodes maintain a shared state. Edge databases replicate data across multiple nodes to improve consistency.

Treating Gossip and Raft as competing options overlooks what actually matters. The focus should be on understanding where each sits in the CAP theorem and the trade-offs they introduce to a distributed network.

The consistency vs. availability trade-off

When network partitions split the mesh, Raft ensures strong data consistency, while Gossip provides availability fallback and eventual consistency when paired with approaches like CRDTs.

In edge computing, where connection is limited and nodes are numerous, partition tolerance is non-negotiable. Edge AI systems must choose whether to prioritize consistency or availability when implementing the network architecture.

Availability is often optimal, as edge nodes continue to function independently after disconnection. Consistency-focused designs like Raft risk write suspensions and stale reads during network partitions.

Feature	Raft	Gossip
Architecture	Leader election and log replication	Peer-to-peer
Latency	Moderate; requires at least a quorum of nodes in a network to become available	Low; messages travel quickly but propagation rounds can slow down speed
Consistency guarantees	Strong consistency	Eventual consistency
Partition tolerance	Moderate; might not survive a partition	High; heals partitions faster

Speed and data delivery trade-offs are another critical constraint of the network architecture. Mesh networking adds latency with each hop as the node count increases. If your system needs data back in <50ms or your latency requirements can tolerate >100ms, this trade-off should shape your design decision.

Choosing the Right Edge AI Deployment Pattern

There’s no specific “right” edge AI deployment pattern for disconnected environments. A solid architecture implementation begins with a clear grasp of the specific constraints, goals, and characteristics of your target application. This means envisioning the full workload lifecycle, including connectivity profile, available compute resources, and latency requirements.

1. Evaluate network stability

Network stability is the primary driver of any edge AI deployment strategy. Determine how much resilience must be engineered into the edge nodes based on the expected duration of disconnection.

If the system is always disconnected: Use drone or network architectures as they are designed to operate completely offline regardless of connectivity status.
If the interruption persists for only minutes or hours: Use factory or HFL architecture to continue data aggregation and inference without interruption. The system remains functional during the outage because all required dependencies already exist within the operational perimeter.
If intermittent connectivity lasts for days or weeks: Use the store-and-forward architecture to buffer inference results and operational data locally until the scheduled connectivity window becomes available again.

2. Assess latency requirements

Define the maximum acceptable latency for your specific application by considering network hops, node availability, and geographical proximity of the edge nodes. The thresholds below reflect typical deployment patterns. Validate them against your specific hardware and network conditions.

If the system requires <50ms latency: Use the drone deployment pattern. Its single-node architecture keeps inference directly on sensors, cameras, or gateways, enabling near-real-time responses. Factory architecture also minimizes latency by running on edge servers within the same facility or on the factory floor.
If the system requires <100ms latency: Use the network or HFL architecture to distribute model improvement workloads across multiple nodes.
If <500ms latency is acceptable: Use store-and-forward architecture for non-critical IoT data that requires batch processing or long-term analytics. It batch-offloads data-intensive tasks to the cloud.

3. Evaluate resource constraints

Edge AI applications differ in processing power, storage, and bandwidth consumption, which impacts inference speed, data aggregation, and real-time analytics. Evaluate each resource limit independently:

Power constraint: For compute power <1 GFLOPS, common in microcontrollers used for sensor inference, the drone architecture is most suitable. It runs on constrained IoT devices using lightweight, inference-only models. At 10–100 GFLOPS, common in edge gateways, HFL and network architectures become more effective as they handle data aggregation needs well at this level. For edge GPU clusters that scale to >10 TFLOPS, factory and store-and-forward architecture support clustered inference pipelines, since they run on-premises.
Bandwidth constraint: Use store-and-forward architecture or HFL to store and process raw, high-volume data at the edge, forwarding only summarized updates to the cloud if required.
Data storage constraint: Use factory or store-and-forward architectures paired with embedded databases to store time-series data locally and scale vertically within the facility. Databases like Actian Zen are optimized for edge AI use cases and can also sync with the cloud once connectivity is restored.

4. Consider a hybrid approach

Industrial systems often combine the strengths of multiple architectures into a coordinated system that delivers resilience and flexibility. Rio Tinto’s mining operations illustrate what hybrid deployment looks like at scale.

At the Greater Nammuldi iron ore mine, more than 50 autonomous trucks operate on predefined routes, using onboard sensors to detect obstacles, an example of the drone architecture. Across 17 sites in Western Australia, these trucks transmit operational data to Rio Tinto’s Operations Centre in Perth, reflecting the network architecture. Finally, an autonomous rail system transports mined ore, synchronizing with the Operations Centre upon reaching port facilities. This fits the store-and-forward architecture.

Rio Tinto demonstrates that deployment patterns are not mutually exclusive. If your use case requires multiple architectures, consider running them on the layer of the system where they’re best suited, rather than forcing a single architecture across the entire operation.

choosing an edge ai architecture — Decision framework for choosing an edge AI architecture

The following table maps specific deployment scenarios to their optimal disconnected edge AI deployment pattern to inform your decision.

Deployment scenarios	Recommended pattern	Rationale
Autonomous inspection drones over oil fields or offshore wind farms	Drone (single-node self-contained)	A self-contained inference runtime with embedded local storage eliminates distributed computation to meet hardware limitations
Automotive assembly lines running defect detection models	Factory (multi-node edge AI)	Cloud dependency is too risky for uptime requirements, so edge clusters run within the facility
Hospital networks where patient data cannot leave individual facilities under HIPAA	Hierarchical federated learning	Models train locally, sharing only weight updates to the cloud, so raw data remains on the local site in compliance with data sovereignty and privacy
Cargo vessels at sea syncing operational data at port	Store-and-forward	A local buffer ensures no inference result or operational event is lost across connectivity gaps that can last days
Smart city traffic management across distributed intersections with no central server dependency	Network (distributed edge-to-edge fabric)	Nodes communicate peer-to-peer via consensus, so node loss reduces capacity without disrupting overall network operation

The Bottom Line

Industries operating across remote, underground, maritime, and geographically dispersed terrain need edge-native architectures that capture real-time insights and keep critical assets running without cloud dependency.

The deployment patterns discussed prioritize what matters most for disconnected environments: local inference, no centralization latency, lower communication costs, and system autonomy.

Before committing to a pattern, validate three things in your own environment: how long your system can tolerate network outage before data loss becomes operationally significant, whether your edge hardware can sustain the compute demands of your chosen architecture without degrading inference quality, and whether your team has the tooling maturity to manage model lifecycle at the edge without cloud dependency. Map your constraints against the decision framework above.

The right answer might not be a single pattern. Layer in hybrid approaches only when the resilience gains justify the operational complexity.

Each pattern depends on a data infrastructure that can operate, store, and sync entirely at the edge. For teams that need to go beyond structured storage and perform semantic search on their local data without exporting vector embeddings to a cloud server, Actian VectorAI DB is optimized for this use case. Join the waitlist for early access.

Join the Actian community on Discord to discuss edge AI architecture patterns with engineers deploying in disconnected environments.

About Author

About Nick Johnson

Why Accuracy Became My Obsession in AI Analytics

By Amra Dorjbayar

#AI #Analytics #Data Governance

By Amra Dorjbayar

#AI #Analytics #Data Governance

Summary

AI analytics can produce plausible answers, but inconsistent results erode trust in enterprise decision-making.
Reliable AI analytics requires deterministic business logic, not probabilistic prompt engineering.
A governed semantic layer ensures consistent definitions for metrics like revenue, churn, and active customers.
Combining AI with strong data governance, quality, and lineage helps deliver trustworthy insights at scale.

Everyone remembers the first time they saw an AI answer a data question. Someone types a question in plain English, and out comes an answer with charts and everything. It feels like magic. You think: This changes everything.

And it does — until you ask the same question twice and get a completely different number. That is the exact moment the magic dies.

This is the core problem with “AI analytics” as a category. Language models are very good at producing responses that sound correct. In data analytics, the answer simply needs to be correct, consistently.

In enterprises, a “plausible” number you can’t trust is significantly worse than no number at all. If a CFO acts on a hallucinated revenue figure, that’s no harmless mistake — it’s a liability.

Solving this trust gap has been our singular mission since day one at Wobby, and it remains our mission now as Actian AI Analyst.

We didn’t set out to build just another “chat with your data” tool; we set out to give business users answers they can trust, so they can make decisions without second-guessing the math.

The Journalist’s Paranoia

My obsession with accuracy didn’t begin in a software startup; it started in a newsroom.

Before Wobby, I was a data journalist. Back then my biggest fear was publishing a calculation error that would mislead millions of readers. When your work becomes the public record, your math must be bulletproof.

During the COVID-19 pandemic, I watched a colleague manually copy government infection data into a spreadsheet every morning to update our graphs. I saw the risk immediately. One slip of a finger or one retroactively updated number could misrepresent a public health crisis. I automated that workflow because the truth was too fragile to leave to manual entry.

That same paranoia drives our approach to AI analytics. We knew that if we were going to ask businesses to trust an AI with their metrics, we couldn’t just “prompt” our way to accuracy.

A Different Architecture for Trustworthy AI Analysts

When teams run into the “different answers for the same question” problem, they usually try to fix it with more instructions. More examples. More context. More guardrails. A longer system message. A few-shot prompt that “teaches” the model what revenue means.

We tried all of it. It works in demos. It doesn’t work as an architecture.

Because the problem isn’t that the prompt is missing some magic sentence. The problem is that you’re asking a probabilistic system to behave like a deterministic one.

So we made a different bet. We stopped “telling” the model to decide how business definitions should be calculated.

Instead, we defined them explicitly and deterministically in a semantic layer. Terms like revenue, active customer, or churn are structured in advance, along with the filters and relationships that determine how they’re computed. When someone asks a question, the AI interprets the language, but it assembles the answer from logic that has already been governed.

The flexibility remains in how people ask. The consistency remains in how the numbers are calculated.

By making the context about the data deterministic, we eliminated the variation that causes answers to drift.

Why Actian

As a five-person startup, our biggest challenge was never the product. It was convincing enterprises that a small team could solve problems that Snowflake, Databricks, and Microsoft were still struggling with. And even when we proved we could, there was always the next question: Will you still exist in three years?

That’s what led us to Actian — and honestly, it makes sense from so many angles that it almost feels inevitable.

For trustworthy AI analytics to work in production, you need more than a smart agent. You need governance. Data quality. Lineage. Stewardship. Access control. The hard, unsexy infrastructure that determines whether AI agents can actually operate reliably across a large organization.

Actian had spent decades building exactly that. What was missing was the AI glue to connect it all — and that’s what we bring.

We’ve all seen the demo that works perfectly. One polished question, one clean answer. But enterprise analytics doesn’t live in demos. It lives in hundreds of unscripted questions, asked by different people, in different ways. Our goal was never to build magic demos. It was to build something enterprises can actually rely on.

About Author

About Amra Dorjbayar

Amra Dorjbayar is a co-founder of Wobby (acquired by Actian in 2026) and the lead behind the Actian AI Analyst. A former award-winning investigative and data journalist, Amra spent many years uncovering stories within complex datasets. He leveraged that expertise to build “agentic” AI systems that democratize data access through natural language.

What 37signals’ Cloud Repatriation Taught Us About AI Infrastructure

Summary

TL;DR

The 37signals Playbook: What Hanson Actually Documented

Why AI Infrastructure Economics are Even More Extreme

The Cloud Infrastructure Case Studies 37signals Validated

Cloud Repatriation Statistics

The Counter-Arguments and When Cloud Providers Win

Applying the Playbook to AI Infrastructure

The Bottom Line

Stay connected

Data insights delivered to you.

Securing Your Data With Actian Vector, Part 7

Summary

Managing Encryption Keys for Encryption at Rest

Rotating Table Keys

Explore Other Blogs on Securing Your Data With Actian Vector:

Stay connected

Data insights delivered to you.

The Data Product Advantage: What New Global Research Reveals About AI Success

Summary

AI Ambition vs. AI Reality

Data Products Have Entered the Roll-Out Phase

The Data Product Advantage

Why Data Products Matter for AI

Download the Full Research Report

Stay connected

Data insights delivered to you.

Why is Data Lineage Important?

Summary

What is Data Lineage?

1. Building Trust in Data

2. Faster Root Cause Analysis

3. Regulatory Compliance and Audit Readiness

4. Improved Data Governance

5. Supporting Data Quality Initiatives

6. Enabling Impact Analysis Before Changes

7. Accelerating Data Democratization

8. Enhancing Collaboration Between Teams

9. Supporting Cloud and Modern Data Architectures

10. Strengthening AI and Machine Learning Governance

11. Reducing Operational Risk

12. Improving Efficiency and Reducing Costs

13. Empowering Strategic Decision-Making

14. Facilitating Mergers and Acquisitions

15. Preparing for the Future of Data

Common Misconceptions About Data Lineage

Improve Your Data Lineage Tracking With Actian

Stay connected

Data insights delivered to you.

From Spatial to Vectors: How HCL Informix® Brings AI to Your Existing Data

Summary

The Database That Keeps Evolving

Why Vector Matters for Your Business

Nothing Like Informix Doing Vector

Informix in the Actian AI Ecosystem

Your Database, Now AI-Ready

Three Decades of Teaching Informix New Tricks

The DataBlade® Legacy

Actian Invests, Informix Evolves

Comparison Table: Vector Database Landscape

Bibliography

HCL Informix, Product & Capabilities

Actian AI Ecosystem

Competitive Landscape & Comparison Table Sources

Informix History & Community

Industry Trends

Actian AI Analyst

Stay connected

Data insights delivered to you.

How to Evaluate Vector Databases in 2026

Summary

TL;DR

Why Every Benchmark You’ve Seen is Vendor-Optimized

The Hidden Rules of Benchmarking

The Five Production Tests That Actually Matter

1. Filtering under concurrent load

2. Performance degradation over time

3. Tail latency under load (P95/P99)

4. Total cost of ownership (TCO)