Résumé
- Features Eric and Davis Broda discussing the Agentic Mesh paradigm.
- Extends Data Mesh principles to agentic systems and AI agents.
- Explores identity, purpose, communication, and semantic context.
- Reimagines systems where agents reason, collaborate, and act.
Chapitres
Okay, let's dive in. Uh, today's topic is, um, the agen future of the enterprise, based on your book, uh, Eric and Davis, uh, called Agen Mesh. I have a long list of questions that I wanna ask you.
First of all, welcome. It's a pleasure to have you and Davis. I think the first question is, is one I would like to ask, uh, you and Eric, you can obviously chip in if you feel like it, but we're discussing the agen future of, uh, the enterprise.
So why not start with a brief definition of, uh, what an agent is. How does it work? What is it, uh, in in your words, Davis?
All right. Well, an agent is a software actor that can interpret context, pursue a goal, choose actions, execute work through tools, APIs, interact with data, and interact with people and other agents. The core feature here is agency.
It doesn't just generate an answer, but it decides what to do next. It carries state across steps and adjusts based on the outcomes of prior steps and enterprise use. And agent should be treated as an operational entity with identity permissions, policies, and observable behavior.
That makes it more like a model call, uh, or chat bot. And it becomes a governed software participant that can perform bounded work inside a business environment. Oh, cool.
Yeah. Uh, let me just add a little bit to that. Uh, fantastic answer, Davis.
Um, there's, there's two, I I, there's two types of agents that, uh, I'd like to bring, uh, to the discussion. One is coding agents, which I'm sure if, if folks have been using Claude Code or Codex, uh, they probably, uh, recognize the fact that they have changed software engineering, and they are moving into the data engineering space as well as the business, uh, domain space. And we see things like open claw, uh, tremendous innovations.
But here's the difference, and this is we're getting into our enterprise discussion, is these are personal agents. These are ones that run under your id, your role, your permissions. You give them access to your environment.
You give them access to your resources. Enterprise agents that we're gonna talk about a little bit more are different. They're operational entities.
As Davis kind of highlighted, uh, they have identity permissions, policies and observer, uh, observable behavior. Now, here's the thing is, is these agents are now governed software participants in the enterprise, just like any other enterprise application. And that's the key, the key difference, I think just to, just to lay the, the, the groundwork for, for how people have perhaps heard things about cloud code and agents and such, and where, where we want, where we see the environment actually going.
Yeah. Thank you, Eric. Thank you for, for specifying this.
And I guess, obviously that the participants here have a bit of, uh, knowledge in the topic already, or at least are interested, right? So, so, so we can get slightly technical here, of course, but the, some of the questions that I have prepared are, are also of, of, uh, like, um, more high level nature so that we can get a, a thorough understanding of the topic altogether. And I wanna open with the premise of your book, actually.
Um, uh, now that we are into the weeds, like you wrote, you have written a book, and obviously the question is for both of you. So feel free to chip in both of you as you as you want. But I want to open with a little bit of a challenge, and I, I guess perhaps it's an easy one.
You chose to write a book on what you call the Ag mes. Why didn't you choose to write a book on what you could provocatively call the monolith? Why is it a mesh?
Why is it it a monolith? Sure. That's a a great question.
Um, I'm, I'm gonna, uh, bring up a few quotes from industry leaders. Uh, Andy Jassy, CEO of Amazon said there's gonna be a million agents in every enterprise. Satya Nadella with Microsoft has said that agents will replace software as a service.
Uh, Jensen Wong has said, CEO of Nvidia has said there will be thousands, if not tens or hundreds of thousands of agents in every single enterprise. So, so what we do know is even if they're vastly incorrect, and there's only a thousand or maybe 10,000, uh, agents in an enterprise, there's very few software entities in any enterprise that have 10,000 of anything. So, so we've perhaps seen something close to that in managing microservices.
And the, the reason that, um, a mesh comes into play is if you need to manage a hundred, a thousand, 10,000 or maybe a million of anything, you need to think of a different architecture. A monolith, uh, would just not be practical. Um, or managing a million agents in a monolith.
It would probably not be, um, deployable in any meaningful, uh, modern architecture today. But if, if I distribute them, uh, I have a chance. And, and quite honestly, the distributed architecture is what we, what we see every single day, whether you're using cloud or even within data centers, you see, see thousands of nodes out there.
So we think the most logical architecture for agents is, uh, a federated model and the mesh, uh, one where by definition any agent can collaborate with any other agent leads itself to a mesh based architecture. Just outta practical necessity, uh, ole Yeah, no, very fair. Um, I think, uh, like this is just a personal observation, but in the very early days of, uh, jet GPT-3 0.5, I think there was this idea that, that, that the future of AI would be, some would, and, and, and this is going to sound awfully untechnical, but, but that the future of AI would somehow be very big chunks of software performing something simply because of the term large language model, and the idea that that would somehow translate into, uh, some kind of monolithic architecture.
But I do agree that, uh, what we're seeing is, uh, a, a very widespread understanding that a gen architectures must exist in what you can call a mesh, uh, in the sense that you will have a very, very big number of agents that, uh, work together, uh, in a mesh. Um, in your book, in Ingen mesh, you, you, you distinguish between, um, free type of, um, agents, if you will, uh, of free ways of working with AI in the company. Um, it's what you call the ai, ai workflows, autonomous agents and enterprise grade agents.
Actually, in your definition, Davis, uh, in the beginning of the conversation, um, um, you, you already alluded a little bit to this, uh, if, if you feel like it, I would, I would, I would wanna expand a little bit on, on your definition and ask, okay, what's the difference between AI workflows, autonomous agents and enterprise grade agents? All right. So that is a great question to ask, and it really gets to some of the differences that AI has been going through in the past few years.
So, an AI workflow is a system where the LM and its tools are preceding along a predefined code path. Uh, this means that AI workflow has to have every step. It could take every decision point laid out in advance by whoever's creating it.
Like someone had, like, you might be using an LM at step four or step three to summarize something or to extract a bit of information or to classify something. But fundamentally, every step in that process mm-hmm, has been defined in advance. Like the AI can only summarize, it can only extract if you've already reached that step, it can't influence the next steps very much.
It's basically fix and everything has, like, if anything unexpected shows up that the designer of the workflow didn't see in advance, you're just going to get an error. Even in cases where it might be obvious to a human what the correct path is, contrast this with an autonomous agent, like an agent can dynamically determine its own execution path. If it has the tools available, even in an unexpected situation, it can apply them and work past the unexpected to figure out a new way to continue proceeding forward.
It can recognize and implement new solutions on demand to deal with changing conditions. The adaptability and dynamic planning is really what sets it apart. And then beyond that to enterprise grade agents, it is the way that you integrate an autonomous agent into the enterprise to ensure that it is discoverable, that it is secure, that it is trusted, and all the other things that are going to be needed to truly make an enterprise accept agents.
Oh, yeah. Okay. So I, I, I see the difference, um, clearer, obviously.
I, I, uh, I, I read the book in advance in preparation for, for this webinar. I also served as a technical reviewer. Um, I didn't say that in in the beginning.
I don't think I said that in the beginning. But what I am convinced about is that this kind of thinking and the way you describe this will become, uh, like a, a, your book will become a go-to source in, in understanding that gentech architectures that are possible to create in, in the enterprise Exactly. Because of these kinds of distinctions.
And perhaps like turning the mic a little bit, uh, to you, Eric, um, in terms of, um, in terms of, uh, in terms of the organizational reality and the enterprise, um, we tend to think, we tend to think of agents as something where one human represents one agent. Now, you very clearly show that in an organizational and, uh, setting, you cannot really think like that. The distinction is, or the, the comparison has to be a little different.
And so we have some, um, metaphors that you use, uh, to translate, um, the enterprise reality. So you, you translate, uh, so lemme just read the question to not make it too complicated for the, for the listeners here. How do you translate a team and an organization to groups of agents?
What's the fleet metaphor about, and what's the ecosystem metaphor about? Can you expand a little bit on that, Eric? Yeah, absolutely.
So, um, we use this analogy in the book where we start with a person is like an agent. And as Davis mentioned, um, an agent, uh, can come up with a task plan, can make decisions just like a person. We extend that analogy into teams.
So just as people work in teams and they collaborate with each other, and they use tools we call teams of agents, we call them fleets. Now, like people, and like any organization, uh, any modest size or even large organization, you have many teams and many layers of organizations, and they may be organized in a variety of different ways. Loosely speaking, we call that, uh, a company, a firm, if you will.
Um, in, in our agent terminology, we call it an ecosystem, or eugen mesh. Now, the interesting thing about it is we actually extend that analogy into supply chains, for example, where you have multiple organizations that have specific contracts that define how they interact, who they can collaborate with, what are the service level expectations or even quality expectations. So the analogy actually goes all the way into modern day manufacturing supply chains, for example, that have all these things that I mentioned.
But these are exactly the same things that we would be able to apply to agents, and we, we actually think they would be applied to, um, this may be a new term, agentic process automation to make it distinct from regular business process automation or robotic process automation. Um, we think the analogy of person to agent, team to fleet organization to ecosystem, and then process automation is, is actually the, the right analogy. Um, and probably the most intuitive one because like you, like, like all of us, we all interact with other people, other, we work in teams.
Typically we work within a a company org, uh, organization. Uh, and sometimes we even interact with multiple, uh, companies. The analogy makes a ton of sense, and that's what, that's why we actually, uh, used it extensively in the book.
Yeah, no, agreed. I also think that, um, um, like the level of co coordination obviously, uh, increases, uh, the more agents you have, right? A fleet is something that you're capable of giving one direction with relative ease and say, you'll go in this direction under these conditions, right?
Whereas an ecosystem becomes obviously increasingly, uh, difficult as you scale it, uh, to coordinate and give direction, and it, and it, and it shouldn't go in the same direction, right? If you have an ecosystem of agents, obviously what you're aiming at is a multitude of different processes that are, uh, not connected or only very loosely connected, right? So that gives you a completely different level of complexity, uh, to operate with.
And I think what is very needed in, in the conversation around agents is that it tends to stop typically at the agent human, uh, comparison, whereas you really need something else than a human to compare, uh, uh, a set of agents, right? Uh, mm-hmm. So, so I like that ques, I like that, uh, that, um, that, that distinction you're making in, in the book, uh, quite a lot.
Okay. So, so I'm also prepared, this is a big one, and, and there are many different aspects in it, and I, I guess you can both answer it. You, you have both, uh, uh, revealed parts of the answer, but I hope we can go into into depth about it.
But this is a big question, so I'll read it out and then we'll chunk it up because no one will be able to remember it anyway. Um, yeah. So, uh, enterprise grade agents have seven characteristics.
So this is, these are the, these are the enterprise grade grade agents. So we're beyond the autonomous agents here. We're into the, through an enterprise grade agent, right?
The, the most refined of its kind enterprise grade agents have seven characteristics, security, reliability, explainability, scalability, discoverability, observability, and operability. And no one can remember that loud repetition. But can you describe what these seven characteristics in total mix, uh, an enterprise grade agent, Blac, and I think we should go through them one by one.
So what's the security aspect of an enterprise grade agent? Yeah, so Davis, I know you wrote the security chapter in the book. Why don't you start with, uh, your perspective on security?
Alright, so in order for the, an enterprise to accept an agent in their operations, they need to know that this agent is secure. It doesn't really matter how good work the agent's doing, if like a random person on the internet can go in and delete your database. So it needs to have, basically, it needs to have permissions, it needs to have identity, it needs to be in your OAuth system, so you can set what it has access to and what it does not.
It needs to be communicating via secure protocols. You need to ensure that all your communications are encrypted. You need to ensure that nobody can just see what the agent is doing, who isn't supposed to, nobody can tell the agent what to do that isn't supposed to, and the agent isn't looking at or accessing things.
It shouldn't be able to in order to do its job, if you could all that. Well, that's, Yeah, it's kind of interesting. Davis, it sounds to me, and this is something we're probably gonna come back to, is, uh, and this is something we've said in, in the book or in another venues, also is an enterprise agent, uh, is just like any other enterprise application that has an identity, it has roles, permissions, communication is secure.
So, so that's, that's the other, um, thing that I mentioned. We we're talking about something that sounds, um, very new in some respects, it actually is not. Um, in fact, the model we used in the book is that security for an agent should mirror the same level of security.
Although there's, there are obviously some differences, but it should mirror the same principles and the same ideas as any other important enterprise application ole. Okay. Yeah, no, that makes a lot of sense.
Um, and obviously that also makes it pretty complicated. Um, okay. But, uh, but yes, obviously you need that, but you would also need, I guess, anyway, let's perhaps get back to that because you would also need some kind of, um, auto architecture created around, uh, that security, right?
Because managing security at that scale will be super, super complicated. Um, Yeah, absolutely. When we think of, let, let's use the example in the enterprise that I'm very familiar with, and probably most people are, is if we extend the analogy of an agent like a person, um, and we think that is quite appropriate, an agent needs to be onboarded just like a person.
And when you're onboarded, you are vetted, can you do the job? Do you have the skills and capability to the job? Um, once you're onboarded, you have an identity.
And once you have an identity as an employee, you have an employee id. You are given, based off of your job, you're given permissions to actually do do your work. Um, so, so the analogy around that is actually, uh, another one we like to bring is, um, just as, as you've heard terms around, uh, HR or human resource management, we think the same disciplines, the same practices, again, within reason, can be applied to agents.
Agents can be onboarded. As Davis mentioned, agents have an id agents have a role, we believe, uh, to coin another phrase here, uh, and I don't know if it's gonna go too far, but just as we have HR or human resource practices, we should probably be thinking about AR or agent resource practices minimally. It's a good analogy, but we think it actually goes pretty far in terms of how you should think about how to protect, uh, and, and govern security around agents.
Yeah. Um, no, but that makes, uh, yeah, again, it's, it's, it's pretty, uh, it, it's, it's, it's a lot of new ground, right? But it, it makes sense.
Um, then again, uh, let's look into the, the next, next aspect here. Um, reliability. Can we like, it, it obviously ties back to security, but what's the reliability aspect of an enterprise grade agent?
Yeah, sure. I'll, I'll take this one on, um, reliability. There's, there's a bunch of different definitions.
Um, and I'll push things like trust, uh, off to the side for a moment. 'cause we will probably talk about later. So, so let's, let's talk about reliability in a very specific context.
When we say something is reliable, we know what it do, what it's supposed to do, uh, we can observe whether it did the job and then we can fix it if it didn't do the job. Um, so that's one aspect to it, and we think that's, that's particularly important for agents. So, so not only are they discoverable and we can understand their purpose, uh, and they can come up with task plans, but we can actually see what they did.
And if they didn't do something, we can go about, uh, fixing it. The other part of reliability is, is it available when I expect it to be available? And in that context, this is the same, same disciplines we think around, uh, some of our applications that may be on the cloud or some of the SaaS applications that we may use in our organization.
Um, there's gonna be service level expectations around their availability, the response time or latency. Um, all those things are what we think should be applied to agents. Now, the interesting thing about this, you're, you're seeing that we try and draw analogies as much as possible to things that people actually understand.
The reason we do that, first off, is because they're quite appropriate, but it demystifies the whole idea around agents. And what we, what we find is that when you took take, put your architecture hat on to say that, um, the concepts are have been around for a long time. So, so for example, the, the challenges of distributed computing, which if you have lots of agents running on the cloud or elsewhere, um, the problems of distribute the challenges around distributed computing, they're not easy to solve, but they are solvable.
It's a known practice. Not to say that it's easy, but it's a known practice. So things around reliability, these are known practices, we believe strongly that you should be applying those practices directly to agents.
Again, no different than any other, uh, enterprise application. Yeah, no, that makes a lot of sense. But it's, it's just, yeah, that transition is difficult, right?
Because it's been sitting with, uh, a few select, uh, employees. And now as you scale something, as a, uh, as an agen mesh, basically, you'll have to, you'll have to multiply that and spread those, the knowledge of of of, of, of, of such, um mm-hmm. Measures to, to more employees or at least, yeah, build, build the architecture such that it's scalable.
That's, uh, super, super difficult. Um, but it, we're, we're not even, uh, halfway through. But, uh, maybe we won't take all of them.
Uh, I wanna leave time for questions also, if, if there are any, from anyone from the participants that have, uh, questions. But let's, uh, let's, let's take it the next one, because it is an obvious one, right? Explainability, what, what is, what, how do we make an agents, um, how do we explain them basically?
Yeah, I, I can start. And then Davis, I know you, you're pretty, you have some pretty strong perspectives on this. Um, today, when you look at, I'm gonna contrast with what people may be using today, Claude code, for example, and, or, or Codex or even chat, GPT, you can see it says thinking.
And what that really is, is, although it's not quite a task plan, it's a little bit of a loop. Um, but it's explaining what it's doing. So, so you actually can see this, if you use any of the, the, the, the modern coding tools or coding agents, uh, out there today, what we believe, if you move into the enterprise domain where agents are full participants in business processes, um, you need to actually have not just observability what they did after the fact, but did they do the right thing.
So when an agent comes up with a task plan in our book, we actually recommend that that task plan, the, is a first class citizen, uh, in, in the observability, uh, sphere. Mm-hmm. It's a first class citizen in that it should be logged.
And once the agent completes its task and, and the, the results of that, that should be attached somehow or linked to that task plan. And then when we look at the observability metrics, the latency or how, you know, whether it completed successfully or not, again, is attached back to that task plan. So what ends up happening is when an agent does its work, we can see the task plan, what it thought it should be doing, we can look at the results to see how well it did, and we can look at using the observability metrics, who it talked to to actually complete that task.
That level of explainability is not available today. We have some, we're working with some clients where we're starting to, uh, move down that particular path. But the coding agents that you're seeing today do not have that capability.
The agent frameworks do not have that capability today. They can exercise parts of it, but the explainability is that end to end understanding, what was it thinking, the task plan, how did it do the results attached to the task plan and the observability? Who did it talk to, whether it's tools or other agents to actually complete that task plan.
That is what it means to be the first class, a first class citizen that gets logged. So that any, um, operability person who finds there's a problem they have to diagnose, we can actually use that in that task plan, the results and the, the, the list of collaborators to actually diagnose the problem. When you're a developer, it gives you that information to ensure that your agent is actually tested and debugged.
So there's so many uses for this explainability, which is why we think it is actually one of the, it it is something that is not available easily today, but it is probably one of the single most important requirements in the enterprise today. And it is a, it is figured prominently in our book. Yes, absolutely.
Davis, did you feel like chipping in on that one, or should we move on? Uh, it was like, Eric answer was pretty great. Covered a lot of the ground here.
Uh, one thing that I do want to add is that you don't just need to have the plan itself tracked. What you want is for the agent to include explanations of why you want to, you wanna be able to see why the agent's doing what it's doing, and include that along with the actual task plans executing. So you don't just see, I'm planning step 1, 2, 3, you want to see, I'm doing step 1, 2, 3 for reasons X, y, Z.
Absolutely. Great point Davis. Absolutely.
Great point.
Um, and I'm very thankful for the participants that have posted some questions, uh, in the q and a. Uh, I think we should move to them in just a little bit. Um, I just wanted to, to recap here, we did, we discussed security, reliability, and explainability.
Um, for, for enterprise grade agents. We still have scalability, discoverability, observability, and other ability to uncover. But I think we should move to the questions that have been asked in the chat.
Um, I think the, the, the participants to serve to, to, to get their questions answered. And then, um, we, uh, we might follow up with a blog post or two for on, um, on the data, uh, intelligence, uh, platform substack that we have, or even our website if you feel like it. But, um, but let's, uh, let's definitely move to the questions in the chat, uh, or in the q and a, I'm sorry.
Uh, and I see both questions in, uh, in, in the q and a and in, um, okay, I have them here. Uh, just a second. So, uh, Ronald, Ronald from, um, Ronald Ban, I believe from the Netherlands.
It's great to have you, uh, on this webinar, webinar, uh, webinar. And he asks, I am interested in how to test agents and upskill or train them so that they are up, uh, to the job who should do this and how to do this so that you can have certified agents for certain jobs. I guess it's more high level question that addresses some of the things that we were discussed discussing, uh, or had discussed already in detail, um, in the previous questions.
But, but maybe it's great to take a step back and like, do you feel, can you feel, uh, I have my own definition of it, but I don't, I'm not sure I can and provide a short answer to this question, but, okay, let's take it again. Who should, like, I'm interested in how to test agents and upskill or train them so that they're up to the job. Who should do this and how to do this?
Well, well, let me start, and then Davis, uh, I'm sure you have some ideas. Um, yeah, first blush, one would say testing an agent's the same as testing software. It is a, it is actually a very different exercise.
And the fun fundamental reason, uh, is if I'm testing software that is deterministic, in other words, it gives you the same response, given the same inputs, then simple things like inequality, operator, um, are obvious things that you use in testing with agents. You don't necessarily have that. An agent obviously has an LLM as a brain, uh, to use that term.
Um, but these, these LMS are non-deterministic. What that really means in simple English is if, if I provide the same inputs twice, I make it a slightly different answer. What that means is that the traditional equality operator in testing doesn't work anymore.
What you need to do is have different ways of actually evaluating an agent. Um, so, so we actually think of it as a little bit more of, um, performance management or agent evaluation as opposed to testing. So we come up with rubrics, for example, that say, based on, uh, the level of autonomy and the level of criticality.
Obviously, if you're highly critical and you're highly autonomous, you're gonna have to have an, an extremely rigorous, uh, uh, testing, uh, and evaluation capability. And there may be six or seven dimensions that, uh, dictate what you can or can't do, or have to do or not have to do. If you have something that's low criticality and low autonomy, maybe that's closer to a chatbot, I suppose.
Maybe you don't need a lot of testing. So, so it depends on where you are in that continuum. The second thing I would say is when you're testing an agent, what we think, and I think there, there's, there's terminology around this.
You have a, a, an agent, a judge agent, that can judge the response of another agent or a judge, LLM. Those are things that we, we actually see out there in the field. So, so using an LLM to actually test an LLM, is that response, does that response, maybe it's not necessarily exactly the same or repeatable, but is the intent of the response the same?
That's where we see the actual testing, um, coming into play. We call it evaluation. Now I'm gonna ask Davis a little bit about the training side.
'cause I think Davis, um, there's, there's a, uh, a lot. I know you have some experience with one of our big retail clients a little while ago around, um, how they would actually begin to try and, um, retrain the model. And there's some challenges with that.
And we came up with, back in the day, it was probably pretty naive rag type solutions. So maybe it's not quote, training the model, um, uh, but it, it, but it's equipping the model with information about your enterprise. Why don't you tell us a little bit about that?
Obviously not mentioning the client. Yep. So there are a variety of ways you can sort of train an agent and increase its capabilities.
One of them is obviously to do a sort of training of the underlying lm, but this is a very heavyweight solution. You need a lot of examples you need, and you need to do it for every single agent that you're going to. It simply doesn't scale The ways to increase an agent or give it new skills that scale in a somewhat better manner are give it more data to work with or give it more tools or collaborators to handle the job.
You can give it a new tool that gives it a new capability of interacting with the world. By writing that capability of the tool and having the agent just include it in the task plan, you can give it additional collaborators, other agents they can contact that have capabilities your primary agent doesn't, and that it can simply call those agents and get them to do the task. Or you can effectively provide the agent with more data.
You could do a relatively naive rag, you could do some more advanced solutions like knowledge graphs, things like that to get more information. And this lets the AI make better decisions because it has more information available. And these are the main ways you're going to sort of train or upskill an agent that scales to the thousands and thousands of agents we're eventually expecting to have.
Yeah, no, I want to, I wanna address the last part of Ronald's, uh, question around certification. 'cause we, we actually dedicate an entire chapter to this. It's part of our trust framework.
So, so again, uh, forgive me, I'm gonna use another analogy, um, in Canada, and I suspect it's the same in Europe, but when I pick up my toaster in Canada, uh, I get a little logo at the bottom of it that's called the Canadian Standards Association. What it says is, my toaster's not gonna burn my house down when I use it. What it says is that that brand means there's a level of service expectation, quality, uh, around that, that label means that that toaster is certified.
Uh, in United States, they have underwriters lab that does the same thing. Uh, in Europe, I'm not sure, but in Canada, again, if I look at a power outlet, I can see a little logo that says this power outlet is safe to use. We think that same idea applies to agents, and we think it is the top level of that governance stack.
The trust stack, actually, it starts with identity and roles moves up through explainability, but fundamentally at the top of the house, what we have is certification and governance. What we believe strongly is that there are models out there today, like Canadian Standards Association, underwriters Lab, their federated governance models, federated certification models that every single day are able to certify that literally millions, hundreds of millions of products are safe to use. We believe that same idea can apply to agents.
So we, we believe the model, again, is, is laid out for us. It is there that has, among other things, it has a, a completely delegated framework that says, I have macro level policies that need to adhere to, and then they go down into for electrical specifications for a toaster to heating, uh, specifications, metal for the, the container of the toaster. All those things have a federated specification framework and a federated governance framework and accredited institutions.
Now that sounds very, very complicated, okay? And if you're managing hundreds of millions of products, maybe it should be, but inside the enterprise, this model works also. So at the top level, we may say, for example, that an agent must be, um, GDPR compliant.
That one then delegates to, uh, an identity verification agent. And it ha and you, you come up with implications on how that identity verification agent would be GDPR compliant. You'd apply that to all the other agents also.
So maybe what we, we should have is a set of spec specifications, again, broken down into particularly pro, probably by business domain at a a granular level. Um, and then what we have is those specifications have experts who can actually validate that specification and that the using explainability approach that we mentioned earlier, that that agent actually adheres to that specification. We believe that model metaphorically will ensure that your agent doesn't burn down your data center house.
So, so we, we see a lot of, um, examples and analogies that will simplify an enterprise's agentic journey. Uh, and they're standing right in front of us. Ole Yeah, no, that makes sense.
I mean, it's really about taking some surprisingly concrete, uh, perspectives from the existing world and just, uh, applying them in an ag reality quite simply. Um, and I can see also, uh, Ronald, uh, thank you for getting back to us and, and, and, and that the answer was, uh, complete. Uh, we also have questions from Michael and from, uh, Sabrina.
Hi, both of you. It's great to, to have you on, uh, on the webinar. Um, Michael, I'll go first.
I can see you posted first, but I I did see your question also, Sabrina. Um, so Michael Wrights curious to understand participant per perspectives on the impact of business defined semantic layer to AI service providers, specifically unlike that of Epic e HR health system or sub EP platform, that generally force their clients to adopt much of their workflows, user roles, et cetera, and compare that to what it means to have a business defined a Gentech architecture, how to, an how to anticipate AI services providers, open AI and proper can scale their businesses if their clients will need their own unique defined romantic layer. Yeah, I get the question.
Do, can you follow it? Uh, or is it too, is it too long? No, no, I, I'm reading it.
Actually, I have, uh, I have it sitting in front of me on the screen. First off, um, Michael, that is, uh, fantastic question and, and I think the timing is so appropriate, um, for a number of different reasons. Now, uh, I am not an expert in the data catalog or semantic space, ole probably is, but let me, let me tell you, I, I do run into it and I do have a perspective because we, we see it in almost every client engagement that we work with.
So, so let me take the, the step back from some simple, simple basics. When an agent tries to respond to a request, um, as Davis mentioned, we load information about that business or about that topic as close as possible to that request so that the agent has all the information available to it. That's called, you know, context engineering, where we wanna find the right context at the right time for the right token budget and put that into the agent context window so it can most effectively do its job.
Here's the challenge is, in a enterprise that has hundreds of terabytes or petabytes of information, a corpus, whether it's structured data, unstructured data, images, video, there's petabytes of that stuff out there. How do I get the right information from those petabytes? And I get it to what is effectively anywhere from 200,000 to a million tokens, a token being about a word just to, to, to, to make it, uh, real.
Um, so, so, so how do I do that? Well, what we find is in order to do that, well, this is what we call, you need to have an ag, an agentic knowledge, fabric, fancy word of saying, I need to be able to identify, I need to have a virtual context manager that creates the minimum. I emphasize minimum because it a token budget viable, which means it has concepts and policies, decision boundaries, and I can put that into the agents or the LLMs context window.
In order to do that, you could use a bunch of different approaches. Technically, you can do a naive reg, as Dave has mentioned, probably doesn't work very well necessarily. You could augment that with knowledge graphs.
Um, there's a variety of different things. We, we've talked to some clients who are using things like page rank to actually, um, understand and link the concepts in a very large, vast graph, fundamentally. Um, that's a, that's a, a mechanism that you need to think about, but the way you make sense of it is through taxonomies and ontologies.
What are the concepts that I should be capturing? How do I actually represent a policy, um, in a knowledge graph? Uh, how do I actually, uh, manifest the relationships between these policies that may apply to a customer in the a ML sense, anti-money or laundering sense, which is diff different than a customer from a sales perspective, the attributes are very, very different.
So, so these are all things that you need to think about, and we call it this agentic knowledge fabric that has underneath it, like I said, a knowledge graph, an ontology, some taxonomy capabilities. But ultimately what you need is semantic consistency, semantic coherence, so you can give the right information, the right time, the right token budget to the agent. The LLM we see semantic understanding, taxonomies, ontologies, all sitting on top of this knowledge fabric as the next, uh, innovation within the agent landscape.
That is certainly also something that I am following very, very closely and thinking about. Uh, and, uh, obviously this, this webinar is, uh, is, is not about, uh, action as a company. So, so for respect of the participants, I don't think I should go into it here, but that is, I think, uh, very precisely, um, defined and, and one of the challenges that we're, that will have to be solved, um, uh, in the, in, in the common time.
Right. Uh, we have a, we have a, a a question also from Sabrina, and I saw your question, Sabrina, so I wanna just go ahead and ask it. Um, oh, well, actually, it's also a little bit about testing, uh, so to the extent, uh, let me go ahead and ask it, but I actually think you have, uh, uh, answered it.
Uh, can you see it also, uh, Eric and Davis in the Yeah, I, I can looked that Lee. Yeah, I saw it in the webinar chat. Yeah, I can see it.
What, what's your, maybe we should DA Davis. Uh, did you have anything you wanted to chip in about this, uh, question? Uh, I think a lot of it was covered in the previous answer, but it might be worth going over some of the points again.
Um, one of the first things mentioned is who should be writing the tests? Should these be written by agents, by humans, or both? And this is largely going to depend on how mature your iden mesh is at the early stage, when you're building your first agent, your first even few dozen agents, it makes sense that most of your testing should be done by people.
You're still getting the hang of agents, so you're gonna want manual evaluation. But as the ENT mesh grows, as you get thousands and tens of thousands or more agents, you're going to have to start relying more on effectively agents, testing agents, simply to manage at the scale needed, where you don't have enough people to manually check every single output every single time. You're not gonna completely get rid of human testing.
You're gonna still need humans to go in and verify that the tests are actually correct to do spot checks, make sure nothing's gone wrong, but how much is human or age, it will depend on how mature you are, are in developing your own project mesh. Yeah, Davis, one of the things we talked about is, um, and I know we had great debates with, uh, some of the other folks in our company and some clients, but, uh, when you think of the autonomy, the high autonomy and high criticality, um, there is, there's a discussion that says it's for some of those very highly autonomous and highly critical, I, in other words, high risk scenarios that today at least, maybe this will change in the future, but in the foreseeable future, we think those, there's a subset that must have human interaction and must have human verification things related to health, for example, maybe high value payments, for example, where there may be some, uh, fraud exposure or anti-money learning. There's a subset of the, the testing that we think for the foreseeable future needs to remain, or at least have that human in the loop or the human above the loop, um, explicitly.
Yeah.
Um, I think I will, um, as we're approaching the, the end, I, I think I would like to, um, to ask the final question that I had prepared, and we can get back to the remaining, uh, discussion, uh, topics, uh, in, in posts. But I do want to to ask you, um, what, what was it, what was it like writing this book? Uh, because I guess all books, they, you have a certain feeling about them.
They are, they can be too early, uh, they can be spot on, they can be too late, it can get different kind of reactions. Like what, what has been, what has been the, the, the feedback so far? Are you too early?
Are you too? I guess you're not too late. Well, yeah, well, let, let me start and then, uh, I'll let Davis answer what it was like working with his father, I suppose.
But, uh, but, but here's what I would say is when, when Davis and I came up with the idea for this book, it was, Ole Knows this, written a few books with O'Reilly. There's a, there's a long process. So this is about 16, 14 to 16 months ago where we didn't know how the agent landscape was going, going to evolve.
Um, we, there was no such thing as Claude Code or Codex and Chat. GPT was still at version 3.5 or something like that. Not very, uh, effective, to be honest with you, at least not compared to today.
It was, it was wonderful back in the day, but compared today, it wasn't, um, wasn't really where it needs to be. Um, interestingly enough, we looked at analogy again, analogies. What, what happened in the enterprise when you had went from one to many?
So, so back in the day, I, I did a lot of work with APIs, and when you have one or two or three APIs, which is very, a long, long time ago, it's easy to manage them. But when you have 10, a hundred or a thousand APIs, you need to think very, very differently. Same thing with data products.
If you have one or two data products, that's interesting, but when you have 10, a hundred, uh, more than a hundred, like some of my clients do, you're talking about a different way of thinking about it. You have to think about different management practices. So when, when Davis and I came up with the idea for this book, and when we pitched it to O'Reilly, we, that was the argument where we're not gonna stop with one agent.
We're gonna have, you know, tens, hundreds, thousands. And the industry leaders have confirmed that. So, so a long-winded answer is Olay, is we're, we think we're at the right time at the right place, uh, which we didn't know 16, 14, 16 months ago, but we're fortunate to be there right now.
Well, yeah, that's kind of my feeling as well. Uh, Davis, do you wanna chip in? I want to hear the details about writing with your father.
Yeah, Nori writing my father was a great experience. We've obviously known each other a lot, and let's just get in the same wavelength when we're talking about our ideas. Um, it was, however, quite a bit different than usual consulting work we do together, especially because my father has a lot more experience writing books than I had at the start of this.
And I'm quite glad I was writing the book with someone who had that experience as it proved quite helpful in getting the book up to the high standards we set for ourselves. Yeah, no, that makes a lot of sense. I mean, it is also, and like, really, really, it's just a really, really different, uh, experience, uh, writing with, uh, someone else and writing a book for the first time.
But I feel also, like, if, if I have to say something about timing of your book, I think it, it, it, it's at the right moment in time because it's not like everyone kinds of sees this now. They see that we need to think about building architectures in our company, and we can understand also that this is not something that you can, uh, that that, that you can take lightly. It'll be complicated to set this up, right?
So I think the book is, is very timely and it's, it's super interesting read. So, um, I guess with these words, I want to thank you for participating in, uh, this webinar here today. Uh, and we will be back with, uh, more insights, uh, in, uh, shared posts and writings.
I can guarantee, uh, that you are at least most welcome to write, uh, for us. And I would be happy to write, uh, with you. Also, I also wanna thank all the participants, um, for patience.
This was, uh, I guess slightly technical webinar, but very needed also. So thank you everyone for listening in and uh, thank you Eric and Davis. Well, Ole thank you Very much to you and the, uh, the whole Acton team for having us.
Thank you very much. And I say thank you as well. It was great to be on.
Thank you.