Welcome back to the Agents of Data podcast. Whilst we were at San Francisco at Snowflake Summit just a few weeks ago, we had the opportunity to sit down with Shishir of Tech Global Systems. We dug into how AI is being used under the covers of projects to help make data teams and their organizations wildly more productive with their data. Tech Global Services provide end-to-end technology solutions and consulting. This episode is led by Angie Hastings. Angie is our senior solution architect here at Matillion. With that, over to you, Angie.
Hi, I'm Angie Hastings. I'm a Solution Architect for Matillion. And here with me today on Data Driven is Shishir Srivastava from Tech Systems Global Services. And we're going to be talking today all about big data and all of the changes that are happening in our industry around AI and kind of everything in between. So Shishir has spent nearly two decades along with myself and in the tech industry and in the data space. And in that time, we've seen quite a bit of change in technology and in data and what businesses are doing with that data. So we're just going to kind of dig into all of those challenges that you're seeing in the industry.
Yeah. So for more than 20 years in the industry, you've like witnessed a bunch of changes as have I. From your vantage points, what's the most important pivotal shifts that we're seeing in business and with regard to data today?
Yeah, thanks. Thanks, Angie. Thanks, Matillion, for hosting me here. I think it's a huge transformation. Like I've been in industry now pretty much close to 24 years and with tech global services 15 years now and happy to be in beautiful company like this. The transformation what I've seen across the ages, like back in, as I've been in data all along my career, since 2000s, I've been working in this space and essentially all of these years I've seen that the way data transformed over the years. We talk about monolithic systems. We had on-prem system, customers were struggling that you need to scale up our system, take it to the next level. New innovations have came. I remember being in Oracle, Oracle Exadata, one of the high-end computing machines, like IDM mainframe. So the systems have moved from those days where the systems were like, you know, monolithic to little powerful. And then the era of big data from big data to cloud, for cloud to now AI. So it's a huge transformation across the years, but it is great to see how industry has changed over the years. But if I have to put my perspective, there are kind of different areas where industry has struggled over the years. And I see that ray of light, the hope when we talked about the advent of big data or probably, you know, with the cloud, now things have simplified and kept on moving at a pace which no one has ever thought. Like, you know, it's a speed of light or it is a speed of thought. You know, the way technology is changing. Some of the key aspect of issues or maybe concerns what industry had it, if you ask me, it's like, you know, data silos. Number of organizations, not just one, the number of organizations, and it's culturally, organically, the way organizations have been laid down and the way they have been framed, the way the whole culture has been framed. Everybody has their own way of working, whether in terms of departmental data or holding the data among themselves. It's a cultural shift as well that, you can say policy regulations with which it becomes difficult at times to share the data across the department. So, you know, holding that data makes it create silos. And then because of that silos, you see data quality issues as well. That's the next step. Those are some of the roadblocks or bottlenecks. I've seen that over the period of year, you know, where customers are struggling, they have a quality issue and then comes the governance. Now you have to put one more layer that how do I keep the data clean? Who should access the data? Who shouldn't be? What are the privileges needs to be given? Those kinds of issues, of course, along with governance security and proprietary data. So all that fun stuff comes in along. But those are some of the bottleneck. And those have been, I feel, I have seen that it's getting away slowly.
Yeah, so we had this shift from all these data silos and kind of the move to the cloud. And then tech systems, global services helping companies kind of break down those silos and move to the cloud. And now what we're seeing is this next big sea change, this next big gear change, right? Of the addition of AI and the addition of all of this, like even more concerns with data and how are we dealing with those data? So how are you, you know, you previously talked about data as the new oil and kind of helping companies to figure out how to, like you said, how to integrate, how to cleanse, how to govern as the move and modernization to the cloud. How do you see that changing now with AI coming on?
I mean, that's a great analogy if I have to put it, right? So when we say data is the new oil, right? Oil, you know, if you go back decades back it powered the economies of countries, economy of organization that how do we, how they have moved from stalemate, monolithic organizations or countries to powerhouses. And so if I see and if I put that analogy in terms of data, data is the foundation. It's the foundation of AI. You know, you see all those innovations like large language models, yeah, RAGs or, or, you know, agent-like and all these frameworks. It's all about, you know, driven by data. So you need to have that power of data and it's not just the data, it's a quality data. You know, it's like if you put in garbage, of course you're going to get garbage out. So you need to have quality data and all, for all that, you know, critical aspects of data platform or to have a strategic move to make a modernized data platform, you need to have a right mindset for sure, along with, you know, right strategy. Now, when we talk about right strategy, do we have data governance in place? Do we have integration tools in place? There are federated system we are talking about. Now, if I compare, you know, certain areas for example, you know, one of the organization we work with, you know, I don't want to name the customer for sure, but then those organization, you know, if I take one particular example, it struggled, you know, in a number of places. And to give you a particular example, we've been engaged, you know, to modernize the system. They were struggling with the performance of their, the whole system that reporting was pretty slow and they wanted us to come on board, understand their whole landscape and give them a solution which can be, you know, agile, speed and accurate. And the first day in our engagement, and I was having on the table conference with the stakeholders and what we learned essentially, they were running their entire system on a Windows XP desktop. And I was surprised that how can you run an enterprise on such kind of, you know, low end machines. So the argument what I heard was it's the cultural team, the way they wanted to handle is they don't want to move away from what they know. The upskilling and the adoption part of it, that was completely missing. Although as part of our recommendation, definitely, you know, we recommended them Red Hat and Unix. And I'm talking about this story way back in 2008 when, you know, there was no advent of cloud or big data. Yeah, they don't want to do all that whole skill shift. Yeah, they didn't want to learn about, you know, those new technology. Although, you know, we gave them, you know, a good hybrid solution, we helped them to modernize eventually. But then that engagement, what gave me a learning, in fact, to the stakeholders on the customer side is they need to invest a lot on new upskilling and reskilling the new technology, you know, as the way the whole landscape is moving. And I'm proud to be a part of an organization where we focus really, really hard on, you know, upskilling our team members, resources, you know, everybody's like, you know, you talk about data side of it, platform side of it, you know, everybody's learning about AI and how things are shaping up. So that's a good, you know, story to tell when I say how organizations needs to be changed. So when I say data is an oil, does what it means essentially, you know, that it's a foundation, it's going to, you know, power or, you know, it's going to act as a catalyst in terms of transformation. Now, every organization has data, but are they using it to its full potential? Yeah. We don't know.
Yeah, on that analogy, really in order for data to really help a company and be of value, you know, it has to be clean, it has to be trustworthy, all of that. What are some of the use cases that you're seeing around kind of the, you know, the cloud and the AI use cases that customers are using in order to see that value?
Yeah, I think, I mean, there are a number of use cases. I mean, the opportunities are like, I would say infinite. It's just like, and now it's, I've seen those days where you pump the data overnight and you have to wait for the batch processing to happen and we see the result next day morning. Now we are in era where the data is getting produced at whatever would be the source system. And you see that fleshed out and ready for consumption, right within a minute or within second, like fraction of second. So it's the data is moving at that speed, but then it also creates a lot of challenges. Now, when I see the use cases and the different, I would say the POVs or MVPs, we usually come across, you know, where customers are talking about that, you know, hey, what can we do? And there are like really, really, you know, interesting use cases I've came across when working with, you know, different engagements. And one of them stands out really well to me that, you know, customer was already in their journey, you know, they were pretty matured and they wanted to move to the next level. What they asked us that, you know, why don't you guys come and suggest us, understand our, you know, landscape and suggest us what is the best possible use case we can do. And we had done like, you know, sort of a deep dive in the line phase where we go assess the environment, discover the whole landscape and technology landscape of the customer and then define and, you know, suggest them then, you know, these are certain use cases based on your certain, you know, scenarios, what they're working on. Now, what we learned, they were really struggling with exposing the data to the different business users and the consumers of the data set. Now, they had service now, they had Salesforce, they had, you know, some mainframe systems. So one of the aspects when we modernize the system is like bringing those federated data sets together, making a unified platform and exposing it to the business users or the right consumers of the data. So what we, you know, gave them a great use case, which they loved it and it is still in production. It was, they wanted the speed, the agility. So the framework or the solution, what we put together, and that was an automation, I would say, a hyper automation. It's an automation and steroids we can add it as, but then the use case was the user has to, you know, just identify the data. So they had a portal of data set, you know, that they different departments have, you know, healthcare, finance and manufacturing, you know, they had different departments. It was a manufacturing organization. They had different data sets. So all the user has to go as like, you know, from a drop down in the service now portal, choose a data set, what they want, whether that data is in the system or not, doesn't matter. You just have to choose a data set and just say, you know, submit, it goes to the owner of the data, the data shepherds were identified, owner of the data who can approve it. And behind the scene, once the data has been approved, it will get, it will be pulled through wrappers, you know, behind the scene, there'll be an automation scripts will be running, it'll take the data set, put right guardrails around it, privileges, you know, so the policy so that once it gets available for the consumption, it should be available for the right people rather than in a exposing it to everyone. And this is all in automated fashion. And that was a fantastic use case. So, you know, that's that- And was there AI being used under the covers to kind of prepare this for the end user was the business to be able to understand their data better?
Exactly, yeah. So that was one and then we were, we came across one of our manufacturing custom for that, we have done manufacturing analytics as a solution, which was Snowflake's capability of Cortex analyst, we leveraged it. They were a long time customer of manufacturing that build their data warehouse, dimensional data model, which was I think close to 10 or 15 years old. Literally, we put up a layer on top of that of Cortex analyst and it gave them the capability of self service be I literally through a natural language, you know, you can just type in the queries and you get the answers. That was fantastic use case. And another great study of AI use case, I would say, we worked with one of our partners, I would say, one of our customer, they were really, really struggling to get the data, for the data collaboration, that was all the great use case we had. It was a manufacturing company, they were having inventories across North American region and it's basically a pain company and pulling the data and they may use to maintain their inventory in multiple different locations and to their dealer network. Now, if the inventory gets, just think of it like a situation where you have inventory, but then it's running out of the space. And I get the quality is bad or yeah, it's not accurate. Yeah, different parameters they had and so they had to deliver it and if that comes, this is situation, they have to eventually request for that. Now that's a long due process, you have to understand your own system, see that how much of inventory you have and then make a request to the organization that we are running short of the inventory, we need to have more, we need to refill the inventory and they send it tons and tons of gallons of their paints and for that, the whole process used to take a week or two. With this power of Snowflake and data sharing capability, all that they had to do now, literally, just click off a button, you raise a request, under the hood, there is Cortex search engines where they have to produce the proof of inventories and all that which processes all those unstructured data in terms of images and PDFs, it reads that, puts, uncovers that layers and layers of information that could in bulk and just provide them that input and in a structured format. Here, at the organization side, they have Snowflake system which is, set it up through a proper governing body and the team of data engineering solution, they used to work on it and they just put the data, process it and within like, in minutes, they'll have the approvals ready for refilling the inventory. Some of the really good cases in different industries.
That's a great point about, we had all of the structured data and then kind of semi-structured data, now there's just this huge explosion of unstructured data and the same challenges, right, that it has to, you have to have access to it and quality and determine like, what happened, you can get value out of it, but it's whatever, whether it's clean or not, right? Garbage in, garbage out, right? And in fact, that reminded me of one of the, recently we came across one of the good use case of AI, we working in one of our customer where they have government directives keep coming to them, kind of in terms of PDF and they had a team of advisors and lawyers. Like regulations you mean? Yeah, yeah. Government directive, regulations and policies and those are like EDF files or in a bunch of documents like thousands and thousands of pages. So they have to read through it, scan it, understand it and made the changes in terms of how they are catering to their customers and immediately inform the body who is getting the directives or policies that we are on top of this and we have taken care of all different scenarios. The turnaround time for all of that was a week or two, they have to go through all the, it's a team of folks. Now you just dump all of the data, put the LLM engine behind it, it just scans over it, you literally talk to your PDF files, ask the questions, what am I, what are the next set of course of action, what has changed from my previous data set now in this new one, is there anything as unique to take an action and literally gives you right of action which is like minutes and then you just come to a meeting, understand, give it to instructions to your team and you know-- The time difference is amazing. Matillion now has the capability to take those large PDFs, send it to a large language model and I have an example of a 100 page document that I just asked to summarize and it does it in seconds whereas it would take a human maybe weeks to do that.
Yeah, so what are you finding, are you finding that people are kind of embracing this new way of thinking? Oh yes. Or is there some hesitancy on using the large language models?
No, I think it's, I've seen that the adoption is pretty high and the customer base, not just at the enterprise level, the large language model and even the advent of AI, AI call it as a baby now, it used to be a baby. If you see that, I usually put that analogy of baby, like when baby is born, it starts peaking, then it starts blabbering and after blabbering, it'll start getting the contextual topic if they have to talk, they'll talk about that and then eventually once they grow, once they mature, the kids, they'll do their own task on their end. So the way AI has progressed over the years, it's like LLM, the step one where it was gone and then eventually you have RAD model where you are giving the context to your AI data set because LLMs is like, it's giving you the entire universe of data and you just ask your question and there are problems of hallucinations, data accuracy and all of that, that has been solved by retrieval augmented model, generative model, so that's RAD model is given that context, that specificity or accuracy to the data set and now we are talking about agentic AI. Agentic AI is all about that, it's like autonomous agents which takes care of decision, perform the task and do it on your own and all of those things. So I think it's moon at a rapid pace and yesterday I was listening to the keynote, it was fantastic, Mert, by Sam, he mentioned that it's the time where we need to start exploring it, wait for things to happen rather than out of curiosity or do it, it's that time and I see that this is huge adoption, customers are curious, they are asking us that, we know the little capability but you being in the industry for such a long time, you tell us what are the use cases, how can we take our organization to the next level? There are, it's like sitting across the table and understanding their problem statements and then doing it, putting that framework or layer of AI and trying to solve the problem because it's a new world all together, the capabilities, I see that it's immense.
Yeah and you kind of touched on this a little bit, the trustworthiness of what's coming out of those models and so what kind of guardrails are you finding that people are putting around and human in the loop to really trust what information is coming back?
Yeah. I think that's a concern in the industry as like you said, a child who's maturing and growing and making sure that those models get trained with the wrong stuff in the hallucinations. So is tech systems, global services like figuring out how to put those guardrails in place?
Exactly, so in fact, in one of our meeting I was explaining to one of our customer, we are building in a tons and tons of agent EKI solution within organization. So we are not a product company, we are a consulting firm but technology and being a technology consulting firm, we definitely need to have an upper hand of what we do as a business and we know it. So there are a lot of agent EKI related in a framework or accelerators, we are building it. In fact, my team is working on data validation agent in good that you brought up the topic and I can explain a bit on how we are trying to rectify because this is a very common problem across the industry and Zuby hearing a lot about that. That we have data, how do we ensure that the data is accurate or not? So what we are doing is we have put up an agent EKI framework to any enterprise data, it doesn't matter, it has to be on cloud for sure. The moment it enters into the system, there's an agent EKI system, a framework which keeps running behind the scene, it will flesh out any outliers into the system. This is something wrong in, let's say assuming in a field of name or last name, you see numbers or special characters. So it will just flesh out, give, create a feedback loop to the humans that hey, you go and take a look at it, record your action. It's a, I would say, I mean, what we've used is reinforcement learning. It's a unique concept and technology offered by, in the field of AI, we are leveraging that, building our solution where understand the data set, bring it to the fore in front of the customer or whoever the guardians of the data see that, is it okay, what action I need to take? You take those action, record those action and all these things can be done in natural language. You can just type it in and say that, I just want to delete the data. I want to keep this data, I'll just quarantine this data from the data set and it will take those action, do it behind the scene for you and be ready for the next set of action. In fact, when the same kind of problem comes into the next time, it will tell you that are you okay to take the previous action which I've taken it and you just say yes and it will take it. You don't need to literally do those every time when you're coming across.
Yeah, I think there's people that are concerned, oh, with this AI, is my job going away? And then I think that we always have this kind of data engineering and understanding of the data, is it going away? It's just, it's teaching us to be prompt engineers instead of data engineers. So that we can get through way more data faster. Yeah. Big opportunities in the AI world, like what are some of the kind of risks or things to watch out for as we're kind of moving into this space?
I think the major risk I see is the governance. You need to have an AI governance. And we internally within our organization in TechSystem itself, we had AI governing council body which has members across from different field, whether it was to talk about from the data side, the app side or infosec and regulatory bodies, everybody's involved into that. It's a group and I would say no, it's not just within our organization. I would suggest to the customer that they should have that kind of a council, that kind of a group which can look into the system where do you have biases, to talk about ways or in the color or country, location, whatever. So those kinds of biases, are they coming out of it? And that's amazing. Yeah, right. And those kinds of concerns, those are very prevalent. And that's a major, I would say, should be the focus area of the industry for the customers who are adopting AI framework in future. That's something to be watch out for. Other than that, if you are a customer who are still in your journey of data modernization, look after your data, look after the quality of the data because wrong data can produce wrong results. And you might adopt seeing that it's an AI generated output but it's not something AI has produced, it's the garbage you have there. Yeah. That's something. It's like the challenges that were previously, we always talked about having good quality data and data that could be trusted. And now it's even more important, right? Because there's more of it. And now we're training models on having clean data. Having clean data. And the last probably I would say, the upskilling of the resources, those who are in this space, they need to know because the way innovations are happening, those are not limited to just one company or big public cloud provider, whether it's not tied up to just AWS or Snowflake or Azure, it is across the industry, innovations are happening in all different places that are open source technologies as well. So I think the resources and those who are working into the system, they need to upskill themselves. And for the industry, that's my message. They need to even focus on that and see that how can they improve on that.
Yeah. Yeah. Looking ahead and kind of upcoming in the industry, like any trends beyond, just kind of interacting with the AI, any trends that you can see coming up that we need to sort of watch out for.
Well, I think AgentEKI is a new kid on the block. Everybody's watchful, everybody's interested and curious about it. I think that's the area we all should be focusing on. So maybe let's take a step back and kind of talk a little bit more about what does that really mean? What does AgentEK AI mean? It's like a buzzword, it's the kid on the block. But kind of explain what is that meaning.
Yeah, so AgentEK AI is a system. You are putting that autonomous engine to your AI. Now AI used to be like, it's reactive usually, you provide them a task and to take care of the task and give you the results. But now with AgentEK I's, you take those action, it's not driven just by one set of action, it can take care of like multiple set of action. For example, I have to order a pizza, right? I know in my phone that I usually order a pizza from maybe Pizza Hut or Domino's or whatever, like whatever my favorite restaurant, all we have to do is create a system where they'll read it in my system, what's my preferred choice, what time it is open, it'll order it, all I have to say that, hey, probably if this is too Siri, right? On iPhone, they'll order the pizza for me. It'll literally order the pizza on behalf of me, make a payment through another agent which will be the payment system which handles the payment system. The third agent will take care of that delivering the pizza from that pizza house to home and it is delivered to my doorstep. So it's like multiple tasks driven by multiple agents and combined together makes you a seamless experience with what you want.
Each little agent has its own specific task that it's doing and they all work together as a team.
That's right, that's right. It's more of a collaboration of different agents.
All right, for the data listeners that are listening, like data leaders that are listening, what's one piece of advice that you would offer them to better unclog their data pipelines that they have today?
I think the main focus area, as I was saying, the data is the foundation. Everybody should focus on the data. And yesterday at keynote, I was talking to Matt, and everybody was like so pumped about what AI is offering. But the focus should be in a data. We need to bring in a right amount of data because we cannot produce the expected result unless and until you have the right amount of data and the right quality of data. So data is the main foundation, right skills of resources, the resources which can execute your system or the program or the projects. You need to have the right mindset, right resources who knows all of those technology because technology is changing pretty fast. So all of that, I think those are the two key factors in my opinion because the rest of the things can definitely be taken care.
And for the data leaders that are concerned about completely changing their skills, I know there's a lot of tool sets and Matillion provides the ability to easily create those pipelines without having to have data scientists and stuff. So are you finding it easier to kind of upskill those customer teams and whatever you build?
Yeah, absolutely. In fact, AI could help in building those answers to the question. Literally, there are LLMs where you can literally query. You want a detailed question, detailed answer or a very brief answer, can you literally give me in seconds? So it's all about exploring the power of LLMs out there and building the different solutions. In fact, those kinds of solutions we are investing internally in our organization, Texas in Global Sciences where in terms of training, certifications and building those, an ecosystem of learning in tech global services, it is there.
Yeah, people don't need to be intimidated by this scary world of artificial intelligence because it's actually getting easier.
Yeah, in fact, I see that the era of when iPhones and mobile phones picked up, there was a talk of the town walls that people are going to lose the job. In fact, it created a different shift and in a huge amount of jobs for different skills. It's going to be the same. In AI, we will be a little free to do more innovative things among ourselves. It'll create new different aspect and different types of jobs which are not there in today's world. Something to explore.
Being able to take all of that massive volumes of data that's growing and growing and being able to really add value to it by having it clean and properly trained and integrated with all of those models.
Yep. Right.
All right, well, thank you for coming today and having a talk with us about all things data and AI.
Thank you. Thanks for inviting me. Sure.
Thanks for listening to the Agents of Data podcast. The podcast is brought to you by intelligent data integration platform Matillion. To discover agentic data engineering visit matillion.com/maia. Don't forget to subscribe and let us know what you think of everything we discussed over on Matillion's LinkedIn and Instagram.
We recommend upgrading to the latest Chrome, Firefox, Safari, or Edge.
Please check your internet connection and refresh the page. You might also try disabling any ad blockers.
You can visit our support center if you're having problems.