Gable Blog - Panel: State of the Data And AI Market | Apoorva Pandhi, Matt Turck, Chris Riccomini, Chad Sanderson

Talk Description

Artificial Intelligence is reshaping the landscape of software development, driving a fundamental shift towards empowering developers to take control earlier in the development lifecycle—known as "shift left." In this panel, venture capital leaders and industry experts will explore how emerging trends in AI and data technologies are influencing investment decisions, creating new opportunities, and transforming development workflows. Attendees will gain valuable insights into the evolving market dynamics, understand the strategic significance of shifting left in today's AI-driven world, and discover how organizations and developers can stay ahead in this rapidly changing environment.

Additional Shift Left Data Conference Talks

Shifting Left with Data DevOps (recording link)

Chad Sanderson - Co-Founder & CEO - Gable.ai

Shifting From Reactive to Proactive at Glassdoor (recording link)

Zakariah Siyaji - Engineering Manager - Glassdoor

Data Contracts in the Real World, the Adevinta Spain Implementation (recording link)

Sergio Couto Catoira - Senior Data Engineer - Adevinta Spain

Panel: State of the Data And AI Market (recording link)

Apoorva Pandhi - Managing Director - Zetta Venture Partners
Matt Turck - Managing Director - FirstMark
Chris Riccomini - General Partner - Materialized View Capital
Chad Sanderson (Moderator)

Wayfair’s Multi-year Data Mesh Journey (recording link)

Nachiket Mehta - Former Head of Data and Analytics Eng - Wayfair
Piyush Tiwari - Senior Manager of Engineering - Wayfair

Automating Data Quality via Shift Left for Real-Time Web Data Feeds at Industrial Scale (recording link)

Sarah McKenna - CEO - Sequentum

Panel: Shift Left Across the Data Lifecycle—Data Contracts, Transformations, Observability, and Catalogs (recording link)

Barr Moses - Co-Founder & CEO - Monte Carlo
Tristan Handy - CEO & Founder - dbt Labs
Prukalpa Sankar - Co-Founder & CEO - Atlan
Chad Sanderson (Moderator)

Shift Left with Apache Iceberg Data Products to Power AI (recording link)

Andrew Madson - Founder - Insights x Design

The Rise of the Data-Conscious Software Engineer: Bridging the Data-Software Gap (recording link)

Mark Freeman - Tech Lead - Gable.ai

Building a Scalable Data Foundation in Health Tech (recording link)

Anna Swigart - Director, Data Engineering - Helix

Shifting Left in Banking: Enhancing Machine Learning Models through Proactive Data Quality (recording link)

Abhi Ghosh - Head of Data Observability - Capital One

Panel: How AI Is Shifting Data Infrastructure Left (recording link)

Joe Reis - Founder - Nerd Herd Education (Co-author of Fundamental of Data Engineering)
Vin Vashishta - CEO - V Squared AI (Author of From Data to Profit)
Carly Taylor - Field CTO, Gaming - Databricks
Chad Sanderson (Moderator)

Transcript

*Note: Video transcribed via AI voice to text; There may be inconsistencies.

" And the next session is our first panel of the day.

Look at this. We've got some hard hitters popping up onto the stage. How is everybody doing? I see we've got Matt, Chris, and Apoorva. Oh, and look at Chad's back too. So Chad, I think you're gonna host this one. I'm gonna hand it over to you, man, and I'll see you all in 20, 30 minutes. Sounds great. Okay. Um, well, hey folks, we are back and we've got a panel, uh, and uh, this panel is gonna be focusing on the future of data.

So we've got AI coming, we've got all this shift left coming, um, what is the future of data and, and data companies? And who better to hear from than, uh, some of the best, uh, data investors in the world? So I'll turn it over to the guys and, uh, let them introduce themselves. Uh, aur, if maybe you wanna kick it off?

Yeah, Chad, good to, good to see you. Um, um, my name is aur. Um, I'm one of the managing directors at Zera Venture Partners, has been around for almost a decade. Uh, I was founded as a AI and data focused fund back in 2014. Um, and since then, uh, we are investing out of a third fund right now. So one $18 million fund lead or co-lead, uh, seed rounds.

Um, and, and broadly invest in, uh, data infrastructure, AI and ML infrastructure, as well as the application layer. Um, excited to be here. Thanks for having me. Absolutely. Uh, Matt, you wanna go next?

Sure. Hi everyone. Thanks for having me. I'm a partner at First Mark. First Mark is a New York based, early stage venture firm. Uh, and as a firm, we do a little bit of everything by, by design, uh, so that includes consumer and marketplaces and that type of stuff. So we invested in, you know, Pinterest and Shopify and Discord and Airbnb.

Uh, but, um, I'm the partner that's, uh, focused, uh, on this world of data, data infrastructure, machine learning, and ai. Uh, so very actively invest in it. And then for what it's worth, do a lot of, uh, extracurricular around it. So, uh, you know, an annual market map called the Mad Landscape. I, uh, run, uh, a big meetup in New York called Data Driven NYC.

And then, um, because I'm a vc, I have to have a podcast. Those are the roles. So I have a podcast, uh, called the MAD Podcast as well. Awesome. Chris, you wanna round things out? Yeah, sure. Um, hey everybody. I'm, I'm Chris. Uh, so I have a much smaller fund. I mostly do little, uh, seed Angel investments. Um, and in addition to that, I do some writing.

I'm helping Martin Kleitman with the second edition of designing data intensive applications, and I also do some, uh, open source stuff as well. I launched, um, slate DB last year, along with the folks at Responsive, um, which is a, like a Rocks DB on object storage, is sort of the way to describe it. So kind of straddling between investing and, and code writing.

That's how, how I keep self busy these days. Awesome. Yeah. Chris is a rockstar. Um, if you don't follow him yet, you should. Um, he has a ton of great content and, uh, software that he is written. Okay. Um, let's, let's just jump into it, I think. So there's a lot to cover. Uh, so, so Matt, maybe I'll start with you. I think pretty much everyone who is listening to this, uh, listening to this conference, whether they know it or not, has probably seen the mad landscape floating around somewhere on the internet, uh, 10 years old.

It's been coming out annually. Uh, all this gen AI stuff is happening. Uh, I'd love to know from you and then, and then hear from Apor and Chris, sort of how you see the next landscape, the next change in the landscape happening as you add Gen AI to the mix. And what is the potential impact on that in the data world?

Uh, yeah, so actually thanks for, uh, reminding me that the next version of that mad landscape is, uh, is, is is overdue. I was like, uh, targeting, uh, early in the year and like we still working on it. So there'll be a, uh, there'll be a version of it coming at, uh, soon, I promise. Who I promise this to myself. Um, and look, I mean, it's, it, it could be a, a, a very long conversation.

Um, you know, I think, I think, um, um.

We could go in also different directions with, with this. But, um, you know, like, I, I think, I think what's, uh, particularly interesting about generative ai, uh, from the landscape perspective is that it's an application first, uh, trend. So this, uh, that's one thing, uh, that, uh, you know, it's probably obvious to practitioners in the field that may not be obvious to everyone, um, which is that, uh, in addition to all the technology prowess, uh, generative AI is also comparatively very easy to use.

So if you think back to machine learning, machine learning was very much, uh, still very much because it is not going away. Uh, but the province of, uh, a few technical users and data scientists that basically would go away and spend a year building a model, uh, and then would come back and then you would basically build an application on top of it, uh, and, um, uh, you know, the whole process would take a a, a very long time.

Uh, here the fundamental difference is that, uh, we have those foundation models, uh, that, uh, are widely available as APIs and, uh, enable just about anyone to build an application very quickly. So, uh, compared to prior trends, uh, which were sort of infrastructure first and application second here, everything is happening at the same time.

So if you think of the modern data stack, you know, in earlier version of the landscape, uh, the modern data stack was very much okay while you got Snowflake. And then, you know, then you just like move data into Snowflake or, you know, via five trend or like any, uh, you know, number of, of, of different vendors.

Uh, and there would be a long march to go from that fundamental infrastructure thing, uh, to applications. And the big difference here is because it's so easy, uh, you got applications everywhere, and, uh, so that right side of the landscape is the most vibrant. And, um, that's the, you know, the one, one big trend, uh, big evolution in 2025.

Got it. Apo, Chris, feel free to jump in anytime. Yeah. I, I actually see this like, like any infrastructure phase, right? Like when all infrastructure layers, whether it's the data, uh, layer or the, or kind of the ML tooling layer or the dev infra layer all go through there, like the cycles of consolidation and fragmentation, which kind of is linked to.

Uh, market shifts and, and market shifts historically have been the cloud shift or the ML shift and kind of the da big data and, uh, and analytics shift. Uh, and more recently, I, I call this kind of the, the AI shift. The AI shift as Matt referred to is, has enabled people to kind of ma make applications or develop applications very quickly.

But as this market is maturing, I think what's exciting is with this platform shift happening, there are a lot of like new data primitives coming up and the new, and whenever, like the, the new primitives come up for two different reasons, right? One is to address the pain points of kind of the incumbent stack.

And there, there is, there are clear pain points with the incumbent stack around, um, like data being thrown into the warehouse and data not being high quality. Um, uh, people not being able to own their own data in the lake or where it belongs. Um, and data not being high quality at that source itself. So that's kind of the, the incumbent issues.

Uh, and on the other hand, there are these bigger challenges that I'm, that I'm emerging, which are more developer first now, just because now developers are the ones who are taking these models and creating applications. Now they care about, like data curation. They care about generating new data sets. And I think it's not just for training some of these models for your internal use cases, but it's for like test time compute.

Like how do you kind of, uh, use the right data or kind of how do you evaluate the data that'll actually help with lower latency, higher accuracy for your different tasks and use cases. So as an investor, I'm kind of very excited about new primitives, not just to kind of address the pain points of the existing stack so that you can own your own data in the lake, uh, and you should be able to have flexibility around compute engines on top, but at the same time, also excited about like, how do you enable developers with tooling Historically, like a lot of that data used to be like given to the likes of scale, AI and label box and so on.

How do you provide tooling in the hands of developers so that they are able to curate data as well as even generate data for certain verticals, like certain areas like life sciences, manufacturing, um, even robotics will require data to be collected from physical surroundings. Uh, and that data does not exist on the web.

So how do you generate that data and how do you create net new applications that are full stack? So I think those are the areas that I'm excited about.

Cool. Awesome. Um, well, next question, uh, Chris, um, you know, you, you have this incredible background as a distinguished engineer and you've sort of been living at the, uh, at the, the overlap of the Venn diagram between sort of seeing the future in terms of the investments you're making and then actually getting your hands on code and, and doing the work.

What do you think is the future of data engineering as we move into the age of ai? Like how does the role of being a data engineer start to change in your opinion? Oh. Uh, so I've been, uh, thinking about this a lot lately as sort of a co consolidation of a number of roles. Um, so I, I wrote a post on this fairly recently where we sort of exploded the number of roles we have.

We have analytics engineer, we have data engineer, we have ML engineer, we have data scientists. I, I don't think over the long term, uh, we're gonna be able to sustain those as distinct roles. Uh, I just don't think they're gonna be able to justify the cost of having, you know, four people doing four, four things that are like 80% the same.

Um, and I think the, the way that I see this playing out is, um, you know, I think largely data engineer is sort of like a, a title will get compacted in along with analytics engineer and maybe ML engineer as well. Uh, and you know, where there were skill gaps, I think this is where AI helps quite a bit. Um, I, I already see a lot of companies sort of going after like, um, you know, AI helpers that will, uh, you know, help you manage your d your airflow jobs and help you manage your ETL pipeline and stuff.

And so there's a world in which these things, uh, collapse down. And so the data engineer's responsibility is probably gonna expand. You're gonna have to start handling, you know, ML engineering type tasks, maybe even data science types, tasks, uh, maybe even analytics engineer type tasks. Um, which I think is a good thing.

Um, it's going to cause that role to be a little bit more focused on, you know, business outcomes. I think, um, historically data engineers have sort of like, and I'm, I'm guilty of this too. I've been kind of like, I'm, I'm responsible for extract and load. Uh, you know, you guys do what you want with the data.

I don't, I don't want to get worry about it. Um, so that's sort of how I've been thinking about it. And you can kind of see some, uh, trends in, in recent startups where they're trying to unify, uh, the stack a bit more. So I think DLT hub is one example where they're trying to kind of a unify, uh, the data pipeline with analytics engineering, DBT kind of use cases.

Um, so that's, that's my take on the situation. Yeah, that makes sense. Basically, you have a lot of fragmentation. Is, is it, is it fair to say that you're sort of describing like a full stack data engineer basically? Yeah, I think that's, that's fair. And I, I think the, the next sort of leap from there is like, well, you know, shift left, this is the whole title of this conference right.

As well if the, you know, why can't application engineers do this? Um, and I think there's, there's room for that as well. Um, so yes, I'm, I'm, I think full stack data engineer is probably a reasonable, reasonable way to, to phrase that. I kind of like that, that term. Um, and I think, you know, similar to piggyback off that it's just, you know, you have existing full stack engineers, why can't they do, you know, the data, the way they're doing DevOps and you know, SecOps and all the other stuff that we've shifted left And I think.

Our data tooling, when you compare it to like a DevOps, uh, maturity level is way behind. And so in order to make that happen, uh, is gonna take a, a lot of platform development. Things like data contracts, things like, uh, you know, DLT Hub and, and DBT, and sort of instilling software engineering, um, best practices into what we're doing.

Um, but we'll get there. Cool. Awesome. Um, apor, I've got a question for you, which is, you, you know, you, you and Matt sort of have both referenced Modern Data Stack, um, last 10, 12, 15 years, maybe the whole world was a flame with, with Snowflake, uh, five Tran, DBT, and I think obviously to a large degree that's, that's still true and will continue to be true.

Um, but what's, what does the next horizon look like in your mind? Like what's the next sort of big set of technologies that, uh, data leaders should be thinking about now potentially to enable all the massive AI use cases that are coming down the road? Well, I, I think at least, um, with, with Snowflake, uh, they, they came up with this, they were the first, like one of the first few folks to kind of really solidify this concept of like separating, uh, compute and storage at that time.

And that was pretty helpful for the community overall because before that, like it was coupled, deeply coupled, and there was less flexibility in the minds of like the practitioners. Um, but like we realized very quickly, right? Like with the likes of Snowflake and, um, the entire modern data stack, there are limitations to like what you can do.

Like one is the limitations to the kind of data you can address. Like it is, uh, very much focused on structured and semi-structured data. And as the world is moving towards like pictures and texts and, and, and audio and video, I think, uh, people don't want to like throw all of that data into like one, one blog, right?

They don't want to like throw it into like one warehouse. They want to keep it at the source, and they need to, they want to get like value from it, from the source. Um, and then there's this limitation around like, there's this, this entire thing requires governance and whenever you need access to any data, like you basically figure out, uh, quality as an afterthought and you figure out like the use case kind of after you've built out that entire stack.

And then there are limitations in terms of what kind of things you can EEL and all of that. I think all of that needs to change in the future, and I think that's the promise and the beauty of a lake. And, and that's why to a certain extent when Databricks builds the Lakehouse and kind of builds that full stack on top of that, it's considered kind of a, a pretty promising full stack platform.

But to, for a lot of organizations who might not want to use Databricks or might want flexibility beyond Databricks to kind of just keep. Data where it is, um, and kind of build tooling on top of that and have flexibility around, uh, ETL have flexibility around cataloging that data and being able to use it with different compute engines, uh, as well as to kind of fetch it as in when you need it, which is what the power of like something like a duct TV is.

Uh, for that you need tooling and, and increased reference DLT hub, which is, which plays pretty nicely in the lake, right? So I think, and then the technologies like Iceberg, uh, that help with asset, uh, compliance, which help with crud for, for the lake itself. So I think you need those kind of technologies, which I think to, for now it's kind of more bespoke.

Uh, it's more like the best technology is for the best use cases, but over time they'll get consolidated because you need simplicity of that stack as well. Um, uh, and that's why I can, I like the idea of the neutral storage that, uh, that tabular started with, and now obviously it's a part of Databricks, but I think we need that kind of tooling that'll enable you with the flexibility of having data at the source, but at also the flexibility of using the, the compute engines and kind of the data at, at the point of use.

Uh, and that's, that's something that I'm looking forward to and actively exploring, uh, technologies in that realm. Interesting. So it seems like there's sort of multiple avenues of shifting left, this sort of shifting quality left, there's shifting, sort of bringing the compute to, um, sort of the, the, the point that's closest to the engineer that's actually doing the work.

Uh, potentially breaking the compute apart into the places that it makes the most sense. It's not having sort of a single file format that you're throwing around, or not having multiple file formats that you're throwing around, but starting to standardize. And all of that is kind of part and parcel. It feels like it's the same type of shift left happening everywhere, but also not giving the overhead to, um, to whether developers or to the data teams around figuring out orchestration, figuring out cataloging, figuring out everything possible in the stack.

And I think, um, all of that has to be simplified, uh, for at least mid-market companies. Um, and, and I think that's an opportunity for the stack. So I, I very much, I very, I very much agree with that. Like, again, I think there is, um, you know, you can go deep in like this product versus that product kind of thing.

But like, what I'm hearing, talking to customers, and maybe that's a super obvious comment, uh, this is like a deep, deep desire for consolidation. There's too many companies, too many tools. The modern data stack is a prime example of this. I mean, it's, you know, great to have like all these different vendors work together, but like, ultimately people want stuff that, that, that works.

Uh, so it could be, you know, company consolidation, corporate consolidation, which effectively means, you know, Databricks and Snowflake buying up and down the stack. Uh, so that, that's one way there could be product consolidation, where you have products do more stuff. Uh, then you could have AI powered consolidation where you, you know, you used to have 10 people do one thing.

Now you could have one person do 10 things, uh, which is sort of what Chris was alluding, uh, to. But, but something's gotta give, right? Something's gonna happen because ultimately people don't care whether that's structured or unstructured data. They don't care whether that's batch or real time. Uh, they don't care about whether that's, uh, you know, bi or predictive analytics.

They just want everything to work together. And that's been the fundamental problem of our industry, I think. And, you know, if that's one thing that this landscape, uh, that I do year after year shows is that it keeps getting worse or there's more and more companies, more and more tools. And until we solve that problem one way or the other, um, I think we're gonna be, uh, uh, inefficient, uh, as in, uh, as an, as an overall industry.

Yeah. Interesting. It feels like we're, we're bringing up a, we're we're bringing, we're bringing up, uh, back, back to the old school data warehouse where the data warehouse is an integration layer, bill Inman. And the, the, the problem we have now as all these technologies are not actually integrating with each other, it seems like what you're, what you're sort of getting at, sorry, Chris, go, go ahead, please.

Yeah, I would say it's, it's, it's, it is ge it is getting worse. And, and, sorry, I'll shut, it's, it's, it is getting worse where, because there is a, uh, now there's like the, obviously the emerging generative AI stack, right? Like the vector database, the rag, and all the things. So there's even more tools. So like the, the, the problem keeps compounding and getting worse.

And that's, that's some, some point, like we need to solve that. Yeah. I, I, I would almost flip what you're saying, uh, Chad, and it, it's almost an every, everything integrates with everything. And so you have like parquet files and iceberg formats and 80 different query engines that can all query the same data.

And, you know, to Matt's point, like the end user doesn't care whether you, you are using like Daft or, you know, uh, Trino or you know, Panda. Like they just wanna run the query, you know? And, and so I think. The MDS pattern is being played out at the query engine level itself. Um, so there's a way in which, if you squinted this, there's a, you know, Wes' post about, uh, Wes McKinney's post about decomposing.

The, the database is like further fractionalizing and balkanizing, what used to just be one tool. Um, and I, I couldn't agree more with, with Matt said about sort of these different approaches, whether that's consolidation or AI powered, uh, tools that are gonna help manage this stuff. It's probably a, a combination of everything.

Um, all the above. But yeah, it's, it's, things are worse, not better. We, everyone's complaining about MDS in the past 10 years and how fractionalized it is, and we're just continuing to make it worse and worse. So I think we need to start seeing some bundling, uh, or partnering. Um, I think, you know, something that's been interesting to me is watching the streaming and, and data lake, uh, consolidate a bit.

So, you know, confluent did table flow. Uh, warp Stream is doing something similar in this space. Buff Stream just did something. So they're trying to kind of like unify the streaming and, and, uh, data warehousing at least platforms a little bit. So maybe something will shake out there where at least we'll be able to, you know, query streaming and, uh, quote unquote stable data, um, more easily.

But it, it's, it's gonna take a long time, unfortunately, I think. Yeah. Uh, I think that makes total sense. Um, on the same note, I, I think Matt, you, you brought up a really interesting point, which is there's, there's kind of the, the reality of the data practitioner, of the data engineer and the analytics engineer and the things that, you know, we are sort of all excited about.

And, you know, with the rise of sort of product led growth and sort of open core and that whole model of business that's evolved over the last 10 years or so, it was really kind of the individual engineers, the individual workers that were driving the adoption of tools. And they would bring in a tool that solved sort of one particular problem and then bring in another tool that maybe solved another sort of particular problem.

And what I'm kind of inferring from what you're saying is that the leadership of these organizations are, are sort of looking at the state of the, the, the data landscape now and saying, Hey, like that, that's, that's not, that's not gonna work so well anymore. We, we can't have, uh, 20, 20 different systems that are all doing a different cut of the same thing.

Um, on on, on that note, what are you seeing from the leadership in terms of what they care about from both, uh, a, a data perspective and also the, the, the future AI state? Like, what are the foundational pieces that they're looking to put into place to get value, um, out of artificial intelligence that may not exist today, uh, or might exist today and they're just not reaping the rewards quite yet, and leadership being the, the CEOs of those companies.

Not the data leadership. Yeah. Right, right, right. You know, I, I, I, I think we're still in that, uh, in that moment that has been, uh, you know, abundantly documented where, uh, people feel under heavy pressure to do ai. So, you know, all the things that people have been saying for the last year, oh, you know, the, the board is bringing that unique, uh, to be able to show that you do air.

That's still very true today. Uh, what's interesting is that, uh, in the enterprise, so, and when I say in the enterprise, I mean global 2000 companies versus, uh, tech native large companies. So in other words, I'm talking about Pfizer, not Uber. Uh, so in the enterprise, we are just starting to emerge from, uh, the low hanging fruit phase.

And the low hanging fruit phase for me was, um, a combination of, um, you know, let's bring open AI on top of Azure, uh, to do chat GPD for our own files. So like the most sort of basic obvious use case, uh, and, and coding, uh, and, um, consulting as the third, uh, leg of the stool that was that low hanging fruit phase.

We just starting to go into the, uh, so what do we actually do with that thing Phase, meaning the low hanging fruit have been sort of taken care of for the more advanced ones, not for everyone. Now it's very much, okay, what do we do that is specific to our company, that is specific to our industry that leverages our data?

So what are the tools that we need? Where is that gonna live? Uh, you know, can our data go to any one of those commercial API vendors, or should we do it OnPrem? If we do it on-prem, what does that mean? Do we need GPUs? So it's like that rubber meets the road kind of, uh, kind of moment, which is, uh, the very hard moment, uh, but also the most important one for the industry, because that's what people want to do.

They want to, if they're gonna spend, you know, big budgets on entrepreneurial of ai, and if that's truly gonna be a thing, it's, it's very much, uh, how do we customize generative AI for our own needs? Uh, and that's just starting to happen. And, you know, and then very much to the data infra we're going to get into, that's gonna give us like another spin into this whole thing that we've been talking about for years.

Uh, you know, into the data contracts, the data governance, the, so that's gonna revive this entire discussion, orchestration, you know, security, who has access to what. And, um, so we, we, we are about to, uh, as, as that reality moment truly happens, we are gonna yep. Take another spin of the same, uh, same discussion.

So it's, uh, it's, uh, it's gonna be both daunting, but also an exciting, fun time. And spin the block again. Yeah, I mean, as, as you're talking and as I'm sort of listening to what you guys are saying and as we're kind of hearing what customers are, uh, and other companies are starting to do with, with ai, uh, it, it feels to me like a lot of the same types of challenges we've been seeing in data management for 30 years are already starting to rear their head a bit.

So, like, for example, if I've got a bunch of different agents and I'm unleashing those agents into my code base so that they can make changes, um, how do those agents actually communicate with each other while they're passing information from agent to agent? And now you've, you've got a data pipeline and all the same kind of management problems start to apply.

Um, okay. Well, uh, I think we're almost at time. We've got a couple minutes left. So I'd love to just go around and give everybody a last word and, you know, what are you kind of excited for in the future? Is there a company that you're excited about and, and why?

Chris, you wanna start? Yeah, I'll go first. I'll go first. Uh, I guess, so a company, uh, not a portfolio company of mine that I'm very excited about, um, is Loophole Labs. They're doing a ton of, and this is sort of deep tech stuff, so not directly related to data stuff, but they're doing a lot of, uh, virtualization and migration stuff.

And one of the side effects of that is, uh, being able to migrate, uh, GPU workloads between cloud, uh, even using spot instances. Um, so I'm just excited from a sort of an engineering and, and use case perspective, but they're doing a ton of work with like labs and um, you know, like government labs and stuff, uh, and get getting a lot of value there.

So I think that whole space is really fascinating to me. Um, so that's one area where I am, I'm excited 'cause I just think the bottleneck's probably gonna be GPUs in power for the foreseeable future. Um, so figuring out how to load balance that stuff is really key. Cool. I'll just, Matt, you wanna go next?

Yeah, me. Uh. Sure. Uh, I, um, so I'm, I'm actually very excited for that phase. Uh, uh, I was just, uh, describing which is, okay, what are we actually doing with that thing with general AI in the enterprise. Uh, I think that's the, the really interesting and sort of creative, uh, moment. I'm excited for, to, to see it.

And in terms of, of company, I'm, um, I'm excited about, uh, so I'm sorry to be such a cliche vc, but I'm actually going to mention a portfolio company, which is Dataiku. Uh, and the, and the reason why I'm mentioning them is precisely because, um, that phase, uh, where, uh, big enterprises are figuring out what to do with the general AI is it's, it's as, as a dataiku board member, have a, you know, front row seat.

'cause it's a company that's at some scale, right? Like above 300 million in, in ar And uh, that, that orchestration of like models, uh, agents, uh, you know, analytics, all, all the things is, is, is what they do. So it's, it is fascinating to just, uh, see, uh, how those large companies are starting to use those tools in the context of dataiku.

But like in, um, in general, what is it that they actually do with it? Uh, I'll also be excellent, important, yeah. I'll also be a cliche VC and, and talk about, uh, Gable and put a plug for Gable itself. It's uh, uh, I think people know, uh, the company more for data contracts. We just wanted to kind of extend that.

Like, it's not just about data contracts, it's about visibility of data assets. It's about um, like ensuring that like the consumers of data get what they're looking for. And the same time governance as well. Because like till now, like most governance tools and lineage tools were downstream of the warehouse.

This kind of enables you to be like, have lineage and visibility from the source itself. Uh, and I think that's, that's what's exciting. Awesome. Well that was not planned in advance, but thank you everybody. Um, thanks folks on the panel for coming. This was awesome. And uh, mark and Demetris back over to you."

Chad Sanderson

April 1, 2025

Panel: State of the Data And AI Market | Apoorva Pandhi, Matt Turck, Chris Riccomini, Chad Sanderson | Shift Left Data Conference 2025

Get the ultimate guide to Data Contracts Deep Dive

Get the ultimate guide to Data Contracts as Code

Ultimate Guide to Data Contracts

Talk Description

Additional Shift Left Data Conference Talks

Transcript

Read our latest Articles

Panel: How AI Is Shifting Data Infrastructure Left | Joe Reis, Vin Vashishta, Carly Taylor, Chad Sanderson | Shift Left Data Conference 2025

Chad Sanderson

Shifting Left in Banking: Enhancing Machine Learning Models through Proactive Data Quality | Abhi Ghosh | Shift Left Data Conference 2025

Chad Sanderson

Building a Scalable Data Foundation in Health Tech | Anna Swigart | Shift Left Data Conference 2025

Chad Sanderson