Audio transcript:
MLOps: The Secret to Successful AI
Michelle
Good morning and thank you all for joining us for another Insight Tech Talk. I'm Michelle Reaux. And today, I am joined by Amol and Ken and I am actually going to have the two of you introduce yourselves. So, Ken, do you want to start?
Ken
Yeah, hi. My name's Ken Millard. I'm a Machine Learning Architect with Insight and I especially focus on MLOps.
Michelle
Great, fantastic. Amol, how about you?
Amol
Hey, everyone. My name is Amol Ajgaonkar. I'm the CTO for Intelligent Edge and my role, I focus on running workloads at the Edge and then scale them using the Cloud.
Michelle
Great. Well, a big welcome to both of you. Thank you so much for being here with me today.
Amol
Thank you, Michelle.
Ken
Happy to be here.
Michelle
All right. So I want to set the stage for us, for what we're going to be talking about this morning. And adoption of Artificial Intelligence or AI, as we all know it by, is growing rapidly. It's something that's becoming just a norm in everyday life. So a lot of businesses have future plans to invest in AI and our guests today, like you just heard, are they're really going to explain what we need to do in order to drive that adoption. So even while businesses see value in AI and they have plans to adopt, the unfortunate reality is that a large percentage of AI projects fail. So today, we're going to explain how organizations can really successfully operationalize AI. All right. So like I said, much of what we're talking today is around AI and it's a term called MLOps. So Ken, I'm going to flip it over to you. Can you level set what is MLOps and is this really a term everybody should know?
Ken
Yeah, absolutely. So I'm really glad that you mentioned that, you know, the truth with AI projects is that most of them fail. And I think more often than not, what we see is that these projects don't fail because these organizations can't get a model together, it's that they fail to kind of realize we always call the kind of from model development to production, the last mile and kind of internally, we've started to call that the last marathon, that that's at least half of the work. And so not really kind of adequately understanding what are the technical complexities between getting these AI projects out of the lab and into a production setting. And so that's essentially what MLOps is. It's a set of practices and standards to ease that path and to accelerate the process between getting something out of the lab environment and into production. And so we use the term MLOps because we're kind of piggybacking on a lot of DevOps concepts, right? So we're utilizing a lot of that same CI/CD methodology in the same way that as software development kind of accelerated with DevOps implementations, we're doing the same thing with AI. And so really the point is to kind of enable a framework to allow your AI engineers the ability to develop this code and these models and have kind of a self-service method between getting them from their development environment and into production.
Michelle
All right, fantastic. Thank you for that. Thank you for that level set. Amol, I'm going to send it over to you. So can you give us some examples of real life use of MLOps?
Amol
Absolutely. So I'll give you examples of where ML is used, right? And then it'll make sense when I talk about MLOps in those scenarios. So let's take manufacturing, to begin with. You have a camera and you've got products and you want to check for defects. You say, "Okay, I'm going to build a model that'll look at those products and check for defects." Great. They build a model and they test it out and it looks fantastic, it works. Like, all right, I need to get this in production. Well, now when you get it in production, like the first time, right? You do it the first time, it all works. You put it there and it's working and you're like, success! Fantastic. Product changes and defects change. And now you're like, "Oh, I need to train a new model or a new version of that model." And well, everything starts to break down now because your line is still running. You still have to do the defect detection, but now you've got another version out here.
How are you going to run that in the same space and make sure that you don't disrupt your existing process, but you also want to check if your new model is better or worse? And so that is where MLOps comes in. It's enabling the data scientists, enabling the architects to be able to build a model, right? Collect the data, the new data, build the model, train the model, and now they want to test it. How do you operationalize that model and make sure that it's running at the edge wherever it is running and be able to do AB testing? Be able to see, "Okay, if I run this model, I'm getting better results." I can maybe detect the new defects that are showing up, or is it worse? Has the lighting changed? Has some other variation in those images because of which that I know this model is failing, right? So that entire life cycle of being able to collect new data, train the model, be able to deploy it back and then test it while the existing model is still going on, right? The AB testing, and then being able to say, "Okay, my newer model is great, I'm going to switch it out." And then be able to just switch it. And now suddenly, your newer model is taking over and you can safely remove that older model. Or even keep it just as a backup. Like, "Hey, I want to maybe something went wrong here. And I want to just go back to that earlier model." That is how we do it. So this is just manufacturing, but the same pipeline works for any AI model. In the sense, you could have retail scenarios like product shrinkage or inventory management, or you want to build heat maps, or do you want to do anything else that relates to an AI model, you happen to run into the same issue. The first one is easy. The second version onwards, that is where if you haven't thought through, when you start off building this, you're going to run into issues.
Michelle
Perfect. And thank you for explaining that, that makes a ton of sense. And you know, one of the themes that we've been talking about that you're hearing is that we're talking about a lot of the AI, these projects, it's an AI fail, right? So we're here to really help figure out how do we make them successful? So you both work with, you're talking to clients, you're seeing this and you're working with technical teams. So what are the common challenges that you're seeing? And then, why is it so incredibly difficult for organizations to get AI off and running and really managing that on their own? So, Ken, I'm going to flip it back to you to answer that first.
Ken
Yeah. So I think one of the first things that's really important for organizations to understand as they kind of engage on these AI projects is that there are, like Amol mentioned, there are fundamentally different things about the way that AI systems work that makes them have separate different requirements than how a traditional software project would work. And I think the first thing to really realize is that, converse to kind of traditional software development, AI projects typically require a lot of care and feeding, right? Like Amol said, like we experienced what we call drift. The data, the system that a model is built to try to predict changes over time. And so that requires that you kind of continually update that model to kind of keep up with those changes. And so I think that that's really, Amol hit the nail on the head, that the biggest challenge is, "Okay, how do I continually update and ensure quality that when I do retrain these models and get them into production, that I'm serving predictions that are just as good as they were before." I think that that that's a really big one. I think another really big hurdle that organizations face is, you know, as a model evolves over time, eventually you can get to a point where just passing newer data to the model, it stops being effective. And so we actually need to change the data that the model consumes and being able to change that in a easy and kind of without having to rebuild your entire data pipeline is difficult if you don't go in with that mindset and go in and design things with that in mind to know that, "Hey, right now, this model needs these five pieces of data, but in the future, it may need eight pieces of data." Right? And so we want to build and architect our systems in a way that allows us to make those kinds of changes without having to rebuild the entire pipeline from scratch.
Michelle
All right, perfect. And so then Amol, I'm going to flip it over to you and ask you. So how do we operationalize these AI models?
Amol
Right. So before we, you know, get to the operationalization conversation, the planning actually starts way before that, right? So the planning actually starts when you start to build a model and you're thinking of the solution and say, "Okay, you know, building an AI model makes sense." And at that point, the realization of what all is required to actually put this in production is important. So you know, if you think about building a model and say, "Okay, I just need to collect this data set and build a model. I'm going to put it here and I'm going to test it." That is just one small lane of a six lane highway. So you're just going through one lane and you know, all of the other stuff you're just ignoring, but you'd really have to have a team that thinks in terms of how do I get this in production? Which means, how do I continuously be able to collect data on demand? You need to have newer, like you know, Ken said, the data changes to variations that you see in the data, you might need more additional data points. If you need that, how are you going to do it in the future? How are you going to do it when it goes into production? Right? So all of this has to be planned. You need to have the right people in place and the right technology stack in place to be able to do that as well. You need that kind of expertise. So it's not just you know, I hire a data architect or a data scientist and you give them the data and like, "Oh, give me a model and I'm good." No, you need other roles as well. Just like when you're learning a software solution, you need the DevOps team, you need data architects, you need the software architects, right? And cloud architects.
And there's so many roles. That is because those are required for a successful deployment and maintenance and running of a solution. Similarly, with an AI model, some of those roles are required. So when we are looking at operationalization of a model, we are looking at a framework of, you know, when we deploy this framework, the customer or the team should be able to collect data on demand, should be able to deploy a model from the Cloud onto the Edge or whatever that pattern is, as well as be able to test it and be able to see the results, you know, side-by-side. Also when you're operationalizing it, and if you're operationalizing it at the Edge, then you need to scope out the right hardware as well. You need to think about security. You need to think about networking. Is the bandwidth enough for so much data to be passed around just locally, not even to the cloud, right? Just locally, is my network good enough? So all of these things have to be taken into consideration before we talk about operationalization, because you know, this entire stack is operationalization. Operationalization is not just one switch. It's basically a plethora of switches. And just knowing when to turn them on and off at that stage. So we build IP around that. So we say, okay we want... Let's say it's a custom vision or computer vision kind of an application. We have IP that will do video ingestion. We work with partners that use that and use that framework so that we can do inferencing and then switch between GPU's and CPU's and an IGPUs and stuff like that, right? So we try and abstract those complexities out, just so that when the data scientists, you know, come up with a new version, we have a better way of pushing that model down and then testing it out on different types of hardwares, whether it's IGPU or GPU or CPU and distribute the load between different machines as well.
Michelle
All right, perfect. So you just talked about computer vision and that's really another technology trend, and we're seeing it accelerate. Actually, Insight and IDG just released a computer vision report, and it really talks about the success from early adopters. So Ken, can you talk to us about the MLOps use cases in different industries?
Ken
Yeah, so I think computer vision is a really interesting and exciting field, both because I think, you know previously, even a couple of years ago, it seemed like it was kind of an insurmountable task for your average organization to be able to implement, you know, a robust computer vision type model for their particular use cases. And now with the development of tools and leveraging techniques like transfer learning, we're seeing that your time to develop in your overall data requirements for these type of computer vision requirements for computer vision projects has come way down. And so actually in that IDG report, we see that of early adopters who have implemented computer vision projects, half of them expect to get their ROI within the first year, right? So talking about machine learning projects, that's an incredibly fast turnaround to show profitability. And so I think that that's both because the dev time has decreased, but also these use cases are so powerful in terms of what they can do. Across the industry, across different industries, I think there's a ton of use cases in terms of how people are leveraging computer vision. Really common ones that we see are in manufacturing, we see a lot of quality control type work, identifying defective product that, as it's being manufactured, trying to get general like quality metrics, like how, you know, or how are we trending in terms of overall production. In retail, we see a lot of individual, like heat mapping to understand like customer foot traffic and flip flows and you know, where customers spend more time looking at some things versus others. And then even in healthcare, you see a lot of uses, you know, from analyzing kind of radiology images and that kind of stuff.
Michelle
All right, fantastic. So as we wrap up today, we like to always, you know, help our listeners keep an eye on the future. So I want to ask both of you and see if you have anything else to add before we close. So Ken, really where do you see MLOps evolving? And then the next question, is this a skill that all IT professionals really need to study?
Ken
So I think my personal opinion is that as you know, I think we're still very early days in MLOps, to be quite honest. We still have several years to go before a lot of the stuff is really regimented to the point that a lot of DevOps practices are. But as we kind of mature, I really see AI in general as becoming just another part of the developer's toolkit. Right? Right now it's a very specialized skill set that a very specialized group of people implement, but I think 10 years, 15 years in the future, you're going to have ML practitioners on most development teams, right? And so this is going to become ubiquitous in terms of software development.
Michelle
All right, fantastic. And then Amol, anything else you want to add before we end today?
Amol
Yeah, absolutely. And so from not just from an MLOps but ML as well, will be changing. So as the tooling improves, and there are a bunch of companies that have come up with tooling that'll allow you to annotate, that'll self-annotate and help you manage these versions of models and even actually deploy some of these automatically. So, everybody's moving in that direction where the MLOps pipeline is almost going to be as a service offering from companies, where you go in and you upload your data and you'll get models and you'll be able to manage those. But if you don't want to use those, there are other tools like Lego blocks and you can put them all together. And those tools themselves are evolving as well. So it'll make it easier. It'll be much easier for other developers who are not, you know, heads down in that space every day, will be, you know, will be able to use them. So I don't think everybody needs to be really super, amazingly talented in this space, but they need to be aware of it. Because ML is going to be, you know, more and more ingrained in all types of solutions, whether it's even a software solution that we build, like a web application or a desktop application, mobile, anywhere. Even there, ML is going to be part of that as well because you might be looking at different types of errors in predicting, you know, what's happening in your environment, right? Now, this is on the operation side, but it's still ML, which means you still need the MLOps pipeline there as well. So everybody just needs to be aware of it and think of it when they're building solutions. So in the future, if we were to do this, what do I need? So helping spread that awareness, understanding that it's required, it's not optional, you could get away with certain things, you know, in the short run, but in the long run, you are absolutely going to need it.
Michelle
Well, Ken, Amol, thank you so much for sharing all of your insights with us today. You made it really easy to understand, and this is like you said, I mean, this is a big, big project. It's a big thing to really understand and learn. And I love the idea of the awareness of it. So I just appreciate you both so much. Thank you for being here.
Amol
Thank you, Michelle.
Ken
Thank you for having us.
Michelle
Absolutely. So for those of you listening, we have some fantastic related resources and those are going to be in the show notes, but also you can find more technical insights about AI and MLOps in our digital magazine, "The Tech Journal" You can read, you can subscribe to it at insight.com/techjournal. Again, thank you all so much for being here with us today and have a fantastic day.