AI is fast becoming critical to business and IT applications and operations. Organizations have been investing in artificial intelligence capabilities for years to stay competitive, are hiring the best data scientist teams and are investing more and more in artificial intelligence and machine learning systems. However, implementing AI / ML models is not easy and the risk of failure is just around the corner. A solid methodology is needed to reduce this risk and enable companies to succeed.
What is ModelOps and what problems does it solve
AI executives have been working to get more models in business for years now. The first hurdle was getting data scientists hired and tools for rapid model creation. That problem has been solved. The next hurdle is getting those models into production in a timely, compliant manner. Companies have a backlog of models that are sitting idle and degrading — contributing no value/revenue to the business. We refer to this as model debt. Or they are working to get their first AI models into production and finding that it is much more complicated and taking much longer than they anticipated. And most important, they are realizing that models, especially ML models intrinsically bring a lot of risk inside the entire organization. Board level understand that if they want to use AI at scale, they must have some process standardization and automation, with strong governance.
Coming from the operations, and being involved in many digital transformation programs, it is not surprising seeing why ModelOps is the cornerstone of every AI initiative. There is a huge difference between the mindset/requirements/collaboration across teams in the Labs vs in Production. And it is not a first in the enterprise world: something very similar happened with software: software developers’ code was not going into production in a timely and effective way. DevOps helped us with software improving collaboration across IT teams, accelerate deploy cycles, and deliver better experiences with modern software development methodology. In a similar way, ModelOps is helping enterprises to operationalize models (AI, machine learning, traditional models). And here it is even more complicated as models drift over time and carry a huge amount of board-level risk across enterprises. Data Scientists build models with multiple tools and languages, and then in most cases, the models never see the light in production, or if they do it takes a long time — sometimes too much and the model is no more useful at that point. If they make it into production, very often models run without proper monitoring, controls and overall governance, so they do not always perform as they should and they may expose the entire company to multiple kinds of risk: compliance, reputational, etc.… And this is an enterprise-level risk.
That said, we can define ModelOps as a combination of culture, policies, practices and tools that is key to an organization’s ability to operationalize AI, ML and analytic models at scale, evolving and improving model governance and performance at a faster pace than existing/homegrown methods. This speed enables organizations to employ AI decisioning to better serve their customers, secure their businesses, and compete more effectively in their market.
Shadow AI: the key issue that organizations are going to have to contend with within the coming years
An important aspect that is often underestimated in the early stages is that ModelOp and MLOps are distinct and separate from each other. While MLOps is for Data Scientists, ModelOps is a focus primarily for CIOs. Indeed, models in production must be monitored and governed 24x7 — and regulations are coming and not only for the Financial Services Industry. And this is something that should be handled by the CIO organization. The risk is to have another situation as we had for Shadow IT — we can call it Shadow AI: each BU is putting models in production without standardization across the enterprise, and we have a wild west of models. Even the simple question: “how many models in production are there?” becomes a hard one to answer, not to talk about having visibility into the state and status of each model in production, and not to mention questions related to compliance and risk management.
If we think of Shadow IT, it was not necessarily bad, as it spiked innovation. Certainly, the CIO organization had to control it, not really eliminate it. The problem we’ve been seeing a lot, and I mention it in my recent articles, is that organizations are still treating models as some asset at the BU level, that belong to the BU and Data Scientists even in production and not as Enterprise assets that should be managed centrally, like many other shared services managed by the IT organization. This is a big mindset shift that is required. And the starting point is to understand that ModelOps is necessarily separated and distinct from Data Science. Labs and Production should be like Church and State. Data Scientists should not be asked to double down as Operational resources too, as they have neither the bandwidth nor the skillset and nor the interest of managing 24x7 complex model life cycles that ensure a proper operationalization.
CIOs should step up with the help of the Chief Risk Officers and implement ModelOps.
Potential risks for companies that operate without the right ModelOps capabilities in place
In a recent report by Forrester, it has been declared “Your AI Transformation is doomed without ModelOps”. ModelOps is of course the core capability that enterprises need to deploy, monitor, and govern AI/ML Models. Companies that won’t decide to adopt ModelOps risking having a hard time keeping up with their competition and certainly won’t exceed it or lead.
Good governance absolutely requires rigorous policies, practices and tools to enforce controls and track steps and changes for auditability, risk management and compliance. The risks of doing ModelOps badly are unreliable business decisioning (that impacts revenue), regulatory and compliance penalties and potential breaches that lead to huge financial and brand costs.
And this is not only true for highly regulated industries such as Financial Services.
- A manufacturer that uses AI to drive product design decisions, finds that a machine-learning model made errors, leading to a serious customer safety issue. To defend against liability, the company is challenged to prove that its AI model was properly developed and maintained.
- The HR department in a large retailer uses a third-party software tool to assess employees and make compensation and career advancement decisions. The tool incorporates capabilities that allow a “citizen data scientist” — in this case, an HR manager with no data science or risk management background — to create and deploy AI models to assess employees and determine salaries, bonuses and promotions. The model is deployed and operated without any involvement from the risk management or compliance teams. An employee passed over for promotion claims that the AI model suffers from bias, sues for unfair labor practices and challenges the company to prove that the model is fair.
These are just a few of the real-world challenges that enterprises of all types may soon face as AI becomes more widely and routinely employed. Indeed, Gartner predicts that 50% of AI use cases will be assessed for risk by 2024, and 15% of application leaders will face board-level investigations into AI failures by 2022. Clearly, it’s well past time to pay attention to AI compliance.
How organizations should get started to do ModelOps
To Analytics and Data Science teams: involve your IT department and Risk department early on, as they are there to take this operational aspect off your plate, and it will save costs and avoid exposing the entire company to unwanted risk.
Leverage a ModelOps platform to automate and standardize the process of designing model life cycle of production models. Most of them have lego pieces that will reduce the time to production, such as out-of-the-box workflows, controls, monitoring and remediation pre-designed processes.
Another important aspect as you select a ModelOps platform is to respect the needs of Data Scientists and of the business users: DS uses in average 3 model creation tools/languages and this number will only grow and tools will phase in and out constantly. Models run on-prem, in the cloud, and serve multiple disperse apps. This “multi-multi” situation requires an agnostic ModelOps Platform.
Final thoughts
Model operations are a must-have capability to operationalize Al at scale. It comprises tools, technologies, and practices to enable organizations to deploy, monitor, and govern AI/ML models and other analytical models in production applications. ModelOps is about more than moving bits. Deploying models doesn’t end with provisioning infrastructure and copying code. Machine learning models are unique in that they must be constantly monitored while in production and regularly retrained, requiring the collaboration of a host of stakeholders from data scientists to ops pros.
Organizations should start to understand why ModelOps is important, the capabilities they need, and how to develop or acquire them.
No comments:
Post a Comment
Thanks for your comments