AI and Machine Learning in the Oil & Gas Data Stack: Platforms, Models, and What Actually Works

Editorial disclosure: This article reflects the independent analysis and professional opinion of the author, informed by published research, vendor documentation, industry surveys, and hands-on experience building ML models for energy companies since 2018. No vendor reviewed or influenced this content prior to publication. Where Groundwork Analytics' own work is referenced, it is clearly noted.

The AI-in-oil-and-gas market hit $4.28 billion in 2026 and is growing at 13% CAGR. Upstream operations account for 61% of that spend. Every major operator has at least one AI initiative. Every service company has launched an AI platform. Every industry conference features an AI keynote.

And yet 70% of digital transformation initiatives in the industry remain stuck in pilot phase.

That statistic should give pause to anyone evaluating AI investments. The technology exists. The budget exists. The pilots produce results. But somewhere between "promising proof of concept on 20 wells" and "production deployment across 2,000 wells," something breaks. Usually, it is not the model. It is everything underneath the model: the data plumbing, the operational trust, the retraining infrastructure, the gap between what a data scientist builds in a Jupyter notebook and what a production engineer will actually use at 6 AM when a well is acting up.

This article is a practitioner-level guide to the AI and ML layer of the upstream oil and gas data stack. It covers the platforms available at each tier, the domain-specific AI products that have actual deployments, the physics-informed approaches that bridge the gap between data science and reservoir engineering, the emerging role of LLMs and AI agents, and -- most importantly -- what actually delivers value versus what is still a conference slide.

The ML Platform Layer: General-Purpose Infrastructure

Before domain-specific AI products enter the picture, operators need somewhere to train, track, and deploy machine learning models. This is the ML platform layer -- the infrastructure that data science teams use to build and manage models.

Databricks ML (MLflow)

Databricks has become the platform of choice for progressive operators building internal data science capabilities. Permian Resources runs its entire data stack on Databricks with Dagster orchestration and dbt transformations. Devon Energy and Shell both use Azure Databricks for ML workloads.

The appeal is integration. Databricks unifies the data lake, feature engineering, model training, and model serving in a single platform. MLflow -- the open-source ML lifecycle management tool that Databricks created and now hosts -- tracks experiments, manages model versions, and handles deployment. For an operator that has already adopted Databricks as its lakehouse platform, adding ML capabilities is a natural extension rather than a separate procurement.

Where it works well: Operators with 5+ data science staff, modern cloud data infrastructure, and enough well count (500+) to generate meaningful training datasets. Production forecasting, decline curve analysis, artificial lift optimization, and completion design optimization are the most common use cases.

Where it falls short: Databricks is a horizontal platform. It knows nothing about wells, reservoirs, or production operations. Every domain-specific feature -- connecting to SCADA data, parsing LAS files, implementing petroleum engineering calculations -- must be built from scratch. This is fine for Shell's 500-person data science team. It is a significant barrier for a mid-size Permian operator with two data engineers.

AWS SageMaker

SageMaker is the default ML platform for operators running on AWS, and it is the backbone of Baker Hughes' Leucipa production automation platform. SageMaker provides managed training infrastructure, built-in algorithms, model hosting, and MLOps pipelines through SageMaker Pipelines.

The Baker Hughes relationship is significant. When Baker Hughes deploys Leucipa at an operator, the underlying ML infrastructure runs on SageMaker. Expand Energy's January 2026 contract to deploy Leucipa across thousands of US gas wells means SageMaker is now powering production optimization at one of the largest natural gas producers in the country.

Where it works well: Operators already on AWS, or operators using Baker Hughes Leucipa. SageMaker's managed infrastructure reduces the DevOps burden compared to self-managed Kubernetes deployments.

Where it falls short: Same domain knowledge gap as Databricks. The platform provides compute and tooling but no understanding of oilfield data. Also, AWS holds only about 17% of the O&G cloud market compared to Azure's 57%, which limits the natural user base.

Azure Machine Learning

Azure ML benefits from Microsoft's dominant position in oil and gas cloud infrastructure. Equinor's internal ML platform, EurekaML, is built on Azure ML. Any operator running Azure as its primary cloud -- which is the majority -- can adopt Azure ML with minimal additional procurement.

Azure ML Studio provides a visual interface that appeals to engineers who are not full-time data scientists. The AutoML capabilities can generate reasonable baseline models without extensive ML expertise. Integration with Power BI allows model outputs to flow directly into the dashboards that decision-makers already use.

Where it works well: Operators already committed to the Microsoft ecosystem (Azure, Power BI, Office 365). The EurekaML case at Equinor demonstrates that Azure ML can support enterprise-grade ML operations at a supermajor.

Where it falls short: AutoML models are starting points, not production solutions for complex petroleum engineering problems. The visual interface can create a false sense of confidence -- "we built an AI model" -- when the hard work of feature engineering, data quality validation, and domain-informed model design has not been done.

Google Vertex AI

Vertex AI is the least adopted ML platform in upstream oil and gas, reflecting Google Cloud's smaller market share (roughly 13% of operators). However, Google's OSDU partnership and Aramco's CNTXT joint venture (which is a Google Cloud reseller for MENA) could expand the footprint.

Where it works well: Operators aligned with Google Cloud, or teams that want access to Google's latest ML research (PaLM, Gemini) through a managed platform.

Where it falls short: Limited O&G ecosystem. Fewer integrations with oilfield software vendors. Smaller community of practitioners with O&G-specific Vertex AI experience.

Custom Python (scikit-learn, XGBoost, LightGBM)

The most widely used "ML platform" in oil and gas is not a platform at all. It is a petroleum engineer with a Jupyter notebook running scikit-learn or XGBoost on a laptop. Every data science team in the industry starts here. Most never leave.

This is not necessarily a criticism. XGBoost on a well-prepared dataset, with proper cross-validation and domain-informed feature engineering, frequently outperforms more exotic approaches on common petroleum engineering problems -- production forecasting, artificial lift failure prediction, completion optimization. The challenge is not the modeling. It is the operationalization: scheduling retraining, monitoring for data drift, making predictions accessible to field engineers who will never open a notebook.

Domain-Specific AI Platforms: Where the Industry Actually Lives

General-purpose ML platforms require operators to build everything from scratch. Domain-specific AI platforms package the models, the data connectors, and the engineering knowledge into products that can be deployed without a dedicated data science team. This is where most AI adoption in oil and gas actually happens.

SLB Lumi + Tela

SLB launched Lumi as its data and AI platform in September 2024, followed by Tela -- its explicitly "agentic" AI assistant -- in November 2025. This is the most significant AI product launch from any service company in the current cycle.

Lumi is the foundation layer: domain-specific foundation models trained on decades of SLB's petrotechnical data, covering subsurface characterization, drilling operations, and production optimization. These are not general-purpose LLMs fine-tuned on oil and gas documents. They are purpose-built models trained on structured well logs, drilling parameters, and reservoir data. SLB's partnership with NVIDIA provides the compute infrastructure, and Mistral provides additional AI infrastructure for Lumi deployment.

Tela is the agentic interface on top of Lumi. It uses an observe-plan-generate-act-learn loop where the system can autonomously interpret well logs, predict drilling issues, optimize equipment parameters, and explain its reasoning in natural language. The December 2025 strategic collaboration with Shell to co-develop AI technologies for subsurface, drilling, and production workflows represents enterprise-grade validation of the approach.

The most concrete proof point is SLB's Autonomous Directional Drilling system (early 2026), where AI adjusts drilling paths in real-time, achieving a reported 30% reduction in drilling time on tested wells.

What this means for operators: SLB is bundling Tela with its existing service relationships. If you are already running Petrel, Eclipse, and DrillOps on the Delfi platform, Tela becomes a natural extension. If you are not an SLB customer, Tela is not available to you. This is an integrated product, not a standalone AI platform.

For a deeper examination of how Tela and other agentic systems work, see Agentic AI for Upstream Oil & Gas: What It Is, What It Isn't, and Why 2026 Is the Inflection Point.

Baker Hughes Leucipa + Cordant

Baker Hughes has taken a different architectural approach: agents that each focus on a single task, running within two platform layers.

Leucipa is the automated field production solution -- AI-driven production optimization deployed on AWS. Its January 2026 deal with Expand Energy to deploy across thousands of US gas wells is the largest known agentic AI production operations contract in the industry.

Cordant (Release 25.2.1, September 2025) is the broader industrial AI software platform with agents for process optimization, asset strategy, and asset health. Each agent is narrowly scoped: one monitors compressor health, another optimizes gas lift injection rates, another evaluates asset replacement timing. This agent-per-task architecture avoids the risks of a single omniscient AI system while delivering concrete automation at each point.

Baker Hughes' GenAI partnership with EPAM adds natural language interfaces to this infrastructure, allowing engineers to query production data and receive synthesized answers without writing SQL.

What this means for operators: Leucipa is deployable without a dedicated data science team. Baker Hughes manages the models, the retraining, and the infrastructure. The trade-off is that you are locked into Baker Hughes' optimization logic and AWS infrastructure. For operators who want turnkey AI production optimization without building internal ML capabilities, this is currently the most proven option.

Cognite Atlas AI (on Cognite Data Fusion)

Cognite occupies a unique position in the stack. Cognite Data Fusion (CDF) is not an AI platform per se -- it is an industrial data platform that contextualizes OT and IT data into a unified knowledge graph. Atlas AI sits on top of CDF, delivering pre-built AI agents and a low-code workbench for building custom agents.

The results are concrete: Aker BP cut root-cause analysis time from weeks or months to hours using Cognite agents. TotalEnergies expanded its partnership in September 2025 to deploy Cognite across its entire upstream asset base over three years. The January 2026 Snowflake partnership integrates CDF with Snowflake's AI Data Cloud.

Cognite's customer roster -- Equinor (long-term collaboration embedded in the Omnia platform), Saudi Aramco (CNTXT joint venture, exclusive MENA reseller), BP, OMV, Wintershall Dea, Cairn Oil & Gas -- represents substantial operational scale.

What this means for operators: Cognite is the most expensive option but also the most comprehensive data foundation. If your primary challenge is data fragmentation -- which McKinsey says is the number one barrier to AI adoption in oil and gas -- Cognite addresses the root cause rather than papering over it with models. The AI agents then operate on clean, contextualized data, which is why they actually work.

The caveat is cost and complexity. Cognite deployments are enterprise undertakings measured in months and millions of dollars. This is supermajor and large independent territory.

Halliburton DecisionSpace 365 (DS365.ai)

Halliburton has quietly deployed more AI and ML models at scale than any other service company: 60+ models in production across drilling, completions, and reservoir workflows. The DS365.ai suite includes AI-driven lithology interpretation, intelligent fracturing (co-developed with Chevron), geosteering optimization, and well planning.

DecisionSpace 365 runs on Halliburton's iEnergy hybrid cloud -- the industry's first hybrid cloud for E&P -- with an AWS-hosted Essentials tier available on a fixed monthly subscription. The use of Azure OpenAI (GPT-4) and Microsoft Fabric indicates that Halliburton is integrating LLM capabilities into its existing ML infrastructure.

What this means for operators: Halliburton's approach is less flashy than SLB's Tela or Cognite's Atlas AI, but 60+ production-deployed models represent a depth of operational ML that the industry should take seriously. The hybrid cloud architecture and subscription pricing make DS365 more accessible to operators who are not ready for full cloud commitments.

Other Domain Players

Palantir Foundry powers BP's digital twin of production operations (covering 2 million+ sensors across Gulf of Mexico, North Sea, and Oman) and was selected by ExxonMobil for analytics and BI in 2022. Palantir's December 2025 expanded partnership with Accenture specifically targets AI acceleration in the energy sector.

Novi Labs provides ML-driven production forecasting and well planning for unconventional plays. Trusted by Shell, ExxonMobil, Chevron, and Devon, Novi Labs raised $35 million in Series funding in June 2025. Its dynamic PUD/PDP forecasting models represent one of the more mature ML applications in upstream.

Corva has evolved from a drilling data visualization platform into an AI-powered drilling intelligence system. Corva Copilot provides predictive insights for drilling and completions, and the April 2025 partnership with Nabors (RigCLOUD Powered by Corva) combines rig contractor hardware with AI analytics. A 36% average improvement in rate of penetration from their Predictive Drilling technology is a concrete ROI data point.

Kelvin operates perhaps the most genuinely autonomous AI system in upstream production. Its closed-loop optimization platform autonomously adjusts gas lift injection rates, plunger lift timing, and other operating parameters based on real-time SCADA data. This is not LLM-based, but it is functionally agentic: the system observes, decides, and acts without human approval for routine adjustments.

Ambyint delivers similar closed-loop AI for artificial lift, specifically rod pump and ESP optimization. Like Kelvin, it connects directly to SCADA and adjusts setpoints autonomously.

Physics-Informed ML: The Bridge Between Data Science and Reservoir Engineering

Pure machine learning models -- neural networks trained on production history with no physical constraints -- have a well-documented failure mode in petroleum engineering. They overfit to training data, extrapolate nonsensically beyond the training distribution, and produce forecasts that violate basic conservation laws. A neural network that predicts a well will produce more oil in year five than in year one is useless regardless of its training loss.

Physics-informed machine learning (PIML) addresses this by embedding physical constraints directly into the model architecture or training process. This is not a theoretical concept. It is the difference between AI tools that petroleum engineers trust and AI tools that get deleted after the pilot.

Physics-Informed Neural Networks (PINNs)

PINNs embed partial differential equations -- the governing equations of fluid flow in porous media, for example -- as penalty terms in the neural network loss function. The model is simultaneously trained to fit observed data and to satisfy the physics. This means that even in regions with sparse data, the model produces physically consistent predictions.

Active research includes domain decomposition approaches that break large reservoir models into subdomains, Hard-Soft PINNs (HS-PINNs) for pressure simulation without labeled data, and hybrid architectures that combine PINNs with traditional reservoir simulation.

Reality check: PINNs remain mostly academic. The published results from Stanford, UT Austin, and KAUST are promising, but production deployments of full PINNs for reservoir simulation are rare. The computational cost of training PINNs on realistic 3D reservoir models is still prohibitive for routine engineering use. What operators actually deploy are simpler surrogate models -- reduced-order physics models augmented with ML -- rather than end-to-end PINNs.

Hybrid Physics-ML Models

The practical sweet spot is hybrid models that combine ML flexibility with physics constraints without requiring full PDE integration. Examples include:

•Constrained decline curve models: ML models for production forecasting that are constrained to honor material balance, minimum decline rates, and known recovery factor bounds. These outperform both pure Arps analysis and pure ML approaches on most unconventional well datasets.
•Physics-informed feature engineering: Instead of feeding raw time series to a model, engineers create features based on physical understanding -- flowing material balance variables, dimensionless time groups, completion-normalized production rates -- and then train conventional ML models (XGBoost, random forests) on these engineered features. The physics is in the features, not the model architecture.
•Surrogate reservoir models: ML models trained on thousands of reservoir simulation runs to approximate the simulator's behavior at a fraction of the computational cost. These are used for real-time optimization, uncertainty quantification, and history matching where running the full simulator for every scenario is impractical.

For a detailed treatment of why pure ML approaches fail for decline curve analysis and how physics-informed alternatives perform, see Why Your Decline Curve AI Keeps Getting It Wrong: Physics-Informed vs. Pure ML Approaches.

What to Ask Your AI Vendor

When evaluating AI products that claim physics-informed capabilities, the questions that separate substance from marketing are:

1.What physical constraints does the model enforce? A vendor that cannot name specific equations or conservation laws is probably not doing physics-informed ML.
2.How does the model perform on wells outside the training distribution? The entire point of physics constraints is improved generalization. Ask for out-of-distribution test results, not just test-set accuracy.
3.What happens when the model encounters conditions it has never seen? A physics-informed model should degrade gracefully toward physics-based predictions. A pure ML model produces garbage.
4.Can the model explain its predictions in physical terms? If the model says production will decline by 15% next quarter, can it attribute that to pressure depletion, interference, or mechanical issues?

AI Agents and LLMs in Oil & Gas: The 2025-2026 Inflection

Large language models and AI agents represent the newest -- and most hyped -- layer of the AI stack in oil and gas. The adoption statistics tell a clear story: 13% of O&G groups have deployed agentic AI, and 49% plan to in 2026. That 49% number should be read with appropriate skepticism (planning to deploy and actually deploying are very different activities), but the direction is unmistakable.

What LLMs Can Actually Do for Petroleum Engineers Today

The most immediate value of LLMs in petroleum engineering is not autonomous decision-making. It is information retrieval and synthesis from unstructured data.

Document analysis: A typical mid-size operator has thousands of well files -- daily drilling reports, completion summaries, workover records, regulatory filings -- scattered across shared drives, document management systems, and filing cabinets. An LLM connected to this document corpus can answer questions that previously required hours of manual search: "What cement blend did we use on the Wolfcamp B wells drilled in 2023?" or "Which wells in the southern section have had casing integrity issues?"

Code generation for engineering workflows: Petroleum engineers who use Python for data analysis can use Claude or GPT to write and debug scripts for decline curve fitting, production data cleaning, LAS file parsing, and well spacing optimization. This does not replace engineering judgment, but it dramatically accelerates the cycle from question to analysis.

Report generation: Drafting daily production reports, morning briefings, and regulatory submissions from structured production data. This is not cutting-edge AI -- it is template-filling with natural language polish -- but the time savings are real and immediate.

Knowledge retrieval from technical literature: SPE papers, vendor manuals, regulatory codes, and internal best practices can be indexed and queried conversationally. An engineer can ask "What is the recommended maximum dogleg severity for 5.5-inch casing in a horizontal lateral?" and get an answer with source citations, rather than searching through three technical manuals.

Where LLMs Fall Short in Petroleum Engineering

The limitations are equally important to understand:

Numerical precision: LLMs are unreliable for calculations. They can set up the equation but should not be trusted to solve it without verification. Any workflow that involves reserves estimation, economic analysis, or regulatory reporting must treat LLM outputs as drafts that require human validation.

Real-time operational decisions: No responsible operator should let an LLM decide whether to shut in a well, change gas lift rates, or modify drilling parameters without human oversight. The stakes are too high, the domain knowledge required is too specialized, and the consequences of hallucination are too severe.

Data access: An LLM without access to your production database, SCADA system, and engineering files is limited to its training data, which does not include your wells. This is the fundamental connectivity problem that MCP (Model Context Protocol) is designed to solve.

MCP: Connecting LLMs to Oilfield Data

The Model Context Protocol is an open standard (developed by Anthropic) that provides a structured way for AI models to access external data sources and tools. Think of it as a universal adapter between an LLM and your data systems.

In oil and gas, the MCP opportunity is enormous and almost entirely untapped. As of early 2026, only three energy-related MCP servers exist publicly: one for commodity prices (OilpriceAPI), one for Oil & Gas RAG queries (kukuhtw), and one for EIA energy data (ebarros23). Compare that to 8,600+ MCP servers across all industries.

No MCP servers exist for LAS well logs, WITSML drilling data streams, OSDU platforms, reservoir simulation decks, or production databases. This means that every operator deploying LLMs for petroleum engineering workflows must build custom integrations from scratch -- or accept that their AI assistants cannot access the data that matters.

This is exactly the problem that petro-mcp, an open-source MCP server for petroleum engineering data, is designed to address. It provides tools for reading LAS files, querying production data, analyzing decline curves, and connecting AI models to the data formats that petroleum engineers work with daily. (Disclosure: petro-mcp is developed by Groundwork Analytics. The project is open source on GitHub.)

For a complete treatment of MCP architecture and its application to oilfield data, see MCP Servers for Oilfield Data: Connecting LLMs to Well Logs, Production Data, and Reservoir Models.

The Agentic AI Landscape: What Is Real

Agentic AI -- systems that can plan multi-step workflows, call tools, query databases, and take actions autonomously -- is the frontier. The major deployments as of early 2026:

Platform	Architecture	Stage	Key Customer
SLB Tela	Observe-plan-act-learn loop on Lumi foundation models	Production	Shell (co-development)
Baker Hughes Leucipa/Cordant	Agent-per-task (optimization, strategy, health)	Production, scaling	Expand Energy (thousands of wells)
Cognite Atlas AI	Pre-built agents + low-code workbench on CDF	Production	TotalEnergies (full upstream), Aker BP, Equinor
Corva Copilot	AI assistant for drilling/completions intelligence	Production	Nabors partnership
Halliburton DS365.ai	60+ ML models, GPT-4 integration, autonomous frac	Production at scale	Chevron (intelligent frac)
Kelvin	Closed-loop autonomous production optimization	Production	Multiple operators

The common pattern across all successful deployments: the AI operates on a narrow, well-defined scope with clear physical constraints. Broad, general-purpose AI agents for petroleum engineering do not exist in production. What exists are specialized agents for specific tasks -- optimize this gas lift system, interpret this well log section, predict when this ESP will fail -- operating within guardrails.

For the definitive guide to evaluating agentic AI maturity and use cases, see Agentic AI for Upstream Oil & Gas.

Deployment Challenges: Why Models Die After the Pilot

Building a model that works on historical data is the easy part. Keeping it working in production, getting engineers to trust it, and demonstrating enough value to justify continued investment -- that is where most AI initiatives in oil and gas fail.

Model Drift and Retraining

Production data is non-stationary. Wells decline. Completion designs evolve. Parent-child interference changes the production signature of an entire section. Regulatory changes alter operating practices. A model trained on 2023 data will degrade on 2025 data, and the degradation is often silent -- the model still produces outputs, but the outputs are increasingly wrong.

The retraining problem: Most operators do not have automated retraining pipelines. The data scientist who built the model retrained it manually during the pilot. When the data scientist left (or moved on to the next project), retraining stopped. Six months later, the model is making predictions based on stale patterns, and the field engineers who were already skeptical now have confirmation that "the AI does not work."

What works: Scheduled retraining pipelines with automated data validation checks. MLflow (or equivalent) model versioning that allows rollback when a retrained model performs worse than its predecessor. Monitoring dashboards that track prediction accuracy against actual production -- and alert when accuracy degrades below a threshold. This is MLOps, and it is not glamorous, but it is the difference between a pilot and a production deployment.

Trust and Adoption

The trust problem in oil and gas AI is not irrational. Production engineers have decades of domain expertise, and they have seen enough technology initiatives come and go to be appropriately skeptical. When an AI model recommends changing the gas lift injection rate, the engineer wants to understand why -- not because they are resistant to technology, but because they are responsible for wells that produce revenue every day and fail spectacularly when operated incorrectly.

What works: Models that show their reasoning. Uncertainty quantification that tells the engineer "I am 90% confident this ESP will fail within 14 days" rather than "this ESP will fail on March 15." Gradual deployment that starts with recommendations (human-in-the-loop) before progressing to autonomous actions. And, critically, starting with use cases where the worst case of a wrong prediction is a missed optimization opportunity, not a well failure.

Data Quality at the Foundation

No amount of ML sophistication compensates for bad input data. The most common data quality issues that kill ML models in oil and gas:

•Sensor drift: Pressure transmitters that read 5% high after six months of operation. The model learns the drift as a real production trend.
•Missing data gaps: SCADA communication outages that create gaps in the time series. Imputation methods introduce artifacts that the model treats as signal.
•Allocation errors: Production allocated to the wrong well. The model dutifully learns to predict the wrong well's production profile.
•Unit inconsistencies: MCF versus MSCF. Calendar day rates versus producing day rates. Barrels of oil versus barrels of liquid. One unit conversion error propagated through a training dataset can corrupt an entire model.

The boring, essential work of data validation, cleaning, and governance is the foundation that every successful AI deployment is built on. If your SCADA data has gaps, your production allocation is approximate, and your well identifiers are inconsistent across systems, your AI project will fail regardless of which ML platform you choose.

MLOps in Oil and Gas: The Missing Infrastructure

MLOps -- the practices and tools for managing ML models in production -- is well established in the technology industry and almost nonexistent in oil and gas. The gap is striking.

What MLOps Looks Like

A mature MLOps practice includes:

•Experiment tracking: Recording every model training run with its hyperparameters, training data snapshot, and evaluation metrics (MLflow, Weights & Biases, Neptune).
•Model registry: Versioned storage of trained models with metadata about training data, performance benchmarks, and deployment status.
•Automated retraining: Scheduled or trigger-based retraining when new data arrives or model performance degrades.
•Feature stores: Centralized repositories of engineered features that multiple models can share.
•Model monitoring: Real-time tracking of prediction accuracy, data drift, concept drift, and feature importance changes.
•CI/CD for models: Automated testing and deployment pipelines that validate a retrained model against holdout data before promoting it to production.

Where Oil and Gas Stands

Most operators are at Stage 0 or Stage 1 of MLOps maturity:

Stage 0 (most small and mid-size operators): No ML models in production. Any analysis is ad hoc, done in Jupyter notebooks or Excel, not operationalized.

Stage 1 (progressive mid-size and large independents): A few models deployed, typically production forecasting or artificial lift optimization. Models trained manually, updated infrequently, monitored informally ("does the engineer still use it?"). No automated retraining. No model registry.

Stage 2 (a few large independents and supermajors): MLflow or equivalent for experiment tracking. Some automated retraining. Power BI or Grafana dashboards showing model predictions versus actuals. Still manually managed by a data science team.

Stage 3 (supermajors with dedicated ML engineering teams): Full MLOps stack with automated training pipelines, model monitoring, feature stores, and CI/CD. Equinor's EurekaML on Azure ML, Shell's Databricks ML infrastructure, and BP's Palantir Foundry approach this level.

The opportunity for most operators is not to leap to Stage 3. It is to move from Stage 0 to Stage 1 with a single, well-chosen use case -- and to build it on infrastructure that can grow to Stage 2 without a rewrite.

What Actually Works, by Company Size

The AI stack that makes sense for a 10,000-well supermajor is irrelevant for a 200-well Permian operator. Here is a realistic assessment of what works at each tier.

Supermajors (ExxonMobil, Chevron, Shell, BP, TotalEnergies, Equinor)

Budget: $100M-$1B+ annual digital/IT spend. In-house data science teams of 100-500+ people.

What they run: Custom ML models on Databricks, Azure ML, or SageMaker. Palantir Foundry or Cognite Data Fusion as the data platform. Service company AI products (SLB Tela, Halliburton DS365.ai) for domain-specific applications. Full MLOps with experiment tracking, automated retraining, and model monitoring.

Key stat: Equinor reported $130 million in AI-driven savings in 2025. Aramco reported $1.8 billion in AI value in 2024. These numbers are real, but they represent the ceiling of what is possible with massive investment and organizational commitment.

Large Independents (Devon, EOG, Diamondback, ConocoPhillips)

Budget: $20-100M annual digital/IT spend. Data science teams of 5-30 people.

What they run: Databricks or Snowflake as the data platform. Vendor AI products for specific use cases. Mix of custom Python models and service company tools.

Mid-Size Operators (Permian Resources, Matador, Crescent, Ring, 500-5,000 wells)

Budget: $5-20M annual IT spend. 0-3 dedicated data science staff.

What works:

•Vendor-delivered AI products that do not require internal ML expertise (Baker Hughes Leucipa, Corva Copilot, Ambyint)
•XGBoost or random forest models built by a petroleum engineer with Python skills
•LLM-assisted document search and engineering analysis (Claude or GPT with domain prompting)
•MCP-based connections between LLMs and production databases (early adopter opportunity)

The mid-size operator trap: Attempting to replicate what Devon or Shell does, at one-tenth the budget and one-twentieth the staff. The result is a half-built data lake, an abandoned Databricks environment, and a production engineer who went back to Excel. The better approach: pick one high-value use case, use a vendor product or a simple custom model, prove ROI, then expand.

Small Operators (100-500 wells)

Budget: $100K-$2M annual IT spend. Zero data science staff.

What works:

•Vendor SCADA with built-in analytics (eLynx, zdSCADA) -- anomaly detection embedded in the platform, no ML expertise required
•LLM assistants (Claude, GPT) for ad-hoc analysis, report writing, and regulatory research -- zero infrastructure required
•MCP servers that connect LLMs directly to production data and LAS files, bypassing the need for a data platform entirely

The opportunity: Small operators are the most underserved segment in oil and gas technology. The gap between supermajor tech stacks ($100M+/year) and small operator tech stacks (Excel + GreaseBook) is enormous. LLMs with domain-specific data access (via MCP or similar) could be the first AI technology that delivers value at this tier without requiring a data science team.

The Hype vs. Reality Scorecard

Based on actual deployments, customer references, and measurable outcomes as of early 2026:

Delivering Real Value

Technology	Evidence	Confidence
Artificial lift optimization (ML-based)	Ambyint, Kelvin deployed at scale; measurable production uplift	High
Drilling optimization AI	SLB autonomous drilling (30% time reduction); Corva (36% ROP improvement)	High
Production anomaly detection	Widely deployed; embedded in SCADA platforms; clear ROI	High
Physics-informed production forecasting	Hybrid models outperform pure Arps and pure ML; Novi Labs commercial	High
NLP for document search and knowledge retrieval	Deployed at supermajors; GenAI assistants in Baker Hughes/EPAM, Corva Copilot	Medium-High

Promising but Early

Technology	Status	Watch For
Agentic AI for production operations	SLB Tela, Baker Hughes Leucipa deployed at scale -- primarily at supermajors	Mid-size operator deployments; cost reduction
Autonomous completions optimization	Halliburton-Chevron intelligent fracturing in field trials	Published results with well count and economics
LLM-powered petroleum engineering	Ad hoc use by individual engineers; no institutional deployments	Standardized tooling (MCP servers, domain fine-tuning)
Full-stack MLOps in E&P	Stage 2-3 at a handful of supermajors	Managed MLOps platforms for E&P

Still Mostly Hype

Technology	Why
Full PINNs replacing reservoir simulation	Computationally prohibitive for realistic 3D models; mostly academic
General-purpose AI agents for "all petroleum engineering"	No production deployments; narrow agents work, broad agents do not
"Digital twin" that runs a physics simulation in real-time	Most "digital twins" are dashboards with a marketing name
AI replacing petroleum engineers	The shortage of engineers is real; AI augments their productivity, not their existence

Where This Is Heading

Three trends will shape the AI layer of the oil and gas data stack over the next 2-3 years:

1. AI moves down-market. Today, meaningful AI capability is confined to supermajors and large independents with seven-figure budgets. The combination of cheaper LLM inference, MCP-based data connectivity, and vendor products designed for smaller operators will make AI accessible to the mid-size and small operator tiers. This is where the largest unserved market exists.

2. Physics-informed approaches become the default. Pure ML models will persist for problems where physics constraints are hard to formulate (e.g., equipment failure prediction from vibration data). But for core petroleum engineering applications -- production forecasting, reservoir characterization, completion optimization -- physics-informed hybrid models will become standard practice rather than a research niche.

3. The agent layer matures from demos to infrastructure. SLB Tela, Baker Hughes Leucipa, and Cognite Atlas AI have established that agentic AI can work in upstream oil and gas. The next phase is standardization: open protocols like MCP that allow agents to access data regardless of vendor, composable agent architectures that operators can customize, and governance frameworks that define what an agent can and cannot do autonomously.

The operators who benefit most from these trends will not be the ones who buy the most expensive AI platform. They will be the ones who solve the data problem first. Clean, accessible, well-governed data is the prerequisite. Everything else -- the ML models, the LLM agents, the physics-informed forecasts -- is built on that foundation.

Need help building your AI and ML data stack? Get in touch.