The Petroleum Engineering Skills Gap: What Universities Aren't Teaching and What Operators Desperately Need

Dr. Mehrdad Shirangi | | Published by Groundwork Analytics LLC

Editorial disclosure

This article reflects the independent analysis and professional perspective of the author, informed by a decade of experience at the intersection of petroleum engineering, optimization, and AI. No university, operator, or vendor reviewed or influenced this content prior to publication. Where we reference Groundwork Analytics' open-source work, we say so explicitly.


Petroleum engineering has always been a cyclical profession. Booms bring hiring surges. Busts bring layoffs and program closures. Students choose PE when oil is at $100 and switch to computer science when it drops to $40. The industry has lived with this pattern for decades.

But the skills gap facing the profession in 2026 is not cyclical. It is structural. And it has nothing to do with oil prices.

The gap is between what petroleum engineering programs teach and what operators actually need from the engineers they hire. Traditional PE curriculum -- reservoir simulation, well testing, petrophysics, drilling engineering, production operations -- remains essential. Nobody is arguing otherwise. But it is no longer sufficient. Operators are increasingly looking for engineers who can also write Python scripts to automate workflows, query production databases with SQL, build dashboards from SCADA data, apply machine learning to well performance prediction, and work with AI tools that are reshaping how engineering decisions get made.

Most PE programs have not caught up. The result is a graduating class that can run a material balance calculation by hand but cannot write a for-loop, that can interpret a well test but cannot pull the test data from a database, and that can describe how a reservoir simulator works but has never seen a real production dataset.

This article examines the gap in detail: what it looks like, why it exists, and what students, faculty, and departments can do about it. The perspective here is informed by my own path through a Stanford PhD in Energy Systems Optimization, years of building AI-powered solutions for upstream operators, and direct observation of what the industry actually asks for when it hires.


What the Traditional PE Curriculum Gets Right

Before discussing what is missing, it is worth being explicit about what PE programs do well. The core curriculum exists for good reasons, and none of what follows should be read as an argument to abandon it.

A well-designed PE program teaches:

Reservoir engineering fundamentals. Material balance, Darcy's law, multiphase flow in porous media, decline curve analysis, well test interpretation. These are the physics of the subsurface, and they are non-negotiable. No amount of Python or machine learning replaces the need to understand why a pressure transient looks the way it does.

Drilling engineering. Wellbore stability, directional drilling, casing design, hydraulics, cementing. The mechanical engineering of getting a hole in the ground safely and efficiently.

Production engineering. Artificial lift selection, nodal analysis, surface facility design, flow assurance. The practical challenge of getting hydrocarbons from the reservoir to the surface and keeping them flowing.

Petrophysics. Log interpretation, core analysis, formation evaluation. The art and science of characterizing what is in the rock.

Reservoir simulation. Numerical methods for solving flow equations, history matching, forecasting. The computational backbone of field development planning.

These subjects form the foundation of the profession. A petroleum engineer who does not understand them is not a petroleum engineer. The problem is not that these topics are taught. The problem is that they are taught as if the tools used to apply them in the field have not fundamentally changed.


What Operators Are Actually Hiring For in 2026

Pull up any job board -- LinkedIn, Indeed, SPE's career site -- and search for petroleum engineering roles posted in the last six months. The pattern is unmistakable.

Entry-level and mid-career PE positions at operators increasingly list requirements that did not appear in job postings a decade ago:

  • Python or R programming -- not as a "nice to have" but as a listed requirement or strong preference
  • SQL and database querying -- ability to pull data from production databases, data warehouses, and SCADA historians
  • Data visualization -- Spotfire, Power BI, Tableau, or Python libraries like Matplotlib and Plotly
  • Machine learning fundamentals -- experience with scikit-learn, regression models, classification, or time-series forecasting
  • Cloud platform familiarity -- AWS, Azure, or GCP, particularly for data storage and compute
  • Version control -- Git and GitHub for collaborative development

Consider some specific examples from 2025-2026 job postings:

Permian Resources, one of the largest pure-play Permian Basin operators, built its data infrastructure on Databricks, Dagster, and dbt. Their data engineering and analytics roles require Python, SQL, and cloud platform experience as baseline qualifications. When they hire engineers who will interact with production data, they expect those engineers to navigate this stack.

Devon Energy, ConocoPhillips, and Pioneer (now part of ExxonMobil) have all posted roles in the past year seeking petroleum engineers with Python scripting capabilities for "workflow automation" and "data-driven decision making."

Chord Energy reported deploying AI optimization on 99% of its rod lift wells. The engineers managing those systems need to understand not just artificial lift physics but also the data pipelines, model outputs, and monitoring dashboards that make the AI work.

The shift is not limited to large operators. Mid-size E&Ps, including companies backed by private equity firms like EnCap, NGP, and KKR, are under pressure to run lean and generate returns quickly. That means fewer engineers doing more with data tools, not more engineers doing things manually. When Crescent Energy, a KKR-backed operator, brought in a CIO with deep SCADA and AI/ML experience, the signal was clear: digital capability is not optional, even at mid-cap scale.

This is the reality of the job market. PE programs that prepare students only for the technical interview -- reservoir simulation theory, drilling calculations, petrophysics interpretation -- are leaving them unprepared for the actual work environment they will enter.


The Great Crew Change: Opportunity and Crisis

The "great crew change" has been discussed in petroleum engineering circles for two decades. The concept is straightforward: a large cohort of experienced engineers who entered the industry during the 1970s and 1980s boom cycles are retiring, and there are not enough incoming engineers to replace them.

The numbers are stark. According to SPE membership data and Bureau of Labor Statistics projections, the oil and gas industry faces a significant workforce shortfall as experienced professionals exit. The Society of Petroleum Engineers itself has seen shifts in its membership demographics that reflect this trend.

But the crew change is not just a headcount problem. It is a knowledge transfer crisis. The engineers retiring carry decades of tacit knowledge -- pattern recognition built from watching thousands of wells, intuition about reservoir behavior that never made it into a database, operational judgment honed through field experience that no textbook captures.

Here is where the skills gap intersects with the crew change in a way that creates both crisis and opportunity:

The crisis: If incoming engineers lack the digital skills to build systems that capture, organize, and learn from operational data, the tacit knowledge walking out the door with retiring engineers is lost permanently. It does not matter how good your reservoir engineering fundamentals are if you cannot build the data infrastructure to preserve institutional knowledge.

The opportunity: Engineers who combine PE domain expertise with data engineering and AI skills are extraordinarily valuable precisely because they can bridge this gap. They can build the AI agents and data systems that encode expert knowledge into scalable, repeatable workflows. They can create the tools that make a 25-year-old with three years of experience as effective as someone with fifteen -- not by replacing judgment, but by giving them access to the accumulated data and patterns that would otherwise take decades to absorb.

This is not about replacing experienced engineers with algorithms. It is about making sure the next generation has the skills to build the systems that preserve and extend what the current generation knows. And that requires skills that most PE programs do not teach.


The Five Specific Skills Gaps

Gap 1: Programming (Python)

This is the most visible and most consequential gap.

Most petroleum engineering programs in the United States require zero to one programming courses. Some programs include a general engineering computing course -- often in MATLAB -- in the freshman or sophomore year. A few have introduced Python electives. Very few require Python as part of the PE-specific curriculum.

The result: most PE graduates in 2026 cannot write a script to read a CSV file, filter it, and plot the results. They cannot automate a repetitive calculation they perform daily. They cannot parse a LAS file, clean SCADA data, or build a simple data pipeline.

Meanwhile, Python has become the lingua franca of technical computing in the energy industry. Every major operator's data science team uses it. The open-source petroleum engineering ecosystem runs on it -- libraries like lasio for well log data, pyautogui for workflow automation, scipy for optimization, and tools like our own petro-mcp for connecting AI to oilfield data are all Python-based.

MATLAB served its purpose for decades, and it remains useful for certain numerical methods courses. But teaching MATLAB in 2026 as the primary computing language for PE students is like teaching FORTRAN in 2000. It works. It is not where the industry is going. Python is free, has vastly more community support, integrates with modern data tools, and is what employers expect.

Gap 2: Data Engineering

Petroleum engineers work with data constantly. Production volumes, wellhead pressures, choke sizes, artificial lift parameters, completion records, well logs, drill bit records, mud properties, core measurements. The volume of data generated by a single well over its lifetime is enormous.

Yet PE students almost never interact with data in its real-world form. Coursework uses clean, pre-formatted textbook datasets. Students never have to:

  • Query a SQL database to pull production history
  • Parse and clean SCADA data with missing values, sensor errors, and timestamp inconsistencies
  • Read LAS 2.0 files and handle header parsing, null values, and unit conversions
  • Join data from multiple sources (production database + completion database + well header database) into a usable dataset
  • Work with APIs to pull public data from state regulatory agencies (Texas RRC, COGCC, NDIC)

In the real world, data engineering -- getting the data into a usable state -- consumes 60-80% of the time in any analytics or AI project. PE graduates who have never experienced this are blindsided when they arrive at their first job and discover that the production data they need is not in a neat Excel spreadsheet but in a SCADA historian with 15-second polling intervals, sensor dropouts, and unit mismatches between fields.

We wrote about the data connectivity challenge in detail in our article on MCP servers for oilfield data. The core problem is that petroleum engineering data is heterogeneous, domain-specific, and scattered across systems that were never designed to talk to each other. Engineers who can navigate this mess are worth their weight in gold. PE programs that never expose students to it are doing them a disservice.

Gap 3: Machine Learning Basics

To be clear: PE programs do not need to produce machine learning researchers. A petroleum engineer does not need to derive backpropagation or understand transformer architectures at the mathematical level.

But they do need to understand:

  • What supervised learning is and when it applies (e.g., predicting well performance from completion parameters)
  • What regression vs. classification means and which one you use for which problem
  • What overfitting is and why a model that looks great on training data might be useless in production -- a topic we explored in depth in our article on physics-informed vs. pure ML decline curve analysis
  • How to evaluate a model using holdout sets, cross-validation, and domain-appropriate metrics
  • When ML is the wrong tool and a physics-based approach or simple statistical method is better

The practical ML skills a PE engineer needs are not exotic. Anomaly detection on SCADA sensor data. Production forecasting using time-series methods. Well clustering by completion design and performance. ESP failure prediction using historical maintenance records. These are applications where basic scikit-learn models outperform deep learning, and where domain expertise matters more than algorithmic sophistication.

The danger of not teaching this is not that PE engineers will fail to use ML. It is that they will either avoid it entirely (missing opportunities) or use it blindly (trusting vendor black boxes without the ability to evaluate whether the model is sound). Both outcomes are bad. The engineer who understands enough ML to ask "What was your training set? How did you handle the wells that went on artificial lift mid-history? What is your holdout R-squared?" is vastly more valuable than one who either dismisses ML or accepts it on faith.

Gap 4: Cloud and DevOps Fundamentals

The upstream industry's migration to cloud computing is well underway. Operators are moving production databases, SCADA historians, and even reservoir simulation workloads to AWS, Azure, and GCP. The economics are compelling: elastic compute for simulation studies that used to require dedicated server rooms, managed databases that reduce IT overhead, and data lake architectures that consolidate previously siloed data.

PE graduates have zero exposure to any of this. Most have never used a command line terminal, let alone deployed a virtual machine, queried a cloud database, or understood concepts like containers, serverless functions, or infrastructure-as-code.

They do not need to become cloud architects. But they need to understand enough to work effectively in an environment where their data lives in the cloud, their analytics tools run in the cloud, and their collaboration happens through cloud-based platforms. At minimum, that means understanding:

  • How cloud storage and databases work (S3, Azure Blob, cloud SQL)
  • What an API is and how to call one
  • How to use a terminal and command line tools
  • What version control (Git) is and why it matters for collaborative work
  • How to deploy and run scripts in a cloud environment

Gap 5: AI Literacy

This is the newest gap and arguably the one that will matter most over the next five years.

AI tools -- Claude, ChatGPT, Gemini, Copilot -- are becoming standard instruments in engineering workflows. They are not replacing engineers. They are changing how engineers work. An engineer who knows how to effectively use an AI assistant to draft a well proposal, debug a Python script, interpret a complex dataset, or summarize 50 pages of regulatory filings will outperform one who does not, all else being equal.

But AI literacy is more than knowing how to use ChatGPT. It includes:

  • Understanding what AI can and cannot do -- knowing that an LLM cannot access your proprietary data unless you give it access (which is exactly what protocols like MCP solve), knowing that it can hallucinate, knowing that it is a reasoning tool and not an oracle
  • Prompt engineering -- the skill of formulating questions and instructions that get useful outputs from AI tools
  • Evaluating AI outputs -- knowing when to trust, when to verify, and when to reject what an AI tells you
  • Understanding agentic AI architectures -- as AI agents become more prevalent in operations (a trend we covered in our article on agentic AI in upstream oil and gas), engineers need to understand what these systems do, how they make decisions, and where human oversight is required

PE programs currently teach none of this. Some faculty are actively prohibiting AI use in coursework, which is understandable from an academic integrity standpoint but counterproductive if the goal is to prepare students for a profession where AI tools are standard.


What Forward-Thinking Programs Are Doing

Not every PE department is standing still. A few programs have begun adapting, though the pace varies significantly.

Texas A&M has introduced data analytics and machine learning electives within its petroleum engineering department and has been one of the more active programs in integrating computational methods into PE coursework. Their partnership with the Texas A&M Institute of Data Science creates at least a pathway for PE students to access data science education.

Colorado School of Mines has leveraged its smaller size and interdisciplinary culture to offer courses that blend geoscience, engineering, and data science. Its Data Science program and various certificate options give PE students access to Python, ML, and data engineering courses, though these are not always required within the PE degree itself.

Stanford's Energy Science & Engineering program (previously Energy Resources Engineering, and where I completed my PhD in optimization for reservoir systems) has always been computationally oriented, with students expected to code as part of their research from day one. But Stanford is unusual in this regard, and it is a graduate program -- the undergraduate PE skills gap remains largely unaddressed even there.

University of Texas at Austin's Hildebrand Department has explored integrating data science elements and has the advantage of being located in the same city as a large concentration of energy tech companies, creating natural industry-academia feedback loops.

University of Tulsa has made efforts to modernize its curriculum with computational elements, recognizing the shifting demands of the industry.

These are positive developments. But they share a common limitation: in most cases, the data science and AI content is elective, not required. A PE student can still graduate from any of these programs without writing a line of Python, without ever touching a real production dataset, and without any exposure to machine learning or AI tools. Until the core curriculum changes, the skills gap will persist.


Practical Recommendations for PE Students

If you are currently pursuing a PE degree or have recently graduated, the skills gap is your problem to solve. Waiting for your department to update the curriculum is not a strategy. Here is a practical roadmap:

1. Learn Python. Now.

This is the single highest-return investment you can make in your career. You do not need to become a software engineer. You need to be a petroleum engineer who can code.

Start here:

  • Python for Everybody (py4e.com) -- free, self-paced, excellent for beginners
  • Automate the Boring Stuff with Python (automatetheboringstuff.com) -- practical, project-based, gets you building useful things quickly
  • Once you have basics, learn pandas for data manipulation, matplotlib/plotly for visualization, and numpy/scipy for numerical computing
  • Work through the lasio library documentation and practice reading actual LAS files

Set a concrete goal: within 60 days, you should be able to write a Python script that reads a CSV of production data, cleans it, calculates decline curve parameters, and generates a plot. That is not a high bar. But it puts you ahead of 80% of PE graduates.

2. Learn SQL

Production databases, well completion databases, and regulatory datasets all live in relational databases. SQL is how you access them.

You do not need to become a database administrator. You need to be able to write SELECT, JOIN, WHERE, and GROUP BY queries fluently. This takes a few weeks of focused practice. SQLBolt (sqlbolt.com) and Mode Analytics SQL Tutorial are both free and effective.

3. Build a Portfolio Project Using Public Data

The Texas Railroad Commission (RRC), Colorado Oil and Gas Conservation Commission (COGCC), New Mexico OCD, and North Dakota Industrial Commission (NDIC) all publish production and well data publicly. Use it.

Build a project that demonstrates you can work with real-world petroleum data:

  • Download production data for a county or field from the Texas RRC
  • Clean it, analyze it, and build a visualization or simple predictive model
  • Document your work in a Jupyter notebook or a GitHub repository
  • Write a brief explanation of what you found and what it means operationally

This kind of project -- data acquisition, cleaning, analysis, visualization, interpretation -- is exactly what operators do every day. Showing that you can do it with public data proves you can do it with their data.

4. Contribute to Open-Source Petroleum Engineering Projects

Open-source contributions demonstrate technical skill, collaborative ability, and initiative. The petroleum engineering open-source ecosystem is small but growing.

Our own petro-mcp project -- an open-source MCP server for petroleum engineering data -- is one example. It provides tools for working with production data, decline curve analysis, and well log parsing through the Model Context Protocol. Contributing to projects like this exposes you to real codebases, real engineering data problems, and the collaborative development workflow (Git, pull requests, code review) that industry teams use daily.

Other PE-adjacent open-source projects worth exploring include lasio (LAS file parsing), welly (well data management), and OPM Flow (reservoir simulation).

5. Get Comfortable With AI Tools

Start using Claude, ChatGPT, or Copilot in your engineering coursework (where permitted). Not as a shortcut to avoid learning -- as a force multiplier for learning.

Use AI to:

  • Explain concepts from your coursework in different ways
  • Debug your Python scripts
  • Explore "what if" scenarios in reservoir engineering problems
  • Summarize technical papers and identify key findings
  • Generate starter code for data analysis projects

The engineers who will thrive in the next decade are not the ones who can do everything an AI can do. They are the ones who can do what AI cannot -- apply domain judgment, understand physical systems, make decisions under uncertainty -- while leveraging AI for everything else.

6. Pursue Complementary Coursework

If your university offers courses in data science, machine learning, database systems, or cloud computing through its CS, statistics, or data science departments, take them. Even one or two courses outside your PE core curriculum can meaningfully change your skill profile.

Look for:

  • Introductory data science or machine learning (scikit-learn level, not deep learning theory)
  • Database systems (SQL, data modeling)
  • Statistics beyond what PE programs require (time-series analysis, Bayesian methods)
  • Any course that involves working with real datasets rather than textbook problems

Recommendations for PE Departments

For faculty and curriculum committees reading this, the following recommendations are offered with respect for the constraints you operate under -- accreditation requirements, limited faculty capacity, student course loads that are already heavy. These are not easy changes. But they are necessary ones.

1. Make Python a Required Course, Integrated Into PE Curriculum

Not a general engineering computing course. Not a MATLAB elective. A required Python course, taught within the PE department (or co-taught with CS/Data Science), using petroleum engineering data and problems.

The course should cover:

  • Python fundamentals (variables, functions, loops, data structures)
  • Data manipulation with pandas
  • Visualization with matplotlib/plotly
  • File I/O including CSV, JSON, and LAS formats
  • Basic API calls (pulling data from regulatory agencies)
  • Introduction to version control with Git

Every problem set should use PE data. Students should parse LAS files, not arbitrary CSV datasets. They should analyze production decline curves, not stock prices. They should clean SCADA data, not web-scraped movie reviews. The domain context is what makes the course relevant and what keeps PE students engaged.

2. Partner With CS and Data Science Departments

Many universities have strong data science programs that PE departments could leverage. Cross-listed courses, joint capstone projects, and shared teaching resources can deliver data science education to PE students without requiring PE departments to hire data science faculty.

The key is making these partnerships structured, not ad-hoc. A formal minor or certificate in data science for PE students, with a curated set of courses that include PE-relevant applications, is far more effective than telling students "you should probably take a CS elective."

3. Use Real Industry Data in Coursework

Textbook problem sets with clean, perfectly formatted data do not prepare students for the real world. Every PE course that involves data analysis -- reservoir engineering, production engineering, formation evaluation -- should include at least some assignments using real, messy, imperfect data.

Public data sources make this feasible:

  • Texas RRC production data (available via their public query system)
  • COGCC production and well data
  • FracFocus completion chemical disclosure data
  • Kansas Geological Survey well log data (free LAS files)
  • Volve field dataset from Equinor (a complete field dataset released for research and education)

The Volve dataset, in particular, is an extraordinary resource. Equinor released the full operational dataset from the Volve field in the North Sea -- including well logs, production data, seismic, reports, and reservoir simulation models. It is the closest thing to a real operator's data environment that a student can access without an internship.

4. Invite Industry Practitioners for Guest Lectures on Digital Transformation

Faculty are experts in reservoir engineering, drilling, and petrophysics. They are not necessarily experts in how AI agents are being deployed to monitor 500 wells, or how a Permian Basin operator built a data pipeline using Databricks and Dagster, or what an MCP server does.

Regular guest lectures from industry practitioners who work at the intersection of PE and technology can fill this gap. These do not need to be formal courses. A monthly seminar series where working engineers present real-world examples of how digital tools are used in operations would be transformative.

5. Address AI Tools Directly in Academic Integrity Policies

The current approach at many universities -- banning AI tools entirely -- is understandable but unsustainable. Students will use these tools regardless of policy, and banning them means they learn to use them without guidance.

A more productive approach: define clearly when AI tools may and may not be used, teach students how to use them effectively and ethically, and design assessments that test understanding rather than outputs. An exam that asks a student to interpret a pressure transient test and explain the physics cannot be meaningfully completed by ChatGPT alone. A homework assignment that asks a student to calculate skin factor from given data can.

Design the curriculum so that AI tools are an accelerant, not a shortcut. Then teach students to use them as professionals.


The Engineers Who Will Thrive

Petroleum engineering is not dying. The narrative that PE is an obsolete profession fails to account for the fact that the world still runs on hydrocarbons, that the energy transition itself requires enormous engineering capability, and that the subsurface expertise PE engineers possess is not replicated by any other discipline.

But PE is evolving. The engineers who will thrive -- who will be sought after by operators, promoted into leadership, and positioned to build the next generation of energy companies -- are the ones who combine deep domain expertise with modern technical skills. They understand reservoir physics AND Python. They can interpret a well test AND query a database. They know petrophysics AND machine learning. They can run a reservoir simulation AND deploy an AI agent to monitor their wells.

This combination is rare today. PE programs that produce it will differentiate their graduates. Students who build it on their own will differentiate themselves.

The skills gap is real, it is widening, and it will not close on its own. But for students and programs willing to act, it represents an extraordinary opportunity. The demand for engineers who can bridge the traditional and digital worlds of petroleum engineering has never been higher, and the supply has never been lower.

Close the gap. The industry is waiting.


Dr. Mehrdad Shirangi is the founder of Groundwork Analytics and holds a PhD from Stanford University in Energy Systems Optimization, where his research focused on computational methods for reservoir management under uncertainty. He has been building AI solutions for the energy industry since 2018. Connect on X/Twitter and LinkedIn, or reach out at info@petropt.com.


Related Articles

Looking for O&G Jobs?

Petro-Jobs uses AI to match your resume to 79+ curated oil & gas positions.

Try Petro-Jobs

Have questions about this topic? Get in touch.