Strategy & Insights

Your Best Engineer Just Retired. 20 Years of Decisions Went With Them. Here's How to Prevent That.

Dr. Mehrdad Shirangi | | 14 min read

Editorial disclosure: This article reflects the independent analysis and professional opinion of the author, informed by published regulatory documents, SPE research, vendor documentation, and direct experience with operational data systems across multiple basins. No vendor reviewed or influenced this content prior to publication. Groundwork Analytics offers an Operational Decision Memory product referenced in the concluding section; the analysis presented here stands on its own regardless of product interest.

The phone call comes on a Tuesday morning.

A drilling engineer on a new Wolfcamp A well at 14,200 feet TVD sees something on the Pason EDR that does not match the drilling program. The plan says 12.8 ppg mud weight through this interval, but the last well on this pad -- Well 47, drilled nine months ago -- ran 13.2 ppg through the same zone. There is no note in the drilling program explaining why. No revision history. No lessons learned document. The engineer picks up the phone and calls the drilling superintendent.

"Why did we run 13.2 on Well 47?"

"Ask Bob. He was the drilling engineer on that one."

"Bob retired in January."

Silence on the line.

The engineer now faces a choice. Run 12.8 as programmed and hope the original plan was right. Or bump to 13.2 because somebody must have had a reason -- but without knowing that reason, is 13.2 even enough? Was it a kick indicator? Pore pressure gradient revision? A geology call from the geologist who also transferred to another basin last quarter?

This is not a hypothetical scenario. Variations of this conversation happen across the upstream oil and gas industry every single day. And the consequences range from minor inefficiency to catastrophic failure.

This article is about a problem that every operator knows exists and almost none have solved: the systematic loss of operational decisions. Not data -- decisions. The reasoning behind why things were done the way they were done. And it is about a category of technology that did not exist until recently: Operational Decision Memory.


The Scale of What We Are Losing

231,000 Years of Experience Walking Out the Door

The oil and gas industry's workforce demographics tell a story that should alarm anyone responsible for operational continuity.

Seventy-one percent of the energy workforce is 50 years old or older. Up to 50% of skilled energy workers may retire within the next five to seven years. The industry has already lost an estimated 231,000 years of cumulative experience to retirement, and the pace is accelerating.

The workforce is bimodally distributed in a way that makes this worse. Baby boomers dominate the senior ranks. Millennials are entering at the junior level. But Gen X -- the generation that would normally bridge the gap -- is almost entirely absent. The 1980s oil price collapse drove an entire generation away from the industry. Between 1980 and 1998, U.S. oil and gas employment fell from roughly 700,000 to 300,000. The people who would today be 15-to-20-year veterans never entered the industry.

The result is a knowledge transfer gap that mentorship programs cannot bridge. You cannot mentor 50 new engineers simultaneously with 50 retiring engineers on staggered timelines. The math does not work.

A 2008 Ernst & Young and Rice University survey found that 90% of senior executives at 22 top international oil and gas companies called the talent shortage one of the top five issues facing the industry. Eighteen years later, the problem has only deepened.

The Dollar Cost of Lost Decisions

When institutional knowledge disappears, the financial impact shows up in three places.

Unplanned downtime. The average upstream oil and gas facility now loses $149 million per year to unplanned downtime -- a 76% increase over prior reporting periods. The hourly cost of unplanned downtime has more than doubled, now exceeding $500,000 per hour. Upstream companies average 27 days of unplanned downtime annually, translating to roughly $38 million per company per year.

Not all of this is attributable to lost institutional knowledge. Equipment fails for mechanical reasons. But a significant portion of unplanned downtime stems from decisions that were made without the context of prior decisions. Running the wrong mud weight. Selecting an artificial lift method that failed on the offset well for reasons nobody documented. Repeating a completions design that caused a frac hit on the neighboring pad -- a $2-3 million mistake that the previous completions engineer knew to avoid but took that knowledge with them when they left.

Repeated mistakes. The Schiehallion drilling team in the North Sea is one of the few documented cases where systematic lessons-learned capture delivered measurable results. By embedding operational lessons after each successive well, the team saved $50 million in drilling costs and achieved record drilling times. The fact that this case is cited in virtually every SPE paper on knowledge management tells you how rare this outcome is. The Schiehallion team is the exception. Repeated mistakes are the rule.

Knowledge replacement cost. Replacing a senior petroleum engineer costs 1.5 to 2 times their annual salary in recruitment alone -- $200,000 to $400,000 or more for experienced subsurface professionals. But the recruitment cost is the small number. The time for a new hire to reach equivalent operational competence in reservoir or subsurface roles is estimated at three to seven years. During those years, the new engineer is making decisions without the accumulated context their predecessor carried.

Companies that laid off experienced staff during the 2014-2016 downturn -- when 350,000 workers were eliminated globally, including 195,000 in the United States -- learned this the hard way. Many had to rehire their own retirees as contractors at 1.5 to 2 times their previous salary because no one else knew how to operate those assets.

When Lost Decisions Kill People

The most consequential examples of lost institutional knowledge in oil and gas are not financial. They are fatal.

Piper Alpha, 1988. 167 people killed in the worst offshore oil disaster in history. A similar catastrophe had occurred eight years earlier on the Alexander Kielland platform in the North Sea, killing 123. The investigation recommendations from the Alexander Kielland were implemented in Norwegian operations -- but not adopted by UK operators. After Piper Alpha, Lord Cullen's inquiry produced 106 safety recommendations. One hundred and five of them had already been implemented in Norway. The knowledge existed. It was not shared across organizational boundaries.

Deepwater Horizon, 2010. 11 killed. The largest marine oil spill in history. Over $65 billion in total cost to BP. The investigation found that issues aboard the Deepwater Horizon mirrored those on Piper Alpha 20 years earlier. BP had also failed to learn from its own 2005 Texas City refinery explosion that killed 15 and injured 170. A deeply entrenched culture of prioritizing cost over safety was documented across multiple prior incidents -- incidents whose lessons were available but were neither captured in accessible form nor enforced.

The pattern is consistent across every major oil and gas disaster of the past 40 years: the information needed to prevent the catastrophe existed before the catastrophe occurred. It was in someone's head, in a filed-and-forgotten report, in a lessons-learned database that nobody queried, or in a different operating company that experienced the same failure mode years earlier.

The problem is not that operators lack data. They are drowning in data. The problem is that decisions -- the human reasoning that connects data to action -- are almost never captured.


Why Current Tools Fail

If the problem is this well-known and this expensive, why has nobody solved it? The answer lies in understanding what current tools actually do -- and what they were never designed to do.

SharePoint: The Universal Default (and Universal Failure)

SharePoint is what most operators actually use for lessons learned, MOC documentation, and knowledge management. And it fails predictably.

Reported average time to find and access a specific document in SharePoint: more than 30 minutes. For operational decisions made during a drilling incident six months ago? Often impossible. The document may not exist. If it does, it is buried in a folder structure that only the person who created it understands. That person may no longer work at the company.

SharePoint is a file cabinet, not a knowledge system. It stores documents that someone manually created and manually filed. There is no intelligence, no context linking, no temporal awareness. A drilling report from Well 47 and an email thread about the mud weight change on Well 47 exist in completely separate locations with no connection between them.

Microsoft recognized this problem and built Viva Topics -- a SharePoint knowledge layer that attempted to automatically organize and surface relevant content. Viva Topics was retired in early 2025. Microsoft is pivoting to Copilot for knowledge management, but Copilot is a generic enterprise AI assistant. It finds documents. It does not capture, contextualize, or connect decisions across time.

Cognite Data Fusion: Data, Not Decisions

Cognite Data Fusion is used by Equinor, Aker BP, and other major operators for operational data integration and contextualization. It excels at liberating data from silos -- connecting SCADA, PI historian, SAP, and other systems into a unified data model.

But Cognite themselves acknowledge the core limitation. Their own documentation states that "once analysis is complete and decisions are made, the application files, analytical tools, and captured knowledge become additional silos of information, remembered or understood only by the people originally involved."

Read that again. Cognite -- one of the most sophisticated operational data platforms in the industry -- explicitly says that decisions become siloed the moment they are made. Every new project repeats the cycle of data gathering, analysis, decision, and then... forgetting.

Cognite does the plumbing. It connects the sensors to the dashboards. It does not capture why a human looked at those dashboards and chose one course of action over another. (For more on how digital platforms handle operational data in upstream oil and gas, see our detailed landscape analysis.)

Palantir Foundry: The Closest -- and the Most Expensive

Palantir Foundry has an Action Log feature that stores "the state of the world when decisions are made." It tracks not just what changed but why, with full audit trails and access controls.

This is, on paper, the closest existing product to true decision memory.

The problem is price. Palantir enterprise contracts typically run $5 to $20 million per year. The platform requires dedicated Palantir Forward Deployed Engineers for implementation and ongoing support. It is designed for supermajors and government agencies, not mid-size operators managing 500 to 2,000 wells.

If you operate 10,000 wells and have a $50 million annual technology budget, Palantir may be an option. If you are a PE-backed Permian operator with 800 wells and a lean IT team, Palantir is not in your universe. And yet, the PE-backed Permian operator has the same knowledge loss problem.

OpenWells and WellView: Well Data, Not Enterprise Decisions

Halliburton's OpenWells includes a lessons learned feature that enables engineers to record observations and document improvements for the next well. Peloton's WellView manages well data across the full lifecycle.

Both are useful tools for well-level documentation. Neither captures cross-functional operational decisions. The completions engineer's spacing rationale, the production engineer's artificial lift selection, the facilities engineer's pipeline routing decision -- these happen outside the well data management system. They happen in meetings, in Teams channels, in email threads, and in hallway conversations. OpenWells captures what you explicitly type into a form. It does not capture the 95% of operational decisions that never make it into any form.

EHS and MOC Software: Process, Not Memory

Sphera, Enablon, VelocityEHS, and similar platforms handle Management of Change workflows -- forms, approvals, notifications, compliance tracking. They are good at what they do: digitizing the MOC paperwork process.

But they capture formal change requests. They do not capture the informal daily decisions that constitute the vast majority of operational activity. The morning ops call where the drilling superintendent says "let's pull the BHA and switch to a shorter bit run." The email thread where the production foreman explains why she shut in Well 12 during the cold snap. The Teams message where the completions engineer says "I talked to the frac crew and we're dropping proppant concentration from 2,200 to 1,800 lb/ft because of what we saw on the last stage."

These decisions are operationally significant. They affect safety, production, and costs. They are almost never documented. And they are invisible to every tool in the current market.


The Concept: Operational Decision Memory

What if operational decisions were captured the way operational data is captured -- automatically, continuously, and without requiring anyone to fill out a form?

This is the concept behind Operational Decision Memory. It is not a database. It is not a dashboard. It is not another knowledge management platform that requires manual input and gets abandoned in six months.

Operational Decision Memory works by listening to the communications that already exist -- Teams meeting transcripts, email threads, Slack channels, meeting recordings -- and using AI to extract the decisions embedded within them.

When a drilling engineer says on the morning ops call, "We're bumping mud weight from 12.8 to 13.2 because the PWD tool is showing 13.0 ppg equivalent pore pressure at 14,200," the system captures:

  • The decision: Increase mud weight from 12.8 to 13.2 ppg
  • The rationale: PWD tool reading 13.0 ppg equivalent at 14,200 ft TVD
  • The decision-maker: The drilling engineer (by name)
  • The timestamp: When it was said
  • The context: Well 47, Wolfcamp A, 14,200 ft TVD
  • The data reference: PWD tool, Pason EDR

No form was filled out. No SharePoint document was created. No workflow was triggered. The engineer made the decision the same way they always make decisions -- by talking about it. The system captured it because the system was listening.

Nine months later, when the next drilling engineer on the next well asks, "Why did we run 13.2 on Well 47?" the system answers in seconds. Not in two days. Not "ask Bob." In seconds.

Three Use Cases That Change How Operators Work

Use Case 1: Regulatory Audit -- 60 Seconds Instead of 2 Days

A BSEE auditor asks: "Who approved the mud weight change on Well 47 and what was the technical basis?"

Today, answering this question requires pulling the well file, searching email archives, calling people who may or may not remember, and assembling a narrative from fragmentary evidence. This takes one to three days and may still produce an incomplete answer.

With Operational Decision Memory: type the question. The system returns the decision chain -- who decided, when, why, what data informed the choice, what alternatives were discussed, and whether the decision was consistent with prior decisions on similar wells. The system generates an audit-ready report with timestamps, participants, and source references. Time to answer: under 60 seconds.

BSEE civil penalties can reach $55,764 per day per violation. OSHA willful violation penalties reach $161,323 per occurrence. The difference between "we can demonstrate our decision process" and "we're reconstructing it from memory" is the difference between a clean audit and a finding. (For a deeper look at SCADA data quality issues that compound these problems, see our data quality audit guide.)

Use Case 2: Knowledge Transfer -- Instant Context from 5 Years of Decisions

A new reservoir engineer is assigned to the Reeves County asset team. They need to understand why the team runs ESPs on some wells and rod pumps on others, why certain pads use 660-foot spacing and others use 880-foot, why the team switched from slickwater to hybrid frac designs in the northern part of the field.

Today, this context transfer takes six to twelve months of asking questions, sitting in on meetings, and gradually absorbing institutional knowledge. If the person they would have asked has left, some of that context is permanently lost.

With Operational Decision Memory, the new engineer queries: "How did we handle ESP versus rod pump selection on Reeves County wells?" The system returns every decision related to artificial lift on Reeves County wells over the past three years -- the rationale, the debate, the data that informed the choice, and the outcomes. It returns: "On Wells 12, 15, and 23, the team selected ESPs over rod pumps because GOR exceeded 800 scf/bbl and the gas interference was causing rod pump failures averaging every 47 days. The production engineer at the time (Sarah Chen, transferred to Delaware Basin October 2025) noted that the ESP installations increased run time to 180+ days."

Three years of accumulated context, delivered in seconds. The new engineer is not starting from zero. They are starting from the full history of what this team has learned.

Use Case 3: Contradiction Detection -- Problems Surfaced Before They Become Costly

In January, the completions team decided to reduce cluster spacing from 60 feet to 45 feet on the Mesquite pad based on fiber optic data showing uneven stimulation. In March, the same team specified 60-foot cluster spacing on the adjacent Cottonwood pad -- apparently without referencing or even remembering the January decision.

No one noticed the contradiction because no one was tracking decision consistency. The information existed in two separate email threads, three months apart, involving slightly different personnel (one engineer transferred between January and March).

Operational Decision Memory detects this automatically. The system identifies that a decision made in March contradicts a decision made in January on a substantively similar topic and flags it for review. Not as an error -- there may be a valid reason for the different approach on the Cottonwood pad. But as a question: "Your team specified 60-ft cluster spacing on Cottonwood, but reduced to 45-ft on the adjacent Mesquite pad in January based on fiber data. Is this intentional?"

This is contradiction detection. It is one of the capabilities that distinguishes decision memory from simple document search. The system does not just find decisions -- it compares them against each other across time and identifies inconsistencies that humans miss because no human can hold thousands of decisions in working memory.


Why This Was Not Possible Five Years Ago

Two technology shifts made Operational Decision Memory feasible in a way it was not before 2023.

First: Automatic meeting transcription. Microsoft Teams, Zoom, and Google Meet now auto-transcribe meetings by default. This means the raw material for decision extraction -- what people said in operational meetings -- is already being generated and stored. Five years ago, extracting decisions from meetings meant manual note-taking (which nobody did consistently) or expensive specialized transcription services.

Second: Large language models. Extracting structured decisions from unstructured conversation requires understanding context, inferring rationale, and distinguishing a decision from a suggestion or a hypothetical. Traditional NLP tools could not do this reliably. Modern LLMs can. When a drilling superintendent says, "Based on what we saw on the last well, I think we should bump the mud weight, and actually let's just go ahead and do that -- call the mud engineer and tell him 13.2," a language model can distinguish between the initial suggestion ("I think we should"), the decision point ("let's just go ahead"), and the action item ("call the mud engineer"). Rule-based systems could not parse this. LLMs can.

The combination of ubiquitous transcription and capable language models means that for the first time, it is technically feasible to capture operational decisions from natural communication without requiring anyone to change their behavior.


What This Means for Mid-Size Operators

The operators most affected by knowledge loss are not the supermajors. ExxonMobil and Chevron have institutional processes, dedicated knowledge management teams, and technology budgets that allow them to at least partially address the problem (though even they struggle with it).

The operators most at risk are in the 200-to-5,000 well range -- mid-size independents and PE-backed companies. These operators have:

  • Smaller teams where each person's knowledge is more concentrated and harder to replace
  • Higher turnover driven by PE portfolio timelines, M&A activity, and basin-to-basin transfers
  • Leaner IT with no dedicated knowledge management function
  • The same regulatory obligations as larger operators (BSEE SEMS, OSHA PSM for covered facilities, state reporting requirements)
  • No access to enterprise tools like Palantir ($5M+/year) or Cognite ($500K+/year)

For these operators, the gap between "we know this is a problem" and "we have a solution" has been unbridgeable -- until now.

A purpose-built Operational Decision Memory system, designed for the mid-market, does not require a $5 million annual commitment. It does not require Forward Deployed Engineers. It connects to the tools the team already uses (Teams, email, meeting recordings), extracts decisions automatically, and makes them queryable. The technology that makes this possible has existed for less than two years. The companies building solutions on this technology are just now entering the market.

This is relevant whether you are managing post-M&A data integration challenges or trying to build an AI capability at a mid-size operator. The decisions your team makes about data architecture, production operations, and technology adoption are themselves institutional knowledge that should be captured and queryable.


The Regulatory Tailwind

Decision memory is not just operationally valuable -- it is increasingly a compliance requirement.

BSEE SEMS (30 CFR 250 Subpart S) requires documented audit trails for all elements of Safety and Environmental Management Systems on the Outer Continental Shelf. Audit reports must be submitted within 60 days. Corrective Action Plans must include the name and job title of responsible personnel, actions to be taken, and schedules. BSEE audits occur on a three-year cycle with civil penalties up to $55,764 per day per violation.

OSHA PSM (29 CFR 1910.119) requires written Management of Change procedures for changes to process chemicals, technology, equipment, procedures, and facilities at covered installations. Each MOC must document the technical basis for the change, the safety and health impact, modifications to operating procedures, duration of change, and authorization requirements. Compliance audits are required every three years.

State regulators are tightening requirements. The Texas Railroad Commission adopted Chapter 4 revisions effective July 2025, reshaping waste management documentation. Colorado's COGCC requires annual greenhouse gas intensity verification reports and new monthly monitoring reports for pre-production operations.

None of these regulators explicitly require "decision tracking." But all of them require documented trails of what changed, who approved it, what the basis was, and what was done about findings. The gap is that most operators satisfy these requirements with the absolute minimum paperwork -- and then scramble to reconstruct fuller documentation when an auditor asks a specific question.

An operator with Operational Decision Memory does not scramble. The decisions are already captured, timestamped, attributed, and queryable. The audit report is a query result, not a two-week reconstruction project.


What Operational Decision Memory Is Not

It is important to be clear about what this technology does not do and is not intended to replace.

It is not a replacement for operational data systems. SCADA, PI historian, production databases, and well data management systems capture operational measurements. Decision Memory captures the human reasoning layer that sits on top of those measurements. Both are necessary. Neither substitutes for the other.

It is not a surveillance tool. Decision Memory extracts operational decisions from work communications. It does not monitor individual productivity, track time, or evaluate employee performance. The purpose is to capture institutional knowledge, not to create a record of who said what for disciplinary purposes. Operators who deploy this technology need clear policies communicating this to their teams.

It is not infallible. AI extraction from meeting transcripts will sometimes misidentify a suggestion as a decision, miss a decision made in a side conversation, or extract an incomplete rationale. The system improves with tuning and feedback, but it will never achieve 100% capture accuracy. The relevant comparison is not perfection -- it is the current state, where capture accuracy for operational decisions is approximately 0%.

It is not a magic solution to organizational culture problems. If an organization has a culture where people hoard information, avoid documentation, or resist transparency, technology alone will not fix it. Decision Memory lowers the friction of knowledge capture from "fill out a form" to "do nothing differently." But it requires organizational willingness to let the system listen to operational communications. Some operators will not be comfortable with this. That is a legitimate concern, not a technical problem.


The Path Forward

The oil and gas industry has known about the knowledge loss problem for at least 20 years. The Great Crew Change has been discussed at every SPE conference, in every workforce planning report, and in every executive briefing since the mid-2000s. And yet, in 2026, 75 to 90 percent of operational knowledge still lives exclusively in people's heads.

The reason is not lack of awareness. It is that until now, every proposed solution required people to do something they will not do: manually enter their knowledge into a system. Every lessons-learned database, every SharePoint knowledge base, every knowledge management initiative has foundered on the same rock. People are busy running operations. They will not stop to fill out forms.

Operational Decision Memory is the first approach that does not require behavioral change. It captures decisions from the communications people are already having. It does not add work. It does not add forms. It does not require a knowledge management initiative, a change management program, or an executive mandate to "start documenting your decisions."

It just listens. And it remembers.

The 20 years of decisions that walked out the door with Bob cannot be recovered. But the next 20 years do not have to be lost the same way.


Have questions about Operational Decision Memory for your operations? Get in touch.

3-Month Pilot

$35,000 – $50,000

One asset team. Full deployment. Prove the value before committing.

  • Deployment and integration with your Teams and email environment
  • Decision extraction tuning for your operational vocabulary
  • Decision timeline dashboard
  • Audit report templates

Ongoing after pilot: $8,000 – $15,000/month depending on team size and data volume.

Talk to an Expert

Book a Free 30-Min Consultation

Discuss your operational challenges with our team of petroleum engineers and data scientists. No sales pitch — just honest technical guidance.

Book Your Free Consultation →