Refrac Candidate Selection with ML: How to Identify the Eagle Ford and Permian Wells Worth Refracing

Dr. Mehrdad Shirangi | | Published by Groundwork Analytics LLC

Editorial disclosure

This article reflects the independent analysis and professional opinion of the author, informed by published research, public operator disclosures, and hands-on experience building machine learning models for petroleum engineering applications. No vendor or operator reviewed or influenced this content prior to publication.

The refrac opportunity in North American shale is enormous and growing. Tens of thousands of horizontal wells completed between 2010 and 2018 used completion designs that are now considered suboptimal -- lower proppant loading, wider cluster spacing, fewer stages, and limited understanding of landing zone optimization. These vintage wells sit on depleted but far-from-exhausted rock, and modern completion techniques can unlock reserves that the original stimulation left behind.

The economics are attractive. BP's BPX Energy has reported triple-digit returns on Eagle Ford refracs, with refracs producing more flow than the original motherbores. ConocoPhillips expects its Marathon-inherited Eagle Ford refrac program to deliver 60% EUR uplift at a cost of supply in the low-to-high $30s per barrel. Devon Energy reports refrac wells producing 70-80% of the original IP rate, effectively resetting the decline clock. Verdun Oil Company, the quiet leader in Eagle Ford refracs, has executed approximately 100 recompletions delivering over 14 million incremental BOE to market.

But here is the problem that every completions engineer and VP Completions faces: refracs cost $3-5 million each, success rates vary from 30% to 70% depending on candidate selection methodology, and the wrong pick burns capital while the well next to it might have been a home run. The difference between a top-quartile refrac program and a mediocre one is not pumping technique or diverter chemistry. It is candidate selection.

Traditional screening approaches rely on empirical rules -- wells completed before a certain year, below a certain proppant loading threshold, in a certain landing zone. These rules work at a coarse level, but they miss the complex, nonlinear interactions between geology, original completion design, depletion state, well spacing, and pressure history that actually determine whether a refrac will deliver economic uplift. Machine learning does not miss those interactions. That is why the operators running the most successful refrac programs are building ML-driven screening workflows, and why your next refrac candidate list should be ranked by a model, not a spreadsheet filter.

The Scale of the Opportunity

The numbers are staggering. In the Eagle Ford alone, ConocoPhillips has identified approximately 500 refrac candidates on Freehold royalty lands -- and that is one operator on a fraction of its acreage. Verdun Oil has identified roughly 700 recompletion candidates across its Eagle Ford position. Industry-wide, analysts estimate that the Eagle Ford contains several thousand wells completed with legacy designs that are viable refrac candidates.

The Permian Basin dwarfs even the Eagle Ford in scale. With tens of thousands of horizontal wells drilled between 2012 and 2018 using completion designs that look primitive by today's standards, the Permian represents the largest potential refrac inventory in North America. Wells from the 2013-2018 vintage are considered the sweet spot -- completed with 1,000-1,500 pounds of proppant per lateral foot when today's designs use 2,000-2,500 pounds or more, and with wider cluster spacing that left significant rock unstimulated.

Yet refracs currently account for only 1-2% of completions activity. The gap between the addressable inventory and actual activity reflects a screening problem, not a technical one. Operators know how to pump a modern refrac. What they lack is confidence in identifying which wells will respond.

Why Traditional Screening Fails

The standard approach to refrac candidate selection uses threshold-based screening: filter for wells completed before 2016, with original proppant loading below 1,500 pounds per foot, lateral lengths exceeding 5,000 feet, and evidence of remaining reserves (typically estimated from decline curve analysis or volumetric calculations). Verdun Oil, for instance, has singled out wells with lateral lengths of around 6,000 feet completed before 2016.

This approach has two fundamental limitations.

It Treats Features as Independent

A threshold filter asks: is the proppant loading below X? Is the lateral length above Y? Is the vintage before Z? Each condition is evaluated independently. But refrac performance is driven by the interaction of these variables. A short-lateral well with very low proppant loading in high-quality rock might be a better candidate than a long-lateral well with moderate proppant loading in marginal rock. The threshold approach cannot express this. It applies a rectangular decision boundary to a problem with a highly nonlinear decision surface.

It Cannot Rank Candidates

Even when threshold screening correctly identifies a population of viable candidates, it provides no basis for ranking them. If your budget allows 15 refracs this year and your screening criteria identify 200 candidates, how do you pick the best 15? Threshold screening gives you a binary pass/fail. It does not tell you which well is expected to deliver 400 BOPD IP versus 150 BOPD IP. The difference between those two outcomes is the difference between a 150% ROR and a break-even project.

Some teams attempt ranking by combining screening criteria with a weighted scoring rubric. This is better than pure thresholds but still requires the engineer to specify the weights -- effectively encoding their assumptions about what drives refrac success into the scoring system. If those assumptions are wrong or incomplete, the ranking is wrong.

Machine learning replaces assumed weights with learned weights. It discovers the relative importance of each feature from actual refrac outcomes, including interaction effects that no engineer would think to specify in a scoring rubric.

The ML Approach: Building a Refrac Screening Model

A well-constructed refrac screening model follows a supervised learning workflow: train on historical refrac outcomes (production uplift, post-refrac EUR, or economic return), then predict the expected outcome for unrefrac'd candidate wells. The architecture has three components: feature engineering, model selection, and economic integration.

Feature Engineering: What the Model Needs to See

The input features for a refrac screening model span four domains, and getting the feature engineering right matters more than the choice of algorithm.

Original Completion Design. The most important feature category. This includes original proppant loading (pounds per lateral foot), fluid volume per foot, number of stages, cluster spacing, perforation design (cluster count, shot density), and fluid type (slickwater, crosslinked gel, hybrid). The gap between original completion intensity and modern design standards is the primary driver of refrac potential. Wells completed with 800-1,200 lbs/ft of proppant when current designs call for 2,000+ lbs/ft have the most unstimulated rock available.

Production History. Cumulative production (oil, gas, water), current production rate, time on production, decline rate (both instantaneous and over trailing 6-12 months), GOR trend, and water cut trend. The production history encodes information about how much resource has been recovered and how the reservoir is behaving. A well with low cumulative recovery relative to its estimated OOIP and a steep early decline may have had poor fracture coverage -- a good refrac signal.

Reservoir and Geology. Landing zone (Upper vs. Lower Eagle Ford, Wolfcamp A vs. B vs. C), net pay thickness, total organic carbon (TOC), thermal maturity (Ro), porosity, permeability estimates, and pressure gradient. These features capture the resource quality. A refrac in high-quality rock with favorable pressure has a higher ceiling than one in marginal rock regardless of the completion design gap.

Spacing and Depletion Context. Distance to nearest offset wells, number of adjacent wells, vintage of adjacent wells, and estimated pressure depletion from production data or pressure surveys. This category captures whether the drainage area has been depleted by offsets, which affects both the remaining resource and the risk of fracture hits during the refrac. Wells in sections with heavy infill development may have less remaining resource and higher frac-hit risk, both of which reduce expected refrac performance.

Engineered Features. Beyond raw inputs, derived features significantly improve model performance. Proppant intensity ratio (original proppant loading divided by current best-practice loading for that area), recovery factor estimate (cumulative production divided by estimated OOIP), production efficiency (cumulative production per lateral foot), and vintage-adjusted decline (comparing actual decline to type-curve expectation for the completion vintage). These engineered features encode domain knowledge about what constitutes "underperformance" or "untapped potential" in a way the raw data does not.

Model Selection: What Works

Research on ML-based refrac screening -- including work presented at URTeC using XGBoost and Factor Contribution Analysis on 1,127 refractured wells across 14 plays -- has converged on a set of model architectures that consistently perform well for this problem.

Gradient Boosting (XGBoost, LightGBM, CatBoost). The workhorse for tabular data problems in petroleum engineering. Gradient-boosted trees handle mixed feature types (continuous and categorical), are robust to outliers, automatically capture feature interactions, and provide built-in feature importance rankings. XGBoost in particular has become the default for refrac screening in research and practice. It handles the moderate dataset sizes typical of refrac problems (hundreds to low thousands of observations) better than deep learning, which needs orders of magnitude more data to outperform gradient boosting on tabular data.

Random Forests. Useful as a benchmark and for ensemble diversity. Random forests are less prone to overfitting than individual gradient-boosted models and provide uncertainty estimates through the variance across trees. In a production screening workflow, running both XGBoost and a random forest and comparing results provides a simple robustness check -- wells that rank highly in both models are higher-confidence candidates.

Physics-Informed Approaches. The most sophisticated implementations embed domain physics into the ML model. This can take several forms: using a physics-informed decline curve model to generate engineered features (estimated remaining reserves, type-curve deviation), constraining the model output to honor material balance (predicted post-refrac EUR cannot exceed estimated OOIP minus cumulative production), or using a simplified reservoir simulation as a feature generator (estimated fracture half-length, stimulated reservoir volume from the original completion).

Physics-informed approaches are especially valuable for refrac screening because the training dataset is inherently small. Most operators have refrac'd fewer than 100 wells, and many have refrac'd fewer than 20. With limited training data, embedding domain knowledge through physics constraints compensates for statistical limitations of the learning algorithm.

What does not work well: Deep neural networks (insufficient data for most operators), linear regression (cannot capture nonlinear interactions that dominate refrac performance), and unsupervised clustering alone (identifies groups but does not predict performance).

Feature Importance: What the Models Reveal

One of the most valuable outputs of an ML refrac screening model is not the candidate ranking itself but the feature importance analysis -- what the model learned about what drives refrac success.

Across multiple published studies and our own modeling work, the feature importance hierarchy for refrac performance prediction is remarkably consistent:

  1. Original proppant loading (lbs/ft). The single most predictive feature. The gap between original and modern proppant loading is the strongest indicator of remaining unstimulated rock volume. The URTeC study on 1,127 refrac'd wells across 14 plays found that proppant per foot was one of only three features with significant impact on post-refrac performance.
  1. Rock quality (landing zone, TOC, maturity). The second most important category. Refracs in the best rock deliver outsized results regardless of original completion design. You cannot refrac your way out of bad rock.
  1. Lateral length. Longer laterals have more rock to restimulate. The URTeC study identified lateral length as one of the three significant features, and Verdun Oil's focus on wells with ~6,000-foot laterals reflects this empirically.
  1. Production history indicators. Wells with low recovery factors relative to rock quality and low cumulative production per lateral foot are under-recovered -- exactly the wells with the most to gain from restimulation.
  1. Spacing and depletion context. This acts as a downside limiter. Wells in heavily developed sections may show all the hallmarks of good refrac candidates (low proppant, good rock, short laterals) but deliver disappointing results because the offsets have already drained significant reserves from the target interval.
  1. Vintage and completion era. A proxy for multiple design decisions that changed simultaneously (proppant loading, cluster spacing, fluid type, diversion technology). Useful for coarse screening but partially redundant with direct completion parameters when those are available.

SHAP (SHapley Additive exPlanations) analysis on gradient-boosted models provides more nuance than simple feature importance rankings. SHAP values reveal interaction effects: for example, low proppant loading is most predictive of strong refrac performance when paired with high-quality rock and low spacing density. The same low proppant loading in marginal rock with tight spacing produces SHAP values near zero -- the model has learned that the combination does not work, even though each individual feature looks favorable.

Economic Screening Integration

A refrac screening model that predicts production uplift but ignores economics is only half-finished. The complete workflow integrates predicted production into an economic model that accounts for the full cost structure and produces a ranked candidate list based on expected return.

Refrac Cost Structure

Refrac costs vary by method and basin but fall into a well-defined range:

  • Full mechanical isolation (plug-and-perf): $3.5-5.0 million. The most expensive approach, involving milling out the existing completion, running a liner, and performing a full re-stimulation. Delivers the most complete restimulation but requires well downtime and intervention risk.
  • Diversion-based (bullhead or limited entry): $1.0-2.5 million. Lower cost with continuous pumping capability. Limited ability to control exactly where the fluid goes, but operators like Devon Energy have reported strong results in the Eagle Ford using this approach.
  • Hybrid approaches: $2.0-3.5 million. Combining coiled-tubing-deployed diversion with selective perforating. Increasingly common as operators balance cost against coverage.

For economic screening, the model must map predicted production uplift (incremental barrels over the base decline) to a net present value using assumed commodity pricing, LOE, taxes, and royalties, then rank candidates by NPV-to-investment ratio or internal rate of return.

Expected Uplift Ranges and Decline Behavior

Post-refrac production behavior follows a pattern that is now well-documented from hundreds of completed refracs:

  • IP uplift: Successful refracs in the Eagle Ford deliver initial production rates of 70-100% of the original IP. Devon's wells have been producing 70-80% of the original IP rate, essentially resetting the decline. BP has seen refracs producing more than the original motherbore.
  • Decline behavior: Post-refrac decline curves typically show steeper initial decline than the original well (hyperbolic b-factor of 0.8-1.2 in the first 12 months) before settling into a shallower terminal decline. The decline is steeper because the refrac accesses partially depleted rock, which produces faster initial transients. Modeling post-refrac decline accurately requires physics-informed approaches -- a topic we covered in detail in our article on physics-informed decline curve analysis.
  • Returns: At current commodity prices, well-selected refracs deliver returns that compete with or exceed new drill economics. Devon has reported 97-250% RORs at its Zgabay project. BP reports triple-digit returns. Verdun's 100-well program has delivered 14 million incremental BOE. The key word is "well-selected" -- the best results come from rigorous candidate screening, not from refrac'ing everything that passes a vintage filter.

The Economic Screening Pipeline

The complete ML-driven economic screening pipeline works as follows:

  1. Feature extraction: Pull completion records, production data, geological attributes, and spacing data for all wells in the candidate universe.
  1. Model prediction: Run the trained gradient-boosted model to predict expected production uplift (incremental 12-month cumulative, incremental 36-month cumulative, and incremental EUR) for each candidate.
  1. Uncertainty quantification: Generate P10/P50/P90 predictions using either a probabilistic model (quantile regression, conformal prediction) or an ensemble of models trained on bootstrap samples. Never make a $4 million capital decision on a point estimate.
  1. Economic calculation: For each candidate, calculate NPV at P10/P50/P90 production levels, using the operator's standard economic assumptions (commodity price deck, LOE, taxes, royalties, discount rate).
  1. Ranking and portfolio optimization: Rank candidates by risk-adjusted return (P50 NPV / investment, or expected value incorporating the probability of sub-economic outcomes). If budget-constrained, solve a portfolio optimization problem to maximize expected program NPV subject to the capital budget.
  1. Engineering review: Present the ranked list with model explanations (SHAP values for each well) to the completions engineering team for review. The model identifies candidates; engineers validate them. No competent operator should refrac a well solely because a model said so.

Case Study Framework: Eagle Ford Refrac Screening with Public Data

The Texas Railroad Commission (RRC) maintains publicly available data on every well permit, completion report, and production record in the state. This creates a unique opportunity to build and validate refrac screening models using entirely public data.

Data Sources

  • Completion records (Form W-2): Original completion design including perforated intervals, fluid volumes, and proppant mass. From 2012 onward, most Eagle Ford completions include lateral length and stage count.
  • Production data (Form PR): Monthly oil, gas, and water production by lease. Individual well allocation requires combining PR data with completion records and operator reporting.
  • Well location and spacing: API number, latitude/longitude, and survey data allow computation of inter-well distances for spacing analysis.

Building the Training Set

The training set consists of wells that have been refrac'd, with the target variable being post-refrac production uplift (typically incremental 12-month cumulative production relative to the pre-refrac base decline). Identifying refrac'd wells in RRC data requires matching recompletion permits to production data showing a step-change in production rate.

The features come from each well's original completion record (pre-refrac) and production history (pre-refrac). Geological attributes can be sourced from public well logs or estimated from regional geology maps. Spacing is computed from well coordinates.

With approximately 500-1,000 horizontal well refracs completed in the Eagle Ford through 2025, the training set is large enough for gradient-boosted models but too small for deep learning -- another reason tree-based models are the right choice for this problem.

Validation Strategy

The critical test for any refrac screening model is out-of-sample validation. The recommended approach:

  • Temporal holdout: Train on refracs completed before 2024. Validate on refracs completed in 2024-2025. This tests whether the model generalizes to wells it has never seen, completed under potentially different conditions.
  • Spatial holdout: Train on refracs in certain counties or areas. Validate on refracs in held-out areas. This tests whether the model generalizes across geological settings within the basin.
  • Operator holdout: If sample size permits, train on refracs from some operators and validate on others. This tests whether the model is learning geology and completion design rather than operator-specific practices.

A model that performs well on temporal and spatial holdouts is ready for deployment. A model that only performs well in-sample is memorizing -- and memorization is the enemy of good capital allocation.

Where the Puck Is Going

The operators building the most sophisticated refrac screening programs are moving beyond simple candidate ranking into several advanced capabilities:

Refrac design optimization. Once the model identifies a candidate, the next question is how to refrac it. ML models trained on refrac design parameters (proppant loading, fluid volume, stage count, diversion method) and post-refrac outcomes can optimize the refrac completion design for each candidate well. This is a distinct model from the screening model -- it asks not "should we refrac this well?" but "given that we are going to refrac this well, what is the optimal design?"

Real-time screening updates. As new production data arrives monthly, the candidate rankings should update. A well that was marginal six months ago may become a strong candidate as its production declines further below the type curve. Automated pipelines that re-score the candidate universe monthly ensure the refrac program is always working from current information.

Transfer learning across basins. Models trained on Eagle Ford refrac outcomes can provide a starting point for Permian screening, even though the geology and completion practices differ. Transfer learning fine-tunes a pre-trained model on a smaller dataset from the target basin, requiring fewer historical refracs in the new basin to build a useful model. As Permian refrac activity increases from its current 1-2% of completions, transfer learning from the Eagle Ford will accelerate the learning curve.

Integration with petro-data infrastructure. Refrac screening models require clean, integrated data from multiple sources -- completion records, production databases, geological models, and well spacing analyses. The petro-mcp open-source MCP server provides a standardized interface for accessing petroleum engineering data and calculations, including decline curve analysis and economic evaluation tools that can plug directly into a refrac screening pipeline. Building screening workflows on open, interoperable data infrastructure ensures the model can ingest data from whatever SCADA, production accounting, or geological modeling system the operator uses.

The Bottom Line for Completions Teams

The refrac opportunity in the Eagle Ford and Permian is real, large, and economically compelling at current commodity prices. The operators capturing the most value -- Verdun with 100+ recompletions and 14 million incremental BOE, Devon with 97-250% RORs, BP with triple-digit returns, ConocoPhillips with 500+ identified candidates -- are not just pumping better refracs. They are picking better wells to refrac.

Machine learning does not replace completions engineering judgment. It augments it by processing more variables, capturing more interactions, and producing a quantitative ranking that spreadsheet screening cannot. The models reveal what drives refrac success (proppant loading gap, rock quality, lateral length), quantify the expected uplift for each candidate, and integrate that prediction into an economic framework that makes capital allocation decisions defensible.

If you are running a refrac program or evaluating whether to start one, the question is not whether ML-based screening adds value. The published results and operator disclosures make that case clearly. The question is whether your current screening approach -- whatever combination of vintage filters, engineering judgment, and production analysis you are using today -- is leaving value on the table by missing the nonlinear interactions that determine refrac success. For most operators, the answer is yes.

The wells worth refracing are hiding in your database. The right model can find them.


Dr. Mehrdad Shirangi is the founder of Groundwork Analytics and holds a PhD from Stanford University in Energy Systems Optimization. His published work on closed-loop field development and prescriptive analytics for well completions bridges the gap between academic research and operational deployment. Connect on X/Twitter and LinkedIn, or reach out at info@petropt.com.

Building a refrac screening workflow or evaluating your candidate selection process? Let's talk about your wells.


Related Articles

Have questions about this topic? Get in touch.