Detecting Oscillating Submission Lags in HBO Therapy Claims: A Variance‑Based Fraud Detection Metric for Medicare Program Integrity
Definition of Key Term – Submission Lag
Throughout submission lag is defined as:
Submission Lag = Submission Date – Claim‑Line From Date (in days)
Where:
Claim‑Line From Date = the date the service was actually provided (e.g., date of HBO therapy session).
Submission Date = the date the provider submitted the claim to the payer (e.g., Medicare).
Examples:
Service on Jan 1, submitted on Jan 1 → lag = 0 days.
Service on Jan 1, submitted on Feb 20 → lag = 50 days.
This lag is not inherently suspicious; providers may batch claims weekly or monthly. However, certain patterns of lags can indicate manipulation.
Executive Summary
The Issue
Medicare limits Hyperbaric Oxygen Therapy (HBO) to 60 sessions per 365 days based on service dates. Some providers manipulate submission dates – not service dates – to evade prepayment edits and hide excessive sessions. They create an alternating pattern of submission lags (e.g., 0 days, then 100 days, then 0, then 100…). Both the mean lag and simple batching rules miss this pattern.
The Solution
Two variance‑based metrics calculated from sorted claim sequences:
1. Variance of submission lag – measures the amplitude of oscillation.
2. Variance of reordering – measures how often claims are submitted out of service order.
Together, they flag providers who are “gaming” the submission timing.
Impact
In a pilot review of 45 HBO providers, two had extreme values on both metrics. Audit confirmed one case of backdating (90+ days) and one case of exceeding the 60‑session limit (85 sessions). Both were referred for recovery.
Personal note – earning the endorsement
I had asked Dr. Zhenhua Huang (PhD in biostatistics) for a LinkedIn endorsement for SAS and Statistics for nearly a year. He never responded to my request. After showing him this variance‑based approach – which he himself had been trying to figure out how others used variance in similar models – he finally gave the endorsement. This paper is dedicated to that principle: earn it.
1. The Scam – “Submission Lag Offsetting”
The rule: No more than 60 HBO sessions in any rolling 365‑day period (by service date).
The cheat:
Deliver 60 sessions legitimately (service dates Jan–Jun).
Submit half on time (lag=0), half with a long lag (e.g., 100 days).
Deliver a second block of sessions later in the year, but submit those with the opposite pattern.
Why?
Many payers run prepayment edits only on claims submitted within 90 days. The alternating pattern ensures half the claims skip prepayment checks. Also, when sorted by submission date, the two blocks interleave, hiding the true service date density.
The clue: Normal providers have low‑variance lags (e.g., all 45 days). Alternating schemes produce high variance and scrambled submission order.
2. Metrics – Technical Definition
Let a provider have n claims for a given patient (or aggregated at provider level). Sort claims by service date (oldest to newest). Assign service_order = 1,2,…,n.
Define submission lag for claim i :
Metric 1 – Variance of lag
Metric 2 – Variance of reordering
Sort claims by submission date (ties broken by service date). Assign submission_order = 1,2,…,n. For each claim, compute the absolute difference:
Then calculate:
Rule of thumb (based on simulation, n≥15):
Low risk: below peer median AND ≈0
Medium risk: either metric above 75th percentile
High risk: both metrics above 90th percentile (flag for audit)
3. Results from Pilot Data
Using simulated data that mirrored real patterns (n=10 per provider, as in the attached Excel file):
Provider B would be flagged for high var_lag alone. Provider D (random, chaotic submission) would be flagged for both. In real data, high var_reorder without high var_lag might indicate a different issue (e.g., frequent resubmissions). The two‑metric approach reduces false positives.
4. Discussion
Why variance beats mean:
Mean lag is blind to oscillation. Variance captures the amplitude. This is what distinguishes suspicious alternating patterns from normal batch billing.
Why reordering matters:
A provider who batches every 45 days will have zero reordering variance. A provider who alternates will scramble submission order, producing positive reordering variance. The combination is powerful.
Limitations:
Small claim counts (<15) give unstable variances.
Trends (e.g., linearly increasing lags) also increase variance; detrending may be required.
Not diagnostic – flags only indicate need for audit.
Extensions:
Add autocorrelation at lag 1 to explicitly test for alternation.
Use peer‑group benchmarking (specialty, region) instead of fixed percentiles.
Integrate into automated monthly monitoring dashboard.
5. Conclusion
A simple, explainable metric – variance – can uncover a sophisticated submission timing scam that mean‑based statistics miss. The dual‑metric approach (lag variance + reordering variance) is easy to implement in SAS, requires no machine learning, and has already led to real recoveries. For program integrity analysts, it’s a new tool in the toolkit.
Acknowledgments
My esteemed and dear friend, Dr. Zhenhua Huang, who made me earn every bit of praise, and whose honesty and rigor I deeply respect.
Code and methodology are open for reuse. Contact me for collaboration or questions.
SAS Implementation
*** Step 1: Sort by provider and service date;
proc sort data=claims out=step1;
by provider_id service_date submission_date; run;
*** Step 2: Create service_order;
data step2;
set step1;
by provider_id;
if first.provider_id then service_order = 0;
service_order + 1; run;
*** Step 3: Sort by provider and submission date to get submission_order;
proc sort data=step2 out=step3;
by provider_id submission_date service_date; run;
data step4;
set step3;
by provider_id;
if first.provider_id then submission_order = 0;
submission_order + 1; run;
*** Step 4: Sort back into service order for variance calculation;
proc sort data=step4 out=final_aligned;
by provider_id service_order; run;
*** Step 5: Compute metrics using PROC SQL;
proc sql;
create table provider_metrics as
select provider_id,
count(*) as claim_count,
mean(lag_days) as mean_lag,
var(lag_days) as var_lag,
var(abs(service_order - submission_order)) as var_reorder
from final_aligned
where calculated claim_count >= 15
group by provider_id; quit;
***Step 6: Flag outliers (example: top 10% by var_lag);
proc univariate data=provider_metrics noprint;
var var_lag var_reorder;
output out=pctl pctlpre=P_ pctlpts=90 75; run;
data flagged;
if _n_=1 then set pctl;
set provider_metrics;
flag_lag_high = (var_lag > P_var_lag_90);
flag_reorder_high = (var_reorder > P_var_reorder_90);
flag_audit = (flag_lag_high and flag_reorder_high); run;
Notes on the code:
var() in PROC SQL returns sample variance (denominator n-1).
Ties in submission date are broken by service_date in the second sort, matching the ROW_NUMBER behavior.
Minimum claim count (15) ensures stability.










