Insights
Computing Cumulative Abnormal Returns in Pandas
Alphanume Team · June 10, 2026
CAR end to end, with a clean estimation window.
Event studies live or die on one question: did the stock move because of the event, or because the whole market moved? Separating the two requires a baseline — an expected return the stock would have earned absent the event — and that baseline has to be estimated from data that pre-dates the event window. This tutorial walks through the full pipeline for computing cumulative abnormal return python-first: market-model estimation, daily abnormal returns, cumulation to CAR, averaging to CAAR, and a cross-sectional t-test — all in pandas, all with clearly synthetic data so every number is auditable.
What an abnormal return actually measures
The abnormal return on day t for stock i is simple: the return the stock actually earned minus the return a model predicted it would earn.
The market model is the most common choice. It says the expected return on any day is alpha_i + beta_i * R_market, where alpha_i and beta_i are estimated from a quiet pre-event period. Subtracting that fitted value from the realized return isolates the event-specific component.
The cumulative abnormal return is just the running sum of daily abnormal returns over the event window — say, day -5 through day +5 around an announcement. A single number summarising whether the stock beat or missed its predicted path over that stretch.
The estimation window — the part most implementations get wrong
The estimation window is the period over which you fit alpha and beta. It must not overlap the event window. If it does, the event itself contaminates the very baseline you are trying to subtract, biasing abnormal returns toward zero. A common convention is to estimate over a window like [-250, -11] trading days relative to the event — ending at least ten days before the event window opens — and use 120 to 250 trading days of data.
The second rule is equally important: the estimation window must end before any information about the event was public. Earnings leaks, pre-announcement trading, or regulatory filings can all shift prices days before the official event date. When in doubt, push the estimation window further back. Violating either rule is a form of look-ahead bias: you are fitting a model that already knows the answer you are about to measure.
Building the synthetic data
We will work with a small synthetic universe — ten stocks, two event dates each — so the mechanics stay visible. The returns DataFrame has daily rows indexed by date and ticker columns. A separate Series holds the market index returns for the same dates.
import numpy as np
import pandas as pd
rng = np.random.default_rng(42)
dates = pd.bdate_range("2022-01-03", "2023-12-29") # ~521 trading days
tickers = [f"S{i:02d}" for i in range(10)]
# Synthetic daily returns: small drift, moderate vol
returns = pd.DataFrame(
rng.normal(0.0003, 0.015, size=(len(dates), len(tickers))),
index=dates,
columns=tickers,
)
market_returns = pd.Series(
rng.normal(0.0003, 0.010, size=len(dates)),
index=dates,
name="market",
)
# Two event dates per stock, spaced well apart
events = [
(ticker, date)
for ticker in tickers
for date in rng.choice(dates[100:-100], size=2, replace=False)
]
events = [(t, pd.Timestamp(d)) for t, d in events]
Nothing here is real price data — every series is drawn from a normal distribution with no cross-sectional correlation, so we expect CARs close to zero on average. That makes it easy to spot implementation errors: a CAAR of 5% on random data means something is wrong.
Estimating alpha and beta per event
For each (ticker, event_date) pair we extract the estimation window, run an OLS regression of stock returns on market returns, and store the coefficients. numpy.polyfit with deg=1 gives us slope (beta) and intercept (alpha) in one call — no statsmodels dependency required.
For the event window we use [-10, +10] trading days; the estimation window runs from -260 to -11 days relative to the event date — 250 trading days ending the day before the buffer starts. We index into the sorted dates array by integer position to avoid any calendar arithmetic surprises.
EST_START = -260 # days before event (inclusive)
EST_END = -11 # days before event (inclusive)
EVT_START = -10
EVT_END = +10
def get_window(dates_idx, event_date, start_offset, end_offset):
"""Return a slice of dates[] centred on event_date."""
pos = dates_idx.get_loc(event_date)
lo = pos + start_offset
hi = pos + end_offset + 1 # +1 for inclusive slice
if lo < 0 or hi > len(dates_idx):
return None
return dates_idx[lo:hi]
dates_idx = pd.DatetimeIndex(returns.index)
results = []
for ticker, event_date in events:
est_window = get_window(dates_idx, event_date, EST_START, EST_END)
evt_window = get_window(dates_idx, event_date, EVT_START, EVT_END)
if est_window is None or evt_window is None:
continue
# Align returns on the estimation window
r_stock = returns.loc[est_window, ticker].values
r_mkt = market_returns.loc[est_window].values
valid = ~(np.isnan(r_stock) | np.isnan(r_mkt))
if valid.sum() < 60: # need enough data to fit
continue
beta, alpha = np.polyfit(r_mkt[valid], r_stock[valid], 1)
# Abnormal returns over the event window
r_stock_evt = returns.loc[evt_window, ticker].values
r_mkt_evt = market_returns.loc[evt_window].values
ar = r_stock_evt - (alpha + beta * r_mkt_evt)
car_series = pd.Series(ar, index=evt_window).cumsum()
results.append({
"ticker": ticker,
"event_date": event_date,
"alpha": alpha,
"beta": beta,
"car": car_series.iloc[-1], # scalar: end-of-window CAR
"car_series": car_series, # full path for plotting
})
df = pd.DataFrame(results).drop(columns="car_series")
print(df.head())
Keeping car_series in a separate list rather than storing it in a DataFrame column avoids object-dtype columns that are slow to operate on. Pull it back when you need to plot cumulative paths.
Averaging to CAAR and a t-test
The cumulative average abnormal return (CAAR) is the cross-sectional mean of individual CARs. The matching t-statistic tests whether the average is significantly different from zero under the assumption that CARs are independent across events — a reasonable first-pass assumption when events are spread across time and firms.
For running a full event study in Python you would also want a time-series of CAAR (averaged day by day across events, then cumulated) to see when the market began pricing the event. The scalar CAR below collapses that to a single endpoint.
car_values = df["car"].values
n = len(car_values)
caar = car_values.mean()
std_err = car_values.std(ddof=1) / np.sqrt(n)
t_stat = caar / std_err
p_value = 2 * (1 - (abs(t_stat) > 1.96)) # rough 5% two-tailed flag
print(f"N events : {n}")
print(f"CAAR : {caar:.4f} ({caar*100:.2f}%)")
print(f"Std err : {std_err:.4f}")
print(f"t-stat : {t_stat:.3f}")
print(f"Reject H0 at 5%? {'yes' if p_value == 0 else 'no'}")
With purely random synthetic returns the t-statistic will almost certainly land inside [-1.96, +1.96], confirming there is nothing to find — exactly what we want when testing the machinery rather than real events.
Three pitfalls that invalidate a CAR study
Overlapping estimation and event windows. Any overlap lets the event contaminate the baseline. The fix is a hard gap — we used ten days here, but five is the minimum anyone credibly defends. Always verify programmatically that the last estimation date is strictly less than the first event date for every observation.
Look-ahead in the estimation period. The estimation window must end before any public information about the event reached the market. For earnings this usually means ending before the first analyst pre-announcement. For M&A it can mean ending well before the rumour cycle. There is no single right answer — but always document the choice and test sensitivity to it.
Event clustering. The t-test above assumes independent CARs. If twenty of your events all fall on the same day — say, all firms reacting to a Fed announcement — the cross-sectional standard error is badly wrong because every abnormal return shares the same market noise. Calendar-time portfolio methods or clustered standard errors are the standard fix. Check your event-date distribution before reporting any significance.
For the full field list and endpoint details, see the API documentation.