Insights
How to Calculate Rolling Returns in Pandas
Alphanume Team · June 3, 2026
Windowed returns without off-by-one bugs.
Rolling returns pandas users reach for first are often subtly wrong — the window is off by one, the signal peeks at future prices, or the cumulative math silently mixes simple and log returns. This post covers the full stack: simple versus log returns, cumulative returns over a window, rolling CAGR, alignment traps, and how rolling differs from resample for periodic returns. All examples operate on a price or returns DataFrame and stay within the rules pandas actually enforces today.
Simple returns versus log returns
Two conventions dominate: simple (arithmetic) returns and log (continuously compounded) returns. Both start from a price Series.
import numpy as np
import pandas as pd
# Assume `prices` is a DatetimeIndex-indexed Series of daily closes.
simple = prices.pct_change() # (p_t - p_{t-1}) / p_{t-1}
log_r = np.log(prices).diff() # ln(p_t) - ln(p_{t-1})
Simple returns compound across assets correctly — if you hold a portfolio of two positions, the portfolio return is a weighted sum of simple returns. Log returns compound across time correctly — add N consecutive log returns and you have the cumulative log return over the whole period. That additivity is why log returns are preferred for rolling time-series arithmetic. The numerical difference between the two is negligible for daily moves under a few percent, but it accumulates over longer windows and for volatile assets. Choose based on what comes next: cross-sectional weighting calls for simple; time-series aggregation calls for log.
Cumulative return over a window
Given a Series of simple daily returns, there are two equivalent ways to compute the cumulative return over any window of length n:
n = 21 # approx. one trading month
# Method 1: price ratio (cleanest, no compounding error)
cum_price = prices / prices.shift(n) - 1
# Method 2: compound the simple returns (useful when you only have returns)
cum_ret = (1 + simple).rolling(n).apply(np.prod, raw=True) - 1
# Method 3: sum log returns, then exponentiate
cum_log = np.log(prices).diff().rolling(n).sum()
cum_from_log = np.exp(cum_log) - 1 # back to simple return space
Method 1 is the most numerically stable because floating-point errors do not accumulate across N multiplications. Method 3 exploits the additive property of log returns — a rolling sum of log returns is exactly the window's cumulative log return — which makes it attractive when you are already working in log space. Methods 1 and 3 produce identical results (up to floating-point tolerance) when the price series has no gaps. Use Method 2 only when prices are unavailable and you must work from the returns column alone.
Rolling windowed returns and the pitfalls
Pandas' .rolling(window) attaches a window of length window ending at the current row inclusive. That "inclusive of the current row" detail is the source of most look-ahead bugs in signal construction.
window = 63 # approx. one quarter
# This is a lookback return: value at row t = return over [t-62, t].
rolling_ret = prices / prices.shift(window - 1) - 1
# Equivalent with rolling (but slower for price-ratio calculation):
rolling_ret_v2 = (
(1 + simple)
.rolling(window)
.apply(np.prod, raw=True) - 1
)
# WRONG for signal generation — uses today's return to predict today's move:
# signal = rolling_ret # already includes price[t]
# CORRECT — shift by 1 so signal at t uses only data through t-1:
signal = rolling_ret.shift(1)
The shift aligns the signal with the return it is meant to predict. If you skip it, a backtest that enters at the close of day t based on rolling_ret.iloc[t] is using price t in the calculation — a classic look-ahead leak. A second trap: .rolling(window) requires exactly window rows by default; the first window - 1 rows are NaN. Pass min_periods=1 only if partial windows make conceptual sense for your metric.
Annualizing a windowed return
A 21-day return is not directly comparable to a 252-day return. Annualizing puts both on the same scale by scaling to a 252-trading-day year.
trading_days_per_year = 252
# Annualized simple return from a window of `n` days:
ann_simple = (1 + rolling_ret) ** (trading_days_per_year / window) - 1
# Annualized from log return sum (already have cum_log from earlier):
ann_log = np.exp(cum_log * (trading_days_per_year / window)) - 1
This is the rolling CAGR: the compound annual growth rate implied by the window's price change, rescaled as if the window repeated until it filled a year. It is not a forecast — it is a normalized comparison metric. When window equals trading_days_per_year, the annualized return collapses to the raw window return, which is a useful sanity check.
Rolling versus resample for periodic returns
Rolling windows and resample answer different questions and should not be confused. A rolling window produces a value for every row, where each value covers the last n periods. Resample snaps the data to calendar buckets — monthly, weekly, quarterly — and produces one row per bucket. Use rolling when you want a time series of overlapping windows (momentum signals, drawdown metrics). Use resample when you want non-overlapping, calendar-aligned intervals (monthly P&L, quarterly attribution). For a deeper look at the resample side, see resampling intraday data with pandas.
# Rolling 21-day return — one value per trading day, windows overlap
rolling_21 = prices / prices.shift(20) - 1
# Monthly return — one value per calendar month, no overlap
monthly = prices.resample("ME").last().pct_change()
# To combine: reindex monthly onto daily and forward-fill if needed
monthly_on_daily = (
monthly
.reindex(prices.index)
.ffill()
)
Mixing the two — computing a "monthly return" with a 21-day rolling window — produces overlapping windows that do not align with month-end dates and introduce serial correlation into any downstream test. If the intended frequency is a calendar month, use resample.
Worked example
The following ties everything together on a synthetic price series. It computes rolling 63-day returns, annualizes them, shifts the signal for correct alignment, and flags the top-decile momentum names — ready to plug into a strategy or plotting an equity curve in Python.
import numpy as np
import pandas as pd
rng = np.random.default_rng(42)
idx = pd.bdate_range("2020-01-02", periods=756) # ~3 years of trading days
tickers = ["A", "B", "C", "D", "E"]
# Synthetic daily log returns, then cumulative price
log_returns = pd.DataFrame(
rng.normal(0.0003, 0.012, size=(len(idx), len(tickers))),
index=idx,
columns=tickers,
)
prices = np.exp(log_returns.cumsum()) * 100 # start at 100
# --- Rolling 63-day return ---
window = 63
rolling_ret = prices / prices.shift(window - 1) - 1
# --- Annualize ---
ann_ret = (1 + rolling_ret) ** (252 / window) - 1
# --- Shift signal so today's signal uses only yesterday's prices ---
signal = ann_ret.shift(1)
# --- Rank cross-sectionally, flag top-decile (momentum filter) ---
ranks = signal.rank(axis=1, pct=True)
top_decile = ranks >= 0.80 # top 20% of 5-name universe
print(signal.tail(3).round(4))
print(top_decile.tail(3))
The prices.shift(window - 1) in the rolling return step uses window - 1, not window, because the numerator is price at row t (inclusive) and the denominator is the price window - 1 rows earlier — giving exactly window observations. Using prices.shift(window) would produce a return spanning window + 1 rows, an off-by-one error that widens every window silently. The full field reference and endpoint details are in the API documentation.