Alphanume

Insights

Vectorized vs Event-Driven Backtesting in Python

Alphanume Team · June 2, 2026

Speed versus realism, and when to use each.

Every strategy researcher faces the same fork early on: write a vectorized backtest python can run in milliseconds, or build an event-driven simulator that models the market bar by bar. The choice shapes how long your research loop takes, how realistic your performance numbers are, and which bugs hide until live trading. This tutorial covers both approaches, shows working code for each, and gives you a decision framework so you spend time on the right tool for each stage. For a concrete case study, see our guide on backtesting a momentum strategy in Python.

What vectorized backtesting is

A vectorized backtest encodes the entire strategy as aligned pandas or numpy arrays. Signals are computed across the whole price history in a single pass; positions are those signals shifted one period to prevent look-ahead; returns are the element-wise product of positions and the asset's period returns. No loop, no state machine — just array math.

import pandas as pd
import numpy as np

# prices: a DatetimeIndex Series of adjusted closes
def vectorized_backtest(prices: pd.Series, short_window: int = 20,
                        long_window: int = 60) -> pd.Series:
    short_ma = prices.rolling(short_window).mean()
    long_ma = prices.rolling(long_window).mean()

    # +1 long, -1 short, 0 flat
    signal = np.where(short_ma > long_ma, 1.0,
                      np.where(short_ma < long_ma, -1.0, 0.0))
    signal = pd.Series(signal, index=prices.index)

    # shift(1) is the anti-look-ahead rule:
    # today's position was decided on yesterday's close
    position = signal.shift(1)

    asset_returns = prices.pct_change()
    strategy_returns = position * asset_returns

    return strategy_returns.dropna()

The shift(1) call is non-negotiable. Without it the strategy "knows" today's close at the moment it decides today's position — a look-ahead error that inflates returns on any trend signal. With it, the position array is always one bar behind the signal array, matching what you could actually act on at the open of each bar.

Vectorized backtests finish quickly — a 20-year daily series runs in well under a second — which makes them ideal for parameter sweeps and universe screens during early research. The weakness is that array math elides everything that happens between bars: partial fills, slippage that depends on bar volume, position sizing that depends on current portfolio value, and any logic that must branch on the path the portfolio took to reach its current state.

Look-ahead traps in vectorized code

Beyond the unshifted signal, the subtler traps are:

  • Rolling windows on the full series. Calling .rolling(n).mean() on a price series that includes future bars is fine if you then shift — but if you normalize prices by the series maximum before computing signals, that maximum uses the future.
  • Forward-filled gaps. If a missing price is filled with the next known value, any signal computed on that bar sees the future fill.
  • In-sample scaling. Fitting a scaler (min-max, z-score) on the full history and then backtesting on that same history leaks distributional information from the end of the series back to the beginning.

What event-driven backtesting is

An event-driven backtest processes time one bar at a time, passing explicit event objects through a pipeline: market data arrives as a MarketEvent, a signal generator emits a SignalEvent, a position sizer converts that to an OrderEvent, and a fill simulator converts the order to a FillEvent that updates portfolio state. Because state is explicit and updated incrementally, path-dependent logic — drawdown-triggered de-risking, cash constraints, margin calls, rolling stops — is straightforward to add.

from dataclasses import dataclass, field
from collections import deque
from typing import Deque
import pandas as pd

@dataclass
class Portfolio:
    cash: float = 100_000.0
    position: float = 0.0
    equity_curve: list = field(default_factory=list)

def event_driven_backtest(prices: pd.Series,
                          short_window: int = 20,
                          long_window: int = 60) -> pd.Series:
    portfolio = Portfolio()
    queue: Deque[str] = deque()
    history = prices.to_list()
    dates = prices.index.to_list()

    short_ma = prices.rolling(short_window).mean()
    long_ma = prices.rolling(long_window).mean()

    for i, (date, price) in enumerate(zip(dates, history)):
        if pd.isna(short_ma.iloc[i]) or pd.isna(long_ma.iloc[i]):
            portfolio.equity_curve.append(portfolio.cash)
            continue

        # MarketEvent: new bar available
        queue.append("MARKET")

        while queue:
            event = queue.popleft()

            if event == "MARKET":
                # SignalEvent: compute signal on *this* bar's data
                if short_ma.iloc[i] > long_ma.iloc[i]:
                    queue.append("BUY")
                elif short_ma.iloc[i] < long_ma.iloc[i]:
                    queue.append("SELL")

            elif event == "BUY" and portfolio.position <= 0:
                # OrderEvent -> FillEvent: market order at current close
                shares = int(portfolio.cash // price)
                cost = shares * price * 1.001  # 0.1 % slippage
                if cost <= portfolio.cash:
                    portfolio.position += shares
                    portfolio.cash -= cost

            elif event == "SELL" and portfolio.position >= 0:
                if portfolio.position > 0:
                    proceeds = portfolio.position * price * 0.999
                    portfolio.cash += proceeds
                    portfolio.position = 0

        equity = portfolio.cash + portfolio.position * price
        portfolio.equity_curve.append(equity)

    equity_series = pd.Series(portfolio.equity_curve, index=dates)
    return equity_series.pct_change().dropna()

The skeleton above is intentionally minimal, but it already does things a vectorized test cannot: slippage is applied per fill, the position is gated on available cash, and the portfolio value is tracked bar by bar. Extending it to handle backtesting a short-selling strategy in Python means adding margin accounting inside the fill handler — a few lines of state, not a rethink of the architecture.

Look-ahead traps in event-driven code

Event-driven code is not automatically safe. Common mistakes:

  • Signals computed on the fill bar's close. If your signal fires on bar i and you also fill on bar i's close, you traded on information available only at end-of-bar. Fill at bar i+1's open instead.
  • Pre-computing the full indicator series. If you call rolling().mean() on the entire price history before the loop (as the skeleton above does for brevity), you have not introduced look-ahead for the signal values — but you have paid the vectorization cost. A stricter implementation recomputes the indicator from the growing window at each bar.
  • Peeking at future portfolio state. Dynamic position sizing that uses tomorrow's volatility estimate to size today's order is a leak even inside a loop.

Side-by-side comparison

Dimension Vectorized Event-driven
Speed Very fast — seconds for years of daily data Slower — loop overhead, especially tick data
Realism Low — ignores fill mechanics, cash, margin High — models slippage, partial fills, costs
Path dependence Hard — positions cannot react to portfolio state Natural — state machine tracks everything
Code volume Small — tens of lines Large — hundreds to thousands of lines
Parameter sweeps Excellent Slow without parallelism
Live-trading parity Low High — same event types as a live feed

When to use each

The practical workflow is to prototype vectorized and validate event-driven. In the research phase you are filtering hundreds of signals and parameter combinations; vectorized code lets you do that in a morning. Once a handful of candidates survive, you promote them to an event-driven simulator to stress-test fill assumptions, measure the impact of realistic transaction costs, and confirm that drawdown-triggered rules behave as intended. Strategies that look very different under the two regimes — large discrepancies in Sharpe or max drawdown — usually have a hidden assumption about fills or position sizing that the array math concealed.

The only case for going straight to event-driven is a strategy whose logic is inherently sequential from day one: execution algorithms, strategies with hard cash constraints, or anything with a stop-loss that ratchets based on the high-water mark. In those cases the vectorized prototype would require so many hacks to approximate the path dependence that it stops saving time.

For the full field list and endpoint details used when loading price data for either framework, see the API documentation.