Skip to main content

Preparing time series

Forecast quality is tied to how you curate the source data. The SDK enforces some guard rails, but high-signal forecasts start with clean inputs.

Rule of thumb

  • Provide at least 30 observations for robust forecasts.
  • Keep series consistent: no NaNs, no None, and a single measurement per interval.
  • Detrend and normalise when you mix heterogeneous scales.

Filling gaps

import pandas as pd

raw = pd.Series(
    [110, None, 118, 120, None, 131],
    index=pd.date_range("2024-01-01", periods=6, freq="D"),
)
clean = raw.interpolate(method="time").fillna(method="ffill")
series = clean.astype(float).tolist()
The SDK converts numpy arrays to lists during serialisation, so broadcasting to floats ahead of the request keeps payloads lean.

Scaling and clipping

import numpy as np
from simulacrum import Simulacrum

client = Simulacrum(api_key="sim-key_id-secret")
window = np.asarray(series, dtype=float)
window = np.clip(window, a_min=0, a_max=None)
window = (window - window.mean()) / window.std()

forecast = client.forecast(series=window, horizon=10, model="tempo")
Log transforms can also stabilise variance for multiplicative series before you forecast.

Feature flags

Store the context you use to shape the series alongside the request so you can replay it later if an issue surfaces.
metadata = {
    "segment": "north-america",
    "transforms": ["interpolate", "zscore"],
    "model": "tempo"
}
Log both the metadata and the response body in your data warehouse to power audit trails.