The browser agent now uses a rolling context window: as a run gets long, older steps are summarised so the freshest context fits in the model's window. Tests that were previously truncated mid-run (long onboarding flows, complex purchasing journeys, exploratory crawls) now run end-to-end.