TL;DR

  • Unit tests and build passed; Playwright timed out on multiple UI steps.
  • The failures cluster around config import/export and sidebar interactions.
  • Next step is to harden selectors, wait for stable UI state, and reduce flakiness.

What failed

The E2E run timed out on:

  • Saving config presets (wizard not hidden in time).
  • Importing config (file input not found or not ready).
  • Exporting config (download event never fired).
  • Activity tab toggle (locator never became clickable).
  • Calendar seed verification (expected text never appeared).

The earlier steps (unit tests, build, PHPUnit, seeding) succeeded, so the environment was mostly fine. This points to UI timing or selector fragility.

Why this is likely flaky

Common Playwright failure modes in CI:

  • UI transitions are slower on shared runners.
  • Selectors rely on labels or IDs that are conditionally rendered.
  • The app is still loading data when assertions run.
  • The test expects a download event but the UI does not trigger it.

Hardening ideas

1) Make UI state explicit

  • Wait for a known stable element after each nav step.
  • Use expect(locator).toBeVisible() before clicking.

2) Improve selectors

  • Prefer data-testid over text or label selectors.
  • Avoid IDs that are created dynamically.

3) Add targeted waits

  • Wait for network idle only where it makes sense.
  • Replace waitForTimeout with UI state checks.

4) Isolate heavy tests

  • Move config import/export into a separate spec.
  • Run stateful tests serially to avoid interference.

Observability

The run already uploads Playwright reports and traces. Next move is to open the trace and compare timing differences between local and CI.

Next changes I will try

  • Add data-testid for wizard, preset save button, and file input.
  • Make the export button assert that a click triggers a download.
  • For calendar checks, wait for the seed to finish, then assert on a stable list item.

Takeaway

The pipeline is correct; the UI tests need to be more deterministic under CI load. Fixing that should bring the job back to green.