Skip to content

[chore] Add testing back#3701

Merged
junaway merged 22 commits intorelease/v0.85.5from
chore/add-testing-back
Feb 13, 2026
Merged

[chore] Add testing back#3701
junaway merged 22 commits intorelease/v0.85.5from
chore/add-testing-back

Conversation

@junaway
Copy link
Contributor

@junaway junaway commented Feb 10, 2026

Copilot AI review requested due to automatic review settings February 10, 2026 19:00
@vercel
Copy link

vercel bot commented Feb 10, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agenta-documentation Ready Ready Preview, Comment Feb 13, 2026 7:52am

Request Review

@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Feb 10, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR re-enables and expands automated testing across the monorepo by restoring/adding Playwright E2E suites for Web (OSS/EE), migrating SDK “integration” tests to an “e2e” structure/marker model, and updating API pytest suites and docs to align with the new testing taxonomy and endpoints.

Changes:

  • Web: switch Playwright configuration to AGENTA_LICENSE/AGENTA_WEB_URL, adjust navigation helpers, and add multiple OSS/EE E2E test suites.
  • SDK: reorganize pytest suites into tests/pytest/{unit,e2e}, add unit-test scaffolding/docs, and relabel “integration” tests to @pytest.mark.e2e with updated credential provisioning.
  • API: update pytest E2E tests for revised endpoints/query shapes and improve eventual-consistency handling for tracing tests; update DB/query logic and various EE/OSS copy strings.

Reviewed changes

Copilot reviewed 82 out of 146 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
web/tests/tests/fixtures/user.fixture/authHelpers/utilities.ts Updates environment detection + initial user state creation for Playwright workers.
web/tests/tests/fixtures/base.fixture/uiHelpers/helpers.ts Makes waitForPath tolerant of full URLs and workspace-scoped path prefixes.
web/tests/tests/fixtures/base.fixture/apiHelpers/index.ts Loosens URL matching for /apps navigation in API helpers.
web/tests/playwright/global-teardown.ts Simplifies teardown configuration; adds OSS cleanup + secret cleanup logic.
web/tests/playwright/config/types.d.ts Refactors tag type definitions toward license/plan/role/speed/cost dimensions.
web/tests/playwright/config/testTags.ts Replaces environment tags with license/plan/role/cost tags and updates tag arguments.
web/tests/playwright/config/projects.ts Collapses multi-project setup into a single project using env-driven baseURL/license.
web/tests/playwright.config.ts Switches testDir/baseURL to license-based paths and removes multi-project config.
web/package.json Bumps packageManager pnpm version.
web/oss/tests/playwright/e2e/testsset/testset.spec.ts Adds OSS testset spec wrapper.
web/oss/tests/playwright/e2e/testsset/index.ts Adds OSS testset E2E flow using fixtures + tags.
web/oss/tests/playwright/e2e/smoke.spec.ts Adds a basic OSS smoke navigation test.
web/oss/tests/playwright/e2e/settings/model-hub.ts Adds OSS Model Hub provider add/delete E2E flow.
web/oss/tests/playwright/e2e/settings/model-hub.spec.ts Adds OSS Model Hub spec wrapper.
web/oss/tests/playwright/e2e/settings/api-keys.ts Adds OSS API keys E2E flow (currently spec is skipped).
web/oss/tests/playwright/e2e/settings/api-keys-management.spec.ts Adds OSS API keys spec wrapper (skipped).
web/oss/tests/playwright/e2e/prompt-registry/prompt-registry-flow.spec.ts Adds OSS prompt registry spec wrapper.
web/oss/tests/playwright/e2e/prompt-registry/index.ts Adds OSS prompt registry flow test logic (partial flow).
web/oss/tests/playwright/e2e/playground/tests.ts Adds OSS playground fixture extensions (variant actions).
web/oss/tests/playwright/e2e/playground/run-variant.spec.ts Adds OSS playground spec wrapper.
web/oss/tests/playwright/e2e/playground/index.ts Fixes OSS playground spec import path.
web/oss/tests/playwright/e2e/playground/assets/types.ts Adds OSS playground fixture/types definitions.
web/oss/tests/playwright/e2e/playground/assets/constants.ts Adds OSS playground test data constants.
web/oss/tests/playwright/e2e/playground/assets/README.md Documents OSS playground fixtures and usage.
web/oss/tests/playwright/e2e/observability/observability.spec.ts Adds OSS observability spec wrapper.
web/oss/tests/playwright/e2e/observability/index.ts Adds OSS observability E2E flow.
web/oss/tests/playwright/e2e/deployment/index.ts Adds OSS deployment E2E flow.
web/oss/tests/playwright/e2e/deployment/deploy-variant.spec.ts Adds OSS deployment spec wrapper.
web/oss/tests/playwright/e2e/app/test.ts Adjusts OSS app test navigation URL waiting.
web/oss/tests/playwright/e2e/app/index.ts Adds OSS app creation flows using extended fixtures + tags.
web/oss/tests/playwright/e2e/app/create.spec.ts Adds OSS app creation spec wrapper.
web/oss/tests/playwright/e2e/app/assets/types.ts Adds OSS app test types/fixtures contracts.
web/oss/tests/playwright/e2e/app/assets/README.md Documents OSS app management test approach.
web/oss/tests/manual/datalayer/utils/test-types.ts Adds shared types for manual datalayer test analysis.
web/oss/tests/manual/datalayer/utils/shared-test-setup.ts Adds reusable setup/recording utilities for manual datalayer tests.
web/oss/tests/manual/datalayer/test-observability.ts Adds a manual observability atom test runner script.
web/oss/src/pages/w/[workspace_id]/p/[project_id]/apps/[app_id]/endpoints/index.tsx Updates EE-only tooltip copy.
web/oss/src/components/pages/settings/WorkspaceManage/Modals/InviteUsersModal.tsx Updates RBAC availability copy.
web/oss/src/components/pages/overview/deployments/DeploymentDrawer/index.tsx Updates deployment history tooltip copy.
web/ee/tests/playwright/e2e/testsset/testset.spec.ts Updates EE imports to new OSS Playwright test locations.
web/ee/tests/playwright/e2e/settings/model-hub.spec.ts Updates EE imports to new OSS Playwright test locations.
web/ee/tests/playwright/e2e/settings/api-keys-management.spec.ts Updates EE imports to new OSS Playwright test locations.
web/ee/tests/playwright/e2e/prompt-registry/prompt-registry-flow.spec.ts Updates EE imports to new OSS Playwright test locations.
web/ee/tests/playwright/e2e/playground/run-variant.spec.ts Updates EE imports to new OSS Playwright test locations.
web/ee/tests/playwright/e2e/observability/observability.spec.ts Updates EE imports to new OSS Playwright test locations.
web/ee/tests/playwright/e2e/human-annotation/tests.ts Makes URL assertions regex-safe for querystring.
web/ee/tests/playwright/e2e/human-annotation/index.ts Adds EE human-annotation E2E flows.
web/ee/tests/playwright/e2e/human-annotation/human-annotation.spec.ts Adds EE human-annotation spec wrapper.
web/ee/tests/playwright/e2e/human-annotation/assets/types.ts Adds EE human-annotation fixture types.
web/ee/tests/playwright/e2e/deployment/deploy-variant.spec.ts Updates EE imports to new OSS Playwright test locations.
web/ee/tests/playwright/e2e/auto-evaluation/tests.ts Adds EE auto-evaluation fixture extensions.
web/ee/tests/playwright/e2e/auto-evaluation/run-auto-evaluation.spec.ts Adds EE auto-evaluation spec wrapper.
web/ee/tests/playwright/e2e/auto-evaluation/index.ts Adds EE auto-evaluation flows.
web/ee/tests/playwright/e2e/auto-evaluation/assets/types.ts Adds EE auto-evaluation fixture types.
web/ee/tests/playwright/e2e/auto-evaluation/assets/README.md Documents EE auto-evaluation fixtures.
web/ee/tests/playwright/e2e/app/create.spec.ts Adds EE app creation wrapper using OSS shared tests.
web/ee/tests/2-app/create.spec.ts Removes old EE app creation spec path.
sdk/tests/pytest/unit/test_tracing_decorators.py Fixes unit tests to match updated tracing/span status API + redact behavior.
sdk/tests/pytest/unit/conftest.py Adds unit-test-only conftest (no external deps).
sdk/tests/pytest/unit/init.py Marks unit tests as a package.
sdk/tests/pytest/unit/TESTING_PATTERNS.md Adds detailed SDK unit testing patterns documentation.
sdk/tests/pytest/unit/README.md Adds quick-start docs for SDK unit tests.
sdk/tests/pytest/e2e/workflows/test_legacy_applications_manager.py Relabels as e2e + adds retry helper for rate limiting.
sdk/tests/pytest/e2e/workflows/test_apps_shared_manager.py Relabels as e2e + adjusts slug assertions for fully-qualified slugs.
sdk/tests/pytest/e2e/workflows/init.py Package marker for workflows e2e tests.
sdk/tests/pytest/e2e/observability/test_observability_traces.py Relabels as e2e.
sdk/tests/pytest/e2e/observability/init.py Package marker for observability e2e tests.
sdk/tests/pytest/e2e/integrations/test_vault_secrets.py Relabels as e2e.
sdk/tests/pytest/e2e/integrations/test_testsets_manager.py Relabels as e2e.
sdk/tests/pytest/e2e/integrations/test_prompt_template_storage.py Relabels as e2e.
sdk/tests/pytest/e2e/integrations/test_evaluators_manager.py Relabels as e2e.
sdk/tests/pytest/e2e/integrations/init.py Package marker for integrations e2e tests.
sdk/tests/pytest/e2e/healthchecks/test_healthchecks.py Adds SDK e2e healthcheck tests.
sdk/tests/pytest/e2e/healthchecks/conftest.py Adds e2e healthcheck fixtures wiring.
sdk/tests/pytest/e2e/healthchecks/init.py Package marker for healthchecks e2e tests.
sdk/tests/pytest/e2e/evaluations/test_evaluations_flow.py Relabels as e2e.
sdk/tests/pytest/e2e/evaluations/init.py Package marker for evaluations e2e tests.
sdk/tests/pytest/e2e/conftest.py Reworks SDK e2e fixtures to provision accounts via admin API env vars.
sdk/tests/pytest/e2e/init.py Package marker for SDK e2e tests.
sdk/tests/pytest/conftest.py Minimizes root conftest so unit tests don’t require env/services.
sdk/tests/integration/init.py Removes legacy integration test package marker/doc.
sdk/pytest.ini Adds markers and introduces e2e marker.
docs/drafts/security/sso-providers.mdx Clarifies EE naming.
docs/docs/self-host/guides/03-deploy-to-kubernetes.mdx Clarifies EE naming.
docs/designs/testing/testing.running.specs.md Adds comprehensive “how to run tests” spec.
docs/designs/testing/testing.principles.specs.md Adds testing principles and pyramid guidance.
docs/designs/testing/testing.interfaces.specs.md Adds cross-interface testing overview + matrix.
docs/designs/testing/testing.interface.web.specs.md Adds Web testing interface spec.
docs/designs/testing/testing.interface.sdk.specs.md Adds SDK testing interface spec.
docs/designs/testing/testing.interface.api.specs.md Adds API testing interface spec.
docs/designs/testing/testing.fixtures.specs.md Adds shared fixtures spec.
docs/designs/testing/testing.dimensions.specs.md Adds unified dimensions/tagging spec.
docs/designs/testing/README.md Adds index for testing design docs.
api/pytest.ini Adds markers for license/cost dimensions.
api/oss/tests/pytest/utils/accounts.py Makes test account creation emails unique.
api/oss/tests/pytest/e2e/workflows/test_workflows_retrieve.py Updates retrieve requests to JSON body contract.
api/oss/tests/pytest/e2e/workflows/test_workflows_basics.py Adds workflows basics e2e tests in new layout.
api/oss/tests/pytest/e2e/workflows/test_workflow_revisions_basics.py Updates commit payload field naming.
api/oss/tests/pytest/e2e/workflows/test_workflow_lineage.py Updates revisions querying and “latest revision” selection logic.
api/oss/tests/pytest/e2e/tracing/test_traces_basics.py Adds polling helpers to handle eventual consistency.
api/oss/tests/pytest/e2e/tracing/test_spans_queries.py Refactors spans query tests for new ingest/query behavior and eventual consistency.
api/oss/tests/pytest/e2e/tracing/test_spans_basics.py Updates spans ingest route + polling for query readiness.
api/oss/tests/pytest/e2e/testsets/test_testsets_queries.py Updates list/query endpoints and expectations.
api/oss/tests/pytest/e2e/testsets/test_testcases_basics.py Updates testcases endpoints + assertions.
api/oss/tests/pytest/e2e/healthchecks/test_healthchecks.py Adds API healthcheck e2e tests in new layout.
api/oss/tests/pytest/e2e/evaluators/test_evaluators_queries.py Updates evaluator query shapes and pagination behavior.
api/oss/tests/pytest/e2e/evaluations/test_evaluation_scenarios_queries.py Updates scenarios queries to POST/query endpoint format.
api/oss/tests/pytest/e2e/evaluations/test_evaluation_runs_queries.py Updates runs queries and close behavior; improves test isolation via markers.
api/oss/tests/pytest/e2e/evaluations/test_evaluation_metrics_queries.py Updates metrics creation/query payload shapes and timestamp filters.
api/oss/tests/pytest/e2e/evaluations/test_evaluation_metrics_basics.py Updates metric test run names and fetch behavior to query-based retrieval.
api/oss/tests/pytest/e2e/annotations/test_annotations_basics.py Adds annotations basics e2e tests in new layout.
api/oss/src/services/variants_manager.py Makes forked variant slugs always unique via suffix.
api/oss/src/dbs/postgres/git/dao.py Disables meta containment queries for JSON columns; adjusts revision fetch and versioning logic.
api/oss/src/dbs/postgres/folders/dao.py Disables meta containment queries for JSON columns.
api/oss/src/dbs/postgres/evaluations/dao.py Ensures close flags are persisted; disables meta containment queries for JSON columns.
api/oss/src/dbs/postgres/blobs/dao.py Disables meta containment queries for JSON columns.
api/oss/src/apis/fastapi/workflows/router.py Fixes commit guard for optional variant id mismatch.
api/oss/src/apis/fastapi/evaluations/router.py Normalizes id comparisons to string-to-string.
api/oss/src/apis/fastapi/auth/router.py Updates EE-only error message copy.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Feb 10, 2026
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 3 new potential issues.

View 12 additional findings in Devin Review.

Open in Devin Review

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 83 out of 147 changed files in this pull request and generated 10 comments.

Comments suppressed due to low confidence (1)

web/tests/playwright/global-teardown.ts:33

  • Global teardown now deletes all accounts whenever AGENTA_AUTH_KEY is present and AGENTA_LICENSE === "oss", regardless of whether the target is a local instance. Given the header comment says this is for local OSS testing, this is risky if AGENTA_WEB_URL/AGENTA_API_URL point at a shared/staging/prod OSS environment. Consider additionally gating on baseURL/apiURL being localhost (or an explicit AGENTA_ALLOW_ACCOUNT_PURGE=true flag) before calling /admin/accounts/delete-all.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings February 10, 2026 19:30
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 85 out of 149 changed files in this pull request and generated 7 comments.

Comments suppressed due to low confidence (2)

sdk/tests/pytest/e2e/conftest.py:70

  • host = api_url[:-4] assumes AGENTA_API_URL always ends with exactly /api. This will mis-handle values like .../api/ (trailing slash) or a different base path, and can produce an invalid host. Prefer parsing and stripping a trailing /api segment safely (e.g., api_url.rstrip('/').removesuffix('/api')).
    sdk/tests/pytest/e2e/workflows/test_legacy_applications_manager.py:39
  • The helper _aupsert_with_retry claims it retries only on 429 rate-limit errors, but it retries whenever applications.aupsert(...) returns None (which can happen for other failures too). Either adjust the docstring to match the behavior or plumb through the underlying error/status so retries are limited to rate-limits.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 13 additional findings in Devin Review.

Open in Devin Review

@junaway junaway changed the base branch from main to release/v0.85.5 February 13, 2026 07:45
Copilot AI review requested due to automatic review settings February 13, 2026 07:50
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 86 out of 150 changed files in this pull request and generated 8 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@junaway junaway merged commit 2313765 into release/v0.85.5 Feb 13, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer size:XXL This PR changes 1000+ lines, ignoring generated files. tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants