Skip to content

Conversation

@bencap
Copy link
Collaborator

@bencap bencap commented Dec 24, 2025

This pull request focuses on improving the robustness and clarity of the Pydantic view models and their data transformations in the codebase. The main changes ensure that "synthetic" fields (fields generated from ORM objects) are only populated when the relevant ORM attributes are present, preventing errors during model instantiation. Additionally, the handling of publication/source fields is standardized to always expect lists (not Optionals), and exception handling is made more consistent and informative.

This addresses an inconsistency where responses from API routes were unable to recreate the model that they were validated against prior to being returned by the server.

Pydantic Model Robustness and Data Transformation:

  • Updated all Pydantic model validators for synthetic fields to only perform attribute transformations if the relevant ORM attributes are present (e.g., user_associations, score_sets, experiments, publication_identifier_associations, etc.), reducing the risk of attribute errors when creating models from non-ORM data. [1] [2] [3] [4] [5]
  • Improved exception handling in model validators to catch both AttributeError and KeyError, and updated error messages to be more descriptive about coercion failures rather than missing attributes. [1] [2] [3] [4]

Publication/Source Field Standardization:

  • Changed all score calibration and related models (ScoreCalibrationBase, ScoreCalibrationModify, ScoreCalibrationCreate, SavedScoreCalibration, ScoreCalibration, etc.) so that threshold_sources, classification_sources, and method_sources are always required lists (not Optionals), ensuring consistency in API responses and internal data structures. [1] [2] [3] [4]
  • Updated field validators to always expect collections for publication/source fields, removing previous handling for None values.

These changes collectively improve the reliability of data ingestion and transformation in the API and make error reporting more actionable for developers.

- Only generate synthetic fields for experiments (e.g., publication identifiers, score set URNs) when ORM attributes are present, avoiding dict-based synthesis.
- Validators now check for ORM attribute presence before transformation, ensuring correct behavior for both ORM and API/dict contexts.
- Updated tests to expect Pydantic validation errors when required synthetic fields are missing.
- Refactored SavedTargetGene and TargetGeneWithScoreSetUrn to synthesize synthetic fields (e.g., external_identifiers, score_set_urn) only from ORM objects, not dicts.
- Updated model validators to require either target_sequence or target_accession for all construction contexts.
- Added tests to ensure SavedTargetGene and TargetGeneWithScoreSetUrn can be created from both ORM (attributed object) and non-ORM (dict) contexts.
- Refactored SavedCollection and CollectionWithUrn to ensure robust handling of synthetic and required fields for both ORM and dict contexts.
- Added parameterized tests to verify all key attributes are correctly handled in both construction modes.
- Added tests for creation from both dict and ORM contexts, mirroring the approach used for other models.
- Refactored MappedVariant view models to ensure robust handling of synthetic and required fields for both ORM and dict contexts.
- Added tests to verify all key attributes and synthetic properties are correctly handled in both construction modes.
- Ensured creation from both dict and ORM contexts, mirroring the approach used for other models.
- Refactored Variant view models to ensure robust handling of synthetic and required fields for both ORM and dict contexts.
- Added tests to verify all key attributes and synthetic properties are correctly handled in both construction modes.
- Ensured creation from both dict and ORM contexts, mirroring the approach used for other models.
- Refactored ScoreCalibration view models to ensure robust handling of synthetic and required fields for both ORM and dict contexts.
- Made the source fields non-optional to enforce required data integrity.
- Added tests to verify all key attributes and synthetic properties are correctly handled in both construction modes.
- Ensured creation from both dict and ORM contexts, mirroring the approach used for other models.
- Refactored ScoreSet view models to ensure robust handling of synthetic and required fields for both ORM and dict contexts.
- Added tests to verify all key attributes and synthetic properties are correctly handled in both construction modes.
- Ensured creation from both dict and ORM contexts, mirroring the approach used for other models.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Proper handling of synthetic properties from all view model creation contexts

2 participants