Skip to content

Conversation

@bencap
Copy link
Collaborator

@bencap bencap commented Jul 22, 2025

Enhancements to error handling:

  • Added detailed error messages for cases where accession values cannot be determined during allele or haplotype annotation, replacing generic ValueError exceptions (src/dcd_mapping/annotate.py). [1] [2]
  • Improved error messages for transcript selection failures, including specific reasons for the failure, such as missing gene symbols or matching transcripts (src/dcd_mapping/transcripts.py). [1] [2]
  • Added error handling for unsupported sequence namespaces and intronic variants in genomic mapping, returning MappedScore objects with appropriate error messages (src/dcd_mapping/vrs_map.py). [1] [2]

Improvements to mapping logic:

  • Introduced checks for mismatches between the number of mapped scores and total score records, with corresponding error messages (src/api/routers/map.py).

Codebase simplification:

  • Removed hard-coded logic for specific datasets, such as manually selecting protein annotations or predefined results for certain score sets (src/dcd_mapping/annotate.py, src/dcd_mapping/transcripts.py). [1] [2]
  • Simplified conditional statements by directly assigning error messages without redundant checks (src/dcd_mapping/annotate.py). [1] [2]

Performance and timeout adjustments:

  • Increased the HTTP request timeout from 30 to 60 seconds in the http_download function to accommodate slower network conditions (src/dcd_mapping/resource_utils.py).

bencap added 2 commits July 18, 2025 16:33
This should help slightly with timeout issues we see in production
…g annotated mappings

Prior to this change, it was possible for some score rows to generate valid mappings with other score rows not creating a mapped variant. This had some negative downstream consequences, which will be remedied by ensuring that if any variant receives a mapped variant, all variants receive a mapped variant.
@bencap bencap requested a review from sallybg July 22, 2025 22:22
@bencap bencap linked an issue Jul 22, 2025 that may be closed by this pull request
@bencap bencap force-pushed the feature/bencap/45/guarantee-mapping-output-for-all-variants branch from b75c18a to 36bc828 Compare July 23, 2025 19:55
Copy link
Collaborator

@sallybg sallybg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! There are a few places where we are now returning individual variant-level errors when there is a problem with a reference sequence. I lean toward having the whole mapping job fail if these errors happen, although I recognize that for multi-target score sets we'd then be failing all targets if even one target fails.

@bencap
Copy link
Collaborator Author

bencap commented Jul 28, 2025

Thanks @sallybg, totally agree.

I added some custom exceptions for those possibilities and included them in the try/except block during mapping. I also went ahead and refactored the other custom exceptions we have into a dedicated exception module to keep them a little more organized.

Copy link
Collaborator

@sallybg sallybg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@bencap bencap merged commit f037ce1 into mavedb-dev Aug 5, 2025
6 checks passed
@bencap bencap deleted the feature/bencap/45/guarantee-mapping-output-for-all-variants branch August 5, 2025 21:15
@bencap bencap mentioned this pull request Aug 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Guarantee an output for each variant in a mapped score set

3 participants