Skip to content

Conversation

@Arslan-Siraj
Copy link
Contributor

@Arslan-Siraj Arslan-Siraj commented Aug 1, 2025

Summary by CodeRabbit

  • New Features

    • Enhanced the "View Results" tab to allow selection and viewing of result files, with improved handling and annotation of MS2 spectra using tolerant matching for more accurate visualization.
    • Added interactive data grid selection and robust annotation assignment for spectra.
  • Bug Fixes

    • Improved warning messages and fallback behavior when associated MS2 files are missing.
  • Chores

    • Updated dependencies, pinning specific versions of key packages for stability.
    • Adjusted Docker configuration related to Python wheel building steps.

@coderabbitai
Copy link

coderabbitai bot commented Aug 1, 2025

Walkthrough

The changes update dependency management files by pinning versions and adding new packages, especially for data visualization and scientific computation. The Dockerfile is modified to skip building and installing pyOpenMS Python wheels. The main application logic for viewing results is refactored to improve file selection and spectrum annotation, introducing tolerant matching using scipy.spatial.cKDTree.

Changes

Cohort / File(s) Change Summary
Dependency Management
environment.yml, requirements.txt
Added and pinned new dependencies (scipy, streamlit-plotly-events, streamlit-aggrid, pandas, captcha). Updated version pinning for streamlit and reordered some packages.
Docker Build Process
Dockerfile
Commented out the steps for building and installing pyOpenMS Python wheels. No other build steps changed.
Result Viewing Logic Refactor
content/Result_2.py
Refactored logic for loading and selecting result files. Improved spectrum annotation using tolerant matching via cKDTree. Enhanced UI flow and removed redundant code. No changes to public APIs.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant App (Result_2.py)
    participant FileSystem
    participant AgGrid
    participant Plotter

    User->>App (Result_2.py): Open "View Results" tab
    App->>FileSystem: List .idXML files in workspace
    App->>User: Show select box for .idXML files
    User->>App: Selects a file
    App->>FileSystem: Load selected .idXML and associated .mzML
    App->>AgGrid: Display CSM dataframe with selectable rows
    User->>AgGrid: Selects a CSM row
    AgGrid->>App: Return selected row data
    App->>App: Annotate spectrum using cKDTree matching (if MS2 available)
    App->>Plotter: Plot annotated spectrum
    Plotter->>User: Display annotated spectrum
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~18 minutes

Poem

In the garden of code where dependencies grow,
New pins and packages line up in a row.
The Docker bunny takes a rest,
Skips the wheel-building quest.
With cKDTree, spectra align just right—
AgGrid and Plotly bring data to light.
🐇✨ All’s ready for a hoppy insight!

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (3)
Dockerfile (1)

1-1: Update the Dockerfile comment to reflect that pyOpenMS is no longer built

The comment states "This Dockerfile builds OpenMS, the TOPP tools, pyOpenMS and thidparty tools" but lines 82-84 that build pyOpenMS are now commented out. Update the comment to reflect the current behavior.

Apply this diff:

-# This Dockerfile builds OpenMS, the TOPP tools, pyOpenMS and thidparty tools.
+# This Dockerfile builds OpenMS, the TOPP tools, and third-party tools.
+# Note: pyOpenMS is installed via pip from requirements.txt, not built from source.

Also fix the typo: "thidparty" → "third-party"

content/Result_2.py (2)

1-11: Avoid star imports for better code clarity

The star imports (lines 2, 3) make it difficult to track where symbols come from and can lead to namespace pollution. This is causing the static analysis warnings about potentially undefined names.

Replace star imports with explicit imports:

-from src.common.common import *
-from src.result_files import *
+from src.common.common import page_setup, show_table, show_fig, v_space, save_params
+from src.result_files import (load_example_result_files, list_result_example_files,
+                            copy_local_result_files_from_directory, remove_selected_result_files,
+                            remove_all_result_files, download_selected_result_files,
+                            create_zip_and_get_base64_, save_uploaded_result, readAndProcessIdXML)

54-56: Improve exception handling to aid debugging

The broad exception handler hides the actual error, making debugging difficult.

    except Exception as e:
+       st.error(f"Error loading mzML file: {str(e)}")
        return None  # Return None if any exception occurs

Or at least log the exception for debugging purposes.

🧹 Nitpick comments (3)
environment.yml (1)

1-1: Address the TODO: Use exact version specifiers for all packages

The TODO comment correctly identifies an inconsistency issue. Some packages use exact versions (e.g., python==3.11, scipy==1.16.1) while others use minimum version constraints (e.g., streamlit>=1.38.0, pyopenms_viz>=0.1.2). For reproducible builds, consider pinning all versions.

Would you like me to help create a version-pinned environment file with all exact versions?

content/Result_2.py (2)

134-138: Replace hardcoded string check with a more robust approach

Using find("Example") != -1 is fragile. Consider using a more explicit check.

-            if (selected_file.find("Example") != -1): 
-               file_name_wout_out = "Example"
+            if selected_file.startswith("Example"):
+                file_name_wout_out = "Example"

186-186: Consider making the m/z tolerance configurable

The hardcoded tolerance of 0.001 might not be suitable for all use cases. Consider making it a configurable parameter.

# At the top of the file or in a config
MZ_TOLERANCE = 0.001  # m/z tolerance for peak matching

# Then use it here:
tolerance = MZ_TOLERANCE
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2701ebe and 7c372b3.

📒 Files selected for processing (4)
  • Dockerfile (1 hunks)
  • content/Result_2.py (2 hunks)
  • environment.yml (1 hunks)
  • requirements.txt (1 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
content/Result_2.py

111-111: load_example_result_files may be undefined, or defined from star imports

(F405)


114-114: Path may be undefined, or defined from star imports

(F405)


120-120: Path may be undefined, or defined from star imports

(F405)


128-128: readAndProcessIdXML may be undefined, or defined from star imports

(F405)


139-139: os may be undefined, or defined from star imports

(F405)


139-139: Path may be undefined, or defined from star imports

(F405)


182-182: np may be undefined, or defined from star imports

(F405)


187-187: np may be undefined, or defined from star imports

(F405)


188-188: np may be undefined, or defined from star imports

(F405)


215-215: pd may be undefined, or defined from star imports

(F405)


218-218: os may be undefined, or defined from star imports

(F405)


220-220: show_fig may be undefined, or defined from star imports

(F405)


220-220: os may be undefined, or defined from star imports

(F405)

🔇 Additional comments (4)
environment.yml (1)

18-18: LGTM!

The addition of scipy==1.16.1 is appropriate for the new cKDTree functionality used in Result_2.py.

Dockerfile (1)

82-84: pyOpenMS installation is covered by requirements.txt

  • requirements.txt includes pyopenms==3.2.0, so pip install -r requirements.txt will install pyOpenMS.
  • The commented-out build steps in Dockerfile (lines 82–84) can be removed, as source-building is no longer needed.
requirements.txt (1)

13-13: SciPy 1.16.1 is a valid, current release.
No changes needed—you can keep the exact pin at line 13 in requirements.txt as is.

content/Result_2.py (1)

178-205: Well-implemented tolerant matching with KDTree

The implementation correctly uses scipy's cKDTree for efficient nearest-neighbor queries with tolerance. The fallback logic for missing MS2 data is also properly handled.

MS2 = process_mzML_file(os.path.join(Path.cwd().parent , str(st.session_state.workspace)[3:] , "mzML-files" ,selected_mzML_file))
if MS2 is None:
st.warning("The corresponding " + ".mzML file could not be found. Please re-upload the mzML file to visualize all peaks.")
MS2 = process_mzML_file(os.path.join(Path.cwd().parent , str(st.session_state.workspace)[3:] , "mzML-files" ,f"{file_name_wout_out}.mzML"))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Improve path construction for robustness

The path construction using Path.cwd().parent and string slicing [3:] is fragile and could break if the directory structure changes.

-            MS2 = process_mzML_file(os.path.join(Path.cwd().parent ,  str(st.session_state.workspace)[3:] , "mzML-files" ,f"{file_name_wout_out}.mzML"))
+            mzml_path = Path(st.session_state.workspace) / "mzML-files" / f"{file_name_wout_out}.mzML"
+            MS2 = process_mzML_file(str(mzml_path))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
MS2 = process_mzML_file(os.path.join(Path.cwd().parent , str(st.session_state.workspace)[3:] , "mzML-files" ,f"{file_name_wout_out}.mzML"))
mzml_path = Path(st.session_state.workspace) / "mzML-files" / f"{file_name_wout_out}.mzML"
MS2 = process_mzML_file(str(mzml_path))
🧰 Tools
🪛 Ruff (0.12.2)

139-139: os may be undefined, or defined from star imports

(F405)


139-139: Path may be undefined, or defined from star imports

(F405)

🤖 Prompt for AI Agents
In content/Result_2.py at line 139, the current path construction uses
Path.cwd().parent combined with string slicing [3:], which is fragile and
error-prone. Replace this with a more robust approach by using pathlib's Path
methods to join paths without manual string slicing. Construct the full path by
properly joining Path.cwd().parent with the workspace directory and other
subdirectories using Path objects and the / operator or Path.joinpath to ensure
platform-independent and reliable path handling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant