-
-
Notifications
You must be signed in to change notification settings - Fork 47
feat: Add AI-powered dependency conflict prediction (#428) #438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Add LLM-based conflict predictor that analyzes system state before install - Detect version incompatibilities using Claude/Anthropic API - Generate resolution strategies ranked by safety score - Display conflicts with safety bars and [RECOMMENDED] labels - Support venv isolation, upgrade, downgrade, and remove strategies - Chain venv commands with && to run in same shell session - Add pip package extraction to installation_history - Add demo script and documentation
…redictor - Remove unused _build_analysis_prompt() and _parse_llm_response() methods - Rewrite test_conflict_predictor.py to match current LLM-based implementation - Add security tests (command injection, constraint validation) - Add JSON extraction robustness tests - Add integration tests for full prediction flow - Add timeout and memory usage tests - Remove debug print statements
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the 📝 WalkthroughWalkthroughThis PR introduces an AI-powered dependency conflict prediction system that analyzes potential package conflicts before installation. It adds a new ConflictPredictor module leveraging an LLM via LLMRouter, integrates conflict detection into the CLI install workflow, tracks resolution outcomes in a new database table, and includes comprehensive test coverage and documentation. Changes
Sequence Diagram(s)sequenceDiagram
participant User as User/CLI
participant CliModule as cli.install()
participant Predictor as ConflictPredictor
participant Router as LLMRouter
participant LLM as LLM Provider
participant History as InstallationHistory
participant Installer as pip/apt
User->>CliModule: cortex install package
CliModule->>CliModule: Extract packages with versions
CliModule->>Predictor: Initialize ConflictPredictor
CliModule->>Predictor: predict_conflicts_with_resolutions()
Predictor->>Predictor: get_pip_packages() + get_apt_packages_summary()
rect rgb(230, 245, 250)
Predictor->>Router: Query LLM for conflict analysis
Router->>LLM: Send combined prompt
LLM->>Router: Return conflict predictions & strategies (JSON)
Router->>Predictor: Return LLM response
end
Predictor->>Predictor: Parse JSON responses
Predictor->>Predictor: Fallback to basic strategies if needed
Predictor->>CliModule: Return conflicts + strategies
rect rgb(255, 245, 220)
alt Conflicts Found
CliModule->>CliModule: format_conflict_summary()
CliModule->>User: Display conflicts & ranked strategies
CliModule->>User: Prompt for resolution choice
User->>CliModule: Select strategy or cancel
else No Conflicts
CliModule->>Installer: Proceed with install
end
end
rect rgb(240, 255, 240)
alt Strategy Chosen
CliModule->>CliModule: Modify install commands (venv/prepend)
CliModule->>Installer: Execute modified install
Installer->>Installer: Install packages
alt Installation Succeeds
CliModule->>History: record_resolution(success=True)
else Installation Fails
CliModule->>History: record_resolution(success=False, feedback)
end
else No Strategy / Cancelled
CliModule->>User: Abort or proceed without resolution
end
end
History->>History: Persist conflict_resolutions entry
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
CLA Verification FailedThe following contributors have not signed the Contributor License Agreement:
How to Sign
Verified SignersThis check runs automatically. Maintainers can update |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (11)
demo_conflict_predictor.sh (1)
1-30: Add explicit shell shebang (and consider strict mode) for this demo scriptWithout a shebang ShellCheck can’t infer the target shell and tooling may mis-handle the script. I’d add a Bash shebang, and optionally strict mode for more predictable failure behavior.
Proposed header
+#!/usr/bin/env bash +set -euo pipefail + echo "Test 1: NumPy/TensorFlow conflict" pip3 install numpy==2.1.0 cortex install "tensorflow==2.15.0" --dry-rundocs/AI_DEPENDENCY_CONFLICT_PREDICTION.md (1)
233-249: Align documented safety-score/learning behavior with current implementationThe “Safety Score Calculation” and “Learning from outcomes” sections describe:
- Dynamic safety scores adjusted by number of risks/affected packages and historical success rates.
- Recording resolutions into
installation_history.dband using past success rates to influence suggestions.In the current code:
_generate_basic_strategiesincortex/conflict_predictor.pyassigns fixed safety scores (0.85, 0.75, 0.5, 0.10) and doesn’t queryInstallationHistory.get_conflict_resolution_success_rate.ConflictPredictor.record_resolutiononly logs and does not callInstallationHistory.record_conflict_resolution, so no learning data is actually persisted.Either wiring
record_resolutioninto the new history APIs and incorporating success rates into scoring, or relaxing the doc claims to “planned/roadmap” would avoid confusion between docs and behavior.Also applies to: 289-310, 367-378
cortex/cli.py (3)
609-696: Reduce LLM calls and repeated system scans in conflict prediction pathThe
install()conflict-prediction block currently:
- Derives
packages_with_versions, then for each package callspredictor.predict_conflicts(...), and- Later calls
predictor.generate_resolutions(all_conflicts)for strategies.Given
predict_conflictsitself delegates topredict_conflicts_with_resolutions(which already does a combined LLM call and system-state scan per package), this pattern yields:
- 1 LLM call per package for conflicts, plus
- 1 additional LLM call for strategies,
- Repeated
pip3 list/dpkg --get-selectionsper package.You now have
predict_conflicts_with_resolutionsdesigned to do conflicts + strategies in a single pass using one prompt. Refactoring this block to:
- Use
predict_conflicts_with_resolutions(ideally in a way that batches the primary package(s) for the install), and- Avoid a separate
generate_resolutionscall when strategies are already returned,would substantially cut LLM latency and system probing overhead, especially for multi-package installs.
624-627: Preserve and restore original logger levels instead of hardcoding INFODuring conflict prediction you temporarily suppress logging by doing:
setLevel(logging.WARNING/ERROR)on variouscortex.*loggers, then- In
finally, unconditionally resetting them tologging.INFO.If another part of the app or user config set different levels (e.g., DEBUG or ERROR), this will silently override their configuration.
Capturing the current levels up front and restoring them in
finallywould be safer.Illustrative pattern
- logging.getLogger("cortex.conflict_predictor").setLevel(logging.WARNING) - logging.getLogger("cortex.dependency_resolver").setLevel(logging.WARNING) - logging.getLogger("cortex.llm_router").setLevel(logging.ERROR) + cp_logger = logging.getLogger("cortex.conflict_predictor") + dr_logger = logging.getLogger("cortex.dependency_resolver") + lr_logger = logging.getLogger("cortex.llm_router") + prev_levels = { + cp_logger: cp_logger.level, + dr_logger: dr_logger.level, + lr_logger: lr_logger.level, + } + cp_logger.setLevel(logging.WARNING) + dr_logger.setLevel(logging.WARNING) + lr_logger.setLevel(logging.ERROR) ... - logging.getLogger("cortex.conflict_predictor").setLevel(logging.INFO) - logging.getLogger("cortex.dependency_resolver").setLevel(logging.INFO) - logging.getLogger("cortex.llm_router").setLevel(logging.INFO) + for logger_obj, level in prev_levels.items(): + logger_obj.setLevel(level)Also applies to: 692-695
774-783: Be awarerecord_resolutioncurrently only logs and does not persist learning dataThe success/failure blocks correctly loop over
all_conflictsand callpredictor.record_resolution(...), butConflictPredictor.record_resolutionis currently a thin logger-only stub and doesn’t write toInstallationHistory’sconflict_resolutionstable.Until
record_resolutionis wired toInstallationHistory.record_conflict_resolution(...)and/or used to influence safety scores, these calls won’t actually contribute to the learning behavior described in the docs. The CLI wiring is good; the missing piece is the implementation insideConflictPredictor.Also applies to: 798-807, 856-865, 875-884
tests/test_conflict_predictor.py (1)
114-143: Strong coverage of predictor core; consider a small test forprompt_resolution_choiceThis suite does a thorough job on:
- Data classes, JSON extraction, LLM-parsing, basic strategy generation, security validation, and system parsing.
- End-to-end flow via
predict_conflicts_with_resolutionswith mocked LLM/router.One small gap is
prompt_resolution_choice’s behavior (auto-select vs user choice, cancel path). A focused test that:
- Calls
prompt_resolution_choice(strategies, auto_select=True), and- Mocks
input()to exercise a non-“1” numeric choice and a cancel/EOF case,would lock down the UX contract that
cortex.cli.install()relies on.Also applies to: 375-440, 648-692
cortex/installation_history.py (2)
283-330: Harden_extract_packages_with_versionsfor richer pip specifiersThe version-extraction regex:
version_match = re.match(r"^([^=]+)[=]+(.+)$", pkg)works for
foo==1.2.3/foo=1.2.3, but will mis-handle more general pip specifiers such as:
numpy>=1.21(name becomesnumpy>),- Combined constraints in a single argument, e.g.
"numpy>=1.21,<2.0"(name includes operators/commas),- Quoted specifiers.
Given this function feeds the conflict predictor (package name + version), it would be good to:
- Strip surrounding quotes,
- Recognize
==/=as version separators but treat>=,<=,!=as constraints rather than splitting name, and- Optionally drop complex constraints into
version=Noneinstead of producing a malformedname.Sketch of a more robust split
pkg = pkg.strip().strip('"').strip("'") m = re.match(r"^([A-Za-z0-9._+-]+)==([^,]+)$", pkg) if m: name, version = m.group(1), m.group(2) else: # Fallback: treat the whole token as a name when it contains constraint operators if any(op in pkg for op in (">", "<", "!", "~")): name, version = pkg, None else: name, version = pkg, None
129-161: Learning table and APIs are ready but not yet used by the predictorThe new
conflict_resolutionstable, indices, and helpers:
record_conflict_resolution(...)get_conflict_resolution_success_rate(...)look sound and give you a good foundation for learning from past resolutions.
Right now, though,
ConflictPredictor.record_resolutionnever callsrecord_conflict_resolution, and no scoring logic queriesget_conflict_resolution_success_rate, so this data model remains unused.Once
ConflictPredictor.record_resolutionis wired to these methods (e.g., storingconflict.to_dict()/strategy.to_dict()as JSON), you’ll get the historical data the docs describe and you can later feed success rates back intosafety_scoreadjustments.Also applies to: 714-801
cortex/conflict_predictor.py (3)
174-198: Graceful parsing of combined LLM responses; consider tightening error-handling path
_parse_combined_responsecorrectly:
- Uses
extract_json_from_responseto tolerate prefixes/markdown/trailing text,- Defaults unknown conflict/strategy types back to
VERSION/VENV,- Falls back to
_generate_basic_strategies(conflicts)when LLM strategies are missing.Note that
extract_json_from_responsealready catchesjson.JSONDecodeErrorinternally, so the outerexcept json.JSONDecodeErrorhere is effectively dead code. Replacing that with a broader log-onlyexcept Exception(or removing it entirely and relying on the inner guard) would simplify the control flow without changing behavior.Also applies to: 199-273
371-445: Safety scores are currently static; historical success rates are not used
_generate_basic_strategiesassigns fixed safety scores (0.85, 0.75, 0.50, 0.10), and there’s no adjustment based on:
- Number of risks,
- Number of affected packages, or
- Historical success rates from
InstallationHistory.get_conflict_resolution_success_rate.Since your docs describe exactly those factors influencing safety scores, it might be worth:
- At least reading a per-(conflict_type, strategy_type) success rate and nudging scores up/down slightly, or
- Clarifying in comments/docs that safety scores are currently heuristic constants and will be made data-driven later.
Also applies to: 465-519
784-803: CLI entrypoint currently can’t perform real predictions (no LLM router)In the
__main__block you construct:predictor = ConflictPredictor() ... conflicts = predictor.predict_conflicts(args.package, args.version)but without an
LLMRouter:
predict_conflicts_with_resolutionsbails out early with “No LLM router available, skipping conflict prediction” and returns([], []).- The CLI then always prints “✅ No conflicts predicted!”.
This is fine if the module-level CLI is just a stub, but if you intend it to be usable, you’ll need to initialize an
LLMRouter(and API key/provider selection) similarly tocortex.cli.install().
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
cortex/cli.pycortex/conflict_predictor.pycortex/installation_history.pydemo_conflict_predictor.shdocs/AI_DEPENDENCY_CONFLICT_PREDICTION.mdtests/test_conflict_predictor.py
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
cortex/installation_history.pytests/test_conflict_predictor.pycortex/cli.pycortex/conflict_predictor.py
**/*install*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*install*.py: Dry-run by default for all installations in command execution
No silent sudo execution - require explicit user confirmation
Implement audit logging to ~/.cortex/history.db for all package operations
Files:
cortex/installation_history.py
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Maintain >80% test coverage for pull requests
Files:
tests/test_conflict_predictor.py
🧠 Learnings (1)
📚 Learning: 2025-12-11T12:03:24.071Z
Learnt from: CR
Repo: cortexlinux/cortex PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T12:03:24.071Z
Learning: Applies to **/*install*.py : Implement audit logging to ~/.cortex/history.db for all package operations
Applied to files:
cortex/installation_history.py
🧬 Code graph analysis (5)
demo_conflict_predictor.sh (1)
cortex/cli.py (1)
install(548-925)
cortex/installation_history.py (5)
cortex/coordinator.py (1)
execute(230-279)cortex/sandbox/sandbox_executor.py (2)
execute(501-633)success(74-76)cortex/graceful_degradation.py (1)
match(280-313)cortex/utils/db_pool.py (1)
get_connection(99-136)cortex/logging_system.py (3)
info(200-202)error(208-210)warning(204-206)
tests/test_conflict_predictor.py (3)
cortex/conflict_predictor.py (19)
ConflictPrediction(65-80)ConflictPredictor(100-458)ConflictType(41-49)ResolutionStrategy(84-97)StrategyType(52-61)escape_command_arg(36-38)extract_json_from_response(585-608)format_conflict_summary(616-670)get_pip_packages(715-735)validate_version_constraint(29-33)to_dict(79-80)to_dict(96-97)predict_conflicts(117-125)predict_conflicts_with_resolutions(127-172)_parse_combined_response(199-272)_generate_basic_strategies(361-445)record_resolution(447-458)get_apt_packages_summary(738-777)_build_combined_prompt(174-197)cortex/cli.py (1)
history(952-1020)cortex/sandbox/sandbox_executor.py (1)
success(74-76)
cortex/cli.py (2)
cortex/conflict_predictor.py (6)
ConflictPrediction(65-80)ConflictPredictor(100-458)ResolutionStrategy(84-97)predict_conflicts(117-125)generate_resolutions(274-319)record_resolution(447-458)cortex/installation_history.py (1)
_extract_packages_with_versions(283-330)
cortex/conflict_predictor.py (1)
cortex/llm_router.py (1)
TaskType(32-42)
🪛 markdownlint-cli2 (0.18.1)
docs/AI_DEPENDENCY_CONFLICT_PREDICTION.md
58-58: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🪛 Shellcheck (0.11.0)
demo_conflict_predictor.sh
[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.
(SC2148)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Test (Python 3.12)
- GitHub Check: Test (Python 3.11)
- GitHub Check: Test (Python 3.10)
- GitHub Check: Agent
- GitHub Check: test (3.10)
- GitHub Check: test (3.12)
- GitHub Check: test (3.11)
🔇 Additional comments (1)
cortex/conflict_predictor.py (1)
25-39: Security hardening for constraints and shell args looks solid
validate_version_constraintandescape_command_argare applied in_generate_basic_strategiesbefore constructing downgrade/uninstall commands, and the tests exercise:
- Acceptance of normal pip-style constraints (e.g.
<2.0,~=1.4.2),- Rejection of obvious injection payloads (e.g.
; rm -rf /),- Proper quoting of malicious package names with
shlex.quote.This is a good balance between functionality and safety for shell-based resolution commands.
Also applies to: 361-433
| def _generate_basic_strategies( | ||
| self, conflicts: list[ConflictPrediction] | ||
| ) -> list[ResolutionStrategy]: | ||
| """Generate basic resolution strategies without LLM.""" | ||
| strategies = [] | ||
|
|
||
| for conflict in conflicts: | ||
| pkg = conflict.package1 | ||
| conflicting = conflict.package2 | ||
|
|
||
| # Strategy 1: Virtual environment (safest) | ||
| strategies.append( | ||
| ResolutionStrategy( | ||
| strategy_type=StrategyType.VENV, | ||
| description=f"Install {pkg} in virtual environment (isolate)", | ||
| safety_score=0.85, | ||
| commands=[ | ||
| f"python3 -m venv {escape_command_arg(pkg)}_env", | ||
| f"source {escape_command_arg(pkg)}_env/bin/activate", | ||
| f"pip install {escape_command_arg(pkg)}", | ||
| ], | ||
| benefits=["Complete isolation", "No system impact", "Reversible"], | ||
| risks=["Must activate venv to use package"], | ||
| affects_packages=[pkg], | ||
| ) | ||
| ) | ||
|
|
||
| # Strategy 2: Try newer version | ||
| strategies.append( | ||
| ResolutionStrategy( | ||
| strategy_type=StrategyType.UPGRADE, | ||
| description=f"Install newer version of {pkg} (may be compatible)", | ||
| safety_score=0.75, | ||
| commands=[f"pip install --upgrade {escape_command_arg(pkg)}"], | ||
| benefits=["May resolve compatibility", "Gets latest features"], | ||
| risks=["May have different features than requested version"], | ||
| affects_packages=[pkg], | ||
| ) | ||
| ) | ||
|
|
||
| # Strategy 3: Downgrade conflicting package | ||
| if conflict.required_constraint and validate_version_constraint( | ||
| conflict.required_constraint | ||
| ): | ||
| strategies.append( | ||
| ResolutionStrategy( | ||
| strategy_type=StrategyType.DOWNGRADE, | ||
| description=f"Downgrade {conflicting} to compatible version", | ||
| safety_score=0.50, | ||
| commands=[ | ||
| f"pip install {escape_command_arg(conflicting)}{conflict.required_constraint}" | ||
| ], | ||
| benefits=[f"Satisfies {pkg} requirements"], | ||
| risks=[f"May affect packages depending on {conflicting}"], | ||
| affects_packages=[conflicting], | ||
| ) | ||
| ) | ||
|
|
||
| # Strategy 4: Remove conflicting (risky) | ||
| strategies.append( | ||
| ResolutionStrategy( | ||
| strategy_type=StrategyType.REMOVE_CONFLICT, | ||
| description=f"Remove {conflicting} (not recommended)", | ||
| safety_score=0.10, | ||
| commands=[ | ||
| f"pip uninstall -y {escape_command_arg(conflicting)}", | ||
| f"pip install {escape_command_arg(pkg)}", | ||
| ], | ||
| benefits=["Resolves conflict directly"], | ||
| risks=["May break dependent packages", "Data loss possible"], | ||
| affects_packages=[conflicting, pkg], | ||
| ) | ||
| ) | ||
|
|
||
| # Sort by safety and deduplicate | ||
| strategies.sort(key=lambda s: s.safety_score, reverse=True) | ||
| seen = set() | ||
| unique = [] | ||
| for s in strategies: | ||
| key = (s.strategy_type, s.description) | ||
| if key not in seen: | ||
| seen.add(key) | ||
| unique.append(s) | ||
|
|
||
| return unique[:4] # Return top 4 strategies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implement actual learning in record_resolution using InstallationHistory
ConflictPredictor.record_resolution currently only logs:
logger.info(
f"Recording resolution: {chosen_strategy.strategy_type.value} - "
f"{'success' if success else 'failed'}"
)but does not:
- Persist anything to the
conflict_resolutionstable, nor - Leverage
InstallationHistory.get_conflict_resolution_success_rateto inform future decisions.
Given the new APIs in InstallationHistory, this is the missing link for the “learning from outcomes” feature and for any future safety-score tuning.
Proposed wiring to `InstallationHistory`
def record_resolution(
self,
conflict: ConflictPrediction,
chosen_strategy: ResolutionStrategy,
success: bool,
user_feedback: str | None = None,
) -> None:
"""Record conflict resolution for learning."""
- logger.info(
- f"Recording resolution: {chosen_strategy.strategy_type.value} - "
- f"{'success' if success else 'failed'}"
- )
+ logger.info(
+ "Recording resolution: %s - %s",
+ chosen_strategy.strategy_type.value,
+ "success" if success else "failed",
+ )
+
+ try:
+ # Persist to installation history for future learning
+ self.history.record_conflict_resolution(
+ package1=conflict.package1,
+ package2=conflict.package2,
+ conflict_type=conflict.conflict_type.value,
+ strategy_type=chosen_strategy.strategy_type.value,
+ success=success,
+ user_feedback=user_feedback or "",
+ conflict_data=json.dumps(conflict.to_dict()),
+ strategy_data=json.dumps(chosen_strategy.to_dict()),
+ )
+ except Exception as e:
+ logger.warning("Failed to record conflict resolution: %s", e)Once this is in place, you can later fold get_conflict_resolution_success_rate(...) into your safety-score calculations if desired.
🤖 Prompt for AI Agents
In cortex/conflict_predictor.py around lines 361 to 445, record_resolution
currently only logs the outcome and doesn’t persist or feed back results;
persist the chosen resolution to the conflict_resolutions table via the
InstallationHistory API (record the strategy_type, description/commands,
affected package names, any constraints, success boolean, and timestamp) and
then call InstallationHistory.get_conflict_resolution_success_rate for the
affected package(s) to surface the updated success rate for use in future
decisions; handle DB/IO errors (log and continue) and add/update unit tests to
assert the persistence call and that the success rate is queried after
recording.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds AI-powered dependency conflict prediction that analyzes potential package conflicts before installation using LLM analysis. The feature predicts conflicts by sending system state (pip/apt packages) to an LLM, generates resolution strategies with safety scores, and provides an interactive CLI workflow.
Key changes:
- LLM-based conflict detection with single API call optimization for combined conflict analysis and resolution generation
- Interactive resolution selection with safety visualization (progress bars and [RECOMMENDED] labels)
- Security hardening with input validation, command escaping, and timeout protection
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 17 comments.
Show a summary per file
| File | Description |
|---|---|
| cortex/conflict_predictor.py | Core implementation with LLM-based conflict prediction, resolution strategy generation, and security validations |
| cortex/cli.py | Integration into install command with conflict checking workflow and resolution prompting |
| cortex/installation_history.py | Database schema and methods for recording conflict resolutions and tracking success rates |
| tests/test_conflict_predictor.py | Comprehensive unit tests covering security, parsing, integration, and edge cases (52 tests) |
| docs/AI_DEPENDENCY_CONFLICT_PREDICTION.md | Technical documentation describing architecture, API, and usage examples |
| demo_conflict_predictor.sh | Demo script showing various conflict scenarios |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (4)
cortex/conflict_predictor.py (4)
154-172: Consider increasingmax_tokensor implementing chunking for complex scenarios.With up to 50 pip packages and 30 apt packages in the prompt, plus multiple conflict/strategy objects in the response, 3000 tokens may truncate output for complex dependency graphs.
Suggested adjustment
response = self.llm_router.complete( messages=messages, task_type=CONFLICT_TASK_TYPE, temperature=0.2, - max_tokens=3000, + max_tokens=4096, # Allow headroom for complex dependency scenarios )Also, consider using lazy logging (e.g.,
logger.info("Found %d conflicts, %d strategies", len(conflicts), len(strategies))) to avoid string formatting when log level is disabled.
696-704: Add brief comment explaining the ValueError fall-through.The empty
except ValueErrorblock intentionally falls through to the "Invalid choice" handling below. A brief comment clarifies this is intentional.try: idx = int(choice) - 1 if 0 <= idx < max_choices: return strategies[idx], idx except ValueError: - pass + pass # Non-numeric input: fall through to invalid-choice handling print(" Invalid choice. Using option 1.")
406-418: Escape the entire package specifier, not just the package name.The current construction
{escape_command_arg(conflicting)}{conflict.required_constraint}produces malformed commands like'numpy'>=1.0. The constraint should be included in the quoted argument.Proposed fix
if conflict.required_constraint and validate_version_constraint( conflict.required_constraint ): strategies.append( ResolutionStrategy( strategy_type=StrategyType.DOWNGRADE, description=f"Downgrade {conflicting} to compatible version", safety_score=0.50, commands=[ - f"pip install {escape_command_arg(conflicting)}{conflict.required_constraint}" + f"pip install {escape_command_arg(conflicting + conflict.required_constraint)}" ], benefits=[f"Satisfies {pkg} requirements"], risks=[f"May affect packages depending on {conflicting}"], affects_packages=[conflicting], ) )
448-459:record_resolutionstill only logs without persisting to database.This remains unaddressed from previous reviews. The method must call
self.history.record_conflict_resolution(...)to enable the "learn from outcomes" feature specified in the linked issue. Based on learnings, audit logging to~/.cortex/history.dbis required for all package operations.
🧹 Nitpick comments (2)
cortex/conflict_predictor.py (2)
25-38: Consider documenting the constraint validation limitations.The regex rejects valid compound pip constraints (e.g.,
>=1.0,<2.0). This is a security-conscious trade-off, but should be documented in the docstring so callers understand the restriction.Additionally,
escape_command_arglacks a docstring per coding guidelines.Proposed documentation improvements
def validate_version_constraint(constraint: str) -> bool: - """Validate pip version constraint format to prevent injection.""" + """Validate pip version constraint format to prevent injection. + + Note: Only simple single-specifier constraints are supported (e.g., '>=1.0'). + Compound constraints like '>=1.0,<2.0' are rejected for security. + """ if not constraint: return True return bool(CONSTRAINT_PATTERN.match(constraint.strip())) def escape_command_arg(arg: str) -> str: - """Safely escape argument for shell commands.""" + """Safely escape argument for shell commands. + + Args: + arg: The argument string to escape. + + Returns: + Shell-safe quoted string. + """ return shlex.quote(arg)
269-272: Dead exception handler:JSONDecodeErroris already handled internally.
extract_json_from_responsecatchesJSONDecodeErrorinternally and returnsNonerather than raising. Thisexceptblock at lines 269-270 will never execute.Suggested cleanup
- except json.JSONDecodeError as e: - logger.warning(f"Failed to parse LLM response as JSON: {e}") - return conflicts, strategies
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
cortex/conflict_predictor.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
cortex/conflict_predictor.py
🧠 Learnings (1)
📚 Learning: 2025-12-11T12:03:24.071Z
Learnt from: CR
Repo: cortexlinux/cortex PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T12:03:24.071Z
Learning: Applies to **/*install*.py : Implement audit logging to ~/.cortex/history.db for all package operations
Applied to files:
cortex/conflict_predictor.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Test (Python 3.10)
- GitHub Check: Test (Python 3.11)
- GitHub Check: Test (Python 3.12)
- GitHub Check: test (3.11)
- GitHub Check: test (3.12)
- GitHub Check: test (3.10)
🔇 Additional comments (11)
cortex/conflict_predictor.py (11)
1-24: LGTM!Module docstring, imports, and logger setup are well-organized. The comment explaining the task type choice is helpful for maintainability.
41-62: LGTM!Enum definitions are clean and well-documented with appropriate conflict and strategy types.
64-98: LGTM!Dataclasses are well-designed with proper type hints, default factories for mutable defaults, and
to_dict()methods that correctly handle enum serialization.
100-126: LGTM!Clean class design with proper docstrings, type hints, and delegation pattern. The legacy method annotation is helpful for guiding callers.
274-320: LGTM!Clean implementation with proper LLM fallback to basic strategies, good error handling, and appropriate logging.
321-360: LGTM!Good JSON parsing with proper exception handling and logging. The sorting by safety_score ensures consistent prioritization.
466-579: LGTM!System prompts are well-crafted with clear JSON schemas, examples, and instructions. The separation of concerns between combined analysis and individual prompts provides flexibility.
586-610: LGTM!Excellent use of
JSONDecoder.raw_decodefor robust JSON extraction from LLM responses that may contain surrounding text. The docstring clearly explains the approach.
617-672: LGTM!Clean formatting with good UX touches: safety bars, emoji indicators, and appropriate truncation of displayed items.
716-737: LGTM!Robust subprocess handling with appropriate timeout, comprehensive exception handling, and debug logging. The 5-second timeout prevents UI blocking.
739-778: LGTM!Smart filtering of apt packages to relevant prefixes, with proper timeout protection and error handling for malformed dpkg output.
Anshgrover23
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ShreeJejurikar Please follow the updated contribution guidelines. We’ve added new rules around AI usage in CONTRIBUTING.md. 2e68b29
Yours CI checks are failing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (4)
cortex/cli.py (4)
641-649: Clarify the OpenAI to Kimi K2 routing or add proper API key support.The comment states "OpenAI uses Kimi K2 in the router" and passes
kimi_api_key=api_keywhenprovider="openai". This is confusing—users settingCORTEX_PROVIDER=openailikely expect OpenAI's API, not Kimi K2.Consider either:
- Updating the comment to clearly explain this routing decision and that a Kimi API key is expected
- Adding separate
openai_api_keyparameter support toLLMRouter- Using the provider's actual API instead of routing through Kimi
Based on past review comments, this clarification is still needed.
636-639: Verify thread-safety of global logger manipulation.The code temporarily modifies global logger levels (lines 636-639) and restores them in a finally block (lines 704-708). While the try-finally pattern ensures restoration, manipulating global logger settings can be problematic if the install command runs concurrently (e.g., if future changes allow concurrent installs or if users run multiple
cortex installprocesses simultaneously).Consider using a context manager or ensuring the conflict prediction phase doesn't run concurrently. Based on past review comments, this pattern should be validated for concurrent scenarios or documented as single-threaded only.
Also applies to: 704-708
653-661: Use the optimized single-call API to avoid multiple LLM calls.The current implementation calls
predict_conflicts()in a loop for each package (lines 654-656), then separately callsgenerate_resolutions()(line 661). This makes N+1 expensive LLM calls. The PR description emphasizes "Single LLM call optimization," but the optimizedpredict_conflicts_with_resolutions()method is not being used here.🔎 Proposed fix using the optimized API
- # Predict conflicts for each package (with version if available) - for package_name, version in packages_with_versions: - conflicts = predictor.predict_conflicts(package_name, version) - all_conflicts.extend(conflicts) - - # Display conflicts if found - if all_conflicts: - # Generate resolution strategies - strategies = predictor.generate_resolutions(all_conflicts) + # Predict conflicts and generate resolutions in a single optimized LLM call + # Note: This calls the LLM once per package but generates both conflicts and strategies + strategies = [] + for package_name, version in packages_with_versions: + conflicts, pkg_strategies = predictor.predict_conflicts_with_resolutions( + package_name, version + ) + all_conflicts.extend(conflicts) + strategies.extend(pkg_strategies) + + # Display conflicts if found + if all_conflicts: + # Strategies were already generated alongside conflict predictionThis reduces the number of LLM calls from N+1 to N (one call per package that returns both conflicts and strategies).
Based on past review comments, this optimization is critical for the performance improvements promised in the PR.
673-679: Virtual environment activation won't persist with command chaining via subprocess.Chaining venv setup commands with
&&in a single string (lines 674-677) will not work as intended. Whensubprocess.run()executespython3 -m venv env && source env/bin/activate && pip install pkg, thesourcecommand only affects that subprocess shell, and the activation won't persist to subsequent installation commands in thecommandslist.🔎 Recommended solutions
Option 1: Use venv's python directly (recommended)
if chosen_strategy.strategy_type.value == "venv": - # Combine venv commands into one chained command - venv_cmd = " && ".join(chosen_strategy.commands) - commands = [venv_cmd] + commands + # Create venv and modify subsequent pip commands to use venv's python + venv_cmd = chosen_strategy.commands[0] # python3 -m venv env + commands = [venv_cmd] + [ + cmd.replace("pip", "./env/bin/pip") if "pip" in cmd else cmd + for cmd in commands + ]Option 2: Document manual activation requirement
if chosen_strategy.strategy_type.value == "venv": - # Combine venv commands into one chained command - venv_cmd = " && ".join(chosen_strategy.commands) - commands = [venv_cmd] + commands + # Create venv only; user must manually activate + commands = [chosen_strategy.commands[0]] + commands + self._print_status( + "ℹ️", + "Virtual environment created. Activate it manually with: source env/bin/activate", + )Based on past review comments, this shell activation behavior needs to be addressed.
🧹 Nitpick comments (4)
cortex/cli.py (4)
622-623: Consider using a public API instead of calling a private method.The code calls
history._extract_packages_with_versions(), which is a private method (indicated by the leading underscore). This violates encapsulation and may break if the implementation changes.💡 Recommended approach
Either make the method public in
InstallationHistoryby renaming it toextract_packages_with_versions(), or add a public wrapper method in the class.In
cortex/installation_history.py:-def _extract_packages_with_versions(self, commands: list[str]) -> list[tuple[str, str | None]]: +def extract_packages_with_versions(self, commands: list[str]) -> list[tuple[str, str | None]]:Then in this file:
-packages_with_versions = history._extract_packages_with_versions(commands) +packages_with_versions = history.extract_packages_with_versions(commands)
669-669: Remove unused variablechoice_idx.The variable
choice_idxis captured fromprompt_resolution_choice()but never used in the code.🔎 Proposed fix
- chosen_strategy, choice_idx = prompt_resolution_choice(strategies) + chosen_strategy, _ = prompt_resolution_choice(strategies)
787-795: Extract duplicated conflict resolution recording logic to a helper method.The code for recording conflict resolution outcomes is duplicated across four locations (lines 787-795, 811-820, 869-877, 888-897) with nearly identical logic. This violates the DRY principle and makes maintenance harder.
💡 Proposed refactoring
Add a helper method to the
CortexCLIclass:def _record_conflict_resolution_outcome( self, predictor: ConflictPredictor | None, chosen_strategy: ResolutionStrategy | None, all_conflicts: list[ConflictPrediction], success: bool, error_msg: str | None = None, ): """Record conflict resolution outcome for learning.""" if predictor and chosen_strategy and all_conflicts: for conflict in all_conflicts: predictor.record_resolution( conflict=conflict, chosen_strategy=chosen_strategy, success=success, user_feedback=error_msg if not success else None, ) outcome = "successful" if success else "failed" self._debug(f"Recorded {outcome} conflict resolution for learning")Then replace all four blocks with:
self._record_conflict_resolution_outcome( predictor, chosen_strategy, all_conflicts, success=True # or False for failure cases )Also applies to: 811-820, 869-877, 888-897
625-709: Consider extracting conflict prediction logic to a helper method.The conflict prediction block spans 85 lines (625-709) within an already long
install()method (~377 lines total). Extracting this to a separate method would improve readability, testability, and maintainability.💡 Suggested refactoring
def _run_conflict_prediction( self, packages_with_versions: list[tuple[str, str | None]], commands: list[str], api_key: str, provider: str, history: InstallationHistory, dry_run: bool, ) -> tuple[ConflictPredictor | None, list[ConflictPrediction], ResolutionStrategy | None]: """ Run conflict prediction for packages and prompt user for resolution. Returns: Tuple of (predictor, conflicts, chosen_strategy) Returns (None, [], None) if prediction fails or is skipped """ # Move lines 625-709 logic here ... return predictor, all_conflicts, chosen_strategyThen in
install():predictor, all_conflicts, chosen_strategy = self._run_conflict_prediction( packages_with_versions, commands, api_key, provider, history, dry_run ) if chosen_strategy: # Apply strategy to commands ...
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
cortex/cli.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
cortex/cli.py
🧬 Code graph analysis (1)
cortex/cli.py (3)
cortex/conflict_predictor.py (6)
ConflictPrediction(65-80)ConflictPredictor(100-459)ResolutionStrategy(84-97)format_conflict_summary(617-671)predict_conflicts(117-125)generate_resolutions(274-319)cortex/llm_router.py (1)
LLMRouter(76-846)cortex/installation_history.py (1)
_extract_packages_with_versions(283-330)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Test (Python 3.12)
- GitHub Check: Test (Python 3.10)
- GitHub Check: Test (Python 3.11)
- GitHub Check: test (3.12)
- GitHub Check: test (3.10)
- GitHub Check: test (3.11)
🔇 Additional comments (1)
cortex/cli.py (1)
12-18: LGTM! Conflict prediction imports are correctly added.The new imports for
ConflictPrediction,ConflictPredictor,ResolutionStrategy,format_conflict_summary,prompt_resolution_choice, andLLMRouterare appropriately used in the conflict prediction flow integrated into the install command.Also applies to: 30-30
- Update record_resolution to persist to DB and return success rate - Fix CLI to use single LLM call optimization (predict_conflicts_with_resolutions) - Handle venv command chaining with bash -c wrapper - Increase max_tokens to 4000 for complex scenarios - Update documentation to reflect LLM-based analysis approach - Remove obsolete demo_conflict_predictor.sh script - Add integration tests for conflict predictor
The CLI tests were timing out because they called install() which triggers conflict prediction. This creates a real LLMRouter that makes actual API calls to Claude/Kimi, causing 60s+ timeouts when the fake test API keys are rejected. Added @patch("cortex.cli.LLMRouter") to all tests that call install() so the LLM router is mocked and no real network requests are made.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Fix all issues with AI Agents 🤖
In @tests/integration/test_conflict_predictor_integration.py:
- Around line 29-33: The test uses a global InstallationHistory() which writes
to the default DB; change setUp to create an isolated temporary database and
pass it into InstallationHistory so tests do not pollute the system DB.
Specifically, in setUp replace the direct InstallationHistory() call with an
instance created using a temporary path or in-memory option supported by
InstallationHistory (e.g., InstallationHistory(db_path=tmpfile) or
InstallationHistory(in_memory=True)), ensure the temporary resource is created
via tempfile and cleaned up in tearDown, and keep the rest of the setup
(mock_router and ConflictPredictor instantiation) unchanged.
♻️ Duplicate comments (2)
docs/AI_DEPENDENCY_CONFLICT_PREDICTION.md (1)
323-353: Documentation describes patterns not present in implementation.The "Known Conflict Patterns" section documents hardcoded data structures (
mutual_exclusions,port_conflicts,version_conflicts) that don't appear to exist in the actual implementation, which uses LLM-based analysis instead. Consider updating this section to reflect the actual LLM-driven approach or removing it if these patterns are not implemented.cortex/conflict_predictor.py (1)
841-860: CLI entrypoint is non-functional withoutllm_routerinitialization.
ConflictPredictor()without anllm_routeralways returns empty results (line 136-138 early-exit). The CLI should initialize the router or fail fast with a clear error.
🧹 Nitpick comments (3)
cortex/cli.py (1)
658-663: Loop may trigger multiple LLM calls, defeating single-call optimization.The loop iterates over
packages_with_versionsand callspredict_conflicts_with_resolutions()for each package. This contradicts the PR description's "single LLM call optimization" and may cause performance issues when installing multiple packages.Consider batching all packages into a single prediction call:
Suggested approach
- for package_name, version in packages_with_versions: - conflicts, strategies = predictor.predict_conflicts_with_resolutions( - package_name, version - ) - all_conflicts.extend(conflicts) - all_strategies.extend(strategies) + # Batch all packages into single LLM call if multiple packages + if len(packages_with_versions) == 1: + pkg_name, pkg_ver = packages_with_versions[0] + all_conflicts, all_strategies = predictor.predict_conflicts_with_resolutions( + pkg_name, pkg_ver + ) + else: + # For multiple packages, consider implementing batch prediction + # or accept multiple calls for now + for package_name, version in packages_with_versions: + conflicts, strategies = predictor.predict_conflicts_with_resolutions( + package_name, version + ) + all_conflicts.extend(conflicts) + all_strategies.extend(strategies)cortex/installation_history.py (1)
129-161: Consider adding composite index for conflict resolution queries.The
get_conflict_resolution_success_rate()method queries byconflict_typeANDstrategy_type(lines 784-790), but there's no composite index on these columns. While the current indices on(package1, package2)and(success)are useful, adding a composite index would improve query performance as the resolution history grows.Suggested addition after line 160
cursor.execute( """ CREATE INDEX IF NOT EXISTS idx_conflict_success ON conflict_resolutions(success) """ ) + + cursor.execute( + """ + CREATE INDEX IF NOT EXISTS idx_conflict_type_strategy + ON conflict_resolutions(conflict_type, strategy_type) + """ + )cortex/conflict_predictor.py (1)
227-227: Clamp confidence and safety_score to valid range [0.0, 1.0].The LLM may return values outside the expected range (e.g.,
1.5or-0.1). Unbounded values could cause display issues or unexpected behavior downstream.Proposed fix
- confidence=float(c.get("confidence", 0.8)), + confidence=max(0.0, min(1.0, float(c.get("confidence", 0.8)))),- safety_score=float(s.get("safety_score", 0.5)), + safety_score=max(0.0, min(1.0, float(s.get("safety_score", 0.5)))),Also applies to: 255-255
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
cortex/cli.pycortex/conflict_predictor.pycortex/dependency_check.pycortex/installation_history.pycortex/llm/interpreter.pycortex/sandbox/docker_sandbox.pydocs/AI_DEPENDENCY_CONFLICT_PREDICTION.mdtests/integration/test_conflict_predictor_integration.pytests/test_cli.pytests/test_cli_extended.pytests/test_conflict_predictor.py
✅ Files skipped from review due to trivial changes (2)
- cortex/dependency_check.py
- cortex/llm/interpreter.py
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/test_conflict_predictor.py
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
tests/integration/test_conflict_predictor_integration.pytests/test_cli.pycortex/cli.pycortex/sandbox/docker_sandbox.pycortex/installation_history.pycortex/conflict_predictor.pytests/test_cli_extended.py
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Maintain >80% test coverage for pull requests
Files:
tests/integration/test_conflict_predictor_integration.pytests/test_cli.pytests/test_cli_extended.py
**/*sandbox*.py
📄 CodeRabbit inference engine (AGENTS.md)
Implement Firejail sandboxing for package operations
Files:
cortex/sandbox/docker_sandbox.py
**/*install*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*install*.py: Dry-run by default for all installations in command execution
No silent sudo execution - require explicit user confirmation
Implement audit logging to ~/.cortex/history.db for all package operations
Files:
cortex/installation_history.py
🧠 Learnings (1)
📚 Learning: 2025-12-11T12:03:24.071Z
Learnt from: CR
Repo: cortexlinux/cortex PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T12:03:24.071Z
Learning: Applies to **/*install*.py : Implement audit logging to ~/.cortex/history.db for all package operations
Applied to files:
cortex/installation_history.pycortex/conflict_predictor.py
🧬 Code graph analysis (4)
tests/test_cli.py (1)
tests/test_cli_extended.py (8)
test_install_dry_run(111-127)test_install_no_execute(135-151)test_install_with_execute_success(160-184)test_install_with_execute_failure(193-217)test_install_no_commands_generated(225-240)test_install_value_error(248-263)test_install_runtime_error(271-286)test_install_unexpected_error(294-309)
cortex/installation_history.py (1)
cortex/utils/db_pool.py (1)
get_connection(99-136)
cortex/conflict_predictor.py (2)
cortex/installation_history.py (2)
record_conflict_resolution(715-767)get_conflict_resolution_success_rate(769-801)cortex/llm_router.py (1)
TaskType(32-42)
tests/test_cli_extended.py (1)
tests/test_cli.py (8)
test_install_dry_run(96-104)test_install_no_execute(109-117)test_install_with_execute_success(123-140)test_install_with_execute_failure(146-163)test_install_no_commands_generated(168-175)test_install_value_error(180-187)test_install_runtime_error(192-199)test_install_unexpected_error(204-211)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: test (3.11)
- GitHub Check: test (3.10)
- GitHub Check: test (3.12)
🔇 Additional comments (15)
tests/test_cli_extended.py (1)
109-127: Consistent LLMRouter mocking across install tests.The addition of
@patch("cortex.cli.LLMRouter")and corresponding_mock_llm_routerparameters is consistent with the CLI's new conflict prediction integration. Parameter ordering correctly follows decorator stacking order.cortex/sandbox/docker_sandbox.py (1)
252-254: LGTM - minor string formatting.The error message reformatting is purely cosmetic with no functional change. The message remains helpful for users encountering Docker daemon issues.
tests/test_cli.py (1)
94-104: Consistent LLMRouter mocking pattern.The test updates correctly mock the new
LLMRouterdependency introduced by the conflict prediction feature. The pattern is consistent across all install-related tests.docs/AI_DEPENDENCY_CONFLICT_PREDICTION.md (1)
88-109: Documentation accurately reflects LLM-based approach.The updated documentation at lines 88-109 now correctly describes the LLM-based analysis approach and the actual implemented functions (
get_pip_packages(),get_apt_packages_summary(),validate_version_constraint(),escape_command_arg()). The note at lines 107-109 clarifies that version parsing is handled by the LLM.cortex/cli.py (4)
12-18: New conflict prediction imports added correctly.The imports for the conflict prediction system are properly structured and align with the public API surface of
cortex/conflict_predictor.py.
680-699: Venv activation approach handles shell limitations correctly.The
bash -cwrapper addresses the limitation thatsourcedoesn't persist across subprocess calls. The user warning at lines 695-699 appropriately informs users they need to manually activate the venv.
636-640: Logging level suppression pattern is acceptable for single-threaded CLI.The logging suppression is restored in a
finallyblock (lines 726-730), ensuring cleanup even on exceptions. While this pattern could be fragile in concurrent scenarios, the CLI install command runs single-threaded, making this safe in practice.
809-817: Conflict resolution recording covers all success/failure paths.The code correctly records resolution outcomes in both parallel (lines 809-817, 833-842) and sequential (lines 891-899, 910-919) execution paths, enabling the learning system to gather data from all installation attempts.
Also applies to: 833-842, 891-899, 910-919
cortex/installation_history.py (3)
284-331: New package extraction method correctly handles version parsing.The
_extract_packages_with_versions()method properly extracts both package names and version constraints from pip/apt commands. The regex at line 316 handles both==and=version specifiers correctly.
715-767: Conflict resolution recording follows established patterns.The
record_conflict_resolution()method uses proper connection pooling, UUID generation for IDs, and consistent error handling. As per coding guidelines, this implements audit logging for conflict resolution operations.
769-801: Success rate calculation with sensible default.Returning
0.5when no historical data exists is a reasonable neutral prior that doesn't bias towards or against any strategy.tests/integration/test_conflict_predictor_integration.py (3)
35-79: Good coverage of full prediction flow with history.The test properly mocks the LLM response and verifies that conflicts and strategies are correctly parsed and returned. The test structure is clear and assertions are appropriate.
205-218: Error handling test verifies graceful degradation.The test correctly verifies that when the LLM fails, the predictor returns empty lists rather than propagating exceptions, ensuring the CLI can continue with installation.
220-267: Strategy ranking test validates safety-first ordering.The test verifies that strategies are returned sorted by safety score (highest first), which aligns with the UX requirement to show the safest option first with a
[RECOMMENDED]label.cortex/conflict_predictor.py (1)
452-515: Well-implemented learning persistence.The
record_resolutionmethod now properly persists resolution outcomes toInstallationHistoryand queries updated success rates, addressing the previous feedback. The error handling for DB/IO exceptions is appropriate.
@ShreeJejurikar please check the regex analysis warning by SonarCloud |
- Fix API key detector to prioritize environment variables over cached file sources. Previously, a stale cache pointing to a file would overwrite valid environment variables with invalid keys, causing 401 authentication errors. - Fix ReDoS (Regular Expression Denial of Service) vulnerabilities in installation_history.py: - Replace `[=]+` with `=+` to prevent backtracking - Replace regex with rstrip() for trailing character removal - Add test for environment variable priority over cached file sources
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Fix all issues with AI Agents 🤖
In @cortex/installation_history.py:
- Around line 768-770: The function currently returns record_id even when the DB
insert fails; update the error handling in the try/except around the insert (the
block that logs "Failed to record conflict resolution: {e}") so that on
exception you do not return a successful id—either re-raise the exception or set
and return None (change the return type to Optional if using typing) instead of
returning the previous record_id; ensure callers are updated to handle None or
exceptions accordingly.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/test_thread_safety.py (1)
1-1: Critical: Black formatting check failed.The CI pipeline reports this file would be reformatted by Black. Run
black .from the repository root to fix all formatting issues before merge.#!/bin/bash # Verify Black formatting compliance black --check tests/test_thread_safety.py
♻️ Duplicate comments (1)
cortex/installation_history.py (1)
147-160: Add composite index for conflict resolution queries.The
get_conflict_resolution_success_ratemethod (lines 787-794) filters by bothconflict_typeandstrategy_type, but there's no composite index to optimize this query pattern. As the conflict resolution history grows, query performance will degrade.🔎 Proposed fix: Add composite index
cursor.execute( """ CREATE INDEX IF NOT EXISTS idx_conflict_success ON conflict_resolutions(success) """ ) + + cursor.execute( + """ + CREATE INDEX IF NOT EXISTS idx_conflict_type_strategy + ON conflict_resolutions(conflict_type, strategy_type) + """ + )Based on past review comments and query patterns in the codebase.
🧹 Nitpick comments (3)
tests/test_api_key_detector.py (1)
158-185: LGTM! Well-structured test for the environment variable priority bug fix.This test properly validates the fix described in the PR objectives where environment variables now take precedence over stale cached file sources. The setup is clear, the scenario is realistic, and the assertions verify the expected behavior.
Optional: minor consistency suggestion
For consistency with other tests in the file, you could use the
_assert_found_keyhelper for the first three assertions and then separately assert the source:- # Should use env var, NOT the stale file - assert found is True - assert key == "sk-ant-valid-env-key" - assert provider == "anthropic" + # Should use env var, NOT the stale file + self._assert_found_key( + (found, key, provider, source), + "sk-ant-valid-env-key", + "anthropic" + ) assert source == "environment"However, the current approach is also fine and matches the style used in other tests like
test_detect_from_environment.cortex/installation_history.py (2)
759-761: Store NULL instead of empty strings for optional fields.The method signature accepts
Nonefor optional fields (user_feedback,conflict_data,strategy_data), but the code stores empty strings""instead. This is inconsistent and prevents distinguishing between "no data provided" and "empty string provided."🔎 Proposed fix
( record_id, timestamp, package1, package2, conflict_type, strategy_type, 1 if success else 0, - user_feedback or "", - conflict_data or "", - strategy_data or "", + user_feedback, + conflict_data, + strategy_data, ), )
718-728: Consider input validation for conflict resolution parameters.Given the PR's emphasis on security hardening (shlex.quote, input validation mentioned in description), consider validating inputs like
package1,package2,conflict_type, andstrategy_typeto prevent malformed data or potential injection risks in downstream processing.Example validations to consider:
- Package names: alphanumeric, hyphens, underscores, dots (reject special chars)
- conflict_type and strategy_type: validate against known enum values
- Limit string lengths to prevent database bloat
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
cortex/api_key_detector.pycortex/installation_history.pytests/test_api_key_detector.pytests/test_thread_safety.py
✅ Files skipped from review due to trivial changes (1)
- cortex/api_key_detector.py
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
tests/test_api_key_detector.pycortex/installation_history.pytests/test_thread_safety.py
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Maintain >80% test coverage for pull requests
Files:
tests/test_api_key_detector.pytests/test_thread_safety.py
**/*install*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*install*.py: Dry-run by default for all installations in command execution
No silent sudo execution - require explicit user confirmation
Implement audit logging to ~/.cortex/history.db for all package operations
Files:
cortex/installation_history.py
🧠 Learnings (1)
📚 Learning: 2025-12-11T12:03:24.071Z
Learnt from: CR
Repo: cortexlinux/cortex PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T12:03:24.071Z
Learning: Applies to **/*install*.py : Implement audit logging to ~/.cortex/history.db for all package operations
Applied to files:
cortex/installation_history.py
🧬 Code graph analysis (1)
cortex/installation_history.py (1)
cortex/utils/db_pool.py (1)
get_connection(99-136)
🪛 GitHub Actions: CI
tests/test_thread_safety.py
[error] 1-1: Black formatting check failed. 1 file would be reformatted. Run 'black .' to fix code style issues.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: test (3.12)
- GitHub Check: test (3.11)
- GitHub Check: test (3.10)
🔇 Additional comments (3)
cortex/installation_history.py (3)
256-282: LGTM! Good security fix.The changes correctly:
- Add pip/pip3 install command support
- Fix the version constraint removal to only match
=instead of the problematic[=:]pattern (addresses past review comment)- Replace regex with
rstrip()to prevent ReDoS vulnerabilities (as noted in commit message)
285-334: LGTM! Well-designed helper method.The
_extract_packages_with_versionshelper is well-structured:
- Properly handles version extraction with
=+pattern (supports both=and==)- Avoids ReDoS with
rstrip()instead of regex- Has clear type hints and documentation
- Consistent with existing extraction logic
772-804: LGTM! Reasonable default strategy.The method correctly computes success rates from historical data and returns a neutral 0.5 default when no history exists or on errors. The error handling is appropriate for a learning system where graceful degradation is preferred over exceptions.
Note: Query performance will benefit from the composite index suggested in the earlier comment.
| logger.error(f"Failed to record conflict resolution: {e}") | ||
|
|
||
| return record_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix error handling: Don't return success indicator when insert fails.
The method returns record_id even when the database insert fails (line 770), making it impossible for callers to detect failures. This could lead to false assumptions about successful logging.
🔎 Proposed fix
except Exception as e:
logger.error(f"Failed to record conflict resolution: {e}")
+ raise
return record_idOr alternatively, return an optional type and return None on failure:
- ) -> str:
+ ) -> str | None:
"""
Record a conflict resolution for learning.
Returns:
- Resolution record ID
+ Resolution record ID, or None if recording failed
"""
# ... existing code ...
except Exception as e:
logger.error(f"Failed to record conflict resolution: {e}")
+ return None
return record_id📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| logger.error(f"Failed to record conflict resolution: {e}") | |
| return record_id | |
| logger.error(f"Failed to record conflict resolution: {e}") | |
| raise | |
| return record_id |
🤖 Prompt for AI Agents
In @cortex/installation_history.py around lines 768-770, The function currently
returns record_id even when the DB insert fails; update the error handling in
the try/except around the insert (the block that logs "Failed to record conflict
resolution: {e}") so that on exception you do not return a successful id—either
re-raise the exception or set and return None (change the return type to
Optional if using typing) instead of returning the previous record_id; ensure
callers are updated to handle None or exceptions accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
cortex/installation_history.py (2)
147-160: Add composite index on (conflict_type, strategy_type) for query performance.The query at lines 789-796 filters by both
conflict_typeandstrategy_type, but there's no composite index on these columns. While separate indices exist on(package1, package2)andsuccess, a composite index on(conflict_type, strategy_type)would significantly improve query performance for theget_conflict_resolution_success_rate()method.🔎 Proposed fix
cursor.execute( """ CREATE INDEX IF NOT EXISTS idx_conflict_success ON conflict_resolutions(success) """ ) + + cursor.execute( + """ + CREATE INDEX IF NOT EXISTS idx_conflict_type_strategy + ON conflict_resolutions(conflict_type, strategy_type) + """ + ) conn.commit()
720-772: Critical: Fix error handling to not return success indicator when insert fails.The method returns
record_ideven when the database insert fails (line 772 after exception at line 770), making it impossible for callers to detect failures. This could lead to false assumptions about successful logging and incorrect learning signals for the conflict prediction system.🔎 Proposed fixes
Option 1: Re-raise the exception
except Exception as e: logger.error(f"Failed to record conflict resolution: {e}") + raise return record_idOption 2: Return None on failure
- ) -> str: + ) -> str | None: """ Record a conflict resolution for learning. Returns: - Resolution record ID + Resolution record ID, or None if recording failed """ # ... existing code ... except Exception as e: logger.error(f"Failed to record conflict resolution: {e}") + return None return record_idIf choosing Option 2, ensure callers in
cortex/conflict_predictor.pyandcortex/cli.pyhandleNonereturns appropriately.
🧹 Nitpick comments (1)
cortex/installation_history.py (1)
285-336: Consider reducing code duplication with _extract_packages_from_commands.The implementation correctly extracts packages with versions using secure string operations (avoiding ReDoS). However, this method shares ~80% of its logic with
_extract_packages_from_commands(lines 247-283), differing only in version handling.🔎 Optional refactor to reduce duplication
Consider refactoring to share the common pattern-matching logic. For example, make
_extract_packages_from_commandscall this method and discard versions:def _extract_packages_from_commands(self, commands: list[str]) -> list[str]: """Extract package names from installation commands""" packages_with_versions = self._extract_packages_with_versions(commands) return sorted(set(name for name, _ in packages_with_versions))This reduces maintenance burden and ensures consistent parsing logic.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
cortex/installation_history.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
cortex/installation_history.py
**/*install*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*install*.py: Dry-run by default for all installations in command execution
No silent sudo execution - require explicit user confirmation
Implement audit logging to ~/.cortex/history.db for all package operations
Files:
cortex/installation_history.py
🧠 Learnings (1)
📚 Learning: 2025-12-11T12:03:24.071Z
Learnt from: CR
Repo: cortexlinux/cortex PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T12:03:24.071Z
Learning: Applies to **/*install*.py : Implement audit logging to ~/.cortex/history.db for all package operations
Applied to files:
cortex/installation_history.py
🧬 Code graph analysis (1)
cortex/installation_history.py (1)
cortex/utils/db_pool.py (1)
get_connection(99-136)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: test (3.10)
- GitHub Check: test (3.11)
- GitHub Check: test (3.12)
🔇 Additional comments (3)
cortex/installation_history.py (3)
256-257: LGTM: pip pattern correctly added.The regex pattern correctly handles pip/pip3 install commands with optional flags, properly capturing package names. This aligns with the PR's objective to support pip package state analysis.
274-279: LGTM: Security improvements to prevent ReDoS.These changes properly address ReDoS vulnerabilities mentioned in the commit message:
- Line 276 now uses
r"=.*$"instead ofr"[=:].*$", removing the unusual colon matching (addressing past review feedback)- Line 279 uses
rstrip()instead of regex for trailing character removal, preventing potential ReDoS attacks
774-806: LGTM: Success rate calculation is sound.The method correctly calculates historical success rates with appropriate fallback behavior:
- Returns 0.5 (neutral) when no historical data exists
- Handles exceptions gracefully by returning the default
- Uses parameterized queries to prevent SQL injection
Note: Query performance would benefit from the composite index on
(conflict_type, strategy_type)suggested in the earlier comment on lines 147-160.
- Fix API key detector to prioritize environment variables over cached file sources. Previously, a stale cache pointing to a file would overwrite valid environment variables with invalid keys, causing 401 authentication errors. - Fix ReDoS (Regular Expression Denial of Service) vulnerabilities in installation_history.py: - Replace `[=]+` with `=+` to prevent backtracking - Replace regex with rstrip() for trailing character removal - Add test for environment variable priority over cached file sources
1a07bcd to
075c962
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Fix all issues with AI Agents 🤖
In @cortex/installation_history.py:
- Around line 285-336: The two methods _extract_packages_with_versions and
_extract_packages_from_commands share most logic; refactor by extracting common
parsing into a helper (e.g., _extract_packages_base(self, commands: list[str],
extract_versions: bool=False)) that contains the patterns, sudo stripping,
matching, splitting and basic filtering, and have both original methods call it
with extract_versions toggled; additionally, in _extract_packages_with_versions
adjust the version extraction to strip all leading '=' characters (or treat
empty version as None) when splitting on '=' so inputs like "package===1.0.0"
yield version "1.0.0" (or None if empty) rather than "=1.0.0".
♻️ Duplicate comments (2)
cortex/installation_history.py (2)
147-160: Add composite index on (conflict_type, strategy_type) for query performance.The query in
get_conflict_resolution_success_rate(lines 789-796) filters by bothconflict_typeandstrategy_type, but there's no composite index on these columns. This will result in suboptimal query performance as the conflict resolution history grows.🔎 Recommended fix
cursor.execute( """ CREATE INDEX IF NOT EXISTS idx_conflict_success ON conflict_resolutions(success) """ ) + + cursor.execute( + """ + CREATE INDEX IF NOT EXISTS idx_conflict_type_strategy + ON conflict_resolutions(conflict_type, strategy_type) + """ + ) conn.commit()
769-772: Fix error handling: Don't return success when database insert fails.The method returns
record_ideven when the database insert fails (line 772), making it impossible for callers to detect failures. This could lead to false assumptions about successful logging and corrupt the learning system's data integrity.🔎 Recommended fix (Option 1: Re-raise exception)
except Exception as e: logger.error(f"Failed to record conflict resolution: {e}") + raise return record_id🔎 Recommended fix (Option 2: Return Optional)
- ) -> str: + ) -> str | None: """ Record a conflict resolution for learning. Returns: - Resolution record ID + Resolution record ID, or None if recording failed """ # ... existing code ... except Exception as e: logger.error(f"Failed to record conflict resolution: {e}") + return None return record_idIf using Option 2, update callers in
cortex/cli.pyandcortex/conflict_predictor.pyto handleNonereturns.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
cortex/installation_history.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
cortex/installation_history.py
**/*install*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*install*.py: Dry-run by default for all installations in command execution
No silent sudo execution - require explicit user confirmation
Implement audit logging to ~/.cortex/history.db for all package operations
Files:
cortex/installation_history.py
🧠 Learnings (1)
📚 Learning: 2025-12-11T12:03:24.071Z
Learnt from: CR
Repo: cortexlinux/cortex PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T12:03:24.071Z
Learning: Applies to **/*install*.py : Implement audit logging to ~/.cortex/history.db for all package operations
Applied to files:
cortex/installation_history.py
🧬 Code graph analysis (1)
cortex/installation_history.py (1)
cortex/utils/db_pool.py (1)
get_connection(99-136)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Test (Python 3.10)
- GitHub Check: test (3.10)
- GitHub Check: test (3.11)
- GitHub Check: test (3.12)
🔇 Additional comments (3)
cortex/installation_history.py (3)
256-258: LGTM! pip/pip3 support added correctly.The pattern correctly matches pip/pip3 install commands and skips flags to capture package names.
274-279: LGTM! Improved efficiency and specificity.The changes correctly address the previous regex concern by:
- Using
r"=.*$"instead ofr"[=:].*$"to avoid matching colons- Replacing regex with
rstrip()for better performance and ReDoS prevention
774-806: LGTM! Success rate calculation is correct.The method correctly:
- Calculates success rate from historical data
- Returns a neutral 0.5 when no history exists (reasonable default)
- Handles exceptions gracefully with fallback
Note: Query performance will be improved once the composite index flagged earlier is added.
| def _extract_packages_with_versions(self, commands: list[str]) -> list[tuple[str, str | None]]: | ||
| """Extract package names with versions from installation commands. | ||
| Returns: | ||
| List of (package_name, version) tuples. Version may be None. | ||
| """ | ||
| packages = [] | ||
| seen = set() | ||
|
|
||
| # Patterns to match package names in commands | ||
| patterns = [ | ||
| r"apt-get\s+(?:install|remove|purge)\s+(?:-y\s+)?(.+?)(?:\s*[|&<>]|$)", | ||
| r"apt\s+(?:install|remove|purge)\s+(?:-y\s+)?(.+?)(?:\s*[|&<>]|$)", | ||
| r"dpkg\s+-i\s+(.+?)(?:\s*[|&<>]|$)", | ||
| # pip/pip3 install commands | ||
| r"pip3?\s+install\s+(?:-[^\s]+\s+)*(.+?)(?:\s*[|&<>]|$)", | ||
| ] | ||
|
|
||
| for cmd in commands: | ||
| # Remove sudo if present | ||
| cmd_clean = re.sub(r"^sudo\s+", "", cmd.strip()) | ||
|
|
||
| for pattern in patterns: | ||
| matches = re.findall(pattern, cmd_clean) | ||
| for match in matches: | ||
| # Split by comma, space, or pipe for multiple packages | ||
| pkgs = re.split(r"[,\s|]+", match.strip()) | ||
| for pkg in pkgs: | ||
| pkg = pkg.strip() | ||
| # Filter out flags and invalid package names | ||
| if pkg and not pkg.startswith("-") and len(pkg) > 1: | ||
| # Extract version if present (e.g., package==1.0.0 or package=1.0.0) | ||
| # Use string split instead of regex to avoid ReDoS | ||
| if "=" in pkg: | ||
| # Split on first '=' to get name, rest is version (may have leading '=') | ||
| name, version = pkg.split("=", 1) | ||
| # Handle == by stripping leading '=' from version | ||
| version = version.lstrip("=").strip() | ||
| name = name.strip() | ||
| else: | ||
| name = pkg | ||
| version = None | ||
|
|
||
| # Remove any trailing special characters from name | ||
| # Use rstrip for efficiency instead of regex to avoid ReDoS | ||
| name = name.rstrip("!@#$%^&*(){}[]|\\:;\"'<>,?/~`") | ||
|
|
||
| if name and name not in seen: | ||
| seen.add(name) | ||
| packages.append((name, version)) | ||
|
|
||
| return packages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Significant code duplication with _extract_packages_from_commands.
The method shares ~70% of its logic with _extract_packages_from_commands (lines 247-283), including patterns, command cleaning, and filtering. This violates DRY principles and increases maintenance burden.
🔎 Recommended refactor approach
Extract common logic into a shared helper method:
def _extract_packages_base(
self,
commands: list[str],
extract_versions: bool = False
) -> list[tuple[str, str | None]] | list[str]:
"""Extract packages from commands with optional version extraction."""
packages = [] if extract_versions else set()
seen = set() if extract_versions else None
patterns = [
r"apt-get\s+(?:install|remove|purge)\s+(?:-y\s+)?(.+?)(?:\s*[|&<>]|$)",
r"apt\s+(?:install|remove|purge)\s+(?:-y\s+)?(.+?)(?:\s*[|&<>]|$)",
r"dpkg\s+-i\s+(.+?)(?:\s*[|&<>]|$)",
r"pip3?\s+install\s+(?:-[^\s]+\s+)*(.+?)(?:\s*[|&<>]|$)",
]
for cmd in commands:
cmd_clean = re.sub(r"^sudo\s+", "", cmd.strip())
for pattern in patterns:
matches = re.findall(pattern, cmd_clean)
for match in matches:
pkgs = re.split(r"[,\s|]+", match.strip())
for pkg in pkgs:
pkg = pkg.strip()
if pkg and not pkg.startswith("-") and len(pkg) > 1:
# Process package based on extract_versions flag
# ... (continue with version extraction or simple name extraction)
return packages if extract_versions else sorted(packages)Then refactor both methods to call this helper.
Minor: Edge case with triple equals (===).
Line 320 splits on first = and strips leading = from version. For package===1.0.0, this results in version "=1.0.0" which may not be the intended behavior.
🔎 Suggested fix
if "=" in pkg:
- # Split on first '=' to get name, rest is version (may have leading '=')
name, version = pkg.split("=", 1)
- # Handle == by stripping leading '=' from version
- version = version.lstrip("=").strip()
+ # Handle == or === by stripping all leading '=' from version
+ version = version.lstrip("=").strip() or None
name = name.strip()Setting version to None if empty prevents storing invalid version strings.
🤖 Prompt for AI Agents
In @cortex/installation_history.py around lines 285-336, The two methods
_extract_packages_with_versions and _extract_packages_from_commands share most
logic; refactor by extracting common parsing into a helper (e.g.,
_extract_packages_base(self, commands: list[str], extract_versions: bool=False))
that contains the patterns, sudo stripping, matching, splitting and basic
filtering, and have both original methods call it with extract_versions toggled;
additionally, in _extract_packages_with_versions adjust the version extraction
to strip all leading '=' characters (or treat empty version as None) when
splitting on '=' so inputs like "package===1.0.0" yield version "1.0.0" (or None
if empty) rather than "=1.0.0".
Anshgrover23
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ShreeJejurikar PR doesn’t follow contributing.md (AI usage disclosure). Please update this. Also address coderabbit comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 13 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
cortex/cli.py
Outdated
| venv_cmd = ( | ||
| "bash -c '" | ||
| + " && ".join( | ||
| cmd.replace("'", "'\\''") | ||
| for cmd in chosen_strategy.commands | ||
| ) | ||
| + "'" | ||
| ) | ||
| # Don't prepend to main commands - venv is isolated | ||
| # Just run the venv setup separately | ||
| commands = [venv_cmd] |
Copilot
AI
Jan 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The venv commands are wrapped in a bash subshell but if the venv creation or package installation fails partway through, there's no error handling or cleanup. Consider adding error checking (using 'set -e' or explicit checks) to ensure failures are properly reported.
cortex/conflict_predictor.py
Outdated
| packages = json.loads(result.stdout) | ||
| return {pkg["name"]: pkg["version"] for pkg in packages} | ||
| except subprocess.TimeoutExpired: | ||
| logger.debug("pip3 list timed out after 5 seconds") |
Copilot
AI
Jan 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message mentions "5 seconds" but the actual timeout value is not visible in the context. If the timeout value changes on line 779, this error message will become misleading. Consider using f-strings to include the actual timeout value or referencing a constant.
cortex/conflict_predictor.py
Outdated
| continue # Skip malformed lines | ||
| return packages[:30] | ||
| except subprocess.TimeoutExpired: | ||
| logger.debug("dpkg --get-selections timed out after 5 seconds") |
Copilot
AI
Jan 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message mentions "5 seconds" but the actual timeout value is not visible in the context. If the timeout value changes on line 815, this error message will become misleading. Consider using f-strings to include the actual timeout value or referencing a constant.
cortex/installation_history.py
Outdated
| pkg = pkg.rstrip("!@#$%^&*(){}[]|\\:;\"'<>,?/~`") | ||
| if pkg: | ||
| packages.add(pkg) | ||
|
|
||
| return sorted(packages) | ||
|
|
||
| def _extract_packages_with_versions(self, commands: list[str]) -> list[tuple[str, str | None]]: | ||
| """Extract package names with versions from installation commands. | ||
| Returns: | ||
| List of (package_name, version) tuples. Version may be None. | ||
| """ | ||
| packages = [] | ||
| seen = set() | ||
|
|
||
| for pkg in self._iterate_package_tokens(commands): | ||
| # Extract version if present (e.g., package=1.0.0, package==1.0.0, package===1.0.0) | ||
| # Use string split instead of regex to avoid ReDoS | ||
| # Note: lstrip("=") handles any number of leading '=' characters correctly | ||
| if "=" in pkg: | ||
| # Split on first '=' to get name, rest is version (may have leading '=') | ||
| name, version = pkg.split("=", 1) | ||
| # Strip any leading '=' characters from version (handles ==, ===, etc.) | ||
| version = version.lstrip("=").strip() | ||
| name = name.strip() | ||
| else: | ||
| name = pkg | ||
| version = None | ||
|
|
||
| # Remove any trailing special characters from name | ||
| # Use rstrip for efficiency instead of regex to avoid ReDoS | ||
| name = name.rstrip("!@#$%^&*(){}[]|\\:;\"'<>,?/~`") |
Copilot
AI
Jan 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rstrip operation with a long string of special characters is duplicated on lines 292 and 323. This should be extracted to a module-level constant for maintainability and to ensure consistency between the two methods.
cortex/cli.py
Outdated
| venv_cmd = ( | ||
| "bash -c '" | ||
| + " && ".join( | ||
| cmd.replace("'", "'\\''") | ||
| for cmd in chosen_strategy.commands | ||
| ) | ||
| + "'" |
Copilot
AI
Jan 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The string escaping approach using single quotes replacement may not be sufficient for all edge cases. Consider using shlex.quote() or a more robust escaping mechanism to prevent potential command injection when constructing the bash command string.
cortex/conflict_predictor.py
Outdated
| logger = logging.getLogger(__name__) | ||
|
|
||
| # Security: Validate version constraint format | ||
| CONSTRAINT_PATTERN = re.compile(r"^(<|>|<=|>=|==|!=|~=|~|===)?[\w.!*+\-]+$") |
Copilot
AI
Jan 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex pattern allows '' without '=' which is not a valid pip constraint operator. The valid operator is '=' (compatible release). The pattern should be updated to ensure '~' is only allowed when followed by '='.
| CONSTRAINT_PATTERN = re.compile(r"^(<|>|<=|>=|==|!=|~=|~|===)?[\w.!*+\-]+$") | |
| CONSTRAINT_PATTERN = re.compile(r"^(<|>|<=|>=|==|!=|~=|===)?[\w.!*+\-]+$") |
cortex/conflict_predictor.py
Outdated
| ["pip3", "list", "--format=json"], | ||
| capture_output=True, | ||
| text=True, | ||
| timeout=5, # Reduced from 15 to prevent UI blocking |
Copilot
AI
Jan 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hardcoded timeout of 5 seconds is duplicated in two places (lines 779 and 815). This magic number should be extracted to a module-level constant for easier maintenance and consistency.
cortex/cli.py
Outdated
| # OpenAI provider maps to Kimi K2 (OpenAI-compatible API) | ||
| # Future: Add native openai_api_key support in LLMRouter |
Copilot
AI
Jan 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment states "OpenAI provider maps to Kimi K2" but this is a significant behavioral difference that should be more prominently documented. Users expecting OpenAI's models will get Kimi K2 instead, which could have different capabilities, costs, and behaviors. Consider adding a user-visible warning or more explicit documentation.
| # OpenAI provider maps to Kimi K2 (OpenAI-compatible API) | |
| # Future: Add native openai_api_key support in LLMRouter | |
| # OpenAI provider currently maps to Kimi K2 via an OpenAI-compatible API. | |
| # Future: Add native openai_api_key support in LLMRouter. | |
| cx_print( | |
| "⚠️ Note: The 'openai' provider currently uses the Kimi K2 backend " | |
| "via an OpenAI-compatible API. Behavior, limits, and pricing may " | |
| "differ from OpenAI's own models.", | |
| "warning", | |
| ) |
| StrategyType, | ||
| ) | ||
| from cortex.installation_history import InstallationHistory | ||
| from cortex.llm_router import LLMRouter |
Copilot
AI
Jan 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import of 'LLMRouter' is not used.
| from cortex.llm_router import LLMRouter |
| idx = int(choice) - 1 | ||
| if 0 <= idx < max_choices: | ||
| return strategies[idx], idx | ||
| except ValueError: |
Copilot
AI
Jan 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'except' clause does nothing but pass and there is no explanatory comment.
| except ValueError: | |
| except ValueError: | |
| # Non-integer input is treated as an invalid choice and handled below. |
- Add error handling (set -e) to venv bash subshell in cli.py - Use shlex.quote for safer command escaping in venv commands - Extract hardcoded timeout values to constants with f-string messages - Extract duplicate rstrip chars to PACKAGE_NAME_STRIP_CHARS constant - Change llm_router logging level from ERROR to WARNING - Fix regex pattern to reject standalone tilde (only ~= is valid pip operator) - Add clarifying comments for OpenAI to Kimi K2 mapping - Add explanatory comments for silent except clauses - Remove unused LLMRouter import from integration tests - Fix misleading test comment about pip constraints
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 13 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
|
|



Summary
Adds AI-powered dependency conflict prediction that analyzes potential package conflicts BEFORE installation using LLM analysis instead of hardcoded rules. Includes resolution strategies ranked by safety with interactive prompts.
Related Issue
Closes #428
Type of Change
AI Disclosure
Used GitHub Copilot for code analysis and refactoring suggestions.
Used Cursor Copilot with Claude Opus 4.5 model for generating test cases and documentation. Core implementation was done manually.
Features
[████████░░]and[RECOMMENDED]labelsshlex.quote(), timeout protectionTesting
pytest tests/ -v)tests/test_conflict_predictor.pycovering:tests/integration/test_conflict_predictor_integration.pyDemo
Screencast.from.02-01-26.09.35.23.PM.IST.webm
Checklist
Summary by CodeRabbit
New Features
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.