feat: Added voice feature for end-user fix #325 #405

Sahilbhatane · 2025-12-31T13:23:57Z

Related Issue

Closes #325

Summary

Adds voice input capability to Cortex Linux, enabling users to speak commands instead of typing them. This implementation uses local speech-to-text processing with Whisper for privacy and low latency.

Features Added

Voice command mode: cortex voice for continuous voice input
Single command mode: cortex voice --single for one-shot recording
Mic flag integration: cortex install --mic and cortex ask --mic
Push-to-talk hotkey: F9 (customizable via CORTEX_VOICE_HOTKEY)
Confirmation prompt: After voice transcription, users can choose dry-run, execute, or cancel
Visual feedback: Recording animation (● Recording...) during speech capture

Technical Details

Uses faster-whisper (optimized Whisper) for accurate, local STT
Default model: base.en (~150MB, good accuracy/speed balance)
Optional dependencies via pip install cortex-linux[voice]
Cross-platform support (Linux primary, Windows compatible)

Files Changed

cortex/voice.py - New voice input handler module
cortex/cli.py - Added voice commands and --mic flags
cortex/branding.py - Windows ASCII fallback for console output
pyproject.toml - Added [voice] optional dependencies
docs/VOICE_INPUT.md - User documentation
tests/test_voice.py - Unit tests (20 tests)
tests/test_ollama_integration.py - Fixed Windows compatibility

Checklist

Tests pass (pytest tests/)
MVP label added if closing MVP issue
Update "Cortex -h" (if needed)
Linting passes (ruff check, black --check)
Security check passes (bandit -r cortex/)
Documentation added (docs/VOICE_INPUT.md)

Note from maintainer: This PR implements voice input using a local-first approach with faster-whisper instead of the Fizy AI integration originally proposed in #325. This decision was made to:

Privacy: All speech processing happens locally—no data sent to external services

Latency: Local processing is faster than round-trip API calls

Offline support: Works without internet connection

Cost: No per-request API fees for users

The Fizy integration may be revisited in a future phase if there's demand for cloud-based voice processing with additional features.

additionally for Wayland based Ubuntu user (if hotkey doesn't work) -

Option 1: Run with sudo (for global hotkey access)
sudo cortex voice

Option 2: Use X11 session (login screen → gear icon → "Ubuntu on Xorg")

Option 3: Use single-shot mode (no hotkey needed)
cortex voice --single`

Summary by CodeRabbit

New Features
- Added voice input functionality with push-to-talk recording and continuous mode support
- Added --mic flag to ask and install commands for voice-based input
- Added dedicated voice command for interactive voice interaction
- Improved Windows display with Windows-optimized icons and separators
Documentation
- Added comprehensive voice input guide covering setup, usage, configuration, and troubleshooting
Dependencies
- Added optional voice dependencies group for enhanced audio capabilities

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-31T13:24:06Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

This PR introduces a voice input system for Cortex, enabling users to control the CLI via voice commands. It adds a new VoiceInputHandler module with speech-to-text transcription using faster-whisper, integrates voice input into CLI commands (ask/install via --mic flag), adds Windows-specific display adaptations, updates dependencies, and includes comprehensive tests and documentation.

Changes

Cohort / File(s)	Summary
Voice Input System Implementation `cortex/voice.py`, `cortex/cli.py`	New `VoiceInputHandler` class with audio capture, faster-whisper transcription, hotkey-driven recording, and error handling; CLI integrates voice via new `voice()` method with continuous/single-shot modes and platform-specific icons/separators
CLI Integration & Wiring `cortex/cli.py`	Adds `--mic` flag to `ask` and `install` commands for voice-driven input; routes voice commands to dedicated handler with confirmation flow (dry-run vs execute); updates help text to advertise voice capabilities
Platform Compatibility `cortex/branding.py`	Conditionalizes Console initialization and icon/separator rendering for Windows vs. non-Windows platforms (ASCII-like icons on Windows, Unicode on others)
Dependencies & Configuration `pyproject.toml`, `requirements.txt`	Introduces new optional dependency group `voice` with faster-whisper, sounddevice, pynput, numpy; updates `all` extra to include voice group; requirements.txt minor formatting adjustment
Test Coverage & Integration Fixes `tests/test_voice.py`, `tests/test_ollama_integration.py`	Comprehensive VoiceInputHandler test suite (dependencies, microphone detection, transcription, cleanup); Ollama presence check refactored from subprocess to `shutil.which`
Documentation & Metadata `docs/VOICE_INPUT.md`, `.gitignore`	New documentation covering Quick Start, installation, usage modes, configuration, workflow diagrams, troubleshooting, and API reference; adds virtual environment directories to gitignore

Sequence Diagram

sequenceDiagram
    participant User
    participant CLI as CortexCLI
    participant VIH as VoiceInputHandler
    participant Audio as Audio System
    participant Model as Whisper Model
    participant Callback

    User->>CLI: --mic flag or voice command
    CLI->>VIH: Initialize + _ensure_dependencies()
    VIH->>Audio: _check_microphone()
    Audio-->>VIH: Device ready
    
    rect rgba(100, 200, 150, 0.2)
        Note over User,VIH: Voice Input Mode
        User->>Audio: Speak (after hotkey)
        Audio->>VIH: _start_recording() + audio chunks
        User->>Audio: Release hotkey (stop)
        VIH->>VIH: _stop_recording_stream()
    end
    
    rect rgba(100, 150, 200, 0.2)
        Note over VIH,Model: Transcription
        VIH->>VIH: _load_model() [lazy]
        VIH->>Model: transcribe(audio_data)
        Model-->>VIH: Segment text
        VIH->>VIH: Concatenate segments
    end
    
    rect rgba(200, 150, 100, 0.2)
        Note over VIH,Callback: Result Processing
        VIH->>Callback: Invoke with transcribed text
        Callback->>CLI: Return command/question
        CLI->>User: Execute or ask further
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

feat: add environment variable manager with encryption and templates #344: Both PRs modify CortexCLI class and CLI command routing; adds new top-level command handler infrastructure similar to voice integration.
Added multi-step structured installation coordinator with plan support fixes #8 #190: Both PRs interact with the CortexCLI class implementation; foundational work that voice feature builds upon.

Suggested reviewers

mikejmorgan-ai
dhvll
Suyashd999

Poem

🐰 Whispers through the speedy air,
Faster-Whisper transcends with care,
Voice commands now guide the way,
Hotkeys dance at work and play,
Linux speaks—no typing today! 🎙️

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Linked Issues check	⚠️ Warning	PR deviates significantly from linked issue #325: implements local-first voice input with faster-whisper instead of the Fizy AI integration specified in the issue requirements.	The PR implements a different solution than what issue #325 requested. Address the gap by either implementing Fizy AI integration per spec or formally updating the issue scope and acceptance criteria to reflect the new approach.
Out of Scope Changes check	❓ Inconclusive	Most changes directly support voice feature; however, changes to cortex/branding.py (Windows console compatibility) and tests/test_ollama_integration.py (Ollama detection) are tangentially related but not strictly required for voice functionality.	Clarify whether Windows branding adjustments and Ollama integration fixes are prerequisites for voice feature stability or separate improvements bundled into this PR.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description check	✅ Passed	Description comprehensively covers objectives, features, technical details, and files changed; follows template with Related Issue and Summary sections plus extended checklist.
Docstring Coverage	✅ Passed	Docstring coverage is 88.00% which is sufficient. The required threshold is 80.00%.
Title check	✅ Passed	The title 'feat: Added voice feature for end-user fix #325' clearly describes the primary change: adding a voice feature to address issue #325. It is concise, specific, and directly related to the main changeset.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2025-12-31T13:24:11Z

CLA Verification Passed

All contributors have signed the CLA.

Contributor	Signed As
@Sahilbhatane	@Sahilbhatane
@Sahilbhatane	@Sahilbhatane

Copilot

Pull request overview

This PR adds voice input capability to Cortex Linux, enabling users to speak commands instead of typing them. The implementation uses local speech-to-text processing with faster-whisper for privacy and low latency. However, there are several critical issues that need to be addressed before merging.

Key Changes

New voice input module with support for continuous and single-shot voice commands
Optional [voice] dependency group in pyproject.toml
Windows compatibility improvements in branding module (ASCII fallback for console output)
Comprehensive documentation for voice features
20 unit tests for voice functionality

Reviewed changes

Copilot reviewed 12 out of 42 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
tests/test_voice.py	Unit tests for voice input handler with mocked dependencies
requirements.txt	Issue: Voice dependencies incorrectly added as required instead of optional
pyproject.toml	Proper optional voice dependency group configuration
myenv/*	Critical: Entire virtual environment directory committed (should be excluded)
docs/VOICE_INPUT.md	Comprehensive user documentation for voice features
cortex/branding.py	Windows compatibility with ASCII fallback for special characters

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/test_voice.py

Copilot · 2025-12-31T13:27:38Z

tests/test_voice.py

+        handler = VoiceInputHandler()
+        assert handler.model_name == "tiny.en"
+        assert handler.sample_rate == 16000
+        assert handler.hotkey == "f9"


The hotkey default is tested as "f9" in multiple tests (lines 45, 243), but the documentation claims the customizable hotkey is set via CORTEX_VOICE_HOTKEY environment variable which is never tested. The relationship between the default value and the environment variable override should be tested to ensure the configuration mechanism works correctly.

myenv/pyvenv.cfg

cortex/cli.py

Copilot · 2025-12-31T13:27:39Z

tests/test_voice.py

+                # Import fresh to get mocked module
+                import importlib
+
+                import cortex.voice


Module 'cortex.voice' is imported with both 'import' and 'import from'.

Copilot · 2025-12-31T13:27:39Z

tests/test_voice.py

+            with patch("cortex.voice.cx_print") as mock_print:
+                import importlib
+
+                import cortex.voice


Module 'cortex.voice' is imported with both 'import' and 'import from'.

tests/test_voice.py

cortex/voice.py

coderabbitai

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

requirements.txt (1)
9-25: Duplicate PyYAML dependencies with inconsistent versions.

There are three occurrences of PyYAML in this file:

Line 9: PyYAML>=6.0.0

Line 21: pyyaml>=6.0.0

Line 25: PyYAML==6.0.3

This creates confusion and potential version conflicts. Keep only one entry with a consistent version constraint.
🔎 Proposed fix
 # Configuration
 PyYAML>=6.0.0
 
 # Environment variable loading from .env files
 python-dotenv>=1.0.0
 
 # Encryption for environment variable secrets
 cryptography>=42.0.0
 
 # Terminal UI
 rich>=13.0.0
 
-# Configuration
-pyyaml>=6.0.0
-
 # Type hints for older Python versions
 typing-extensions>=4.0.0
-PyYAML==6.0.3

♻️ Duplicate comments (4)

myenv/Scripts/deactivate.bat (1)

1-22: Part of virtual environment that should not be committed.

This file is part of the myenv/ virtual environment directory that should be removed from version control. See the comment on myenv/pyvenv.cfg for details.

myenv/share/man/man1/isympy.1 (1)

1-188: Third-party package artifact that should not be committed.

This is a SymPy man page installed in the virtual environment. It's part of the myenv/ directory that should be removed from version control. See the comment on myenv/pyvenv.cfg for details.

myenv/Scripts/activate.bat (1)

1-34: Part of virtual environment with hardcoded developer paths.

This file contains a hardcoded path (C:\Users\sahil\...) on line 11 and is part of the myenv/ virtual environment that should be removed. See the comment on myenv/pyvenv.cfg for details.

myenv/Scripts/Activate.ps1 (1)

1-528: Part of virtual environment that should not be committed.

This PowerShell activation script is part of the myenv/ virtual environment directory that should be removed from version control. See the comment on myenv/pyvenv.cfg for details.

🧹 Nitpick comments (4)

docs/VOICE_INPUT.md (1)
66-83: Add language identifier to fenced code blocks for markdown lint compliance.

Static analysis flagged missing language specifiers. For terminal output and diagrams, use text or console as the language identifier.
🔎 Suggested fix

Line 66:
-```
+```text
$ cortex voice
Line 146:
-```
+```text
┌──────────────┐    ┌──────────────┐
cortex/voice.py (3)
296-307: Recording indicator bypasses branding utilities.

The recording indicator at line 302 uses raw string formatting " CX | " instead of the cx_print function from cortex/branding.py. This creates inconsistency in terminal output, especially on Windows where cx_print uses ASCII-only icons.

🔎 Suggested approach

Consider using console.print with Rich markup for consistency, or document why direct stdout is necessary (e.g., for \r carriage return updates).

If carriage return updates are required, you could use Rich's Live context or Status for animated updates instead of raw stdout manipulation.

425-428: Busy-wait loop could be replaced with event-based waiting.

The infinite loop with time.sleep(0.1) is a busy-wait pattern. While functional, it wastes CPU cycles polling. Consider using an Event.wait() pattern that blocks until signaled.
🔎 Alternative approach
+        self._exit_event = threading.Event()
+
         try:
             # Keep the main thread alive
-            while True:
-                time.sleep(0.1)
+            self._exit_event.wait()
         except KeyboardInterrupt:
             cx_print("\nVoice mode exited.", "info")
Then set self._exit_event.set() in the stop() method.
482-487: Bare except clause silently swallows all exceptions.

The bare except at line 485 catches and ignores all exceptions, including KeyboardInterrupt and SystemExit. This could mask programming errors during debugging.
🔎 Suggested fix
         if self._hotkey_listener:
             try:
                 self._hotkey_listener.stop()
-            except Exception:
-                pass
+            except OSError:
+                # Listener may already be stopped
+                pass
             self._hotkey_listener = None
Or log the exception at debug level:
except Exception as e:
    logging.debug("Error stopping hotkey listener: %s", e)

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8171eca and 9acb0d7.

⛔ Files ignored due to path filters (29)

myenv/Scripts/coloredlogs.exe is excluded by !**/*.exe
myenv/Scripts/cortex.exe is excluded by !**/*.exe
myenv/Scripts/ct2-fairseq-converter.exe is excluded by !**/*.exe
myenv/Scripts/ct2-marian-converter.exe is excluded by !**/*.exe
myenv/Scripts/ct2-openai-gpt2-converter.exe is excluded by !**/*.exe
myenv/Scripts/ct2-opennmt-py-converter.exe is excluded by !**/*.exe
myenv/Scripts/ct2-opennmt-tf-converter.exe is excluded by !**/*.exe
myenv/Scripts/ct2-opus-mt-converter.exe is excluded by !**/*.exe
myenv/Scripts/ct2-transformers-converter.exe is excluded by !**/*.exe
myenv/Scripts/distro.exe is excluded by !**/*.exe
myenv/Scripts/dotenv.exe is excluded by !**/*.exe
myenv/Scripts/f2py.exe is excluded by !**/*.exe
myenv/Scripts/hf.exe is excluded by !**/*.exe
myenv/Scripts/httpx.exe is excluded by !**/*.exe
myenv/Scripts/humanfriendly.exe is excluded by !**/*.exe
myenv/Scripts/isympy.exe is excluded by !**/*.exe
myenv/Scripts/markdown-it.exe is excluded by !**/*.exe
myenv/Scripts/numpy-config.exe is excluded by !**/*.exe
myenv/Scripts/onnxruntime_test.exe is excluded by !**/*.exe
myenv/Scripts/openai.exe is excluded by !**/*.exe
myenv/Scripts/pip.exe is excluded by !**/*.exe
myenv/Scripts/pip3.12.exe is excluded by !**/*.exe
myenv/Scripts/pip3.exe is excluded by !**/*.exe
myenv/Scripts/pyav.exe is excluded by !**/*.exe
myenv/Scripts/pygmentize.exe is excluded by !**/*.exe
myenv/Scripts/python.exe is excluded by !**/*.exe
myenv/Scripts/pythonw.exe is excluded by !**/*.exe
myenv/Scripts/tiny-agents.exe is excluded by !**/*.exe
myenv/Scripts/tqdm.exe is excluded by !**/*.exe

📒 Files selected for processing (13)

cortex/branding.py
cortex/cli.py
cortex/voice.py
docs/VOICE_INPUT.md
myenv/Scripts/Activate.ps1
myenv/Scripts/activate
myenv/Scripts/activate.bat
myenv/Scripts/deactivate.bat
myenv/pyvenv.cfg
myenv/share/man/man1/isympy.1
pyproject.toml
requirements.txt
tests/test_voice.py

🧰 Additional context used

📓 Path-based instructions (3)

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs

Files:

tests/test_voice.py
cortex/voice.py
cortex/branding.py
cortex/cli.py

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Maintain >80% test coverage for pull requests

Files:

tests/test_voice.py

{setup.py,setup.cfg,pyproject.toml,**/__init__.py}

📄 CodeRabbit inference engine (AGENTS.md)

Use Python 3.10 or higher as the minimum supported version

Files:

pyproject.toml

🧬 Code graph analysis (3)

tests/test_voice.py (1)

cortex/voice.py (9)

VoiceInputHandler (39-495)

_ensure_dependencies (87-125)

_check_microphone (153-175)

transcribe (226-270)

stop (477-495)

VoiceInputError (21-24)

MicrophoneNotFoundError (27-30)

ModelNotFoundError (33-36)

get_voice_handler (498-517)

cortex/voice.py (1)

cortex/branding.py (1)

cx_print (52-82)

cortex/cli.py (1)

cortex/voice.py (4)

VoiceInputError (21-24)

VoiceInputHandler (39-495)

start_voice_mode (399-432)

record_single (434-475)

🪛 GitHub Actions: CI

cortex/cli.py

[error] 580-580: Ruff check failed: W293 Blank line contains whitespace. Command: 'ruff check . --output-format=github'.

🪛 GitHub Check: lint

cortex/cli.py

[failure] 591-591: Ruff (W293)
cortex/cli.py:591:1: W293 Blank line contains whitespace

[failure] 588-588: Ruff (W293)
cortex/cli.py:588:1: W293 Blank line contains whitespace

[failure] 580-580: Ruff (W293)
cortex/cli.py:580:1: W293 Blank line contains whitespace

🪛 GitHub Check: Lint

cortex/cli.py

[failure] 591-591: Ruff (W293)
cortex/cli.py:591:1: W293 Blank line contains whitespace

[failure] 588-588: Ruff (W293)
cortex/cli.py:588:1: W293 Blank line contains whitespace

[failure] 580-580: Ruff (W293)
cortex/cli.py:580:1: W293 Blank line contains whitespace

🪛 LanguageTool

docs/VOICE_INPUT.md

[grammar] ~141-~141: Ensure spelling is correct
Context: ...apture** - Records via sounddevice at 16kHz mono 3. Speech-to-Text - Transcribe...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

[grammar] ~141-~141: Ensure spelling is correct
Context: ... Records via sounddevice at 16kHz mono 3. Speech-to-Text - Transcribes using `fa...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🪛 markdownlint-cli2 (0.18.1)

docs/VOICE_INPUT.md

66-66: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

146-146: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Agent
GitHub Check: test (3.11)
GitHub Check: test (3.12)
GitHub Check: test (3.10)

🔇 Additional comments (14)

cortex/branding.py (3)

8-14: Good Windows compatibility improvements for Rich console.

The platform-aware Console initialization with force_terminal=True and legacy_windows based on platform detection is a solid approach for cross-platform terminal rendering.

63-79: Well-structured platform-specific icon fallbacks.

The ASCII fallback icons for Windows (|, +, !, x, *) are appropriate replacements for the Unicode characters that may not render correctly in Windows terminals. The conditional structure is clean and maintainable.

92-102: Consistent platform-specific separators.

The separator adjustments in cx_step and cx_header align with the icon changes above, ensuring consistent visual appearance across platforms.

pyproject.toml (1)

72-79: Voice optional dependencies are correctly structured.

The optional dependency group is properly defined with appropriate version constraints compatible with Python 3.10+. Including it in the all extra is appropriate for comprehensive installation.

docs/VOICE_INPUT.md (1)

1-46: Well-structured documentation with comprehensive coverage.

The documentation covers all essential aspects: installation, usage modes (continuous and single), configuration options, troubleshooting, privacy considerations, and API reference. The privacy-first approach (local processing, no audio uploads) is well highlighted.

tests/test_voice.py (1)

10-11: Good test organization with clear class-based grouping.

The test suite is well-organized into logical test classes (TestVoiceInputHandler, TestVoiceInputExceptions, TestGetVoiceHandler, TestRecordingState), each with focused test methods and appropriate fixtures. The mocking strategy for optional dependencies is sound.

Also applies to: 199-200, 224-225, 268-269

cortex/cli.py (3)

540-552: Good implementation with proper dependency handling.

The voice method correctly handles the optional dependency import with a helpful error message guiding users to install the voice extras. The API key and provider checks follow the established pattern in the codebase.

560-577: Consider case-insensitive verb matching and edge cases.

The install verb detection uses startswith() after lowercasing, which is good. However, the verb removal at lines 574-577 operates on the original text but uses len(verb) from the lowercase version, which works correctly since length is preserved. The logic is sound.

One edge case: if the user says "Install", the slicing text[len(verb):] correctly preserves the original casing of the software name.

1633-1638: Help table correctly advertises new voice capabilities.

The Rich help table now includes voice command entries, making the feature discoverable to users.

cortex/voice.py (5)

21-36: Well-designed exception hierarchy.

The exception classes are properly structured with a base VoiceInputError and specific subclasses for different error conditions. Docstrings are present as required by coding guidelines. The pass statements are appropriate for simple exception classes.

51-85: Good initialization with sensible defaults and type hints.

The __init__ method has proper type hints, comprehensive docstrings, and sensible defaults. The environment variable fallback for model_name provides flexibility. Internal state management with explicit initialization is clean.

226-270: Transcription method is well-implemented with proper validation.

The transcribe method correctly:

Handles lazy model loading (line 238-239)

Returns early for empty audio (line 241-242)

Validates dtype before processing (line 245-246)

Uses appropriate VAD parameters for voice commands

Strips and joins segment texts properly

498-517: Factory function provides clean API.

The get_voice_handler factory function has proper type hints and docstring. It provides a simpler interface for common use cases while still allowing customization.

177-202: Audio recording implementation is robust.

The _start_recording method properly:

Clears the buffer before starting

Uses a callback-based approach for non-blocking capture

Sets up appropriate stream parameters (16kHz, mono, float32)

Handles errors by resetting state and raising a specific exception

cortex/cli.py

docs/VOICE_INPUT.md

myenv/pyvenv.cfg

myenv/Scripts/activate

requirements.txt

tests/test_voice.py

dhvll · 2025-12-31T13:32:32Z

@Sahilbhatane remove all those virtual environment files. You should never push those.

Sahilbhatane · 2025-12-31T13:33:34Z

@Sahilbhatane remove all those virtual environment files. You should never push those.

yeah ik, .gitignore didn't recognized them ig.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (4)

cortex/voice.py (1)
486-490: Improve exception handling in hotkey listener cleanup.

The bare except Exception: pass silently swallows all errors. Following the pattern used at line 498 for the audio stream, add debug logging to track potential issues during cleanup.
🔎 Proposed fix
         if self._hotkey_listener:
             try:
                 self._hotkey_listener.stop()
-            except Exception:
-                pass
+            except Exception as e:
+                logging.debug("Error stopping hotkey listener: %s", e)
             self._hotkey_listener = None
cortex/cli.py (3)
554-558: Unused provider variable.

The provider variable is retrieved at line 558 but never used in the voice method. The process_voice_command callback already retrieves the API key and uses self.ask() and self.install(), which internally get the provider themselves.
🔎 Consider removing unused variable
         api_key = self._get_api_key()
         if not api_key:
             return 1
-
-        provider = self._get_provider()
579-600: Fix trailing whitespace to resolve pipeline failures.

Based on past review comments, lines 580, 588, and 591 contain trailing whitespace on blank lines, causing CI/linter failures. These must be fixed for the PR to pass.
🔎 Remove trailing whitespace

Ensure lines 580, 588, and 591 are completely empty (no spaces or tabs):
                 cx_print(f"Installing: {software}", "info")
-                
+
                 # Ask user for confirmation
                 console.print()
                 console.print("[bold cyan]Choose an action:[/bold cyan]")
                 console.print("  [1] Dry run (preview commands)")
                 console.print("  [2] Execute (run commands)")
                 console.print("  [3] Cancel")
                 console.print()
-                
+
                 try:
                     choice = input("Enter choice [1/2/3]: ").strip()
-                    
+
                     if choice == "1":
2000-2015: Add error handling for VoiceInputError and its subclasses.

The install --mic flow catches ImportError but doesn't handle VoiceInputError, MicrophoneNotFoundError, or ModelNotFoundError. If voice recording fails (e.g., no microphone detected, model loading fails), an unhandled exception will propagate.
🔎 Proposed fix
             if getattr(args, "mic", False):
                 try:
-                    from cortex.voice import VoiceInputHandler
+                    from cortex.voice import VoiceInputError, VoiceInputHandler

                     handler = VoiceInputHandler()
                     cx_print("Press F9 to speak what you want to install...", "info")
                     software = handler.record_single()
                     if not software:
                         cx_print("No speech detected.", "warning")
                         return 1
                     cx_print(f"Installing: {software}", "info")
                 except ImportError:
                     cli._print_error("Voice dependencies not installed.")
                     cx_print("Install with: pip install cortex-linux[voice]", "info")
                     return 1
+                except VoiceInputError as e:
+                    cli._print_error(str(e))
+                    return 1

🧹 Nitpick comments (1)

.gitignore (1)
14-20: Consolidate duplicate entries for maintainability.

The .gitignore file contains numerous redundant entries. For example:

env/, venv/, ENV/ appear at lines 14–15, 143–145

.mypy_cache/, .pytest_cache/, .coverage, htmlcov/ appear at lines 70–71 and 146–149

This duplication reduces readability and makes future maintenance harder.
🔎 Suggested cleanup: Remove duplicates and consolidate sections
  # ==============================
  # Logs & Misc
  # ==============================
  *.log
  logs/
  *.tmp
  *.bak
  *.swp
- .env
- .venv
- env/
- venv/
- ENV/
- .mypy_cache/
- .pytest_cache/
- .coverage
- htmlcov/
  *.out
  *~
  *.swo
Then, verify that all non-duplicate entries are already covered in their respective sections above (Virtual Environments, mypy/Pyre/pytype, etc.).

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9acb0d7 and 891dff5.

⛔ Files ignored due to path filters (23)

myenv/Scripts/bandit-baseline.exe is excluded by !**/*.exe
myenv/Scripts/bandit-config-generator.exe is excluded by !**/*.exe
myenv/Scripts/bandit.exe is excluded by !**/*.exe
myenv/Scripts/black.exe is excluded by !**/*.exe
myenv/Scripts/blackd.exe is excluded by !**/*.exe
myenv/Scripts/coverage-3.12.exe is excluded by !**/*.exe
myenv/Scripts/coverage.exe is excluded by !**/*.exe
myenv/Scripts/coverage3.exe is excluded by !**/*.exe
myenv/Scripts/dmypy.exe is excluded by !**/*.exe
myenv/Scripts/mypy.exe is excluded by !**/*.exe
myenv/Scripts/mypyc.exe is excluded by !**/*.exe
myenv/Scripts/nltk.exe is excluded by !**/*.exe
myenv/Scripts/normalizer.exe is excluded by !**/*.exe
myenv/Scripts/pip.exe is excluded by !**/*.exe
myenv/Scripts/pip3.12.exe is excluded by !**/*.exe
myenv/Scripts/pip3.exe is excluded by !**/*.exe
myenv/Scripts/py.test.exe is excluded by !**/*.exe
myenv/Scripts/pytest.exe is excluded by !**/*.exe
myenv/Scripts/ruff.exe is excluded by !**/*.exe
myenv/Scripts/safety.exe is excluded by !**/*.exe
myenv/Scripts/stubgen.exe is excluded by !**/*.exe
myenv/Scripts/stubtest.exe is excluded by !**/*.exe
myenv/Scripts/typer.exe is excluded by !**/*.exe

📒 Files selected for processing (6)

.gitignore
cortex/cli.py
cortex/voice.py
myenv/share/man/man1/bandit.1
tests/test_ollama_integration.py
tests/test_voice.py

✅ Files skipped from review due to trivial changes (1)

myenv/share/man/man1/bandit.1

🚧 Files skipped from review as they are similar to previous changes (1)

tests/test_voice.py

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs

Files:

tests/test_ollama_integration.py
cortex/voice.py
cortex/cli.py

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Maintain >80% test coverage for pull requests

Files:

tests/test_ollama_integration.py

🧬 Code graph analysis (3)

tests/test_ollama_integration.py (1)

scripts/setup_ollama.py (1)

check_ollama_installed (75-77)

cortex/voice.py (1)

cortex/branding.py (1)

cx_print (52-82)

cortex/cli.py (1)

cortex/voice.py (4)

VoiceInputError (21-24)

VoiceInputHandler (39-499)

start_voice_mode (401-436)

record_single (438-479)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: Test (Python 3.12)
GitHub Check: Test (Python 3.10)
GitHub Check: Test (Python 3.11)
GitHub Check: test (3.12)
GitHub Check: test (3.10)
GitHub Check: test (3.11)

🔇 Additional comments (10)

tests/test_ollama_integration.py (1)

15-15: LGTM: Clean refactor to shutil.which.

Replacing the subprocess-based Ollama detection with shutil.which is a cleaner, more direct approach that aligns with the pattern used in scripts/setup_ollama.py (lines 74-76).

Also applies to: 29-30, 37-37

cortex/voice.py (5)

1-36: LGTM: Clean module structure and exception hierarchy.

Module docstring clearly describes the feature, and the exception hierarchy provides appropriate granularity for error handling.

39-86: LGTM: Well-structured initialization with environment-driven configuration.

Good use of lazy loading for the model and environment variable overrides. The type hints and docstring meet coding guidelines.

87-270: LGTM: Robust dependency checks and transcription pipeline.

The dependency validation provides helpful user guidance, and the transcription implementation uses appropriate parameters (beam_size=5, VAD filtering, no_speech_threshold=0.6).

272-436: LGTM: Thread-safe hotkey handling with good UX.

The implementation properly uses threading primitives (locks, events) and provides visual feedback during recording. KeyboardInterrupt handling ensures clean exit.

502-521: LGTM: Clean factory function with proper type hints.

The factory function follows Python conventions and provides a clean entry point for creating voice handlers.

cortex/cli.py (4)

1633-1633: LGTM: Help documentation clearly advertises voice features.

The help table updates appropriately document the new voice input capabilities for users.

Also applies to: 1638-1638

1722-1739: LGTM: Well-structured argument parsing for voice features.

The --mic and --single flags are appropriately configured, and making the question argument optional enables the voice input flow.

1743-1755: LGTM: Consistent --mic flag integration.

The install command's --mic flag follows the same pattern as the ask command, maintaining consistency across the CLI.

1989-1998: LGTM: Clean command routing for voice features.

The routing logic appropriately delegates to the voice method for both explicit voice commands and --mic flags, with helpful error messages.

.gitignore

- Remove unused 'provider' variable in cli.py - Add logging to except block in voice.py stop() - Remove unused 'threading' import in test_voice.py - Improve test_ensure_dependencies_missing test - Fix test_transcribe_loads_model_if_needed to test lazy loading - Add VoiceInputError handling to install --mic path - Remove optional voice deps from requirements.txt (use pyproject.toml)

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

cortex/voice.py (2)
185-189: Consider using logging for audio status messages.

Line 187 prints audio status directly to stderr. For consistency with the rest of the module (which uses cx_print and logging.debug), consider using logging.debug() instead.
🔎 Proposed refactor
         def audio_callback(indata, frames, time_info, status):
             if status:
-                print(f"Audio status: {status}", file=sys.stderr)
+                logging.debug("Audio status: %s", status)
             if self._is_recording:
                 self._audio_buffer.append(indata.copy())
238-250: Redundant model check after lazy loading.

Lines 248-250 check if self._model is None after calling _load_model() on line 239. However, _load_model() either sets self._model or raises ModelNotFoundError, so the model can never be None at line 249. This check is dead code.
🔎 Proposed refactor
         if self._model is None:
             self._load_model()

         if len(audio_data) == 0:
             return ""

         # faster-whisper expects float32 audio normalized to [-1, 1]
         if audio_data.dtype != np.float32:
             audio_data = audio_data.astype(np.float32)

-        # Model should be loaded at this point
-        if self._model is None:
-            raise ModelNotFoundError("Model must be loaded before transcription")
-
         segments, info = self._model.transcribe(
docs/VOICE_INPUT.md (1)
66-66: Consider adding language identifiers to code blocks.

The fenced code blocks at lines 66 and 146 are missing language identifiers. While not critical, adding them improves syntax highlighting and documentation quality.
🔎 Suggested additions

For line 66 (example session output):
-```
+```console
 $ cortex voice
For line 146 (ASCII diagram):
-```
+```text
 ┌──────────────┐    ┌──────────────┐    ┌──────────────┐
Also applies to: 146-157

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 891dff5 and 8f2384f.

📒 Files selected for processing (5)

cortex/cli.py
cortex/voice.py
docs/VOICE_INPUT.md
requirements.txt
tests/test_voice.py

🚧 Files skipped from review as they are similar to previous changes (2)

requirements.txt
tests/test_voice.py

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs

Files:

cortex/cli.py
cortex/voice.py

🧬 Code graph analysis (2)

cortex/cli.py (3)

cortex/voice.py (4)

VoiceInputError (21-24)

VoiceInputHandler (39-499)

start_voice_mode (401-436)

record_single (438-479)

cortex/first_run_wizard.py (1)

_print_error (746-748)

cortex/branding.py (1)

cx_print (52-82)

cortex/voice.py (1)

cortex/branding.py (1)

cx_print (52-82)

🪛 LanguageTool

docs/VOICE_INPUT.md

[grammar] ~141-~141: Ensure spelling is correct
Context: ...apture** - Records via sounddevice at 16kHz mono 3. Speech-to-Text - Transcribe...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

[grammar] ~141-~141: Ensure spelling is correct
Context: ... Records via sounddevice at 16kHz mono 3. Speech-to-Text - Transcribes using `fa...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🪛 markdownlint-cli2 (0.18.1)

docs/VOICE_INPUT.md

66-66: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

146-146: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (10)

cortex/voice.py (3)

339-383: LGTM! Thread-safe hotkey handling.

The hotkey setup properly uses a lock for thread safety and correctly sets the recording flag before starting threads. The logic handles both press-to-start and press-to-stop flows correctly.

385-436: LGTM! Proper error handling and resource cleanup.

The recording worker and continuous voice mode implementation correctly handle errors and ensure resources are cleaned up via the finally block.

481-499: LGTM! Exception handling improved.

The cleanup logic properly handles exceptions during shutdown and logs them for debugging. This addresses the past review comment about silent exception handling.

docs/VOICE_INPUT.md (2)

49-49: LGTM! Model size documentation is now consistent.

The documentation correctly states that the default base.en model is ~150MB, which is consistent with the table at line 117. This addresses the past review comment about the documentation inconsistency.

1-261: LGTM! Comprehensive and well-structured documentation.

The documentation provides excellent coverage of the voice input feature, including installation, usage, configuration, troubleshooting, and API reference. The structure is clear and user-friendly.

cortex/cli.py (5)

540-625: LGTM! Well-designed voice input integration.

The voice() method properly handles both continuous and single-shot modes, includes user confirmation for installations, and has comprehensive error handling for missing dependencies and voice input errors.

1989-1996: LGTM! Clean integration of voice input with ask command.

The --mic flag integration properly routes to the voice handler and provides clear error messages when neither a question nor the mic flag is provided.

1998-2027: LGTM! VoiceInputError handling properly implemented.

The install command's --mic integration now correctly imports and catches VoiceInputError (lines 2001, 2014-2016), addressing the past review comment about missing error handling. The implementation provides clear error messages and proper fallback behavior.

1728-1753: LGTM! Parser configuration is correct.

The new voice command and --mic flag integrations are properly configured with clear help text and appropriate argument handling. Making software optional (line 1741) correctly supports the --mic workflow.

1987-1988: LGTM! Voice command routing is correct.

The routing logic correctly maps the --single flag to the continuous parameter by negating it, so that by default (no flag) continuous mode is active, and with --single it switches to single-shot mode.

sonarqubecloud · 2026-01-11T11:51:03Z

Quality Gate failed

Failed conditions
B Reliability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

Sahilbhatane and others added 2 commits December 31, 2025 18:35

Voice feature: speeech-to-text added

50d92cf

Merge branch 'cortexlinux:main' into issue-325

9acb0d7

Copilot AI review requested due to automatic review settings December 31, 2025 13:23

Sahilbhatane requested a review from mikejmorgan-ai as a code owner December 31, 2025 13:23

Copilot started reviewing on behalf of Sahilbhatane December 31, 2025 13:24 View session

test fixs

bb1de82

Copilot AI reviewed Dec 31, 2025

View reviewed changes

coderabbitai bot reviewed Dec 31, 2025

View reviewed changes

chore: add myenv and venv312 to gitignore

891dff5

Sahilbhatane changed the title ~~Issue 325~~ Added voice feature for end-user fix #325 Dec 31, 2025

coderabbitai bot reviewed Dec 31, 2025

View reviewed changes

.gitignore Show resolved Hide resolved

Sahilbhatane marked this pull request as draft December 31, 2025 13:38

Sahilbhatane added 2 commits December 31, 2025 19:09

fix: remove myenv from repo and fix voice test mocking

1734cff

docs: remove tiny.en model, use base.en as default everywhere

dab189b

dhvll requested a review from Anshgrover23 December 31, 2025 13:55

Sahilbhatane added 2 commits December 31, 2025 19:27

fix: skip voice tests when numpy not installed (optional dep)

45d3769

Sahilbhatane requested review from Dhruv-89 and Suyashd999 December 31, 2025 14:08

Anshgrover23 removed their request for review December 31, 2025 14:09

Sahilbhatane removed the request for review from mikejmorgan-ai December 31, 2025 14:10

Merge branch 'cortexlinux:main' into issue-325

8f2384f

Sahilbhatane marked this pull request as ready for review December 31, 2025 14:35

Sahilbhatane marked this pull request as draft December 31, 2025 14:35

coderabbitai bot reviewed Dec 31, 2025

View reviewed changes

Anshgrover23 assigned Sahilbhatane Jan 1, 2026

mikejmorgan-ai self-assigned this Jan 10, 2026

Merge branch 'main' into issue-325

12e7208

Sahilbhatane changed the title ~~Added voice feature for end-user fix #325~~ feat: Added voice feature for end-user fix #325 Jan 11, 2026

Sahilbhatane added 4 commits January 11, 2026 16:56

Suggestion fix and import fix

3ae13dd

Add import for error

4837e51

System requirements for voice and key detection

73a734f

test fix for py 3.11

0cd18a1

Anshgrover23 unassigned mikejmorgan-ai Jan 11, 2026

Uh oh!

feat: Added voice feature for end-user fix #325 #405

Are you sure you want to change the base?

feat: Added voice feature for end-user fix #325 #405

Uh oh!

Conversation

Sahilbhatane commented Dec 31, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related Issue

Summary

Features Added

Technical Details

Files Changed

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

github-actions bot commented Dec 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CLA Verification Passed

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Key Changes

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI Dec 31, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Dec 31, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 31, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dhvll commented Dec 31, 2025

Uh oh!

Sahilbhatane commented Dec 31, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud bot commented Jan 11, 2026

Quality Gate failed

Uh oh!

Reviewers

Assignees

Sahilbhatane commented Dec 31, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 31, 2025 •

edited

Loading

github-actions bot commented Dec 31, 2025 •

edited

Loading