-
Notifications
You must be signed in to change notification settings - Fork 42
More robust bench execution for Dynamo over k8s #793
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughThe change replaces in-cluster Kubernetes API streaming with a subprocess.run call executing kubectl exec directly. Command output is written to genai_perf.log with enhanced error handling, and exception handling is broadened from specific API exceptions to general exceptions with debug-level logging. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Tip 🧪 Unit Test Generation v2 is now available!We have significantly improved our unit test generation capabilities. To enable: Add this to your reviews:
finishing_touches:
unit_tests:
enabled: trueTry it out by using the Have feedback? Share your thoughts on our Discord thread! Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile OverviewGreptile SummaryReplaced Kubernetes Python Stream API with Key Changes:
Issues Found:
Confidence Score: 3/5
Important Files Changed
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 file reviewed, 3 comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/cloudai/systems/kubernetes/kubernetes_system.py`:
- Around line 316-324: The current subprocess.run call in kubernetes_system.py
(the try block around kubectl_exec_cmd) only logs result.returncode and swallows
failures; change it to either pass check=True to subprocess.run or explicitly
check result.returncode and raise a subprocess.CalledProcessError when non-zero,
and re-raise TimeoutExpired and other exceptions instead of only logging them so
failures propagate; update the except block to log and then raise the caught
exception (so _genai_perf_completed is only set when the command actually
succeeds).
- Around line 313-316: The kubectl subprocess command built in kubectl_exec_cmd
does not pass the resolved kubeconfig, so kubectl may target the wrong cluster
when self.kube_config_path differs from the default; update the construction of
kubectl_exec_cmd (the list used in subprocess.run for running genai_perf_cmd) to
include the flag "--kubeconfig" followed by self.kube_config_path (or its
resolved path used by load_kube_config()) before the "exec" invocation so the
subprocess uses the same kubeconfig as the API client.
Summary
Stream API can be sensitive to network issues, delays, broken ping, etc. Using
kubectlis more robust for this case.Test Plan
Additional Notes
—