A battle-tested CI/CD standardization framework for regulated environments. Built from lessons learned deploying pipelines in payment processing, fintech, and healthcare organizations where audit failures cost millions and security incidents end careers.
This is not a tutorial. It's a production-grade reference implementation that handles the edge cases most examples ignore.
The problem: Every team builds their own pipeline. Quality varies. Security is inconsistent. Auditors ask questions nobody can answer. Incidents reveal gaps that "should have been caught."
The solution: A golden path that teams adopt, not adapt. One way to build, scan, sign, and deploy. Exceptions tracked, not hidden. Evidence generated automatically.
What this proves:
- Pipeline standardization at scale (50+ services, same controls)
- Audit-ready evidence generation (PCI-DSS, SOC2)
- Security gates that block, not warn
- Promotion workflow that auditors understand
Evaluate in 5 minutes:
# Clone and run the demo (no Docker required)
git clone <repo-url>
cd openshift-devsecops-golden-path
make demo-e2eOr manually:
- Read this README (2 min)
- Review Design Decisions (3 min)
- Check Exception Workflow (2 min)
- Skim the Jenkins Shared Library (3 min)
In payment/fintech, you don't get to "move fast and break things":
| Requirement | Why It Matters |
|---|---|
| Change Control | PCI-DSS 6.4 requires documented change management |
| Segregation of Duties | Developers cannot deploy their own code to production |
| Audit Trail | Every deployment must trace back to approved code |
| Vulnerability Management | Known vulnerabilities must be tracked and remediated |
| Evidence | "Trust me" doesn't work with auditors |
I've seen these failures repeatedly:
- Audit finding: "No evidence that security scans ran before production deployment"
- Incident: Vulnerable dependency in production because SCA was "optional"
- Compliance gap: 40% of services had no quality gates at all
- Finger-pointing: "I thought the other team handled that"
This framework exists because voluntary best practices don't work in organizations under regulatory pressure.
┌─────────────────────────────────────────────────────────────────────────────┐
│ IMMUTABLE BUILD PIPELINE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ git push ──► Jenkins ──► MANDATORY GATES ──► Signed Artifact │
│ │ │ │
│ ┌──────────┴──────────┐ │ │
│ ▼ ▼ ▼ ▼ │
│ ┌─────────┐┌─────────┐┌─────────┐┌─────────┐ │
│ │ SAST ││ SCA ││ Quality ││ Secrets │ │
│ │ Semgrep ││ Trivy ││ Sonar ││Gitleaks │ │
│ │ ││ ││ ││ │ │
│ │ BLOCKS ││ BLOCKS ││ BLOCKS ││ BLOCKS │ │
│ │ on High ││on Crit. ││on Fail ││on Find │ │
│ └─────────┘└─────────┘└─────────┘└─────────┘ │
│ │
│ Artifact = Container Image + SBOM + Signature + Build Provenance │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
│
Digest (immutable)
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ GITOPS PROMOTION (Argo CD) │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ DEV │ PR ──► │ QA │ PR + ──► │ PROD │ │
│ │ │ │ │ Approval │ │ │
│ │ auto │ │ team │ │ lead + │ │
│ │ deploy │ │ review │ │ security│ │
│ └─────────┘ └─────────┘ └─────────┘ │
│ │
│ SAME ARTIFACT (by digest) moves through environments │
│ Config changes, code doesn't │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
- Gates block, they don't warn. A warning is a suggestion. A block is a control.
- Artifacts are immutable. Same SHA256 from build to production.
- Promotion is a Git commit. Auditable, reversible, requires approval.
- Exceptions are explicit. Tracked in Git, time-boxed, require justification.
This framework generates evidence for common compliance requirements:
| Requirement | Control | Evidence Generated |
|---|---|---|
| 6.2.4 Software development personnel are trained | Enforced via pipeline (can't skip) | Build logs showing gate execution |
| 6.3.1 Security vulnerabilities identified and managed | SCA scan on every build | Trivy reports, vulnerability counts |
| 6.3.2 Custom code reviewed before release | SonarQube quality gate | Quality gate status, PR reviews |
| 6.4.1 Development/test environments separated from production | GitOps env separation | Argo CD sync status per env |
| 6.4.2 Separation of duties between dev and prod | PR approval required for prod | Git commit history, approvers |
| 6.5.x Common vulnerabilities addressed | SAST scans | Semgrep reports |
| Trust Criteria | Control | Evidence |
|---|---|---|
| CC6.1 Logical access controls | Pipeline RBAC, env separation | Jenkins audit logs, Git history |
| CC6.6 Security events logged | All pipeline actions logged | Jenkins logs, Argo CD events |
| CC7.1 Configuration management | GitOps single source of truth | Git history is the audit trail |
| CC8.1 Change management | PR-based promotion | PR history, approvals |
| Auditor Question | Where to Find Evidence |
|---|---|
| "How do you ensure only scanned code reaches production?" | Jenkins build logs: every build shows SAST/SCA execution |
| "Who approved this production deployment?" | Git history: PR merge commit shows approver |
| "What vulnerabilities exist in production?" | SBOM + Trivy DB: exact dependency list, can query CVEs |
| "How do you track security exceptions?" | security/exceptions/: all exceptions in Git |
| "What changed between these two deployments?" | Git diff between promotion PRs |
A production system must handle failures gracefully. Here's how this framework handles common failure scenarios:
| Failure | Impact | Mitigation |
|---|---|---|
| SonarQube down | Quality gate cannot execute | Pipeline fails fast with clear error. No silent bypass. Ops alerted. |
| Trivy DB outdated | SCA may miss recent CVEs | Pipeline checks DB age, warns if >24h old, fails if >72h |
| Registry unreachable | Cannot push artifact | Retry with backoff (3x). Fail after. No partial states. |
| Signing key unavailable | Cannot sign image | Pipeline fails. Unsigned images cannot promote. |
| Failure | Impact | Mitigation |
|---|---|---|
| Argo CD down | No sync to cluster | GitOps state preserved in Git. Manual sync possible. Alert on sync delay >15min. |
| Deployment fails health check | Bad version in env | Argo CD auto-rollback to last healthy. Progressive rollout limits blast radius. |
| Cluster unreachable | Cannot deploy | Argo CD retries. State preserved. Alert on prolonged disconnect. |
Rollback production deployment:
# GitOps rollback = revert the PR
git revert <promotion-commit>
git push
# Argo CD syncs previous digest automaticallyEmergency hotfix (bypassing normal flow):
- Create exception request documenting urgency
- Get security lead approval (Slack + Git)
- Use
EMERGENCY=trueflag in pipeline (logs extensively) - Post-incident: full postmortem, convert to proper fix
Pipeline infrastructure down:
# Pipeline state is in Git, not Jenkins
# Rebuild Jenkins, reconnect to same repos
# Resume from last successful stage| Metric | Target | How We Measure |
|---|---|---|
| Deployment Frequency | Daily per service | Count of production promotions |
| Lead Time for Changes | <1 day | Commit timestamp → production deploy |
| Change Failure Rate | <5% | Rollbacks / total deployments |
| MTTR | <1 hour | Incident open → resolved |
| Metric | Target | How We Measure |
|---|---|---|
| Critical vulns in prod | 0 | Trivy scan of running images |
| High vulns MTTR | <7 days | Time from detection to remediation |
| Exception count | Trending down | Count of active exceptions |
| Gate bypass attempts | 0 | Pipeline logs (should never happen) |
| Metric | Target | How We Measure |
|---|---|---|
| Services on golden path | 100% | Services using shared library / total |
| Pipeline success rate | >95% | Successful builds / total builds |
| Onboarding time | <1 day | Request → first successful build |
- Lines of code — Incentivizes bloat
- Number of deployments — Without quality context, meaningless
- Vulnerabilities found — Finding more isn't better; fixing is
Strong opinions, loosely held:
Why: "Latest" is a lie. It changes. You cannot audit what "latest" meant last Tuesday. Digests are immutable.
Why: Segregation of duties. The person who wrote the code should not be the same person who approves production deployment. This is non-negotiable in regulated environments.
Why: A gate that warns but doesn't block is not a gate. It's a suggestion. Suggestions get ignored under deadline pressure.
Why: Without signatures, you cannot prove the image in production came from your pipeline. Supply chain attacks exploit this gap.
Why: Tickets get closed and forgotten. Git history is permanent. When an auditor asks "what exceptions existed on date X?", you can answer with git log.
Why: Every bypass must be logged, justified, and time-boxed. "Emergency" is not a blank check.
- Docker & Docker Compose
- make
git clone <repo-url>
cd openshift-devsecops-golden-path
cp .env.example .env
make up
make healthServices:
- Jenkins http://localhost:8080 (admin/admin)
- SonarQube http://localhost:9000 (admin/admin)
- Nexus http://localhost:8081 (admin/admin123)
- Registry localhost:5000
make demo
# Watch at http://localhost:8080/job/demo-service# Verify image signature
cosign verify --key cosign.pub localhost:5000/demo-service:latest
# View SBOM
cat demo-service/sbom.json | jq '.components | length'
# Check quality gate
curl -s http://localhost:9000/api/qualitygates/project_status?projectKey=demo-service├── jenkins-shared-library/ # THE CORE: Reusable pipeline steps
│ ├── vars/ # goldenPipeline, qualityGate, securityScan...
│ └── src/org/acme/ # Shared classes
├── gitops/ # Argo CD configuration
│ ├── app-of-apps/ # Bootstrap
│ ├── policies/ # Kubernetes admission policies (Kyverno)
│ └── env/{dev,qa,prod}/ # Environment configs
├── security/ # Governance
│ ├── policies/ # Security baselines
│ └── exceptions/ # Exception workflow + tracking
├── scripts/ # Scanner wrappers with adapters
├── docs/ # Architecture, runbooks
└── demo-service/ # Example application
| Document | Purpose |
|---|---|
| DESIGN_DECISIONS.md | Why we made these choices (and what we rejected) |
| DEMO_RUNBOOK.md | 7-minute demonstration script |
| ARCHITECTURE.md | Technical deep-dive |
| QUALITY_GATES.md | Gate configuration and thresholds |
| ONBOARDING.md | Adding a new service (5 minutes) |
| SECURITY_SETUP.md | Jenkins credentials and security configuration |
# Swap Semgrep for Fortify
SAST_ADAPTER=fortify ./scripts/scanners/run-sast.sh
# Swap Trivy for Checkmarx SCA
SCA_ADAPTER=checkmarx ./scripts/scanners/run-sca.shAdapters normalize output. Pipeline logic doesn't change.
The GitOps structure supports multiple clusters:
gitops/env/
├── dev/
├── qa/
├── prod-us-east/
├── prod-eu-west/
└── prod-ap-south/
Same artifact, region-specific configuration.
For environments without internet access:
- Use keyed Cosign signing (not keyless)
- Mirror Trivy DB internally
- Self-host SonarQube rules
MIT License. See LICENSE.