25 Jan 21:23

f42a6c2

v0.4.0: GPU Infrastructure Generalization & Python Bindings Latest

Latest

Highlights

This release extracts ~7,000+ lines of proven GPU infrastructure from RustGraph into RingKernel, making these capabilities available to all RingKernel users.

New: Python Bindings (`ringkernel-python`)

PyO3-based Python wrapper with full async/await support:

import ringkernel
import asyncio

async def main():
    runtime = await ringkernel.RingKernel.create(backend="cpu")
    kernel = await runtime.launch("processor", ringkernel.LaunchOptions())
    await kernel.terminate()
    await runtime.shutdown()

asyncio.run(main())

Features:

Async/await with sync fallbacks
HLC timestamps and K2K messaging
CUDA device enumeration and GPU memory pool management
Benchmark framework with regression detection
Hybrid CPU/GPU dispatcher with adaptive thresholds
Resource guard for memory limit enforcement
Type stubs for IDE support

New: PTX Compilation Cache

Disk-based PTX caching for faster kernel loading with SHA-256 content hashing and compute capability awareness.

New: GPU Stratified Memory Pool

Size-stratified GPU VRAM pool with 6 size classes (256B-256KB), O(1) allocation from free lists.

New: Multi-Stream Execution Manager

Multi-stream CUDA execution for compute/transfer overlap with event-based synchronization.

New: Benchmark Framework

Comprehensive benchmarking with regression detection, baseline comparison, and multiple report formats (Markdown, JSON, LaTeX).

New: Hybrid CPU-GPU Dispatcher

Intelligent workload routing with adaptive threshold learning between CPU and GPU execution.

New: Resource Guard

Memory limit enforcement with safety margins and RAII reservation patterns.

New: Kernel Mode Selector

Intelligent kernel launch configuration based on workload profile and GPU architecture.

See CHANGELOG.md for full details.

Assets 2

21 Jan 09:54

mivertowski

v0.3.2

0481bf5

v0.3.2: GPU Profiling Infrastructure

What's New

GPU Profiling Infrastructure

CUDA event-based timing and NVTX markers
Memory allocation tracking
Chrome trace export for visualization

Publishing Fixes

Fixed publish script to add User-Agent header for crates.io API
Updated dependency versions across all crates for v0.3.2 publishing
ringkernel-ir, ringkernel-graph, ringkernel-montecarlo now use workspace versions

Crates Published

ringkernel-core, ringkernel-cuda-codegen, ringkernel-wgpu-codegen
ringkernel-derive, ringkernel-cpu, ringkernel-cuda, ringkernel-wgpu, ringkernel-metal
ringkernel-codegen, ringkernel-ecosystem, ringkernel-audio-fft
ringkernel (main crate)

See crates.io/crates/ringkernel for the published crates.

Assets 2

19 Jan 20:16

mivertowski

v0.3.1

e92adeb

v0.3.1: Enterprise Readiness

RingKernel v0.3.1: Enterprise Readiness

This release adds comprehensive enterprise-grade features for production deployments.

🔐 Enterprise Security

Real Cryptography: AES-256-GCM, ChaCha20-Poly1305, Argon2 key derivation
Secrets Management: SecretStore trait with key rotation, caching, and chained stores
K2K Message Encryption: Kernel-to-kernel encryption with forward secrecy
TLS/mTLS Support: Full TLS with rustls, certificate rotation, SNI resolution

🔑 Authentication & Authorization

Authentication Providers: ApiKeyAuth, JwtAuth (RS256/HS256), ChainedAuthProvider
RBAC: Role-based access control with deny-by-default PolicyEvaluator
Multi-tenancy: TenantContext, ResourceQuota, usage tracking

📊 Observability

OpenTelemetry: OTLP export to Jaeger, Honeycomb, Datadog, Grafana Cloud
Structured Logging: Multi-sink logger with trace correlation (JSON/Text)
Alert Routing: Severity-based routing with deduplication (Slack, Teams, PagerDuty)
Remote Audit Sinks: Syslog, CloudWatch Logs, Elasticsearch

⚡ Rate Limiting

Algorithms: TokenBucket, SlidingWindow, LeakyBucket
Builder API: Fluent configuration with RateLimiterBuilder
Distributed: SharedRateLimiter for multi-instance deployments

🔧 Operational Excellence

Automatic Recovery: Configurable policies per failure type (Restart, Migrate, Checkpoint, Notify, Escalate, Circuit)
Operation Timeouts: Deadline propagation with Timeout and Deadline types
Recovery Manager: Retry tracking, cooldown periods, automatic escalation

📦 Feature Flags

[dependencies]
ringkernel-core = { version = "0.3.1", features = ["enterprise"] }

# Or select specific features:
ringkernel-core = { version = "0.3.1", features = ["crypto", "auth", "tls", "rate-limiting", "alerting"] }

📈 Metrics

Test Coverage: 900+ tests (up from 825+)
Crates Published: 21 crates to crates.io

🚀 Quick Start

use ringkernel_core::prelude::*;

// Enterprise runtime with production preset
let runtime = RuntimeBuilder::new()
    .production()
    .build()?;

// API key authentication
let auth = ApiKeyAuth::new()
    .add_key("sk-prod-abc123", Identity::new("service-a"));

// Rate limiting
let limiter = RateLimiterBuilder::new()
    .algorithm(RateLimitAlgorithm::TokenBucket)
    .rate(1000)
    .burst(100)
    .build();

Full Changelog

See CHANGELOG.md for complete details.

Assets 2

19 Jan 09:34

mivertowski

v0.3.0

8d81824

v0.3.0: Multi-Kernel Dispatch, Memory Pools, Global Reductions

RingKernel v0.3.0

GPU-native persistent actor model framework for Rust. This release adds multi-kernel dispatch, memory pools, global reduction primitives, and two new crates.

Highlights

21 crates published to crates.io - Full workspace now available
825+ tests across the workspace
cudarc 0.18.2 and wgpu 27.0 support

New Features

Multi-Kernel Dispatch and Persistent Message Routing

#[derive(PersistentMessage)] macro for GPU kernel dispatch
KernelDispatcher component with builder pattern and metrics
CUDA handler dispatch code generator (CudaDispatchTable)
Queue tiering system (QueueTier, QueueFactory, QueueMonitor)

Memory Pool Management

StratifiedMemoryPool with 5 size buckets (256B to 64KB)
AnalyticsContext for grouped buffer lifecycle
PressureHandler for memory pressure monitoring
CUDA ReductionBufferCache and WebGPU StagingBufferPool

Global Reduction Primitives

ReductionOp enum: Sum, Min, Max, And, Or, Xor, Product
ReductionBuffer<T> using mapped memory (zero-copy host read)
Multi-phase kernel execution with SyncMode (Cooperative, SoftwareBarrier, MultiLaunch)
PageRank example with dangling node handling

CUDA NVRTC Compilation

compile_ptx() function for runtime CUDA compilation
Downstream crates can compile CUDA without direct cudarc dependency

Domain System

20 business domains with reserved type ID ranges
#[message(domain = "FraudDetection")] attribute
Domains: GraphAnalytics, FraudDetection, ProcessIntelligence, Banking, etc.

New Crates

ringkernel-montecarlo - Philox RNG, antithetic variates, control variates, importance sampling
ringkernel-graph - CSR matrix, BFS, SCC (Tarjan/Kosaraju), Union-Find, SpMV

Breaking Changes

cudarc API updated to 0.18.2 (module loading, kernel launch builder pattern)
wgpu API updated to 27.0 (Arc-based resources)

Installation

[dependencies]
ringkernel = "0.3.0"

# Optional backends
ringkernel-cuda = "0.3.0"
ringkernel-wgpu = "0.3.0"

Documentation

Full Changelog: v0.2.0...v0.3.0

Assets 2

14 Jan 16:48

mivertowski

v0.2.0

210073c

RingKernel v0.2.0

What's Changed

Claude/persistent kernel implementation d nc3 o by @mivertowski in #9

Full Changelog: v0.1.3...v0.2.0

Contributors

mivertowski

Assets 2

17 Dec 14:18

mivertowski

v0.1.3

01edfbf

v0.1.3 - Dependency Updates & CI Fixes

Highlights

wgpu 27.0 - Major update with Arc-based resource tracking (~40% performance improvement in some workloads)
Dependency updates - tokio 1.48, axum 0.8, tonic 0.14, egui 0.31, winit 0.30
CI/CD fixes - Workspace builds without CUDA/nvcc installed

What's Changed

Dependencies Updated

Package	From	To
wgpu	0.19	27.0
tokio	1.35	1.48
thiserror	1.0	2.0
axum	0.7	0.8
tower	0.4	0.5
tonic	0.11	0.14
prost	0.12	0.14
egui/egui-wgpu/egui-winit	0.27	0.31
winit	0.29	0.30
glam	0.27	0.29
metal	0.27	0.31
arrow	52	54
polars	0.39	0.46
rayon	1.10	1.11
actix-rt	2.9	2.10

Deferred Updates

iced: Kept at 0.13 (0.14 requires major application API rewrite)
rkyv: Kept at 0.7 (0.8 has incompatible data format)

CI/CD Improvements

CUDA features are now opt-in (not default)
Workspace builds succeed without nvcc installed
Feature-gated CUDA tests with #[cfg(feature = "cuda")]

See CHANGELOG.md for full details.

Assets 2

11 Dec 09:55

github-actions

v0.1.2

581c539

v0.1.2

Release v0.1.2

- **WaveSim3D** - 3D acoustic wave simulation with realistic physics
  - Full 3D FDTD wave propagation solver
  - Binaural audio rendering with HRTF support
  - Volumetric ray marching visualization
  - GPU-native actor system for distributed simulation

- Expanded GPU intrinsics from ~45 to 120+ operations across 13 categories
- Atomic operations: and, or, xor, inc, dec
- 3D stencil intrinsics: up, down, at(dx, dy, dz)
- Warp match/reduce operations (Volta+/SM 8.0+)
- Bit manipulation, memory, special, and timing ops
- 171 tests (up from 143)

- Added required-features to CUDA-only wavesim binaries
- Updated GitHub Actions release workflow

See CHANGELOG.md for full details.

Assets 6

04 Dec 15:40

mivertowski

v0.1.1

8de6c1a

v0.1.1 - AccNet & ProcInt Showcase Applications

What's New

New Showcase Applications

AccNet - GPU-Accelerated Accounting Network Analytics

Network visualization with force-directed graph layout
Fraud detection: circular flows, threshold clustering, Benford's Law violations
GAAP compliance checking for accounting rule violations
Temporal analysis for seasonality, trends, and behavioral anomalies
GPU kernels: Suspense detection, GAAP violation, Benford analysis, PageRank

ProcInt - GPU-Accelerated Process Intelligence

DFG (Directly-Follows Graph) mining from event streams
Pattern detection: bottlenecks, loops, rework, long-running activities
Conformance checking with fitness and precision metrics
Timeline view with partial order traces and concurrent activity visualization
Multi-sector templates: Healthcare, Manufacturing, Finance, IT
GPU kernels: DFG construction, pattern detection, partial order derivation, conformance checking

Changes

Updated showcase documentation with AccNet and ProcInt sections
Updated CI workflow to exclude CUDA tests on runners without GPU hardware

Fixes

Fixed 14 clippy warnings in ringkernel-accnet
Fixed benchmark API compatibility in ringkernel-accnet
Fixed code formatting issues across showcase applications

Run the Applications

# AccNet - Accounting Network Analytics
cargo run -p ringkernel-accnet --release

# ProcInt - Process Intelligence
cargo run -p ringkernel-procint --release

Full Changelog: v0.1.0...v0.1.1

Assets 6

03 Dec 16:12

mivertowski

v0.1.0

871e793

RingKernel v0.1.0

RingKernel v0.1.0 - Initial Release

A GPU-native persistent actor model framework for Rust.

Highlights

Persistent GPU Kernels: GPU compute units as long-running actors that maintain state between invocations
Lock-free Message Queues: High-performance host↔GPU and kernel-to-kernel communication
Hybrid Logical Clocks (HLC): Causal ordering across distributed GPU operations
Multiple Backends: CPU, CUDA, WebGPU support
Zero-copy Serialization: rkyv-based message passing
Rust-to-GPU Transpilers: Write GPU kernels in Rust DSL, transpile to CUDA C or WGSL

Crates

Crate	Description
ringkernel	Main facade crate
ringkernel-core	Core traits, types, HLC, K2K, PubSub
ringkernel-derive	Proc macros (`#[derive(RingMessage)]`, `#[ring_kernel]`)
ringkernel-cpu	CPU backend
ringkernel-cuda	NVIDIA CUDA backend
ringkernel-wgpu	WebGPU backend
ringkernel-cuda-codegen	Rust-to-CUDA transpiler
ringkernel-wgpu-codegen	Rust-to-WGSL transpiler
ringkernel-wavesim	Wave simulation demo
ringkernel-txmon	Transaction monitoring demo

Quick Start

[dependencies]
ringkernel = "0.1"
tokio = { version = "1", features = ["full"] }

For GPU backends:

ringkernel = { version = "0.1", features = ["cuda"] }
# or
ringkernel = { version = "0.1", features = ["wgpu"] }

Documentation

API Docs: https://docs.rs/ringkernel
Guides: https://mivertowski.github.io/RustCompute/

Performance

Benchmarked on NVIDIA RTX Ada:

CUDA Codegen: ~93B elem/sec (12,378x vs CPU)
Message queue throughput: ~75M ops/sec
HLC timestamp generation: <10ns per tick

What's Included

14 workspace crates
390+ tests
20+ examples
Comprehensive documentation
Educational simulation modes (WaveSim)
Real-time fraud detection demo (TxMon)

Assets 2

Uh oh!

Releases: mivertowski/RustCompute

v0.4.0: GPU Infrastructure Generalization & Python Bindings

Highlights

New: Python Bindings (ringkernel-python)

New: PTX Compilation Cache

New: GPU Stratified Memory Pool

New: Multi-Stream Execution Manager

New: Benchmark Framework

New: Hybrid CPU-GPU Dispatcher

New: Resource Guard

New: Kernel Mode Selector

Uh oh!

v0.3.2: GPU Profiling Infrastructure

What's New

GPU Profiling Infrastructure

Publishing Fixes

Crates Published

Uh oh!

v0.3.1: Enterprise Readiness

RingKernel v0.3.1: Enterprise Readiness

🔐 Enterprise Security

🔑 Authentication & Authorization

📊 Observability

⚡ Rate Limiting

🔧 Operational Excellence

📦 Feature Flags

📈 Metrics

🚀 Quick Start

Full Changelog

Uh oh!

v0.3.0: Multi-Kernel Dispatch, Memory Pools, Global Reductions

RingKernel v0.3.0

Highlights

New Features

Multi-Kernel Dispatch and Persistent Message Routing

Memory Pool Management

Global Reduction Primitives

CUDA NVRTC Compilation

Domain System

New Crates

Breaking Changes

Installation

Documentation

Uh oh!

RingKernel v0.2.0

What's Changed

Contributors

Uh oh!

v0.1.3 - Dependency Updates & CI Fixes

Highlights

What's Changed

Dependencies Updated

Deferred Updates

CI/CD Improvements

Uh oh!

v0.1.2

Uh oh!

v0.1.1 - AccNet & ProcInt Showcase Applications

What's New

New Showcase Applications

AccNet - GPU-Accelerated Accounting Network Analytics

ProcInt - GPU-Accelerated Process Intelligence

Changes

Fixes

Run the Applications

Uh oh!

RingKernel v0.1.0

RingKernel v0.1.0 - Initial Release

Highlights

Crates

Quick Start

Documentation

Performance

What's Included

Uh oh!

New: Python Bindings (`ringkernel-python`)