Hedge Fund Excel Model Auditor

A tool for structural analysis of institutional Excel‑based financial models.

The application ingests Excel workbooks as Directed Acyclic Graphs (DAGs), mapping the flow of data from assumptions to valuation and highlighting structural issues in the model.

Quick start

pip install -r requirements.txt
streamlit run app.py
# Upload sample_models/BOBWEIR_Model.xlsx to test

Optional LLM analysis

pip install anthropic python-dotenv  # or: pip install openai python-dotenv
cp .env.example .env
# Edit .env and add your API key

Features

Dual‑state ingestion
Loads models in two parallel states: values (for numerical checks) and formulas (for dependency tracing).
Dependency graph engine
Uses networkx to represent each cell as a node, enabling detection of circular references and disconnected calculation chains.
Audit checks
- Hard‑coded entries in forecast periods (e.g., constant growth rates in formula rows).
- Balance sheet consistency $$(Assets - (Liabilities + Equity) = 0)$$.
- References to missing or external files (e.g., C:/Users/Analyst/Desktop/Budget.xlsx).
Reporting artifacts
Produces a PDF memo and an Excel “datatape” summarizing identified issues.
Complexity scoring
Assigns a 1–5 complexity score using graph topology and formula density.
LLM‑assisted summaries (optional)
Uses Claude or GPT‑4 to generate narrative summaries of findings, with prompts constrained to exclude investment advice.

Installation

Clone the repository:

git clone https://github.com/your-username/excel-model-eval.git
cd excel-model-eval

Install dependencies:
```
pip install -r requirements.txt
```

Usage

Launch the dashboard:
```
streamlit run app.py
```
Upload a model:
- Supports .xlsx and .xlsm for formula analysis.
- CSV files are supported for value inspection only.
- Example: upload sample_models/BOBWEIR_Model.xlsx for a preconfigured test case.
Review outputs:
- Inspect the complexity score and identified issues in the UI.
- Download the PDF memo and Excel datatape from the sidebar.

Sample model: BOBWEIR Pharmaceuticals

The repository includes a sample financial model for testing:

sample_models/BOBWEIR_Model.xlsx

Sheet	Contents
Cover	Company overview, product list
Assumptions	Growth, margins, working capital drivers
Revenue	Revenue build for six products
IS	Income statement with linked formulas
BS	Balance sheet with working capital calculations
CF	Cash flow statement
DCF	DCF schedule and terminal value

Company profile (synthetic example):

Specialty pharma focused on rare diseases and oncology
Revenue split across six products (Neurex, Oncovir, Hemaguard, Cardioshield, Dermaclear, Respiron)
Projection period: 2024E–2028E

The model includes intentional issues for testing:

A hard‑coded revenue plug in the projection period (Revenue sheet, Neurex 2025E).
Full formula linkages for dependency tracing.

To regenerate the sample:

python scripts/create_sample_model.py

Project structure

excel-model-eval/
├── src/                       # Core audit engine
│   ├── ingestion.py           # Dual-state loading
│   ├── dependency.py          # Graph construction and analysis
│   ├── auditor.py             # Heuristic checks and issue catalog
│   ├── reporting.py           # PDF/Excel report generation and scoring
│   └── llm_analyzer.py        # Optional LLM integration
├── eval/                      # LLM evaluation framework
│   └── llm_rubrics/           # YAML rubrics for grading LLM outputs
├── human_review/              # Human-in-the-loop materials
│   ├── reviewer_guidelines.md
│   └── sample_reviews/
├── trainer_tasks/             # Evaluation exercises (AI trainer-style tasks)
├── docs/
│   ├── failure_modes.md       # Catalog of LLM failure patterns
│   └── transferability.md     # Notes on cross-domain use
├── sample_models/
│   └── BOBWEIR_Model.xlsx
├── scripts/
│   └── create_sample_model.py
├── app.py                     # Streamlit frontend
├── .env.example               # Template for API keys
├── data/                      # User models (gitignored)
├── RESULTS/                   # Generated reports (gitignored)
└── requirements.txt

LLM integration architecture

The optional LLM module (src/llm_analyzer.py) adds a structured analysis layer on top of the deterministic audit engine:

┌──────────────────┐     ┌──────────────────┐     ┌──────────────────┐
│   Audit Engine   │ --> │   LLM Analyzer   │ --> │  Human Review    │
│ (Deterministic)  │     │   (Narrative)    │     │  (Final Review)  │
└──────────────────┘     └──────────────────┘     └──────────────────┘
        CONTROL               REASONING               DECISION

Audit engine: graph analysis, issue detection, severity calculations.
LLM analyzer: text summaries, grouping and prioritization of issues, explanations.
Human review: verification of claims and decisions about any downstream actions.

Prompt‑level safety constraints

The LLM is configured to:

Allowed: explain findings, order issues by materiality, suggest remediation steps, express uncertainty, reference specific cells.
Disallowed: investment recommendations, valuation opinions, price targets, data invention, business strategy advice.

The model is used as an analysis and explanation component; it does not execute changes or interact with external systems.

Design principles

Principle	Implementation in this repo
Separate reasoning from control	LLM produces text analyses; code performs audits and report generation
Constrained guidance	Prompts enforce scope limits and evidence referencing
Evaluation of non‑numeric outputs	YAML rubrics and human review guidelines
Failure‑mode awareness	Documented patterns and targeted test cases
Preference for interpretability	Graph‑based checks and explicit evidence paths

Evaluation and trainer assets

Rubrics (eval/llm_rubrics/)
- strategy_quality.yaml: plausibility and proportionality of proposed actions.
- reasoning_fidelity.yaml: alignment between evidence and conclusions, uncertainty handling.
- safety_and_scope.yaml: scope adherence and hallucination risk.
Human review (human_review/)
- Reviewer guidelines defining process and scoring criteria.
- Sample reviews illustrating “good”, “borderline”, and “failed” outputs.
Failure modes (docs/failure_modes.md)
- Examples: narrative overfitting, regime anchoring, unwarranted confidence.
- Detection strategies and mitigation options.
Trainer tasks (trainer_tasks/)
- Exercises where a reviewer grades outputs, identifies failure modes, and proposes prompt or policy changes (suitable as AI trainer / RLHF evaluation tasks).

Transferability

The same architecture (deterministic core + scoped LLM layer + human review + rubrics) can be adapted to other domains such as compliance, healthcare, cybersecurity, operations, and policy analysis, as discussed in docs/transferability.md.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
builder		builder
docs		docs
eval/llm_rubrics		eval/llm_rubrics
examples		examples
human_review		human_review
sample_models		sample_models
scripts		scripts
src		src
tests		tests
trainer_tasks		trainer_tasks
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
app.py		app.py
main.py		main.py
reporting.py		reporting.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hedge Fund Excel Model Auditor

Quick start

Optional LLM analysis

Features

Installation

Usage

Sample model: BOBWEIR Pharmaceuticals

Project structure

LLM integration architecture

Prompt‑level safety constraints

Design principles

Evaluation and trainer assets

Transferability

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

bdschi1/excel-model-eval

Folders and files

Latest commit

History

Repository files navigation

Hedge Fund Excel Model Auditor

Quick start

Optional LLM analysis

Features

Installation

Usage

Sample model: BOBWEIR Pharmaceuticals

Project structure

LLM integration architecture

Prompt‑level safety constraints

Design principles

Evaluation and trainer assets

Transferability

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages