fix: use category name instead of uuid generated by llm (CM-889) by ulemons · Pull Request #3822 · CrowdDotDev/crowd.dev

ulemons · 2026-02-03T15:28:40Z

PR: Stricter prompt + double validation for category UUIDs

Summary

This PR hardens LLM category selection by:

Increasing prompt rigidity to reduce category/id drift.
Applying a double validation step to ensure the returned UUID is both well-formed and present in the DB.
Falling back to DB-backed resolution using the category name when the UUID is missing/invalid/not found.

What changed

Prompt
- Reinforced “closed set” constraints: categories must be copied verbatim from the authoritative JSON list.
- Added stricter self-check requirements to minimize mismatches.
Post-processing / Validation
- Step 1: Accept category only if id matches UUID format and exists in categories (DB list).
- Step 2: If id fails (missing/invalid/not in DB), try resolving by name (case-insensitive) and use the DB id.
- Step 3: If neither id nor name resolves to a DB category, skip the entry and warn.

Why

LLM outputs can return:

Correct names but wrong UUIDs
Correct UUID format that doesn’t exist in DB
New/modified category names or invented IDs

This ensures output is always DB-consistent, prevents silent data corruption, and keeps categorization within the authoritative category set.

Note

Medium Risk
Moderate risk because it changes production categorization behavior and introduces filtering/correction logic that could drop or alter LLM-selected categories if matching fails.

Overview
Tightens the LLM category-classification flow by switching the category list in the prompt from grouped text to an authoritative closed-set JSON array and adding strict output constraints/self-check instructions to reduce name/id drift.

Adds post-processing in findCategoriesWithLLM to handle null LLM responses and to ensure returned category IDs are DB-consistent: accept only IDs that are valid UUIDs and exist in the fetched category list, otherwise resolve by case-insensitive name and replace with the DB UUID, skipping unknown categories with warnings.

^{Written by Cursor Bugbot for commit 0f8d57e. This will update automatically on new commits. Configure here.}

services/apps/categorization_worker/src/activities/activities.ts

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

services/apps/categorization_worker/src/activities/activities.ts

fix: use category name instead of uuid generated by llm

b0d56a6

ulemons self-assigned this Feb 3, 2026

ulemons added the Bug Created by Linear-GitHub Sync label Feb 3, 2026

ulemons changed the title ~~fix: use category name instead of uuid generated by llm~~ fix: use category name instead of uuid generated by llm (CM-889) Feb 3, 2026

ulemons added 5 commits February 3, 2026 16:56

fix: improve prompt

80ffb3a

fix: improve prompt

fe691e8

fix: using data from db

ab024d3

fix: print after filter

4ccc201

fix: lint

4db030e

ulemons requested a review from joanagmaia February 3, 2026 16:56

ulemons marked this pull request as ready for review February 3, 2026 16:56

cursor bot reviewed Feb 3, 2026

View reviewed changes

fix: cursor comments

0f8d57e

cursor bot reviewed Feb 4, 2026

View reviewed changes

services/apps/categorization_worker/src/activities/activities.ts Show resolved Hide resolved

services/apps/categorization_worker/src/activities/activities.ts Show resolved Hide resolved

joanagmaia approved these changes Feb 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: use category name instead of uuid generated by llm (CM-889)#3822

fix: use category name instead of uuid generated by llm (CM-889)#3822
ulemons wants to merge 7 commits intomainfrom
feat/categorization-issue

ulemons commented Feb 3, 2026 •

edited by cursor bot

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ulemons commented Feb 3, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR: Stricter prompt + double validation for category UUIDs

Summary

What changed

Why

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ulemons commented Feb 3, 2026 •

edited by cursor bot

Loading