Changelog Feb 10, 2026 3 min read

v0.4.0 — Sentinel Evaluation Engine, New Models, Workflow Pipelines

Automated quality scoring is live. Plus: Flux.2 Klein support, multi-step workflow builder, and 3 new benchmark categories.

Miguel Rasero

CTO & Co-Founder

v0.4.0 is our biggest release yet. The headline feature is Sentinel, our automated model evaluation engine, but there is a lot more packed into this one. Here is the full breakdown.

Sentinel Evaluation Engine

Every output generated through Runflow is now automatically scored across three dimensions: FID (distributional similarity), CLIP (prompt alignment), and human eval calibration. Scores are weighted per niche, so a corporate headshot is evaluated differently than a creative portrait or a product photo.

This powers our benchmark tables and gives customers real-time quality metrics in the dashboard. Read the full technical deep-dive in Building Sentinel: Our Automated Model Evaluation System.

New Models

Flux.2 Klein — A lightweight variant of Flux.2 optimized for speed. 2x faster than Flux.2 [schnell] with only a 3-point quality tradeoff. Ideal for real-time preview generation and interactive applications.
SDXL Lightning v2 — Updated 4-step distilled model with improved face coherence. Scores 3 points higher than v1 on our portrait benchmark while maintaining the same latency profile.

Workflow Pipelines

You can now chain multiple operations into a single API call. Define a pipeline that generates an image, scores it, and enhances it, all in one request with a single webhook callback.

const result = await runflow.workflow({
  steps: [
    { action: "generate", model: "flux.2-dev", prompt },
    { action: "score", niche: "corporate-headshot" },
    { action: "enhance", model: "real-esrgan-x4" }
  ]
});

New Benchmark Categories

Sentinel now tracks three additional niche categories:

E-commerce product shots — Scoring optimized for white-background product photography, edge clarity, and color accuracy
Creative portraits — Artistic style generation with emphasis on aesthetic quality and prompt creativity adherence
Virtual try-on — Clothing overlay accuracy, body proportion preservation, and garment texture realism

Bug Fixes

Fixed an issue where webhook callbacks would occasionally fire before the image was fully uploaded to CDN, resulting in 404s on the image URL
Resolved a race condition in the async job queue that could cause duplicate processing of the same request under high concurrency
Fixed EXIF orientation handling for uploaded reference images that were being rotated incorrectly on iOS Safari uploads
Corrected timeout handling for long-running SDXL inpainting jobs that exceeded the default 30-second window

Update to v0.4.0 by running npm update @runflow/sdk. The Sentinel scoring API is available immediately on all plans. Workflow pipelines are in beta and available on Pro and Enterprise.

ChangelogSentinelv0.4

Want custom benchmarks for your workload?

We'll run our evaluation pipeline against your production data, for free.

Talk to Founders

benchmarks

v0.4.0 — Sentinel Evaluation Engine, New Models, Workflow Pipelines

Sentinel Evaluation Engine

New Models

Workflow Pipelines

New Benchmark Categories

Bug Fixes

Want custom benchmarks for your workload?

Related posts

Background Removal Showdown: RMBG-2.0 vs SAM 2 vs Proprietary APIs

How We Cut GPU Costs 70% - The Architecture Behind Runflow

Building Sentinel: Our Automated Model Evaluation System