Case · Forge (self-applied) → Shield productisation · 6 weeks (Forge-shaped, internal) · then productised

Internal — Talpro Universe · Responsible-AI eval harness

Bhaskar Anand·Founder & CEO, CompetitorX·Pune, India·Published Apr 25, 2026

Capability · Forge Industry · HR-tech

Forge (self-applied) → Shield productisation · 6 weeks (Forge-shaped, internal) · then productised

Talpro India

Outcomes · measured

312

PRs gated

in 14 weeks across 4 products

Bias probes blocked at PR

would have shipped subtly biased rankings

days → minutes

Bias-detection mean time

manual review → CI gate

Narrative

01 · Problem

Problem

By late 2025 every Talpro AI surface — CV screening, recruiter desktop, candidate matching — had its own ad-hoc evaluation script written in whatever language the product team preferred. There was no shared definition of ‘is this model regression-free’, no shared probe library, no shared way to talk about bias. When the first enterprise prospect's DPO sent the AI-risk questionnaire, three different product teams gave three different answers about how Talpro tested for bias. The deal nearly stalled. Worse: the recruiter team was measuring shortlist-rank correlation but not measuring timezone-drift; the matching team was measuring drift but not bias; nobody was measuring prompt-injection resilience. Each team had a partial truth. Composing them into one defensible answer was a per-prospect rebuild.

02 · Approach

Approach

Six-week Forge engagement (internal) to build ProveIQ — a single eval harness with a portable probe library, a CI gate, and a public-facing report generator. Weeks 1–2: catalogue every existing eval across the four products into a unified taxonomy (rank correlation, recruiter-agreement, bias probe across protected attributes, drift detection, prompt-injection resilience, PII leakage, cost-per-inference, timezone stability). Weeks 3–4: build the harness as a TypeScript library with a 500-variant probe dataset (name swap, age markers, majority/minority markers across Hindi-belt, Tamil, Malayalam, Bengali surnames). CI plugin gates every PR. Weeks 5–6: build the report generator that emits the 28-page Responsible-AI annex DPOs actually read — the same one used in the Shield case (`/cases/shield-dpdpa-readiness`). Productised as the eval engine behind every Shield engagement.

03 · Outcome

Outcome

ProveIQ has gated 312 PRs across the Talpro Universe in the 14 weeks since rollout. 11 PRs were blocked by the bias probe (would have shipped subtly biased rankings), 4 by drift detection, 2 by prompt-injection. Mean time to bias-detection on a regression dropped from days (manual review) to minutes (CI). The same 28-page annex generator is now the single document Talpro hands to enterprise DPOs in Shield engagements — three different DPOs in three months accepted it without a 40-question follow-up. ProveIQ is the foundation under every Talpro AI claim: every metric on this site is reproducible because ProveIQ generated it.

04 · In their words

“Before ProveIQ, three teams had three different answers to the DPO’s bias question. After ProveIQ, we have one answer and a 28-page annex that backs it.”
— Bhaskar Anand · Founder, Talpro

Bhaskar AnandPune, India

05 — Who led this engagement

Bhaskar Anand. Every first call.

Founder & CEO, CompetitorX. Pune, India. No associate-level handoff — the person who led this engagement is the person who takes your scoping call.

Book 30 min Read Bhaskar's writing →

Similar engagement shape?

Start the conversation with a named human.

Book 30 min with Bhaskar

Next case

Internal — Talpro Universe · Smart-doorbell QR subscription service

Build (self-applied) → consumer SaaS →