The client
A startup accelerator offering mentorship programs, investment facilitation, and workshops to early-stage founders across India. They'd been operating for months with a team of SDRs making hundreds of outbound calls per week through Zoho CRM.
They sent us 2,270 call recordings as PDF transcripts. It was the largest dataset we'd processed.
The data quality problem
Before we could analyze anything, we discovered something important: 51% of the data was noise.
| Category | Count | % |
|----------|-------|---|
| Total transcripts | 2,273 | 100% |
| Pre-filter skipped (voicemails, wrong numbers) | 619 | 27% |
| Insights extracted | 1,651 | 73% |
| Qualifying for analysis | 1,117 | 49% |
| Filtered post-extraction (too thin) | 534 | 23% |
- 221 were rich — 3+ signals with direct quotes
- 896 were moderate — at least one actionable signal
- The rest had summaries but no extractable intelligence
This wasn't a flaw in the accelerator's process. It's normal. Most sales operations record everything — including the voicemails, the "I'm busy, call back later" calls, the wrong numbers, and the 15-second check-ins. The problem is that when you analyze all 2,270 calls equally, the noise drowns the signal.
Our quality filter separated them automatically. What remained was pure signal.
What the filtered data revealed
Discovery 1: SDRs were calling at the wrong time
Dozens of conversations opened with "I'm sitting with a client" or "Can we talk later?" The team was calling during business hours — but their prospects were early-stage founders at their day jobs.
The fix: call immediately when a registration comes in, while interest is fresh. Not the next business day.
Discovery 2: One-size-fits-all was killing high-value prospects
The cross-conversation analysis surfaced a contradiction invisible in any single call: some prospects said "we've already been funded" while others said "I need help getting started." Both received the same workshop and the same sales pitch.
Workshop attendees explicitly said "the workshop was pretty much basic." Not all of them — but enough that it was a pattern, not an outlier.
Funded startups wanting strategic mentorship were being treated like first-time founders needing idea validation. The company was losing its highest-value prospects because it couldn't distinguish between segments.
Discovery 3: Zero competitive mentions
Across 1,117 qualifying calls, not a single prospect mentioned a competitor by name. Zero.
For most companies, this sounds like good news. The founders immediately recognized the red flag: if nobody is comparing you to alternatives, it means either you have no competition (unlikely in startup services) or your prospects don't see you as being in the same category as the alternatives they're actually considering.
The positioning was so different from established accelerators and angel networks that prospects weren't even making the comparison. That's not differentiation — that's invisibility.
Discovery 4: The workshop funnel was leaking at every stage
10 evidence threads showed positive workshop feedback ("it was quite insightful," "it was awesome"). But 5 separate threads revealed the opposite: attendance was inconsistent, follow-ups went to voicemail, and prospects who attended didn't understand the next steps.
The workshop was generating leads. It wasn't converting them.
The founder reaction
When we walked the founders through these findings, the conversation shifted from "how do we get more leads" to "how do we fix our segmentation." They realized the volume wasn't the problem — they were making 2,270 calls. The targeting was the problem.
Three immediate actions came out of the session:
1. Segment prospects into three tiers before the first call
2. Call within minutes of registration, not the next day
3. Create separate workshop tracks for different founder stages
The methodology
- 2,273 PDF transcripts from Zoho CRM, processed with custom PDF parser
- Quality filter: 51% noise removed before analysis
- Two-pass extraction: comprehension pass + structured extraction per call
- Evidence compilation: 20 cross-conversation threads identified
- 9 reports generated: each for a different audience and use case
The quality filter alone saved the analysis from being diluted by 1,156 low-value transcripts. The insights that emerged from the remaining 1,117 calls were specific, actionable, and — most importantly — invisible in any single call review.
2,270 calls is a lot of data. 1,117 qualifying calls is a lot of intelligence. The difference is knowing which ones to read.