Inbound leads are a good problem to have — until there are too many of them to research properly. At 20 to 30 leads a week you can afford to spend 15 minutes on each one before passing to the sales team. At 200 a month, that's simply not happening. Something gets skipped.
For Natalie, a sales ops manager at a mid-market SaaS company, this was a growing pressure point. Their product had taken off in a sector where company size and buying intent were highly predictable from public data — but the team was forwarding leads to account executives with almost no qualification context. AEs were spending their first calls doing research that should have been done beforehand.
She needed a way to process a month's worth of inbound leads — company name, self-reported role, maybe a line about what they were looking for — and turn that into a structured qualification note that gave the AE something to work with in 30 seconds.
What she was trying to produce
Before designing the CSV or writing any prompt, Natalie wrote down exactly what a useful qualification note looked like. This step matters — it determines what columns you need and how you write the prompt.
Her AEs needed four things per lead:
- ICP match score (1–5): How closely does this company fit the ideal customer profile?
- Company size estimate: SMB, mid-market, or enterprise, inferred from the available signals.
- Key pain point: One sentence on the most likely reason they signed up, based on their described role and context.
- Recommended first question: An opening discovery question tailored to their situation.
Having the output defined before the prompt made writing the prompt straightforward. The output format becomes the instruction.
Building the spreadsheet
Her CRM export had more columns than she needed. She trimmed it to the five that actually contained useful signal:
| company_name | company_description | contact_role | company_size | signup_context |
|---|---|---|---|---|
| Meridian Legal | UK-based litigation and commercial law firm, 80 fee earners across three offices. | Head of Business Development | 51–200 | Looking to improve how we follow up with prospective clients after pitch meetings |
| Apex Facilities Group | Facilities management contractor serving NHS trusts and local authorities across England. | Operations Director | 201–500 | We handle a lot of contract renewals and need a better system for tracking them |
A few decisions worth noting:
- The
signup_contextcolumn — the free-text field from the sign-up form — was the most valuable input. Even a single sentence of self-reported intent gave the model enough to identify the likely pain point. company_sizewas the band from the sign-up dropdown, not a verified headcount. It's a rough signal but useful for the ICP scoring.- She deliberately excluded the contact's name and email. The batch job was about company-level qualification, not personalisation.
Writing the prompt
The prompt had two jobs: define the ICP so the model could score against it, and specify the exact output format so the results could be imported cleanly.
Prompt used:
You are a sales qualification analyst. Review the lead information below and produce a structured qualification note for the account executive who will take the first call.
Our ideal customer profile (ICP):
— UK-based professional services or regulated sector businesses
— 50–500 employees
— A defined business development, operations, or client management function
— Pain points typically around: client follow-up, pipeline tracking, contract management, or reporting
Output exactly this format — no extra text, no preamble:
ICP_SCORE: [1–5, where 5 = perfect fit]
SIZE_BAND: [SMB / Mid-market / Enterprise]
KEY_PAIN: [one sentence, max 20 words, based on role and signup_context]
OPENING_QUESTION: [one discovery question tailored to their situation]
Scoring guide:
5 — Matches sector, size, and pain point exactly
4 — Matches 2 of 3 criteria strongly
3 — Partial match, worth a call
2 — Outside ICP but not a hard no
1 — Clear mismatch on most criteria
The scoring guide matters. Without it, the model tends to cluster around 3–4 for everything — the ICP definition alone isn't enough to calibrate the scale. Defining what a 5 and a 1 look like forces meaningful differentiation.
Running the batch
500 rows on Gemini 2.5 Flash. The inputs are short and structured, the output format is constrained — Flash handles this well and the cost for the full run was under £1.
The batch completed in about 45 minutes.
What the output looked like
| company_name | contact_role | AI responses |
|---|---|---|
| Meridian Legal | Head of Business Development | ICP_SCORE: 5 SIZE_BAND: Mid-market KEY_PAIN: Needs a structured follow-up system for post-pitch prospect engagement. OPENING_QUESTION: How are you currently tracking where each prospective client is after a pitch — and what typically falls through the cracks? |
| Apex Facilities Group | Operations Director | ICP_SCORE: 4 SIZE_BAND: Mid-market KEY_PAIN: Managing contract renewal timelines across a large public-sector client base. OPENING_QUESTION: When a contract renewal is coming up, who owns the process today — and how far in advance does that typically start? |
Natalie imported this back into the CRM by mapping the AI responses column. Her team split the structured text into four separate fields using a simple formula. AEs now have all four data points on the lead record before they pick up the phone.
Tips from running this at scale
Define your ICP precisely before writing the prompt. Vague ICP definitions ("mid-sized companies in B2B") produce vague scores. The more specific you can be about sector, size band, and pain points, the more useful the scoring becomes.
The opening question output pays dividends. This was the detail that most surprised Natalie. AEs initially assumed they'd ignore it and run their standard discovery framework — but after a few calls, several said the suggested questions were better calibrated to the prospect's actual situation than their default openers. The model had read the signup context and inferred something they hadn't noticed.
Run ICP score 1–2 leads separately. After the first batch, Natalie filtered the output into three tiers: scores 4–5 went straight to AEs, scores 3 went to a weekly review call, and scores 1–2 were deprioritised into a nurture sequence. This alone saved the sales team about four hours a week of calls they wouldn't have won.
Refresh the ICP definition quarterly. The prompt encodes your current ICP. If your ideal customer changes — which it will — update the scoring guide before the next batch. The prompt is your source of truth for what "good" means.
What this replaced
Before, an SDR spent roughly 10–15 minutes per lead doing the same research manually: checking LinkedIn, reading the company website, writing a short note. For 500 leads a month that was 80–125 hours — the better part of two full working weeks, every month, on a task that now takes two hours of data prep and runs overnight.
The qualification quality also improved. Manual notes depended heavily on which SDR wrote them and how much time they had that day. The batch output is consistent — same format, same scoring criteria, applied the same way to every row.