Customer reviews contain more useful signal than most businesses ever extract. Sentiment, recurring complaints, specific product issues, things customers love — it's all there, buried in unstructured text across Trustpilot, Google, Amazon, and your own site.
The problem is volume. Reading and classifying 500 reviews takes a day. 5,000 takes weeks. And by the time you've finished, the insight is already three months old.
AI batch processing changes this: you export your reviews, write your classification instructions once, and get a fully coded, summarised CSV back — ready to pivot and report on. This guide shows you how.
Step 1: Export your reviews
Most platforms have an export function:
- Trustpilot: Business account → Reviews → Export. Exports CSV with review text, rating, date, and reviewer details.
- Google Business Profile: Download via Google My Business API, or use a third-party export tool like Local Falcon or BrightLocal.
- Amazon Seller Central: Reports → Business Reports → Reviews. Or use the Amazon Seller API for more control.
- Shopify product reviews: If using Shopify's native review app or Judge.me, both have CSV export options in the app dashboard.
Your export will likely have more columns than you need. Before uploading, trim it to the essentials:
| review_id | review_text | rating | product_name | review_date |
|---|---|---|---|---|
| R8821 | Love this serum but the pump stopped working after two weeks — had to pour it out | 3 | Vitamin C Brightening Serum | 2026-04-12 |
| R8822 | Best moisturiser I've ever used. My skin hasn't been this clear in years. | 5 | Daily Hydration Cream | 2026-04-14 |
Keep the rating column — it's useful context for the model and gives you a quick sanity check on whether the AI-assigned sentiment matches the star rating.
Step 2: Define your themes
Before writing your batch instructions, decide what categories you want. The right categories depend on what decisions you're trying to make. Common options:
- PRODUCT_QUALITY — comments about the product itself, ingredients, efficacy
- PACKAGING — packaging aesthetics, practicality, damage in transit
- DELIVERY — speed, tracking, courier issues
- CUSTOMER_SERVICE — interactions with support, returns, responses
- VALUE — price, perceived value for money
- REPURCHASE_INTENT — explicit statements about buying again
- OTHER — doesn't fit cleanly into the above
Keep your category list focused. More than 10 categories introduces ambiguity and classification errors. If you have 15 things you want to track, group the related ones.
Step 3: Write your batch instructions
Sample batch instructions:
You are a customer insight analyst. Classify the customer review below.
Output exactly this format — no extra text:
PRIMARY_THEME: [one of the categories listed below]
SECONDARY_THEME: [one of the categories, or NONE]
SENTIMENT: [positive / neutral / negative / mixed]
ACTION_REQUIRED: [yes / no]
SUMMARY: [one sentence, max 15 words, third person]
Categories:
PRODUCT_QUALITY — efficacy, ingredients, texture, smell, results
PACKAGING — packaging design, practicality, pump/applicator, transit damage
DELIVERY — shipping speed, tracking, courier issues
CUSTOMER_SERVICE — support interactions, returns, complaint handling
VALUE — price, perceived value, subscription or bundle comments
REPURCHASE_INTENT — explicit "will buy again" or "won't buy again" statements
OTHER — doesn't fit any category clearly
Rules:
— ACTION_REQUIRED is "yes" if the review describes a product defect, a delivery failure, or a customer service problem that likely needs a response
— SENTIMENT reflects the reviewer's overall feeling — a 5-star review that mentions one minor complaint is still positive
— Do not infer meaning not present in the review text
— If the review is too short to classify reliably, set PRIMARY_THEME to OTHER
The ACTION_REQUIRED column is worth including even if it's not in your original brief. It lets you filter immediately for reviews that need a reply — without reading every row.
Step 4: Run the batch and use the output
Gemini 2.5 Flash handles review classification well. Reviews are typically short and the classification task doesn't require deep reasoning — Flash is fast, cheap, and produces clean output for this use case.
When the output CSV arrives, the immediate useful analyses are:
- Pivot by PRIMARY_THEME + SENTIMENT — see where negative sentiment concentrates. If PACKAGING has 40% negative and PRODUCT_QUALITY has 5%, that's your priority.
- Filter ACTION_REQUIRED = yes — your reply queue, already sorted, without reading anything.
- Filter by product_name — per-product breakdowns to identify which SKUs have recurring issues.
- Trend over time — if you run the analysis monthly, you can track whether issues are improving or worsening after changes you've made.
The SUMMARY column earns its place in reporting. Rather than presenting raw review text to a stakeholder, you can show them the one-line summary — they get the signal without having to read 500 verbatims.
What to watch out for
Short reviews are harder to classify. "Great product!" and "Didn't work for me" give the model very little to work with. These will often end up as OTHER or with low-confidence classifications. That's correct behaviour — they're genuinely ambiguous.
Mixed reviews need the mixed sentiment option. A review that says "love the product but the packaging is terrible" is neither positive nor negative. If your batch instructions don't include "mixed" as a sentiment option, the model will be forced to pick one — and it'll be inconsistent. Include it.
Rerun after you update your categories. If you find that 30% of your reviews are landing in OTHER, your categories probably aren't covering the right topics. Revise them and rerun — the cost is low enough that iteration is practical.