About a year ago we integrated ChatGPT into a Bitrix24 CRM for a client. The goal: automatic qualification of incoming leads without human intervention for the first touchpoint. Here's how it went.
The problem
The client had a steady stream of inbound leads — form submissions, calls, chat messages — all landing in Bitrix24. Sales managers were spending the first 20-30 minutes of every lead just gathering basic information: budget, timeline, scope, decision-maker. Repetitive, low-value work that killed energy for the actual selling conversations.
What we built
The architecture was straightforward: a webhook from Bitrix24 fires when a new lead is created, our middleware service picks it up, calls the OpenAI API with a structured prompt, and writes the qualification results back into the CRM as deal fields and an activity note.
The prompt was the hard part. A generic "qualify this lead" prompt produced generic results. We ended up with a structured system prompt that:
- Described the client's product and typical customer profile
- Listed specific fields to extract (budget range, urgency, role of the contact)
- Specified output format as JSON for reliable parsing
- Included fallback values when information wasn't present
GPT-4o-mini was fast enough and cheap enough to run on every lead. We didn't need GPT-4 for this — the task was extraction and classification, not reasoning.
Results
First-touch qualification time dropped from 20-30 minutes to near-zero for structured leads. Managers still reviewed the AI output, but they went into conversations already knowing the context.
The edge cases were predictable: vague leads ("I want something"), very short messages, non-Russian text. We handled these with confidence scores — if confidence was below threshold, it flagged for manual review rather than auto-qualifying.
What to watch out for
The biggest mistake in projects like this: treating AI output as ground truth. It's a starting point. Managers need to understand they're reviewing a draft, not reading facts. Training this mindset took longer than building the integration.
Also: prompt versioning. Keep your prompts in version control. When you change a prompt, it changes behavior for every future lead. You want to be able to roll back and compare.
The integration cost roughly two weeks of development time. It paid for itself in the first month in time saved.