Overview

Progrid AI Lead Research (progrid_ai_research) is a CRM module that automates the process of discovering, qualifying, and importing business leads from the web. It is designed for sales teams, marketing departments, and business development professionals who need a scalable way to identify new prospects without manual research.

Purpose

Traditional lead research involves visiting dozens of websites, copying contact details into spreadsheets, and manually assessing whether a business is active. This module automates that entire workflow by combining web search APIs, content extraction, and large language model (LLM) analysis into a single pipeline.

Each research job goes through six phases automatically:

  1. Search – Query web search providers to find relevant business URLs

  2. Fetch – Download and extract content from discovered web pages

  3. Normalize – Use AI to extract structured business data (name, emails, phones, services)

  4. Score – Evaluate each business’s activity level with a confidence percentage

  5. Store – Save results with supporting evidence and metadata

  6. Deliver – Create CRM leads, enrich partner records, or export to CSV

Target users

This module is built for two primary audiences:

Sales and marketing teams

Sales representatives and marketing professionals use AI Lead Research to discover new prospects in specific industries, regions, or niches. Typical use cases include:

  • Finding local service providers who may need a new website or software solution

  • Identifying competitors in a given market

  • Discovering businesses with outdated web presences that could benefit from modernization

System administrators

Administrators configure search providers, manage API keys, set rate limits, and monitor cache utilization. They also control which users have access to create research jobs versus only viewing results.

Provider ecosystem

The module integrates with three external service providers. All communication happens via secure HTTPS API calls.

Search providers

Two web search providers are supported. You can use either one or both simultaneously.

Provider

Description

Free tier

Configuration

Brave Search

Privacy-focused web search API with strong coverage of business directories and local results.

2,000-5,000 queries/month

CRM ‣ AI Research ‣ Configuration ‣ Settings

Tavily

AI-optimized search API designed for LLM applications. Returns pre-processed content snippets.

1,000 credits/month

CRM ‣ AI Research ‣ Configuration ‣ Settings

LLM provider

Provider

Description

Free tier

Configuration

Groq

High-speed LLM inference using Llama 3.1 70B. Handles both data extraction (normalize phase) and activity scoring (score phase).

Generous free tier with rate limits

CRM ‣ AI Research ‣ Configuration ‣ Settings

Pipeline concept

Every research job follows the same six-phase pipeline. Understanding these phases helps you interpret job status and troubleshoot issues.

Phase 2: Fetch

Each discovered URL is fetched and its content extracted. The module uses the Trafilatura library for intelligent content extraction, with a regex-based fallback for pages that resist standard parsing. A 7-day content cache (model Progrid.fetch.cache) prevents redundant downloads of the same URL, reducing API usage and speeding up subsequent jobs targeting similar businesses.

Phase 3: Normalize

The extracted page content is sent to the Groq LLM with a structured prompt. The AI extracts:

  • Business name – The official name of the company

  • Email addresses – All contact emails found on the page

  • Phone numbers – Phone and fax numbers

  • Services offered – What the business provides

  • Activity signals – Evidence of recent activity (blog posts, news, updated copyright dates)

Phase 4: Score

A second LLM call evaluates the extracted data and assigns:

  • Confidence score (0-100%) – How confident the system is about the data accuracy

  • Activity status – One of three categories:

    • active – Clear evidence of recent business activity

    • unclear – Insufficient information to determine status

    • inactive – Signs the business may be closed or dormant

Phase 5: Store

Results are saved as Progrid.research.result records linked to the parent job. Each result includes the extracted business data, confidence score, activity status, and the raw evidence text that the LLM used for its evaluation.

Phase 6: Deliver

Based on the job’s configured deliverables, the system can:

  • Create CRM leads – Results with confidence above 50% are automatically converted into CRM leads (crm.lead records) with pre-filled contact information

  • Enrich existing partners – If a matching partner is found by email or website domain, the existing record is updated rather than creating a duplicate

  • Export to CSV – Generate a downloadable CSV file with all results for external use

Note

Deduplication during the deliver phase uses email address and website domain matching to prevent creating duplicate leads or partner records in your CRM.

Module information

Technical name

progrid_ai_research

Version

18.0.1.1.0

Category

CRM

Dependencies

crm, mail, base, base_setup, utm