Scoring Thousands of Influencers with AI: An Automated Enrichment Pipeline

A company I worked for had thousands of alumni. Each one was a potential influencer and affiliate partner, but the challenge was figuring out which ones actually had reach and engagement. Manually reviewing that many LinkedIn and Instagram profiles would take months. So I built a pipeline that does it in hours.

The Problem

The contact database in HubSpot had names, emails, and program completion dates. That's it. To build an influencer and affiliate program, we needed:

LinkedIn follower and connection counts
Current job titles and company affiliations
Instagram handles and estimated reach
Whether they're actively creating content in the industry
A tier classification (A/B/C) for outreach prioritization

Pipeline Architecture

code

Step 1: Export contacts from HubSpot (n8n HTTP Request)
Step 2: Enrich LinkedIn via Apify (harvestapi actor)
Step 3: Extract Instagram handles from LinkedIn bios
Step 4: Enrich Instagram via Crawl4AI
Step 5: Score & Tier Classification (n8n Code Node)
Step 6: Write enriched data back to HubSpot
Step 7: Sync to NocoDB for dashboard view

The Scoring Algorithm

The Code node in step 5 implements a weighted scoring model across five dimensions:

Reach is weighted the heaviest, using follower thresholds to assign points. Someone with a large following gets the maximum score; smaller accounts scale down proportionally.

Connections adds a secondary signal for LinkedIn network size, which correlates with influence in professional communities even when follower counts are modest.

Profile completeness rewards verified accounts, filled-out bios, and listed skills. A polished profile signals someone actively managing their professional presence.

Geography adds a small bonus for US-based contacts, since the company's primary market was domestic.

Relevant titles scan for keywords that indicate active practitioners in the industry (consultants, facilitators, trainers, etc.).

The total score maps to a tier: A (high-value, prioritize outreach), B (medium value, nurture sequence), or C (low priority, awareness only).

Handling Null Data Gracefully

Real-world LinkedIn data is messy. Many profiles returned null for "about," "topSkills," or "title." The scoring logic handles every null case:

Null fields default to 0 points (not penalized, just not rewarded)
Missing Instagram handles skip the IG enrichment step entirely
Profiles with no current position still get scored on connections and verification

Results

The distribution followed a predictable pattern: a small percentage of contacts landed in Tier A with high follower counts, active industry titles, and evidence of content creation. A larger middle tier showed moderate reach and relevant backgrounds but wasn't actively creating content. The majority fell into Tier C with low or no public presence.

The Tier A list became the seed for the company's influencer outreach program. These contacts were entered into a dedicated HubSpot sequence with personalized messaging based on their enriched profile data, using their current title, company, and follower count to craft relevant offers.

Edward Chalupa is a digital marketing specialist and founder of Whtnxt, a digital marketing and automation consultancy. Connect with him on LinkedIn or explore more at echalupa.com.

Scoring Thousands of Influencers with AI: An Automated Enrichment Pipeline

The Problem

Pipeline Architecture

The Scoring Algorithm

Handling Null Data Gracefully

Results

Related Posts

How to Set Up n8n as Your Marketing Automation Engine

How I Connected a Cooper & Hunter Mini-Split to Apple Home (Without the Official App)

The Agentic Mix: Why Running Everything on One AI Model Is the New Single-Channel Marketing

Found this useful?