< Back to Blog

Scoring 5,000 Influencers with AI: An Automated Enrichment Pipeline

How I built an n8n workflow that scrapes LinkedIn and Instagram data for 5,000+ contacts and classifies them into influencer tiers.

The coaching certification company I worked for has over 5,000 graduates. Each one is a potential influencer and affiliate partner - but the challenge is figuring out which ones actually have reach and engagement. Manually reviewing 5,000 LinkedIn and Instagram profiles would take months. So I built a pipeline that does it in hours.

The Problem

The graduate database in HubSpot had names, emails, and certification dates. That's it. To build an influencer and affiliate program, we needed:

  • LinkedIn follower and connection counts
  • Current job titles and company affiliations
  • Instagram handles and estimated reach
  • Whether they're actively creating content about coaching
  • A tier classification (A/B/C) for outreach prioritization

Pipeline Architecture

Step 1: Export grad contacts from HubSpot (n8n HTTP Request)
Step 2: Enrich LinkedIn via Apify (harvestapi actor)
Step 3: Extract Instagram handles from LinkedIn bios
Step 4: Enrich Instagram via Crawl4AI
Step 5: Score & Tier Classification (n8n Code Node)
Step 6: Write enriched data back to HubSpot
Step 7: Sync to NocoDB for dashboard view

The Scoring Algorithm

The Code node in step 5 implements a weighted scoring model:

// Reach (0-40 pts)
if (followers >= 5000) score += 40;
else if (followers >= 2000) score += 25;
else if (followers >= 500) score += 10;

// Connections (0-20 pts)
if (connections >= 3000) score += 20;
else if (connections >= 1000) score += 10;

// Profile completeness (0-20 pts)
if (verified) score += 10;
if (hasAbout) score += 5;
if (hasSkills) score += 5;

// US-based bonus (0-10 pts)
if (country === 'US') score += 10;

// Coach/relevant title (0-10 pts)
const keywords = ['coach', 'leadership', 'facilitator', 'trainer'];
if (keywords.some(k => title.toLowerCase().includes(k))) score += 10;

// Tier assignment
if (score >= 70) tier = 'A';      // High-value, prioritize outreach
else if (score >= 45) tier = 'B';  // Medium value, nurture sequence
else tier = 'C';                   // Low priority, awareness only

Handling Null Data Gracefully

Real-world LinkedIn data is messy. Many profiles returned null for "about," "topSkills," or "title." The scoring logic handles every null case:

  • Null fields default to 0 points (not penalized, just not rewarded)
  • Missing Instagram handles skip the IG enrichment step entirely
  • Profiles with no current position still get scored on connections and verification

Results

  • Tier A: ~8% of graduates (400 contacts) - high follower counts, active coaching titles, content creators
  • Tier B: ~22% of graduates (1,100 contacts) - moderate reach, relevant but not actively creating
  • Tier C: ~70% of graduates (3,500 contacts) - low or no public presence

The Tier A list became the seed for the company's influencer outreach program. These contacts were entered into a dedicated HubSpot sequence with personalized messaging based on their enriched profile data - their current title, company, and follower count were used to craft relevant offers.

Download the Workflow

The Influencer Enrichment Pipeline workflow is available as a ready-to-import n8n JSON file. It includes the HubSpot contact pull, Apify LinkedIn enrichment via the harvestapi actor, the weighted scoring algorithm Code node, tier classification logic, and NocoDB dashboard integration. All credentials and API keys have been replaced with placeholders.

Download Influencer Enrichment Workflow

Requires: HubSpot OAuth2, Apify API token (harvestapi LinkedIn actor), NocoDB API token.


Edward Chalupa is a digital marketing specialist and founder of Whtnxt, a digital marketing and automation consultancy. Connect with him on LinkedIn or explore more at echalupa.com.