Links
Comment on page

Send interactions from post URL to HubSpot notes

Tags: #linkedin #hubspot #openai #interactions #post #url #send #notes
Author: Florent Ravenel
Last update: 2023-08-16 (Created: 2023-08-16)
Description: This notebook automates the process of sending people interactions (like or comment) on a LinkedIn post URL to a contact notes in HubSpot. If a person doesn't already exist in HubSpot, a new contact is created, complete with their first name, last name, occupation, and LinkedIn URL. We also use a prompt to categorize people by ICP, enriching the HubSpot contact information in the process. This tool proves invaluable for tracking and scoring targets acquired through your LinkedIn post campaigns.
Disclaimer: This code is in no way affiliated with, authorized, maintained, sponsored or endorsed by Linkedin or any of its affiliates or subsidiaries. It uses an independent and unofficial API. Use at your own risk.
This project violates Linkedin's User Agreement Section 8.2, and because of this, Linkedin may (and will) temporarily or permanently ban your account. We are not responsible for your account being banned.

Input

Import libraries

import naas
from naas_drivers import linkedin, hubspot
import pandas as pd
import openai
import requests
from datetime import datetime, timezone
from difflib import SequenceMatcher
# Set the display option for max column width to ensure the link is fully displayed
pd.set_option('display.max_colwidth', None)

Setup variables

Mandatory
  • li_at: Cookie used to authenticate Members and API clients.
  • JSESSIONID: Cookie used for Cross Site Request Forgery (CSRF) protection and URL signature validation.
  • linkedin_url: This variable represents the LinkedIn post URL.
  • hs_access_token: This is your HubSpot access token. It's used to authenticate your application's requests to the HubSpot API.
Optional
  • exclude_profiles: This is a list of LinkedIn profile URLs that you want to exclude from your script's operations. For example, you can exclude yourself and your team.
  • hubspot_owner_id: HubSpot owner ID for new contact created.
  • contact_properties: This is a list of properties (internal names) that you want to retrieve or work with for a HubSpot contact.
  • hs_linkedin_url: HubSpot property (internal name) that contains the LinkedIn profile URL.
  • custom_properties: HubSpot property to be added when new contact is created. It must be a dictionary with key as hubspot internal name and value expected in HubSpot. Otherwise, the contact won't be created
  • openai_api_key: This is your API key for OpenAI. It's used to authenticate your application's requests to the OpenAI API.
  • prompt: This is a string that's used as a prompt for OpenAI's text generation API. It will be help you classify people that interacted with your post
  • icp_hubspot: Dictionary with HubSpot internal name as key and list of values expected in HubSpot as value. This property have to fit with your prompt to create ICP.
# Mandatory
li_at = naas.secret.get("LINKEDIN_LI_AT") or "YOUR_LINKEDIN_LI_AT"
JSESSIONID = naas.secret.get("LINKEDIN_JSESSIONID") or "YOUR_LINKEDIN_JSESSIONID"
linkedin_url = input("Post URL:")
hs_access_token = naas.secret.get("HS_ACCESS_TOKEN") or "YOUR_HS_ACCESS_TOKEN"
# Optional
exclude_profiles = [
"https://www.linkedin.com/in/ACoAABCNSioBW3YZHc2lBHVG0E_TXYWitQkmwog",
"https://www.linkedin.com/in/ACoAAA6EYJABlJdZG2ZQLuLkpCu2Ny8pqa065b8",
"https://www.linkedin.com/in/ACoAAAJHE7sB5OxuKHuzguZ9L6lfDHqw--cdnJg"
]
hubspot_owner_id = "158373005"
contact_properties = [
"hs_object_id",
"firstname",
"lastname",
"email",
"linkedinbio",
"jobtitle"
]
hs_linkedin_url = "linkedinbio"
custom_properties = {
"naas_target": "Yes"
}
openai_api_key = naas.secret.get("OPENAI_API_KEY") or "YOUR_OPENAI_API_KEY"
prompt = f"""
I am building Naas, the Universal open source data plaform.
I have 2 ideal customer profile, one is a 'data producer' with basic knowledge of Python that could use our Notebook templates to create plugins.
These plugions are then distributed data via our NaasAI Chat interface.
The other one is a 'data consummer' that will enjoy using NaasAI Chat for its basic LLMs integration but also interested in having its own data available, hence work with the data producer.
I will give you the [OCCUPATION] from a profile I get from LinkedIn, you will return stricly and only one of the following values inside the simple quotes based on the best match 'DataProducer', 'DataConsumer', 'NotICP' or 'Don't know' if you don't find a plausible match with the first 3 values.
Don't put the results into quotes.
"""
icp_hubspot = {
"icp_type": ["NotICP", "DataConsumer", "DataProducer"]
}

Model

Get post likes

df_likes = linkedin.connect(li_at, JSESSIONID).post.get_likes(linkedin_url)
print("Number of likes: ", len(df_likes))
df_likes.head(1)

Get post comments

df_comments = linkedin.connect(li_at, JSESSIONID).post.get_comments(linkedin_url)
print("Number of comments: ", len(df_comments))
df_comments.head(1)

Create database of LinkedIn profiles

def predict_category(
openai_api_key,
prompt,
summary
):
# Return TBD if not openai key is set
if not openai_api_key:
return "TBD"
# Connect to openai
openai.api_key = openai_api_key
if summary:
prompt = prompt.replace("[OCCUPATION]", summary)
else:
return "NotICP"
response = openai.Completion.create(
engine="text-davinci-003",
prompt=prompt,
temperature=0,
max_tokens=60
)
return response.choices[0].text.split("\n")[-1].strip()
def create_profiles_db(
df_likes,
df_comments,
exclude_profiles,
openai_api_key,
prompt
):
# Init
df = pd.DataFrame()
# Concat db on specific column and drop duplicates
to_keep = [
"PROFILE_ID",
"PROFILE_URL",
"PUBLIC_ID",
"FIRSTNAME",
"LASTNAME",
"FULLNAME",
"OCCUPATION",
"PROFILE_PICTURE",
]
df = pd.concat([df_likes, df_comments])[to_keep].drop_duplicates(to_keep)
# Cleaning: exclude URLs not a profile + custom
df = df[
(df["PROFILE_URL"].str.contains("https://www.linkedin.com/in/")) &
~(df["PROFILE_URL"].isin(exclude_profiles))
].reset_index(drop=True)
# Determine if profiles match with ICP
df['ICP'] = df.apply(lambda row: predict_category(openai_api_key, prompt, row["OCCUPATION"]), axis=1)
return df.reset_index(drop=True)
df_profiles = create_profiles_db(
df_likes,
df_comments,
exclude_profiles,
openai_api_key,
prompt
)
print("Unique profiles:", len(df_profiles))
df_profiles.head(1)

Get all contacts from HubSpot

df_contacts = hubspot.connect(hs_access_token).contacts.get_all(contact_properties)
print("HubSpot Contact:", len(df_contacts))
df_contacts.head(1)

Find HubSpot ID for leads

def similarity(a, b):
return SequenceMatcher(None, a, b).ratio()
def get_husbspot_id(
df,
df_contacts,
hs_linkedin_url,
hubspot_owner_id=None,
icp_hubspot={},
custom_properties={}
):
# Init
properties = {}
# Add custom contact property to dict
if len(custom_properties) > 0:
properties = custom_properties
# Add HubSpot owner ID
if hubspot_owner_id and hubspot_owner_id != "":
properties["hubspot_owner_id"] = hubspot_owner_id
# Add TBD if Linkedin URL col is empty
df_contacts[hs_linkedin_url] = df_contacts[hs_linkedin_url].fillna("TBD")
# Create fullname on HubSpot contact database
df_contacts["fullname"] = df_contacts["firstname"].fillna("TBD") + " " + df_contacts["lastname"].fillna("TBD")
# Loop on interactions profiles
for row in df.itertuples():
# Init
hs_ids = []
index = row.Index
firstname = row.FIRSTNAME
lastname = row.LASTNAME
fullname = row.FULLNAME
profile_url = row.PROFILE_URL
icp_type = row.ICP
jobtitle = row.OCCUPATION
profile_id = row.PROFILE_ID
public_id = row.PUBLIC_ID
print("Starting with:", fullname)
# Find if interaction PROFILE_ID or PUBLIC_ID match with HubSpot LinkedIn bio
for x in [profile_id, public_id]:
tmp_df = df_contacts[df_contacts[hs_linkedin_url].str.contains(x)]
if len(tmp_df) > 0:
hs_ids += tmp_df["hs_object_id"].unique().tolist()
# Find if interaction FULLNAME match with HubSpot first and last name
if len(hs_ids) == 0:
for f in df_contacts["fullname"].unique():
ratio = similarity(f, fullname)
if ratio > 0.9:
tmp_df = df_contacts[df_contacts["fullname"] == f].reset_index(drop=True)
hs_ids += tmp_df["hs_object_id"].unique().tolist()
# Create contact if does not exist
if len(hs_ids) == 0:
print("❌ No HubSpot IDs found, contact to be created")
# Add contact properties
properties["firstname"] = firstname
properties["lastname"] = lastname
properties["jobtitle"] = jobtitle
properties[hs_linkedin_url] = profile_url
# Add ICD
if len(icp_hubspot) > 0:
icp_key = list(icp_hubspot.keys())[0]
icp_values = list(icp_hubspot.values())[0]
if icp_type in icp_values:
properties[icp_key] = icp_type
# Create contact using naas drivers hubspot
create_contact = {"properties": properties}
hs_ids = hubspot.connect(hs_access_token).contacts.send(create_contact)
else:
print(f"✅ {len(hs_ids)} contact IDs found.")
# Remove duplicates, preserving order
hs_ids = list(dict.fromkeys(hs_ids))
# Transform list to string
hs_ids = ",".join(hs_ids)
# Add hubspot IDs in df
df.loc[index, "HUBSPOT_IDS"] = hs_ids
return df
df_leads = get_husbspot_id(
df_profiles,
df_contacts,
hs_linkedin_url,
hubspot_owner_id,
icp_hubspot,
custom_properties,
)
print("Leads:", len(df_leads))
df_leads.head(1)

Output

Send likes or comments to HubSpot contact notes

def get_association_from_contact(
token,
contact_id,
endpoint,
):
# Init
results = []
# Requests
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {token}"
}
url = f"https://api.hubapi.com/crm/v4/objects/contacts/{contact_id}/associations/{endpoint}"
# Response
res = requests.get(url, headers=headers)
if res.status_code == 200:
results = res.json().get("results")
return results
def retrieve_object_details(
token,
object_id,
object_type,
properties=None,
):
# Init
data = []
params = {
"archived": "false"
}
# Requests
if properties:
params["properties"] = properties
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {token}"
}
url = f"https://api.hubapi.com/crm/v3/objects/{object_type}/{object_id}"
# Response
res = requests.get(url, headers=headers, params=params)
if res.status_code == 200:
data = res.json().get("properties")
else:
print(res.text)
return pd.DataFrame([data])
def create_activity_df(
token,
contact_id,
activity,
properties_dict=None,
):
# Init
df = pd.DataFrame()
if not properties_dict:
properties_dict = {
"hs_object_id": "activity_hs_id",
"hs_lastmodifieddate": "activity_date",
"hs_body_preview": "activity_content",
"hs_body_preview_html": "activity_content_html"
}
properties = [x for x in properties_dict]
# List activities
data = get_association_from_contact(
token,
contact_id,
activity
)
print("Data fetched:", len(data))
for d in data:
object_id = d.get("toObjectId")
tmp_df = retrieve_object_details(
token,
object_id,
activity,
properties
)
if len(tmp_df) > 0:
tmp_df = tmp_df[properties]
df = pd.concat([df, tmp_df])
if len(df) > 0:
df = df.rename(columns=properties_dict)
if 'activity_type' not in df:
df.insert(loc=1, column="activity_type", value=activity.upper())
return df.reset_index(drop=True)
def create_hubspot_note(
api_key,
body,
object_datetime=None,
contact_ids=[],
):
# Init
data = []
# Get the current timestamp in UTC
if not object_datetime:
object_datetime = datetime.utcnow()
hs_timestamp = object_datetime.replace(tzinfo=timezone.utc).strftime("%s") + "000"
# Create contact asso
contacts = []
for contact_id in contact_ids:
contacts.append(
{
"to": {"id": contact_id},
"types": [
{
"associationCategory": "HUBSPOT_DEFINED",
"associationTypeId": 202
}
]
}
)
# Requests
payload = {
"properties":
{
"hs_note_body": body,
"hs_timestamp": hs_timestamp
},
"associations": contacts
}
headers = {
'accept': "application/json",
'content-type': "application/json",
'authorization': f"Bearer {api_key}"
}
url = "https://api.hubapi.com/crm/v3/objects/notes"
# Response
res = requests.post(url, headers=headers, json=payload)
if res.status_code == 201:
data = res.json()
print("✅ Note successfully created:", data.get('id'))
else:
print(res)
print(res.text)
return data
def delete_note(
token,
object_id,
):
# Requests
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {token}"
}
url = f"https://api.hubapi.com/crm/v3/objects/notes/{object_id}"
# Response
res = requests.delete(url, headers=headers)
if res.status_code == 204:
print(f"✅ Note '{object_id}' successfully deleted!")
else:
print(res.text)
return res
def delete_specific_note(df, p):
object_ids = df.loc[df["activity_content_html"] == p, "activity_hs_id"].unique().tolist()
for object_id in object_ids:
delete_note(hs_access_token, object_id)
# Get meta from post URL
df_meta = linkedin.connect(li_at, JSESSIONID).post.get_stats(linkedin_url)
title = df_meta.loc[0, "TITLE"]
post_url = df_meta.loc[0, "POST_URL"]
author = df_meta.loc[0, "AUTHOR_NAME"]
author_url = df_meta.loc[0, "AUTHOR_URL"]
remove_note = False
for row in df_leads.itertuples():
# Init
fullname = row.FULLNAME
profile_id = row.PROFILE_ID
hs_ids = row.HUBSPOT_IDS
print("Starting with:", fullname, hs_ids)
# Get likes and comments
tmp_likes = pd.DataFrame()
if len(df_likes) > 0:
tmp_likes = df_likes[df_likes["PROFILE_ID"] == profile_id].reset_index(drop=True)
tmp_comments = pd.DataFrame()
if len(df_comments) > 0:
tmp_comments = df_comments[df_comments["PROFILE_ID"] == profile_id].reset_index(drop=True)
print(f"-> {len(tmp_likes)} likes and {len(tmp_comments)} comments")
# Get notes
if hs_ids != "TO_BE_CREATED":
hs_ids = hs_ids.split(",")
df_notes = pd.DataFrame()
# Get notes from contact
for uid in hs_ids:
tmp_notes = create_activity_df(
hs_access_token,
uid,
"notes",
)
df_notes = pd.concat([df_notes, tmp_notes]).reset_index(drop=True)
# Create 'Comments' notes
if len(tmp_comments) > 0:
for c in tmp_comments.itertuples():
comment = c.TEXT
create_note_comment = True
if len(df_notes) > 0:
for p in df_notes["activity_content_html"].unique().tolist():
if str(post_url) in str(p) and comment in str(p):
create_note_comment = False
# Delete note if needed
if remove_note:
delete_specific_note(df_notes, p)
# Create note
if create_note_comment:
timestamp = datetime.strptime(c.CREATED_TIME, "%Y-%m-%d %H:%M:%S")
body = f"LinkedIn interaction - Comment '{comment}' on <a href={post_url}>'{title}'</a> from <a href={author_url}>'{author}'"
create_hubspot_note(
hs_access_token,
body,
timestamp,
hs_ids
)
# Create 'Likes' notes
if len(tmp_likes) > 0:
create_note_like = True
if len(df_notes) > 0:
for p in df_notes["activity_content_html"].unique().tolist():
if (str(post_url) in str(p)) and ("Like" in str(p)):
create_note_like = False
# Delete note if needed
if remove_note:
delete_specific_note(df_notes, p)
# Create note
if create_note_like:
like_timestamp = None
if len(df_comments) > 0:
like_timestamp = datetime.strptime(df_comments.loc[df_comments.index[-1], "CREATED_TIME"], "%Y-%m-%d %H:%M:%S")
body = f"LinkedIn interaction - Like on <a href={post_url}>'{title}'</a> from <a href={author_url}>{author}"
create_hubspot_note(
hs_access_token,
body,
like_timestamp,
hs_ids
)
else:
print('❌ Contact to be created')