Comment on page
Get Hot Posts From Subreddit
Tags: #reddit #subreddit #data #hottopics #rss #information #opendata #snippet #dataframe
Last update: 2023-04-12 (Created: 2021-08-16)
Description: This notebook allows users to retrieve the hottest posts from a specified subreddit on Reddit.
!pip install praw
import praw
import pandas as pd
import numpy as np
from datetime import datetime
SUBREDDIT = "Python" # example: "CryptoCurrency"
- Select “script” as the type of app.
- Name your app and give it a description.
- Set-up the redirect uri to be http://localhost:8080.
- Once you click on “create app”, you will get a box showing you your "client_id" and "client_secrets".
- "user_agent" is the name of your app.
If you need help on setting up and getting your API credentials, please visit ---> Get Reddit API Credentials
MY_CLIENT_ID = "EtAr0o-oKbVuEnPOFbrRqQ"
MY_CLIENT_SECRET = "LmNpsZuFM-WXyZULAayVyNsOhMd_ug"
MY_USER_AGENT = "script by u/naas"
Connect with the reddit API
reddit = praw.Reddit(
client_id=MY_CLIENT_ID, client_secret=MY_CLIENT_SECRET, user_agent=MY_USER_AGENT
)
Get the subreddit level data
posts = []
for post in reddit.subreddit(SUBREDDIT).hot(limit=50):
posts.append(
[
post.title,
post.score,
post.id,
post.subreddit,
post.url,
post.num_comments,
post.selftext,
post.created,
]
)
posts = pd.DataFrame(
posts,
columns=[
"title",
"score",
"id",
"subreddit",
"url",
"num_comments",
"body",
"created",
],
)
- If you need more variables, check "vars()" function``
- Usage: 'vars(post)', you'll get post level variables
Convert unix timestamp to interpretable date-time
posts["created"] = pd.to_datetime(posts["created"], unit="s")
posts.head()
Hint: Filter data using "created" variable for past 24 hours hot posts
- More info on the PRAW package used: https://praw.readthedocs.io/en/stable/
Last modified 3mo ago