Reddit Data API
The Reddit API provides powerful scraping and search capabilities for Reddit content, including posts, comments, users, and communities.Overview
The Reddit service offers four distinct capabilities for interacting with Reddit data:URL Scraping
Extract content from specific Reddit posts and comments
Post Search
Search across Reddit posts and comments
User Search
Find and retrieve information about Reddit users
Community Search
Discover Reddit communities (subreddits)
Job Types
scrapeurls
Retrieves the complete content of one or more Reddit URLs. This job type is specifically designed for extracting data from Reddit posts and comments.
Supported URLs:
- Post URLs:
https://reddit.com/r/{community}/comments/{post_id}/...
- Comment URLs:
https://reddit.com/r/{community}/comments/{post_id}/{title}/{comment_id}/...
- User profile URLs
- Community/subreddit URLs
- Reddit homepage or feed URLs
searchposts
Performs comprehensive searches across Reddit posts and comments based on keywords or phrases.
searchusers
Searches for Reddit user profiles and optionally retrieves their post history.
searchcommunities
Discovers Reddit communities (subreddits) based on search queries.
Parameters
Common Parameters
Parameter | Type | Required | Default | Description |
---|---|---|---|---|
type | string | Yes | - | The job type: scrapeurls , searchposts , searchusers , or searchcommunities |
sort | string | No | "relevance" | Sort order for results. Options: "relevance" , "hot" , "top" , "new" , "rising" , "comments" |
include_nsfw | boolean | No | false | Whether to include NSFW (Not Safe For Work) content |
after | string | No | - | ISO8601 timestamp to filter results created after this date/time |
max_items | integer | No | 10 | Maximum number of items to load in server cache |
max_results | integer | No | 10 | Maximum number of results to return per page |
next_cursor | string | No | - | Pagination cursor for retrieving subsequent pages |
Job-Specific Parameters
scrapeurls
Parameters
Parameter | Type | Required | Default | Description |
---|---|---|---|---|
urls | array[string] | Yes | - | Array of Reddit post/comment URLs to scrape |
Search Job Parameters (searchposts
, searchusers
, searchcommunities
)
Parameter | Type | Required | Default | Description |
---|---|---|---|---|
queries | array[string] | Yes | - | Array of search queries |
Additional Search-Specific Parameters
Parameter | Type | Applies To | Default | Description |
---|---|---|---|---|
skip_posts | boolean | searchusers | false | Skip user posts when searching users |
max_posts | integer | searchposts | 10 | Maximum posts per page |
max_comments | integer | searchposts | 10 | Maximum comments per page |
max_communities | integer | searchcommunities | 2 | Maximum communities per page |
max_users | integer | searchusers | 2 | Maximum users per page |
API Examples
Scraping Reddit URLs
Searching Reddit Posts
Searching Reddit Users
Searching Reddit Communities
Pagination
The Reddit API supports cursor-based pagination for handling large result sets. After receiving your initial results, use thenext_cursor
value from the response to fetch subsequent pages:
Best Practices
1. Efficient Querying
- Use specific search terms to get more relevant results
- Combine related queries in a single request when possible
- Set appropriate
max_*
parameters based on your needs
2. Rate Limiting
- Be mindful of API rate limits
- Implement exponential backoff for retries
- Cache results when appropriate
3. Content Filtering
- Use the
after
parameter to get only recent content - Set
include_nsfw
appropriately for your use case - Use
skip_posts
when you only need user profile information
4. URL Validation
- Ensure URLs are properly formatted Reddit URLs before scraping
- Validate that URLs point to posts or comments, not profiles or communities
Error Handling
Common error scenarios and how to handle them:Invalid URL Format
When usingscrapeurls
, ensure all URLs follow the correct Reddit URL pattern for posts and comments.
Missing Required Parameters
scrapeurls
requiresurls
array- All search operations require
queries
array - These parameters are mutually exclusive
Invalid Job Type
Ensure thetype
parameter is one of: scrapeurls
, searchposts
, searchusers
, or searchcommunities
.
Response Structure
While the exact response structure may vary, typical responses include:- Posts: Title, content, author, timestamp, score, comments
- Comments: Content, author, timestamp, score, parent context
- Users: Username, karma, account age, recent posts (unless
skip_posts
is true) - Communities: Name, description, subscriber count, creation date
Limitations
- URL scraping is limited to posts and comments only
- Search results are subject to Reddit’s search algorithm
- NSFW content is excluded by default
- Maximum limits apply to all result types to prevent excessive API usage