Reddit Data API

The Reddit API provides powerful scraping and search capabilities for Reddit content, including posts, comments, users, and communities.

Overview

The Reddit service offers four distinct capabilities for interacting with Reddit data:

URL Scraping

Extract content from specific Reddit posts and comments

Post Search

Search across Reddit posts and comments

User Search

Find and retrieve information about Reddit users

Community Search

Discover Reddit communities (subreddits)

Job Types

scrapeurls

Retrieves the complete content of one or more Reddit URLs. This job type is specifically designed for extracting data from Reddit posts and comments. Supported URLs:
  • Post URLs: https://reddit.com/r/{community}/comments/{post_id}/...
  • Comment URLs: https://reddit.com/r/{community}/comments/{post_id}/{title}/{comment_id}/...
Not Supported:
  • User profile URLs
  • Community/subreddit URLs
  • Reddit homepage or feed URLs

searchposts

Performs comprehensive searches across Reddit posts and comments based on keywords or phrases.

searchusers

Searches for Reddit user profiles and optionally retrieves their post history.

searchcommunities

Discovers Reddit communities (subreddits) based on search queries.

Parameters

Common Parameters

ParameterTypeRequiredDefaultDescription
typestringYes-The job type: scrapeurls, searchposts, searchusers, or searchcommunities
sortstringNo"relevance"Sort order for results. Options: "relevance", "hot", "top", "new", "rising", "comments"
include_nsfwbooleanNofalseWhether to include NSFW (Not Safe For Work) content
afterstringNo-ISO8601 timestamp to filter results created after this date/time
max_itemsintegerNo10Maximum number of items to load in server cache
max_resultsintegerNo10Maximum number of results to return per page
next_cursorstringNo-Pagination cursor for retrieving subsequent pages

Job-Specific Parameters

scrapeurls Parameters

ParameterTypeRequiredDefaultDescription
urlsarray[string]Yes-Array of Reddit post/comment URLs to scrape

Search Job Parameters (searchposts, searchusers, searchcommunities)

ParameterTypeRequiredDefaultDescription
queriesarray[string]Yes-Array of search queries

Additional Search-Specific Parameters

ParameterTypeApplies ToDefaultDescription
skip_postsbooleansearchusersfalseSkip user posts when searching users
max_postsintegersearchposts10Maximum posts per page
max_commentsintegersearchposts10Maximum comments per page
max_communitiesintegersearchcommunities2Maximum communities per page
max_usersintegersearchusers2Maximum users per page

API Examples

Scraping Reddit URLs

{
  "type": "reddit",
  "arguments": {
    "type": "scrapeurls",
    "urls": [
      "https://reddit.com/r/ArtificialIntelligence/comments/1n1dwzv/what_math_should_i_focus_on_for_ai_and_why/",
      "https://reddit.com/r/programming/comments/abc123/example_post/"
    ],
    "sort": "new",
    "include_nsfw": false,
    "max_items": 100
  }
}

Searching Reddit Posts

{
  "type": "reddit",
  "arguments": {
    "type": "searchposts",
    "queries": [
      "machine learning",
      "artificial intelligence"
    ],
    "sort": "top",
    "after": "2024-01-01T00:00:00Z",
    "max_items": 100,
    "max_results": 20,
    "max_posts": 10,
    "max_comments": 10
  }
}

Searching Reddit Users

{
  "type": "reddit",
  "arguments": {
    "type": "searchusers",
    "queries": [
      "NASA",
      "SpaceX"
    ],
    "sort": "relevance",
    "skip_posts": false,
    "max_items": 50,
    "max_users": 5
  }
}

Searching Reddit Communities

{
  "type": "reddit",
  "arguments": {
    "type": "searchcommunities",
    "queries": [
      "artificial intelligence",
      "machine learning"
    ],
    "sort": "relevance",
    "max_items": 100,
    "max_results": 10,
    "max_communities": 5
  }
}

Pagination

The Reddit API supports cursor-based pagination for handling large result sets. After receiving your initial results, use the next_cursor value from the response to fetch subsequent pages:
{
  "type": "reddit",
  "arguments": {
    "type": "searchposts",
    "queries": ["technology"],
    "max_results": 20,
    "next_cursor": "eyJvZmZzZXQiOjIwfQ=="
  }
}

Best Practices

1. Efficient Querying

  • Use specific search terms to get more relevant results
  • Combine related queries in a single request when possible
  • Set appropriate max_* parameters based on your needs

2. Rate Limiting

  • Be mindful of API rate limits
  • Implement exponential backoff for retries
  • Cache results when appropriate

3. Content Filtering

  • Use the after parameter to get only recent content
  • Set include_nsfw appropriately for your use case
  • Use skip_posts when you only need user profile information

4. URL Validation

  • Ensure URLs are properly formatted Reddit URLs before scraping
  • Validate that URLs point to posts or comments, not profiles or communities

Error Handling

Common error scenarios and how to handle them:

Invalid URL Format

When using scrapeurls, ensure all URLs follow the correct Reddit URL pattern for posts and comments.

Missing Required Parameters

  • scrapeurls requires urls array
  • All search operations require queries array
  • These parameters are mutually exclusive

Invalid Job Type

Ensure the type parameter is one of: scrapeurls, searchposts, searchusers, or searchcommunities.

Response Structure

While the exact response structure may vary, typical responses include:
  • Posts: Title, content, author, timestamp, score, comments
  • Comments: Content, author, timestamp, score, parent context
  • Users: Username, karma, account age, recent posts (unless skip_posts is true)
  • Communities: Name, description, subscriber count, creation date

Limitations

  • URL scraping is limited to posts and comments only
  • Search results are subject to Reddit’s search algorithm
  • NSFW content is excluded by default
  • Maximum limits apply to all result types to prevent excessive API usage