Twitter Scraper
This guide will help you configure your Masa Node as a X/Twitter scraper.
Prerequisites
- A running, staked Masa Node (see Binary Installation or Docker Setup)
twitter_cookies.json.example
file in the root directory of your Masa Node renamed totwitter_cookies.json
- X/Twitter Pro Account: without a Pro account, you will not be able to scrape X/Twitter data.
A paid X/Twitter Pro Account is absolutely necessary for scraping X/Twitter data. Ensure you have obtained a paid Twitter Pro Account before proceeding with the configuration.
Configuration Process
Prepare X/Twitter cookies
Create a
twitter_cookies.json
file with your X/Twitter account credentials. To obtain this information:a. Log in to X/Twitter in your web browser
b. Open the browser's developer tools (usually F12 or right-click > Inspect)
c. Go to the "Application" or "Storage" tab
d. In the left sidebar, expand "Cookies" and click on "https://twitter.com"
e. Look for the following cookie names and copy their values:
- personalization_id
- kdt
- twid
- ct0
- auth_token
f. Use the template file
twitter_cookies.json.example
in the root directory as a guideg. Replace only the placeholders in the "Value" field with the actual values
h. Save the file as "twitter_cookies.json" (remove ".example" from the filename)
Place the cookies file
Move the file to the appropriate location where your Masa keys are stored:
mv twitter_cookies.json ~/.masa/twitter_cookies.json
Set environment variable
Enable X/Twitter scraping in your
.env
file:TWITTER_SCRAPER=true
Restart your node
Restart the Masa node to apply the changes.
Verify configuration
Check the logs for confirmation:
Is TwitterScraper: true
Test the X/Twitter scraper
Curl the node in local mode to confirm it returns X/Twitter data:
curl -X 'POST' \
'http://localhost:8080/api/v1/data/twitter/tweets/recent' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"query": "$Masa AI",
"count": 1
}'You should receive a response similar to this:
{
"data": [
{
"Error": null,
"Tweet": {
"ConversationID": "1828797710385942907",
"GIFs": null,
"HTML": "<a href=\"https://twitter.com/CryptoGodJohn\">@CryptoGodJohn</a> $MASA the leading token for <a href=\"https://twitter.com/hashtag/AI\">#AI</a> and <a href=\"https://twitter.com/hashtag/Data\">#Data</a> <br><a href=\"https://twitter.com/getmasafi\">@getmasafi</a>",
"Hashtags": [
"AI",
"Data"
],
"ID": "1828900558452797478",
// ... (other Tweet fields)
}
}
],
"workerPeerId": "16Uiu2HAmSCQMh22Xmo1GMxXB73qRx3YaVqqL1UwTYn3iNvQLjPB5"
}Verify that the
workerPeerId
in the response matches your node's peerID.
Security Considerations
- Ensure your
twitter_cookies.json
file has appropriate permissions (e.g.,chmod 600
). - Keep your X/Twitter credentials secure and do not share them.
- Never commit your
twitter_cookies.json
file to version control.
Warning: Cloud-Based Scraping
If you are running a X/Twitter scraper in the cloud, you must use a residential proxy. Without a residential proxy, your scraper is likely to be blocked by X/Twitter, resulting in invalid credentials errors. Ensure you have a reliable residential proxy service set up before deploying your scraper in a cloud environment.
Troubleshooting
If you encounter issues:
- Verify the format of your
twitter_cookies.json
file. - Ensure your X/Twitter credentials are valid and not expired.
- Check the node logs for any error messages related to X/Twitter scraping.
- If running in the cloud, confirm your residential proxy is correctly configured and functioning.
For more detailed setup options and advanced configurations, refer to: