X/Twitter Search Indexed Data

Indexed Search

What is Indexed Search?

The indexed search endpoints provide powerful vector-based search over previously collected X/Twitter data. Unlike live search, indexed search queries our secure vector database for lightning-fast results with no polling required.

Key Features

• Instant results from vector database• Semantic similarity matching• Hybrid search combining keywords and vectors• Secure TEE-protected data storage

Search Types

• Similarity Search: Semantically related content• Hybrid Search: Combine semantic and keyword matching

Use Cases

Perfect for applications needing:

Fast historical data access
Semantic content discovery
Pattern and trend analysis

Indexed Search (Similarity & Hybrid)

The indexed search endpoints allow you to query stored X/Twitter data via a powerful vector-based search engine. These endpoints are ideal for applications that require high-speed access to previously fetched tweets, enabling use cases like:

Semantic similarity detection
Hybrid relevance ranking
Near-instant responses (no polling or job queues)

All searches operate over tweets already collected and stored in the Masa vector store (Milvus).

Similarity Search (X/Twitter)

Use this endpoint to perform semantic searches on previously indexed X/Twitter data. Results are ranked based on vector similarity.

Endpoint

POST /v1/search/similarity/twitter

Authentication

Type: Bearer Token
Header: Authorization: Bearer <API_KEY>

Request Parameters

The request parameters allow you to customize your semantic search query. The query field is used to find semantically similar content, while optional keywords can filter results. Use keyword_operator to control how multiple keywords are combined, and max_results to limit the number of returned items.

Field	Type	Required	Description
query	string	✅	The semantic search text
keywords	string[]	❌	Keywords to filter results
keyword_operator	”and” \| “or”	❌	Default: “and”
max_results	number (1-100)	❌	Defaults to environment variable or 10

Example Request

The following example demonstrates how to perform a semantic search for content related to “open source LLM models”, filtered to include tweets containing either “AI” or “NLP” keywords. The request limits results to 10 tweets and uses the “or” operator to match tweets containing any of the specified keywords.

curl https://data.masa.ai/api/v1/search/similarity/twitter \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Content-Type: application/json" \
  -X POST \
  -d '{
    "query": "open source LLM models",
    "keywords": ["AI", "NLP"],
    "keyword_operator": "or",
    "max_results": 10
  }'

Example Response

{
    "ID": "1234567890",
    "Content": "Open source LLMs like Falcon and LLaMA are changing the game.",
    "Metadata": null,
    "Score": 0.87
  },

Metadata will be included in future versions.

Hybrid Search (X/Twitter)

Hybrid search combines vector similarity and keyword-based full-text search for more powerful and flexible querying. By assigning weights to both semantic similarity and keyword matching, you can precisely tune how results are ranked. This allows you to balance finding content that is conceptually related (via vector similarity) with content containing specific keywords (via full-text search) to get the most relevant results for your use case.

Endpoint

POST /v1/search/hybrid/twitter

Authentication

Type: Bearer Token
Header: Authorization: Bearer <API_KEY>

###Request Structure

Field	Type	Required	Description
similarity_query.query	string	✅	Text for semantic matching
similarity_query.weight	number (0–1)	✅	Weight to apply to vector score
text_query.query	string	✅	Text for full-text keyword matching
text_query.weight	number (0–1)	✅	Weight to apply to text score
keywords	string[]	❌	Optional filter keywords
keyword_operator	”and” \| “or”	❌	Default: “and”
max_results	number (1–100)	❌	Defaults to env value or 10

Example request

curl https://data.masa.ai/api/v1/search/hybrid/twitter \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Content-Type: application/json" \
  -X POST \
  -d '{
    "similarity_query": {
      "query": "generative AI breakthroughs",
      "weight": 0.7
    },
    "text_query": {
      "query": "open source ChatGPT competitor",
      "weight": 0.3
    },
    "keywords": ["LLaMA", "Falcon"],
    "keyword_operator": "or",
    "max_results": 20
  }'

Example response

  {
    "ID": "987654321",
    "Content": "LLaMA-3 is being positioned as a serious open source ChatGPT competitor.",
    "Metadata": null,
    "Score": 0.91
  },

Choosing Between Search Types

Use Case	Similarity Search	Hybrid Search
Quick semantic matching without keywords	✅	—
Blend semantic meaning with keyword search	—	✅
Fine-tune relevance scoring	—	✅
Approximate text matching	✅	—

Tips for Better Results• Use short, descriptive phrases in query fields.• Use max_results: 1 for “I’m feeling lucky” style lookups.• Try Hybrid Search if you want balance between context and keyword targeting.

Data API

​Indexed Search

What is Indexed Search?

Key Features

Search Types

Use Cases

​Indexed Search (Similarity & Hybrid)

​Similarity Search (X/Twitter)

​Endpoint

​Authentication

​Request Parameters

​Example Request

​Example Response

​Hybrid Search (X/Twitter)

​Endpoint

​Authentication

​Choosing Between Search Types

Indexed Search

Indexed Search (Similarity & Hybrid)

Similarity Search (X/Twitter)

Endpoint

Authentication

Request Parameters

Example Request

Example Response

Hybrid Search (X/Twitter)

Endpoint

Authentication

Choosing Between Search Types