What is Indexed Search?

The indexed search endpoints provide powerful vector-based search over previously collected X/Twitter data. Unlike live search, indexed search queries our secure vector database for lightning-fast results with no polling required.

Key Features

• Instant results from vector database

• Semantic similarity matching

• Hybrid search combining keywords and vectors

• Secure TEE-protected data storage

Search Types

Similarity Search: Semantically related content

Hybrid Search: Combine semantic and keyword matching

Use Cases

Perfect for applications needing:

  • Fast historical data access
  • Semantic content discovery
  • Pattern and trend analysis

Indexed Search (Similarity & Hybrid)

The indexed search endpoints allow you to query stored X/Twitter data via a powerful vector-based search engine. These endpoints are ideal for applications that require high-speed access to previously fetched tweets, enabling use cases like:

  • Semantic similarity detection
  • Hybrid relevance ranking
  • Near-instant responses (no polling or job queues)

All searches operate over tweets already collected and stored in the Masa vector store (Milvus).

Similarity Search (X/Twitter)

Use this endpoint to perform semantic searches on previously indexed X/Twitter data. Results are ranked based on vector similarity.

Endpoint

POST /v1/search/similarity/twitter

Authentication

  • Type: Bearer Token
  • Header: Authorization: Bearer <API_KEY>

Request Parameters

The request parameters allow you to customize your semantic search query. The query field is used to find semantically similar content, while optional keywords can filter results. Use keyword_operator to control how multiple keywords are combined, and max_results to limit the number of returned items.

FieldTypeRequiredDescription
querystringThe semantic search text
keywordsstring[]Keywords to filter results
keyword_operator”and” | “or”Default: “and”
max_resultsnumber (1-100)Defaults to environment variable or 10

Example Request

The following example demonstrates how to perform a semantic search for content related to “open source LLM models”, filtered to include tweets containing either “AI” or “NLP” keywords. The request limits results to 10 tweets and uses the “or” operator to match tweets containing any of the specified keywords.

curl https://data.dev.masalabs.ai/api/v1/search/similarity/twitter \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Content-Type: application/json" \
  -X POST \
  -d '{
    "query": "open source LLM models",
    "keywords": ["AI", "NLP"],
    "keyword_operator": "or",
    "max_results": 10
  }'

Example Response

{
    "ID": "1234567890",
    "Content": "Open source LLMs like Falcon and LLaMA are changing the game.",
    "Metadata": null,
    "Score": 0.87
  },

Metadata will be included in future versions.

Hybrid Search (X/Twitter)

Hybrid search combines vector similarity and keyword-based full-text search for more powerful and flexible querying. By assigning weights to both semantic similarity and keyword matching, you can precisely tune how results are ranked. This allows you to balance finding content that is conceptually related (via vector similarity) with content containing specific keywords (via full-text search) to get the most relevant results for your use case.

Endpoint

POST /v1/search/hybrid/twitter

Authentication

  • Type: Bearer Token
  • Header: Authorization: Bearer <API_KEY>

###Request Structure

FieldTypeRequiredDescription
similarity_query.querystringText for semantic matching
similarity_query.weightnumber (0–1)Weight to apply to vector score
text_query.querystringText for full-text keyword matching
text_query.weightnumber (0–1)Weight to apply to text score
keywordsstring[]Optional filter keywords
keyword_operator”and” | “or”Default: “and”
max_resultsnumber (1–100)Defaults to env value or 10

Example request

curl https://data.dev.masalabs.ai/api/v1/search/hybrid/twitter \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Content-Type: application/json" \
  -X POST \
  -d '{
    "similarity_query": {
      "query": "generative AI breakthroughs",
      "weight": 0.7
    },
    "text_query": {
      "query": "open source ChatGPT competitor",
      "weight": 0.3
    },
    "keywords": ["LLaMA", "Falcon"],
    "keyword_operator": "or",
    "max_results": 20
  }'

Example response

  {
    "ID": "987654321",
    "Content": "LLaMA-3 is being positioned as a serious open source ChatGPT competitor.",
    "Metadata": null,
    "Score": 0.91
  },

Choosing Between Search Types

Use CaseSimilarity SearchHybrid Search
Quick semantic matching without keywords
Blend semantic meaning with keyword search
Fine-tune relevance scoring
Approximate text matching

Tips for Better Results

• Use short, descriptive phrases in query fields.

• Use max_results: 1 for “I’m feeling lucky” style lookups.

• Try Hybrid Search if you want balance between context and keyword targeting.