Perplexity Search

What It Does

Instantly searches the web for real-time answers using the Perplexity AI API (Sonar Pro). Just ask anything — “What’s the weather in Tokyo?” or “Latest news on AI?” — and it returns a concise, spoken-friendly answer without symbols or jargon.

Suggested Trigger Words

“search”
“tell me about”
“what is”
“who is”

Key Features

Real-Time Search

Access current information from the web, not just LLM training data

Clean Answers

Automatically removes citations and symbols for natural speech

Fast Response

Optimized for quick answers with max_tokens=150

Sonar Pro Model

Uses Perplexity’s most advanced search model

How It Works

User Query

User asks a question using a trigger word

Immediate Acknowledgment

“Let me check that for you real quick”

API Request

Sends query to Perplexity Sonar Pro with search capabilities enabled

Response Cleaning

Strips citation markers [1], [2], etc. and extra whitespace

Spoken Answer

Delivers clean, conversational answer via TTS

API Requirements

Perplexity AI API key required. Get one at perplexity.ai/settings/api

Setup Instructions

Visit perplexity.ai/settings/api
Create an API key
Replace YOUR_API_KEY in main.py with your actual key:

api_key = "pplx-abc123..."  # Your real key here

Code Walkthrough

API Configuration

Optimized for quick, spoken-friendly responses:

payload = {
    "model": "sonar-pro",  # sonar or sonar-pro both work
    "temperature": 0.2,
    "disable_runs": False,
    "top_p": 0.9,
    "max_tokens": 150,
    "messages": [
        {
            "role": "system",
            "content": (
                """
                Give a short, clear answer in simple spoken language.
                Do not use symbols, citations, or abbreviations.
                """
            )
        },
        {
            "role": "user",
            "content": f"{msg}"
        }
    ]
}

Model Choice: sonar-pro provides more accurate results. Use sonar for faster/cheaper queries.

API Request with Timing

Logs response time for debugging:

start_time = time.time()

# Send the request to Perplexity
response = requests.post(
    "https://api.perplexity.ai/chat/completions",
    headers={
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    },
    json=payload
)

end_time = time.time()
response_time = round(end_time - start_time, 3)

self.worker.editor_logging_handler.info(f"⏱️ API Response Time: {response_time} seconds")

Response Extraction and Cleaning

Removes citations and cleans up formatting:

try:
    result = response.json()
    self.worker.editor_logging_handler.info(f"✅ Parsed JSON Response:\n{json.dumps(result, indent=2)}")
except Exception as e:
    self.worker.editor_logging_handler.info(f"❌ Failed to parse JSON: {e}")
    result = {}

# Extract the assistant's message (final summary)
search_result = result.get("choices", [{}])[0].get("message", {}).get("content", "Sorry, I couldn't find anything.")

# Remove citation markers like [1], [2], etc.
search_result = re.sub(r"\[\d+\]", "", search_result)
search_result = search_result.replace("  ", " ").strip()

Delivery

# Speak the final summarized result
await self.capability_worker.speak("Here's what I found:")
await self.capability_worker.speak(search_result)

# Resume the normal workflow
self.capability_worker.resume_normal_flow()

Example Conversations

User: What's the capital of Australia?

AI: Let me check that for you real quick...

AI: Here's what I found: The capital of Australia is Canberra, 
    not Sydney as many people assume.

Advanced Configuration

Adjusting Response Length

"max_tokens": 150,  # Shorter, punchier answers
"max_tokens": 300,  # More detailed explanations

Temperature Control

"temperature": 0.2,  # More factual, deterministic
"temperature": 0.7,  # More creative, varied responses

Model Selection

sonar-pro
sonar

Best for: Accurate, detailed answersCost: Higher per queryUse when: User needs reliable, in-depth information

Extending This Ability

Follow-up Questions

Add conversation loop to allow multi-turn research sessions

Source Citations

Parse and speak source URLs for fact-checking

Category Detection

Route different query types to different models or parameters

Search History

Store and recall previous searches across sessions

Why Use Perplexity Instead of LLM?

Real-Time Information

Perplexity searches the web for current information, while base LLMs only know information up to their training cutoff date.Example: “What’s the current price of Bitcoin?” needs live data.

Factual Accuracy

Perplexity cites and verifies sources, reducing hallucinations for factual queries.Example: “When was the last SpaceX launch?” requires verified data.

News and Events

Perfect for queries about recent news, sports scores, stock prices, or weather.Example: “Who won the game last night?” needs current information.

Perplexity is ideal for factual, time-sensitive queries. For creative tasks or advice, the base LLM is often better.

Troubleshooting

Error: 401 Unauthorized

Your API key is invalid or expired. Check that you’ve correctly replaced YOUR_API_KEY in the code.

Error: 429 Rate Limited

You’ve exceeded your API quota. Check your usage at perplexity.ai/settings/api

Empty or Incomplete Answers

Try increasing max_tokens from 150 to 250 for longer responses.

Citation Numbers in Speech

The cleaning regex should remove [1], [2] markers. Check that this line is present:

search_result = re.sub(r"\[\d+\]", "", search_result)

Official Abilities

Community Showcase

What It Does

Suggested Trigger Words

Key Features

Real-Time Search

Clean Answers

Fast Response

Sonar Pro Model

How It Works

API Requirements

Setup Instructions

Code Walkthrough

API Configuration

API Request with Timing

Response Extraction and Cleaning

Delivery

Example Conversations

Advanced Configuration

Adjusting Response Length

Temperature Control

Model Selection

Extending This Ability

Follow-up Questions

Source Citations

Category Detection

Search History

Why Use Perplexity Instead of LLM?

Troubleshooting

Build docs developers (and LLMs) love

Official Abilities

Community Showcase

​What It Does

​Suggested Trigger Words

​Key Features

Real-Time Search

Clean Answers

Fast Response

Sonar Pro Model

​How It Works

​API Requirements

​Setup Instructions

​Code Walkthrough

​API Configuration

​API Request with Timing

​Response Extraction and Cleaning

​Delivery

​Example Conversations

​Advanced Configuration

​Adjusting Response Length

​Temperature Control

​Model Selection

​Extending This Ability

Follow-up Questions

Source Citations

Category Detection

Search History

​Why Use Perplexity Instead of LLM?

​Troubleshooting

Build docs developers (and LLMs) love

What It Does

Suggested Trigger Words

Key Features

How It Works

API Requirements

Setup Instructions

Code Walkthrough

API Configuration

API Request with Timing

Response Extraction and Cleaning

Delivery

Example Conversations

Advanced Configuration

Adjusting Response Length

Temperature Control

Model Selection

Extending This Ability

Why Use Perplexity Instead of LLM?

Troubleshooting