Package Exports
This package does not declare an exports field, so the exports above have been automatically detected and optimized by JSPM instead. If any package subpath is missing, it is recommended to post an issue to the original package (gpt-research) to support the "exports" field. If that is not possible, create a JSPM override to customize the exports field for this package.
Readme
GPT Research
๐ GPT Research is an autonomous AI research agent that conducts comprehensive research on any topic, searches the web for real-time information, and generates detailed reports with proper citations.
Built with TypeScript and optimized for both local development and serverless deployment (Vercel, AWS Lambda, etc.).
โจ Features
- ๐ Multi-source Research: Integrates multiple search providers:
- Tavily - AI-optimized search engine
- Serper - Google Search API (2,500 free searches/month)
- Google Custom Search - Direct Google integration
- DuckDuckGo - Privacy-focused search
- ๐ Smart Web Scraping: Cheerio and Puppeteer for content extraction
- ๐ค Multiple LLM Support: OpenAI, Anthropic, Google AI, Groq, and more
- ๐ MCP Integration: Model Context Protocol for external tool connections
- ๐ Various Report Types: Research, Detailed, Summary, Resource, Outline
- ๐ Streaming Support: Real-time updates via Server-Sent Events
- โก Vercel Optimized: Built for serverless deployment
- ๐พ Memory Management: Tracks research context and history
- ๐ฐ Cost Tracking: Monitor LLM usage and costs
๐ Quick Start
Installation
npm install gpt-research
# or
yarn add gpt-research
# or
pnpm add gpt-research
Configuration
Create a .env
file in the root directory:
# Required
OPENAI_API_KEY=your-openai-api-key
# Optional Search Providers (at least one recommended)
TAVILY_API_KEY=your-tavily-api-key # https://tavily.com (best for AI research)
SERPER_API_KEY=your-serper-api-key # https://serper.dev (Google search, 2,500 free/month)
GOOGLE_API_KEY=your-google-api-key # Google Custom Search
GOOGLE_CX=your-google-custom-search-engine-id
# Optional LLM Providers
ANTHROPIC_API_KEY=your-anthropic-api-key
GOOGLE_AI_API_KEY=your-google-ai-api-key
GROQ_API_KEY=your-groq-api-key
Basic Usage
const { GPTResearch } = require('gpt-research');
// or for TypeScript/ES modules:
// import { GPTResearch } from 'gpt-research';
async function main() {
const researcher = new GPTResearch({
query: 'What are the latest developments in quantum computing?',
reportType: 'research_report',
llmProvider: 'openai',
apiKeys: {
openai: process.env.OPENAI_API_KEY,
tavily: process.env.TAVILY_API_KEY
}
});
// Conduct research
const result = await researcher.conductResearch();
console.log(result.report);
console.log(`Sources used: ${result.sources.length}`);
console.log(`Cost: $${result.costs.total.toFixed(4)}`);
}
main().catch(console.error);
Streaming Research
const researcher = new GPTResearch(config);
// Stream research updates in real-time
for await (const update of researcher.streamResearch()) {
switch (update.type) {
case 'progress':
console.log(`[${update.progress}%] ${update.message}`);
break;
case 'data':
if (update.data?.reportChunk) {
process.stdout.write(update.data.reportChunk);
}
break;
case 'complete':
console.log('\nResearch complete!');
break;
}
}
๐ง Configuration Options
interface ResearchConfig {
// Required
query: string; // Research query
// Report Configuration
reportType?: ReportType; // Type of report to generate
reportFormat?: ReportFormat; // Output format (markdown, pdf, docx)
tone?: Tone; // Writing tone
// LLM Configuration
llmProvider?: string; // LLM provider (openai, anthropic, etc.)
smartLLMModel?: string; // Model for complex tasks
fastLLMModel?: string; // Model for simple tasks
temperature?: number; // Generation temperature
maxTokens?: number; // Max tokens per generation
// Search Configuration
defaultRetriever?: string; // Default search provider
maxSearchResults?: number; // Max results per search
// Scraping Configuration
defaultScraper?: string; // Default scraper (cheerio, puppeteer)
scrapingConcurrency?: number; // Concurrent scraping operations
// API Keys
apiKeys?: {
openai?: string;
tavily?: string;
serper?: string;
google?: string;
anthropic?: string;
groq?: string;
};
}
๐ Report Types
- ResearchReport: Comprehensive research with citations
- DetailedReport: In-depth analysis with extensive coverage
- QuickSummary: Concise overview of key points
- ResourceReport: Curated list of resources and references
- OutlineReport: Structured outline for further research
๐ Search Providers
Available Providers
Provider | Best For | Free Tier | API Key Required |
---|---|---|---|
Tavily | AI-optimized research | 1,000/month | Yes - Get Key |
Serper | Google search results | 2,500/month | Yes - Get Key |
Custom search | 100/day | Yes - Setup | |
DuckDuckGo | Privacy-focused | Unlimited | No |
Choosing the Right Provider
- Tavily: Best for AI research, academic papers, technical topics
- Serper: Best for current events, general web search, Google quality
- Google Custom Search: Best for specific domains, controlled results
- DuckDuckGo: Best for privacy-sensitive research, no API needed
Using Multiple Providers
// Configure multiple providers for redundancy
const researcher = new GPTResearch({
query: 'Your research topic',
retrievers: ['tavily', 'serper'], // Falls back if one fails
apiKeys: {
tavily: process.env.TAVILY_API_KEY,
serper: process.env.SERPER_API_KEY
}
});
๐ MCP (Model Context Protocol) Support
GPT Research now supports MCP for connecting to external tools and services!
What is MCP?
MCP (Model Context Protocol) is a standardized protocol for connecting AI systems to external tools and data sources. It enables seamless integration with various services through a unified interface.
MCP Features
- Stdio MCP Servers - Local process spawning for NPX/binary tools (Node.js/Docker/VPS)
- HTTP MCP Servers - RESTful API connections (works everywhere including Vercel)
- WebSocket MCP - Real-time bidirectional communication (works everywhere)
- Tool Discovery - Automatic discovery of available tools from all server types
- Smart Selection - AI-powered tool selection based on research query
- Streaming Updates - Real-time progress tracking via SSE
- Mixed Mode - Combine stdio, HTTP, and WebSocket in the same application
MCP Usage Examples
HTTP/WebSocket MCP (Works everywhere including Vercel)
const researcher = new GPTResearch({
query: "Latest AI developments",
mcpConfigs: [
{
name: "research-tools",
connectionType: "http",
connectionUrl: "https://mcp.example.com",
connectionToken: process.env.MCP_TOKEN
}
],
useMCP: true
});
Stdio MCP (Local tools - Node.js environments)
const researcher = new GPTResearch({
query: "Analyze this codebase",
mcpConfigs: [
{
name: "filesystem",
connectionType: "stdio",
command: "npx",
args: ["@modelcontextprotocol/filesystem-server"],
env: { READ_ONLY: "false" }
},
{
name: "git",
connectionType: "stdio",
command: "git-mcp",
args: ["--repo", "."]
}
]
});
Mixed Mode (Combine all connection types)
const researcher = new GPTResearch({
query: "Research topic",
mcpConfigs: [
// Local tools via stdio
{ name: "local-fs", connectionType: "stdio", command: "npx", args: ["fs-mcp"] },
// Remote API via HTTP
{ name: "api", connectionType: "http", connectionUrl: "https://api.example.com/mcp" },
// Real-time via WebSocket
{ name: "stream", connectionType: "websocket", connectionUrl: "wss://realtime.example.com" }
]
});
MCP Deployment Compatibility
MCP Type | Local/Node.js | Vercel | Docker | VPS/Cloud |
---|---|---|---|---|
HTTP Servers | โ Full | โ Full | โ Full | โ Full |
WebSocket | โ Full | โ Full | โ Full | โ Full |
Stdio | โ Full | โ Not Supported | โ Full | โ Full |
Stdio MCP Notes:
- Works perfectly in Node.js, Docker, VPS, and self-hosted environments
- Not supported on Vercel, AWS Lambda, or other serverless platforms
- For serverless deployments, use HTTP/WebSocket MCP or deploy a proxy server
Popular Stdio MCP Servers
These MCP servers can be run locally via stdio:
# File System Access
npx @modelcontextprotocol/filesystem-server
# Git Repository Tools
npx @modelcontextprotocol/git-server
# Database Query Execution
npm install -g mcp-database
mcp-database
# Custom Python MCP Server
python -m mcp.server
# Shell Command Execution
cargo install mcp-shell
mcp-shell
Learn More
- See
examples/demo-mcp.js
for HTTP/WebSocket demo - See
examples/demo-mcp-stdio.js
for stdio demo - Read
MCP.md
for implementation details - Check MCP Specification for protocol docs
๐ Vercel Deployment
API Routes
Create API routes in your Next.js/Vercel project:
// api/research/route.js
import { GPTResearch } from 'gpt-research';
export async function POST(request) {
const { query, reportType } = await request.json();
const researcher = new GPTResearch({
query,
reportType,
apiKeys: {
openai: process.env.OPENAI_API_KEY,
tavily: process.env.TAVILY_API_KEY
}
});
const result = await researcher.conductResearch();
return Response.json(result);
}
Streaming API
// api/research/stream/route.js
export async function POST(request) {
const { query } = await request.json();
const stream = new ReadableStream({
async start(controller) {
const researcher = new GPTResearch({ query });
for await (const update of researcher.streamResearch()) {
controller.enqueue(
`data: ${JSON.stringify(update)}\n\n`
);
}
controller.close();
}
});
return new Response(stream, {
headers: {
'Content-Type': 'text/event-stream',
'Cache-Control': 'no-cache',
'Connection': 'keep-alive'
}
});
}
Environment Variables
Add to your Vercel project settings:
OPENAI_API_KEY=your-key
TAVILY_API_KEY=your-key
SERPER_API_KEY=your-key
๐งช Examples
# Basic example
npm run example
# OpenAI-only example (no web search)
npm run example:simple
# Full research with Tavily web search
npm run example:tavily
# Research using Serper (Google Search API)
npm run example:serper
Check the examples/
directory for more detailed usage examples.
๐ Documentation
Quick Links
๐ฏ Use Cases
- Market Research: Analyze competitors, trends, and market opportunities
- Academic Research: Gather and synthesize information for papers and studies
- Content Creation: Research topics thoroughly for articles and blog posts
- Technical Documentation: Research technical topics and generate comprehensive guides
- Due Diligence: Conduct thorough research on companies, people, or topics
- News Aggregation: Gather and summarize news from multiple sources
๐ค Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
๐ License
MIT License - see LICENSE file for details.
๐ Performance Considerations
- Token Limits: Automatically manages context within token limits
- Concurrent Operations: Configurable concurrency for searches and scraping
- Cost Optimization: Uses appropriate models for different tasks
- Caching: Caches scraped content to avoid redundant operations
- Memory Management: Efficient in-memory storage with export/import capabilities
๐ Security
- API Key Management: Never commit API keys to version control
- Input Validation: All URLs and inputs are validated
- Rate Limiting: Built-in rate limiting for API calls
- Error Handling: Comprehensive error handling and recovery
๐ฏ Roadmap
- Add multi-language support
- Add more LLM providers (Cohere, Together AI)
- Implement research templates
- Add PDF and DOCX report export
๐ก Tips
- Use Tavily for best results - It's specifically designed for AI research
- Configure multiple search providers - Automatic fallback ensures reliability
- Adjust concurrency based on your limits - Prevent rate limiting
- Use streaming for long research - Better user experience
- Monitor costs - Track LLM usage to manage expenses
๐ Troubleshooting
Common Issues
Build Errors: Make sure you have Node.js 18+ and run npm install
API Key Errors: Verify your API keys are correct in .env
Rate Limiting: Reduce scrapingConcurrency
and maxSearchResults
Memory Issues: For large research, increase Node.js memory:
node --max-old-space-size=4096 your-script.js
๐ง Support
- Issues: GitHub Issues
โญ Show Your Support
If you find GPT Research helpful, please consider:
- Giving us a star on GitHub
- Sharing with your network
- Contributing to the project
Built with โค๏ธ by Pablo Schaffner
Autonomous research for everyone