The AI Research Lab continuously ingests papers from arXiv, embeds them into a vector database, and runs a coordinated team of specialist agents to extract structured insights — contradictions, emerging benchmarks, research frontiers, cross-paper connections.
The AI Research Lab is an agent-powered observatory for the AI literature. The goal is to surface what matters across hundreds of papers without requiring you to read them all.
Papers are fetched from arxiv.org via the Semantic Scholar API for richer metadata (citations, open-access status, author affiliations). Quality signals come from HuggingFace, Papers with Code, and OpenReview.
Papers are filtered by recency (default 6 months), abstract length, and minimum relevance score — then sorted by influence signals to surface high-signal work early.
Each ingestion run follows four sequential steps.
Search
Query the Semantic Scholar Graph API with the topic's search terms. Paginate, filter by category and abstract length, deduplicate against stored IDs.
Embed
Each paper's abstract and title is chunked into ~500-token segments with 50-token overlap, then embedded via Gemini text-embedding-001 to 768-dimensional vectors.
Store
Metadata into PostgreSQL. Embeddings into a pgvector column indexed with HNSW (cosine) for sub-millisecond ANN lookup.
Link
Papers are joined to their topic; the topic's paper count and last-sync timestamp update atomically.
Five specialist agents process the ingested papers. Each receives the full corpus for the topic and produces a structured artifact powering a dashboard tab.
Paper Analyzer
Extracts the core problem, the approach, the main result, and a plain-language takeaway for each paper. Powers the Papers tab.
Trend Mapper
Tracks how research intensity has shifted across sub-topics over time. Surfaces emerging and declining areas. Drives the Topic Evolution chart.
Contradiction Finder
Surfaces papers making conflicting empirical or methodological claims. Also identifies areas of consensus and open debates across the collection.
Benchmark Extractor
Pulls benchmark names, metrics, and scores. Warns when papers report incomparable numbers (different datasets, splits, or metrics).
Frontier Detector
Identifies paradigm shifts, breakthroughs, and underexplored gaps. Surfaces genuinely new directions over incremental improvements.
Agents are orchestrated in three phases to manage dependencies and maximize parallelism.
Phase I
Parallel foundation
Paper Analyzer · Trend Mapper
No dependencies. Run in parallel against the raw corpus.
Phase II
Builds on Phase I
Contradiction Finder · Benchmark Extractor
Use the structured summaries from the Paper Analyzer. Run in parallel with each other.
Phase III
Synthesis
Frontier Detector
Synthesizes outputs from all four prior agents. Runs last with full context.
Next.js 15
App Router
Drizzle ORM
PostgreSQL
pgvector
HNSW ANN index
Gemini
text-embedding-001
Instruct LLM
Agent reasoning
Recharts
Trend & landscape charts
The full source code — ingestion, agents, API routes, frontend — is on GitHub. Issues and pull requests are welcome.
AI Research Lab · built by Abhi Das · Back to dashboard