Earnings Call Graph Analyst
Agent name
Earnings Call Graph Analyst
GitHub repository: GitHub - mu-jeong/earnings-call-graph Β· GitHub
What it does
Earnings Call Graph Analyst is a source-grounded financial and industry analysis agent for public earnings-call materials. It converts earnings-call transcripts and company-published earnings materials into a Neo4j graph of companies, source documents, chunks, entities, relation facts, and ontology concepts.
The current demo is intentionally specialized for AI infrastructure research. It helps users compare how companies describe demand, product exposure, revenue momentum, storage/networking/compute capacity, and related business outcomes across the loaded FY2026 Q2 graph.
Instead of only retrieving transcript snippets, the agent answers with graph reasoning paths such as:
AI Demand --DRIVES--> Cloud growth
Blackwell Ultra --DRIVES--> AI Demand
AI Demand --DRIVES--> Custom silicon
AI Infrastructure Solutions --DRIVES--> Product Revenue
These paths make the answer explainable: the user can see which source entity, relationship, and target entity support each conclusion, then inspect the supporting chunk or source document.
The agent is designed for analyst-style questions such as:
-
Which companies are currently loaded in the graph?
-
What positive AI infrastructure demand signals are companies reporting?
-
Which company is directly connected to a specific AI infrastructure product category?
-
What does the graph show about NVIDIA, Blackwell, and data center revenue growth?
-
What are the most connected entities, relation paths, and source-backed evidence chunks in the graph?
Current dataset and demo scope
The current demo focuses on a curated FY2026 Q2 set of AI infrastructure-related earnings-call materials from:
-
Cisco (CSCO)
-
Marvell Technology (MRVL)
-
Microsoft (MSFT)
-
NetApp (NTAP)
-
NVIDIA (NVDA)
-
Seagate Technology (STX)
-
Super Micro Computer (SMCI)
This is a demo dataset, not a hard-coded product limit. The app reads companies from source manifests and from loaded Neo4j Company nodes, so additional companies can be added by providing new Markdown files or private source-manifest entries and regenerating/loading the graph.
Why a graph fits
Earnings calls are narrative-heavy and inconsistent across companies. The same industry theme can appear under different company-specific language: Blackwell, custom silicon, Microsoft Cloud, data-center switching, enterprise storage, AI servers, cloud AI growth, or AI infrastructure revenue.
The graph preserves those company-specific terms as entities while connecting them to structured relation facts and ontology concepts. This lets the agent compare companies without flattening away the original evidence.
A simplified graph pattern is:
Company -> EarningsCall -> FiscalPeriod
SourceDocument -> EarningsCall
SourceDocument -> MarkdownChunk
MarkdownChunk <- SUPPORTED_BY - RelationFact
RelationFact -> FROM_ENTITY -> Entity
RelationFact -> TO_ENTITY -> Entity
Entity -> OntologyConcept
This structure lets the agent explain why something is true, not just return a keyword match.
Document parsing and chunking
Source materials can be provided in two ways:
-
Place normalized local Markdown transcripts under
data/source_cache/markdown/. -
Provide official source URLs in a private manifest and let the pipeline materialize the source text.
The source materializer supports PDF, DOCX, and HTML earnings-call materials. It downloads official source material, extracts text, normalizes paragraphs, and writes cached Markdown for repeatable local ingest.
During ingest, each Markdown document is split into paragraph-scoped MarkdownChunk nodes. The parser tracks headings and speaker-style lines, removes page markers and source boilerplate, and groups non-empty lines into paragraphs. Long paragraphs are split at sentence boundaries with a target maximum of 900 characters, so each chunk remains small enough for source-grounded extraction while preserving local context.
Each chunk receives a stable document-scoped id such as company-quarter-document-chunk-001, along with metadata including heading, chunk type, source line range, text hash, and document id. The chunk type can distinguish prepared remarks, Q&A sections, analyst questions, management answers, document overviews, and general speaker statements when the source exposes that structure.
Graph extraction is chunk-scoped. The pipeline first builds a document-level ontology or canonical vocabulary from the full Markdown document, then asks Gemini to extract entities and relations from each paragraph chunk independently, optionally in batches. Relations are accepted only when supported by the current chunk, and each RelationFact is linked back to its supporting MarkdownChunk. This is what lets the agent show graph paths and source evidence instead of unsupported transcript summaries.
Streamlit application
The web application is implemented with Streamlit and has four main tabs:
-
Graph - Main interactive graph exploration surface for the loaded earnings-call corpus. Users can switch between company-level entity paths and ontology-grouped concept views, filter by search term, company, ontology concept, and minimum node connection count, and inspect visible source-backed relation paths. The tab also provides an LLM-generated graph overview that summarizes key points and company differences from connected referenced chunks.
-
Ask - Deterministic graph question answering. Ask matches the question against entity names, relation types, evidence snippets, entity properties, and ontology concepts, then builds an answer from matched
RelationFact->MarkdownChunkevidence. The LLM summary from referenced chunks action appears directly under theQuestioninput; when run, the matched referenced chunks become the LLM input for a source-grounded synthesis. Matched relations are shown as cards, with Support / upside and Risk / pressure separated into two columns, and ontology mappings shown on their ownOntology:line inside each card. -
Ask (Aura) - Local tester for a Neo4j Aura Agent-like graph-tool workflow. Neo4j Aura supports agent-style graph tools; this project recreates a similar
router -> tool execution -> answerflow in Streamlit with a LangGraph-style orchestration pattern. The user enters only a question, the router selects the tool, and the answer emphasizesGraph reasoning pathandReferenced chunkinstead of separateEvidence,Positive signal, orWhy it matterscolumns. -
Key Nodes - High-signal entity exploration. Key Nodes ranks important entities from the full loaded graph and shows each selected node's source-backed relation table plus referenced transcript chunks. It also provides an LLM summary action that uses those connected referenced chunks as input, so users can get a synthesized explanation of the selected node's evidence without writing Cypher.
Agent tools
The project now supports an Aura-style tool workflow in two places:
-
AuraDB demo configuration: the actual AuraDB agent tool setup and screenshots are preserved in
docs/auraDB/. -
Local Streamlit tester: the Ask (Aura) tab mirrors the same tool behavior so the workflows can be tested locally in the web app.
The local tester works like an agent: the user enters only a question, a router chooses the most appropriate tool, and the chosen tool runs with the normalized question. The Ask (Aura) tab does not expose a manual tool override control; users only enter a question, and the router handles tool selection automatically.
Tool type split
-
loaded_company_universeandfrequent_entitiesare fixed Cypher-template tools. -
ai_positive_demand_by_company,ai_risks_constraints_by_company,company_ai_deep_dive, andproduct_category_evidence_mapare Text2Cypher tools.
For Text2Cypher tools, the app generates or selects a tool-specific read-only Cypher query, validates it, runs it against Neo4j, and sends the returned rows into a tool-specific answer prompt or deterministic local renderer. The answer format follows the AuraDB examples at a high level: executive summary, source-backed table, and graph reasoning or cross-company takeaway. Aura tool answers intentionally avoid separate evidence-gap/caveat sections so the visible output stays focused on graph reasoning paths and referenced chunks.
In the current local implementation, Ask (Aura) is intentionally positioned as a Neo4j Aura Agent-like workflow that can be exercised before recreating the tools in Aura. Neo4j Aura's agent-style graph-tool behavior is mimicked in Streamlit with a LangGraph-style orchestration pattern: a router chooses the appropriate tool from the user question, the selected tool executes either a fixed Cypher template or constrained read-only Text2Cypher query, and the answer writer renders a user-facing response. This mirrors the Aura tool workflow while keeping internal Cypher diagnostics hidden by default. For Text2Cypher answers, the visible table focuses on Company when applicable, Graph reasoning path, and Referenced chunk; redundant generated columns such as Positive signal, Evidence, and Why it matters are stripped so the user sees the source chunk supporting the graph path.
The Text2Cypher path includes guardrails:
-
generated Cypher must be read-only,
-
generated Cypher must include
RETURNandLIMIT, -
write/admin/procedure clauses are rejected,
-
overly large limits are capped to the available graph size,
-
cross-company tools rebalance rows by company when a first pass over-focuses on one company,
-
and Neo4j syntax errors can trigger an automatic repair pass.
1. loaded_company_universe
Description:
List companies currently loaded in the graph before cross-company analysis.
What it does:
-
Returns the graph's currently loaded company universe.
-
Helps the agent avoid guessing which companies are available.
-
Establishes the scope for follow-up cross-company analysis.
Example question:
Which companies are currently loaded in the earnings-call graph?
2. ai_positive_demand_by_company
Description:
Use Text2Cypher to find company-by-company positive demand signals for AI infrastructure.
What it does:
-
Compares demand signals across the loaded company set.
-
Prioritizes graph relationships that connect AI infrastructure themes to growth, revenue, product demand, orders, adoption, capacity, or production signals.
-
Returns company, graph reasoning path, evidence, source entity, relation, target entity, and concept context so the answer remains auditable.
Example question:
Across the loaded earnings-call graph, what positive demand signals are companies reporting for the AI infrastructure industry?
3. ai_risks_constraints_by_company
Description:
Use Text2Cypher to find company-by-company risks, bottlenecks, or constraints for AI infrastructure growth.
What it does:
-
Compares negative or limiting AI infrastructure signals across the loaded company set.
-
Prioritizes source-backed relation paths connected to risk, constraints, supply, capacity, cost, margin, or regulatory pressure.
-
Keeps risk/constraint answers grounded in referenced chunks rather than unsupported market commentary.
Example question:
Across the loaded earnings-call graph, what risks or constraints are companies reporting for AI infrastructure growth?
4. company_ai_deep_dive
Description:
Use Text2Cypher to retrieve one company's AI, product, and data-center source chunks.
What it does:
-
Focuses the analysis on a single company.
-
Pulls together relevant AI infrastructure entities, relationships, and supporting transcript evidence for that company.
-
Supports follow-up questions about product families, demand drivers, and growth commentary.
Example question:
For NVIDIA, what does the earnings-call graph show about AI demand, Blackwell, and data center revenue growth?
5. product_category_evidence_map
Description:
Use Text2Cypher to map a requested product category to company chunks and graph paths.
What it does:
-
Starts from a product or category term and finds company-specific graph evidence connected to it.
-
Helps compare which companies are exposed to the same infrastructure category.
-
Preserves company-specific terms while mapping them back to broader product or ontology concepts.
Example question:
Across the loaded graph, summarize company evidence for the AI accelerator product category.
AuraDB screenshots / demo evidence
The following AuraDB-related materials are intentionally preserved even though the AuraDB agent configuration itself is external to the Python source tree. They document the actual AuraDB tool behavior and expected answer style.
Loaded company universe
This screenshot shows the agent identifying the companies currently available in the loaded earnings-call graph.
Positive AI infrastructure demand signals
This screenshot shows the agent summarizing positive demand signals across loaded companies with company-by-company evidence and graph reasoning paths.
Company deep dive
This screenshot shows the company-specific deep dive workflow for NVIDIA and AI/Blackwell-related evidence.
Product category evidence map
This screenshot shows the product-category workflow, mapping AI accelerator evidence across companies.
Web UI screenshots
The local Streamlit screenshots in docs/web/ document the user-facing workflow, not only the underlying graph data model. Together, they show how a user moves from graph exploration to source-grounded analysis without writing Cypher.
Graph exploration
This screenshot shows the main graph workspace: search/company/ontology filters, graph scope controls, result-limit controls, a graph overview generated from connected referenced chunks, and the interactive relation graph. It demonstrates that the application uses relationships as the analysis surface rather than treating the data as a flat document search index.
Ask: source-grounded answer workflow
This screenshot shows deterministic graph question answering. The LLM summary from referenced chunks action appears directly below the question and uses matched referenced chunks as its input. Below the summary, matched relations are split into Support / upside and Risk / pressure columns, with each card showing the graph path, a separate Ontology: line, evidence text, confidence, and chunk id. The lower section keeps matched evidence and referenced chunks inspectable.
Ask (Aura): routed graph-tool workflow
This screenshot shows the local Aura-style agent workflow. The user enters only a question; the app routes it to the best tool, runs the selected Cypher-template or Text2Cypher tool, and writes the answer from returned graph rows. The visible output emphasizes Graph reasoning path and Referenced chunk, mirroring the Aura tool workflow while keeping the analysis source-grounded.
Key Nodes: selected-node evidence summary
This screenshot shows high-signal node exploration. After selecting a key node, the UI displays source-backed relation rows, provides an LLM summary from referenced chunks action using the selected node's connected chunks, and keeps the underlying referenced chunks available for inspection.
What makes it useful
Earnings Call Graph Analyst is useful because earnings season creates a large volume of unstructured company commentary. Analysts often need to know not just what one company said, but how multiple companies across the AI infrastructure value chain describe the same demand environment.
The agent helps answer questions such as:
-
Is AI infrastructure demand broad-based or concentrated in a few companies?
-
Which layer of the stack is benefiting: cloud, networking, accelerators, storage, servers, or custom silicon?
-
Which statements are supported by explicit graph paths and source evidence?
-
Which company-specific product terms map to broader industry themes?
-
Where do companies differ in the way they describe demand, capacity, or product exposure?
Limitations and caveats
-
The current graph covers a curated company set, not the entire market.
-
The current demo scope is FY2026 Q2-oriented and should not be interpreted as a complete multi-period history.
-
The agent answers only from the loaded graph data and should not be treated as investment advice.
-
Extracted graph paths are only as complete as the loaded source materials and chunk-level extraction results.
-
Text2Cypher is validated and repaired, but generated queries can still miss relevant evidence or require reruns for broad questions.
-
The current evidence map shows the strongest extracted graph paths, not every transcript mention.
Final submission blurb
Earnings Call Graph Analyst turns public AI-infrastructure earnings-call materials into a Neo4j graph-backed analyst agent. It helps users ask source-grounded questions about AI demand, Blackwell, custom silicon, cloud growth, networking, storage, AI servers, and company-specific exposure across the AI infrastructure value chain. The graph matters because companies use different language for similar industry themes; the agent preserves company-specific terms while exposing explainable paths such as Blackwell Ultra --DRIVES--> AI Demand and AI Demand --DRIVES--> Cloud growth. This makes the result more than transcript search: it is a relationship-driven evidence map for understanding AI infrastructure trends during earnings season.



