Open Source: Dealflow Data Platform – AI + Neo4j Integration

Hi everyone :waving_hand:

I’m Santosh Narayanan, a Senior Full Stack & Cloud Engineer specializing in Generative AI and LLM Systems.

I recently published the Dealflow Data Platform, an open-source project that integrates LangChain, Neo4j Aura (free tier) / Neo4j Desktop, and Google Cloud Run to explore AI-driven graph reasoning and Cypher query generation.

This post marks the completion of Phase 1, where I focused on building the AI–graph integration layer and end-to-end workflow for intelligent data mapping.
Phase 2 will expand into deeper reasoning, richer data models, and interactive visualizations.

:link: GitHub: https://github.com/santoshnarayanan/dealflow-data

More info one can view from README.md and the Technical-design.md

:backhand_index_pointing_right: Example continuation post (Phase 2 → Phase 4 + Phase 5 preview)

:brain: Continuing my progress on the AI-powered Dealflow Dashboard (LangChain + Neo4j + Weaviate + React)!
Since the last update, I’ve completed Phase 2, Phase 3, and Phase 4, expanding the system into a full hybrid RAG + multi-agent reasoning platform. Here’s the gist :backhand_index_pointing_down:


:rocket: Phase 2 — Vector DB + Semantic Search (Weaviate)

  • Added Weaviate with text2vec-openai for embeddings.

  • Created startup & investor vector schemas.

  • Implemented semantic search endpoints for:

    • /vector/startups?q=...

    • /vector/investors?q=...

  • Frontend semantic search UI to explore AI-powered similarity results.

:sparkles: This enabled contextual RAG workflows and improved natural-language discovery.


:robot: Phase 3 — Hybrid AI (Graph + Vector + LLM Synthesis)

  • Implemented 3 major LangChain pipelines:

    • askGraph → LLM → Cypher → Neo4j

    • askRag → Semantic retrieval via Weaviate

    • askHybrid → Combine Neo4j + Weaviate + LLM summarization

  • Frontend pages added:

    • AI Query, Semantic Search, Hybrid AI
  • Architecture now supports richer multi-source reasoning.

:sparkles: The system can now answer queries using BOTH graph relations and semantic context.


:robot::handshake: Phase 4 — Multi-Agent Orchestrator (Classifier → Cypher → Vector → Answer Agent)

Built a production-style multi-agent reasoning workflow:

:one: Classifier Agent — routes user queries to cypher, vector, or hybrid
:two: Cypher Agent — generates Cypher + executes on Neo4j
:three: Vector Agent — semantic search on Weaviate
:four: Answer Agent — synthesizes JSON answer + explanation

Additional upgrades:

  • Prometheus metrics (/metrics)

  • Detailed pino logging

  • Health checks for Neo4j, Weaviate, Agents (/health/*)

  • New frontend module: AI Multi-Agent Chat

:sparkles: The system now intelligently selects reasoning paths and explains how it arrived at answers.


:crystal_ball: What’s coming in Phase 5 (preview)

Phase 5 is focused on production-grade deployment + observability + system intelligence:

:ship: Deployment (GCP)

  • Cloud Run services (frontend + backend)

  • Managed vector DB option (Weaviate Cloud / Vertex AI Vector Search)

  • Traefik / API gateway routing

:bar_chart: Observability

  • OpenTelemetry distributed tracing

  • Grafana dashboards for metrics

  • Slow-query monitoring for Neo4j & Weaviate

  • Error budget & performance thresholds

:brain: Multi-Agent Enhancements

  • Short-term + long-term memory (Redis)

  • Agent-level caching to reduce LLM calls

  • More domain agents (Funding Rounds, Investor Profiling)

:chart_increasing: Graph Expansion

  • Adding FundingRound relationships

  • Deeper investment patterns & reasoning queries


:link: Repo: https://github.com/santoshnarayanan/dealflow-data