About my work

I build systems where models meet meaning. Over the past few years, I’ve worked across open source, devtools, consumer, and multimodal AI, from designing frameworks that help developers integrate Visual Language Models into production to building narrative-aware agents that understand long videos and movies. My work sits at the intersection of multimodal AI (VLMs), LLM agents, and real-world product systems — not just models, but systems that actually ship and scale. I’ve worked across open source, startups, and research systems: Senior Research Engineer & maintainer for PyTorch Lightning (~5M+ monthly downloads), where I led distributed training features like TPU Pod support Built evaluation and retrieval systems for long video understanding at Rumi Labs (a16z-backed), including multi-agent pipelines and multi-context RAG Co-developed VLM Run’s vision MCP server and schema-driven extraction systems for structured data from images, PDFs, and video Co-founded Sequels AI — an agentic AI media OS generating contextual content in real-time, later licensed to an a16z-backed startup What I can help you with: 1. Multimodal AI systems (Vision + Language) Extract structured data from images, documents, or video Build pipelines that actually work in production (not just demos) 2. LLM agents & workflows LangGraph / agent orchestration Tool-using agents, evaluation, and reliability 3. AI system architecture Designing scalable backends (FastAPI, queues, async pipelines) RAG systems with real-world constraints (latency, retrieval quality, cost) 4. Evaluation & reliability Build eval systems for LLMs, agents, or multimodal models Define metrics that actually reflect product performance How I typically work: Turn vague ideas into clear system designs Identify what’s actually hard vs hype Help you ship faster with fewer iterations If you’re building: AI products using real-world data Multimodal systems (vision, video, documents) Agent-based workflows Developer tools for AI —I can help you go from idea → architecture → working system.

Highlights
3
Roles held
Experience

AI Consultant

Current

Reiko2024-12-01 – 2026-04-11

Built evaluation, retrieval, and RL systems for multimodal AI, with a focus on long video understanding, agentic pipelines, and computer use agents.
Rumi Labs (a16z-backed)
Built RUMI-EVALS, an automated evaluation framework for long video understanding that generates high-quality MCQs from scene annotations to test narrative, temporal, and character reasoning
Developed a LangGraph-based multi-agent pipeline for question synthesis, tournament-based scene ranking, and multi-context evaluation across modalities
Architected a RAG system with dual-context retrieval (temporal + semantic), query intent classification, and multi-vector embeddings (narrative/sensory/spatial/emotional)
AGI Inc (Menlo-backed)
Designed evaluation + RL systems for computer-use agents, focusing on trajectory-based evaluation, reward modeling, and measuring task success, efficiency, and robustness
VLM Run (SPC-backed)
Co-developed the MCP server for vision, enabling LLM agents to process and extract structured data from visual inputs
Built a scalable video transcription backend using queue-based pipelines for high-throughput processing
Led development of open-source tooling (SDK, hub, cookbooks) to improve developer adoption
Implemented schema-driven extraction (Pydantic + GraphQL-style filtering) with autocasting and visualization
Built agentic systems for extracting structured data from complex documents (multi-layout tables)
Guardrails AI (Zetta Ventures-backed)
Built integrations across LlamaIndex, LangChain, NVIDIA NeMo Guardrails, and Portkey
Improved traceability via LangSmith + RunnableConfig
Led open-source growth strategy and contributor experience improvements

Co-Founder & Head of AI

Sequels AI2023-05-11 – 2024-11-11

Licensed the tech to an a16z-backed startup.
Agentic AI Television OS revolutionizing viewing experiences through contextual content generation and personalized entertainment ecosystems. Product Demo. Internal Demo.
Product Vision & Leadership
Co-founded an AI-driven entertainment platform that generates contextually relevant content, adapting to users' real-time viewing and emotional states, integrating social elements, and universal remote control capabilities
Built and led a cross-functional team of 7+ engineers recruited from Meta, Twitch, IBM, OKX, ShareChat, etc.
Core AI & ML Systems
Developed Perceiver Engine agentic research framework generating contextual "bullets" for media scenes—for eg, achieved 140+ diverse content blocks for complex scenes like "Einstein meets Oppenheimer's first encounter."
Created Subplot Story Tree algorithm to decompose complex narratives into hierarchical subplots, enabling enhanced story comprehension and detailed character analysis
Engineered content scene mapping infrastructure using fine-tuned transformer models to associate generated content with precise timestamps, delivering contextually relevant information at exact viewing moments
Implemented the LLMOps platform, ensuring efficient, reliable, and scalable LLM utilization across production systems
Platform & Infrastructure
Built Cosmo Observability Platform as a consumer app proxy, providing comprehensive analytics on movie titles and content performance to drive ML system optimization
Deployed Workflow Orchestration Platform using self-hosted Prefect for automated data retrieval, ingestion, and research pipeline management
Developed a comprehensive content management platform with FastAPI backend for optimized CRUD operations and a Python client interface for seamless research pipeline integration
Product Features & Applications
Created "Catch Me Up!" interactive feature delivering personalized recaps, character insights, and thematic breakdowns for enhanced viewer engagement
Spearheaded the Sports Match Post Generator System, automating engaging content creation for pre-game, in-game, and post-game coverage across international sports events

Senior Research Engineer & Open Source Maintainer

Lightning AI2021-01-11 – 2023-01-11

Lightning AI is a leading MLOps startup that makes it easy to build, train, fine-tune, and deploy models.
PyTorch Lightning is a lightweight PyTorch framework for high-performance AI research (~31k stars, ~5m monthly downloads).
Major collaborators and led features with: PyTorch team at Meta, TPU XLA team at Google, AI Economist team at Salesforce, Habana team at Intel, Sagemaker team at Amazon, and a few more.


Pioneered TPU Pod training feature with Lightning, making it the first PyTorch framework to offer this capability.
Spearheaded initial development of the Generative AI Muse App
Core contributor to the new stable accelerator and strategy API.
Led the stable development for training on TPU Accelerator.
Drove integration of Habana Accelerator with Lightning.
Led the WarpDrive Lightning integration with Salesforce.
Presented at major technical conferences including ODSC Boston, ODSC Europe, ODSC West, Geekle Data Science Summit, FOSSUnited and a few more.
Led the integration of Rich Progress Bar for PyTorch Lightning.
Co-led the initial Lightning Documentation Revamp.
Established procedural pipeline for app development and submission with the product team.
Developed multiple Lightning apps, including Video Search and HackerNews App.
Worked on integrating Sagemaker Distributed Training Strategy.
Added support for the Object Detection Task in Lightning Flash.
Consistently contributed to Lightning projects through feature development, bug fixes, code reviews, releases, refactors, and technical blog posts.