open to senior AI roles

Dhiraj
Chaudhary

AI & Data Engineer

I build LLM and multi-agent systems — and the data platforms they run on. I lead Core Platforms at KKR, and ship live AI products on the side.

browse projects · open · ⌘K for everything
dhiraj@dev — zsh — 80×24
dhiraj@dev:~$
dhiraj@dev:~$ cat about.md

About

I'm an AI & data engineer in New York. At KKR I lead the Core Data Platforms team (grew it 1 → 5), where I own a Spark/Scala ETL framework that powers 10,000+ big-data jobs — and build the multi-agent LLM systems moving our data operations from human-run to autonomous. I was one of the first there to push AI into production, back before Cursor or Copilot existed.

I care about the unglamorous parts of AI: grounding agents in real context so they don't hallucinate, structured outputs that actually validate, and human-in-the-loop designs that ship. On the side I build and run live AI products end to end — Get New Resume, TimeBrew, and more.

ai / llm
LangGraphLiteLLMOpenRouterRAGevalsClaudeGPT-4
data
SparkScalaIcebergHadoopHiveETL
cloud
AWS LambdaStep FunctionsGlueEMRDynamoDBServerless
languages
PythonSQLScalaTypeScriptBash
web
Next.jsReactFastAPIFlaskTailwind
$ git log --oneline experience/
2022 — present

Sr Software / Data Engineer

KKR (Global Atlantic)
2021

Software Engineer Intern

Tarifica
dhiraj@dev:~$ ls ~/work

Work

KKR (Global Atlantic)·Sr Software / Data Engineer·builds & leads Core Data Platforms (1 → 5)2022 — present
incident-agentsai

Multi-agent system (LangGraph) that diagnoses job failures and drafts the fix — opens a GitHub PR for code issues, a one-click workflow for infra/data. Grounded in per-job profiles; human-in-the-loop by design.

langgraphlitellmclaude
analyzer · proposer · validator
incident-filesai

The precursor: an LLM engine for data-file incidents. TPAs send hand-made Excel/CSV that breaks the pipeline — a renamed column, a changed date format — and it diffs expected-vs-actual, proposes a fix-the-file-or-fix-the-code action a human approves, then ingests instantly.

llmpythonhitl
2–3 days → hours
dq-chatbotai

Two-agent data-quality system: one inspects a table, infers its domain, and proposes 10–100 rules (with tools to fetch stats, sample rows, and test-execute candidates); a second runs daily and reasons over rule output + exploratory stats to give a plain-English health verdict.

langgraphtoolsdq
two agents · proactive DQ
deal-codegenai

LLM system that auto-generates the 15–20 pipeline modules each new insurance deal needs — built on a canonical medallion model and grounded in prior deals' code. Shipped before Cursor/Copilot existed.

llmpythoncodegen
−60% integration · pre-Cursor
llm-migrationsai

LLM pipelines that migrated 2,000+ Sybase procs → Redshift, 500 SAS jobs → Spark, and 1,000 Tableau dashboards via guided XML rewrites + a visual-diff system. Human-validated throughout.

llmredshiftspark
months → days
doc-translationai

Secure multi-language PDF/PPTX translation built from parts — parses every text + layout element, translates through open-source models running on-prem (nothing leaves the VPC), then reassembles into an identical-looking document. Verified by a round-trip eval loop scoring semantic similarity + sentiment divergence.

nlpon-premevals
on-prem models · round-trip evals
spark-etl-frameworkplatform

Config-driven Spark/Scala ETL framework I scaled into KKR's most-used internal utility. Self-service job generation, a validator producing 5M+ data-point checks, complex features behind single config flags.

sparkscalaplatform
10k+ jobs · −80% runtime
json-ingestionplatform

Real-time ingestion for 300–400k tiny JSON files per load — jobs that once ran 17–18 hrs/day. Two wins: a Spark-distributed file move (5 hrs → 2 min) and RDD-level normalization that fixed list-vs-dict key drift before the DataFrame, after Lambda + pandas kept running out of memory.

sparkrddpython
5 hrs → 2 min · 300k+ files
serverless-platformplatform

One of the lead architects of KKR's move off 24/7 EC2 onto Glue / Lambda / EMR — designed the architecture, proved feasibility, built the v0, then enabled the org to migrate. Custom tooling where out-of-box couldn't: S3 trigger handler, dependency watcher, operational job handler.

awsserverlessplatform
8k+ hrs saved · −40% detection · real-time
data-fabricplatform

Self-service platform to migrate any source DB — Oracle, Snowflake, DB2, Redshift — into Apache Iceberg, creating a unified, cheap data fabric. JSON-driven, built on the Spark framework.

icebergsparkself-serve
500+ tables · 4 teams
variance-trackerplatform

Solo full-stack platform to track and sign off data variances across multiple databases — Flask end to end, with automated email notifications and approval routing. Shipped to AWS after passing the enterprise Architecture Review Board.

flaskawsfull-stack
15+ stakeholders · −70% approvals
dhiraj@dev:~$ ls ~/products

Live products

AI products I've designed, built, and shipped end to end — live on the internet, outside of work — plus the autonomous agent that runs one of them.

live

Get New Resume getnewresume.com

Turns a resume + job description into a tailored resume, ATS match score, and cover letter — with a zero-fabrication constraint. 4-stage LLM pipeline, multi-model routing, typed/validated outputs.

Next.js · Lambda · OpenRouter · DynamoDB
running

Milly getnewresume.com

The autonomous Claude agent that runs Get New Resume's back office on a live EC2 box — reads incoming email, decomposes it into tasks, spawns worker sub-agents, and ships a daily briefing. Role-based (dispatcher · babysitter · briefer), token-budgeted, self-healing via cron.

Claude · Node · EC2 · SES/S3/SQS
live

TimeBrew timebrew.news

Personalized AI news briefings in a 'Morning Brew' voice. A 3-stage Step Functions pipeline (curator → editor → dispatcher) across 14 Lambdas, timezone-aware scheduling, ~$0.02 per briefing.

Step Functions · Perplexity · GPT-4 · SES
live

InspireInbox inspireinbox.com

LLM-powered motivational platform — personalized content tuned to your growth goals, with scheduled delivery, feedback collection, and analytics.

FastAPI · React
live

Notion Crafts notioncrafts.com

A web app hosting interactive widgets (clocks, timers, counters) and icon packs you can embed in Notion. Python/Flask backend on EC2 behind Nginx + Cloudflare.

Flask · EC2 · Nginx
dhiraj@dev:~$ ls ~/labs

Labs & open source

Apps, experiments, a published Python package, and undergrad research — the things I build to learn.

jotted

Keyboard-first 'mission control' for running many AI agents at once — drag-drop task lanes per agent, a Cmd-K command palette, and PM-grade timeline / radar / kanban views. Built it to manage my own parallel Claude runs.

next.jszustandagents
wip · cmd-k · vim nav
drink-water

Native iOS hydration tracker (SwiftUI) — animated progress styles, 90-day history, smart reminders, streaks, and full VoiceOver/dark-mode support. Built App-Store-ready, MVVM + OSLog.

swiftswiftuiios
app-store-ready
student-circle

Serverless student social platform — FastAPI on AWS Lambda with Cognito auth, API Gateway, and S3, plus a Vite/React front end. Multi-stage dev/staging/prod infra via the Serverless Framework.

fastapilambdareact
serverless · cognito
proboabfunc

A Python package (published to PyPI) for statistical operations and plots over binomial & gaussian distributions, built with OOP on pandas / NumPy / matplotlib.

pythonpypistats
published · pip install
research-papers

Undergrad research: a Flood-It solver algorithm in Python (presented at the Mathematics Association of America) and a bio-robotics study on hybrid artificial/biological structures (SJC research symposium).

algorithmsresearchmath
2 papers · presented
dhiraj@dev:~$ ./contact.sh

Get in touch

Building something that needs solid data & AI plumbing?

dhirajc963@gmail.com
$ cat contact.json
{
  "email": "dhirajc963@gmail.com",
  "github": "github.com/dhirajc963",
  "linkedin": "linkedin.com/in/dhiraj-kumarcdry",
  "location": "New York"
}