Data scientist who builds end-to-end — from PDF scraping and ETL pipelines to deployed ML systems and production web applications. Graduate-level coursework in econometrics, computational statistics, and Bayesian reasoning. Ships real things: 6.4M-row campaign finance platform in production, real-time Whisper-powered police scanner with async architecture, interactive crime mapping in a live newsroom, and an honors thesis analyzing 484K Indian villages.
Five core services, all built around the same idea: your data should be working for you, not the other way around.
Replace the weekly scramble of pulling numbers from five different places. I build clean, automated dashboards and reports that update themselves — so you always know where things stand.
Got years of messy spreadsheets, duplicate records, or data you've never been able to make sense of? I'll clean it, structure it, and surface the patterns that matter for your decisions.
The information you need often exists online — in PDFs, government sites, or competitor pages — but there's no download button. I build custom scrapers that collect and structure it for you automatically.
Voter file analysis, precinct-level targeting, historical performance modeling, and field strategy recommendations. Data-driven campaigns win — I help you build the infrastructure for it.
Need to know what's actually driving your results — not just what correlates? I build Monte Carlo simulations to stress-test assumptions and use causal inference methods to isolate real effects from noise, so you can make decisions based on evidence, not guesswork.
Live audio transcription, keyword detection, and alerting systems powered by Whisper speech-to-text with quantized inference, voice activity detection, and fuzzy text matching — fully async architectures built to run in production.
A few examples of real projects and their outcomes.
Real-time police scanner transcription and alerting system. Ingests live Broadcastify audio via FFmpeg, runs Whisper STT with Int8 quantization, detects critical keywords via Aho-Corasick + fuzzy matching, and pushes alerts via email/Slack/WebSocket dashboard. Fully async architecture packaged as a standalone macOS app.
Built a public campaign-finance platform that ingests Illinois state bulk filings + FEC federal data, powers searchable dashboards, and surfaces donor/network analytics.
Analyzed a 600,000-row Illinois voter file to compute precinct-level deviation from Democratic baseline. Built field targeting recommendations from the findings.
Designed Power Query pipelines and VBA-driven Excel tools for class-booking, patient-management, and daily operating reports. Built R-based automated reporting infrastructure and dashboards.
Scraped and standardized presidential primary election data from 100+ state-party PDF documents into a single, clean, reproducible dataset for ongoing academic research.
Built a Python scraper to extract data from 250 PDF crime logs, then analyzed and visualized findings in an interactive R-based crime map ↗ for a university newsroom.
Designed a 700-iteration Monte Carlo simulation in R to evaluate how ANOVA holds up under varying effect sizes and non-normal error distributions. Analyzed Type I error rates and statistical power across conditions.
Full-stack dashboard aggregating U.S. Congress and Illinois General Assembly bill activity with searchable dashboards, CSV export, and D3.js network visualizations.
Cross-sectional analysis of 484,630 Indian villages (12 merged datasets). MLR testing domestic, commercial, and agricultural electrification effects on consumption. R² = 0.82. 11 academic citations (JPE, World Development, Energy Economics).
I'm Devin Oommen — a data scientist based in the Chicago area. I graduated from Northern Illinois University in 2025 with honors in Political Science, with graduate-level coursework in econometrics (OLS, 2SLS/IV, causal inference), computational statistics (MLE, Monte Carlo, bootstrap, KDE), and Bayesian reasoning.
I build end-to-end — from PDF scraping and ETL pipelines to deployed ML systems and production web applications. Recent work includes a 6.4M-row campaign finance platform, a real-time Whisper-powered police scanner with fully async architecture, interactive crime mapping in a live newsroom, and an honors thesis analyzing 484K Indian villages.
I'm especially interested in working with small businesses that know they're sitting on useful data but don't have the time or tools to do anything with it — and with political campaigns and organizations that want to make sharper, evidence-based decisions.
MPSA Conference Presenter · ICPA News Story of the Year · ICCJA Reporter of the Year · Peters Scholarship for Public Service · Mortar Board Honor Society
Free 15-minute consultation. No pitch, no pressure — just an honest look at whether I can help.
Email Me