Advanced analytics and AI/ML leader at the University of Illinois System, delivering production-grade AI pipelines, LLM platforms, and predictive models that drive executive-level decisions.

I'm a Data Scientist at Administrative Information Technology Services (AITS), University of Illinois System, where I've served as the functional lead of Advanced Analytics and Decision Support for nearly three years — operating at senior scope across analytics architecture, AI infrastructure, executive engagement, and stakeholder delivery in a complex, highly regulated enterprise.
My work spans the entire stack: from architecting Linux-based analytics execution environments to deploying RAG-powered legislative intelligence platforms live on Azure. I've built systems that process 79,000+ archival emails, classify research proposals in real time via 11-step NLP pipelines, and enable natural-language search across legacy institutional file archives.
I believe in independently defining ambiguous problems, designing solutions from scratch, and sustaining delivery continuity through organizational change. I've presented to executive leadership, governance committees, and the Enterprise Architecture Committee, and I've made it a point to translate complex AI into narratives that non-technical stakeholders can act on.
End-to-end AI platform that ingests, classifies, and semantically queries Illinois state and federal legislation. Extracts 12+ structured intelligence signals per bill via LLaMA, builds hybrid FAISS vector search, powers a GPT-4.1 Copilot chatbot with citation-backed responses, and deploys a live Streamlit dashboard to Azure with Z-score anomaly detection, TF-IDF trend analysis, and subcategory momentum modeling. Stakeholders include AVP Academic Affairs, IGPA, and AVP External & State Relations.
Predictive ML system identifying students at risk of discontinuing enrollment, with ~82% classification accuracy. Random Forest, Tree Ensemble, and ARIMA models with 10-year enrollment projections by department and college. Full lifecycle ownership from feature engineering to executive presentation to UIC leadership.
Processed 79,676 historical .eml records from a 20-year institutional archive. PII detection and anonymization via Microsoft Presidio (5 strategies), LLaMA de-identified summarization, LDA + LLM-labeled topic modeling. Scalability analysis documented the production path. Built in partnership with RIMS and the University Library.
Multimodal image moderation POC for large-scale historical collections, scaled to 12,125 images. Compared semantic similarity-based classification against LLaVA multimodal LLM at scale. Developed domain-specific sensitive content taxonomy for archival and legal risk. Inference throughput later re-engineered for a 6.22× speedup on the same NVIDIA L4 GPU — see Project 09. Active collaboration with the RIMS Committee at UIUC and the University Library cataloguing program.
15+ file-type extraction pipeline (PDF, Word, Excel, PowerPoint, images, code, HTML) with LLaVA visual intelligence and LLaMA text summarization. FAISS vector search returns Box file paths from natural-language queries, making decades of legacy institutional content discoverable for the first time. Includes image querying via LLaVA descriptions.
Predictive workforce analytics platform designed to transform EDW HR and payroll data into forward-looking strategic intelligence. Three completed POC workstreams cover internal mobility modeling, succession readiness prediction, and workforce growth forecasting. Uses Markov chain career simulation, Random Forest promotion prediction, graph-based career community detection, and O*NET-enriched skill matching. Developed in partnership with Luanne Mayorga, Assistant Chancellor for Illinois HR.
Custom-designed centralized health-checking and alerting system for all scheduled analytics workflows. Detects failures, cascade errors, and silent pipelines, and generates diagnostic alert emails via Linux sendmail with full traceback detail. Runs every 30–60 min with zero manual intervention.
Full-stack production pipeline connecting live Oracle databases through 11-step NLP processing (TF-IDF + KeyBERT + LDA fusion, HERDS classification) to structured Oracle writeback, running unattended on an automated schedule. Processes all SPA research proposals and awards system-wide across UIUC and UIC — foundation for the emerging Research Proposal Intelligence platform supporting the System's ~$517M annual research portfolio.
Performance engineering case study on a single NVIDIA L4 GPU. Controlled benchmark methodology isolated request serialization in the production Ollama serving layer as the throughput bottleneck — not the hardware or model. Migration to vLLM continuous batching delivered a 6.22× speedup at C=8 on the same hardware and same model class, taking the 12,125-image archival corpus from 22 hours to 3.7 hours with GPU utilization rising from 28% to 97%.
Six-stage experimental funnel benchmarking three open-weight 7–8B AWQ models (Mistral, Qwen 2.5, Llama 3.1) on a single NVIDIA L4 to defend or replace the production model choice for the Legislative Intelligence Platform. Covers chunk-size sweep (66 cells), cross-model quality validation, concurrency scaling, per-request telemetry decomposition, SLO gate analysis, and a two-round human-graded qualitative review (Experiments 3 & 3.5). Production recommendation: Llama 3.1 8B AWQ at chunk_size=1200, concurrency=12–24 — 20+ percentage-point parse-OK lead over all tested alternatives, quality invariant under concurrency (<0.6pp across c=1→24), and the only model clearing 80% parse-OK at every tested chunk size. Exp 3.5 characterized parse-OK's within-model reliability: directional for Llama, reversed for Qwen and Mistral.
End-to-end AI system for identifying cost-saving opportunities across multi-vendor purchasing catalogs. Processes 6 million catalog rows from four distributors using a three-tier pipeline: deterministic exact part-number matching (~42K matched pairs), FAISS semantic retrieval over 5.19M embeddings, and LLM equivalence reasoning with pack-size normalization. Identified $161K in defensible unit-normalized savings after correcting for 49% pack-size inflation in the raw figure. Includes a live price comparison tool (search, browse by category, browse by manufacturer) and a spending intelligence dashboard for leadership.
A private memory-vault PWA that pairs photos with voice recordings and written stories, eventually feeding a personal AI that can keep speaking in your own voice to the people you leave behind. Built iPhone-first as an installable Progressive Web App with an offline-first IndexedDB architecture that is multi-tenant-ready for public launch. Stage 1 live, with backend and AI layers in progress as a side venture.
A helpful little professional chatbot that acts as a conversational introduction to me — answering questions about my background, projects, and working style using my resume, case studies, and portfolio content as its grounding corpus. Retrieval-augmented so answers stay factual; voiced in my own cadence so it feels like reading something I actually wrote. Intended to live on this portfolio site as an alternative to scrolling.
A personal project exploring the intersection of psychology, game design, and AI. An adaptive storytelling system where NPCs and narrative branches respond dynamically to a player's personality, inferred from dialogue choices mapped to the Big Five (OCEAN) framework. LLM-generated dialogue serves dual purpose: advancing story and extracting personality signals. Includes a Jungian archetype NPC system, FAISS-backed memory, and a healing/safety layer. Built modularly with FastAPI backend, vector memory, and a React chat-style UI.
Reclassified from Data Engineer to Data Scientist in Nov 2023. Functionally assumed Advanced Analytics team lead role in early 2023 following major team departures and organizational restructuring — owning architecture, delivery, prioritization, governance, and executive engagement for multiple institutional AI platforms. Moved with the analytics function to the new AI Solutions Team in Nov 2025 (team move, no formal title change).
Joined AITS as a Data Engineer on the Advanced Analytics team. Took initiative beyond the role's scope from the start — researching new techniques, developing project plans, and delivering innovative solutions independently. Functionally assumed team lead responsibilities in early 2023, prior to formal reclassification to Data Scientist later that year.
Graduated Spring 2021
Interested in analytics strategy, applied AI, or the work described here? Reach out.