Open to AI/ML Engineer Roles — San Jose, CA

Naveen Sama

6+ years building production LLM applications, RAG pipelines, and scalable AI systems across fintech, e-commerce, and healthcare.

0
Years Experience
0
Companies
0
Million+ Users Served
0
% Cost Reduction (LLM Infra)
Who I Am
About Me

I'm an AI/ML Engineer with 6+ years of experience turning complex AI research into systems that handle millions of real-world interactions. At PNC Financial Services, I architect conversational AI, LLM agents, and RAG pipelines that automate banking workflows at scale.

My work spans the full AI lifecycle β€” from data pipelines and model fine-tuning to production deployment, evaluation, and monitoring. I specialize in LLM agent orchestration, retrieval-augmented generation, and LLMOps infrastructure.

  • Location San Jose, CA
  • Education M.S. Big Data Analytics — SDSU, 2024 (GPA 3.94)
  • Email naveensama0797@gmail.com
  • Focus LLM Agents · RAG · MLOps · Fine-tuning
95%
Python
93%
LLM Agents
88%
RAG / Vector DB
85%
MLOps
87%
PyTorch / TF
83%
AWS / Cloud
Portfolio
Featured Projects
πŸ€–
LangGraph Research Agent
Autonomous multi-step research agent with persistent memory, tool orchestration, and ReAct reasoning loops.
LangGraphOpenAIFastAPIDocker
hover to flip β†’
Multi-agent StateGraph with planner, researcher, summarizer, and critic nodes. Streaming responses, conversation checkpointing, and ReAct tool-use loops deployed via FastAPI + Docker.
LangGraph · OpenAI · FastAPI · Redis
View on GitHub β†’
πŸ”
RAG Document QA System
End-to-end retrieval-augmented generation pipeline for document Q&A with hybrid search and reranking.
LangChainChromaDBFastAPIStreamlit
hover to flip β†’
Hybrid retrieval (dense embeddings + BM25), contextual compression reranking, and a Streamlit chat UI. Ingests PDFs, Word docs with chunk overlap and metadata filtering.
LangChain · ChromaDB · OpenAI · Streamlit
View on GitHub β†’
πŸ“ˆ
LLM Time Series Forecaster
Benchmarks 8 LLM prompting strategies vs classical ML for short-term electricity load forecasting.
GPT-4oClaude 3.5LSTMProphet
hover to flip β†’
24 forecasters evaluated across 3 LLMs. Key finding: GPT-4o with statistical context + chain-of-thought achieves 3.41% MAPE β€” outperforming SARIMA, XGBoost, and LSTM baselines.
GPT-4o · Claude · LLaMA 3.1 · XGBoost · LSTM
View on GitHub β†’
πŸš€
MLOps Pipeline
Production-grade end-to-end ML pipeline with experiment tracking, model registry, and automated deployment.
MLflowDockerAWS ECSGrafana
hover to flip β†’
Full lifecycle: data validation (Great Expectations), MLflow tracking, staging→production promotion, Dockerized FastAPI serving on AWS ECS, Prometheus + Grafana monitoring, GitHub Actions CI/CD.
MLflow · FastAPI · Docker · AWS · Prometheus
View on GitHub β†’
⚾
MLB Win Predictor
ML system predicting MLB home team win probability using advanced sabermetrics and automated report generation.
scikit-learnSQLPlotlyDocker
hover to flip β†’
SQL feature extraction from Retrosheet data. CramΓ©r's V + Tschuprow's T correlation analysis, Random Forest importance, brute-force variable selection. Auto-generates self-contained HTML analytics reports.
scikit-learn · Pandas · Plotly · SQL · Docker
View on GitHub β†’
πŸ₯
Healthcare ML Suite
Predictive models for diabetes, heart disease, and cancer detection with SHAP explainability and class imbalance handling.
XGBoostSHAPSMOTEJupyter
hover to flip β†’
Ensemble models (XGBoost + Random Forest) with SHAP explainability for clinical interpretability. SMOTE for class imbalance. ROC-AUC > 0.92 across all three prediction tasks.
XGBoost · scikit-learn · SHAP · SMOTE
View on GitHub β†’
Career
Experience
Jun 2024 — Present
AI/ML Engineer
PNC Financial Services · San Jose, CA
  • Conversational AI using LangGraph + Pinecone handling 1M+ monthly conversations at sub-2.5s p95 latency, reducing live agent escalations by 20%
  • Extended to voice banking with LiveKit WebRTC, Whisper STT, and ElevenLabs TTS β€” 50K+ monthly interactions at sub-1.5s latency
  • KYC agent with human-in-the-loop for OFAC sanctions screening, cutting review time 60%; CrewAI credit memo agent reducing drafting from days to hours
  • Document AI service: 94% F1, 1,000+ docs/day, reducing manual form processing by 70% across agents and business units
  • Internal RAG chatbot with Neo4j knowledge graph β€” 98.9% citation compliance, 2,000+ daily users, sub-2s p95 latency
  • Fine-tuned Llama 3 8B, Mistral, Llama Nemotron via QLoRA/PEFT on EKS GPU nodes β€” 90%+ inference cost reduction vs external APIs
  • LLM evaluation with DeepEval, RAGAS, LLM-as-a-Judge, and RLHF-informed prompt tuning; reduced hallucination rates across all production agents
Jan 2024 — May 2024
Machine Learning Engineer Intern
Blue Cross Blue Shield of Michigan · San Diego, CA
  • Member risk-scoring pipeline on SageMaker (XGBoost + SHAP) achieving 87% AUC-ROC for chronic disease progression prediction
  • LLM evaluation frameworks for ContractsGPT and SecureGPT on AWS Bedrock β€” reduced hallucination 5%, improved latency 10%
May 2023 — Dec 2023
Data Science Intern
SDSU Research Foundation · San Diego, CA
  • Predictive model for high-priority customer classification (XGBoost, RF, LightGBM + SMOTE) at 84% recall; deployed on GCP Vertex AI
  • NLP pipeline (NLTK, spaCy, PyTorch) on 100GB+ FHIR EHR data reducing clinician review time from 5 min to 1 min; 92% F1 on 35K+ patient records
Jan 2021 — Jul 2022
Software Development Engineer
Amazon · India
  • Distributed competitor catalog scraping system processing 10M+ listings at 98% crawl success across 5+ marketplaces
  • Spark-based selection gap pipeline with BERT taxonomy classification covering 85% of catalog nodes; reduced selection gap by 5%
  • NLP query rewriting pipeline at sub-20ms latency reducing zero-result searches by 18% across multiple marketplaces
May 2019 — Jan 2021
Software Engineer
Accenture · India
  • Full-stack Java Spring Boot + React.js enterprise apps with JWT/OAuth 2.0; reduced UI defects 40%, improved developer velocity 30%
  • Data migration pipeline migrating 50M+ historical records at 99.5% accuracy with zero production downtime
  • Containerized microservices on Kubernetes with blue-green deployments; cut release cycles from 2 weeks to 2 days
Let's Connect
Get In Touch

Open to AI/ML Engineer roles. I bring production-tested expertise in LLM systems, RAG, and MLOps.