ABOUT
Hello! I'm Dev, an engineer who loves creating technology-driven solutions, especially those powered by data, machine learning, and scalable software. My journey kicked off at IIT (BHU) in India, where I first got hooked on NLP and Computer Vision—sparking my passion for building smarter systems. Since then, I've sharpened ML models during my internship, crafted scalable data solutions at Societe Generale, and even became a certified Google Cloud Professional MLE along the way.
Now, I'm diving deeper into cutting-edge AI and software engineering at Texas A&M, always eager to tackle meaningful challenges. Whether it's building something new or improving what's out there, I'm all about making tech that actually matters.
Here's what I've been working with:
- Python
- Java
- C++
- JavaScript
- SQL
EXPERIENCE
Nov 2024 - Present Student Assistant @Texas A&M Engineering Experiment Station (TEES)Architected an AI-driven e-learning platform using AWS (DynamoDB, EC2, S3), React/NextJS, OpenAI APIs, and Pinecone, delivering personalized learning to 400+ students via real-time ML-powered features.
Developed RAG-based TA chatbot and automated question generation pipelines with LangChain, improving question relevance by 25%.
Engineered and deployed scalable data and AI pipelines, integrating transcript extraction and performance analytics.
- AWS
- DynamoDB
- EC2
- S3
- React
- NextJS
- OpenAI APIs
- Pinecone
June 2021 - December 2023 Software Engineer (Data) @Societe GeneraleLed development of scalable data processing systems for credit risk exposure analysis, leveraging Big Data technologies to enable high-volume (Million+ rows/day) data ingestion and transformation for downstream analytics.
Won Spot award for being an excellent team player.
Enhanced autoscaling on Azure Datalake to save €XX,XXX per quarter.
Developed several alerting and monitoring features, scripts using Scala and Spark improving productivity of the team.
Accelerated system validation by 50% through automated regression testing.
Mentored new teammates and contributed to every phase of the Software Development Lifecycle, from requirement gathering/data analysis to deployment and prod support.
- Java
- Spring Boot
- Apache Spark
- Kafka
- ElasticSearch
- Jenkins
- SQL
- Azure
- Scala
May 2020 - June 2020 Data Scientist Intern @Societe GeneraleDeveloped an ML-based incident resolution recommendation system, reducing operational risks and improving response times by 38%. Worked closely with cross-functional teams to design, develop, and test the MVP, meeting strict deadlines.
- Python
- Scikit-learn
- NLTK
- Pandas
- spaCy
- Networkx
2024 - Present M.S. Computer Science @Texas A&M UniversityGPA: 4.0/4.0
- Large Language Models
- Deep Learning
- Software Engineering
2017 - 2021 B.Tech. Electronics Engineering @Indian Institute of Technology (BHU)GPA: 8.78/10.0
- Natual Languange Processing
- Computer Vision
PROJECTS
PaperFormer: A Citation-Graph Enhanced Language Model for Scientific Applications
Rithik Kapoor, Dev Garg, Ruihong Huang. [Under Review at Association for Computational Linguistics (ACL 2025)]. A novel citation-aware model for research papers with LoRA fine-tuned LLaMA that achieves 51% perplexity reduction and SOTA summarization improvement.
- PyTorch
- Ray
- LLaMA
News Aggregation and Recommendation System
A distributed, AI-driven platform that ingests and clusters news from multiple sources, generates real-time summaries, and delivers personalized, bias-aware recommendations based on user interactions. The system integrates MLOps for model monitoring and retraining, ensuring scalable, low-latency content delivery.
- Kafka
- Spark
- FAISS
- MLflow
- Kubeflow
- Redis
- PostgreSQL
- Elasticsearch
- ETL
Agentic Self-Corrective RAG
A multi-agent websearch-enabled Retreival Augment Generation system based on LLama3 to minimize hallucinations. 20% increase in answer relevance and a 5% enhancement in faithfulness.
- LangChain
- Ollama
- AWS
- Docker
Stock Market Charting App
Do CRUD, Analyze and Visualize complex financial data through intuitive charts.
- Angular
- Spring Boot
- PostgreSQL
BLOGS
Feburary, 2023 Designing Data Intensive Applications: Notes
My notes on the book Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems.