Master of Science: Data Science and Artificial Intelligence
Saarbrücken, Germany2023 - Present
Natural Language Processing, Neural Networks, Generative AI, Machine Translation
Seminar: Machine Learning for NLP - challenges and applications of large language models
Seminar: Compositionality in Language & Computation - whether neural networks learn true compositional representations or merely memorize surface level patterns
amity university
Bachelor of Technology: Computer Science and Engineering
Mumbai, India2019 - 2023
Data Structures & Algorithms, Object Oriented Programming, Database Management Systems
Minor in Animation
Bachelor Thesis: Voice-to-Image Synthesis with NLP and Sentiment Analysis
work experience
DHC Business Solutions
InnoLab04.25 - Present
Werkstudent
Applied AI Development
Designed and experimented with agentic RAG workflows using LangChain and LangGraph, including sequential, supervisor, and adaptive agent architectures.
Built internal interactive demos using Gradio to prototype and evaluate LLM-driven workflows with stakeholders.
Investigated GraphRAG-based retrieval pipelines, conducting comparative experiments that informed the transition back to optimized vector-based RAG systems.
Implemented LangGraph-based evaluation pipelines assessing RAG outputs across multiple qualitative and retrieval-based criteria.
Engineered document ingestion and chunking strategies for complex enterprise documents containing tables, images, and multi-page layouts using Docling and Unstructured.
Integrated AI systems with OpenWebUI via custom API endpoints and supported switching between local and hosted LLM deployments.
Containerized and deployed internal demos using Docker for testing and experimentation on company infrastructure.
Improved retrieval performance through reranking strategies and prompt engineering for evaluation and multimodal captioning tasks.
Developed an IT support chatbot to guide customers through structured information gathering, reducing incomplete ticket submissions.
Built a dual-index RAG architecture (Knowledge Base + Solution Archive) with multilingual German/English support using ChromaDB and Jina Reranker.
Developed LangGraph pipelines for automated Jira ticket analysis, including priority classification, root cause extraction, and workaround identification with Pydantic-structured outputs.
Built a versioned prompt management system using markdown files for reproducible LLM workflows across evaluation and ingestion tasks.
ProofRAG — Open-source Python CLI, Agent Skill, and GitHub Action for generating corpus-grounded RAG golden sets, running app predictions, evaluating retrieval and answer quality with LLM-as-judge metrics, producing HTML scorecards, and gating CI regressions.
rag-utils — Practical RAG script collection for DOCX/PDF processing, chunking, retrieval evaluation, context packing, query expansion, span tracing, SQLite embedding caching, semantic deduplication, and hybrid retrieval with reciprocal rank fusion.
llm behavior & fine-tuning
Prescriptive Bias in LLM Sampling — Exploratory experiments measuring whether LLM samples drift from descriptive averages toward implicit ideal values across prompts, languages, temperatures, personas, and external baselines.
Program Repair and Hint Generation — Evaluated GPT-4o-mini and Phi-3-mini for Python program repair and educational hint generation on INTROPYNUS, using multi-candidate sampling, prompt engineering, LoRA fine-tuning, and multi-task learning.
Data Selection and PEFT — Combined influence-based data selection with BitFit, LoRA, and (IA)3 for MolFormer fine-tuning on molecular property prediction.
applied ml & applications
FactCheckLIAR — Hybrid BM25/FAISS fact-checking system with a fine-tuned BERT classifier, cached indexes, optional Ollama response generation, and Streamlit interface.