ivevdokimov

Senior Machine Learning Engineer

Senior Machine Learning Engineer with 5 years of experience specializing in NLP, LLMs, and AI. Proven track record of building scalable NLP pipelines, developing AI agents, and fine-tuning models using PEFT (LoRA) and RL. Deep expertise in RAG architectures and optimizing model inference (vLLM, sglang, quantization). Contributor to the open-source projects: verl, vllm, trl, sglang. Seeking opportunities to build and scale generative AI products.



Experience: 5 years

Yearly salary: $120,000

Hourly rate: $70

Nationality: 🌏 Remote

Residency: 🇷🇸 Serbia


Experience

Senior Machine Learning Engineer
Sber
2024 - 2026
The project involves optimizing the bank’s internal processes using AI agents. Sber is the largest bank in Central and Eastern Europe. Company has 291K+ employees. Built interactive analytics of user paths (clickstream/process mining) in Sber products: transition graph, identification of business-significant states, search for drop-off points, generation of hypotheses on the causes of conversion drops. Reduced time-to-insight for the process redesign team from 2 days to 2-3 hours per process, which increased the number of optimized processes by 2.4 times (from 12 to 29 processes per month), and a significant increase in the share of complex process redesigns. Enhanced internal datasets, cross-validated GigaChat LLM versions and its equivalents using the bank’s business cases, and produced quality reports. Expanded the benchmark from 11 to 34 internal cases. Stack: python, sql, pyspark, airflow, s3, openai, sglang, fastapi, docker, mlflow, streamlit, langgraph.
Machine Learning Engineer
Skoltech
2021 - 2024
Project 1 - Expanding the Effective Context of LLM: The project is related to the study of ways to increase the effective context length in open-source LLM. Skoltech is an international technological university combining cutting-edge research with technology application, ranked top-25 in the Nature Index Rising Young Universities. Reproduced results from scientific papers and built evaluation pipelines to run benchmarks and validate custom architectural solutions. Designed novel architectures and retrieval algorithms using SFT and RL to improve answer quality in long-context settings. Implemented production-grade code for model architectures, training and validation. Integrated vllm, reducing the time for validation runs of experiments for the team, which allowed them to run more experiments. Stack: python, pytorch, transformers, openai, vllm, verl, transformers, chromadb. Project 2 - AI sales assistant: The project aims to implement a real-time AI advisor to increase sales conversions. Applied statistical methods, machine learning algorithms, and LLM to generate alerts from audio conversations. Developed an internal benchmark for evaluating the quality of Whisper, optimized hyperparameters, which allowed us to increase RPS by 12.9% while reducing WER by 0.73%. Fine-tuned an LLM with LoRA to generate dialogue criteria, increasing sales conversion by 14.53% (from 2.89% to 3.31%). Stack: python, pytorch, s3, redis, kafka, openai, whisper, faststream, mlflow. Project 3 - Dynamic rubricator: The project involves building a clustering system for complaint topics and identifying spikes in requests. Developed a request clustering and monitoring system processing 120k requests/day to surface hot issues and detect demand spikes early. Made extraction of semantic information and generated topic headings. Implemented automated hourly report generation and an interactive dashboard with drill-down analysis. Increased recall in early detection of spikes in requests by 21% (from 57% to 78%). Stack: python, pyspark, sql, scikit-learn, sentence-transformers.

Skills

aws
nosql
open-source
python
pytorch
sql
ai
english