harshit2002

Site Reliability Engineer

Highly motivated Site Reliability Engineer with 2+ years of experience in high-availability Cloud Operations (AWS) and production incident management. Strong foundation in Linux, Bash, and Python automation. Possesses practical experience implementing core DevOps toolchain: Docker, Kubernetes, and CI/CD workflows, focused on automating system stability and deployment efficiency.


Experiece: 6 months

Yearly salary: $12,000

Hourly rate: $20

Nationality: 🇮🇳 India

Residency: 🇮🇳 India


Experience

Site Reliability Engineer
PrepLadder Private Limited
2025 - 2026
Executed deep Root Cause Analysis (RCA) for complex production incidents within high-availability systems to enhance stability and reduce recurring issues. Led critical P1/P2 incident response, successfully driving rapid resolution and reducing Mean Time To Resolution (MTTR) by 30% across high-availability systems. Optimised PostgreSQL queries to enhance database reliability, resulting in a 15% reduction in query latency during peak load. Automated key operational tasks and scheduled maintenance activities using Python and Bash scripting. Standardised incident response by developing and maintaining comprehensive runbooks and Standard Operating Procedures (SOPs) to accelerate engineering fixes and knowledge transfer. Leveraged monitoring and logging analysis to drive continuous operational improvements and support the reliability of high-availability AWS cloud services.
Product Support Engineer
Exotel Techcom Private Limited
2024 - 2025
Specialised in technical diagnostics and L2 resolution for a SaaS platform, focusing on complex APIs, SIP call flows, and underlying networking issues. Performed comprehensive log analysis and system diagnostics within Linux environments to isolate and resolve application, API, and network-level failures. Utilised debugging tools (Postman, cURL) to analyse API requests/responses and webhook logs, ensuring service integration and communication protocol reliability. Acted as the crucial link between Support and Engineering, defining reproducible scenarios and providing deep diagnostic context to accelerate platform bug resolution. Partnered closely with Product and Engineering teams, contributing to bug validation, feature clarification, and quality assurance workflows. Created and maintained internal knowledge base articles and procedural runbooks, streamlining diagnostic processes and enhancing team efficiency. Managed the incident lifecycle end-to-end, ensuring accurate documentation and tracking (Jira/Salesforce) to provide metrics for continuous process improvement.

Skills

ai
api
aws
ci-cd
docker
git
kubernetes
linux
python
terraform
devops
english