yiwen828
Data Engineer
Data Engineer with a Master's in Computer Science and production experience building Python ETL pipelines, statistical anomaly detection systems, and scalable data infrastructure. Turned zero build-performance visibility into a department-wide reporting system across 15 machines — adopted by 3 teams within 3 months. Strong in pipeline reliability, statistical feature engineering, and data quality.
Experience: 3 years
Yearly salary: $52,000
Hourly rate: $30
Nationality: 🇹🇼 Taiwan
Residency: Uk
Experience
Software Engineer (Data Engineering & Automation)
Compal Electronics 2023 - 2025
Turned zero build-performance visibility into a structured daily data asset for 15 engineers across 3 teams, as measured by 15+ build events ingested automatically per day across 15 machines with zero manual steps, by designing and deploying a production Python ETL pipeline that extracted, normalised, and loaded build metrics (time, CPU, RAM, owner, project) into Google Sheets in real time. Protected source-of-truth data for 4 management stakeholders across 6+ months of production use, as measured by zero overwrites to the raw data layer during the entire deployment, by designing a two-layer architecture that separated write-only raw storage from the downstream reporting dashboard. Eliminated an estimated 33% false-positive anomaly rate caused by 30× build-duration variance, as measured by build-pattern differences between senior engineers (older hardware, higher compile frequency) and junior staff, by replacing a global threshold with per-project statistical grouping. Reduced anomaly alert noise by 70% and ensured detection validity from day one, as measured by per-project dynamic thresholds (mean + 1.5 SD) and a 10-record minimum gate per project group, by designing a detection system that adapted to each project's historical distribution and blocked signals from sparse datasets before they reached the reporting layer. Preserved reporting accuracy for 4 management stakeholders, as measured by preventing a methodology change that would have distorted project-level build-time baselines across all 15 machines, by benchmarking the proposed weighted-average approach against live build logs and recommending rejection on data-integrity grounds. Achieved department-wide adoption within 3 months of launch, as measured by zero-manual-step deployment across all 15 machines and 3 teams, by designing the pipeline for frictionless integration from day one. Maintained living technical documentation adopted by 15 engineers across 3 teams, as measured by weekly updates covering API integration, trigger logic, and statistical detection methodology, by iteratively refining documentation throughout and after deployment. Prevented 20-30 low-quality issues from entering the engineering workflow, as measured by pre-triage auditing of test versions, conditions, reproducibility, and evidence completeness, by establishing a systematic QA review step before issues were formally logged. Accelerated third-party defect resolution by coordinating between firmware engineers and external vendors, as measured by consistent follow-up that kept stalled issues moving, by classifying reported failures into firmware, third-party logic, and environment categories to route each to the correct owner.
Skills
data-science
sql
python
english
chinese-mandarin