Data Scientist

Quantitative Researcher Intern

Experience: 3 years

Yearly salary: $100,000

Hourly rate: $65

Nationality: πŸ‡ΊπŸ‡Έ United States

Residency: πŸ‡ΊπŸ‡Έ United States


Experience

Capstone – Data Cleaning, Modelling & Market Detection
Bank of America
2025 - 2025
Developed an end-to-end pipeline for retrieving and processing ETH perpetual order book and trade book data from AWS S3, including restructuring Level 2 bid/ask from json formats. Streamlined a scalable and robust aggregation workflow by quality checking and data integration at various time granularities. Normalized data using smoothed z-score. Implemented high frequency features to capture market dynamics, leveraging Polars for accelerated feature computation, achieving 3x speed improvement. Used Stacked Autoencoder for feature extraction and Gaussian Mixture Model for market regimes detection. Utilized ensemble reinforcement learning models (PPO, A2C) to build market making strategies based on different time intervals, with optimal cumulative monthly return to be 1.5 after trading cost deduction.
Quantitative Researcher Intern
Sov.AI
2024 - 2024
Conducted comparative analysis of various options data platforms, evaluating data quality, coverage, and pricing. Authored a detailed recommendation report for selecting optimal data sources. Explored data analysis by understanding concept, visualizing features on different time windows, processing missing values and observing the distribution. Features content is historical volatility, implied volatility, IV surface, etc. Applied K-means to cluster 340+ features into 4 groups and determined the signal direction by grid search method. Based on weekly rebalancing strategy, evaluated factors performance using metrics including Sharpe ratio, Sortino ratio, maximum drawdown to select the top 20 best factors amongst each cluster. Factor Portfolio Combination Method. Checking the orthogonality of factors. Replicate paper with factor combination from Bottom-Up and Top-Down. With Bottom-Up method to have higher Sharpe ratio in 1.54. Machine Learning Method. Employed CNN pred models to compress features and subsequently used LightGBM to forecast the daily returns, achieving a low RMSE of 0.0014 on testing dataset. Found the best threshold for buy, hold and sell via Optuna, obtaining annual return of 1.7. Developed a visualization tool via Streamlit to help team visualize the performance of each factor and strategy backtesting results, supporting the quick decision and iteration of strategy.
Data Analyst Intern
Ipsos
2023 - 2023
Queried weekly sales data for nationwide automotive products from database, analyzed target user behaviors from region, product preference, word of mouth among both customer and competitor products. Built and regularly maintained a dashboard using SQLite and Tableau. Communicated with client teams to identify requests, developed a Python script to automate the extraction of market research template content from SPSS files, and developed a new template interface for the enterprise data platform, enhancing flexibility in data import and report generation. Analyzed weekly product reviews (textual data) by building a word cloud model and performing sentiment classification. Extracted keywords and refined the handling of neutral terms, resulting in a 4% improvement in overall classification accuracy. By adding keywords to platform, enriching the visibility and improving the customer satisfaction.

Skills

java
linux
python
pytorch
sql
data-science
english
chinese-mandarin