dandibahh

Web3 Data Scientist

AI & Machine Learning Engineer with 4+ years of experience specializing in Web3, DeFi analytics, and predictive modeling. Expert in Python, ML frameworks, LLM fine-tuning, and real-time AI systems. Proven record of developing scalable machine learning solutions that integrate blockchain data, automate crypto asset recommendations, and enhance investor decision-making. Adept at leveraging statistical analysis, LLM observability, and infrastructure optimization for AI-driven systems. Passionate about transforming decentralized data into actionable insights using tools like ChatGPT, FastAPI, Docker, and cutting-edge ML libraries.

Experience: 5 years

Yearly salary: $40,000

Hourly rate: $30

Nationality: 🇳🇬 Nigeria

Residency: 🇳🇬 Nigeria

Experience

Web3 Data Scientist

xFractal

2025 - 2025

Designed and deployed AI-based predictive systems that identify top-performing Web3 tokens using historical on-chain data and ML classification algorithms (Random Forest, XGBoost). Integrated ChatGPT & LLM tools to automate summarization of token market trends and investor insights. Built scalable ML pipelines using Airflow and FastAPI; automated data ingestion from MongoDB and on-chain APIs. Performed statistical analysis and stress testing on model outputs across different crypto market conditions. Developed LLM-based tools for internal research summarization and used HuggingFace transformers for entity extraction in token metadata. Created custom investment recommendation engines that personalized signals based on investor risk profiles. Optimized model observability using MLflow and integrated LLM telemetry for performance monitoring. Engineered and optimized a SARIMAX + LightGBM stacked time series model, achieving 61.5% directional accuracy (p < 1e-22) and 16–19% error reduction (WRMSE/WZPTAE), exceeding Allora benchmark pass criteria. Integrated Optuna hyperparameter optimization (50 trials) for LGBM residual learning, improving ZPTAE loss, and enhancing model robustness under walk-forward validation. Developed volatility-targeted position sizing and realistic P&L simulations, resulting in CAGR ~2.8% with Sharpe 0.56 and controlled max drawdown (-8.4%) across 1,750 test periods. Enhanced portfolio risk metrics computation (Sharpe, Sortino, Calmar, CAGR, VaR, Expected Shortfall), enabling transparent evaluation of strategies and identifying edge cases such as abnormally high CAGR values. Resolved model training bottlenecks by adapting LightGBM’s API to ensure compatibility with early stopping and evaluation sets, streamlining reproducible backtests.

Skills

blockchain

data-science

data-viz

machine-learning

python

pytorch

sql

english

Create Profile Hire Data Scientists