I am a Machine Learning Engineer working for HomeChoice International.

Skillset Summary

Here is a summary of my skillset, and tools that I am proficient with:

(last update 2022-01-21)

Tool Approximate Hours of Usage/Experience
SQL (Vertica, MySQL, AWS Athena, Google BigQuery, PostgreSQL) 3300
R (incl. tidyverse) 2842
Python (incl. pandas, numpy, scipy, sklearn, datetime) 1360
Microsoft Excel 1100
Supervised Learning & general Machine-Learning theory (transfer learning, classication/regression: neural nets, GBM family, random forest, tree-based models, linear models, elastic-net/penalised linear models, regression splines, local linear models, KNN, bias/variance tradeoff, cross-validation etc.) 621
ML-Ops (models in production) 532
Theory of Experimental Design (null hypothesis testing, bayesian inference, orthogonal designs, power calculation, unbiased variance reduction techniques …) 501
Recommender Engines (DCN, Wide&Deep, Factorisation Machines, Collaborative Filtering incl. Matrix Factorisation, Content-Based, cold-start recommenders) 376
TensorFlow + Keras 169
RMarkdown & LaTeX 161
Uplift Modelling (model-based estimation of heterogeneous treatment effects) 152
Image-based Models (segmentation & masking, classification, multi-class classification (auto tagging), tensorflow/keras, transfer learning 150
Association Rule Mining 100
Exponential Smoothing Models (time series prediction) 70
Natural Language Processing (NLP): Word & Document Embedding Methods 70
Multi-Armed Contextual Bandit Algorithms 65
Meta-Heuristic Optimisation (genetic algorithms, simulated annealing, TABU search) 60
Web Scraping (BeautifulSoup & Selenium in Python) 53
Clustering algorithms (k-means, k-medoids, hierarchical family, CLARA, DBSCAN) 50
Financial Modelling 50
ARIMA, SARIMA, ARIMAX models (time series prediction) 50
H2o Machine Learning Framework (R library) 40
Non-Linear Dimension Reduction (T-SNE, UMAP, ISOMAP, Locally-Linear Embedding) 36
R Shiny 30
Multi-Variable Analysis (PCA, Factor Analysis, SVD Bi-Plots, Canonical Correlation Analysis) 30
Causal Inference: Bayesian Networks and Do-Calculus 25
HTML, CSS and JavaScript 20
Git 16
Linear Optimisation 15
State Space Models (time series prediction) 12
General Reinforcement Learning Theory 10
Multi-Objective Optimisation 10
Copulas (Financial Modelling) 5
Arch/Garch Models (time series variance prediction/inference) 2
TBATS (time series prediction) 2
…this list tbc 0

Thanks

Non-Statistical Interests

  • Piano

  • Jazz

  • Origami

  • Calisthenics

  • Rubik’s Cube

  • Football

  • Other hobbies, which I have accepted that I will only have time for again after I retire:

    • juggling
    • drawing
    • unicycling
    • writing
    • music composition