Experience&Skills

Here comes a list of algorithms, packages and data science or engineering topics I was dealing with in the past.


Projects

End-to-end Machine Learning Platform: Design and implementation

Analytics: Churn, sales, engagement, conversion and budget prediction, customer lifetime value (CLV), explainable models via Shap

Funnel optimisation: Predicting conversion probability of leads within the funnel

Finance: streamed stock price predictions and feature engineering

Natural Language Processing (NLP): Sentiment analysis, semantical search, sequence to sequence models

Image classification: ImageNet classifications and object detection

Mail optimisation: via multi-armed bandits

Information retrieval from knowledge graphs

Big data risk reporting: for a DAX company, using Spark, Kafka and Scala

Food Intolerance detection: Causality and effect estimation of eaten food on bodily pain

Startup CTO tasks: building of full tech stack including Backend, Frontend, Data Pipelines, ML models


Models and algorithms

tree models: xgboost, catboost, lightgbm, ngboost, random forest

deep learning: TensorFlow, Keras, CNNs, LSTMs, attention models, TabNet

NLP: gensim, nltk, spacy, topic modelling

bayesian methods: Bayesian inference, Markov Chain Monte Carlo (MCMC), Bayesian AB tests

multi-armed bandits for stateless reinforcement learning

custom loss functions: for gradient boosted trees

hyperparameter optimisation: skopt

feature selection

model selection

embeddings


ML Engineering

Airflow workflow management: Scheduling of batch predictions and ETL processes via Airflow

Realtime predictions: Live predictions on streamed stock data via RabbitMQ, containerised

Cloud deployment: of model artifacts and pipelines

TensorfFlow Extended (TFX) for deep learning pipelines

ML Metadata: For data and artifact lineage (tracking of their path through the ML pipeline)


Databases and scheduling

databases, warehouses, cloud storage: snowflake, postgres, timescaleDB, neo4j, ElasticSearch, S3

scheduling: airflow, cron

Streaming: Kafka, RabbitMQ


Deployment

AWS

EC2, Lambda, Sagemaker, Batch, Fargate, ECR

Google Cloud

Cloud Function, App engine

Others

Heroku

Engineering

docker, terraform, git, linux shell


MLOps: Monitoring, tracking

plotly for dashboarding

grafana for technical parameter monitoring

slack integration

Evidently for model drift

MLflow for experiment tracking and model registry


Data validation

pandera

great_expectations

pydantic

tensorflow data validation (tfdv)


Data Exploration and visualisation

sweetviz

facets

matplotlib

seaborn

plotly


APIs

Interactive Brokers (Stock data streaming and order execution)

Alpha Vantage (stock data)

Emarsys (eMarketing system)


Web development

Javascript

React, React Native

Nodejs