About

Hi there, my name is Sebastian Klipp - a Berlin-based Data Scientist and Machine Learning Engineer.

Besides my current (part-time) job at Babbel I dedicate the rest of my professional time to personal or freelance projects. If you want to work together, feel free to reach out to me!

After graduating with a Master’s degree in physics with a focus on computational modelling and complex system simulations, I worked in the overlapping fields of Data Science, Data Engineering and Machine Learning Engineering at Deloitte Analytics Institute, Babbel and for a startup project of which I was a founding member with the role of CTO.

If you want to know more, take a peek at my LinkedIn profile, and also check out my github.

My expertise lies in the following fields:

End-to-end Machine Learning and AutoML

I enjoy turning ideas into real life production systems, covering all aspects of modelling and data pipeline implementation:

Project scoping with the team and stakeholders
Architecture design
Data collection from various sources
Feature Engineering
Model selection, training, optimisation and evaluation
Deployment and automatisation
ML monitoring

The landscape of automatic Machine Learning advances quickly with tools like H20, Google’s AutoML or DataRobot emerging. These tools provide powerful utilities to kickstart an ML project in order to quickly start evaluating in production, learning from its shortcomings and iterating over the whole ML cycle to improve business goals. I recommend using AutoML as a starting point, and then successively replace parts of the pipeline with self-managed components where it provides cost and customisation benefits. Especially tools like AWS Sagemaker allow to add custom components into the ML pipeline and automate the remaining parts.

Data Products & Data Analytics

I help you to design your product’s Machine Learning features, Analytics projects, build business cases and communicate between the product, engineering and data science teams. To make a data project a success, one needs to understand both areas: The algorithmic ML / data world and the business / product world. It is crucial to translate business objectives (like ‘revenue increase’) into optimisable algorithmic metrics (like ‘precision’ or ‘rmse’) and vice-versa communicate model and analysis results to business stakeholders.

Machine Learning Engineering & Data Engineering

Research and model optimisation is great, but the biggest reward comes from seeing a project actually drive the business forward, thus it needs to be productionised. For this purpose, it is important to understand technical requirements of ML models running in a production environment like latency, parallelism, CI/CD and code consistency. In addition, a suitable API framework and cloud service needs to be selected for ML or app deployment. For data pipelines, task schedulers like Airflow need to be setup and potentially Big Data processing frameworks like Spark be leveraged.