mlr3 and OpenML - Moneyball use case

This use case shows how to make use of OpenML data and how to impute missing values in a ML problem.

A pipeline for the titanic data set - Advanced

This post shows how to build a Graph using the mlr3pipelines package on the "titanic" dataset. Moreover, feature engineering, data imputation and benchmarking are covered.

Tuning a stacked learner

This tutorial explains how to create and tune a multilevel stacking model using the mlr3pipelines package.

Pipelines, selectors, branches

This tutorial explains how applying different preprocessing steps on different features and branching of preprocessing steps can be achieved using the mlr3pipelines package.

Imbalanced data handling with mlr3

This use case compares different approaches to handle class imbalance for the optdigits ( binary classification data set using the mlr3 package.

Resampling: stratified, blocked and predefined

When evaluating machine learning algorithms through resampling, it is preferable that each train/test partition will be a representative subset of the whole data set. This post covers three ways to achieve such reliable resampling procedures.

mlr3 basics on "iris" - Hello World!

Basic ML operations on iris: Train, predict, score, resample and benchmark. A simple, hands-on intro to mlr3.

A pipeline for the titanic data set - Basics

This post shows how to build a Graph using the mlr3pipelines package on the "titanic" dataset.

mlr3 basics - german credit

mlr3 is a machine learning framework for R. Together with other packages from the same developers, mostly following the naming scheme "mlr3___", it offers functionality around developing, tuning, and evaluating machine learning workflows.

mlr3pipelines tutorial - german credit

In this tutorial we will continue working with the German Credit Dataset. We already used different Learners on it in previous posts and tried to optimize their hyperparameters. To make things interesting, we introduce missing values into the dataset.

mlr3tuning tutorial - german credit

We evaluate all algorithms using 10-fold cross-validation. We use a fixed train-test split, i.e. the same splits for each evaluation. Otherwise, some evaluation could get unusually "hard" splits, which would make comparisons unfair.

Select uncorrelated features

The following example describes a situation where we aim to remove correlated features. This in essence means, that we drop features until no features have a correlation higher then a given `cutoff`. This is often useful when we for example want to use linear models.

Tuning Over Multiple Learners

This use case shows how to tune over multiple learners for a single task.

Impute missing variables

We show how to use mlr3pipelines to augment the "mlr_learners_classif.ranger" learner with automatic imputation.

Encode factor levels for xgboost

The package "xgboost" unfortunately does not support handling of categorical features. Therefore it is required to manually convert factor columns to numerical dummy features. We show how to use "mlr3pipelines" to augment the "mlr_learners_classif.xgboost" learner with an automatic factor encoding.

House prices in King County

Use case illustrating data preprocessing and model fitting via mlr3 on the "King County House Prices" dataset.

More articles »