Multi Label Classification

(what is?) Multilabel classification is a classification problem where multiple target labels can be assigned to each observation instead of only one like in multiclass classification.

Two different approaches exist for multilabel classification:

  • Problem transformation methods try to transform the multilabel classification into binary or multiclass classification problems.

  • Algorithm adaptation methods adapt multiclass algorithms so they can be applied directly to the problem.

I.e., the Two approaches are:

  • Use a classifier that does multi label

  • Use any classifier with a wrapper that compares each two labels

great PDF that explains about multi label classification and especially metrics, part 2 here

An awesome Paper that explains all of these methods in detail, also available here!

PT1: for each sample select one label, remove all others.

PT2: remove every sample which has multi labels.

PT3: for every combo of labels create a single-label, i.e. A&B, A&C etc..

PT4: (most common) create L datasets, for each label learn a binary representation, i.e., is it there or not.

PT5: duplicate each sample with only one of its labels

PT6: read the paper

There are other approaches for doing it within algorithms, they rely on the ideas PT3\4\5\6 implemented in the algorithms, or other tricks.

They also introduce Label cardinality and label density.

Efficient net, part 2 - EfficientNet is based on a network derived from a neural architecture search and novel compound scaling method is applied to iteratively build more complex network which achieves state of the art accuracy on multiclass classification tasks. Compound scaling refers to increasing the network dimensions in all three scaling formats using a novel strategy. Multi label confusion matrices with sklearn

Scikit multilearn package

Last updated