Active Learning

What is Active Learning?

Active learning is a low-supervision approach to machine learning in which the learning algorithm actively seeks out the most valuable data points in a data set — those which provide it with an optimal learning opportunity. Additionally, the active learning algorithm can query data scientists for assistance with labeling data points it’s uncertain about.

How it works is deceptively simple:

  1. First, an engineer provides the active learning model with a large data set that is unlabeled save for a small portion of data points.
  2. The active learning model will assess the labeled data points, when using that information to make predictions about the remaining data points in the set.
  3. For each prediction it makes, the model will assign a confidence score to indicate how certain it is in its assessment.
  4. Once the model is done analyzing the data set, it will submit the data points with the lowest confidence score to the engineer, who then applies the correct label to each data point.
  5. After viewing the labels applied by the engineer, the model uses its newfound understanding to re-label other data points in the set.

Just as deep learning and neural networks mimic human brain patterns, active learning mimics how we learn and engage with new information. We start with the basics, then practically apply that knowledge to solve related problems. If we encounter a question or problem that leaves us stumped, we ask for help from the instructor

What is the Role of Active Learning in Machine Learning?

The majority of training for machine learning models is passive. A model is provided with a data set then more or less left to its own devices until the validation stage. Active learning is an alternative to this passivity.

Instead of training itself by laboriously analyzing data points it already understands, an active learning algorithm can focus on the ones it doesn’t. There are several advantages to this approach:

  • Considerably reduced labeling effort. Instead of labeling massive quantities of data, the model instead chooses and labels the samples that can potentially teach it the most.
  • Improved performance. Active learning not only streamlines training, but can also lead to the creation of more generalized, accurate and reliable models.
  • Scalable training. Because it’s relatively low-effort and low-intensity, active learning scales incredibly well compared to other types of machine learning.
  • Better utilization of data sets. Active learning can generally get the same results with smaller data sets, making it well-suited for scenarios where there’s a relative dearth of labeled data points.

Active Learning vs. Reinforcement Learning

Both active learning and reinforcement learning allow a model to train itself through a sort of trial-and-error process. However, that’s where the similarities between the two end. While active learning typically involves the intervention of a human data scientist or engineer, reinforcement learning leaves the model to its own devices.

Reinforcement learning can also be either passive or active, depending on how much guidance is provided to the model and whether or not there’s a fixed policy on which it can act.

Types of Active Learning

There are three core categories of active learning:

  • Stream-based selective sampling. Instead of a single data set, the model is provided with an ongoing stream of unlabeled data points which it must label in sequence. It simultaneously decides whether or not to query the user for each data point, as well.
  • Pool-based sampling. The model is given a large dataset and expected to assess it in its entirety before querying the user about any labels it’s uncertain of. Generally, the algorithm is trained on a small set of labeled data prior to being directed to the larger pool.
  • Query synthesis. Query synthesis is a somewhat complex approach to active learning in which the model generates its own data points. Typically, this requires something known as a generative adversarial network — a neural network consisting of both a generator that creates instances and a discriminator that assesses them.