🤖 Machine Learning

Updated at 2024-01-04 07:20

Machine learning tasks can be grouped into three categories:

  1. Supervised learning (SL): Your dataset has both inputs and desired outputs.
  2. Unsupervised learning (UL): Your dataset consists only of the inputs and the program must learn the desired outputs itself.
  3. Reinforcement learning (RL): Program interacts with a dynamic environment and tries to attain a certain goal. Distinct input/output pairs are not presented.

Semi-supervised learning is an overlapping subcategory utilizing unlabeled data for most of the training while having a small proportion of labeled data.

Supervised learning predicts; Unsupervised learning transforms. Prediction is telling the label or probabilities of different labels for given sample. Transform turns sample into different representation, likelihood or cluster identifier.

Output can break down machine learning into problem categories:

  • Classification: the output is discrete, e.g. gender of a person.
    • Normal classification tells if an object is present in the input.
    • Object detection locates and classifies object(s) within an input.
    • Semantic segmentation classifies every piece e.g. pixel within an input.
  • Regression: the output is continuous value e.g. age of a person. Classification can always be represented as regression, but not the other way around.
  • Clustering: a set of inputs is to be divided into groups, but the groups are not known beforehand like in classification.
  • Density Estimation: finding the distribution of inputs in some space.
  • Dimensionality Reduction: the target is to simplify the features set per sample.

Supervised Learning

Supervised learning means that our dataset consists of both features and labels. Target is to create a program that predicts label of an unlabeled sample.

We have a dataset of images with people; each image also has that person's gender and age. We use this data to train a program to predict persons gender and age.

The most important thing with all supervised learning approaches is label quality. If training data labels are perfect, the model will learn to predict it very well.

Unsupervised Learning

Unsupervised learning means that our dataset only has features. Program has to figure out how to label them itself; that is to say, program needs to figure out how are the samples different. Unsupervised learning is used when you have a lot of data, but they don't know how to create proper labels.

Given an audio clip with person talking over music, separate the two tracks.

Given a video, isolate a moving object and categorize in relation to other moving objects which have been seen.

Reinforcement Learning

Reinforcement learning is inspired by behaviorist psychology, concerned with how software ought to take actions in an environment to maximize reward or minimize punishment. Reinforcement learning is used when the data is scarce or when you can't clearly define the ideal end state.

  1. An agent performs actions in an environment.
  2. The agent has a state and acts from this given state to a new one.
  3. Actions may have a reward/punishment; given to the agent by the environment.
  4. Epoch is represented as a sequence of states, actions and reward/punishment.
  5. The agent learns a policy; how to act when the agent is in a certain state.

RL algorithms are either model-based or model-free:

  • Model-based algorithms try to learn how the environment works. Model-based methods are hard to use when possible state space is large e.g. Go. Dynamic programming is an example of a model-based algorithm.
  • Model-free algorithms are policy-based, value-based or hybrid.
    • Policy-based methods try to find the optimal policy; how to act given a certain state. Policy gradients and REINFORCE are examples of policy-based algorithms.
    • Value-based methods try to find the optimal value. Q-learning, SARSA and value iteration are examples of value-based algorithms.
    • Hybrid model-free methods try to optimize both the policy and the value function. For example actor-critic methods.