Skip to content

Machine Learning

tito-kimbo edited this page Sep 10, 2018 · 8 revisions

What is Machine Learning?

Machine Learning (ML) is a field of Computer Science that allows machines to make predictions about certain features of data from previously observed data. BMLF offers access to 3 basic ML models: Classification, Regression, and Clustering. In addition, it offers data reduction capabilities for improved efficiency.

Classification

Classification is the problem of identifying to which category of a given set a new observation belongs. BMLF offers several classification models listed below, which perform better in specific use cases.

  1. SVM: General purpose, very useful with the appropriate tweaks.
  2. Multi-Layer Perceptron (MLP): May perform better than the other algorithms on sufficiently large datasets. It is sensitive to feature scaling and data reduction.
  3. Gaussian Naive-Bayes (Gaussian NB): Intended for use on continuous (real-valued) data.
  4. Multinomial Naive-Bayes (Multinomnial NB): Intended for use when dataset consists of 'counts'.
  5. Bernoulli Naive-Bayes (Bernoulli NB): Intender for use when it matters whether an event has occurred or not (for example on boolean data).

Regression

Regression or regression analysis is a set of statistical processes for estimating the relationships between variables. BMLF also offers several regression models, which are listed below with their intended use cases.

  1. Linear: Intended for use when variables are linearly correlated.
  2. Elasticnet: Used for linear regression after L1 and L2 regularization. Usually, it is preferred to use elasticnet over Ridge or Lasso regressions.
  3. ElasticnetCV: Same as regular elasticnet but applies cross-validation to optimize the model.
  4. Bayes Ridge: Bayesian estimation on Ridge regression. This model may be used when some of the features are not relevant, when features are not highly correlated and even to perform feature selection.
  5. OrthogonalMatchingPursuit: Regularized regression used as an alternative to Lasso or Elasticnet. It is somewhat sensitive to input data.
  6. OrthogonalMatchingPursuitCV: Same as regular OMP but applies cross-validation to optimize the model.
  7. Theil: Based on the Theil-Sen estimator, it is very robust against outliers, but computationally expensive, so it is recommended only for small datasets.
  8. SGD: Efficient and powerful regression method. It requires careful hyperparameter tuning and is quite sensitive to feature scaling.
  9. Perceptron: Simple regression for large scale learning. It is not regularized, it updates the model on mistakes and it has relatively fast training times.
  10. Passive-agressive: Intended for use in large-scale learning with regularization.

Clustering

Clustering or cluster analysis solves the problem of grouping items in different sets such that objects in the same set are more similar (in a certain sense) between them than when compared to items from other sets. BMLF offers basic clustering functionality.

  1. KMeans: General purpose clustering algorithm.
  2. Affinity: Intended for use when there aren't many samples but there are several clusters of different sizes.
  3. Mean Shift: Same use case as Affinity clustering.
  4. Agglomerative: Intended for use with several samples and clusters.
  5. DBScan: Intended for use with very large sample size and medium cluster number with different sizes.
  6. Birch: Intended for use with large datasets, mainly for outlier removal or data reduction purposes.

Reduction

Data reduction consists of a series of techniques that allow a user to reduce the amount of data and/or features available to make ML models more efficient while minimizing information loss. Currently, BMLF only supports Principal Component Analysis (PCA) which reduces the amount of features and transforms linearly correlated data into a set of linearly incorrelated variables. The supported PCA models are

  1. Automatic PCA: Applies linear dimensionality reduction in order to reduce the number of features.
  2. Incremental PCA: Usually works as a more efficient PCA.
  3. Kernel PCA: Applies non-linear dimensionality reduction through the use of kernel functions.

Clone this wiki locally