Complete the sentences by filling in the blanks. Each correct answer earns points!
Machine Learning (ML) studies statistical algorithms that learn from data and generalize to data to perform tasks without explicit programming.
Context: Machine Learning Definition and Generalization
In supervised learning, the goal is to learn a mapping from inputs to outputs, enabling classification or regression.
Context: Supervised Learning (Classification and Regression)
Unsupervised learning finds structure in data, such as groups (clustering) or lower-dimensional representations (dimensionality reduction).
Context: Unsupervised Learning (Clustering and Dimensionality Reduction)
Self-supervised learning creates supervisory signals from the data itself to learn useful representations without manual .
Context: Self-supervised Learning
Reinforcement learning trains agents to choose actions that maximize long-term through interaction with an environment.
Context: Reinforcement Learning and Learning with Humans
A common mathematical view of many ML methods is Risk Minimization (ERM), where models minimize error on training data as an approximation to expected risk.
Context: Empirical Risk Minimization (ERM)
PAC learning is a theoretical framework that provides probabilistic guarantees on and generalization.
Context: Probably Approximately Correct (PAC) Learning
The biasāvariance tradeoff describes the balance between underfitting (high ) and overfitting (high variance) that affects generalization.
Context: BiasāVariance Tradeoff
Kernel machines use similarity in transformed feature spaces, and this idea is closely related to Machines and SVM.
Context: Kernel Machines and BiasāVariance Tradeoff
Model evaluation for classification often uses a Matrix, which summarizes correct and incorrect predictions by class.
Context: Confusion Matrix
A Curve plots true positive rate versus false positive rate across decision thresholds.
Context: ROC Curve
CauseāEffect: Unsupervised learning (for example, k-means) groups similar data points without labels, which causes which leads to reduced dataset size by replacing groups with centroids.
Context: CauseāEffect chain for k-means compression
CauseāEffect: Predicting posterior probabilities of a sequence given its history causes optimal data compression using coding on the output distribution.
Context: Data Compression Connection to Learning
CauseāEffect: An optimal compressor is available, which can be used for prediction by selecting the symbol that compresses best given previous history; this works because compression cost reflects .
Context: CauseāEffect chain for compression-based prediction
Core concept sentence: In structured prediction, models such as Bayesian networks and conditional random fields are used to predict outputs.
Context: Structured Prediction and Probabilistic Graphical Models