CSCI933 - Machine Learning Algorithms and Applications

In the course CSCI933, we explored various facets of machine learning, covering theoretical foundations and practical applications. Here’s a detailed look at what we covered:

Introduction to Linear Algebra

Linear algebra is foundational for understanding machine learning algorithms. The course introduced vectors and matrices, essential tools for modeling and solving complex problems. Key concepts included:

Vectors and Matrices: Understanding their definitions, transpositions, and practical applications in representing and manipulating data.
Matrix Algebra: Delved into operations like matrix multiplication and addition, and properties such as symmetry, skew-symmetry, and orthogonality.
Applications: Principal Component Analysis (PCA) for dimensionality reduction, utilizing Singular Value Decomposition (SVD) for matrix factorization.

Probability Theory

Probability theory equips us to model and reason about uncertainty in data. Key topics included:

Events and Probabilities: Events are subsets of a sample space, with probabilities assigned to these events.
Conditional Probability: The probability of an event given that another event has occurred.
Random Variables: Variables that assign values to events in a probability space, including concepts like expected value and variance.

Introduction to Machine Learning

Machine learning involves extracting patterns from data to make predictions or decisions. We covered:

Common Scenarios: Supervised, unsupervised, semi-supervised, and reinforcement learning, among others.
Tasks: Classification, regression, clustering, dimensionality reduction, natural language understanding, and more.
Performance and Experience: Measures of performance like accuracy and error rate, and the impact of supervised vs. unsupervised experiences on algorithm performance.

Regression Techniques

Regression is a key supervised learning method for prediction. Topics included:

Introduction to Regression: Historical context and its evolution.
Types of Regression: Linear regression, kernel ridge regression, lasso regression, and elastic net regression.
Regularization: Techniques to prevent overfitting, such as L1 (Lasso), L2 (Ridge), and elastic net regularization. Regularization helps improve model generalization by adding penalty terms to the loss function.

Classification and Pattern Recognition

Classification involves designing models to recognize and categorize patterns. Key points:

Pattern Recognition: Designing machines to classify objects based on features.
Classifier Design: Specifying model parameters to ensure optimal performance.
Bayes Decision Rule: A probabilistic method for classification that minimizes error or risk by choosing the class with the highest posterior probability.

Neural Networks

Neural networks mimic the brain’s functioning for various tasks. Topics covered:

Introduction to Neural Networks: Modeling the brain’s task performance using synaptic weights.
Neuron Models: Basic components of a neuron, including synapses, adder, and activation functions.
Activation Functions: Common functions such as Threshold, Logistic (Sigmoid), ReLU, and Softmax.
Network Architectures: Single Layer, Multilayer Feedforward, and Recurrent Networks.

Advanced Neural Networks

We explored specialized neural network architectures:

Perceptron: A simple model classifying inputs by creating a hyperplane as a decision boundary.
Backpropagation Algorithm: A method for training multilayer perceptrons using forward and backward passes to update weights.

Convolutional Neural Networks (CNNs)

CNNs are designed for processing grid-like data:

CNN Architecture: Includes convolutional layers, subsampling (pooling), and feature mapping.
Popular Architectures: LeNet-5, AlexNet, GoogLeNet, and ResNet.

Recurrent Neural Networks (RNNs)

RNNs handle sequential data:

Time Series Data: Applications in stock prices, electricity consumption, and more.
RNNs: Processing sequences by maintaining an internal state.
LSTM and GRU: Advanced RNNs addressing limitations with their unique computational structures.

Language Modelling

Language modeling is crucial for understanding and generating human language:

Text Encoding: Techniques like one-hot encoding and word embeddings.
Neural Machine Translation (NMT): Models for translating text using encoder-decoder architectures.
Transformer: A powerful model utilizing multi-head attention and positional embeddings for NLP tasks.

Regularization and Model Improvement

Regularization techniques enhance model performance:

Regularization Methods: Includes data transformation, network architecture adjustments, and optimization techniques.

Autoencoders and GANs

Advanced neural network models:

Autoencoders: Networks trained to reconstruct inputs.
Generative Adversarial Networks (GANs): Two networks (generator and discriminator) trained together to produce synthetic data.

Graphs and Graph Neural Networks (GNNs)

Graphs represent complex relationships:

Graphs: Definitions and types (undirected, directed, weighted, bipartite).
Graph Neural Networks: Techniques for training GNNs, including message passing and node feature updates.

Overall Review

My review of this subject is mixed. Although I feel there is still a lot to learn in machine learning, I no longer see it as an untamable beast. Neural networks are just function approximators, and at their core, everything boils down to linear algebra. The journey towards creating conscious artificial intelligence remains uncertain—maybe it will never be achieved; who knows?

On the practical side, I feel capable of implementing machine learning models effectively. However, I don’t yet have a deep grasp of the theory behind each model. I understand their functionalities and applications, but the mathematical intricacies of how they work remain elusive. Questions about the underlying mathematics are still lingering in my mind.

I scored 70% in the course, and the theoretical aspects were particularly challenging. SHOULD HAVE FOCUSED MORE ON LECTURES

Lectures Assign1 Assign2 Assign3