Computational Complexity of Machine Learning Algorithms
The first question one should ask is What is Computational Complexity? The Encyclopaedia Britannica defines computational complexity as:
Computational complexity is a measure of the amount of computing resources (time and space) that a particular algorithm consumes when it runs.
Computational complexity is further divided into two types of complexities which are:
Time complexity
Rather than measuring the time taken by an algorithm or a piece of code that introduces machine dependency, Time complexity was created. Time complexity is a function of input that eliminates machine dependency and allows us to compare different algorithms without even running them. For example, an algorithm with O(n) will always perform better than O(n²) because its growth rate is smaller than that of O(n²).
Space complexity
Like Time complexity is a function of input so is space complexity. Conceptually, it is the same as time complexity you just need to replace time with space. Wikipedia defines space complexity as:
The space complexity of an algorithm or a computer program is the amount of memory space required to solve an instance of the computational problem as a function of characteristics of the input.
Let’s check the computational complexities of different machine learning algorithms. Bounds are defined within the heading of the algorithm.
Linear Regression:
n= Number of training examples
f= no of features
Training Time complexity: O(f²n+f³)
Prediction Time complexity: O(f)
Runtime space complexity: O(f)
Logistic Regression:
n= Number of training examples
f= no of features
Training Time complexity: O(f*n)
Prediction Time complexity: O(f)
Runtime space complexity: O(f)
Support vector machines:
n= Number of training examples
f= no of features
s= no of support vectors
Training Time complexity: O(n²) to O(n³)
The training time complexity varies with different kernels.
Prediction Time complexity: O(f) to O(s*f)
For linear kernel it’s O(f), for RBF and Polynomial it is O(s*f)
Runtime space complexity: O(s)
Naive Bayes:
n= Number of training examples
f= no of features
c= no of classes
Training Time complexity: O(n*f*c)
Prediction Time complexity: O(c*f)
Runtime space complexity: O(c*f)
Decision Trees:
n= Number of training examples
f= no of features
d= depth of the tree
p= no of nodes
Training Time complexity: O(n*log(n)*f)
Prediction Time complexity: O(d)
Runtime space complexity: O(p)
Random Forests:
n= Number of training examples
f= no of features
k= no of trees
p=no of nodes in trees
d= depth of the tree
Training Time complexity: O(n*log(n)*f*k)
Prediction Time complexity: O(d*k)
Runtime space complexity: O(p*k)
K-nearest Neighbors:
n= Number of training examples
f= no of features
k= k nearest neighbors
Brute:
Training Time complexity: O(1)
Prediction Time complexity: O(n*f+k*f)
Runtime space complexity: O(n*f)
kd-tree:
Training Time complexity: O(f*n*log(n))
Prediction Time complexity: O(k*log(n))
Runtime space complexity: O(n*f)
K-means clustering
n= Number of training examples
f= no of features
k= number of clusters
i= no of iterations
Training Time complexity: O(n*f*k*i)
Runtime space complexity: O(n*f+k*f)