Computational Complexity of Machine Learning Algorithms

Rafay Qayyum
3 min readJul 31, 2022

The first question one should ask is What is Computational Complexity? The Encyclopaedia Britannica defines computational complexity as:

Computational complexity is a measure of the amount of computing resources (time and space) that a particular algorithm consumes when it runs.

Computational complexity is further divided into two types of complexities which are:

Time complexity

Rather than measuring the time taken by an algorithm or a piece of code that introduces machine dependency, Time complexity was created. Time complexity is a function of input that eliminates machine dependency and allows us to compare different algorithms without even running them. For example, an algorithm with O(n) will always perform better than O(n²) because its growth rate is smaller than that of O(n²).

Growth rates of different time complexities.

Space complexity

Like Time complexity is a function of input so is space complexity. Conceptually, it is the same as time complexity you just need to replace time with space. Wikipedia defines space complexity as:

The space complexity of an algorithm or a computer program is the amount of memory space required to solve an instance of the computational problem as a function of characteristics of the input.

Let’s check the computational complexities of different machine learning algorithms. Bounds are defined within the heading of the algorithm.

Linear Regression:

n= Number of training examples
f= no of features

Training Time complexity: O(f²n+f³)
Prediction Time complexity: O(f)
Runtime space complexity: O(f)

Logistic Regression:

n= Number of training examples
f= no of features

Training Time complexity: O(f*n)
Prediction Time complexity: O(f)
Runtime space complexity: O(f)

Support vector machines:

n= Number of training examples
f= no of features
s= no of support vectors

Training Time complexity: O(n²) to O(n³)
The training time complexity varies with different kernels.
Prediction Time complexity: O(f) to O(s*f)
For linear kernel it’s O(f), for RBF and Polynomial it is O(s*f)
Runtime space complexity: O(s)

Naive Bayes:

n= Number of training examples
f= no of features
c= no of classes

Training Time complexity: O(n*f*c)
Prediction Time complexity: O(c*f)
Runtime space complexity: O(c*f)

Decision Trees:

n= Number of training examples
f= no of features
d= depth of the tree
p= no of nodes

Training Time complexity: O(n*log(n)*f)
Prediction Time complexity: O(d)
Runtime space complexity: O(p)

Random Forests:

n= Number of training examples
f= no of features
k= no of trees
p=no of nodes in trees
d= depth of the tree

Training Time complexity: O(n*log(n)*f*k)
Prediction Time complexity: O(d*k)
Runtime space complexity: O(p*k)

K-nearest Neighbors:

n= Number of training examples
f= no of features
k= k nearest neighbors

Brute:

Training Time complexity: O(1)
Prediction Time complexity: O(n*f+k*f)
Runtime space complexity: O(n*f)

kd-tree:

Training Time complexity: O(f*n*log(n))
Prediction Time complexity: O(k*log(n))
Runtime space complexity: O(n*f)

K-means clustering

n= Number of training examples
f= no of features
k= number of clusters
i= no of iterations

Training Time complexity: O(n*f*k*i)
Runtime space complexity: O(n*f+k*f)

--

--