Time and space are some of the most important things to understand in software or algorithm development. These two things are critical for improving algorithmic efficiency, required by software developers to make code as efficient as possible and especially important when the matter at hand uses large data sets or must solve complex problems. Time and space are often represented using Big O Notation, which while itself is just a mathematical notation system has far greater value in helping developers factor in time and space to their code.
Although we often think of computers as machines with unlimited capabilities, computers can only complete a finite number of tasks. Developers need to optimise each task so that it takes up as little time or memory as possible or ideally both. Time complexity and space complexity are functions that describe how much time and how much memory is used in relation to the quantity of input received and thus help us understand the constraints of computers in completing their work.
Time Complexity
Imagine following a recipe step by step to bake cookies. If you are making one batch of cookies let us say it will take 1 hour. But what if you need to make 100 batches of cookies? Will it take one hundred extra hours? Time complexity helps us answer these questions by measuring how the running time of an algorithm grows as the size of the problem grows.
A classic example is linear search. To find a name in a list, the easiest way to do it would be by checking each name until you find what you are looking for. If there are 10 names on the list, 10 checks will need to be made (in the worst case). If there are 100 names, 100 checks will be required. This is called linear time complexity and we write it as O(n) where n is the number of items in the list. If the time complexity is O(n2), as the size of the problem doubles, the time taken for the algorithm to work quadruples. In this case, the algorithm would not be very efficient because if the data set is large, the algorithm will take much longer to work through the problem.
Space Complexity
Now imagine moving to a new house, and you need to pack up all your belongings. If your room is small, you only need a few boxes. However, if you have a large house, you will need more boxes. Space complexity is a measure of how much memory an algorithm needs to do its job and just like time complexity, we want this to be as small as possible. If you wanted to search through a data set and find the largest number, one way is to write down the largest number you've seen so far and keep searching through the entire data set. No matter how big your dataset is, you only need to remember one number. This is called constant space complexity, which we write as O(1).
The Importance of Big O Notation
Instead of saying ‘this algorithm takes exactly 25 milliseconds’ or ‘this algorithm takes exactly 10 megabytes of memory’, we use symbols like O(n) or O(1) to give a general idea of how the algorithm handles the problem. Big O notation is like a shorthand for describing the time or space complexity of a problem. For instance, if the space complexity of a problem is O(log n), as the size of the problem doubles, the amount of memory used increases at a constant rate.
So why should you care? Imagine you are developing a new application that needs to process large amounts of data. You want it to be fast and efficient, even when processing large files. Whether you're writing code for a simple calculator application or a complex video game, by choosing the right algorithms and data structures, you can create software that is not only powerful but also fast and efficient. And this all begins with understanding time complexity, space complexity and being able to represent these in Big O Notation in order to help us measure and improve the performance of our code.
Consider the following definitions of classes of problems.
P = {all problems that can be solved in polynomial time}
EXP = {all problems that can be solved in exponential time}
NP = {all problems that can be solved in polynomial time via a “lucky” algorithm}
“Polynomial time” means a time complexity of O(
𝑛𝑐
), and “exponential time” means a time complexity of O(
2𝑛𝑐
2
n
c
) for some positive constant
𝑐
. “A lucky algorithm describes a “non-deterministic model of computation” [1] which means that when the algorithm is given the same input, it can exhibit differing behaviour on different runs.
For every problem in NP, if the answer is yes for some input, this will be outputted in polynomial time. If no input makes the answer yes, then the correct answer of no will be found in polynomial time. Another characterisation of NP is as {all problems that can be verified in polynomial time}. Verification is crucially distinct from an ab initio solution. Consider the example of a proof by mathematical induction where if the first statement is true and it can be proven that any statement succeeding another statement is true, all statements are therefore true. Being given a candidate result is often far easier and less complex than deriving the formula to find it in the first place. More formally, when the answer is yes, this can be proved, and it is then this proof that can be checked in polynomial time. Whether P=NP is said to be one of the “deepest unanswered problem in all of Computer Science” [2] and is one of the remaining unsolved
millennium problems. P=NP is in words: “are there problems whose solutions can be checked in polynomial time but that cannot be solved in polynomial time?”. Of course, all problems that can be solved in polynomial time can be checked in polynomial time, because solution ab initio is itself a method of
checking. Consequently, P ⊆ NP but is P ⊂ NP?
The following are some famous problems known to be part of the class NP. There is often an intuitive checking algorithm which goes along with them:
• Integer factorisation (verifiable in polynomial by multiplication))
• n × n sudoku (verifiable in polynomial by comparing each unordered row/-
column/group of cells with the unordered set of integers from 1 to n)
• Knapsack problem (where you try to maximise the value e.g. weight of items that can fit in the maximum capacity of a container)
Decision problems can be further categorised. A decision problem is S-hard if its time complexity is at least that of every problem in S, where S is a complexity class. A decision problem is S-complete if it is both S-hard and in S. Therefore, C = S ∩ H, where H is the set of all S-hard problems and C is the set of all
S-complete problems. For example, the game of Tetris (with the decision problem “can you survive given the scenario?”) is NP-complete, and the game of chess (with the decision problem “does White win a given position?”) is
EXP-complete. These terms are not needlessly superfluous but useful definitions, even if puzzling. It is now easy to prove that P ⊂ EXP. Since chess is EXP-complete, it is the most complex problem in EXP, but there exist problems ‘p’ ∈ EXP such that p / ∈ P ⇒ chess / ∈ P. A related reformulation of a decision problem ‘d’ being S-hard is that every problem in S can then be “reduced” to d: ∀p ∈ S, p → d. Reduction is an immensely powerful tool in problem-solving. It entails converting a foreign problem into a more familiar one, and applying standard techniques to then implicitly solve the initial problem. A corollary of this is that if a problem p is reducible to another problem d, then d is at least as complex as p. This is because solving d is equivalent to solving p [1]. Reduction from a decision problem p to another decision problem d is also a conversion from inputs of p to inputs of d, while preserving the single-bit output.
Commonly, problems are formulated in terms of an equivalent graph prob-
lem, and the wide variety of graph algorithms can then be deployed to solve
the initial problem [1]. Examples include converting a longest path problem
into a shortest path problem by negating the weights, or converting a minimum
product of weights problem to a minimum sum of weights problem by taking
logarithms of the weights (log(ab) ≡ log(a) + log(b)).
Computational complexity theory is indisputably a rich and quite fundamental
field of study. It forms the foundation for the analysis of algorithms, which itself
is vital for a whole multitude of reasons. It raises deep and powerful questions,
solutions to which would have far-reaching and irreversible consequences – alas, if P = NP, most encryption systems would be largely upended, along with a
wide variety of ramifications [4]. Polls show that most Computer Scientists believe that P ≠ NP, though others argue that researchers should explore both
sides equally when investigating proofs and ought not be prejudiced [3]. Nevertheless, whilst pioneers such as Alan Turing (UK), Alonzo Church (US), Juris Hartmanis (Latvia), and Boris Trakhtenbrot (USSR) have well-established it as
an area of research, there is undeniable significant progress yet to be made.
[1] Massachusets Institute of Technology Department of Electrical Engineering and Computer Science OpenCourseWare 6.006: Introduction to Algorithms Lecture 23: Computational Complexity
https://youtu.be/moPtwq_cVH8?si=Tv31ws0zk4_bj4X6
[2] Scientific American The Most Important Unsolved Problem in Computer Science https://www.scientificamerican.com/article/ the-most-important-unsolved-problem-in-computer-science/
[3] Communications ACM P vs NP poll https://mags.acm.org/communications/201205?pg=12#pg12
[4] Wikipedia P versus NP Problem https://en.wikipedia.org/wiki/P_versus_NP_problem#P_=_NP