Learning to get faster performance out in Python using Big O.

When you start to get better at programming and are beginning to do more complicated things with your code, especially in Python you are going to start caring about a lot more things than “did I get this to work.” A few things you might worry about are “Could this be faster?”, “Could this be clearer?”, and “Even if this couldn’t faster, can we make it run faster?”.

I want to say quickly, I’m not an expert, and I’m just going to get you started and try to point you in the right direction, informing you about what I found was helpful to me.

Ok, so let’s say I’m feeling nostalgic and want to play a game from my childhood called cookie clicker where you click a cookie repeatedly to earn cookies that you use to buy things to buy more cookies. while playing this game, I quickly realize that I can automate using the library pyautogui and moving to mouse to the x-y coordinates of the cookie and telling Python to click repeatedly at that location for however long we want

This isn’t actually a good way to go about automating a task like this because if we move the location of the cookie on the screen it will immediately break if you actually wanted to learn how to do this properly you could try using selenium for Python a good tutorial series of which can be found here: https://www.youtube.com/watch?v=Xjv1sY630Uc.

Back to telling Python how to click a cookie using pyautogui. Pyautogui is a library for Python that allows users to easily use their mouse and keyboard through Python to interact with gui applications and be able to automate things. So as I said earlier all we are going to do is find the x,y coordinates for the cookie on our screen and move the mouse to that location and start clicking.

from pyautogui import moveTo, clickdef clickthecookie(times):
moveTo(135, 513)
for i in range(times):
click()

Ok so we have our function, it’ll move to the coordinate of the cookie on my screen at the time I made this which was (135,513). When I run this well see that our mouse will click at a consistent rate until it completes the number of “times” I specify. Before we go into making this faster I think this is a good opportunity to learn how to describe how long this function will take to run using Big O Notation.

Big O Notation

To describe how long our function is going to take all we need to do is ask ourselves, depending on our input, how many lines is this function going to run until it stops. In our case there are three lines total after the initial function declaration, the first being the line that moves our mouse. No matter what happens, the first line will be read and will move our mouse when our function is declared, it is a constant. The next line is a for loop that will loop for however many “times” we put in, in other words every line within the loop will execute n times depending on our input. The last line “click” is similar to our previous one where it is just constant, if it gets read it’ll do its job and the interpreter will move on.

Big O notation provides us with a simple way to describe the same relationships we observed above with these symbols

Constant time: O(1): O(1) or constant time means that no matter the input the output will be the same. For example:

def sayhi(name):
return 'Hi'
#this function takes a name as an arguement but it really doesnt #matter because the output is always hi. The computer doesnt really
#need to think about whatever name you give it, it will just #constantly say hi.

Linear time O(n): O(n) or Linear time means that as the inputs you put in get bigger, the amount of time the function will take will increase at a constant rate.

def countTo(number):
for num in range(1,number):
print(number)
#the higher the number you give this function the longer the for #loop will run. counting to ten is the same as counting to one ten #times this is a linear relationship

Quadratic time O(n²): Quadratic time is what happens when you do x things every time you do x thing.

def countToxxtimes(x):
for num in range(x):
for number in range(x):
print(number)
#this function is the same as the previous counting function except
#it counts to x x times for example if I pass in 10 it will count to #10 10 times where as in the countTo function if I pass in ten it is #only going to count to ten once meaning this function would take #ten times longer than the other one if we counted how long each #took to compute ten. If the first for loop will be read 5 times so #will the for loop within it meaning the lines within the second #loop will be read 5*5 times

Logarithmic time O(Log n): logarithmic time occurs when you are constantly dividing something or are incrementing exponentially over the input. For these functions as the input gets bigger your operating time will increase than start to stay the same.

def amountofsquareswithinx(x):n =2
while n is <= x:
n=n^2
return n
#as xgrows we the amount of time it will take to reach x will start #to decay because of the rate at which n will start to grow.

Linearithmic time O(n*Log n): Linearithmic algorithms are ones where you perform a logarithmic algorithm more times as you input gets bigger

Exponential time O(2^n ): As your input increases the operating time increases exponentially. ie calculating the number of unique pairings between x items. As the number of items to check increases, the number of pairings will begin to grow exponentially.

Factorial time O(n!): pretty much trying to brute force your way to an answer. The growth rate for O(n!) is ridiculous and unsustainable at high inputs as it could literally take you hundreds of years at a certain point.

Using what we learned from big O notation we can now look back at out cookie clicking function and break it down using the notation

def clickthecookie(times):#O(1)
moveTo(Cookie)#O(1)
for i in range(times):#O(n)
click()#O(1)
# using big o we can determine that for n input this funciton will take
#O(n)+O(1)+O(1)+O(1)

As mentioned in the comments of the code block above, we can calculate that the above function will take O(n)+O(1)+O(1)+O(1). Then you are looking at your functions operating complexity you are looking for the worst-case scenario meaning you should just look at the fastest growing term in your code as that’s the best indicator of what the longest times will look like. Keeping this in mind we will drop the constants and take O(n). But how accurate is this actually?

If we count to 100 and run our function with our current number for every number while timing it we can make a graph and see how well it matches up with our theory.

import matplotlib.pyplot as plttimestocompute = []
inputs = []
for i in range(100):
start=t()
pc.clickthecookie(i)
finish = t()-start
timestocompute.append(finish)
inputs.append(i)
plt.scatter(inputs,timestocompute)

And we were right as our input(0–100 on the x-axis) increases so does the operating time(seconds on the y-axis). Knowing this you should now be able to at least identify the problem parts of beefy code you make/encounter. But what if you broke it down and you cant find a way to further increase the performance of your work? I’m planning on making posts on these as well but I would look into Cython and/or threading/multiprocessing.

I hope you found this helpful and would love some feedback on why it may not have been. Thanks for reading

Data Scientist/ Machine learning engineer who loves to share about this fascinating field

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store