Don’t repeat yourself: Python functions

Learning to program is not easy. We have to learn a new language and a new way of thinking. We have to learn the grammatical rules of the programming language we are learning, and the logic of our program, how it will work its way through the code and hopefully give us the answer we expect.

In this post we will see how we can use functions to improve our code. We have learned about Python functions before (1, 2, 3). This time we will see how functions can be used to reduce duplicate code and make more readable, maintainable code. This will lay the ground work for future posts where we will explore creating our own Python classes, modules and packages.

A simple example

We want to calculate the BMI of five subjects. Here is how we could do it in Python:

# Subject data = [weight_kg, height_m]
subject1 = [80, 1.62]
subject2 = [69, 1.53]
subject3 = [80, 1.66]
subject4 = [80, 1.79]
subject5 = [72, 1.60]

bmi_subject1 = int(subject1[0] / subject1[1]**2)
print("bmi {} = {}".format('subject1', bmi_subject1))

bmi_subject2 = int(subject2[0] / subject2[1]**2)
print("bmi {} = {}".format('subject2', bmi_subject2))

bmi_subject3 = int(subject3[0] / subject3[1]**2)
print("bmi {} = {}".format('subject3', bmi_subject3))

bmi_subject4 = int(subject4[0] / subject4[1]**2)
print("bmi {} = {}".format('subject4', bmi_subject4))

bmi_subject5 = int(subject5[0] / subject5[1]**2)
print("bmi {} = {}".format('subject5', bmi_subject5))

The code produces the expected output:

bmi subject1 = 30
bmi subject2 = 29
bmi subject3 = 29
bmi subject4 = 24
bmi subject5 = 28

Using functions to avoid repeating ourselves

While the above code works, it required a lot of typing, retyping and cut-and-pasting.

It is a good idea to adhere to the DRY principle when writing computer code: don’t repeat yourself. If we cut and paste code to reuse it in various places, we may at some point realise our code contains an error and have to fix the error in all the places we pasted and used our code. Often times we forget to fix one of the pasted pieces of code and we will be confused why our code does not work. Just imagine if we made a mistake in our BMI formula, we would have to fix our mistake five times in the above code.

Below is an improved version of the code that includes a function to calculate BMI:

def bmi_calc(weight_kg, height_m):
    """Calculate BMI from weight in kg and height in meters"""
    bmi = int(weight_kg / height_m**2)
    return bmi

# Subject data = [weight_kg, height_m]
subject1 = [80, 1.62]
subject2 = [69, 1.53]
subject3 = [80, 1.66]
subject4 = [80, 1.79]
subject5 = [72, 1.60]

bmi_subject1 = bmi_calc(subject1[0], subject1[1])
print("bmi {} = {}".format('subject1', bmi_subject1))

bmi_subject2 = bmi_calc(subject2[0], subject2[1])
print("bmi {} = {}".format('subject2', bmi_subject2))

bmi_subject3 = bmi_calc(subject3[0], subject3[1])
print("bmi {} = {}".format('subject3', bmi_subject3))

bmi_subject4 = bmi_calc(subject4[0], subject4[1])
print("bmi {} = {}".format('subject4', bmi_subject4))

bmi_subject5 = bmi_calc(subject5[0], subject5[1])
print("bmi {} = {}".format('subject5', bmi_subject5))

Avoid repeating ourselves even more

While the above code produces the same output as the first version of our code, it still required lots of cutting and pasting and changing subject numbers. Also, the print statement is identical for each subject. We should be able to improve our code by including the print statement in our function.

def bmi_calc(sub_num, weight_kg, height_m):
    """Calculate BMI from weight in kg and height in meters"""
    bmi = int(weight_kg / height_m**2)
    subject = 'subject' + str(sub_num)
    print("bmi {} = {}".format(subject, bmi))

# Subject data = [weight_kg, height_m]
subject1 = [80, 1.62]
subject2 = [69, 1.53]
subject3 = [80, 1.66]
subject4 = [80, 1.79]
subject5 = [72, 1.60]

bmi_subject1 = bmi_calc(1, subject1[0], subject1[1])
bmi_subject2 = bmi_calc(2, subject2[0], subject2[1])
bmi_subject3 = bmi_calc(3, subject3[0], subject3[1])
bmi_subject4 = bmi_calc(4, subject4[0], subject4[1])
bmi_subject5 = bmi_calc(5, subject5[0], subject5[1])    

A final improvement

The above code works well. However, we had to copy-and-paste the call to our function for each subject and remember to change the subject number.

The final version of our code is below. We now use a list of lists to hold our subjects’ data. Each internal list holds the data for a subject, and we have added the subject number along with their weight and height. Our code now uses a for loop to call our bmi_calc() function for each subject.

def bmi_calc(sub_num, weight_kg, height_m):
    """Calculate BMI from weight in kg and height in meters"""
    bmi = int(weight_kg / height_m**2)
    subject = 'subject' + str(sub_num)
    print("bmi {} = {}".format(subject, bmi))

# Subject data = [weight_kg, height_m]
subjects =[[1, 80, 1.62], # subject1
           [2, 69, 1.53], # subject2
           [3, 80, 1.66], # subject3
           [4, 80, 1.79], # subject4
           [5, 72, 1.60]] # subject5

for sub in subjects:
    bmi_calc(sub[0], sub[1], sub[2])    

Summary

We have seen how we can write more reliable code by using functions in Python. We have also used a for loop to reduce additional repetition in our code. Overall, our code is shorter, easier to read, and easier to maintain. The next time your are coding and find yourself doing lots of cut-and-pasting, take a moment to see if you could avoid repeating yourself by turning that code into a function.

 

 

2 comments

  • Dear Joseph,

    Thank you for reading my post and commenting.

    1. The print statement is simply there to show the work that is being done, the output. In the context of a real data set, we might return these values rather than print them.

    The function called bmi_calc() allows us to encapsulate much of the work. Specifically, someone not as familiar with our code or Python coding in general will see a function called bmi_calc() and know what it does. Also, it can be reused in this program and other programs (if we make our file into a Python module).

    Your implementation does the job, but it might not be immediately clear what it is doing. For example, the variable ‘i’ and ‘num’ are uninformative.

    1. Those numbers (i..e , 1, 2, 3, 4, 5) are subject IDs. The reason to include them in the data itself is that we might not have data for some subjects, or we might not have used a numerical series of IDs.

    Your implementation would not work correctly if the data was:

    subjects = [[80, 1.62], # subject1
    [69, 1.53], # subject2
    # data from subject3 was corrupt and not used for analysis.
    [80, 1.66], # subject4
    [80, 1.79], # subject5
    [72, 1.60]] # subject6

    Thus, it is often better to not use iterators like ‘i’ that come from ‘range(len(x))’.

    Like

  • Your final version still have room to improve.
    1. Semantically, why you are doing with print() in a function called bmi_calc()?
    2. What are those 1, 2, 3, 4, 5 doing in your subject list? Are they dictionary keys or IDs?


    subjects = [[80, 1.62], # subject1
    [69, 1.53], # subject2
    [80, 1.66], # subject3
    [80, 1.79], # subject4
    [72, 1.60]] # subject5

    for i, num in zip(range(1, len(subjects) + 1),
    [int(weight_kg / height_m ** 2) for weight_kg, height_m in subjects]):
    print(f’bmi subject{i} = {num}’)

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s