Python: An introduction to functions, part 3

We had a short intro to functions and understand that a function works with variables defined inside or outside it. Now, let’s write a function to help with data analysis.

I needed to analyse data from 3 people who had a stroke. The way these data are structured can be typical of experimental studies (on humans): each subject has a folder containing data files from different test conditions.
Here, the folders are named by subject identifiers: ID001, ID002, ID003. Within each folder there are text data files for different trials: trial1.txt, trial2.txt, trial3.txt, trial4.txt, etc. A laborious way of analysing data organised this way would be to manually enter folder ID001, run a Python script to analyse data files, exit that folder and enter folder ID002, run the same script, exit and enter folder ID003 etc. An alternative to this is to write a function creating a data structure that can be used to index subject identifiers one at a time. This way, we can have Python enter each subject’s folder and analyse data, and repeat this action for the rest of the subjects.

The following function was written to create an ordered dictionary indicating whether the left or right side was affected by stroke, and the subject identifier:

from collections import OrderedDict

def gen_id():
    Generate OrderedDict of subject index, stroke-affected side and subject ID.

    OrderedDict of subject index (key) and list of affected side and
    subject ID (value).
    e.g., OrderedDict([(1, ['l', 'ID001'])])
    id_key = OrderedDict()
    id_key[1] = ['l', 'ID001']
    id_key[2] = ['l', 'ID002']
    id_key[3] = ['r', 'ID003']
    return id_key

This is an example of a function that does not take input values, but returns a data structure generated from information entered in the body of the function.

The function above is then used to generate the dictionary:

id = gen_id()

which produces the following output:

OrderedDict([(1, ['l', 'ID001']), (2, ['l', 'ID002']), (3, ['r', 'ID003'])])

Since the data structure is a dictionary, we can use the .values() dictionary method to index the dictionary’s values.
Here, the values are lists containing affected side and subject ID. Now that we have written this function gen_id(), we can use it here and in similar analyses in future studies to generate subject identifiers (with minimal modifications to customise the code).


We implemented concepts of defining and calling a function in an example to analyse data. The code needed to package information can be used here and reused in future similar analyses.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s