Python: An introduction to functions, part 3
We had a short intro to functions and understand that a function works with variables defined inside or outside it. Now, let’s write a function to help with data analysis.
I needed to analyse data from 3 people who had a stroke. The way these data are structured can be typical of experimental studies (on humans): each subject has a folder containing data files from different test conditions.
Here, the folders are named by subject identifiers:
ID001, ID002, ID003. Within each folder there are text data files for different trials:
trial1.txt, trial2.txt, trial3.txt, trial4.txt, etc. A laborious way of analysing data organised this way would be to manually enter folder
ID001, run a Python script to analyse data files, exit that folder and enter folder
ID002, run the same script, exit and enter folder
ID003 etc. An alternative to this is to write a function creating a data structure that can be used to index subject identifiers one at a time. This way, we can have Python enter each subject’s folder and analyse data, and repeat this action for the rest of the subjects.
The following function was written to create an ordered dictionary indicating whether the left or right side was affected by stroke, and the subject identifier:
from collections import OrderedDict def gen_id(): """ Generate OrderedDict of subject index, stroke-affected side and subject ID. Returns ------- OrderedDict of subject index (key) and list of affected side and subject ID (value). e.g., OrderedDict([(1, ['l', 'ID001'])]) """ id_key = OrderedDict() id_key = ['l', 'ID001'] id_key = ['l', 'ID002'] id_key = ['r', 'ID003'] return id_key
This is an example of a function that does not take input values, but returns a data structure generated from information entered in the body of the function.
The function above is then used to generate the dictionary:
id = gen_id() print(id)
which produces the following output:
OrderedDict([(1, ['l', 'ID001']), (2, ['l', 'ID002']), (3, ['r', 'ID003'])])
Since the data structure is a dictionary, we can use the
.values() dictionary method to index the dictionary’s values.
Here, the values are lists containing affected side and subject ID. Now that we have written this function
gen_id(), we can use it here and in similar analyses in future studies to generate subject identifiers (with minimal modifications to customise the code).
We implemented concepts of defining and calling a function in an example to analyse data. The code needed to package information can be used here and reused in future similar analyses.