None .. An html document created by ipypublish

outline: ipypublish.templates.outline_schemas/rst_outline.rst.j2 with segments: - nbsphinx-ipypublish-content: ipypublish sphinx content

7. Functions and modules¶

Basile Marchand (Materials Center - Mines ParisTech / CNRS / PSL University)

7.1. Let’s organize all of this!¶

7.1.1. Store your code in functions¶

In the previous parts we have therefore seen how to handle basic Python objects, how to repeat operations, etc … However you have seen it with the different exercises (I suppose) your Python files have seen tendencies to become a bit messy. and you can quickly get tangled up in your brushes. In addition, as we saw in the section on loops, a large part of a computer program consists of repeating a series of instructions a large number of times. The ideal in order to have a clear and easily exploitable code is to divide the different series of instructions into functions and that is what we are going to do.

7.1.2. What is a function?¶

In mathematics we define a function \(f\) as being an application which to an input \(x\) living in a certain space \(x\in E\) associates an output \(y\) living in a certain space \(y\in V\).

\[x \rightarrow y = f(x)\]

Well in computer science it’s the same thing, the only difference is in the vocabulary. Indeed we can define a function f in computer science as being an instruction which has an argumentx of a certain type (int, float, list, dict, …) associates an output y of a certain type.

y = f (x)

Likewise, just as there are functions of several variables in mathematics, computer functions can also take several arguments as input.

y = f (x, y, z)

How do you define functions in Python? It is quite simple that is done using the def instruction. The syntax is as follows:

def name_of_my_function(arg1, arg2, ..., argN):
    instruction_ 1
    instruction_2
    ret = ...
    return ret

For example, if we want to define the sum function taking as input a list and returning the sum of its elements, we can write:

[1]:

## Définition de la fonction
def somme(ma_liste):
    s = 0
    for x in ma_liste:
        s += x
    return s

une_liste = [ x+100 for x in range(10) ]
la_somme = somme( une_liste )   ## Appel de la fonction
print("la_somme = {}".format(la_somme))
print("la_somme = {}")
print(f"la_somme = {la_somme}")

la_somme = 1045
la_somme = {}
la_somme = 1045

Several remarks: * We must distinguish between the phase of* definition of the function *(part of the code where the function is defined by specifying the series of instructions that the latter will be required to carry out) from the phase *call of the function* (part of the code where the series of instructions contained in the function are executed). * The name of the my_list argument in the function definition is completely independent of the variable name I give when I call the function. The name my_list is only used as an identifier to handle my input variable within the function

Writing a function of several variables follows the same logic as before. For example, if we want to implement an average_weighted function, we can proceed as follows:

[2]:

def moyenne_ponderee( valeurs, ponderations ):
    s = 0
    s_w = 0
    for x, w in zip(valeurs, ponderations):
        s += w*x
        s_w += w
    return s/s_w

notes = [12,9,17,15]
poids = [1, 2, 2, 3]

s = moyenne_ponderee(notes, poids)
print("La moyenne pondérée est : {}".format(s))

La moyenne pondérée est : 13.625

7.1.3. Function and variable the same or not?¶

[3]:

mean_weight = moyenne_ponderee
s_bis = mean_weight(notes, poids)
print("La moyenne pondérée est : {}".format(s_bis))

La moyenne pondérée est : 13.625

We get the same result because if we look at the memory address of the functions, they are the same in both cases.

[4]:

print("""Adresse de moyenne_ponderee : {}
Adresse de mean_weight : {}
""".format(hex(id(moyenne_ponderee)), hex(id(mean_weight))))

Adresse de moyenne_ponderee : 0x7fd7c427f950
Adresse de mean_weight : 0x7fd7c427f950

Thus functions and variables have common aspects, so this implies that we can also pass a function as an argument of another function !!

[5]:

def square(x):
    return x*x

def for_each(func, iterable):
    res = []
    for x in iterable:
        res.append( func(x) )

    return res

inp = [ x**0.5 for x in range(10) ]
inp2 = for_each( square, inp )
print(inp)
print(inp2)

[0.0, 1.0, 1.4142135623730951, 1.7320508075688772, 2.0, 2.23606797749979, 2.449489742783178, 2.6457513110645907, 2.8284271247461903, 3.0]
[0.0, 1.0, 2.0000000000000004, 2.9999999999999996, 4.0, 5.000000000000001, 5.999999999999999, 7.000000000000001, 8.000000000000002, 9.0]

One question you might or might not ask yourself is what if I want the function I define to return multiple output variables? The answer is simple, it suffices to return a tuple in which we store the different variables that we want to recover. For example if in the previous example I want to retrieve the result list and its size (no interest you will tell me), just modify the for_each function as follows:

[6]:

def for_each2(func, iterable):
    res = []
    for x in iterable:
        res.append( func(x) )

    return res, len(res)

inp = [ x**0.5 for x in range(4) ]
out = for_each2( square, inp )
print(out[0])
print(out[1])

[0.0, 1.0, 2.0000000000000004, 2.9999999999999996]
4

There you want to tell me that it is not necessarily practical to have to handle a tuple afterwards. And I can only agree with you and even add that this affects the readability of the code. But don’t worry because Python is pretty well thought out. In fact, you can automatically split a tuple into several variables as soon as you exit the function. All you have to do is call the function as follows:

[7]:

ret, taille = for_each2(square, inp)
print("liste = {} , taille = {}".format(ret, taille))

liste = [0.0, 1.0, 2.0000000000000004, 2.9999999999999996] , taille = 4

However be careful the number of variables must be consistent between what is in the return of the function and what you put to the left of the=during the call to function. Because if this is not the case, Python will interpret it as an error

>>> ret, taille, variable_en_trop = for_each2(square, inp)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-52-ae3d4eb02d50> in <module>()
----> 1 ret, taille, variable_en_trop = for_each2(square, inp)

ValueError: not enough values to unpack (expected 3, got 2)

In the same spirit one can wonder how to do if one wishes to define a function with arguments having default values. To do this, it suffices quite simply to give a value to the arguments in question when defining the function. For example, if you want to create an incremente function which by default increments a number by 1 but can also increment it by another value specified by the user, just proceed to the following way:

[8]:

def incremente(x, incr=1):
    return x+incr

a = 1
print(incremente(a))
print(incremente(a, 100))

2
101

Warning: syntax rule Arguments with default values must necessarily be positioned last when defining the function. For example the following syntax is false:

>>> def incremente_error(incr=1, x):
>>>     return x+incr

def incremente_error(incr=1, x):
SyntaxError: non-default argument follows default argument

When you have multiple arguments with default values, the rule for calling the function when you want to specify one or more arguments to a value other than its default value is as follows: > The values of the arguments must be given in the same order as that established for the definition of the function or > The values of the arguments must be preceded by the name of the argument followed by the symbol =

[9]:

def formule(x, a=1, b=0):
    return a*x + b

# Si on specifie tous les arguments

print( formule( 1.876, 10., 2.) )
# ou
print( formule( 1.876, a=10., b=2.) )

# Specification partielle
print( formule( 1.876, 2.) )
print( formule( 1.876, b=2.) )

20.759999999999998
20.759999999999998
3.752
3.876

To finish on the subject of functions there is only one point left to cover, namely how to define functions taking a variable number of arguments. Indeed it can sometimes be useful to define such functions. To do this there is a first solution, which does not use any particular syntax and which is to define your function as taking as input a tuple in which before calling your function you will store all your arguments. This would give for example:

[10]:

def fonction_arg_variable( args ):
    print("La fonction est appelée avec {} arguments qui ont pour valeurs {}".format( len(args), args))

une_variable = 1
une_autre = False
encore_une_autre = [1,2,3]
func_args = (une_variable, une_autre, encore_une_autre)
fonction_arg_variable( func_args )
func_args_2  = (une_variable, encore_une_autre)
fonction_arg_variable( func_args_2 )

La fonction est appelée avec 3 arguments qui ont pour valeurs (1, False, [1, 2, 3])
La fonction est appelée avec 2 arguments qui ont pour valeurs (1, [1, 2, 3])

You could then tell me that yes it does the expected job but it is still not very practical because you have to define a tuple by hand before each call of the function. And you would be right to tell me that. It is for this reason that there is in Python the syntax * args which will allow us to have the same behavior as before while avoiding the step of defining a tuple. If we take the previous example:

[11]:

def fonction_arg_variable_star( *args ):
    print("La fonction est appelée avec {} arguments qui ont pour valeurs {}".format( len(args), args))

une_variable = 1
une_autre = False
encore_une_autre = [1,2,3]
### On appelle directement la fonction avec les arguments
### sans creer de tuple
fonction_arg_variable_star( une_variable, une_autre, encore_une_autre )
fonction_arg_variable_star( une_variable, encore_une_autre )

La fonction est appelée avec 3 arguments qui ont pour valeurs (1, False, [1, 2, 3])
La fonction est appelée avec 2 arguments qui ont pour valeurs (1, [1, 2, 3])

Finally there is another way to define a function with a number of variable arguments it is the syntax ** kwargs. This second syntax solves a problem associated with *args which is that when calling a function defined using* args it is necessary to give the arguments in the sense provided in the definition of the function so that it has the expected behavior. Drawing :

[12]:

def fonction_args(exposant, *args):
    """ La fonction est implémenté de telle sorte que :
        exposant -> un float
        args[1] -> un booléen
        args[2:] -> des flottants
    """
    if len(args) < 1:
        return exposant**exposant
    if args[0] is True:
        s = 0
        for x in args[1:]:
            s += x**exposant
        return s
    else:
        s=0
        for x in args[1:]:
            s += x**(1./exposant)
        return s

### Appel de la fonction avec uniquement argument positionnel
print( fonction_args(2.) )
### Appel de la fonction avec tous les arguments (dans le bon sens donc comportement correct)
print( fonction_args(2., True, 1.,2.,3.,4.) )
### Appel de la fonction avec tous les arguments (les deux premiers sont inversés donc comportement incorrect)
print( fonction_args(True, 2., 1.,2.,3.,4.) )

4.0
30.0
10.0

The use of the kwargs syntax is done as shown below:

[13]:

def fonction_arg_variable_nommes( **kwargs ):
    print("La fonction est appelée avec {} arguments qui ont pour valeurs {}".format( len(kwargs), kwargs))
    print("kwargs est de type : {}".format(type(kwargs)))

une_variable = 1
une_autre = False
encore_une_autre = [1,2,3]
### On appelle directement la fonction avec les arguments
### sans creer de tuple
fonction_arg_variable_nommes( mon_arg_1=une_variable, mon_arg_2=une_autre, mon_arg_3=encore_une_autre )
fonction_arg_variable_nommes( mon_arg_1=une_variable, mon_arg_3=encore_une_autre )

La fonction est appelée avec 3 arguments qui ont pour valeurs {'mon_arg_1': 1, 'mon_arg_2': False, 'mon_arg_3': [1, 2, 3]}
kwargs est de type : <class 'dict'>
La fonction est appelée avec 2 arguments qui ont pour valeurs {'mon_arg_1': 1, 'mon_arg_3': [1, 2, 3]}
kwargs est de type : <class 'dict'>

We then see that the kwargs object is a dictionary whose keys are in fact the names given to the variables when the function is called.

7.1.4. Anonymous functions¶

There is in fact a second way to define functions in Python, this is what we call anonymous functions or lambda functions. The syntax for defining these anonymous functions is as follows:

python my _function_ anonymous = lambda arg1, arg2, arg3: value _of_ return ``

We can see that the syntax is relatively different from that of the def keyword. The framework for using this type of function is the definition of short function and essentially mathematical functions. We see that with this syntax we are very close to what we could write on a sheet.

For example if we program the function rms (for Root Mean Square) of three variables which is expressed mathematically by:

\[rms(x,y,z) = \left( \frac{1}{3} \left[ x^2 + y^2 + z^2 \right] \right)^{\frac{1}{2}}\]

we can write an anonymous function:

[14]:

rms = lambda x,y,z: (1./3. * (x**2+y**2+z**2) )**(1./2.)

print(rms(1,2,1))

1.4142135623730951

7.1.5. The scope of variables¶

To finish this presentation of the syntax and the rules for defining a function in Python, we will see what we call the scope of variables. First of all we can see in the following example that a variable defined in a function can only be used within the latter. In the eyes of the outside world it does not exist.

>>> def add_2(a):
>>>    b = 2      ### La variable b est créée dans la fonction
>>>    c = a + b
>>>    print( "c = {}".format(c) )

>>> une_valeur = 1.

>>> add_2( une_valeur )

c = 3.0

>>> print( b )     ### Erreur : en dehors de la fonction b n'existe pas

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-53-7c321ed8f944> in <module>()
      9 add_2( une_valeur )
     10
---> 11 print( b )     ### Erreur : en dehors de la fonction b n'existe pas

NameError: name 'b' is not defined

The following example illustrates the fact that within a function, Python sees all the variables being defined in the instruction block calling the function in question.

[15]:

def add_3(a):
    c = a + d
    print("c = {}".format(c))

une_valeur = 1

>>> add_3(une_valeur)

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-54-43a5a1054963> in <module>()
      4
      5 une_valeur = 1
----> 6 add_3(une_valeur)

<ipython-input-54-43a5a1054963> in add_3(a)
      1 def add_3(a):
----> 2     c = a + d
      3     print("c = {}".format(c))
      4
      5 une_valeur = 1

NameError: name 'd' is not defined

If we define the variable d outside the code before the call to theadd_3 function, we see that there is no longer an error when executing the code.

[16]:

d = 10
add_3(une_valeur)

c = 11

En revanche Python ne peut pas modifier la valeur associée aux variables définit à l'extérieur de la fonction. Il ne les voit qu'en lecture seule. Dans l'exemple suivant on voit que le fait de vouloir modifier `e` engendre une erreur lors de l'exécution du code.

On the other hand, Python cannot modify the value associated with the variables defined outside the function. He only sees them in read-only mode. In the following example we see that the fact of wanting to modify e generates an error during the execution of the code.

def add_4(a):
    c = a + e
    print("c = {}".format(c))
    e = 0

e = 10
add_4(une_valeur)
print(e)

---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
<ipython-input-56-703da7c8669b> in <module>()
      6
      7 e = 10
----> 8 add_4(une_valeur)
      9 print(e)

<ipython-input-56-703da7c8669b> in add_4(a)
      1 def add_4(a):
----> 2     c = a + e
      3     print("c = {}".format(c))
      4     e = 0
      5

UnboundLocalError: local variable 'e' referenced before assignment

Note: There is the keyword global in Python which allows to: override the rules of scope of variables presented previously. We will not talk about global operation here because its use is strongly discouraged because it generates rough codes, very complicated to maintain and develop and especially with potentially unpredictable behavior.

7.1.6. Let’s take it a step further and put the functions away¶

We have just seen how we can organize our code into functions in order to have a simple and reusable program. For small single-use codes this is quite sufficient. On the other hand you can easily imagine that for a complex program having many functions it is necessary to push the organization and the architecture of the code further.

To go further in the organization of the code, the idea is to store your functions in files. In addition, the principle is to make an “intelligent” distribution of your functions by category. For example a file for all calculation / data processing functions, a file for all output writing functions, a file for all display and visualization functions, etc.

To distribute your functions in files is very simple. All you have to do is create a file with the extension .py and place the definitions of your functions inside it.

Attention: For the naming of your files there are some rules to follow. First of all, it is absolutely forbidden to use spaces as well as special characters (é, è, à,!,?, …) in your file names. Then the PEP8 convention recommends naming the files with a name starting with a lowercase letter.

For example, let’s create a file myFunctions.py in which we will store a number of Python functions.

[17]:

%%file mesFonctions.py

##### File : mesFonctions.py

def fonction_calcul():
    return None

##### end of file test.py

Writing mesFonctions.py

[18]:

!ls
!cat mesFonctions.py

01_introduction_et_setup.ipynb  YY_projets.ipynb
02_variables_et_controle.ipynb  _build
03_conteneurs_et_boucles.ipynb  conf.py
04_fonctions_et_modules.ipynb   data
05_classes_et_poo_base.ipynb    index.rst
06_numpy.ipynb                  media
07_scipy.ipynb                  mesFonctions.py
Makefile                        sphinx_ipypublish_all.ext.custom.json
XX_exercices.ipynb

##### File : mesFonctions.py

def fonction_calcul():
    return None

##### end of file test.py

Now the question we can ask ourselves is how do we tell Python that there is a myFunctions.py file containing a set of functions that I want to use in my main program? The answer is simple, just use the keyword import. The import keyword has four modes of use:

The first translates to the syntax below. In this case it is necessary to specify the name myFunctions each time you want to use a function contained in the filemyFonctions.py

import myFunctions
...
myFunctions.aFunctionOfFile (args)

The second possible syntax is directly linked to the fact that in general a developer is lazy and tries to write as few characters as possible. For this reason, the function modules can be renamed.

import myFunctions as mf
...
mf.aFunctionOfFile (args)

The third syntax allows you to specify at the time of import which functions we are going to use and therefore only load these functions.

from myFunctions imports oneFunctionOfFile, anotherFunction
...
someFunctionOfFile (args)
...
someOtherFunction (args2)

Finally, the last possible syntax is that which allows you to import all the functions contained in a file and to use them subsequently without having to put the prefix of the file back in front.

from myFunctions import *...
someFunctionOfFile (args)
...
someOtherFunction (args2)

**Warning:** although the last mode of use may seem convenient it is
a bad idea to use it. A simple example if in two files there is a
function with the same name but not doing the same thing. If you use
the ``from ... import*`` syntax, one of the two functions will be
overwritten by the other and therefore inaccessible.

So that the use of the key word import is done without problem, it is necessary to pay attention to where the filemesFonctions.py is located in relation to the main file, i.e. the one where the line of is written. import.

If the two files are side by side there are no problems, the import will proceed without a hitch (provided that there is no syntax error in themyFunction.py file .

On the other hand, if the myFunctions.py file is not located in the same folder as thescript_principal.py file if you do nothing the import will fail. Indeed, we must help Python so that it finds the file myFunctions.py if it is not next to it. For this it is necessary to extend the PYTHONPATH.

Under Linux or Mac OS an easy way to extend PYTHONPATH is to use environment variables. To do this, simply type the following command line in a console:

export PYTHONPATH=/path/to/the/folder:$PYTHONPATH

Another, perhaps simpler, solution is to extend your PATH within your main Python program. This is done in the following way:

[19]:

import sys

sys.path.append("/chemin/vers/le/dossier/")

7.2. Python modules¶

7.2.1. What is a module and where can I find them?¶

We saw in the previous part that we can distribute Python code in files. So of course people started doing that and redistributing their code on the Internet and in this way the modules were born. So a module is a set of additional features that can be imported into Python code, using the import command. And so over the years a huge library of Open Source modules has grown thanks in particular to a very active Python user community.

Among all the available modules, it is necessary to distinguish two categories: Modules from the standard Python library, this is a restricted set installed by default with Python regardless of your installation. Other modules which are not available by default and need to be installed for you to use them.

7.2.2. The standard Python library¶

The standard Python library includes a little over 100 modules of all kinds, to have the list of available modules you can go to the official site of [Python] (https://docs.python.org/3/ library / index.html). We will not of course cover all of them, especially since many of them will not be useful to us. We will only focus on the few modules of the standard library that can be useful to you in everyday life.

*The math and cmath modules*

The first module that will certainly be of use to you one day is the math module. As its name suggests, it is a module defining a number of mathematical functions. The loading of this module is of course done using the import command according to one of the 4 syntaxes presented in the previous part.

Among the functions defined there are sin,cos, log,exp and many others. For an exhaustive list of functions contained in the math module you can: * go to the following address https://docs.python.org/3/library/math.html * type help (math) in a Python or Notebook prompt. python import math help (math) `Themath` module also defines a number of mathematical constant:

[20]:

import math
print("math.pi : {}".format(math.pi))
print("math.e  : {}".format(math.e))

math.pi : 3.141592653589793
math.e  : 2.718281828459045

There is a variant of the math module dedicated to the treatment of complex numbers, it is thecmath module.

*The os module*

The os module allows you to interact with the computer’s operating system. The great thing about this module is that it has been designed so that no matter what operating system you are using (Windows, Mac OS or Linux) the functions of the module are the same (although at a lower level this is not the case at all). This makes it possible in particular to design programs that are cross-platform. Among the useful functions of the module there are among others:

``os.listdir`` which allows to list all the files / folders of a directory. os.isdir which allows to test if the given path corresponds to a folder or not. ``os.mkdir`` which allows to create a folder and many others …

Among the useful features available in the os module there are those relating to path management. To use these features you have to load the os.path submodule. Why bother with file paths you will tell me. This is always for reasons of compatibility between operating systems. Indeed on Linux and Mac OS systems (based on Linux) the file / folder paths are of the form /here/a/path. While on Windows the paths are of the form C:\un\path\windows. The most used function of the os.path module is thejoin function. Below is an example of use.

[21]:

import os.path

un_chemin = os.path.join("partie_1", "partie_2")

print( un_chemin )

chemin, fichier = os.path.split("/un/chemin/vers/un_fichier.txt")
print(chemin)
print(fichier)

partie_1/partie_2
/un/chemin/vers
un_fichier.txt

7.2.3. And many other things¶

This is only a very brief review of all the possibilities offered by the standard Python library. I strongly urge you, if you are obviously curious, to take a look at [https://docs.python.org/3/library/ Danemark(https://docs.python.org/3/library/) for have a more global vision of the possibilities offered by language. You will find, among other things, modules for graphical interfaces, for setting up server tcp, managing program input arguments, etc.