# Programma Giorno 2

- Control Flow
- Conditionals
- Loops
- Dictionaries
- Arguments of Functions
- List comprehension
- Sets
- Reading and Writing Files (the hard way)

# Control Flow

In computer science, control flow (or flow of control) is the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated. The emphasis on explicit control flow distinguishes an imperative programming language from a declarative programming language.
[https://en.wikipedia.org/wiki/Control_flow](https://en.wikipedia.org/wiki/Control_flow)

Essentially, a program should be able to "decide":
- whether to execute a set of statements or not (*conditionals*)
- repeat the execution of a set of statements a number of times (*loops*)



# Boolean Expressions

A boolean expression is an expression that is either ```True``` and ```False```. These are special values that belong to the type bool; they are not strings. 



In [1]:
a = 3
b = 5
a == b

False

In [2]:
b - a == 2

True

In [3]:
type(True)

bool

# What is True?

Any object can be tested for truth value, for use in an `if` or `while` condition or as operand of the Boolean operations below.

By default, an object is considered true unless its class defines either a `__bool__()` method that returns False or a `__len__()` method that returns zero, when called with the object. 

# What is False? 

Here are most of the built-in objects considered false:
* constants defined to be false: `None` and `False`
* zero of any numeric type: `0`, `0.0`, `0j`, `Decimal(0)`, `Fraction(0, 1)`
* empty sequences and collections: `''`, `()`, `[]`, `{}`, `set()`, `range(0)`

Operations and built-in functions that have a Boolean result always return `0` or `False` for false and `1` or `True` for true, unless otherwise stated. (Important exception: the Boolean operations `or` and `and` always return one of their operands.)

In [4]:
0 == False

True

# Operators

Numerical
```python
x == y               # x is equal to y
x != y               # x is not equal to y
x > y                # x is greater than y
x < y                # x is less than y
x >= y               # x is greater than or equal to y
x <= y               # x is less than or equal to y
```


Logical

`and`, `or`, `not`


# Conditionals


```python
if condition_1:
    statement_1a
    ...
elif condition_2:
    statement_2a
    ...
else:
    statement_3a
    ...
```
- indented blocks are mutually exclusive (only one is executed)
- ```elif``` and ```else``` are optional

# Loops: While

```python
while <test>:
    <statements>
    if <test>: break        # exit loop now, skip else
    if <test>: continue     # go to top of loop now
else:
    <statements>            # if we didn't hit a break
```

1. Determine whether the condition is true or false.
1. If false, exit the while statement and continue execution at the next statement.
1. If the condition is true, run the indented block and then go back to step 1.

# Loops: For

The Python for loop begins with a header line that specifies an assignment target (or targets), along with an object you want to step through. The header is followed by a block of indented statements, which you want to repeat:

```python
for <target> in <object>:   # assign object items to target
    <statements>
    if <test>: break        # exit loop now, skip else
    if <test>: continue     # go to top of loop now
else:
    <statements>            # if we didn't hit a 'break'
```


In [5]:
L1 = [2,1,3]
L2 = []
if L1:
    print('L1 evaluates as if it were True')
if L2:
    print('L2 evaluates as if it were True')


L1 evaluates as if it were True


In [6]:
x = 3
y = 5
if x < y:
    print('x is less than y')
elif x > y:
    print('x is greater than y')
else:
    print('x and y are equal')

x is less than y


In [7]:
n = 10
while n > 0:
    print(n)
    n = n - 1
print('Blastoff!')

10
9
8
7
6
5
4
3
2
1
Blastoff!


# Loop special statements

In Python:

- `break` Jumps out of the closest enclosing loop (past the entire loop statement).
- `continue` Jumps to the top of the closest enclosing loop (to the loop’s header line).
- `pass` Does nothing at all: it’s an empty statement placeholder.
- `loop else block` Run if and only if the loop is exited normally, that is, without hitting a break.

# For as a sequence iterator

The `for` loop is a generic **sequence iterator** in Python: it can step through the items in any object that responds to the sequence indexing operation. The for works on strings, lists, tuples, and other objects.


In [8]:
for i in range(5):
    print(i**2)


0
1
4
9
16


A string is a sequence so you can use `for` to get one character at a time 

In [9]:
for char in 'Donald Duck':
    print(char)

D
o
n
a
l
d
 
D
u
c
k


# The Collatz Conjecture

Start from any positive integer *n*. At every step do one of the following to get a new number:
- if the number is even, divide it by two
- if the number is odd, multiply it by three and add 1

For example, say you start from 6. 6 is even, so the next number is 6/2=3. 3 is odd so we multiply it by 3 and add 1 to get 10. 10 is even, we get 5. 5 is odd, so we get 16. Then we divide by 2, to get 8. Then again we divide by 2, to get 4 and from 4 we get 2, then 1, then back to 4 and we end up in a cycle 4, 2, 1.

The Collatz Conjecture says that no matter the number you start from, you will end up in the 4, 2, 1 cycle.

The conjecture is still such, no one knows whether it is true or not.

Write a code to test the sequence generated by a generic number *n*.

In [10]:
n = 22
print(n)
while n != 1:
    if n % 2 == 0:
        n = n//2
    else:
        n = 3*n + 1
    print(n)


22
11
34
17
52
26
13
40
20
10
5
16
8
4
2
1


# Finding the first negative element in a list

Say we have a list `L` and we’re looking for the index of the first negative element of the list. Our code is:

In [11]:
L = [1, 2, -3, -6, 4]
i = 0
while L[i] >= 0:
    i = i + 1
print(i)


2


The code works, even if the negative element is the first of the list. However, it is not correct. Why?

Because, if the list has no negative elements, then it raises an exception.

In [12]:
L = [1, 2, 3, 6, 4]
i = 0
while L[i] >= 0:
    i = i + 1
print(i)


IndexError: list index out of range

We can modify this by adding a condition that guarantees we are not out of range.

In [13]:
L = [1, 22, 3, 4, 6]
i = 0
while i < len(L) and L[i] >= 0:   # short-circuiting
    i = i + 1
if i < len(L):
    print(i)
else:
    print('No negative elements')

No negative elements


Rewriting the "find the negative element" with for. A list is an iterable ...

In [14]:
L1 = [1, 22, -3, 4, -6]
found = False   # flag
for el in L1:
    if (not found) and el < 0:
        print(el)
        found = True
if not found:
    print('No negative elements')

-3


But we want the index, not the element ...

In [15]:
L1 = [1, 22, -3, 4, -6]
found = False   # flag
i = 0
for el in L1:
    if (not found) and el < 0:
        print(i)
        found = True
    i = i + 1
if not found:
    print('No negative elements')

2


Better solution:

In [17]:
L1 = [1, 22, -3, 4, -6]
found = False   # flag
for indice, elemento in enumerate(L1):
    if (not found) and elemento < 0:
        print(indice)
        found = True
if not found:
    print('No negative elements')

2


With break statement

In [None]:
L1 = [1, 22, -3, 4, -6]
not_found = True   # flag
for i, el in enumerate(L1):
    if not_found and el < 0:
        print(i)
        not_found = False
        break
if not_found:
    print('No negative elements')

Apparently, nothing has changed. However, in the last format we are performing the exact number of needed iterations (three, in this case). The scan of the list stops when the first negative element is seen.

A better version?

In [18]:
L1 = [1, 22, -3, 4, -6]
for i, el in enumerate(L1):
    if el < 0:
        print(i)
        break
else:
    print('No negative elements')

2


The above version does not need the flag. However, the usage (and meaning) of the `else` clause in the for loop (and in the while loop) is known to be confusing to someone.

## Dictionaries

Dictionaries are **mutable** collections of key-value pairs  They exist in other languages and are also called maps or “associative arrays”.
In Python they are enclosed by braces. Every pair, called item, contains a key and a value.

In [19]:
d1 = {'first_name': 'Donald', 'last_name': 'Duck'}
print(d1['first_name'])     # Donald
print(d1['last_name'])      # DUck

Donald
Duck


The syntax is that of lists, but now you have a key (that can be any immutable)
instead of an index (that must be an integer)

You may change, add and remove items using the same syntax.

`get` method returns the value, given the key.
`pop` method returns the value, given the key, and removes the item.



In [26]:
d1 = {'first_name': 'Donald', 'last_name': 'Duck'}
print(d1['first_name'])     # Donald
print(d1['last_name'])      # Duck

d1['birth_year'] = 1931
print(d1)
d1['birth_year'] = 1932
print(d1)
del(d1['birth_year'])
print(d1)
d1['birth_year'] = 1931
print(d1)

Donald
Duck
{'first_name': 'Donald', 'last_name': 'Duck', 'birth_year': 1931}
{'first_name': 'Donald', 'last_name': 'Duck', 'birth_year': 1932}
{'first_name': 'Donald', 'last_name': 'Duck'}
{'first_name': 'Donald', 'last_name': 'Duck', 'birth_year': 1931}


In [27]:
print(d1.keys())
print(d1.values())
print(d1.items())

for k,v in d1.items():
    print(f'Key: {k} -> Value: {v}')

print(d1['partner'])

dict_keys(['first_name', 'last_name', 'birth_year'])
dict_values(['Donald', 'Duck', 1931])
dict_items([('first_name', 'Donald'), ('last_name', 'Duck'), ('birth_year', 1931)])
Key: first_name -> Value: Donald
Key: last_name -> Value: Duck
Key: birth_year -> Value: 1931


KeyError: 'partner'


To avoid the above problem, one can use the method get. From [https://docs.python.org/3/library/stdtypes.html#mapping-types-dict](https://docs.python.org/3/library/stdtypes.html#mapping-types-dict)
>get(key[, default])
>
>Return the value for key if key is in the dictionary, else default. If default is not given,  it defaults to None, so that this method never raises a KeyError.


In [28]:
print(d1.get('partner','Single'))
print(d1)
print(d1.get('first_name','Unknown'))
print(d1.pop('first_name','Unknown'))
print(d1)


Single
{'first_name': 'Donald', 'last_name': 'Duck', 'birth_year': 1931}
Donald
Donald
{'last_name': 'Duck', 'birth_year': 1931}



## More on Functions

Functions are defined with the ```def``` statement.

The first part of a function definition specifies the function name and parameter names that represent input values. The body of a function is a sequence of statements that execute when the function is called or applied. 

The standard naming convention for functions is to use lowercase letters with an underscore ( _ ) used as a word separator—for example, `read_data()`. If a function is not meant to be used directly because it’s a helper or some kind of internal implementation detail, its name usually has a single underscore prepended to it—for example, `_helper()`. These are only conventions, however. 

### Arguments

Arguments are fully evaluated left-to-right before executing the function body. For example, `add(1+1, 2+2)` is first reduced to ```add(2, 4)``` before calling the function. The order and number of arguments must match the parameters given in the function definition. If a mismatch exists, a `TypeError` exception is raised. The structure of calling a function (such as the number of required arguments) is known as the function’s call signature.

A function may have multiple `return` statements, but only one is executed.

### Default arguments

You can attach default values to function parameters by assigning values in the function definition. For example:



In [1]:
def enroll_student(firstname, lastname, ug_university = 'Bocconi'):
    d = {}
    d['firstname'] = firstname    
    d['lastname'] = lastname
    d['ug_university'] = ug_university
    return d

d = enroll_student('Mickey', 'Mouse', 'NYU')
print(d)
d = enroll_student('Minnie', 'Mouse')
print(d)


{'firstname': 'Mickey', 'lastname': 'Mouse', 'ug_university': 'NYU'}
{'firstname': 'Minnie', 'lastname': 'Mouse', 'ug_university': 'Bocconi'}


### Defaults and optional parameters

When a function defines a parameter with a default value, *that parameter, and all the parameters that follow it, are optional*. It is not possible to specify a parameter with no default value after any parameter with a default value.

Another example with datetime objects.



In [44]:
import datetime

str_date_1 = '09/12/2023'
d = datetime.datetime.strptime(str_date_1, '%d/%m/%Y')
print(type(d))
print(d.strftime('%a %d %b %Y'))
str_date_2 = '09/16/2023'
d = datetime.datetime.strptime(str_date_2, '%m/%d/%Y')
print(d.strftime('%a %d %b %Y'))

def check_date(str_date, format = '%d/%m/%Y'):
    try:
        d = datetime.datetime.strptime(str_date, format)
        return True
    except ValueError:
        return False

print(check_date(str_date_1))
print(check_date(str_date_2, format='%m/%d/%Y'))



<class 'datetime.datetime'>
Sat 09 Dec 2023
Sat 16 Sep 2023
True
True


### Evaluation of Default Parameters

Default parameter values **are evaluated once** when the function is first defined, *not each time the function is called*. This often leads to surprising behavior if mutable objects are used as a default.



In [None]:
def func(x, items=[]):
    items.append(x)
    return items

print(func(1))
print(func(2)) 
print(func(3)) 

Notice how the default argument retains the modifications made from previous invocations. To prevent this, it is better to use None and add a check.



In [None]:
def func(x, items=None):
    if items is None:
        items = []
        items.append(x)
    return items

print(func(1))
print(func(2)) 
print(func(3)) 


As a general practice, to avoid such surprises, only use immutable objects for default argument values—numbers, strings, Booleans, `None`, and so on.



## Variable Number of Arguments

A function can accept a variable number of arguments if an asterisk `*` is used as a prefix on the **last** parameter name. 

In [58]:
def product(first, *args):
    result = first
    print(type(args))
    for x in args:
        print(x)
        result = result * x
    return result
import math
def product_chiara(*args):
    p = 1
    for i in range(len(args)):
        p = p * args[i]
    return p

r1 = product(10,30)
print(r1)
print(product(10, 20))
print(product(2, 3, 4, 5))
print(product_chiara(2, 3, 4, 5))

def norm(*components):
    sq = 0
    for component in components:
        sq = sq + component*component
    return math.sqrt(sq)

print(norm(0,4,3,2,3))

def inner_product(v1, v2):
    ip = 0
    for component in zip(v1, v2):
        ip = ip + component[0] * component[1]
    return ip

print(inner_product((1,2,3),(0,1,0)))

<class 'tuple'>
30
300
<class 'tuple'>
20
200
<class 'tuple'>
3
4
5
120
120
6.164414002968976
2


In this case, all of the extra arguments are placed into the `args` variable as a *tuple*. You can then work with the arguments using the standard sequence operations—iteration, slicing, unpacking, and so on.



In [56]:
# each of the following lists contains 
# the id of a customer
# their country code
# the amounts they spent
c1 = [1, 'IT', 1000, 1500]
c2 = [3, 'DE', 300]
c3 = [8, 'UK', 100, 1000, 900, 140]

def print_customer_data(cust_data):
    cust_code, cust_country, *cust_purchases = cust_data
    print(f'Costumer {cust_code} from {cust_country} made the following purchases:')
    for purchase in cust_purchases:
        print(f'{purchase} euro')

print_customer_data(c1)
print_customer_data(c2)
print_customer_data(c3)

def places(country, *cities):
    print(f'Country: {country}')
    for city in cities:
        print(city)

places('Italy', 'Milan', 'Venice', 'Florence')
places('UK', 'London', 'Edinburgh', 'Glasgow', 'Manchester', 'Oxford')



Costumer 1 from IT made the following purchases:
1000 euro
1500 euro
Costumer 3 from DE made the following purchases:
300 euro
Costumer 8 from UK made the following purchases:
100 euro
1000 euro
900 euro
140 euro
Country: Italy
Milan
Venice
Florence
Country: UK
London
Edinburgh
Glasgow
Manchester
Oxford



### Unpacking vs Slicing

When you use tuples or lists on the left side of the `=`, Python pairs objects on the right side with targets on the left and assigns them from left to right.

Combined with the asterisk, unpacking provides a sometimes better alternative to slicing.



In [None]:
ages = [0, 9, 4, 8, 7, 20, 19, 1, 6, 15]
ages_descending = sorted(ages, reverse=True)

# oldest, second_oldest = ages_descending
#
# does not work: the list on the right has more than 2 elements

# with slicing
oldest = ages_descending[0]
second_oldest = ages_descending[1]
others = ages_descending[2:]
print(oldest, second_oldest, others)

# with catch-all unpacking
oldest, second_oldest, *others = ages_descending
print(oldest, second_oldest, others)

# variation
oldest, *others, youngest = ages_descending
print(oldest, youngest, others)


### Positional vs Keyword

Function arguments can be supplied by explicitly naming each parameter and specifying a value. 

These are known as **keyword arguments**.



In [None]:
def enroll_student(firstname, lastname, ug_university = 'Bocconi'):
    d = {}
    d['firstname'] = firstname    
    d['lastname'] = lastname
    d['ug_university'] = ug_university
    return d

d = enroll_student(firstname = 'Minnie', lastname = 'Mouse') 
print('in order, all kwargs, OK', d)

d = enroll_student(lastname = 'Mouse', firstname = 'Minnie') 
print('not in order, all kwargs, OK', d)

d = enroll_student('Minnie', lastname='Mouse') 
print('positional first, then kwargs, OK', d)

d = enroll_student(firstname = 'Minnie', 'Mouse') 
print('kwargs first, then positional, ERROR', d)


With keyword arguments, the order of the arguments doesn’t matter. 

If you omit any of the required arguments or if the name of a keyword doesn’t match any of the parameter names in the function definition, you get a TypeError.

Positional arguments and keyword arguments can appear in the same function call, provided that 
- **all the positional arguments appear first**, 
- values are provided for all mandatory arguments, and 
- no argument receives more than one value. 

In [None]:
def enroll_student(firstname, lastname, ug_university = 'Bocconi'):
    d = {}
    d['firstname'] = firstname    
    d['lastname'] = lastname
    d['ug_university'] = ug_university
    return d

d = enroll_student(firstname = 'Minnie', 'Mouse') 
print('kwargs first, then positional, ERROR', d)


### Arbitrary arguments

If the **last** argument of a function definition is prefixed with `**` (double asterisk), all the additional keyword arguments (those that don’t match any of the other parameter names) are placed in a dictionary and passed to the function. The order of items in this dictionary is guaranteed to match the order in which keyword arguments were provided.

Arbitrary keyword arguments might be useful for defining functions that accept a large number of potentially open-ended configuration options that would be too unwieldy to list as parameters.





In [None]:
def video(title, **parms):
    language = parms.pop('language', 'English')
    color = parms.pop('color', 'color')
    # No more options
    if parms:
        raise TypeError(f'Unsupported configuration options {list(parms)}')
    return f'Title: {title}, language: {language}, {color}'

print(video('Wile E Coyote'))
print(video('The Third Man', color='b/w'))
print(video('2001 A Space Odyssey', director='Kubrick'))


## List comprehensions


In [61]:
'''
squares = []
for x in range(10):
    squares.append(x**2)

print(squares)

squares = [x**2 for x in range(10)]

print(squares)
'''
c0 = [(x, y) for x in [1,2,3] for y in [3,1,4]]
c1 = [(x, y) for x in [1,2,3] for y in [3,1,4] if x != y]

c2 = []
for x in [1,2,3]:
    for y in [3,1,4]:
        if x != y:
            c2.append((x, y))

print(c0)
print(c1)
print(c2)

[(1, 3), (1, 1), (1, 4), (2, 3), (2, 1), (2, 4), (3, 3), (3, 1), (3, 4)]
[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]
[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]


In [62]:
vec = [-4, -2, 0, 2, 4]
# create a new list with the values doubled
r1 = [x*2 for x in vec]
print(r1)

# filter the list to exclude negative numbers
r2 = [x for x in vec if x >= 0]
print(r2)

# apply a function to all the elements
r3 = [abs(x) for x in vec]
print(r3)

# call a method on each element
fresh_fruit = ['  banana', '  loganberry ', 'passion fruit  ']
r4 = [name.strip() for name in fresh_fruit]
print(r4)

# create a list of 2-tuples like (number, square)
r5 = [(x, x**2) for x in range(6)]
print(r5)

# the tuple must be parenthesized, otherwise an error is raised

# flatten a list using a listcomp with two 'for'
vec = [[1,2,3], [4,5,6], [7,8,9]]
r6 = [num for elem in vec for num in elem]
print(r6)

[-8, -4, 0, 4, 8]
[0, 2, 4]
[4, 2, 0, 2, 4]
['banana', 'loganberry', 'passion fruit']
[(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 25)]
[1, 2, 3, 4, 5, 6, 7, 8, 9]



## Sets


In [66]:
A = {1, 3, 55, 3.14}
print(A)

A.add('Goofy')

print(A)

A.add(3)

print(A)

l1 = [1,3,3,5,7,8,7]
print(l1)
s1 = set(l1)
print(s1)
l2 = list(s1)
print(l2)

{3, 1, 3.14, 55}
{1, 3, 3.14, 'Goofy', 55}
{1, 3, 3.14, 'Goofy', 55}
[1, 3, 3, 5, 7, 8, 7]
{1, 3, 5, 7, 8}
[1, 3, 5, 7, 8]


Seems it takes no time, but there is no magic: a check is done not to add the same element twice!

In [None]:
import numbers
A = {1,-44, 3.14, 'Goofy'}
for el in A:
    if isinstance(el, numbers.Number) and el < 0:
        print('A contains negative numbers!')


B = set()

B.add(5)

print(B)


Elements of a `set` cannot be mutable.

In [None]:
A = {2,3,[1,2]}



See also [https://en.wikipedia.org/wiki/Hash_function](https://en.wikipedia.org/wiki/Hash_function)

Adding to a list performs slightly better than adding to a set (there's no need to hash new elements)


In [69]:

import timeit

timeit.timeit(stmt='for i in range(n):' '   A.add(i)', setup='A = set(); n=1000', number=100000)

timeit.timeit(stmt='for i in range(n):' '   B.append(i)', setup='B = list(); n=1000', number=100000)


5.636775699999816

Comprehensions can generate also dictionaries and sets.



In [None]:
d = {("number" + str(i)):i for i in range(10)}

print(d)

s = {i**2 for i in range(20)}
print(s)


### Random and Pseudo-Random Numbers in python

- The `random` module
- The `random.seed()` function. `seed(number)` initializes the random number generator to a specific "seed". The effect is that after calling `random.seed(N)`, the sequence of random numbers is the same on every machine (that is, it is not random at all!). 
- `seed` is crucial when evaluating the performance of a randomized algorithm. Without it, the sequence of random numbers given to algorithm A will be different from that given to algorithm B, making the comparison of the performances impossible.



In [71]:

import random

# random.random() generates a pseudo-random decimal number 
# uniformly distributed in the interval [0,1)
random.random()

# random.randint(a, b) generates a pseudo-random integer 
# uniformly distributed in the interval [a, b]
random.randint(3,10)

random.seed(10)

random.random()

# random.choices() draws k elements from a sequence (with replacement)
random.choices(range(30), k=10)

# with probabilities ...
random.choices([100, 200, 300, 400], [30, 30, 30, 10], k=5)

# random.sample() draws k elements from a sequence (without replacement)
random.sample(range(30), k=10)


[12, 13, 9, 26, 21, 8, 14, 5, 25, 27]

# Reading/Writing Files

For reading or writing a file, Python uses a `file object`. 

What you do is
1. Open a communication channel with the file, obtaining a reference to a specific file object
1. (read/write) Manipulate the content of that object via ad hoc functions
1. Terminate the communication (that is, close the file object)

Different Operating Systems (OSs) have a limitation on the number of files they can handle at the same time. The number is usually large but some slots are taken for the internal working of the OS and from the other programs. Bottom line: one should take care of point 3. above.

## Text and Binary files

At the lowest level, Python works with two fundamental datatypes:
- *bytes* that represent raw uninterpreted data of any kind and 
- *text* that represents Unicode characters.

# Finding files in a computer

OSs organize files in a database called `file system`. Windows and MacOS have different file systems. Thus, the way a file can be referred to depends on the OS. In general, however, the database can be represented at a tree that creates a hierarchy.

The way this hierarchy translates into a string, depends on the os. For example, the file *artists.txt* in the subdirecotry *datafiles* of the current directory, would be 

> *.\datafiles\artists.txt* 

in windows and

> */datafiles/artists.txt* 

in MacOS (and linux too). 

Python provides a level of abstraction about the OS with the package os.

In [None]:
import os
print(os.getcwd())

In [None]:
filename = os.path.join(os.getcwd(),'data','bridges.data')
print(filename)


In [None]:
# Read a text file all at once as a string
input_file_name = os.path.join(os.getcwd(),'data','bridges.data')
with open(input_file_name, 'rt') as input_file:
    data = input_file.read()
    print(data)

In [None]:
# Read a file line-by-line
input_file_name = os.path.join(os.getcwd(),'data','bridges.data')
with open(input_file_name, 'rt') as input_file:
    for line in input_file:
        print(line)
        # do sth with line

In [None]:
# Write to a text file
output_file_name = os.path.join(os.getcwd(),'data','out.txt')
with open(output_file_name, 'wt') as output_file:
    output_file.write('Some output\n')
    print('More output', file=output_file)

In [None]:
# read from and write to a text file
input_file_name = os.path.join(os.getcwd(),'data','bridges.data')
output_file_name = os.path.join(os.getcwd(),'data','out.txt')
with open(input_file_name, 'rt') as input_file, open(output_file_name, 'w') as output_file:
    for line in input_file:
        if line[:2] == 'E1':
            # if line starts with 'E1', copy it to the output
            output_file.writelines(line)


To open a file, use the built-in `open()` function. Usually, `open()` is given a filename and a file mode. It is also often used in combination with the `with` statement as a *context manager*. 

When opening a file, you need to specify a file mode. The core file modes are **'r'** for reading, **'w'** for writing, and **'a'** for appending. 'w' mode replaces any existing file with new content. 'a' opens a file for writing and positions the file pointer to the end of the file so that new data can be appended.

A special file mode of **'x'** can be used to write to a file, but only if it doesn't exist already. This is a useful way to prevent accidental overwriting of existing data. For this mode, a FileExistsError exception is raised if the file already exists.

# CSV module

Rows in CSV files are naturally represented using dictionaries. Reading a csv file into an array of dictionaries is made easy through the csv module. 

The csv module defines two methods for the csv object: DictReader and DictWriter. See the documentation for the csv module, especially for the .reader and the DictReader functions:

[https://docs.python.org/3/library/csv.html](https://docs.python.org/3/library/csv.html)


In [33]:
import csv 
import os
input_file_name = os.path.join(os.getcwd(),'data','Artists.txt')
print(input_file_name)
# nationality
with open(input_file_name, encoding='utf-8') as f:
    linereader = csv.DictReader(f, delimiter=',', quotechar='"')
    print(linereader)
    gender = {}
    for row in linereader:
        if row['Gender'] in gender.keys():
            gender[row['Gender']] = gender[row['Gender']] + 1
        else:
            gender[row['Gender']] = 1

print(sorted(gender.items(), key=lambda item: item[1],reverse=True))


C:\github\castellanza\data\Artists.txt
<csv.DictReader object at 0x00000298F602AA80>
[('Male', 9762), ('', 3141), ('Female', 2300), ('male', 15), ('Non-Binary', 2), ('female', 1), ('Non-binary', 1)]


A similar method works for writing csv files. We can use the DictWriter function in the csv module. 

[https://docs.python.org/3/library/csv.html?highlight=dictwriter#csv.DictWriter](https://docs.python.org/3/library/csv.html?highlight=dictwriter#csv.DictWriter)

In [20]:
l1 = [1,2,4]

In [21]:
l1

[1, 2, 4]

In [22]:
l1.append(55)

In [23]:
l1

[1, 2, 4, 55]

In [8]:
dir(l1)

['__add__',
 '__class__',
 '__class_getitem__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__imul__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__rmul__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'append',
 'clear',
 'copy',
 'count',
 'extend',
 'index',
 'insert',
 'pop',
 'remove',
 'reverse',
 'sort']

In [9]:
d = {'name':'Fabrizio', 'last_name':'Iozzi'}

In [10]:
d

{'name': 'Fabrizio', 'last_name': 'Iozzi'}

In [11]:
d.keys()

dict_keys(['name', 'last_name'])

In [28]:
l2 = l1.append(22)

In [29]:
l1

[1, 2, 4, 55, 66, 22, 22]

In [31]:
l3 = list((2,44,'ddd'))

In [32]:
l3

[2, 44, 'ddd']

In [33]:
a = int(44)

In [34]:
a

44

In [35]:
a = 44.

In [36]:
type(a)

float

In [37]:
a = float(44)

In [38]:
a

44.0

In [40]:
a = complex('1+3j')

In [41]:
type(a)

complex

In [43]:
a.conjugate()

(1-3j)

In [76]:
import math
class Point():
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def dist_from_O(self):
        return math.sqrt(self.x*self.x + self.y*self.y)

p1 = Point(2,3)
print(p1)
print(p1.dist_from_O())

def distance(p1, p2):
    return math.sqrt((p1.x-p2.x)**2+(p1.y-p2.y)**2)

p2 = Point(3,4)
print(distance(p1,p2))

<__main__.Point object at 0x00000258FB6AC380>
3.605551275463989
1.4142135623730951


In [74]:
dir(p1)

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'dist_from_O',
 'x',
 'y']