In Python, we have the following built-in data structures:
We have already introduced the list data structure in Chapter 3. Lists are heterogeneous containers for multiple items in Python.
Unlike sequences (like lists in Python), which are indexed by a range of numbers, dictionaries are indexed by keys. It is best to think of a dictionary as a set of key: value pairs, with the requirement that the keys are unique (within one dictionary). A pair of braces creates an empty dictionary: {}
. Placing a comma-separated list of key: value pairs within the braces adds initial key: value pairs to the dictionary; this is also the way dictionaries are written on output.
Dictionaries are sometimes found in other languages as associative arrays.
Example: let's create a shopping list with the name of products as keys and quantities as values:
shopping_list = {'apple': 6, 'bread': 2, 'milk': 6, 'butter': 1}
print(shopping_list)
Example: let's create a dictionary with the name of countries as keys and their capital cities as values:
capitals = {
'Aland Islands': 'Mariehamn',
'Albania': 'Tirana',
'Andorra': 'Andorra la Vella',
'Armenia': 'Yerevan',
'Austria': 'Vienna',
'Azerbaijan': 'Baku',
'Belarus': 'Minsk',
'Belgium': 'Brussels',
'Bosnia and Herzegovina': 'Sarajevo',
'Bulgaria': 'Sofia',
'Croatia': 'Zagreb',
'Cyprus': 'Nicosia',
'Czech Republic': 'Prague',
'Denmark': 'Copenhagen',
'Estonia': 'Tallinn',
'Faroe Islands': 'Torshavn',
'Finland': 'Helsinki',
'France': 'Paris',
'Georgia': 'Tbilisi',
'Germany': 'Berlin',
'Gibraltar': 'Gibraltar',
'Greece': 'Athens',
'Guernsey': 'Saint Peter Port',
'Vatican City': 'Vatican City',
'Hungary': 'Budapest',
'Iceland': 'Reykjavik',
'Ireland': 'Dublin',
'Isle of Man': 'Douglas',
'Italy': 'Rome',
'Jersey': 'Saint Helier',
'Kosovo': 'Pristina',
'Latvia': 'Riga',
'Liechtenstein': 'Vaduz',
'Lithuania': 'Vilnius',
'Luxembourg': 'Luxembourg',
'Macedonia': 'Skopje',
'Malta': 'Valletta',
'Moldova': 'Chisinau',
'Monaco': 'Monaco',
'Montenegro': 'Podgorica',
'Netherlands': 'Amsterdam',
'Norway': 'Oslo',
'Poland': 'Warsaw',
'Portugal': 'Lisbon',
'Romania': 'Bucharest',
'Russia': 'Moscow',
'San Marino': 'San Marino',
'Serbia': 'Belgrade',
'Slovakia': 'Bratislava',
'Slovenia': 'Ljubljana',
'Spain': 'Madrid',
'Svalbard': 'Longyearbyen',
'Sweden': 'Stockholm',
'Switzerland': 'Bern',
'Turkey': 'Ankara',
'Ukraine': 'Kyiv',
'United Kingdom': 'London',
'Northern Cyprus': 'North Nicosia'
}
print(capitals)
Elements of a dictionary can be accessed through their key:
print(capitals['Hungary'])
Dictionaries are mutable, so the values can be modified:
capitals['Hungary'] = 'Esztergom' # old capital city of Hungary between the 11-13th century
print(capitals['Hungary'])
capitals['Hungary'] = 'Budapest'
print(capitals['Hungary'])
Dictionaries can also be extended with new elements (key: value pairs):
capitals['USA'] = 'Washington'
print(capitals['USA'])
Removing or deleting existing elements is also possible:
del capitals['USA'] # deleting, because not in Europe
print(capitals['USA'])
Tuples are a sequence of heterogeneous elements, similarly like lists. Its initial elements are defined as a comma separated list, surrounded by parentheses.
Note: lists are surrounded by brackets!
neighbours = ('Austria', 'Slovakia', 'Ukraine', 'Romania', 'Serbia', 'Croatia', 'Slovenia')
print(neighbours)
The items of a list can be accessed by the numerical indexes. (The first item is indexed with zero.)
print(neighbours[0])
print(neighbours[2:5])
print(len(neighbours))
The elements of tuple can also be fetched by tuple unpacking, which means that the items of a tuple are extracted into distinct variables:
a, b, c, d, e, f, g = neighbours
print(a, b, c, d, e, f, g)
Through tuple packing, we can create a new tuple with its elements defined:
neighbours2 = (a, b, c, d, e, f, g)
print(neighbours == neighbours2)
While lists are mutable, tuples are immutable, meaning that the elements cannot be modified:
neighbours[0] = 'Renamed country'
New elements can neither be added to a tuple. Removing existing elements is also not possible.
neighbours.append('New country')
Though tuples may seem similar to lists, they are often used in different situations and for different purposes. Tuples are immutable, and usually contain a heterogeneous sequence of elements. Lists are mutable, and their elements are usually homogeneous and are accessed by iterating over the list.
We could see that lists, tuples and even strings have many common properties, such as indexing and slicing operations. They are sequence data types.
The dict()
constructor builds dictionaries directly from sequences of key-value pairs. The dictionary in this case is build from a list tuples, each tuple containing precisely two elements: a key and a value, in this order.
capitals = dict([
('Aland Islands', 'Mariehamn'),
('Albania', 'Tirana'),
('Andorra', 'Andorra la Vella'),
('Armenia', 'Yerevan'),
('Austria', 'Vienna'),
('Azerbaijan', 'Baku'),
('Belarus', 'Minsk'),
('Belgium', 'Brussels'),
('Bosnia and Herzegovina', 'Sarajevo'),
('Bulgaria', 'Sofia'),
('Croatia', 'Zagreb'),
('Cyprus', 'Nicosia'),
('Czech Republic', 'Prague'),
('Denmark', 'Copenhagen'),
('Estonia', 'Tallinn'),
('Faroe Islands', 'Torshavn'),
('Finland', 'Helsinki'),
('France', 'Paris'),
('Georgia', 'Tbilisi'),
('Germany', 'Berlin'),
('Gibraltar', 'Gibraltar'),
('Greece', 'Athens'),
('Guernsey', 'Saint Peter Port'),
('Vatican City', 'Vatican City'),
('Hungary', 'Budapest'),
('Iceland', 'Reykjavik'),
('Ireland', 'Dublin'),
('Isle of Man', 'Douglas'),
('Italy', 'Rome'),
('Jersey', 'Saint Helier'),
('Kosovo', 'Pristina'),
('Latvia', 'Riga'),
('Liechtenstein', 'Vaduz'),
('Lithuania', 'Vilnius'),
('Luxembourg', 'Luxembourg'),
('Macedonia', 'Skopje'),
('Malta', 'Valletta'),
('Moldova', 'Chisinau'),
('Monaco', 'Monaco'),
('Montenegro', 'Podgorica'),
('Netherlands', 'Amsterdam'),
('Norway', 'Oslo'),
('Poland', 'Warsaw'),
('Portugal', 'Lisbon'),
('Romania', 'Bucharest'),
('Russia', 'Moscow'),
('San Marino', 'San Marino'),
('Serbia', 'Belgrade'),
('Slovakia', 'Bratislava'),
('Slovenia', 'Ljubljana'),
('Spain', 'Madrid'),
('Svalbard', 'Longyearbyen'),
('Sweden', 'Stockholm'),
('Switzerland', 'Bern'),
('Turkey', 'Ankara'),
('Ukraine', 'Kyiv'),
('United Kingdom', 'London'),
('Northern Cyprus', 'North Nicosia')
])
print(capitals)
Since we know that each tuple in a dictionary contains a key and a value, tuple unpacking can be very useful to extract them into separate variables, e.g.:
pair = ('Hungary', 'Budapest')
key, value = pair
print("Key is {0}, value is {1}".format(key, value))
Accessing the list of key-value tuples can be done with the items()
function of the dictionary:
print("List of key-value pairs:")
print(capitals.items())
Accessing ONLY the list of keys or values is also possible with the keys()
and values()
functions of the dictionary:
print("List of keys:")
print(capitals.keys())
print("List of values:")
print(capitals.values())
We can also use a for loop to iterate through the items of a dictionary:
for item in capitals.items():
print(item)
Here we iterate through the key: value tuples of a dictionary with the item
variable.
The key is the element with the index 0, the value is the element with the index 1 in item
:
for item in capitals.items():
key = item[0]
value = item[1]
print("{0}: {1}".format(key, value))
By creating a list of tuples from the dictionary, we can fetch a single tuple by its numerical index and then extract the key and the value into separate variables with tuple unpacking:
item_10 = list(capitals.items())[10]
print(item_10)
key, value = item_10
print("Key is {0}, value is {1}".format(key, value))
We can also use a for loop to iterate through the key: value pairs in a dictionary with tuple unpacking:
for item in capitals.items():
key, value = item
print("{0}: {1}".format(key, value))
We do not even need a temporary item
variable, the unpacking can be done directly in the for
statement:
for key, value in capitals.items():
print("{0}: {1}".format(key, value))
Now compare the 2 versions of iterating through the items of a dictionary and observe how tuple unpacking makes accessing the key and the value easier:
for item in capitals.items():
key = item[0]
value = item[1]
print("{0}: {1}".format(key, value))
for key, value in capitals.items():
print("{0}: {1}".format(key, value))
Note: keys in a dictionary can be any immutable type; strings and numbers can always be keys. Tuples can be used as keys if they contain only strings, numbers, or tuples; if a tuple contains any mutable object either directly or indirectly, it cannot be used as a key. You can’t use lists as keys, since lists can be modified.
Note: the same tuple unpacking were utilized when using the enumerate()
function introduced in Chapter 5:
data = [ ... ]
for idx, value in enumerate(data):
...
The dictionaries population_2008
and population_2018
store the population of some European countries in the according years (2008 and 2010):
population_2008 = { 'Belgium': 10666866, 'Bulgaria': 7518002, 'Czechia': 10343422, 'Denmark': 5475791, 'Germany': 82217837, 'Estonia': 1338440, 'Ireland': 4457765, 'Greece': 11060937, 'Spain': 45668939, 'France': 64007193, 'Croatia': 4311967, 'Italy': 58652875, 'Cyprus': 776333, 'Latvia': 2191810, 'Lithuania': 3212605, 'Luxembourg': 483799, 'Hungary': 10045401, 'Malta': 407832, 'Netherlands': 16405399, 'Austria': 8307989, 'Poland': 38115641, 'Portugal': 10553339, 'Romania': 20635460, 'Slovenia': 2010269, 'Slovakia': 5376064, 'Finland': 5300484, 'Sweden': 9182927, 'United Kingdom': 61571647, 'Iceland': 315459, 'Liechtenstein': 35356, 'Norway': 4737171, 'Switzerland': 7593494, 'Montenegro': 615543, 'North Macedonia': 2045177, 'Albania': 2958266, 'Serbia': 7365507, 'Turkey': 70586256, 'Andorra': 83137, 'Belarus': 9689770, 'Bosnia and Herzegovina': 3843846, 'Kosovo': 2153139, 'Moldova': 3572703, 'San Marino': 32054, 'Ukraine': 46192309, 'Armenia': 3230086, 'Azerbaijan': 8629900, 'Georgia': 4382070 }
population_2018 = { 'Belgium': 11398589, 'Bulgaria': 7050034, 'Czechia': 10610055, 'Denmark': 5781190, 'Germany': 82792351, 'Estonia': 1319133, 'Ireland': 4830392, 'Greece': 10741165, 'Spain': 46658447, 'France': 66926166, 'Croatia': 4105493, 'Italy': 60483973, 'Cyprus': 864236, 'Latvia': 1934379, 'Lithuania': 2808901, 'Luxembourg': 602005, 'Hungary': 9778371, 'Malta': 475701, 'Netherlands': 17181084, 'Austria': 8822267, 'Poland': 37976687, 'Portugal': 10291027, 'Romania': 19530631, 'Slovenia': 2066880, 'Slovakia': 5443120, 'Finland': 5513130, 'Sweden': 10120242, 'United Kingdom': 66273576, 'Iceland': 348450, 'Liechtenstein': 38114, 'Norway': 5295619, 'Switzerland': 8484130, 'Montenegro': 622359, 'North Macedonia': 2075301, 'Albania': 2870324, 'Serbia': 7001444, 'Turkey': 80810525, 'Andorra': 74794, 'Belarus': 9491823, 'Bosnia and Herzegovina': 3502550, 'Kosovo': 1798506, 'Moldova': 3547539, 'San Marino': 34453, 'Ukraine': 42386403, 'Armenia': 2972732, 'Azerbaijan': 9898085, 'Georgia': 3729633 }
print("Population of European countries in 2008:")
print(population_2008)
print()
print("Population of European countries in 2018:")
print(population_2018)
Data source: EuroStat
Task: What was the population of Hungary in 2008 and in 2018?
print("Population of Hungary in 2008: {0}".format(population_2008["Hungary"]))
print("Population of Hungary in 2018: {0}".format(population_2018["Hungary"]))
Task: What was the population change between 2008 and 2018 in Hungary? What is the average change per year?
diff = population_2018["Hungary"] - population_2008["Hungary"]
print("Population difference for Hungary: {0}".format(diff))
diff_avg = diff // 10
print("Average population change for Hungary per year: {0}".format(diff_avg))
Task: Display for all countries the population change between 2008 and 2018!
for key in population_2008.keys():
diff = population_2018[key] - population_2008[key]
print("{0}: {1}".format(key, diff))
Task: Which country had the largest population growth in the given timespan? Which one had the largest population decline?
max_country = "Hungary"
max_diff = population_2018[max_country] - population_2008[max_country]
min_country = "Hungary"
min_diff = population_2018[min_country] - population_2008[min_country]
for key in population_2008.keys():
diff = population_2018[key] - population_2008[key]
if diff > max_diff:
max_diff = diff
max_country = key
if diff < min_diff:
min_diff = diff
min_country = key
print("Largest growth: {0} ({1})".format(max_country, max_diff))
print("Largest decline: {0} ({1})".format(min_country, min_diff))
Python also includes a data type for sets. A set is an unordered collection with no duplicate elements. Basic usage include membership testing and eliminating duplicate entries. Set objects also support mathematical operations like union, intersection, difference, and symmetric difference.
Curly braces or the set()
function can be used to create sets.
Note: to create an empty set you have to use set()
, not {}
; the latter creates an empty dictionary and not a set.
neighbours = {'Austria', 'Slovakia', 'Ukraine', 'Romania', 'Serbia', 'Croatia', 'Slovenia'}
print(neighbours)
Since sets are unordered theoretically, Python does not support indexing for sets, meaning we cannot access an item with a numerical index:
print(neighbours[0])
However, we can perform membership testing, evaluating whether an item is in the list or not:
print('Serbia' in neighbours)
print('Germany' in neighbours)
New element can be added with the add()
method to a set, existing element can be removed with the remove()
method from a set. Sets also guarantee to contain no duplicate entries:
neighbours.add('Ukraine') # already in the set
print(neighbours)
Demonstration of basic set operations:
german_speakers = {'Germany', 'Austria', 'Switzerland'}
print("Union: {0}".format(neighbours | german_speakers))
print("Intersection: {0}".format(neighbours & german_speakers))
print("Difference: {0}".format(neighbours - german_speakers))
print("Symmetric difference: {0}".format(neighbours ^ german_speakers))
Task: Verify whether the dictionaries population_2008
and population_2018
contain exactly the same countries as keys.
keyset_2008 = set(population_2008.keys())
keyset_2018 = set(population_2018.keys())
nomatch = keyset_2008 ^ keyset_2018
if len(nomatch) == 0:
print("The 2 dictionaries contains the same countries.")
else:
print("There are some countries only present in one of the dictionaries: {0}".format(nomatch))
A stack is an abstract data structure that follows the "last-in-first-out" or LIFO model: elements are added to the top of the stack and only the top element of a stack can be removed.
Some well-known real world examples for usage of stacks:
The list methods make it very easy to use a list as a stack, where the last element added is the first element retrieved (last-in, first-out). To add an item to the top of the stack, use append()
. To retrieve an item from the top of the stack, use pop()
without an explicit index.
stack = [1, 2, 3, 4, 5]
stack.append(6)
stack.append(7)
print(stack)
print(stack.pop())
print(stack.pop())
print(stack)
stack.append(8)
stack.append(9)
print(stack)
print("Process all the elements of the stack:")
while len(stack) > 0:
print(stack.pop())
A queue is an abstract data structure that follows "first-in-first-out" or FIFO model: new elements are added to the back of queue and only the front element of the queue can be removed.
Some well-known real world examples for usage of queues:
It is also possible to use a list as a queue, where the first element added is the first element retrieved (first-in, first-out); however, lists are not efficient for this purpose. While appends and pops from the end of list are fast, doing inserts or pops from the beginning of a list is slow (because all of the other elements have to be shifted by one).
To implement a queue, use collections.deque
which was designed to have fast appends and pops from both ends.
from collections import deque
queue = deque([1, 2, 3, 4, 5])
queue.append(6)
queue.append(7)
print(queue)
print(queue.popleft())
print(queue.popleft())
print(queue)
queue.append(8)
queue.append(9)
print(queue)
print("Process all the elements of the stack:")
while len(queue) > 0:
print(queue.popleft())
Note: deque
is short for double ended queue, because we can manage both ends of the data structure (add or remove elements).