Python is an object-oriented programming language (OOP). Objects are an encapsulation of variables and functions into a single entity.
Let's assume we have a data type for rectangles. Without objects we could write the following code:
rec1_bl = (0, 2) # bl = bottom-left
rec1_ur = (6, 8) # ur = upper-right
rec2_bl = (4, 3)
rec2_ur = (7, 5)
def area(bl, ur):
width = ur[0] - bl[0]
height = ur[1] - bl[1]
return width * height
print("Area of rectangle #1: {0}".format(area(rec1_bl, rec1_ur)))
print("Area of rectangle #2: {0}".format(area(rec2_bl, rec2_ur)))
As we can observe the data (rec1_bl
, rec1_ur
, etc.) and the functions (area()
and possible further functions) are defined separately, not encapsulated together.
Simply put, an object is a collection of data (variables) and methods (functions) that act on those data. A class is a blueprint for the object. Classes introduce new data types in Python, describing real-world things and situations. Objects are the instances of classes, the creation process of objects are also called instantiation.
Let's create the Rectangle
class now:
class Rectangle():
name = 'Rectangle'
def area(self):
width = self.ur[0] - self.bl[0]
height = self.ur[1] - self.bl[1]
return width * height
rec1 = Rectangle()
rec1.bl = (0, 2)
rec1.ur = (6, 8)
rec2 = Rectangle()
rec2.bl = (4, 3)
rec2.ur = (7, 5)
print("Area of rectangle {0}".format(rec1.area()))
print("Area of rectangle {0}".format(rec2.area()))
In this example the Rectangle
class has 4 attributes: name
, ur
, bl
and area()
. Attributes may be data or method: the name
is a simple string, bl
and ur
are tuples while area()
is a function. Functions in a class are called methods more specifically.
The rec1 = Rectangle()
statement creates a new instance object named rec1
from the class Rectangle
.
We can access the attributes of objects using the object name prefix, e.g. rec1.area()
.
Remember how we used list and dictionary functions:
numbers = [1, 4, 5, -2, 8]
numbers.sort()
shopping_list = {'apple': 6, 'bread': 2, 'milk': 6, 'butter': 1}
for item in shopping_list.items():
print(item)
This is the same syntax, we are calling methods on objects.
self
parameter¶There is a self
parameter in the area()
function definition inside the Rectangle
class but, we called the method simply as rec1.area()
, without any arguments. It still worked.
This is because, whenever an object calls its method, the object itself is passed as the first argument. So, rec1.area()
translates into Rectangle.area(rec1)
.
print("Area of rectangle #1: {0}".format(Rectangle.area(rec1)))
print("Area of rectangle #2: {0}".format(Rectangle.area(rec2)))
print("Name of all rectangles: {0}".format(Rectangle.name))
In general, calling a method with a list of arguments is equivalent to calling the corresponding function with an argument list that is created by inserting the method's object before the first argument.
For these reasons, the first argument of the function in class must be the object itself. This is conventionally called self
. It can be named otherwise but it is highly discouraged to follow the convention.
In our previous example we deliberately gave a value to the bl
and ur
attributes of rec1
and rec2
before calling the area()
method on them, so it can process those values.
What happens if we e.g. forget to initialize those attributes beforehand?
rec3 = Rectangle()
print("Area of rectangle #3: {0}".format(rec3.area()))
This issue can be addressed with a special constructor method, which is always executed when a new object is instantiated from a class.
In Python, class functions that begins with double underscore (__
) are called special functions as they have special meaning.
The __init__()
function has particular interest for us now. This special function gets called whenever a new object of that class is instantiated.
This type of function is also called a constructor in object-oriented programming. We normally use it to initialize all the variables.
class Rectangle():
name = 'Rectangle'
def __init__(self, bl_x, bl_y, ur_x, ur_y):
self.bl = (bl_x, bl_y)
self.ur = (ur_x, ur_y)
def area(self):
width = self.ur[0] - self.bl[0]
height = self.ur[1] - self.bl[1]
return width * height
rec1 = Rectangle(0, 2, 6, 8)
rec2 = Rectangle(4, 3, 7, 5)
print("Area of rectangle #1: {0}".format(rec1.area()))
print("Area of rectangle #2: {0}".format(rec2.area()))
Now we cannot "forget" to pass all the required data to the object upon instatiation, because Python will raise a TypeError
.
rec3 = Rectangle()
print("Area of rectangle #3: {0}".format(rec3.area()))
Alternatively we could use default values for the parameters, so a Rectangle
could be constructed without defining its dimensions, but still giving value to the instance attributes.
class Rectangle():
name = 'Rectangle'
def __init__(self, bl_x = 0, bl_y = 0, ur_x = 0, ur_y = 0):
self.bl = (bl_x, bl_y)
self.ur = (ur_x, ur_y)
def area(self):
width = self.ur[0] - self.bl[0]
height = self.ur[1] - self.bl[1]
return width * height
rec3 = Rectangle()
print("Area of rectangle #3: {0}".format(rec3.area()))
Generally speaking, instance attributes are for data unique to each instance and class attributes are for variables and methods shared by all instances of the class.
In the example Rectangle
class, the name
attribute is a class variable, because it is defined as an attribute of the class.
print(Rectangle.name)
The bl
and ur
are instance attributes (because they are accessed through the self
object). This means that each rectangle can have its own bottom-left and upper-right position, but all rectangles share the same name.
rec1.bl = (-2, 1)
print(rec1.bl) # has no effect on rec2
print(rec2.bl)
By default, the string representation of an object consists of the type name and memory address:
print(rec1)
As we discussed, methods that begins with double underscore (__
) are called special functions in Python. The __str__()
method is another special function, which can compute and return the "informal" or nicely printable string representation of an object. The return value must be a string object.
class Rectangle():
def __init__(self, bl_x, bl_y, ur_x, ur_y):
self.bl = (bl_x, bl_y)
self.ur = (ur_x, ur_y)
def __str__(self):
return "Rectangle ({0}, {1}, {2}, {3})".format(self.bl[0], self.bl[1], self.ur[0], self.ur[1])
def area(self):
width = self.ur[0] - self.bl[0]
height = self.ur[1] - self.bl[1]
return width * height
rec1 = Rectangle(0, 2, 6, 8)
rec2 = Rectangle(4, 3, 7, 5)
print(rec1)
print(rec2)
Extend the Rectangle
class with a perimeter()
method.
Sample usage:
result = rec1.perimeter()
# result is an integer value
class Rectangle():
def __init__(self, bl_x, bl_y, ur_x, ur_y):
self.bl = (bl_x, bl_y)
self.ur = (ur_x, ur_y)
def __str__(self):
return "Rectangle ({0}, {1}, {2}, {3})".format(self.bl[0], self.bl[1], self.ur[0], self.ur[1])
def area(self):
width = self.ur[0] - self.bl[0]
height = self.ur[1] - self.bl[1]
return width * height
def perimeter(self):
width = self.ur[0] - self.bl[0]
height = self.ur[1] - self.bl[1]
return 2 * (width + height)
The computation of the width and height of the rectangle is now redundantly given in the area()
and the perimeters()
methods. Eliminate the redundancy by extracting a new width()
and height()
function in the Rectangle
class.
class Rectangle():
def __init__(self, bl_x, bl_y, ur_x, ur_y):
self.bl = (bl_x, bl_y)
self.ur = (ur_x, ur_y)
def __str__(self):
return "Rectangle ({0}, {1}, {2}, {3})".format(self.bl[0], self.bl[1], self.ur[0], self.ur[1])
def width(self):
return self.ur[0] - self.bl[0]
def height(self):
return self.ur[1] - self.bl[1]
def area(self):
return self.width() * self.height()
def perimeter(self):
return 2 * (self.width() + self.height())
Extend the Rectangle
class with a translate()
method, which moves it in the Euclidean space in the given direction.
Sample usage:
print(rec1)
rec1.translate(3, 4)
print(rec1)
class Rectangle():
def __init__(self, bl_x, bl_y, ur_x, ur_y):
self.bl = (bl_x, bl_y)
self.ur = (ur_x, ur_y)
def __str__(self):
return "Rectangle ({0}, {1}, {2}, {3})".format(self.bl[0], self.bl[1], self.ur[0], self.ur[1])
def width(self):
return self.ur[0] - self.bl[0]
def height(self):
return self.ur[1] - self.bl[1]
def area(self):
return self.width() * self.height()
def perimeter(self):
return 2 * (self.width() + self.height())
def translate(self, x, y):
self.bl = (self.bl[0] + x, self.bl[1] + y)
self.ur = (self.ur[0] + x, self.ur[1] + y)
Extend the Rectangle
class with an overlap()
method, which can decide whether 2 rectangles overlap each other.
Sample usage:
result = rec1.overlap(rec2)
# result is a boolean value
class Rectangle():
def __init__(self, bl_x, bl_y, ur_x, ur_y):
self.bl = (bl_x, bl_y)
self.ur = (ur_x, ur_y)
def __str__(self):
return "Rectangle ({0}, {1}, {2}, {3})".format(self.bl[0], self.bl[1], self.ur[0], self.ur[1])
def width(self):
return self.ur[0] - self.bl[0]
def height(self):
return self.ur[1] - self.bl[1]
def area(self):
return self.width() * self.height()
def perimeter(self):
return 2 * (self.width() + self.height())
def translate(self, x, y):
self.bl = (self.bl[0] + x, self.bl[1] + y)
self.ur = (self.ur[0] + x, self.ur[1] + y)
def overlap(self, other):
overlap_x = ((self.bl[0] < other.bl[0] and self.ur[0] > other.bl[0]) or
(other.bl[0] < self.bl[0] and other.ur[0] > self.bl[0]))
overlap_y = ((self.bl[1] < other.bl[1] and self.ur[1] > other.bl[1]) or
(other.bl[1] < self.bl[1] and other.ur[1] > self.bl[1]))
return overlap_x and overlap_y
Hint: Two axis-parallel rectangles overlap if they overlap either by the X or Y dimensions.
They overlap by the X dimension if A_X1 < B_X1
and A_X2 > B_X1
; or B_X1 < A_X1
and B_X2 > A_X1
.
Similar inequality condition apply on the Y dimension.
A relatively new feature available since Python 3.7 (released in 2018) is the data class. A data class is a class typically containing mainly data, although there aren’t really any restrictions. It is created using the @dataclass
decorator, as follows:
from dataclasses import dataclass
@dataclass
class Country:
name: str
capital: str
area: int
population: int
gdp: int
literacy: float
region: str = 'Unknown'
def population_density(self):
return self.population / self.area
The benefit of using data classes is that some special methods, e.g. the __init__()
constructor method will be automatically generated and added to the class, initiating all instance variables. The generated constructor will look like:
def __init__(self, name, capital, area, population, gdp, literacy, region = 'Unknown'):
self.name = name
self.capital = capital
self.area = area
self.population = population
self.gdp =gdp
self.literacy = literacy
self.region = region
Since data classes is just a new syntactical approach in Python for defining classes, we can instantiate objects from data classes and use them just like before:
hungary = Country('Hungary', 'Budapest', 93030, 9981334, 13900, 99.4, 'Central-Europe')
print(hungary.capital)
print(hungary.population_density())
The string representational __str__()
method is also predefined for data classes:
print(hungary)
As you have may noticed we also defined the type of the instance variables in the data class, e.g. population: int
. This is called type hinting or type annotations and is mandatory when defining a data class.
Type hinting is available in Python since version 3.5 and can also be used elsewhere: for local variables, function parameters, return types, etc.
Note that the Python runtime does not enforce function and variable type annotations, so whether you use them or not, they will not affect how your code is executed. However they can be used by third party tools such as type checkers (see mypy) or integrated development environments (IDEs), to early detect potential errors in your code.
We will not use type hinting further in this course, as Jupyter Notebook itself does not perform type checking based on them.
We do not always have to start from scratch when writing a class. If the class is a specialized version of another already existing class, we can use inheritance.
When one class inherits from another, it automatically takes on all the attributes and methods of the first class. The original class is called the parent class, and the new class is the child class. The child class inherits every attribute and method from its parent class but is also free to define new attributes and methods of its own.
Let's inherit the Square
class from the Rectangle
class:
class Square(Rectangle):
def __init__(self, bl_x, bl_y, width):
self.bl = (bl_x, bl_y)
self.ur = (bl_x + width, bl_y + width)
s1 = Square(5, 10, 3)
print("Area of square #1: {0}".format(s1.area()))
__init__()
function in the child class¶Often we would like to reuse the original __init__
function of the parent class in the child class.
This can be done with the super()
function inside the child class constructor.
This is a special function that helps Python make connections between the parent and child class.
Note: The name super comes from a convention of calling the parent class a superclass and the child class a subclass.
class Square(Rectangle):
def __init__(self, bl_x, bl_y, width):
super().__init__(bl_x, bl_y, bl_x + width, bl_y + width)
s1 = Square(5, 10, 3)
print("Area of square #1: {0}".format(s1.area()))
Let's see what happens if we print our s1
object:
print(s1)
It shows the text "Rectangle", because the __str__()
special function was defined this way in the Rectangle
parent class and now the Square
child class inherited it.
We can override any method from the parent class that do not fit into the model of the child class. To achieve this, we can simply redefine the method in the child class with the same name as the method we want to override in the parent class. Python will disregard the parent class method and only pay attention to the method you define in the child class.
class Square(Rectangle):
def __init__(self, bl_x, bl_y, width):
self.bl = (bl_x, bl_y)
self.ur = (bl_x + width, bl_y + width)
def __str__(self):
return "Square ({0}, {1}, width = {2})".format(self.bl[0], self.bl[1], self.ur[0] - self.bl[0])
s1 = Square(5, 10, 3)
print("Area of square #1: {0}".format(s1.area()))
print(s1)
Note: the super()
function can be used in any overriding child class methods.
Child classes may also extend the functionality of their parent class by adding new methods to themselves.
class Square(Rectangle):
def __init__(self, bl_x, bl_y, width):
self.bl = (bl_x, bl_y)
self.ur = (bl_x + width, bl_y + width)
def __str__(self):
return "Square ({0}, {1}, width = {2})".format(self.bl[0], self.bl[1], self.ur[0] - self.bl[0])
def side(self):
return self.bl[1] - self.ur[0]
s1 = Square(5, 10, 3)
print(s1.side())
The Rectangle
class does not have this new side()
method:
print(rec1.side())