Python OOP: Classes, Dataclasses, and Dunder Methods
Python OOP isn't Java OOP โ from __init__ to dataclasses to __slots__, here's what actually matters
Abstract AlgorithmsAI-assisted content. This post may have been written or enhanced with the help of AI tools. While efforts are made to ensure accuracy, the content may contain errors or inaccuracies. Please verify critical information independently.
๐ Why Every Java Developer Writes Un-Pythonic Classes on Day One
Imagine a developer โ let's call him Daniel โ who has written Java for six years. He sits down to write his first Python class and produces this:
class BankAccount:
def __init__(self):
self.__balance = 0.0 # private field
self.__owner = "" # private field
def getBalance(self): # getter
return self.__balance
def setBalance(self, value): # setter with no validation
self.__balance = value
def getOwner(self):
return self.__owner
def setOwner(self, name):
self.__owner = name
This code runs. It even looks familiar. But it is profoundly un-Pythonic, and every seasoned Python developer will tell you to throw it out.
The core problem is that Daniel is fighting two assumptions baked into Python: attributes are public by default, and data validation belongs to properties, not setters. In Java, encapsulation is enforced by the compiler via private. In Python, encapsulation is a social contract โ by convention, a single underscore prefix (_balance) signals "internal use", and a double underscore (__balance) invokes name-mangling to prevent accidental overrides in subclasses. Neither prevents access from outside the class.
The idiomatic replacement uses @property:
class BankAccount:
def __init__(self, owner: str, balance: float = 0.0):
self._owner = owner
self._balance = balance
@property
def balance(self) -> float:
return self._balance
@balance.setter
def balance(self, value: float) -> None:
if value < 0:
raise ValueError("Balance cannot be negative")
self._balance = value
@property
def owner(self) -> str:
return self._owner
Now callers write account.balance = 500 instead of account.setBalance(500), and the validation is invisible to the caller. The attribute behaves like a simple field on the outside but runs arbitrary logic on the inside. This is the Pythonic way.
The second Java habit that breaks in Python is interface-based programming. In Java, you define an interface Drawable and every class that can be drawn must declare implements Drawable. In Python, this is replaced by duck typing: if an object has a draw() method, it is drawable โ no declaration required. The name comes from the phrase "if it walks like a duck and quacks like a duck, it is a duck." You write code that calls obj.draw() and trust that the caller passes something with a draw() method. If they do not, a AttributeError surfaces at runtime. For cases where you truly want compile-time-style contracts, Python provides Abstract Base Classes (ABC) and the Protocol class โ but these are opt-in, not the default.
Understanding these two shifts โ properties over getters/setters, duck typing over interfaces โ is the foundation for writing Python that looks like Python rather than Java with different syntax.
๐ Anatomy of a Python Class: init, self, and the Difference Between Instance and Class State
Before diving into inheritance and dunder methods, it helps to be precise about what a Python class actually is and where data lives.
A class is a callable object. When you call BankAccount("Alice", 100), Python calls BankAccount.__init__ with a freshly created object as the first argument (by convention named self). The __init__ method does not create the object โ __new__ does that โ but it initialises the instance's attributes. The distinction matters when you customise object creation, which we will cover in the Deep Dive section.
Instance attributes belong to an individual object. They live in obj.__dict__, a plain Python dictionary. Every instance gets its own copy:
class Counter:
def __init__(self, start: int = 0):
self.count = start # instance attribute โ stored in self.__dict__
a = Counter(10)
b = Counter(20)
print(a.count) # 10
print(b.count) # 20
print(a.__dict__) # {'count': 10}
Class attributes belong to the class itself, not any instance. All instances share the same value until an instance overrides it locally:
class Counter:
default_step = 1 # class attribute โ stored in Counter.__dict__
def __init__(self, start: int = 0):
self.count = start
def increment(self):
self.count += self.default_step
c = Counter()
c.increment()
print(c.count) # 1
Counter.default_step = 5
c.increment()
print(c.count) # 6 โ Counter.default_step changed for everyone
c.default_step = 10 # now c has its own instance-level shadow
c.increment()
print(c.count) # 16 โ uses c's own shadow, not the class attribute
This shadowing behaviour surprises many developers. The rule is simple: Python looks up attributes in the instance __dict__ first, then the class __dict__, then base classes in MRO order. The first match wins.
The self parameter is not a keyword in Python. It is merely a convention โ any name works. What matters is that Python automatically passes the calling instance as the first positional argument to every regular method. The name self is the community standard and you should follow it.
โ๏ธ Inheritance, super(), and How Python Decides Which Method to Call
Python supports both single inheritance (one parent) and multiple inheritance (multiple parents). Multiple inheritance is powerful but introduces ambiguity when two parent classes define the same method. Python resolves this with the Method Resolution Order (MRO), computed using the C3 linearisation algorithm.
The MRO for any class is a deterministic, left-to-right depth-first ordering with the constraint that a parent never appears before all of its children. You can inspect it:
class A:
def hello(self): return "A"
class B(A):
def hello(self): return "B"
class C(A):
def hello(self): return "C"
class D(B, C):
pass
print(D.mro())
# [<class 'D'>, <class 'B'>, <class 'C'>, <class 'A'>, <class 'object'>]
d = D()
print(d.hello()) # "B"
The MRO tells you: start at D, then check B (found hello here, so stop). If B had not defined hello, Python would check C next, then A, then object.
super() is the idiomatic way to call a method from the next class in the MRO. It does not mean "my parent class" โ it means "the next class in the MRO". This distinction matters in multiple inheritance:
class Animal:
def __init__(self, name: str):
self.name = name
class Flyable:
def __init__(self):
self.can_fly = True
class Bird(Animal, Flyable):
def __init__(self, name: str):
super().__init__(name) # calls Animal.__init__ per MRO
self.can_fly = True
Python provides three types of methods, and choosing the right one is a common source of confusion:
| Method Type | Decorator | First argument | When to use |
| Instance method | (none) | self โ the instance | Needs access to instance state |
| Class method | @classmethod | cls โ the class itself | Factory constructors, class-level operations |
| Static method | @staticmethod | (none) | Pure utility with no state dependency |
class Temperature:
def __init__(self, celsius: float):
self.celsius = celsius
@classmethod
def from_fahrenheit(cls, fahrenheit: float) -> "Temperature":
"""Factory: alternative constructor using @classmethod."""
return cls((fahrenheit - 32) * 5 / 9)
@staticmethod
def absolute_zero_celsius() -> float:
"""Static: no self, no cls needed."""
return -273.15
def to_fahrenheit(self) -> float:
"""Instance: reads self.celsius."""
return self.celsius * 9 / 5 + 32
t = Temperature.from_fahrenheit(212)
print(t.celsius) # 100.0
print(Temperature.absolute_zero_celsius()) # -273.15
print(t.to_fahrenheit()) # 212.0
isinstance(obj, SomeClass) checks whether obj is an instance of SomeClass or any subclass. issubclass(Sub, Base) checks the class hierarchy without needing an object. Both are indispensable for writing polymorphic code safely.
The diagram below shows the MRO resolution path for a diamond inheritance scenario, which is the classic case where the C3 algorithm earns its keep.
flowchart TD
D[D - search starts here] --> B[B - found hello - STOP]
D --> C[C - would check next if B had no hello]
B --> A[A - base class]
C --> A
A --> OBJ[object - Python root]
In this flowchart, D.hello() resolves to B.hello because B appears before C in D's MRO. The diamond is resolved by visiting each class at most once, in left-to-right depth-first order, guaranteeing a deterministic and consistent lookup without ambiguity.
๐ง Deep Dive: The Python Object Model and Its Performance Implications
The Internals of Python Object Model
Every Python object has three core attributes: an identity (id(obj) โ its memory address), a type (type(obj) โ its class), and a value (stored in obj.__dict__ for regular objects). Understanding how Python looks up attributes via __dict__ explains almost every OOP behaviour you encounter.
Attribute lookup chain: When you write obj.attr, Python executes roughly this sequence:
- Check
type(obj).__mro__for a data descriptor (an object with both__get__and__set__). Data descriptors win over instance dictionaries. - Check
obj.__dict__for an instance attribute. - Check
type(obj).__mro__for a non-data descriptor (only__get__) or plain class attribute. - Raise
AttributeErrorif nothing is found.
@property is a data descriptor. It lives in the class's __dict__ as a property object with __get__, __set__, and __delete__ methods. When you write obj.balance, Python finds the property object in the class first (step 1), calls its __get__, which calls your getter function. This is why @property can intercept attribute access even though the caller writes plain obj.balance โ no method call syntax required.
You can verify this directly:
class Circle:
@property
def radius(self):
return self._radius
@radius.setter
def radius(self, value):
if value <= 0:
raise ValueError("Radius must be positive")
self._radius = value
# The property object lives in the class __dict__:
print(type(Circle.__dict__['radius'])) # <class 'property'>
print(Circle.__dict__['radius'].fget) # <function Circle.radius ...>
__slots__ is Python's mechanism for trading flexibility for memory and speed. Normally, every instance carries a __dict__ โ a hash table that can hold any attribute. For classes with many instances and fixed attribute sets, this wastes 200โ400 bytes per object. Declaring __slots__ replaces __dict__ with a compact fixed-size array:
class Point:
__slots__ = ('x', 'y')
def __init__(self, x: float, y: float):
self.x = x
self.y = y
p = Point(1.0, 2.0)
# p.__dict__ does not exist โ AttributeError
# p.z = 3.0 would raise AttributeError โ no new attributes allowed
A Point with __slots__ uses approximately 56 bytes; the same class without __slots__ uses roughly 256 bytes. At one million instances, that is 200 MB saved โ meaningful in data-intensive applications.
__class__ is a writable reference to the object's type. Python uses it for isinstance checks and dynamic method dispatch. Changing it is rarely justified outside of migration code, but knowing it exists helps you understand how frameworks do runtime patching (mocking, proxying).
Performance Analysis: slots, Dataclasses, and the hash/eq Contract
__slots__ memory savings are most impactful in tight loops or large collections:
import sys
class PointDict:
def __init__(self, x, y):
self.x = x
self.y = y
class PointSlots:
__slots__ = ('x', 'y')
def __init__(self, x, y):
self.x = x
self.y = y
pd = PointDict(1, 2)
ps = PointSlots(1, 2)
print(sys.getsizeof(pd) + sys.getsizeof(pd.__dict__)) # ~360 bytes
print(sys.getsizeof(ps)) # ~56 bytes
@dataclass vs manual __init__: @dataclass auto-generates __init__, __repr__, and __eq__ based on declared fields. The generated code is equivalent to what you would write by hand but cannot be forgotten or mistyped. For classes with five or more fields, @dataclass is always faster to write and less error-prone than manual __init__.
The __hash__ / __eq__ contract is critical: if a == b, then hash(a) must equal hash(b). Python enforces a corollary: if you define __eq__ without __hash__, Python sets __hash__ = None, making the class unhashable. This prevents bugs where equal objects hash differently and silently produce incorrect set/dict behaviour. To make a class usable as a dictionary key or set member, you must define both, or use @dataclass(frozen=True) which handles both automatically.
from dataclasses import dataclass
@dataclass(frozen=True)
class Coordinate:
lat: float
lon: float
# frozen=True generates __hash__ and makes instances immutable
c1 = Coordinate(51.5, -0.12)
c2 = Coordinate(51.5, -0.12)
print(c1 == c2) # True
print(hash(c1) == hash(c2)) # True
visited = {c1, c2}
print(len(visited)) # 1 โ same coordinate, deduplicated
๐ Visualising the Shape Hierarchy: A Real Domain Class Diagram
The following diagram models a geometry domain with a Shape base class and two concrete subclasses, Circle and Rectangle. It shows inheritance relationships, shared dunder methods, and the contract each class fulfils.
classDiagram
class Shape {
+str color
+__init__(color: str)
+area() float
+perimeter() float
+__str__() str
+__eq__(other) bool
+__hash__() int
}
class Circle {
+float radius
+__init__(color: str, radius: float)
+area() float
+perimeter() float
+__str__() str
}
class Rectangle {
+float width
+float height
+__init__(color: str, width: float, height: float)
+area() float
+perimeter() float
+__str__() str
}
Shape <|-- Circle : inherits
Shape <|-- Rectangle : inherits
Shape defines the interface contract: any concrete subclass must implement area() and perimeter(). The __eq__ and __hash__ methods live on Shape so that two shapes with the same colour and dimensions compare as equal and can be stored in sets without duplicates. Circle and Rectangle each override __str__ to produce human-readable output while inheriting the comparison logic from Shape โ this is the open/closed principle at work in Python.
๐ Python OOP in the Wild: Dataclasses, Context Managers, and Custom Iterators
Beyond basic class definitions, Python's real power comes from implementing specific dunder methods that let your objects integrate seamlessly with built-in language constructs.
@dataclass for clean data containers eliminates boilerplate for pure data classes. Instead of writing __init__, __repr__, and __eq__ manually, you declare fields with type annotations and let the decorator generate them:
from dataclasses import dataclass, field
from typing import List
@dataclass
class Order:
order_id: str
customer: str
items: List[str] = field(default_factory=list)
total: float = 0.0
def add_item(self, item: str, price: float) -> None:
self.items.append(item)
self.total += price
o = Order("ORD-001", "Alice")
o.add_item("Widget", 9.99)
print(o)
# Order(order_id='ORD-001', customer='Alice', items=['Widget'], total=9.99)
Context managers with __enter__ and __exit__ make your class work with with statements, guaranteeing cleanup even when exceptions occur. This is how Python handles files, database connections, locks โ any resource that must be released:
class ManagedConnection:
def __init__(self, host: str):
self.host = host
self.conn = None
def __enter__(self):
print(f"Connecting to {self.host}")
self.conn = {"host": self.host, "active": True} # simulate connection
return self.conn
def __exit__(self, exc_type, exc_val, exc_tb):
print(f"Closing connection to {self.host}")
self.conn["active"] = False
return False # do not suppress exceptions
with ManagedConnection("db.prod.example.com") as conn:
print(f"Using: {conn}")
# Prints: Connecting, Using, Closing โ even if an exception is raised inside the block
Custom iterators with __iter__ and __next__ let your objects work with for loops, list(), sum(), and any other construct that consumes iterables:
class Countdown:
def __init__(self, start: int):
self.current = start
def __iter__(self):
return self # the iterator is the object itself
def __next__(self):
if self.current <= 0:
raise StopIteration
value = self.current
self.current -= 1
return value
for n in Countdown(5):
print(n, end=" ") # 5 4 3 2 1
__repr__ vs __str__ serve different audiences. __str__ is for end users โ it should be human-readable and can omit internal detail. __repr__ is for developers and debugging โ it should be unambiguous, ideally eval()-able to reconstruct the object. When only one is defined, Python falls back to __repr__ for both. The convention: always define __repr__; only add __str__ when you need a different display for users.
class Vector:
def __init__(self, x: float, y: float):
self.x = x
self.y = y
def __repr__(self) -> str:
return f"Vector({self.x!r}, {self.y!r})" # developer-facing
def __str__(self) -> str:
return f"({self.x}, {self.y})" # user-facing
v = Vector(3.0, 4.0)
print(repr(v)) # Vector(3.0, 4.0)
print(str(v)) # (3.0, 4.0)
โ๏ธ Choosing the Right Python Data Container: dataclass vs namedtuple vs TypedDict vs dict
Python offers four main ways to represent structured data, and the right choice depends on mutability, type safety, performance, and serialization needs:
| Construct | Mutable | Type-checked | Hashable by default | Best for |
dict | Yes | No | No | Ad-hoc data, JSON payloads |
TypedDict | Yes | IDE only (no runtime) | No | Typed dicts for APIs, JSON |
namedtuple | No | No | Yes | Lightweight immutable records |
@dataclass | Yes (default) | IDE + __annotations__ | No (yes if frozen=True) | Domain models, config objects |
@dataclass(frozen=True) | No | IDE + __annotations__ | Yes | Immutable value objects, dict keys |
When to use @dataclass(frozen=True): whenever the object represents a value (a point, a currency amount, a date range) rather than an entity. Values are equal if their fields are equal; entities are equal only if their identities match. Freezing also enables hashing, making the object usable in sets and as dict keys.
ABC vs Protocol for interfaces: ABC (Abstract Base Class) is nominal โ you must explicitly inherit from it. Protocol (from typing) is structural โ any class with the required methods satisfies the protocol without declaring it. Use ABC when you control the class hierarchy and want enforced overrides. Use Protocol for duck-typed checks, third-party classes, and type annotations in libraries.
from abc import ABC, abstractmethod
from typing import Protocol
# ABC: explicit, enforced at instantiation time
class Drawable(ABC):
@abstractmethod
def draw(self) -> None: ...
# Protocol: structural, checked only by type checkers (mypy, pyright)
class Printable(Protocol):
def print_summary(self) -> str: ...
๐งญ Python OOP Decision Guide: Matching Your Need to the Right Construct
| I need to... | Use this |
Store data with type hints and auto-generated __init__ / __repr__ | @dataclass |
| Create an immutable, hashable value object (usable as dict key) | @dataclass(frozen=True) |
| Return multiple values from a function without a full class | namedtuple or tuple |
| Type-annotate a dict payload (e.g. JSON body) | TypedDict |
| Validate data at construction time with rich error messages | pydantic.BaseModel |
| Enforce a method contract on subclasses | ABC with @abstractmethod |
| Apply a structural interface check for duck typing | typing.Protocol |
| Reduce memory usage for millions of instances | __slots__ |
| Add custom iteration to an object | __iter__ + __next__ |
Integrate with with blocks for resource cleanup | __enter__ + __exit__ |
| Control how an object displays in print / logs | __str__ and __repr__ |
Make objects comparable with == and sortable with < | __eq__, __lt__, __le__ |
| Make objects usable in sets or as dict keys | __hash__ (+ __eq__) |
๐งช Building a Payment Domain in Python: A Complete Runnable Example
This section demonstrates a realistic payment processing domain model. It combines inheritance, abstract base classes, dataclasses, dunder methods, and properties into a single runnable Python module. The goal is to show how these concepts compose in production-shaped code โ not toy examples. Pay attention to how @abstractmethod enforces the process() contract, how __repr__ makes debugging easy, and how __eq__ and __hash__ let payments be deduplicated in a set.
from __future__ import annotations
from abc import ABC, abstractmethod
from dataclasses import dataclass, field
from datetime import datetime
from typing import List
# โโ Value Object โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
@dataclass(frozen=True)
class Money:
"""Immutable value object representing an amount in a given currency."""
amount: float
currency: str = "USD"
def __post_init__(self):
if self.amount < 0:
raise ValueError(f"Amount cannot be negative: {self.amount}")
def __add__(self, other: Money) -> Money:
if self.currency != other.currency:
raise ValueError("Cannot add different currencies")
return Money(self.amount + other.amount, self.currency)
def __str__(self) -> str:
return f"{self.currency} {self.amount:.2f}"
# โโ Abstract Base โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
class Payment(ABC):
"""Base class for all payment types. Enforces process() contract."""
def __init__(self, payment_id: str, amount: Money):
self._payment_id = payment_id
self._amount = amount
self._processed_at: datetime | None = None
@property
def payment_id(self) -> str:
return self._payment_id
@property
def amount(self) -> Money:
return self._amount
@property
def is_processed(self) -> bool:
return self._processed_at is not None
@abstractmethod
def process(self) -> bool:
"""Execute the payment. Returns True on success."""
...
@abstractmethod
def _provider_name(self) -> str:
"""Human-readable provider name for repr/logging."""
...
def __repr__(self) -> str:
status = "processed" if self.is_processed else "pending"
return (
f"{self.__class__.__name__}("
f"id={self._payment_id!r}, "
f"amount={self._amount}, "
f"status={status!r})"
)
def __eq__(self, other: object) -> bool:
if not isinstance(other, Payment):
return NotImplemented
return self._payment_id == other._payment_id
def __hash__(self) -> int:
return hash(self._payment_id)
# โโ Concrete Implementations โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
class CreditCardPayment(Payment):
"""Payment via credit card with last-four-digits masking."""
def __init__(self, payment_id: str, amount: Money, card_number: str):
super().__init__(payment_id, amount)
if len(card_number) < 4:
raise ValueError("Invalid card number")
self._last_four = card_number[-4:]
@property
def masked_card(self) -> str:
return f"****-****-****-{self._last_four}"
def process(self) -> bool:
print(f"Charging {self._amount} to card {self.masked_card}")
self._processed_at = datetime.now()
return True
def _provider_name(self) -> str:
return "CreditCard"
class PayPalPayment(Payment):
"""Payment routed through PayPal with email association."""
def __init__(self, payment_id: str, amount: Money, paypal_email: str):
super().__init__(payment_id, amount)
self._email = paypal_email
def process(self) -> bool:
print(f"Sending {self._amount} via PayPal to {self._email}")
self._processed_at = datetime.now()
return True
def _provider_name(self) -> str:
return "PayPal"
# โโ Payment Processor (uses duck typing, not isinstance) โโโโโโโโโโโโโโโโโโโโโโ
@dataclass
class PaymentProcessor:
"""Processes a batch of payments. Works with any Payment subclass."""
payments: List[Payment] = field(default_factory=list)
def add(self, payment: Payment) -> None:
self.payments.append(payment)
def process_all(self) -> None:
for payment in self.payments:
if not payment.is_processed:
success = payment.process()
if success:
print(f" OK: {payment}")
else:
print(f" FAILED: {payment}")
def total(self) -> float:
return sum(p.amount.amount for p in self.payments if p.is_processed)
# โโ Demo โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
if __name__ == "__main__":
processor = PaymentProcessor()
processor.add(CreditCardPayment(
"PAY-001", Money(49.99), "4111111111111234"
))
processor.add(PayPalPayment(
"PAY-002", Money(19.99), "alice@example.com"
))
processor.add(CreditCardPayment(
"PAY-003", Money(99.00), "5500005555555559"
))
processor.process_all()
print(f"\nTotal processed: USD {processor.total():.2f}")
# Deduplication via __eq__ and __hash__
p1 = CreditCardPayment("PAY-001", Money(49.99), "4111111111111234")
p2 = CreditCardPayment("PAY-001", Money(49.99), "4111111111111234")
unique = {p1, p2}
print(f"\nDuplicate check โ unique payments in set: {len(unique)}") # 1
Running this produces:
Charging USD 49.99 to card ****-****-****-1234
OK: CreditCardPayment(id='PAY-001', amount=USD 49.99, status='processed')
Sending USD 19.99 via PayPal to alice@example.com
OK: PayPalPayment(id='PAY-002', amount=USD 19.99, status='processed')
Charging USD 99.00 to card ****-****-****-5559
OK: CreditCardPayment(id='PAY-003', amount=USD 99.00, status='processed')
Total processed: USD 168.98
Duplicate check โ unique payments in set: 1
Every element of this example earns its place: @dataclass(frozen=True) on Money ensures currency arithmetic is safe and value-equal amounts hash identically. @abstractmethod on process() guarantees a TypeError at instantiation time if a subclass forgets to implement it. __eq__ and __hash__ on Payment use payment_id as the natural key, enabling set-based deduplication without any additional bookkeeping.
๐ ๏ธ Production-Grade Python Objects: pydantic, attrs, and stdlib dataclasses
Three major libraries build on Python's object model to make data validation, immutability, and schema generation faster and more reliable than hand-rolled classes.
stdlib dataclasses (Python 3.7+) is the zero-dependency baseline. It auto-generates __init__, __repr__, __eq__, and optionally __hash__ via frozen=True. Use it for internal data models where you control all inputs. Limitation: no runtime validation โ setting amount = -5 on a @dataclass succeeds silently unless you add a __post_init__ check.
from dataclasses import dataclass
@dataclass
class Config:
host: str
port: int = 8080
debug: bool = False
pydantic (v2) is the validation powerhouse. It enforces types at construction time, coerces compatible values (e.g. "42" โ 42 for an int field), and generates JSON Schema for free. Stripe, FastAPI, and dozens of production Python services use Pydantic models as their API request/response layer.
from pydantic import BaseModel, field_validator
class PaymentRequest(BaseModel):
payment_id: str
amount: float
currency: str = "USD"
@field_validator("amount")
@classmethod
def amount_must_be_positive(cls, v: float) -> float:
if v <= 0:
raise ValueError("Amount must be positive")
return v
req = PaymentRequest(payment_id="PAY-001", amount=49.99)
print(req.model_dump()) # {'payment_id': 'PAY-001', 'amount': 49.99, 'currency': 'USD'}
attrs predates dataclasses and remains popular in performance-sensitive and library code. It supports validators, converters, and __slots__ generation via a simple decorator API. The cattrs companion library provides serialisation and deserialisation:
import attrs
@attrs.define
class Endpoint:
host: str = attrs.field(validator=attrs.validators.instance_of(str))
port: int = attrs.field(default=443)
tls: bool = attrs.field(default=True)
e = Endpoint(host="api.example.com")
print(e) # Endpoint(host='api.example.com', port=443, tls=True)
For a deeper look at Pydantic in the context of FastAPI request validation and schema generation, see the planned Python web frameworks follow-up post in this series.
๐ Lessons Learned from Writing Pythonic Classes
Start with @dataclass, upgrade when you feel friction. Plain @dataclass handles 80% of use cases. Add frozen=True when you need hashability or immutability. Graduate to pydantic when you need runtime validation of external inputs (API payloads, config files). Add __slots__ only when profiling confirms memory is the bottleneck.
Do not port Java getters and setters. Direct attribute access is idiomatic Python. Use @property only when you genuinely need to intercept access for validation, lazy computation, or deprecation warnings โ not as a default encapsulation strategy.
__repr__ is non-negotiable. Every class that you will ever inspect in a debugger or REPL should have __repr__. The three minutes it takes to write it saves hours of print(obj.__dict__) archaeology. @dataclass and Pydantic generate it for free.
Multiple inheritance requires conscious design. Python makes multiple inheritance syntactically trivial but semantically complex. If you are building a library, prefer composition and Protocol over deep inheritance trees. If you must use multiple inheritance, inspect .mro() and document the intended resolution order explicitly.
The __hash__/__eq__ contract is a trap for the unwary. Defining __eq__ without __hash__ makes your class unhashable (Python sets __hash__ = None). Always define both together, or use @dataclass(frozen=True) to handle the contract automatically.
__slots__ is a last resort, not a best practice. It complicates inheritance (each class in the hierarchy must declare its own __slots__) and prevents __dict__, which breaks some introspection tools. Use it only after profiling confirms memory savings are worth the complexity cost.
๐ Python OOP: What Actually Matters
TLDR: Python OOP is attribute-lookup chains, not access modifiers. Use
@propertyinstead of getters/setters,@dataclassinstead of manual__init__, and dunder methods to make your objects integrate with Python's built-in constructs. For validation, reach forpydantic. For memory-critical inner loops, reach for__slots__. For everything else, plain@dataclasswithfrozen=Truewhere immutability matters is the Pythonic default.
| Concept | Python idiom |
| Encapsulation | @property with _underscore convention |
| Interface | ABC (enforced) or Protocol (structural) |
| Data class | @dataclass or pydantic.BaseModel |
| Immutable value | @dataclass(frozen=True) |
| Memory optimisation | __slots__ |
| Iterator protocol | __iter__ + __next__ |
| Resource management | __enter__ + __exit__ |
| Equality and hashing | __eq__ + __hash__ (always together) |
๐ Python OOP Practice Quiz
Test your understanding of the concepts covered in this post.
A Python class has both
AandBas parents (in that order) viaclass C(A, B). BothAandBdefine a methodgreet(). Which version doesC().greet()call, and why?Correct Answer:
A.greet()โ because the MRO placesAbeforeBin the resolution order. Python's C3 linearisation visits parents left-to-right, and the first match wins.What is the difference between defining
__str__and__repr__on a class? Which is called when you place an object inside a list andprint()the list?Correct Answer:
__str__is user-facing and human-readable;__repr__is developer-facing and should be unambiguous (ideallyeval()-able). When printing a list, Python calls__repr__on each element, not__str__.If you define
__eq__on a class, what happens to__hash__by default? How do you fix it if you need instances to be usable as dictionary keys?Correct Answer: Python sets
__hash__ = None, making the class unhashable. To restore hashability, either explicitly define__hash__alongside__eq__, or use@dataclass(eq=True, frozen=True)which generates a consistent__hash__automatically.What is the output of the following code, and why?
class Animal: legs = 4 dog = Animal() dog.legs = 2 print(Animal.legs) print(dog.legs)Correct Answer:
Animal.legsprints4;dog.legsprints2. Assigningdog.legs = 2creates an instance attribute that shadows the class attribute for that specific instance. The class attribute itself is unchanged.When would you choose
typing.Protocoloverabc.ABCas the mechanism for defining an interface in Python?Correct Answer: Use
Protocolwhen you want structural (duck-typed) type checking without requiring classes to explicitly inherit from the interface โ especially useful for third-party classes or when you cannot modify the class hierarchy. UseABCwhen you want enforced inheritance and instantiation-time errors if abstract methods are not implemented.Open-ended challenge: Take the
Paymentdomain model from the Practical section. Add aBankTransferPaymentclass that introduces arouting_numberattribute with@propertyvalidation (routing numbers must be exactly 9 digits). Implement theprocess()method to simulate a bank transfer and verify the__hash__/__eq__behaviour holds for twoBankTransferPaymentobjects with the samepayment_id.
๐ Related Reading in the Python Programming Series

Written by
Abstract Algorithms
@abstractalgorithms
More Posts
Watermarking and Late Data Handling in Spark Structured Streaming
TLDR: A watermark tells Spark Structured Streaming: "I will accept events up to N minutes late, and then I am done waiting." Spark tracks the maximum event time seen per partition, takes the global minimum across all partitions, subtracts the thresho...
Spark Structured Streaming: Micro-Batch vs Continuous Processing
๐ The 15-Minute Gap: How a Fraud Team Discovered They Needed Real-Time Streaming A fintech team runs payment fraud detection with a well-tuned Spark batch job. Every 15 minutes it reads a day's worth of transaction events from S3, scores them agains...
Stateful Aggregations in Spark Structured Streaming: mapGroupsWithState
TLDR: mapGroupsWithState gives each streaming key its own mutable state object, persisted in a fault-tolerant state store that checkpoints to object storage on every micro-batch. Where window aggregations assume fixed time boundaries, mapGroupsWithSt...
Shuffles in Spark: Why groupBy Kills Performance
TLDR: A Spark shuffle is the most expensive operation in any distributed job โ it moves every matching key across the network, writes temporary sorted files to disk, and forces a hard synchronization barrier between every upstream and downstream stag...
