All Posts

Python OOP: Classes, Dataclasses, and Dunder Methods

Python OOP isn't Java OOP โ€” from __init__ to dataclasses to __slots__, here's what actually matters

Abstract AlgorithmsAbstract Algorithms
ยทยท22 min read
Share
AI Share on X / Twitter
AI Share on LinkedIn
Copy link

AI-assisted content. This post may have been written or enhanced with the help of AI tools. While efforts are made to ensure accuracy, the content may contain errors or inaccuracies. Please verify critical information independently.


๐Ÿ“– Why Every Java Developer Writes Un-Pythonic Classes on Day One

Imagine a developer โ€” let's call him Daniel โ€” who has written Java for six years. He sits down to write his first Python class and produces this:

class BankAccount:
    def __init__(self):
        self.__balance = 0.0        # private field
        self.__owner = ""           # private field

    def getBalance(self):           # getter
        return self.__balance

    def setBalance(self, value):    # setter with no validation
        self.__balance = value

    def getOwner(self):
        return self.__owner

    def setOwner(self, name):
        self.__owner = name

This code runs. It even looks familiar. But it is profoundly un-Pythonic, and every seasoned Python developer will tell you to throw it out.

The core problem is that Daniel is fighting two assumptions baked into Python: attributes are public by default, and data validation belongs to properties, not setters. In Java, encapsulation is enforced by the compiler via private. In Python, encapsulation is a social contract โ€” by convention, a single underscore prefix (_balance) signals "internal use", and a double underscore (__balance) invokes name-mangling to prevent accidental overrides in subclasses. Neither prevents access from outside the class.

The idiomatic replacement uses @property:

class BankAccount:
    def __init__(self, owner: str, balance: float = 0.0):
        self._owner = owner
        self._balance = balance

    @property
    def balance(self) -> float:
        return self._balance

    @balance.setter
    def balance(self, value: float) -> None:
        if value < 0:
            raise ValueError("Balance cannot be negative")
        self._balance = value

    @property
    def owner(self) -> str:
        return self._owner

Now callers write account.balance = 500 instead of account.setBalance(500), and the validation is invisible to the caller. The attribute behaves like a simple field on the outside but runs arbitrary logic on the inside. This is the Pythonic way.

The second Java habit that breaks in Python is interface-based programming. In Java, you define an interface Drawable and every class that can be drawn must declare implements Drawable. In Python, this is replaced by duck typing: if an object has a draw() method, it is drawable โ€” no declaration required. The name comes from the phrase "if it walks like a duck and quacks like a duck, it is a duck." You write code that calls obj.draw() and trust that the caller passes something with a draw() method. If they do not, a AttributeError surfaces at runtime. For cases where you truly want compile-time-style contracts, Python provides Abstract Base Classes (ABC) and the Protocol class โ€” but these are opt-in, not the default.

Understanding these two shifts โ€” properties over getters/setters, duck typing over interfaces โ€” is the foundation for writing Python that looks like Python rather than Java with different syntax.


๐Ÿ” Anatomy of a Python Class: init, self, and the Difference Between Instance and Class State

Before diving into inheritance and dunder methods, it helps to be precise about what a Python class actually is and where data lives.

A class is a callable object. When you call BankAccount("Alice", 100), Python calls BankAccount.__init__ with a freshly created object as the first argument (by convention named self). The __init__ method does not create the object โ€” __new__ does that โ€” but it initialises the instance's attributes. The distinction matters when you customise object creation, which we will cover in the Deep Dive section.

Instance attributes belong to an individual object. They live in obj.__dict__, a plain Python dictionary. Every instance gets its own copy:

class Counter:
    def __init__(self, start: int = 0):
        self.count = start      # instance attribute โ€” stored in self.__dict__

a = Counter(10)
b = Counter(20)
print(a.count)  # 10
print(b.count)  # 20
print(a.__dict__)  # {'count': 10}

Class attributes belong to the class itself, not any instance. All instances share the same value until an instance overrides it locally:

class Counter:
    default_step = 1            # class attribute โ€” stored in Counter.__dict__

    def __init__(self, start: int = 0):
        self.count = start

    def increment(self):
        self.count += self.default_step

c = Counter()
c.increment()
print(c.count)          # 1
Counter.default_step = 5
c.increment()
print(c.count)          # 6  โ€” Counter.default_step changed for everyone
c.default_step = 10     # now c has its own instance-level shadow
c.increment()
print(c.count)          # 16 โ€” uses c's own shadow, not the class attribute

This shadowing behaviour surprises many developers. The rule is simple: Python looks up attributes in the instance __dict__ first, then the class __dict__, then base classes in MRO order. The first match wins.

The self parameter is not a keyword in Python. It is merely a convention โ€” any name works. What matters is that Python automatically passes the calling instance as the first positional argument to every regular method. The name self is the community standard and you should follow it.


โš™๏ธ Inheritance, super(), and How Python Decides Which Method to Call

Python supports both single inheritance (one parent) and multiple inheritance (multiple parents). Multiple inheritance is powerful but introduces ambiguity when two parent classes define the same method. Python resolves this with the Method Resolution Order (MRO), computed using the C3 linearisation algorithm.

The MRO for any class is a deterministic, left-to-right depth-first ordering with the constraint that a parent never appears before all of its children. You can inspect it:

class A:
    def hello(self): return "A"

class B(A):
    def hello(self): return "B"

class C(A):
    def hello(self): return "C"

class D(B, C):
    pass

print(D.mro())
# [<class 'D'>, <class 'B'>, <class 'C'>, <class 'A'>, <class 'object'>]

d = D()
print(d.hello())  # "B"

The MRO tells you: start at D, then check B (found hello here, so stop). If B had not defined hello, Python would check C next, then A, then object.

super() is the idiomatic way to call a method from the next class in the MRO. It does not mean "my parent class" โ€” it means "the next class in the MRO". This distinction matters in multiple inheritance:

class Animal:
    def __init__(self, name: str):
        self.name = name

class Flyable:
    def __init__(self):
        self.can_fly = True

class Bird(Animal, Flyable):
    def __init__(self, name: str):
        super().__init__(name)      # calls Animal.__init__ per MRO
        self.can_fly = True

Python provides three types of methods, and choosing the right one is a common source of confusion:

Method TypeDecoratorFirst argumentWhen to use
Instance method(none)self โ€” the instanceNeeds access to instance state
Class method@classmethodcls โ€” the class itselfFactory constructors, class-level operations
Static method@staticmethod(none)Pure utility with no state dependency
class Temperature:
    def __init__(self, celsius: float):
        self.celsius = celsius

    @classmethod
    def from_fahrenheit(cls, fahrenheit: float) -> "Temperature":
        """Factory: alternative constructor using @classmethod."""
        return cls((fahrenheit - 32) * 5 / 9)

    @staticmethod
    def absolute_zero_celsius() -> float:
        """Static: no self, no cls needed."""
        return -273.15

    def to_fahrenheit(self) -> float:
        """Instance: reads self.celsius."""
        return self.celsius * 9 / 5 + 32

t = Temperature.from_fahrenheit(212)
print(t.celsius)                        # 100.0
print(Temperature.absolute_zero_celsius())  # -273.15
print(t.to_fahrenheit())               # 212.0

isinstance(obj, SomeClass) checks whether obj is an instance of SomeClass or any subclass. issubclass(Sub, Base) checks the class hierarchy without needing an object. Both are indispensable for writing polymorphic code safely.

The diagram below shows the MRO resolution path for a diamond inheritance scenario, which is the classic case where the C3 algorithm earns its keep.

flowchart TD
    D[D - search starts here] --> B[B - found hello - STOP]
    D --> C[C - would check next if B had no hello]
    B --> A[A - base class]
    C --> A
    A --> OBJ[object - Python root]

In this flowchart, D.hello() resolves to B.hello because B appears before C in D's MRO. The diamond is resolved by visiting each class at most once, in left-to-right depth-first order, guaranteeing a deterministic and consistent lookup without ambiguity.


๐Ÿง  Deep Dive: The Python Object Model and Its Performance Implications

The Internals of Python Object Model

Every Python object has three core attributes: an identity (id(obj) โ€” its memory address), a type (type(obj) โ€” its class), and a value (stored in obj.__dict__ for regular objects). Understanding how Python looks up attributes via __dict__ explains almost every OOP behaviour you encounter.

Attribute lookup chain: When you write obj.attr, Python executes roughly this sequence:

  1. Check type(obj).__mro__ for a data descriptor (an object with both __get__ and __set__). Data descriptors win over instance dictionaries.
  2. Check obj.__dict__ for an instance attribute.
  3. Check type(obj).__mro__ for a non-data descriptor (only __get__) or plain class attribute.
  4. Raise AttributeError if nothing is found.

@property is a data descriptor. It lives in the class's __dict__ as a property object with __get__, __set__, and __delete__ methods. When you write obj.balance, Python finds the property object in the class first (step 1), calls its __get__, which calls your getter function. This is why @property can intercept attribute access even though the caller writes plain obj.balance โ€” no method call syntax required.

You can verify this directly:

class Circle:
    @property
    def radius(self):
        return self._radius

    @radius.setter
    def radius(self, value):
        if value <= 0:
            raise ValueError("Radius must be positive")
        self._radius = value

# The property object lives in the class __dict__:
print(type(Circle.__dict__['radius']))  # <class 'property'>
print(Circle.__dict__['radius'].fget)   # <function Circle.radius ...>

__slots__ is Python's mechanism for trading flexibility for memory and speed. Normally, every instance carries a __dict__ โ€” a hash table that can hold any attribute. For classes with many instances and fixed attribute sets, this wastes 200โ€“400 bytes per object. Declaring __slots__ replaces __dict__ with a compact fixed-size array:

class Point:
    __slots__ = ('x', 'y')

    def __init__(self, x: float, y: float):
        self.x = x
        self.y = y

p = Point(1.0, 2.0)
# p.__dict__ does not exist โ€” AttributeError
# p.z = 3.0  would raise AttributeError โ€” no new attributes allowed

A Point with __slots__ uses approximately 56 bytes; the same class without __slots__ uses roughly 256 bytes. At one million instances, that is 200 MB saved โ€” meaningful in data-intensive applications.

__class__ is a writable reference to the object's type. Python uses it for isinstance checks and dynamic method dispatch. Changing it is rarely justified outside of migration code, but knowing it exists helps you understand how frameworks do runtime patching (mocking, proxying).

Performance Analysis: slots, Dataclasses, and the hash/eq Contract

__slots__ memory savings are most impactful in tight loops or large collections:

import sys

class PointDict:
    def __init__(self, x, y):
        self.x = x
        self.y = y

class PointSlots:
    __slots__ = ('x', 'y')
    def __init__(self, x, y):
        self.x = x
        self.y = y

pd = PointDict(1, 2)
ps = PointSlots(1, 2)
print(sys.getsizeof(pd) + sys.getsizeof(pd.__dict__))   # ~360 bytes
print(sys.getsizeof(ps))                                  # ~56 bytes

@dataclass vs manual __init__: @dataclass auto-generates __init__, __repr__, and __eq__ based on declared fields. The generated code is equivalent to what you would write by hand but cannot be forgotten or mistyped. For classes with five or more fields, @dataclass is always faster to write and less error-prone than manual __init__.

The __hash__ / __eq__ contract is critical: if a == b, then hash(a) must equal hash(b). Python enforces a corollary: if you define __eq__ without __hash__, Python sets __hash__ = None, making the class unhashable. This prevents bugs where equal objects hash differently and silently produce incorrect set/dict behaviour. To make a class usable as a dictionary key or set member, you must define both, or use @dataclass(frozen=True) which handles both automatically.

from dataclasses import dataclass

@dataclass(frozen=True)
class Coordinate:
    lat: float
    lon: float

# frozen=True generates __hash__ and makes instances immutable
c1 = Coordinate(51.5, -0.12)
c2 = Coordinate(51.5, -0.12)
print(c1 == c2)       # True
print(hash(c1) == hash(c2))   # True
visited = {c1, c2}
print(len(visited))   # 1 โ€” same coordinate, deduplicated

๐Ÿ“Š Visualising the Shape Hierarchy: A Real Domain Class Diagram

The following diagram models a geometry domain with a Shape base class and two concrete subclasses, Circle and Rectangle. It shows inheritance relationships, shared dunder methods, and the contract each class fulfils.

classDiagram
    class Shape {
        +str color
        +__init__(color: str)
        +area() float
        +perimeter() float
        +__str__() str
        +__eq__(other) bool
        +__hash__() int
    }
    class Circle {
        +float radius
        +__init__(color: str, radius: float)
        +area() float
        +perimeter() float
        +__str__() str
    }
    class Rectangle {
        +float width
        +float height
        +__init__(color: str, width: float, height: float)
        +area() float
        +perimeter() float
        +__str__() str
    }
    Shape <|-- Circle : inherits
    Shape <|-- Rectangle : inherits

Shape defines the interface contract: any concrete subclass must implement area() and perimeter(). The __eq__ and __hash__ methods live on Shape so that two shapes with the same colour and dimensions compare as equal and can be stored in sets without duplicates. Circle and Rectangle each override __str__ to produce human-readable output while inheriting the comparison logic from Shape โ€” this is the open/closed principle at work in Python.


๐ŸŒ Python OOP in the Wild: Dataclasses, Context Managers, and Custom Iterators

Beyond basic class definitions, Python's real power comes from implementing specific dunder methods that let your objects integrate seamlessly with built-in language constructs.

@dataclass for clean data containers eliminates boilerplate for pure data classes. Instead of writing __init__, __repr__, and __eq__ manually, you declare fields with type annotations and let the decorator generate them:

from dataclasses import dataclass, field
from typing import List

@dataclass
class Order:
    order_id: str
    customer: str
    items: List[str] = field(default_factory=list)
    total: float = 0.0

    def add_item(self, item: str, price: float) -> None:
        self.items.append(item)
        self.total += price

o = Order("ORD-001", "Alice")
o.add_item("Widget", 9.99)
print(o)
# Order(order_id='ORD-001', customer='Alice', items=['Widget'], total=9.99)

Context managers with __enter__ and __exit__ make your class work with with statements, guaranteeing cleanup even when exceptions occur. This is how Python handles files, database connections, locks โ€” any resource that must be released:

class ManagedConnection:
    def __init__(self, host: str):
        self.host = host
        self.conn = None

    def __enter__(self):
        print(f"Connecting to {self.host}")
        self.conn = {"host": self.host, "active": True}  # simulate connection
        return self.conn

    def __exit__(self, exc_type, exc_val, exc_tb):
        print(f"Closing connection to {self.host}")
        self.conn["active"] = False
        return False   # do not suppress exceptions

with ManagedConnection("db.prod.example.com") as conn:
    print(f"Using: {conn}")
# Prints: Connecting, Using, Closing โ€” even if an exception is raised inside the block

Custom iterators with __iter__ and __next__ let your objects work with for loops, list(), sum(), and any other construct that consumes iterables:

class Countdown:
    def __init__(self, start: int):
        self.current = start

    def __iter__(self):
        return self       # the iterator is the object itself

    def __next__(self):
        if self.current <= 0:
            raise StopIteration
        value = self.current
        self.current -= 1
        return value

for n in Countdown(5):
    print(n, end=" ")   # 5 4 3 2 1

__repr__ vs __str__ serve different audiences. __str__ is for end users โ€” it should be human-readable and can omit internal detail. __repr__ is for developers and debugging โ€” it should be unambiguous, ideally eval()-able to reconstruct the object. When only one is defined, Python falls back to __repr__ for both. The convention: always define __repr__; only add __str__ when you need a different display for users.

class Vector:
    def __init__(self, x: float, y: float):
        self.x = x
        self.y = y

    def __repr__(self) -> str:
        return f"Vector({self.x!r}, {self.y!r})"   # developer-facing

    def __str__(self) -> str:
        return f"({self.x}, {self.y})"              # user-facing

v = Vector(3.0, 4.0)
print(repr(v))   # Vector(3.0, 4.0)
print(str(v))    # (3.0, 4.0)

โš–๏ธ Choosing the Right Python Data Container: dataclass vs namedtuple vs TypedDict vs dict

Python offers four main ways to represent structured data, and the right choice depends on mutability, type safety, performance, and serialization needs:

ConstructMutableType-checkedHashable by defaultBest for
dictYesNoNoAd-hoc data, JSON payloads
TypedDictYesIDE only (no runtime)NoTyped dicts for APIs, JSON
namedtupleNoNoYesLightweight immutable records
@dataclassYes (default)IDE + __annotations__No (yes if frozen=True)Domain models, config objects
@dataclass(frozen=True)NoIDE + __annotations__YesImmutable value objects, dict keys

When to use @dataclass(frozen=True): whenever the object represents a value (a point, a currency amount, a date range) rather than an entity. Values are equal if their fields are equal; entities are equal only if their identities match. Freezing also enables hashing, making the object usable in sets and as dict keys.

ABC vs Protocol for interfaces: ABC (Abstract Base Class) is nominal โ€” you must explicitly inherit from it. Protocol (from typing) is structural โ€” any class with the required methods satisfies the protocol without declaring it. Use ABC when you control the class hierarchy and want enforced overrides. Use Protocol for duck-typed checks, third-party classes, and type annotations in libraries.

from abc import ABC, abstractmethod
from typing import Protocol

# ABC: explicit, enforced at instantiation time
class Drawable(ABC):
    @abstractmethod
    def draw(self) -> None: ...

# Protocol: structural, checked only by type checkers (mypy, pyright)
class Printable(Protocol):
    def print_summary(self) -> str: ...

๐Ÿงญ Python OOP Decision Guide: Matching Your Need to the Right Construct

I need to...Use this
Store data with type hints and auto-generated __init__ / __repr__@dataclass
Create an immutable, hashable value object (usable as dict key)@dataclass(frozen=True)
Return multiple values from a function without a full classnamedtuple or tuple
Type-annotate a dict payload (e.g. JSON body)TypedDict
Validate data at construction time with rich error messagespydantic.BaseModel
Enforce a method contract on subclassesABC with @abstractmethod
Apply a structural interface check for duck typingtyping.Protocol
Reduce memory usage for millions of instances__slots__
Add custom iteration to an object__iter__ + __next__
Integrate with with blocks for resource cleanup__enter__ + __exit__
Control how an object displays in print / logs__str__ and __repr__
Make objects comparable with == and sortable with <__eq__, __lt__, __le__
Make objects usable in sets or as dict keys__hash__ (+ __eq__)

๐Ÿงช Building a Payment Domain in Python: A Complete Runnable Example

This section demonstrates a realistic payment processing domain model. It combines inheritance, abstract base classes, dataclasses, dunder methods, and properties into a single runnable Python module. The goal is to show how these concepts compose in production-shaped code โ€” not toy examples. Pay attention to how @abstractmethod enforces the process() contract, how __repr__ makes debugging easy, and how __eq__ and __hash__ let payments be deduplicated in a set.

from __future__ import annotations
from abc import ABC, abstractmethod
from dataclasses import dataclass, field
from datetime import datetime
from typing import List

# โ”€โ”€ Value Object โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

@dataclass(frozen=True)
class Money:
    """Immutable value object representing an amount in a given currency."""
    amount: float
    currency: str = "USD"

    def __post_init__(self):
        if self.amount < 0:
            raise ValueError(f"Amount cannot be negative: {self.amount}")

    def __add__(self, other: Money) -> Money:
        if self.currency != other.currency:
            raise ValueError("Cannot add different currencies")
        return Money(self.amount + other.amount, self.currency)

    def __str__(self) -> str:
        return f"{self.currency} {self.amount:.2f}"

# โ”€โ”€ Abstract Base โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

class Payment(ABC):
    """Base class for all payment types. Enforces process() contract."""

    def __init__(self, payment_id: str, amount: Money):
        self._payment_id = payment_id
        self._amount = amount
        self._processed_at: datetime | None = None

    @property
    def payment_id(self) -> str:
        return self._payment_id

    @property
    def amount(self) -> Money:
        return self._amount

    @property
    def is_processed(self) -> bool:
        return self._processed_at is not None

    @abstractmethod
    def process(self) -> bool:
        """Execute the payment. Returns True on success."""
        ...

    @abstractmethod
    def _provider_name(self) -> str:
        """Human-readable provider name for repr/logging."""
        ...

    def __repr__(self) -> str:
        status = "processed" if self.is_processed else "pending"
        return (
            f"{self.__class__.__name__}("
            f"id={self._payment_id!r}, "
            f"amount={self._amount}, "
            f"status={status!r})"
        )

    def __eq__(self, other: object) -> bool:
        if not isinstance(other, Payment):
            return NotImplemented
        return self._payment_id == other._payment_id

    def __hash__(self) -> int:
        return hash(self._payment_id)

# โ”€โ”€ Concrete Implementations โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

class CreditCardPayment(Payment):
    """Payment via credit card with last-four-digits masking."""

    def __init__(self, payment_id: str, amount: Money, card_number: str):
        super().__init__(payment_id, amount)
        if len(card_number) < 4:
            raise ValueError("Invalid card number")
        self._last_four = card_number[-4:]

    @property
    def masked_card(self) -> str:
        return f"****-****-****-{self._last_four}"

    def process(self) -> bool:
        print(f"Charging {self._amount} to card {self.masked_card}")
        self._processed_at = datetime.now()
        return True

    def _provider_name(self) -> str:
        return "CreditCard"

class PayPalPayment(Payment):
    """Payment routed through PayPal with email association."""

    def __init__(self, payment_id: str, amount: Money, paypal_email: str):
        super().__init__(payment_id, amount)
        self._email = paypal_email

    def process(self) -> bool:
        print(f"Sending {self._amount} via PayPal to {self._email}")
        self._processed_at = datetime.now()
        return True

    def _provider_name(self) -> str:
        return "PayPal"

# โ”€โ”€ Payment Processor (uses duck typing, not isinstance) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

@dataclass
class PaymentProcessor:
    """Processes a batch of payments. Works with any Payment subclass."""
    payments: List[Payment] = field(default_factory=list)

    def add(self, payment: Payment) -> None:
        self.payments.append(payment)

    def process_all(self) -> None:
        for payment in self.payments:
            if not payment.is_processed:
                success = payment.process()
                if success:
                    print(f"  OK: {payment}")
                else:
                    print(f"  FAILED: {payment}")

    def total(self) -> float:
        return sum(p.amount.amount for p in self.payments if p.is_processed)

# โ”€โ”€ Demo โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

if __name__ == "__main__":
    processor = PaymentProcessor()

    processor.add(CreditCardPayment(
        "PAY-001", Money(49.99), "4111111111111234"
    ))
    processor.add(PayPalPayment(
        "PAY-002", Money(19.99), "alice@example.com"
    ))
    processor.add(CreditCardPayment(
        "PAY-003", Money(99.00), "5500005555555559"
    ))

    processor.process_all()
    print(f"\nTotal processed: USD {processor.total():.2f}")

    # Deduplication via __eq__ and __hash__
    p1 = CreditCardPayment("PAY-001", Money(49.99), "4111111111111234")
    p2 = CreditCardPayment("PAY-001", Money(49.99), "4111111111111234")
    unique = {p1, p2}
    print(f"\nDuplicate check โ€” unique payments in set: {len(unique)}")  # 1

Running this produces:

Charging USD 49.99 to card ****-****-****-1234
  OK: CreditCardPayment(id='PAY-001', amount=USD 49.99, status='processed')
Sending USD 19.99 via PayPal to alice@example.com
  OK: PayPalPayment(id='PAY-002', amount=USD 19.99, status='processed')
Charging USD 99.00 to card ****-****-****-5559
  OK: CreditCardPayment(id='PAY-003', amount=USD 99.00, status='processed')

Total processed: USD 168.98

Duplicate check โ€” unique payments in set: 1

Every element of this example earns its place: @dataclass(frozen=True) on Money ensures currency arithmetic is safe and value-equal amounts hash identically. @abstractmethod on process() guarantees a TypeError at instantiation time if a subclass forgets to implement it. __eq__ and __hash__ on Payment use payment_id as the natural key, enabling set-based deduplication without any additional bookkeeping.


๐Ÿ› ๏ธ Production-Grade Python Objects: pydantic, attrs, and stdlib dataclasses

Three major libraries build on Python's object model to make data validation, immutability, and schema generation faster and more reliable than hand-rolled classes.

stdlib dataclasses (Python 3.7+) is the zero-dependency baseline. It auto-generates __init__, __repr__, __eq__, and optionally __hash__ via frozen=True. Use it for internal data models where you control all inputs. Limitation: no runtime validation โ€” setting amount = -5 on a @dataclass succeeds silently unless you add a __post_init__ check.

from dataclasses import dataclass

@dataclass
class Config:
    host: str
    port: int = 8080
    debug: bool = False

pydantic (v2) is the validation powerhouse. It enforces types at construction time, coerces compatible values (e.g. "42" โ†’ 42 for an int field), and generates JSON Schema for free. Stripe, FastAPI, and dozens of production Python services use Pydantic models as their API request/response layer.

from pydantic import BaseModel, field_validator

class PaymentRequest(BaseModel):
    payment_id: str
    amount: float
    currency: str = "USD"

    @field_validator("amount")
    @classmethod
    def amount_must_be_positive(cls, v: float) -> float:
        if v <= 0:
            raise ValueError("Amount must be positive")
        return v

req = PaymentRequest(payment_id="PAY-001", amount=49.99)
print(req.model_dump())   # {'payment_id': 'PAY-001', 'amount': 49.99, 'currency': 'USD'}

attrs predates dataclasses and remains popular in performance-sensitive and library code. It supports validators, converters, and __slots__ generation via a simple decorator API. The cattrs companion library provides serialisation and deserialisation:

import attrs

@attrs.define
class Endpoint:
    host: str = attrs.field(validator=attrs.validators.instance_of(str))
    port: int = attrs.field(default=443)
    tls: bool = attrs.field(default=True)

e = Endpoint(host="api.example.com")
print(e)   # Endpoint(host='api.example.com', port=443, tls=True)

For a deeper look at Pydantic in the context of FastAPI request validation and schema generation, see the planned Python web frameworks follow-up post in this series.


๐Ÿ“š Lessons Learned from Writing Pythonic Classes

Start with @dataclass, upgrade when you feel friction. Plain @dataclass handles 80% of use cases. Add frozen=True when you need hashability or immutability. Graduate to pydantic when you need runtime validation of external inputs (API payloads, config files). Add __slots__ only when profiling confirms memory is the bottleneck.

Do not port Java getters and setters. Direct attribute access is idiomatic Python. Use @property only when you genuinely need to intercept access for validation, lazy computation, or deprecation warnings โ€” not as a default encapsulation strategy.

__repr__ is non-negotiable. Every class that you will ever inspect in a debugger or REPL should have __repr__. The three minutes it takes to write it saves hours of print(obj.__dict__) archaeology. @dataclass and Pydantic generate it for free.

Multiple inheritance requires conscious design. Python makes multiple inheritance syntactically trivial but semantically complex. If you are building a library, prefer composition and Protocol over deep inheritance trees. If you must use multiple inheritance, inspect .mro() and document the intended resolution order explicitly.

The __hash__/__eq__ contract is a trap for the unwary. Defining __eq__ without __hash__ makes your class unhashable (Python sets __hash__ = None). Always define both together, or use @dataclass(frozen=True) to handle the contract automatically.

__slots__ is a last resort, not a best practice. It complicates inheritance (each class in the hierarchy must declare its own __slots__) and prevents __dict__, which breaks some introspection tools. Use it only after profiling confirms memory savings are worth the complexity cost.


๐Ÿ“Œ Python OOP: What Actually Matters

TLDR: Python OOP is attribute-lookup chains, not access modifiers. Use @property instead of getters/setters, @dataclass instead of manual __init__, and dunder methods to make your objects integrate with Python's built-in constructs. For validation, reach for pydantic. For memory-critical inner loops, reach for __slots__. For everything else, plain @dataclass with frozen=True where immutability matters is the Pythonic default.

ConceptPython idiom
Encapsulation@property with _underscore convention
InterfaceABC (enforced) or Protocol (structural)
Data class@dataclass or pydantic.BaseModel
Immutable value@dataclass(frozen=True)
Memory optimisation__slots__
Iterator protocol__iter__ + __next__
Resource management__enter__ + __exit__
Equality and hashing__eq__ + __hash__ (always together)

๐Ÿ“ Python OOP Practice Quiz

Test your understanding of the concepts covered in this post.

  1. A Python class has both A and B as parents (in that order) via class C(A, B). Both A and B define a method greet(). Which version does C().greet() call, and why?

    Correct Answer: A.greet() โ€” because the MRO places A before B in the resolution order. Python's C3 linearisation visits parents left-to-right, and the first match wins.

  2. What is the difference between defining __str__ and __repr__ on a class? Which is called when you place an object inside a list and print() the list?

    Correct Answer: __str__ is user-facing and human-readable; __repr__ is developer-facing and should be unambiguous (ideally eval()-able). When printing a list, Python calls __repr__ on each element, not __str__.

  3. If you define __eq__ on a class, what happens to __hash__ by default? How do you fix it if you need instances to be usable as dictionary keys?

    Correct Answer: Python sets __hash__ = None, making the class unhashable. To restore hashability, either explicitly define __hash__ alongside __eq__, or use @dataclass(eq=True, frozen=True) which generates a consistent __hash__ automatically.

  4. What is the output of the following code, and why?

    class Animal:
        legs = 4
    
    dog = Animal()
    dog.legs = 2
    print(Animal.legs)
    print(dog.legs)
    

    Correct Answer: Animal.legs prints 4; dog.legs prints 2. Assigning dog.legs = 2 creates an instance attribute that shadows the class attribute for that specific instance. The class attribute itself is unchanged.

  5. When would you choose typing.Protocol over abc.ABC as the mechanism for defining an interface in Python?

    Correct Answer: Use Protocol when you want structural (duck-typed) type checking without requiring classes to explicitly inherit from the interface โ€” especially useful for third-party classes or when you cannot modify the class hierarchy. Use ABC when you want enforced inheritance and instantiation-time errors if abstract methods are not implemented.

  6. Open-ended challenge: Take the Payment domain model from the Practical section. Add a BankTransferPayment class that introduces a routing_number attribute with @property validation (routing numbers must be exactly 9 digits). Implement the process() method to simulate a bank transfer and verify the __hash__/__eq__ behaviour holds for two BankTransferPayment objects with the same payment_id.


Abstract Algorithms

Written by

Abstract Algorithms

@abstractalgorithms