Contents

Multiprocessing, Multithreading and Asyncio in Python Part 1 - Basic Concept

Python’s performance bottlenecks were criticized for years,
but thanks to the hard work of developers,
Asyncio was introduced in Python 3.4 to improve performance in specific scenarios.
By Python 3.13, the Free-threaded design (PEP-703) emerged,
allowing the optional disabling of the GIL.
Combined with the pre-existing Multiprocessing and Multithreading,
I have compiled a few records on the principles, differences, and use cases for these three technologies.
This first post will briefly introduce the basic concepts and suitable scenarios for each.

Multiprocessing

Multiple processes can be created and executed in parallel by a singel program.
Each process has its own independent memory space and therefore can completely bypass the limitations of Python’s GIL (Global Interpreter Lock).
This means that regardless which version of Python you are using, you can execute multiple processes in parallel on multi-core CPUs, independently and without interference.

Use Case:
CPU-bound tasks, such as extensive mathematical calculations, data processing, image recognition, etc.
It can effectively utilize the computing power of multi-core CPUs.

Additionally, due to the isolation of processes memory space,
if a single process crashes, it won’t affect other running processes or the main program.

import multiprocessing
import time

def cpu_bound_task(n):
    count = 0
    for i in range(n):
        count += i
    print(f"Finished task with {n}")

if __name__ == '__main__':
    start_time = time.time()
    processes = []
    for i in range(4):
        p = multiprocessing.Process(target=cpu_bound_task, args=(10**7,))
        processes.append(p)
        p.start()

    for p in processes:
        p.join()

    end_time = time.time()
    print(f"Multiprocessing took {end_time - start_time:.2f} seconds.")
  • Pros:
    • Can achieve true parallelism using multi-core CPUs.
    • Not GIL limitation.
    • Independent memory space between processes leads to high stability and less likelihood of Race Conditions.
  • Cons:
    • Creating independent processes requires more resources (CPU, memory).
    • Inter-process communication (IPC) is more complex, requiring mechanisms like Queue, Pipe, or shared memory, which results in higher latency.

Multithreading

Multiple threads are created within a single process.
These threads share the same memory space (Heap) used by the process, allowing for easy data sharing and exchange.

In versions before Python 3.13, unlike other programming languages like C/C++,
Python’s multithreading was limited by the Python GIL (Global Interpreter Lock).
Even when running on a multi-core CPU,
Python’s multithreading could not actually achieve parallel computing.

GIL (Global Interpreter Lock): The GIL is a mechanism in CPython (the official Python implementation) designed to protect Python objects (like dicts, lists, etc.) from corruption.
This mechanism ensures that only one thread can execute Python bytecode at any given time.
This means that for CPU-bound tasks, even on a multi-core CPU,
Python’s multithreading can only execute on a single core at a time,
failing to achieve true parallelism for speedup.

Historically, under the GIL, when a thread encountered an I/O operation (like reading/writing a file or a network request), it would release the GIL,
allowing other threads a chance to run.
Therefore, Multithreading was traditionally used for handling I/O-bound tasks.

As Python’s user base grew, demands like PEP-703 emerged.
Starting from Python 3.13, an experimental feature to optionally disable the GIL (free-threading mode) was included.
Python 3.14 introduced a GIL-free, Free-threaded version.
In these versions, Python’s Multithreading can finally break through the limits and handle CPU-bound tasks in parallel across multiple cores, avoiding the IPC overhead of Multiprocessing.

Use Case:

  • Before Python 3.13: I/O-bound tasks, such as web scraping, file downloads, API requests, etc.
  • Python 3.13+ (with Free-threaded mode): All types of tasks, including CPU-bound ones.
import threading
import requests
import time

def io_bound_task(url):
    try:
        response = requests.get(url)
        print(f"Downloaded {url} with status {response.status_code}")
    except Exception as e:
        print(f"Error downloading {url}: {e}")

if __name__ == '__main__':
    urls = ["https://www.google.com"] * 5
    start_time = time.time()
    threads = []
    for url in urls:
        t = threading.Thread(target=io_bound_task, args=(url,))
        threads.append(t)
        t.start()

    for t in threads:
        t.join()

    end_time = time.time()
    print(f"Multithreading took {end_time - start_time:.2f} seconds.")
  • Pros:
    • The overhead of creating a thread is smaller than that of a process.
    • Shared memory makes data exchange convenient.
  • Cons:
    • Versions before Python 3.13 are limited by the GIL and cannot utilize multi-core CPUs for CPU-bound tasks.
    • Requires handling thread synchronization issues, such as using Lock to avoid Race conditions.

Asyncio I/O and Coroutines

Asyncio is a standard library introduced after Python 3.4.
Conceptually, it uses an Event Loop and Coroutines to achieve concurrency on a single thread.

A Coroutine can be seen as a lightweight thread.
It can be controlled to pause at a point where it needs to wait for I/O (await),
returning control to the event loop to execute other coroutines.
When the condition for the pause is met (e.g., the awaited I/O operation is complete),
the event loop returns to continue executing that coroutine.

Besides being able to execute other coroutines during the pause,
it also saves OS-level thread switching (context switch),
which can significantly improve performance.

Use Case:
Highly concurrent I/O-bound tasks,
especially scenarios that require handling a large number of network connections simultaneously (like Web servers, chat applications, or massive API requests).

import asyncio
import aiohttp
import time

async def async_io_bound_task(session, url):
    try:
        async with session.get(url) as response:
            print(f"Downloaded {url} with status {response.status}")
    except Exception as e:
        print(f"Error downloading {url}: {e}")

async def main():
    urls = ["https://www.google.com"] * 5
    start_time = time.time()
    async with aiohttp.ClientSession() as session:
        tasks = [async_io_bound_task(session, url) for url in urls]
        await asyncio.gather(*tasks)
    end_time = time.time()
    print(f"Asyncio took {end_time - start_time:.2f} seconds.")

if __name__ == '__main__':
    asyncio.run(main())
  • Pros:
    • Extremely low context switching overhead, capable of handling a large number of I/O operations with high efficiency.
    • Operates on a single thread, so there are no OS-level race condition issues (though application-layer race conditions are still possible if not careful).
  • Cons:
    • Not suitable for CPU-bound tasks; a single CPU-bound task can block the entire Event Loop.
    • Requires using async/await syntax and corresponding asynchronous library support (like aiohttp, asyncpg).

Comparison Summary

FeatureMultiprocessingMultithreadingAsyncio
Basic UnitProcessThreadCoroutine
Memory SpaceIndependentSharedShared (Single-threaded)
GIL ImpactNone, bypassedRestricted in old versions; 3.13+ can avoidNone (Single-threaded)
Parallelism/ConcurrencyParallelismOld: Concurrency; 3.13+: ParallelismConcurrency
Use CaseCPU-bound, high fault tolerance/isolationGeneral I/O-boundMassive/highly concurrent I/O-bound
ProsMulti-core utility, high stabilityShared memory, low overheadExtremely high I/O throughput, low overhead
ConsHigh resource overhead, complex IPCOld versions (pre-3.13) limited by GIL; race conditions/Lock complexityNot for CPU-bound tasks (Event Loop will block)
  • If your task is CPU-bound, requiring a lot of CPU computation, then multiprocessing in any version can handle it by fully utilizing multi-core CPUs. multithreading in Python 3.13+ is also a viable option, reducing context switch overhead and IPC complexity.
  • If your task is I/O-bound with relatively simple logic and a moderate number of connections, multithreading is a lightweight and straightforward choice.
  • If your task is I/O-bound and requires handling a large number of concurrent connections (e.g., developing a Web server, API, or microservice), then asyncio provides the highest performance and throughput.