Tips and Tricks #72: Parallelize CPU-Bound Work with ProcessPoolExecutor

Bypass the GIL and utilize all CPU cores for compute-intensive tasks.

Code Snippet

from concurrent.futures import ProcessPoolExecutor
import multiprocessing

def cpu_intensive_task(data):
    """Heavy computation that benefits from parallelization."""
    return sum(x * x for x in range(data))

def process_batch(items):
    # Use all available cores
    workers = multiprocessing.cpu_count()
    
    with ProcessPoolExecutor(max_workers=workers) as executor:
        results = list(executor.map(cpu_intensive_task, items))
    
    return results

# Process 1000 items across all cores
items = list(range(1000, 2000))
results = process_batch(items)

Why This Helps

  • True parallelism by bypassing Python’s GIL
  • Linear speedup for CPU-bound workloads
  • Simple API similar to threading

How to Test

  • Compare wall-clock time vs sequential
  • Monitor CPU utilization across cores

When to Use

Data transformation, image processing, numerical computations, batch processing.

Performance/Security Notes

Data is pickled between processes. Keep payloads small. Not suitable for I/O-bound tasks.

References


Try this tip in your next project and share your results in the comments!


Discover more from Byte Architect

Subscribe to get the latest posts sent to your email.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.