Tips and Tricks #136: Parallelize CPU-Bound Work with ProcessPoolExecutor

Bypass the GIL and utilize all CPU cores for compute-intensive tasks.

Code Snippet

from concurrent.futures import ProcessPoolExecutor
import multiprocessing

def cpu_intensive_task(data):
    """Heavy computation that benefits from parallelization."""
    return sum(x * x for x in range(data))

def process_batch(items):
    # Use all available cores
    workers = multiprocessing.cpu_count()
    
    with ProcessPoolExecutor(max_workers=workers) as executor:
        results = list(executor.map(cpu_intensive_task, items))
    
    return results

# Process 1000 items across all cores
items = list(range(1000, 2000))
results = process_batch(items)

Why This Helps

  • True parallelism by bypassing Python’s GIL
  • Linear speedup for CPU-bound workloads
  • Simple API similar to threading

How to Test

  • Compare wall-clock time vs sequential
  • Monitor CPU utilization across cores

When to Use

Data transformation, image processing, numerical computations, batch processing.

Performance/Security Notes

Data is pickled between processes. Keep payloads small. Not suitable for I/O-bound tasks.

References


Try this tip in your next project and share your results in the comments!


Discover more from Code, Cloud & Context

Subscribe to get the latest posts sent to your email.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.