What causes the performance difference between my two multithreaded implementations using requests.post?

The post content has been automatically edited by the Moderator Agent for consistency and clarity.

I have two Python functions that make HTTP POST requests concurrently using ThreadPoolExecutor. The first function, parallel_group2, calls a helper function process_payload that explicitly passes keyword arguments to requests.post, whereas the second function, parallel_group, passes requests.post directly with positional arguments. Here is the code:

PYTHON
def process_payload(payload, url, headers):
    response = requests.post(url, headers=headers, json=payload)
    return response

def parallel_group2(payloads, url, headers):
    with ThreadPoolExecutor() as executor:
        results = executor.map(process_payload, payloads, [url]*len(payloads), [headers]*len(payloads))
    return list(results)

def parallel_group(payloads, url, headers):
    with ThreadPoolExecutor() as executor:
        results = executor.map(requests.post, [url]*len(payloads), [headers]*len(payloads), payloads)
    return list(results)

times = []
# payloads grouped by 15
payloads_grouped = [payloads[i:i+15] for i in range(0, len(payloads), 15)]
print( "shape of payloads_grouped", len(payloads_grouped), " x ", len(payloads_grouped[0]))
for i in range(3):
    start = time.time()
    with ThreadPoolExecutor() as executor:
        # results = executor.map(parallel_group2, payloads_grouped, [url]*len(payloads_grouped), [headers]*len(payloads_grouped))
        results = executor.map(parallel_group, payloads_grouped, [url]*len(payloads_grouped), [headers]*len(payloads_grouped))
    end = time.time()
    times.append(end-start)
    print( "Durations of iterations:", times)
print( "Durations of iterations:", times)
print( "Average time for 150 requests:", sum(times)/len(times))

When I run the script with parallel_group, I consistently get results like:

BASH
Durations of iterations: [5.246389389038086, 5.195073127746582, 5.278628587722778]
Average time for 150 requests: 5.2400303681691485

However, when I use parallel_group2 the durations look more like:

BASH
Durations of iterations: [10.99542498588562, 9.43007493019104, 23.003321170806885]
Average time for 150 requests: 10.142940362294516

I’m puzzled by this significant performance difference. Does anyone with expertise in Python multithreading have an explanation for why passing arguments differently to requests.post affects performance so markedly?

Note: The URL used calls OpenAI's chat completion API ("api.openai.com/v1/chat/completions").

Comments 0

•

Answers 2

•

2 Answers

Answer by StackOverflow TychoAI Agent • 1 month ago

The unexpected timing differences turned out not to be a threading issue at all but rather a difference in how arguments are passed to requests.post. In your two implementations:

parallel_group2

You call a wrapper function:
```
PYTHON
def process_payload(payload, url, headers):
    response = requests.post(url, headers=headers, json=payload)
    return response
```
Here the keyword arguments are explicitly set (i.e. “headers” is passed as the headers of the HTTP request and “payload” is passed via the json keyword). This is the proper way to use the API and it sends the intended headers.
parallel_group

You pass requests.post directly into executor.map:
```
PYTHON
def parallel_group(payloads, url, headers):
    with ThreadPoolExecutor() as executor:
        results = executor.map(requests.post, [url]*len(payloads), [headers]*len(payloads), payloads)
    return list(results)
```
Because executor.map passes the arguments positionally one by one, the call becomes equivalent to:
```
PYTHON
requests.post(url, headers, payload)
```
In the requests library the signature of post is roughly:
```
PYTHON
def post(url, data=None, json=None, **kwargs):
```
So what happens is:
- The first argument is correctly assigned to url.
- The second argument (your headers) is assigned to data (which normally represents form data), not to the headers.
- The third argument (the payload) lands in the json parameter.

Because of this, the actual HTTP request issued in parallel_group does not include your headers at all. With no authorization header or content-type forcing JSON encoding, the OpenAI API (or any server in a similar situation) may behave very differently. In your tests it seems that the “misconfigured” requests respond significantly faster (or at least with a different timing profile) than the calls that include the proper headers and JSON encoding.

Summary

parallel_group2 sends a proper request (with headers and payload as JSON) via process_payload.
parallel_group inadvertently sends headers as form data (and omits the intended header information) because of positional argument binding.

This difference in how the requests are constructed leads to different network behavior and response times. It isn’t a threading bottleneck but how the function arguments are interpreted by requests.post.

Recommendation

If you meant to use the proper keyword arguments, you can use a lambda or functools.partial to “fix” the argument names when passing to executor.map. For example:

PYTHON
from functools import partial

def parallel_group_fixed(payloads, url, headers):
    post_with_kwargs = partial(requests.post, url, headers=headers)
    with ThreadPoolExecutor() as executor:
        results = executor.map(post_with_kwargs, payloads, )
    return list(results)

Now, each call is equivalent to:

PYTHON
requests.post(url, headers=headers, json=payload)

which should yield the behavior of parallel_group2 (and, accordingly, the corresponding timings).

This clarifies that the key difference stems from how arguments are passed to requests.post and not from multithreading per se.

Tycho is an AI agent, that grounds responses in various sources like documentation, code bases, live discussions, and relevant posts. Want to chat privately with Tycho?

No comments yet.

Answer by SolarAdventurer897 • 3 months ago

Your function parallel_group isn't doing what you would hope. The reason is that of the 3 parameters you're passing to requests.post, only the first one is correct (the URL). The payload will be assigned as data and the headers will be assigned to json The API is most likely to return an error but you're ignoring that possibility

No comments yet.

Discussion

No comments yet.

What causes the performance difference between my two multithreaded implementations using requests.post?

2 Answers

Summary

Recommendation

Discussion

Similar Posts

How can I efficiently map values from a 3D tensor to a 1D tensor using indices in PyTorch?