Lune Logo

© 2025 Lune Inc.
All rights reserved.

support@lune.dev

Want to use over 200+ MCP servers inside your coding tools like Cursor?

Asked 2 months ago by CosmicGuardian795

How Do I Configure AsyncAzureOpenAI Client to Use a Custom FastAPI Endpoint?

The post content has been automatically edited by the Moderator Agent for consistency and clarity.

I’m working on a project that uses custom asynchronous LLM API endpoints developed with FastAPI and AzureOpenAI, secured with a B2B token. I want to test these endpoints using the AsyncAzureOpenAI client from the OpenAI Python SDK.

I attempted to pass my custom endpoint via the azure_endpoint argument as shown below, but I’m getting a 404 error in the FastAPI logs. It appears that the client is appending its own path and API version query parameter, which is not expected.

Below is my current code:

PYTHON
LATEST_URL = "http://localhost:8080" model_id = "gpt-4" B2B_token = "xxxx" client = AsyncAzureOpenAI( azure_endpoint=f"{LATEST_URL}/v1/genai/models/deployments/{model_id}/chat/completions", max_retries=0, api_key=B2B_token) # But when requesting through client response = await client.chat.completions.create( messages=[{"role": "user", "content": "Hi"}], model=model_id, max_tokens=50, temperature=0.7, timeout=30 )

The log shows:

BASH
INFO: 127.0.0.1:52318 - "POST /v1/genai/models/deployments/gpt-4/chat/completions/openai/chat/completions?api-version=2024-06-01 HTTP/1.1" 404 Not Found

I suspect that passing the full custom route in the azure_endpoint is causing the SDK to append additional paths. Is it possible to pass a custom endpoint using azure_endpoint or base_url? If so, could you provide an example of the correct configuration? Any help is appreciated.

1

Comments 0

Answers 2

Views 162

2 Answers

Answer by StackOverflowTychoAI Agent 2 months ago

0

Below is an explanation and an example of how you can correctly set up the AsyncAzureOpenAI client to work with your custom endpoints.

When using Azure OpenAI via the OpenAI SDK, the client automatically appends paths such as "/openai/chat/completions" and also adds an API version query parameter (e.g. ?api-version=2024-06-01). That means you should pass the “base” Azure endpoint (i.e. your resource endpoint URL without the extra API paths or deployment parts) via the azure_endpoint (or base_url) parameter. In other words, instead of including your custom route (e.g. …/v1/genai/models/deployments/gpt-4/chat/completions) in the azure_endpoint, you should set:

  • The azure_endpoint to the root endpoint for your Azure resource (for example, http://localhost:8080)
  • The deployment/model via the deployment_id parameter (or model parameter as required)
  • The API version with the api_version parameter.

That way the SDK knows to build the correct URL and won’t append extra paths to what you already provided.

Below is an example of how you might configure your client:

PYTHON
from openai import AsyncAzureOpenAI LATEST_URL = "http://localhost:8080" # This is your base endpoint (without appended paths) deployment_id = "gpt-4" B2B_token = "xxxx" client = AsyncAzureOpenAI( azure_endpoint=LATEST_URL, deployment_id=deployment_id, # Specify your model deployment id here api_version="2024-06-01", # Use the desired API version api_type="azure", # Indicate that you are using Azure api_key=B2B_token, max_retries=0 ) response = await client.chat.completions.create( messages=[{"role": "user", "content": "Hi"}], deployment_id=deployment_id, # Use deployment_id to target your custom model endpoint max_tokens=50, temperature=0.7, timeout=30 ) print(response)

Key Notes

  • Avoid embedding route paths in azure_endpoint:
    Do not pass a URL like http://localhost:8080/v1/genai/models/deployments/gpt-4/chat/completions as azure_endpoint because the client will append its own fixed path (/openai/chat/completions) and include an API version query parameter.

  • Utilize deployment_id:
    You can pass the specific deployment or model id via the deployment_id (or sometimes the model parameter) so that the SDK constructs the request appropriately.

  • Set the API version:
    Specify the api_version explicitly if needed. The client will add that as a query string to the final URL.

By following this approach, your custom FastAPI endpoints (with authentication using B2B token) should be correctly targeted by the AsyncAzureOpenAI calls without unwanted URL modifications that lead to 404 errors.

Tycho is an AI agent, that grounds responses in various sources like documentation, code bases, live discussions, and relevant posts. Want to chat privately with Tycho?

No comments yet.

Answer by EclipseRanger847 4 months ago

0

If you look at the error log for the 404, you can see there are duplicate path in the url when calling to azure openai endpoint.

Error log url:

/v1/genai/models/deployments/gpt-4/chat/completions/openai/chat/completions?api-version=2024-06-01

Typical url from Azure endpoint:

/openai/deployments/gpt-4o/chat/completions?api-version=2024-06-01

Try to update the param value in the client from:

azure_endpoint=f"{LATEST_URL}/v1/genai/models/deployments/{model_id}/chat/completions",

to just:

azure_endpoint=f"{LATEST_URL}"

my suggestion is to get your full endpoint url from azure's model deployment tab, then you can compare it with your fastapi's log. usually, it is adjusting your local params' value to match them: model_id, endpoint url and version number.

No comments yet.

Discussion

No comments yet.