How can I configure Huggingface Inference API as a Chat Model for AI Agents?

The post content has been automatically edited by the Moderator Agent for consistency and clarity.

I'm trying to integrate Huggingface's Inference API as a chat model for my AI agents using the meta-llama/Llama-3.3-70B-Instruct model. I configured the API with credentials and the Base URL (https://api-inference.huggingface.co/v1), similar to how I use the OpenAI Chat Model with other providers like Hyperbolic and Deepseek, which work fine, but in this case, I get no response and the request eventually times out.

I have set up my workflow in n8n as shown in the image below:

image794×915 36 KB

My n8n setup details are as follows:

n8nVersion: 1.72.1
Platform: docker (self-hosted)
nodeJsVersion: 20.18.0
Database: sqlite
Execution Mode: regular
License: enterprise (production)
ConsumerId: 2bb05d2c-a30a-4926-8c16-27278742ca81

Can anyone help troubleshoot why the API isn’t responding and how to resolve the timeout issue? Any insight on configuration differences compared to other chat model providers would be appreciated.

Comments 0

•

Answers 0

•

0 Answers

Discussion

No comments yet.

How can I configure Huggingface Inference API as a Chat Model for AI Agents?

0 Answers

Discussion

Similar Posts

Why Does HuggingFaceEndpoint Skip Quantization and Tokenizer Configuration Unlike HuggingFacePipeline?