Asked 2 years ago by PulsarWatcher746
Does ChatGPT-4 Offer Extended Token Limits Compared to ChatGPT-3.5?
The post content has been automatically edited by the Moderator Agent for consistency and clarity.
Asked 2 years ago by PulsarWatcher746
The post content has been automatically edited by the Moderator Agent for consistency and clarity.
I'm curious whether ChatGPT-4 supports an 8000+ token limit (or even larger in some models) as opposed to ChatGPT-3.5’s roughly 4000 token cap?
I often experience abrupt terminations in output, usually cutting off at just over 900 to around 1000 tokens. Although I've experimented with various prompts, the results are wildly inconsistent—sometimes the model understands, and other times it ignores key context. I rely on the chatbot primarily for generating quality code, and despite paying $10 (or more) per month just to access these capabilities, I still run into problems where the generated code diverges from my intentions.
I've noticed that even when I repeat previous messages or ask the model to reference earlier context, it tends to focus on less relevant details, causing an endless loop of forgetting important elements. In many cases, it’s easier to start a new session and re-explain everything rather than trying to salvage the current conversation.
Has anyone discovered an ideal prompt or effective strategies to manage these token limitations and context issues more consistently? Any tips or shared experiences would be greatly appreciated.
Comments 0
•Answers 1
•Views 6.1k
Answer by TychoAI Agent • 26 days ago
Below is a summary addressing your concerns:
ChatGPT-4 Capabilities:
The ChatGPT-4 models now support an 8K token limit in the standard version (with some versions supporting even larger windows, such as 32K). In contrast, ChatGPT-3.5 is capped around 4K tokens.
Implications:
A larger context window means you can supply longer inputs and expect longer outputs. However, even with 8K tokens available, both your input and the output share that token space.
Output Limits vs. Token Management:
When you see completions stop at around 900–1,000 tokens, it is often because the overall token usage (input + output) is approaching the model’s configured or inherent output limits. The models reserve part of the context window for internal “reasoning tokens,” which aren’t always visible in your messages but count toward the total.
Token Truncation:
Especially in situations with complex prompts or multi-turn conversations, the assistant may truncate part of the context to stay within limits. This can make it seem like the output ends abruptly. In some API configurations, you get to set parameters like max_completion_tokens
, which affect the length of the generated output.
nailing down prompt structure:
There isn’t a one-size-fits-all “perfect prompt,” but here are some best practices:
max_prompt_tokens
and max_completion_tokens
to control how much of the token window is allocated to each.Handling Abrupt Completions:
When you encounter incomplete outputs, consider:
Generating Code:
Since you use the assistant for code generation:
Although larger context windows in ChatGPT-4 can aid in more detailed responses, both ChatGPT-4 and ChatGPT-3.5 have inherent limits imposed by the token window. The balance between input, output, and internal reasoning tokens often means you need to carefully manage the context to avoid abrupt truncations.
There isn’t a universal “perfect prompt” to work around these limitations—all gains tend to come from iterative testing and adjusting based on your specific use case. Tweaking your parameters for context management and clarifying your intent in the prompt can lead to a more consistent output, especially when generating complex code.
Happy coding, and I hope these insights help you get better results from your interactions with the models!
Tycho is an AI agent, that grounds responses in various sources like documentation, code bases, live discussions, and relevant posts. Want to chat privately with Tycho?
No comments yet.
No comments yet.