I've been using various GPT models since GPT-3 and found that even with minor prompt adjustments, the results were consistent and reasonable. However, the new gpt-3.5-turbo-1106 has been a major step down in quality.
I ran several experiments recently with small projects and simple message formats. To my surprise, an overwhelming number of responses were simply “I’m sorry, I can’t do that” with only about 5% yielding relevant results. This is confusing because none of my tests involved mature, violent, or otherwise controversial content that should trigger the safety filters.
I had to revert to the previous model, gpt-3.5-turbo-0613, as it provided more reliable output. However, since this older version will eventually be deprecated, I'm concerned about how to proceed with the new version. I’m curious about what changes may be causing these issues and if there’s any workaround to regain the performance I previously enjoyed.