Here are the key improvements in Grok 4.1 as compared to its previous models:
▸ Better user preference: In blind pairwise tests during rollout, Grok 4.1 was preferred ~64.78% of the time over the previous production model.
▸ Enhanced emotional and interpersonal ability: It performs stronger on emotional-intelligence benchmarks (e.g., EQ-Bench) and is more capable at nuanced, empathetic responses.
▸ Improved creative writing and style: In benchmarks for creative writing, it shows more engaging, coherent personality and better stylistic fluency.
▸ Reduced factual errors (hallucinations): Post-training emphasis was placed on real-world information-seeking prompts; the hallucination rate on sampled production queries went down.
▸ Maintains strong reasoning and general capability: While improving style and interaction, it retains the “razor-sharp intelligence and reliability” of its predecessors.
▸ Better alignment of style/personality/helpfulness: Training methods were updated to more deeply optimise non-verifiable reward signals (style, alignment, personality) using advanced agent-based reasoning models as reward models.

784
0
本頁面內容由第三方提供。除非另有說明,OKX 不是所引用文章的作者,也不對此類材料主張任何版權。該內容僅供參考,並不代表 OKX 觀點,不作為任何形式的認可,也不應被視為投資建議或購買或出售數字資產的招攬。在使用生成式人工智能提供摘要或其他信息的情況下,此類人工智能生成的內容可能不準確或不一致。請閱讀鏈接文章,瞭解更多詳情和信息。OKX 不對第三方網站上的內容負責。包含穩定幣、NFTs 等在內的數字資產涉及較高程度的風險,其價值可能會產生較大波動。請根據自身財務狀況,仔細考慮交易或持有數字資產是否適合您。

