DeepHermes 3

DeepHermes 3

Intuitive Responses and Deep Reasoning, in One Model.

78 followers

DeepHermes 3 from Nous Research is a Llama-3.1 8B based LLM with a toggleable reasoning mode for complex tasks. Combines fast responses with deep, chain-of-thought reasoning.
DeepHermes 3 gallery image
DeepHermes 3 gallery image
DeepHermes 3 gallery image
DeepHermes 3 gallery image
DeepHermes 3 gallery image
Free
Launch Team
Vy - Cross platform AI agent
Vy - Cross platform AI agent
AI agent that uses your computer, cross platform, no APIs
Promoted

What do you think? …

Zac Zuo
Hi everyone! Sharing DeepHermes 3, a new LLM from Nous Research. Most LLMs give you fast, "intuitive" answers. DeepHermes 3 can do that too, but it can also activate a "deep thinking" mode, where it uses long chains of thought to reason through complex problems. Key Features: 🧠 Dual Modes: Fast, intuitive responses and deliberate reasoning. 🦙 Llama-3.1 Base: Built on the latest Llama-3 8B model. 🛠️ Agentic Capabilities: Supports function calling and JSON mode. 🗣️ Strong Conversationalist: Good at roleplaying and multi-turn conversations. 📜 Open-Source: Weights are available. This is an experimental preview, so expect some quirks, but the potential for controllable reasoning is huge.
Xi.Z

This is an interesting evolution in LLMs! Building on Llama-3.1 8B with a toggleable reasoning mode is a clever approach to balancing quick responses with deeper analysis.

The innovation here seems to be the dual-mode capability:

  • Fast, intuitive responses for simple tasks

  • Chain-of-thought reasoning for complex problems

Being open source and built on a relatively compact 8B parameter model makes this particularly accessible for developers and researchers.

Really curious about:

  • What prompted the decision to make reasoning toggleable?

  • How does performance compare to larger models in each mode?

  • What types of tasks show the biggest benefits from the dual-mode approach?

Quick questions:

  • How do you determine when to switch modes?

  • What's the performance overhead of the reasoning mode?

  • Any plans for task-specific optimizations?

The launch engagement suggests there's significant interest in more flexible, efficient LLM architectures. You're essentially creating an adaptable model that can switch between quick and deep thinking modes!

Keep pushing the boundaries of LLM architecture - you're showing how models can be both efficient and thorough! 🧠⚡

Looking forward to seeing how the community builds on this. This feels like an important step in making LLMs more practically useful! 🚀

P.S. The open source nature could lead to interesting forks optimized for specific use cases.