Launched this week

Humalike
Give your AI agents the social intelligence they're missing
692 followers
Give your AI agents the social intelligence they're missing
692 followers
Today's models are capable enough. Smart enough. Fast enough. But we still feel they don’t fit in the room. Humalike is building the behavioral infrastructure for humanlike AI agents. The social skills & proactiveness your agents have been missing. APIs, models, benchmarks.









Humalike
Hey PH 👋 Martí here, co-founder of Humalike.
What is Humalike? The behavioral infrastructure for humanlike AI agents. The social skills your agents have been missing.
The problem
A few months ago we built an AI community manager. The second it hit a group chat, everyone knew it was a bot. It talked over people, never knew when to shut up. More features didn't fix it. Today's models are capable enough. Smart enough. Fast enough. But we still feel they don’t fit in the room.
The solution: 7 behavioral APIs
Turn-Taking (Flagship): Knows when to speak and when to stay silent (bundles all other APIs in one).
Theory of Mind: It gives your agent a sense of what people really think and feel.
Norms: Reads the group’s tone and responds the way it’s accepted here.
Persona: Improve presonality so it’s Opinionated, takes sides, backed by real community data
Social Memory: It gives your agent a memory for people, who they are and what matters to them.
Social Signals: Catches the pause before sending, a removed reaction, and an edited message.
Social Observability: Sees who’s engaged, who’s bored, and who’s annoyed.
Model, use-case and stack agnostic, built for groups, not just 1:1.
Extra highlights
💸 $20 in free tokens to start building
🔌 One-shot integrations with Hermes, WhatsApp & Telegram
📄 Backed by in-house research: LoSoNA (social-norm benchmark) + HUMA (a human-passing group facilitator)
🔒 SOC 2 / ISO 27001 in progress
Who It's for: Anyone building agents that must feel human, AI companions, NPCs, tutors, voice agents, groups, humanoids. If you've ever shipped an agent that was smart but experience using it felt wrong, Humalike is for you.
What we'd love from you: Grab your $20 in tokens, and tell us, how did our APIs improve the experience? Try with Hermes, Openclaw, or any agent you have deployed! We'll be here all day reading every comment, your feedback shapes what we ship!
Backed by the first investors in ElevenLabs, Revolut & more.
Built by a tiny 🇪🇸×🇵🇱 team that hasn't slept much :))
@mcarmonas the turn-taking problem is so underrated. everyone's focused on making agents smarter but the thing that breaks trust in group settings is way more basic, it's the agent that won't shut up or doesn't read when the room has moved on. curious how you're handling conflict between norms and persona, like when the group tone is reserved but the persona is configured to be opinionated?
jared.so
@rnagulapalle
totally, that's the underrated part, the agent that won't shut up or misses that the room has moved on is what breaks trust.
On norms vs persona: the norms moderate how the agent behaves, but the personality stays the same. A reserved room doesn't change who the agent is, just how and when it expresses it. The persona still takes a side, it's just more measured and better timed. Like a person who's opinionated but reads the room first.
Humalike
@rnagulapalle 100% agree + good intuition :)) tysm for the support
Visla
@mcarmonas This is a really interesting problem. Everyone keeps trying to make agents smarter, but half the battle is just making them less "awkward", not trying to anthropomorphize them. But if they are indeed going to be "teammates" of the future, as some think, then knowing when to talk, when to wait, and when to stay quiet matters a lot more than people realize.
Congrats on the launch, excited to see where this goes.
@mcarmonas @mogabr Thanks for support! We see a world few years from know where AI is on every platform, every communication channel, and every physical location. These agents will interact with humans and groups of humans just like we interact with each other. AI doesn't need to become "just like human" but it has to adapt to social setting to work alongside us.
Humalike
@mogabr tysm for the supp Gabe! 100% agree with you
Refocus
The Turn-Taking API is the part that jumps out at me. I build voice AI that calls elderly parents every day, and the single hardest thing has been the bot cutting people off. Older folks pause mid-sentence to find a word, and every VAD setup I've tried reads that silence as their turn ending. How does Turn-Taking handle long, uneven pauses? Is it purely acoustic timing, or does it factor in whether the thought is actually complete? Following to see where the benchmarks land.
@igorgurovich It is a hard problem, not solved well by anyone yet, especially in group conversations. One hard thing about it is that you can't only rely on what other person is doing (e.g. is there a moment of silence), but turn-taking needs to take into account personality of your agent, it's goals, it's relationship with human, it's memory etc. With this launch we tackle this problem for text and online chat first, while we work on end-to-end model for turn-taking in voice.
Humalike
@igorgurovich tysm for the supp Igor!
Looks like "groups, not 1:1" framing is the one most people might underrate. Really good! Turn-taking in a 2-person chat is mostly a latency problem, but the second there are 4 people in the room the agent has to decide whether to speak at all, which is a completely different thing.
Wonder when you're stack-agnostic, how do you actually capture a deleted draft or a pulled reaction? I guess on most platforms that event never leaves the client
jared.so
@artstavenka1 Hey, you nailed it with that question!
The platform forwards those events (edits, removed reactions, a typing indicator that stops) to you, you send them to us, and we do the interpretation. The hard part is understanding of what do these signals actually mean, f.e., if a person removed a reaction from a message, it could mean a change of heart or nothing at all, depending on the context. Right now agents are not able to interpret these signals, and that's where we come in.
Happy to go deeper on any of this, just ask!
Humalike
@artstavenka1 100%!! Thanks for your support Art
The "groups, not 1:1" framing is what got me.
I run a 2k-person Discord and tried putting an agent in the busy channels. In a 1:1 DM it's fine — but the moment 5 people are going back and forth, it either spams every message or freezes and says nothing. There's no in-between.
So my question on Turn-Taking: in a fast group thread, is it scoring "should I speak right now" per incoming message? Or does it hold a running read of the whole conversation and wait for a real opening?
And can I bias it toward "lurk more" — for a channel where I only want it to chime in occasionally?
Humalike
@rudratosh We would love you to connect your agent again to your community but now using Humalike :)) tysm for the supp!
@rudratosh It's funny that you bring up Discord, as it was our first use when we began working on Humalike. We hit the same issues as you described and decided that there's no point agents for Discord until Turn-taking and social aspects are solved.
It doesn't respond to every message, it notices if people are still sending messages. It waits for the opening and then addresses everything it seen so far.
You can tune lurking by adjusting your agent personality and passing it to turn-taking component!
The community manager anecdote is universal, I've watched it happen in Discord servers, Slack workspaces, and group chats. The failure mode isn't the bot being wrong, it's the bot being present. Silence has always been the harder signal to model because there's no reward function for "you correctly didn't do anything."
The split between Turn-Taking and Theory of Mind is what I'd want to understand better. In practice they feel related but the failure modes are different, an agent can have decent turn-taking (waits for pauses, doesn't interrupt) while still fundamentally misreading what people actually want from the conversation. And vice versa: an agent can read the room well emotionally but still fire at the wrong beat. Is Turn-Taking gated by Theory of Mind under the hood, or are they genuinely independent modules that can score high/low separately?
Rooting for this. Building social behavior as infrastructure rather than as prompt tricks is overdue.
@elias_motionfy Yes exactly! Turn-taking is component that benefits from all the other components, and actually we use ToM in turn-taking under the hood, nice catch. Turn-taking is the king of all components and it benefits from Social Signals, Norms, Persona, ToM and Memory - because knowing when to say something vs stay silent requires as much context as possible, and good judgment upon this context.
We split it because components still can be used independently - e.g. we used ToM component internally to analyze transcript history after the chat ended, not only to guide agent in real-time.
The split also helps thinking about Social Intelligence in general. "How do I make my AI behave better and less annoying" is the initial problem. It took us a while to categorize failure modes, understand different dimensions of social intelligence and create solutions upon them. It makes it easier to understand, debug and talk about it:))
@mateusz_jacniackiThe "categorize failure modes first, then build solutions" progression is exactly the shape of good infrastructure work. Feels obvious in retrospect and impossible in advance, most teams skip that step and end up with a monolithic "make it feel more human" prompt that can't be debugged when it breaks. Splitting the components so you can isolate which one failed on a bad session is the debugging superpower that only shows up if you did the taxonomy work first.
The ToM-in-turn-taking-under-the-hood detail is the honest architecture answer. Bundled feature that also ships as an independent module is the sweet spot, teams get the composed behavior by default but can dig deeper when they need to. That's the pattern I keep seeing work across infrastructure categories.
Curious about the eval side. Building a benchmark like LoSoNA for social norms feels genuinely hard, norms are context-dependent by definition, so any benchmark risks either overfitting to a specific culture or being so generic it doesn't measure anything real. How did you handle that tradeoff, is LoSoNA weighted toward a specific cultural context, or did you build it with explicit deltas per region/community type?
Humalike
@mateusz_jacniacki @elias_motionfy Really appreciate this - you described the motivation very well.
On LoSoNA: we didn’t try to build a “universal norm” benchmark, because that would miss the point. The benchmark is about whether a model can infer a local norm from the conversation and adapt to it when that norm differs from the default assistant behavior.
So the unit we test is not “does the model know what is polite globally?” but “given this group’s demonstrated behavior, can it predict what response fits here?”
That also makes the cultural tradeoff more manageable. Long-term, the right direction is definitely broader coverage across regions, communities, domains, and communication styles, but the core eval is about local group norm inference, not a fixed list of norms.
Humalike
@elias_motionfy Totally!! Theory of mind can be used as a solo component, but it also complements Turn-taking perfectly! Thanks for the supp!
"APIs, models, benchmarks" implies you have a way to measure social intelligence in agents, which is actually the hardest part of this whole space to get right. What does a benchmark for humanlike behavior look like here, who's evaluating it, and how do you avoid the benchmark just measuring surface-level mimicry like filler words and pacing rather than actual social appropriateness?
@ansari_adin Agree, evaluating social intelligence is the hardest part. And we don't have the full holy grail benchmark for social intelligence yet. We are tackling this problem and released a paper about inferring local social norms in LLMs, it's not testing social intelligence, but model with social intelligence will do well on this benchmark, and it's not trivial to game it.
paper link: https://arxiv.org/pdf/2606.14600
P.S. there are some benchmarks that claim to measure social intelligence (e.g. Sotopia) and we are not fan of those. They make big claims without being rigorous
Triforce Todos
Theory of Mind is the hardest one to get right, honestly.
How are you evaluating whether it's actually working vs just inferring emotions in an obvious way?
@abod_rehman Hey! It's hard to evaluate Theory of Mind, as anything in Social Intelligence problem space :)
There was pre-existing body of research about LLM's theory of mind capabilities and the interesting finding is that LLMs already have some level of literal theory of mind but they have hard time using this information to adjust their own behavior - which is called functional theory of mind. We rely on that research and our in-house research when approaching this problem.
Interesting read: https://arxiv.org/abs/2509.00559
Humalike
@abod_rehman Totally! tysm for the supp Abdul :))