Give your AI agents the social intelligence they're missing

Humalike - Give your AI agents the social intelligence they're missing

by•3d ago

Today's models are capable enough. Smart enough. Fast enough. But we still feel they don’t fit in the room. Humalike is building the behavioral infrastructure for humanlike AI agents. The social skills & proactiveness your agents have been missing. APIs, models, benchmarks.

Replies

Best

How are you actually measuring "humanlike" behavior beyond the benchmarks you ship, and can customers plug in their own eval scenarios to test against their specific use case?

Report

2d ago

Maker

@ensarokunakol At the end of the day it's your agent, so you can evaluate it in whatever way you want. We want to allow you to forget about interaction / social part so you can focus on what matters for your agent.

Report

1d ago

Humalike

Maker

@ensarokunakol To be honest, that's not something we have absolutely figured it out yet. We have some things, observability, intuition and "simple" benchmarks, there is still a long way to go upon how to measure. About eval scenarios, yes! The API "Social Observability" does exactly that!!

Report

1d ago

Tried the API over the weekend and the proactive context layer actually feels useful, not gimmicky. Liked that it picks up on social cues I usually have to script by hand.

Report

2d ago

Maker

@havavyad Tysm!

Report

1d ago

Humalike

Maker

@havavyad That's amazing to hear Hava! Have a great day, thx for the supp

Report

1d ago

StartupBase

The social intelligence angle is a sharp wedge. Most agent tooling optimizes for finishing the task and forgets how the interaction actually lands. In practice, are you scoring tone and context, or injecting it into the responses themselves? Feels like something multi-agent setups are going to need soon.

Report

2d ago

Maker

@attacomsian Thanks for the support. We equip your agent with context and judgment it needs to perform better.

Report

1d ago

Humalike

Maker

@attacomsian Totally! One of the components is social observability, which we use ourselves to figure out how to eval scoring. In practice, if we go simple, if you don't complain and overall feel satisfied with the interaction, it means you had a good experience! I think we can all relate to how annoying, impersonal and generally unaware agents are (in any usecase / product) :))

Report

1d ago

AISA AI Skills Test

the turn-taking problem is so real. I've seen plenty of AI agents that are technically capable but socially exhausting — they jump in too fast, over-explain, and never read the room. curious how you're benchmarking this though — what does 'good' social behavior look like as a metric? is it response timing, or something more nuanced like knowing when a user is thinking vs actually done talking?

Report

2d ago

Maker

@ozandag It's a combination of factors and the ultimate judges are humans: do they speak with him or ignore him? Do they get annoyed by him? Do they trust him? But this can be only evaluated in long time horizon, so we also look at short-term as you noted

Report

1d ago

Humalike

Maker

@ozandag 100%! Experience should just feel right. Our metric isn't fully acc yet, but if you don't complain and have a good experience, that means it's good :))

Report

1d ago

the missing layer isn't intelligence, it's calibration. models can generate perfect answers but they don't know when they're supposed to be quiet. the social debt shows up the moment you drop an agent into a real slack or discord. it either lurks awkwardly or overshares. the version that reads the room first and speaks second is the one people let stay.

curious how you're benchmarking "fits in the room." vibes are hard to measure. is it deference patterns, timing, response latency to social cues? that spec is the whole product.

Report

1d ago

Maker

@thenameisarian Hey 👋 humans are the ultimate judge. At the end of the day what matters is "do people like the agent?" "are they annoyed by him" "do they do social sanctioning on it?" this questions matter the most but are only attributable in long time horizon.

Report

1d ago

The benchmark piece is interesting here. For agents, “social intelligence” can get fuzzy fast. I’d want to see failure cases like interrupting too often or being too passive, not just success scores. Are you measuring those negative behaviors too?

Report

1d ago

Maker

@xiaosong001 Hey! Social Observability components evaluates these failure modes. We are obviously focusing on failure modes even more than on success stories.

Report

1d ago

Congratulations on the launch!
How does Turn Taking decide the best moment for an agent to join a conversation? does it adapt differently for fast moving group chats? really curious about the underlying approach.

Report

2d ago

Maker

@avery_thompson2 Thanks for support and question! Turn-taking uses other components like Social Signals to make better judgment. Social signals keeps track of typing speeds and realizes when chat is dynamic vs quiet. Turn taking also handles interruptions when agent already started processing and another message appears in group chat, that way agent is never spammy :))

Report

2d ago

Humalike

Maker

@avery_thompson2 tysm!!

Report

2d ago

Hello Inbox

Interesting... may test this with the RAG system I built. Good luck with the launch!

Report

2d ago

Humalike

Maker

@ismaelyws Amazing :)) Get back to us if you encounter any issue / weird behavior!

Report

2d ago

Social intelligence is exactly the layer that separates a demo from something a business will actually put on the phone. In production the failures are almost never 'wrong answer' - they're tone, over-promising, or not knowing when to shut up and hand off to a human. How are you measuring 'social' correctness? That's the part that's brutal to eval. Congrats on the launch.

Report

2d ago

Humalike

Maker

@david_marko Thanks for the supp David!

Report

2d ago

Maker

@david_marko Yes it's hard to measure and evaluate, especially in automated way since it usually takes human to judge what is appropiate in social setting. We have in-house research team working on evals and we open-sourced one of them, that is targeted on how well LLM adjusts to the group it is speaking to.

Paper link: https://arxiv.org/pdf/2606.14600

By the way I see you are maker of Worvi that could benefit from Humalike APIs. If you have more feedback / ideas please let us know!

Report

2d ago

The behavioral angle feels really fresh, not just another wrapper around an LLM. The proactivity piece is what stood out when I poked around, agents actually initiating instead of waiting on prompts.

Report

2d ago

Humalike

Maker

@nihal412209 Indeed! tysm for the supp!

Report

2d ago

•••

4 5 6