Groq® - Hyperfast LLM running on custom built GPUs

Raycast

•2yr ago

An LPU Inference Engine, with LPU standing for Language Processing Unit™, is a new type of end-to-end processing unit system that provides the fastest inference at ~500 tokens/second.

Replies

Best

Crustdata

Congrats team Groq® on your launch.

Report

2yr ago

It looks very promising. How can I find information on how to use the APIs?

Report

2yr ago

Wow, you guys are innovating. Congratulations! I tested it out and was blown away.

Report

2yr ago

Octomind

Wow, love it. We are heavily relying on LLMs and the slowness of our agents is a constant annoyance. A 14x speed-up would be a real game changer. Can't wait to see LPUs in action and at scale. Keep going!

Report

2yr ago

This is helpful post.thanks

Report

2yr ago

This was so blazing fast! compared to all other LLM inference engines i have used before, the difference is not even needing of a comparison!

Vercel, OpenRouter, Gemini, GPT all take their time even with their smaller/faster inference models
only similar speed i have noticed is Gemini flash models but that still is slower than how fast i get answers from Groq API

Report

5d ago

1 2