GitHub

GitHub

BERT, Tokenizer, Python, WordPiece, pybind11,C++,Flash,Trie

4 followers

🚀 **FlashTokenizer: World's Fastest CPU Tokenizer!** ⚡ 8~15x faster than `BertTokenizerFast` 🛠️ High-performance C++ 🔄 Parallel with OpenMP 📦 Easy pip install 💻 Cross-platform (Win/Mac/Linux) ▶️ Demo: https://youtu.be/a_sTiAXeSE0
GitHub gallery image
GitHub gallery image
Free
Launch Team / Built With
AssemblyAI
AssemblyAI
Build voice AI apps with a single API
Promoted

What do you think? …

rowen
Maker
📌
👋 Hi Product Hunters! We're excited to launch **FlashTokenizer**, the world's fastest CPU tokenizer optimized specifically for large language models like BERT. We built this to significantly speed up NLP inference—achieving **8-15x faster performance** compared to traditional tokenizers. - Key features include: - ⚡ Ultra-fast tokenization - 🛠️ Optimized C++ performance - 📦 Simple pip installation - 💻 Cross-platform compatibility (Windows, macOS, Ubuntu) We'd love your feedback, thoughts, and questions—let's discuss! 🚀