GitHub

BERT, Tokenizer, Python, WordPiece, pybind11,C++,Flash,Trie

4 followers

BERT, Tokenizer, Python, WordPiece, pybind11,C++,Flash,Trie

4 followers

🚀 **FlashTokenizer: World's Fastest CPU Tokenizer!** ⚡ 8~15x faster than `BertTokenizerFast` 🛠️ High-performance C++ 🔄 Parallel with OpenMP 📦 Easy pip install 💻 Cross-platform (Win/Mac/Linux) ▶️ Demo: https://youtu.be/a_sTiAXeSE0

Free

Launch tags:Artificial Intelligence•GitHub•Tech

Launch Team / Built With

Framer — Launch websites with enterprise needs at startup speeds.

Launch websites with enterprise needs at startup speeds.

Promoted

Maker

📌

👋 Hi Product Hunters! We're excited to launch **FlashTokenizer**, the world's fastest CPU tokenizer optimized specifically for large language models like BERT. We built this to significantly speed up NLP inference—achieving **8-15x faster performance** compared to traditional tokenizers. - Key features include: - ⚡ Ultra-fast tokenization - 🛠️ Optimized C++ performance - 📦 Simple pip installation - 💻 Cross-platform compatibility (Windows, macOS, Ubuntu) We'd love your feedback, thoughts, and questions—let's discuss! 🚀

Report

11mo ago

Reviews

Most Informative