trending
rowen•

10mo ago

GitHub - BERT, Tokenizer, Python, WordPiece, pybind11,C++,Flash,Trie

šŸš€ **FlashTokenizer: World's Fastest CPU Tokenizer!** ⚔ 8~15x faster than `BertTokenizerFast` šŸ› ļø High-performance C++ šŸ”„ Parallel with OpenMP šŸ“¦ Easy pip install šŸ’» Cross-platform (Win/Mac/Linux) ā–¶ļø Demo: https://youtu.be/a_sTiAXeSE0