Skip to content

Pull requests: huggingface/tokenizers

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

RUSTSEC-2024-0436 - replace paste with pastey
#1834 opened Jul 25, 2025 by nystromjd Loading…
Unused Unicode Character Filter
#1832 opened Jul 23, 2025 by sanderland Loading…
remove stray comment
#1831 opened Jul 22, 2025 by sanderland Loading…
Add enforce_utf8_boundaries option to BpeTrainer
#1830 opened Jul 22, 2025 by sanderland Loading…
Faster Whitespace PreTokenizer (Drop-in Replacement)
#1822 opened Jul 7, 2025 by 8ria Loading…
Add 3.13t CI using pytest-run-parallel
#1809 opened Jun 23, 2025 by ngoldbaum Loading…
Fix typo in README
#1808 opened Jun 23, 2025 by aisk Loading…
Track lockfile
#1806 opened Jun 22, 2025 by sftse Loading…
Adding multiprocessing for sentencepiece_extractor
#1804 opened Jun 19, 2025 by AamodThakur Loading…
add group capture to replace
#1788 opened Jun 3, 2025 by cboseak Loading…
Add Truncate pre-tokenizer
#1783 opened May 27, 2025 by ArthurZucker Draft
Update decode stream api
#1780 opened May 27, 2025 by ArthurZucker Loading…
Make unigram cache optional
#1763 opened Apr 18, 2025 by wangrunji0408 Loading…
Implement Append normalizer
#1755 opened Mar 24, 2025 by austinleedavis Loading…
Add FxHash and ShortStringOptimization.
#1733 opened Feb 10, 2025 by MeetThePatel Loading…
3 of 4 tasks
Does windows aarch work ?
#1719 opened Jan 10, 2025 by Narsil Loading…
Draft backtrack
#1712 opened Jan 3, 2025 by ArthurZucker Draft
Fast regex
#1605 opened Aug 8, 2024 by ArthurZucker Draft
ProTip! Type g i on any issue or pull request to go back to the issue listing page.