最終更新:2025-02-14 (金) 10:15:01 (154d)

transformers
transformers.LlamaTokenizer
Top / transformers.LlamaTokenizer

メモ

The LLaMA tokenizer is a BPE model based on SentencePiece.
One quirk of sentencepiece is that when decoding a sequence, if the first token is the start of the word (e.g. “Banana”), the tokenizer does not prepend the prefix space to the string.