最終更新:2025-01-22 (水) 00:37:08 (17d)
main_prefill
Top / main_prefill
https://github.com/AXERA-TECH/ax-llm/blob/prefill/src/main.cpp
run_deepseek-r1_1.5B_ax630c.shに入ってたやつ
usage: ./main_prefill --prompt=string [options] ... options: -p, --prompt prompt (string) --template_filename_axmodel axmodel path template (string [=tinyllama-int8/tinyllama_l%d.axmodel]) --filename_post_axmodel post axmodel path (string [=tinyllama-int8/tinyllama_post.axmodel]) --tokenizer_type tokenizer type 0:LLaMa 1:Qwen 2:HTTP 3:Phi3 4:MINICPM (int [=0]) --filename_tokenizer_model tokenizer model path (string [=tokenizer.model]) --filename_tokens_embed tokens embed path (string [=tinyllama.model.embed_tokens.weight.bfloat16.bin]) --use_topk (bool [=0]) --bos (bool [=1]) --eos (bool [=0]) --axmodel_num num of axmodel(for template) (int [=22]) --tokens_embed_num tokens embed num (int [=32000]) --tokens_embed_size tokens embed size (int [=2048]) --use_mmap_load_embed it can save os memory (bool [=0]) --dynamic_load_axmodel_layer it can save cmm memory (bool [=0]) --live_print print in live if set true, else print in end (bool [=0]) --continue continuous dialogue (bool [=0]) -?, --help print this message