最終更新:2025-01-22 (水) 00:37:08 (17d)  

main_prefill
Top / main_prefill

https://github.com/AXERA-TECH/ax-llm/blob/prefill/src/main.cpp

run_deepseek-r1_1.5B_ax630c.shに入ってたやつ

  • usage: ./main_prefill --prompt=string [options] ...
    options:
      -p, --prompt                        prompt (string)
          --template_filename_axmodel     axmodel path template (string [=tinyllama-int8/tinyllama_l%d.axmodel])
          --filename_post_axmodel         post axmodel path (string [=tinyllama-int8/tinyllama_post.axmodel])
          --tokenizer_type                tokenizer type 0:LLaMa 1:Qwen 2:HTTP 3:Phi3 4:MINICPM (int [=0])
          --filename_tokenizer_model      tokenizer model path (string [=tokenizer.model])
          --filename_tokens_embed         tokens embed path (string [=tinyllama.model.embed_tokens.weight.bfloat16.bin])
          --use_topk                       (bool [=0])
          --bos                            (bool [=1])
          --eos                            (bool [=0])
          --axmodel_num                   num of axmodel(for template) (int [=22])
          --tokens_embed_num              tokens embed num (int [=32000])
          --tokens_embed_size             tokens embed size (int [=2048])
          --use_mmap_load_embed           it can save os memory (bool [=0])
          --dynamic_load_axmodel_layer    it can save cmm memory (bool [=0])
          --live_print                    print in live if set true, else print in end (bool [=0])
          --continue                      continuous dialogue (bool [=0])
      -?, --help                          print this message