Skip to content

Latest commit

 

History

History
36 lines (35 loc) · 1.9 KB

TODO.md

File metadata and controls

36 lines (35 loc) · 1.9 KB

TODO

  • ChatML format (actually need to add special tokens)
  • Vicuna dataset merge (yahma/alpaca-cleaned)
  • Phi-2 fine tuning
  • Quantize /w llama.cpp
  • Make custom component use llama.cpp + ChatML
  • Continued synthetic dataset improvements (there are a bunch of TODOs in there)
  • Licenses + Attributions
  • Finish Readme/docs for initial release
  • Function calling as JSON
  • multi-turn prompts; better instruct dataset like dolphin/wizardlm?
  • Fine tune Phi-1.5 version
  • make llama-cpp-python wheels for "llama-cpp-python>=0.2.24"
  • prime kv cache with current "state" so that requests are faster
  • make a proper evaluation framework to run. not just loss. should test accuracy on the function calling
  • add more remote backends
    • LocalAI (openai compatible)
    • Ollama
    • support chat completions API (might fix Ollama + adds support for text-gen-ui characters)
  • more config options for prompt template (allow other than chatml)
  • publish snapshot of dataset on HF
  • figure out DPO for refusals + fixing incorrect entity id
  • mixtral + prompting (no fine tuning)
  • use varied system prompts to add behaviors
  • setup github actions to build wheels that are optimized for RPIs

more complicated ideas

  • "context requests"
    • basically just let the model decide what RAG/extra context it wants
    • the model predicts special tokens as the first few tokens of its output
    • the requested content is added to the context after the request tokens and then generation continues
    • needs more complicated training b/c multi-turn + there will be some weird masking going on for training the responses properly
  • RAG for getting info for setting up new devices
    • set up vectordb
    • ingest home assistant docs
    • "context request" from above to initiate a RAG search