Does anyone have a favorite FIM capable model? I've been using codellama-13b thr...

trissi · on July 17, 2024

I use deepseek-coder-7b-instruct-v1.5 & DeepSeek-Coder-V2-Lite-Instruct when I want speed & codestral-22B-v0.1 when I want smartness.

All of those are FIM capable, but especially deepseek-v2-lite is very picky with its prompt template so make sure you use it correctly...

Depending on your hardware codestral-22B might be fast enough for everything, but for me it's a bit to slow...

If you can run it deepseek v2 non-light is amazing, but it requires loads of VRAM

thot_experiment · on July 17, 2024

IIRC the codestral fim tokens aren't properly implemented in llama.cpp/ollama, what backend are you using to run them? id probably have to drop down to iq2_xxs or something for the full fat deepseek but I'll definitely look into codestral, I'm a big fan of mixtral, hopefully a MoE code model with FIM comes along soon.

EDIT: nvm, my mistake looks like it works fine https://github.com/ollama/ollama/issues/5403

xoranth · on July 17, 2024

Is the extension you wrote public?

thot_experiment · on July 17, 2024

no but it's super super janky and simple hodgepode of stack overflow and gemma:27b generated code, i'll just put it in the comment here, you just need CURL on your path and vim that's compiled with some specific flag

    function! GetSurroundingLines(n)
        let l:current_line = line('.')
        let l:start_line = max([1, l:current_line - a:n])
        let l:end_line = min([line('$'), l:current_line + a:n])
        
        let l:lines_before = getline(l:start_line, l:current_line - 1)
        let l:lines_after = getline(l:current_line + 1, l:end_line)
        
        return [l:lines_before, l:lines_after]
    endfunction
    
    function! AIComplete()
        let l:n = 256
        let [l:lines_before, l:lines_after] = GetSurroundingLines(l:n)
        
        let l:prompt = '<PRE>' . join(l:lines_before, "\n") . ' <SUF>' . join(l:lines_after, "\n") . ' <MID>'
        
        let l:json_data = json_encode({
            \ 'model': 'codellama:13b-code-q6_K',
            \ 'keep_alive': '30m',
            \ 'stream': v:false,
            \ 'prompt': l:prompt
        \ })
        
        let l:response = system('curl -s -X POST -H "Content-Type: application/json" -d ' . shellescape(l:json_data) . ' http://localhost:11434/api/generate')
    
        let l:completion = json_decode(l:response)['response']
        let l:paste_mode = &paste
        set paste
        execute "normal! a" . l:completion
        let &paste = l:paste_mode
    endfunction
    
    nnoremap <leader>c :call AIComplete()<CR>

xoranth · on July 17, 2024

Thanks!