Replies: 1 comment
-
|
I forgot to add, my question is about the llama-server that you can spin up to create an openapi compatible endpoint. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I might have missed something but is it possible to unload a model if it wasn't used for X minutes? Ollama has something like that, freeing up vram for other things (image generation etc.) if the llm isn't in use currently.
Beta Was this translation helpful? Give feedback.
All reactions