feat(api): migrate from llama-cli subprocess to llama-server REST API
authorSvjatoslav Agejenko <svjatoslav@svjatoslav.eu>
Sun, 17 May 2026 20:22:45 +0000 (23:22 +0300)
committerSvjatoslav Agejenko <svjatoslav@svjatoslav.eu>
Sun, 17 May 2026 20:22:45 +0000 (23:22 +0300)
commit5e240c25fecd7ee236f8a3081d6ae118471cb89b
tree4a620ef10cb4812faa53ea064036cf5c2fa552e6
parent8b1fcb8d3dcf2963df1694b77db875c91d44c03e
feat(api): migrate from llama-cli subprocess to llama-server REST API

Replace local llama-cli binary execution with HTTP calls to a running
llama-server instance. This enables KV cache reuse across turns, which is
essential for future multi-turn agent mode.

Key changes:
- TaskProcess now POSTs to /v1/chat/completions instead of spawning process
- Skill format changed from raw prompt template to system_prompt field
- Configuration simplified: removed llamaCliPath, thread counts, modelsDirectory,
  and all sampling defaults; added server_url
- Wizard stripped of model discovery and llama-cli path validation
- Added --once flag to process command for one-shot batch execution
- Removed all getEffective* sampling parameter methods from Task
doc/examples/skills/default.yaml
doc/examples/skills/summary.yaml
src/main/java/eu/svjatoslav/alyverkko_cli/commands/ListModelsCommand.java
src/main/java/eu/svjatoslav/alyverkko_cli/commands/WizardCommand.java
src/main/java/eu/svjatoslav/alyverkko_cli/commands/task_processor/Task.java
src/main/java/eu/svjatoslav/alyverkko_cli/commands/task_processor/TaskPriorityQueue.java
src/main/java/eu/svjatoslav/alyverkko_cli/commands/task_processor/TaskProcess.java
src/main/java/eu/svjatoslav/alyverkko_cli/commands/task_processor/TaskProcessorCommand.java
src/main/java/eu/svjatoslav/alyverkko_cli/configuration/Configuration.java
src/main/java/eu/svjatoslav/alyverkko_cli/configuration/SkillConfig.java