summary |
shortlog | log |
commit |
commitdiff |
tree
first ⋅ prev ⋅ next
Svjatoslav Agejenko [Sun, 17 May 2026 21:40:29 +0000 (00:40 +0300)]
refactor: remove model abstraction and obsolete commands
Remove the Model class and all model-related configuration, switching
the task processor to use a single server-side model. This eliminates
the local model selection hierarchy (TOCOMPUTE > skill > default).
Deleted commands:
- ListModelsCommand — no longer needed without local model registry
- JoinFilesCommand — superseded by other workflows
Updated AddTaskHeaderCommand to stop prompting for model alias and
generating model= in TOCOMPUTE headers.
Updated Task, TaskProcess, and TaskProcessorCommand to remove model
field and model-specific timeout fallback.
Added @JsonIgnoreProperties to Configuration for forward compatibility
with old config files that still contain a models key.
Svjatoslav Agejenko [Sun, 17 May 2026 20:22:45 +0000 (23:22 +0300)]
feat(api): migrate from llama-cli subprocess to llama-server REST API
Replace local llama-cli binary execution with HTTP calls to a running
llama-server instance. This enables KV cache reuse across turns, which is
essential for future multi-turn agent mode.
Key changes:
- TaskProcess now POSTs to /v1/chat/completions instead of spawning process
- Skill format changed from raw prompt template to system_prompt field
- Configuration simplified: removed llamaCliPath, thread counts, modelsDirectory,
and all sampling defaults; added server_url
- Wizard stripped of model discovery and llama-cli path validation
- Added --once flag to process command for one-shot batch execution
- Removed all getEffective* sampling parameter methods from Task
Svjatoslav Agejenko [Sat, 16 May 2026 23:48:38 +0000 (02:48 +0300)]
chore: add local Phi-4 LLM server launcher for TCP testing
Add tools/llama-server-phi-4 script to start a fast local llama.cpp server\nwith the Phi-4 model (Q4_K_M, ~8.3 GiB) on port 8081.\n\nIntended for gradual transition to using LLM over TCP in the project.\nRuns at low priority with single slot and 16K context.
Svjatoslav Agejenko [Sat, 16 May 2026 23:27:00 +0000 (02:27 +0300)]
Initial commit