Skip to content

Commit 8133a3e

Browse files
Add LLM orchestration and presentation prep post
1 parent 9001ea6 commit 8133a3e

File tree

1 file changed

+38
-0
lines changed

1 file changed

+38
-0
lines changed

content/posts/llm-orchestration.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
+++
2+
date = '2025-07-11'
3+
draft = true
4+
title = 'LLM Orchestration and System Readiness'
5+
tags = ["AI", "LLM", "SonarQube", "Refactoring", "Agentic AI", "WatsonX", "Ollama", "vLLM", "UCL", "IBM"]
6+
+++
7+
8+
<a href="#tldr" class="btn">Jump to TL;DR</a>
9+
10+
Following last week’s groundwork in static analysis and semantic retrieval, our focus this week has shifted toward multi-model orchestration, LLM infrastructure, and preparing for our upcoming IBM showcase on the 16th.
11+
12+
## 1. Unified Provider Interface
13+
To enable flexible experimentation and future-proofing, we've implemented a unified interface for interacting with multiple LLM providers.
14+
15+
Our abstraction currently supports **Ollama**, **vLLM**, and **WatsonX**, but, following the Open-Closed Principle, we've made sure to facilitate the addition of new ones with no friction.
16+
17+
This interface lays the foundation for benchmarking, adaptive routing, and graceful fallback strategies based on availability and performance of providers, with easy plug-in of AI models for each.
18+
19+
## 2. vLLM Integration for Local Scaling
20+
21+
We've begun setting up [vLLM](https://docs.vllm.ai/en/latest/) as a high-throughput, low-latency local model provider, as an alternative to Ollama.
22+
23+
By running it on specialised hardware, we gain the benefits of faster inference and more efficient memory use, which is crucial in scaling for real-world scenarios.
24+
25+
vLLM now acts as a drop-in alternative to Ollama, helping us compare local model performance under different configurations.
26+
27+
## 3. Preparing for the IBM presentation
28+
29+
With the system architecture stabilising (see [our Agentic Workflow post]({{< relref "agentic-workflow/index.md" >}})), we've started preparing for our upcoming presentation to IBM next Wednesday (July 16th).
30+
31+
This involved refining our workflows, scripting sample runs, and ensuring every component - especially our Scanner and Strategist agents - works seamlessly across models.
32+
33+
## TL;DR
34+
35+
This week we've focused on orchestration, flexibility, and preparing for next week's presentation:
36+
- **Unified LLM Interface** to switch between Ollama, WatsonX, and vLLM without modifying core agent logic.
37+
- **vLLM Integration Underway** to enable high-performance inference running on dedicated hardware.
38+
- **IBM Presentation Preparations** are in full swing, as we polish our end-to-end flows and solidify our system.

0 commit comments

Comments
 (0)