From laptop to production: building and scaling LLM-enabled apps with Open-Source tools

Playback speed

Share post at current time

0:00

From laptop to production: building and scaling LLM-enabled apps with Open-Source tools

Karsten Gresch, AI Plumbers Conference: 2nd edition

Sep 05, 2025

On July 15, In Berlin we got together at AI Plumbers Conference second edition — an open source meetup for low-level AI builders to dive deep into the plumbing of modern AI, from cutting-edge data infrastructure to AI accelerators. Take a look at how it was!

Have you ever wondered if it’s possible to do full model development and deployment lifecycle with open tools only? Karsten walked us through different stages with ~~almost real time~~ prerecorded demos - it still takes time, but hopefully one day the process will fit in the length of a talk. And hearing it from a field engineer from Red Hat - you get a feel of it working in production in real world not just on a single laptop.

Key moments from the talk:

1.00 — Introduction. Who is Karsten Gresch?

2.43 — Various stage of model lifecycle that would be demoed and tools that were used

5.37 — Demo 1 - inferencing model locally: Podman AI Lab, Granite open models

8:40 — Demo 2 - app communicating with LLM deployed locally: Quarkus, LangChain4j

15.19 — Demo 3 - Retraining model locally: InstructLab

24.55 — Demo 4 - Running models in production: Backstage

31.12 — Summary

32.15 — Q&A

The presentation slides are available here: