After a whole day full of talks at AI DevRoom at FOSDEM and the talks track at AI Plumbers, this was a final presentation to spark controversy and raise question for unconference section that came next. Key moments from the talk: 0:30 - Picking up where the "The Local AI Rebellion" left off 1:15 - Inference vs training needs - inference vs training chips 3:03 - What kind of silicon architecture would work best 4:25 - Frameworks, compilers, "stuff" that makes a system 6:18 - History lesson on inference frameworks - how we ended up where we are 18:00 - Ancient history lesson - recognizing similar patterns 19:57 - Refactoring? Why are things so complicated? Right level of abstraction? Figuring it out is on us! 26:13 - tinygrad approach - RISC for Op types? 30:00 - Ongoing work, predictions and AIFoundry principles The presentation slides are available here:
P.S. Don't miss the hot takes to be in on all the inside jokes, look for discussion of the main use case for local AI (if you know, you know, thank you kobold.cpp), "uncovering" hidden dependencies in ML frameworks, some blackhat use cases and many more!
Share this post