<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Ai Inference on Jaeyoung Lee</title><link>https://sleepylee02.github.io/tags/ai-inference/</link><description>Recent content in Ai Inference on Jaeyoung Lee</description><generator>Hugo -- gohugo.io</generator><language>en-us</language><lastBuildDate>Fri, 23 Jan 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://sleepylee02.github.io/tags/ai-inference/index.xml" rel="self" type="application/rss+xml"/><item><title>RT-Swap [Addressing GPU Memory Bottlenecks for Real-Time Multi-DNN Inference]</title><link>https://sleepylee02.github.io/study/2026-01-23-rt-swap/</link><pubDate>Fri, 23 Jan 2026 00:00:00 +0000</pubDate><guid>https://sleepylee02.github.io/study/2026-01-23-rt-swap/</guid><description>What I Studied RT-Swap: Addressing GPU Memory Bottlenecks for Real-Time Multi-DNN Inference (RTAS'24) Key Takeaways Transparent design Beautiful implementation Proactive Management System Attachment Slides (PDF) Preview is not available. Open the PDF.</description></item></channel></rss>