{"id":2191,"date":"2025-07-07T11:00:23","date_gmt":"2025-07-07T11:00:23","guid":{"rendered":"https:\/\/interactivehpc.dk\/?p=2191"},"modified":"2025-08-12T08:53:03","modified_gmt":"2025-08-12T08:53:03","slug":"webinar-recording-fine-tuning-and-deploying-large-language-models","status":"publish","type":"post","link":"https:\/\/interactivehpc.dk\/?p=2191","title":{"rendered":"Webinar recording: Fine-Tuning and Deploying \u00a0Large Language Models"},"content":{"rendered":"\n<p class=\"gp-gutenbergpro-43fda\"><\/p>\n\n\n\n<figure class=\"wp-block-video\"><video height=\"1080\" style=\"aspect-ratio: 1920 \/ 1080;\" width=\"1920\" controls src=\"https:\/\/interactivehpc.dk\/wp-content\/uploads\/2025\/08\/video1904462930_edited_1.mp4\"><\/video><\/figure>\n\n\n\n<p class=\"gp-gutenbergpro-fb9b8\">In this video we will guide you through the complete pipeline of fine-tuning large language models (LLMs) for specialised tasks such as medical question-answering using NeMo Framework and Triton Inference Server.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Prepare and preprocess open-source datasets for fine-tuning.<\/li>\n\n\n\n<li>Apply&nbsp;<strong>Parameter-Efficient Fine-Tuning (PEFT)<\/strong>&nbsp;using&nbsp;<strong>LoRA<\/strong>&nbsp;with&nbsp;<strong>NVIDIA NeMo Framework<\/strong>.<\/li>\n\n\n\n<li>Deploy optimised LLMs using&nbsp;<strong>NVIDIA<\/strong>&nbsp;<strong>Triton Inference Server<\/strong>&nbsp;and&nbsp;<strong>TensorRT-LLM<\/strong>.<\/li>\n\n\n\n<li>Generate a synthetic Q&amp;A dataset using&nbsp;<strong>Label Studio<\/strong>&nbsp;connected to a live inference backend.<\/li>\n\n\n\n<li>Fine-tune and evaluate your customised LLM for domain-specific applications.<\/li>\n<\/ul>\n\n\n\n<p class=\"gp-gutenbergpro-ae91d\">All workflows will be executed inside a&nbsp;<strong>UCloud<\/strong>&nbsp;project environment with access to GPU resources.<\/p>\n\n\n\n<p class=\"gp-gutenbergpro-e1abd\"><strong>Target audience:&nbsp;<\/strong>Machine learning practitioners, researchers, and engineers interested in LLM customisation, domain adaptation, or scalable model deployment.<\/p>\n\n\n\n<p class=\"gp-gutenbergpro-f7620\"><strong>Technical Level:<\/strong>&nbsp;Intermediate to Advanced.<\/p>\n\n\n\n<p class=\"gp-gutenbergpro-a4570\">Notebooks: <a href=\"https:\/\/github.com\/emolinaro\/ucloud-workshop-28-05-2025\">https:\/\/github.com\/emolinaro\/ucloud-workshop-28-05-2025<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this video we will guide you through the complete pipeline of fine-tuning large language models (LLMs) for specialised tasks such as medical question-answering using NeMo Framework and Triton Inference Server. All workflows will be executed inside a&nbsp;UCloud&nbsp;project environment with access to GPU resources. Target audience:&nbsp;Machine learning practitioners, researchers, and engineers interested in LLM customisation, [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":2193,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"gtp_columnspro_styling":"{}","gtp_paragraph_styling":"{\"a45705d3-a807-4e89-b750-b4eeaa091853\":\" .gp-gutenbergpro-a4570 { background-position-x: 50%;\\nbackground-position-y: 50%;\\nbackground-size: cover;\\nheight: px; }\"}","gtp_heading_styling":"{}","gtp_spacer_styling":"{}","gtp_video_styling":"{}","gtp_group_styling":"{}","gtp_cover_styling":"{}","footnotes":""},"categories":[40,38,10,47,353,11],"tags":[],"class_list":["post-2191","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-interactive-hpc","category-supercomputing","category-tutorial","category-ucloud","category-webinars-and-tutorials-video","category-workshop"],"lang":"en","translations":{"en":2191,"da":2195},"pll_sync_post":[],"_links":{"self":[{"href":"https:\/\/interactivehpc.dk\/index.php?rest_route=\/wp\/v2\/posts\/2191","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/interactivehpc.dk\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/interactivehpc.dk\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/interactivehpc.dk\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/interactivehpc.dk\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=2191"}],"version-history":[{"count":2,"href":"https:\/\/interactivehpc.dk\/index.php?rest_route=\/wp\/v2\/posts\/2191\/revisions"}],"predecessor-version":[{"id":2222,"href":"https:\/\/interactivehpc.dk\/index.php?rest_route=\/wp\/v2\/posts\/2191\/revisions\/2222"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/interactivehpc.dk\/index.php?rest_route=\/wp\/v2\/media\/2193"}],"wp:attachment":[{"href":"https:\/\/interactivehpc.dk\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=2191"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/interactivehpc.dk\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=2191"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/interactivehpc.dk\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=2191"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}