From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 72D79371D0D for ; Mon, 16 Mar 2026 22:54:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.181 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773701696; cv=none; b=rJxbxkuzEaRRwrYnyZzaxya1cyTKwE4aCfemtg0RGhyxq3K5U4Ca6ntxXce6RHP8WFhOKirQnw1d57PqX3ZGpf+reWU9pYMhvcI8szdZAJcARoeYnEFqlESBOIaEHueKY9rHBa7nwW6rqMyWuvQuM9CdlixtkYyGBks3oLBERMQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773701696; c=relaxed/simple; bh=nibhYwiPdt++G6J9J7ria/uZagWKJzcUfGV8vktD+b4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=H6mrGPw0u5+kv47xITRkpw+7OA6QSARencPPvmGbdej8In+vYsnC+8daPCcAjxve13SStvWSS1KtROvq5yejg6R+JITvWHSderfMxHkG8+WuOI92HHnhI5I64+ewt/24O01xW8Ji9zSA0WFGtuY0ae752X3f/2WykPxdvlT/xYQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=d8TqCYQO; arc=none smtp.client-ip=209.85.214.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="d8TqCYQO" Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-2b052562254so36475ad.0 for ; Mon, 16 Mar 2026 15:54:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1773701695; x=1774306495; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=aESMhDv9gp0zMf1EBnN9PJRbyf0p6BUGu73DNh1Qjzg=; b=d8TqCYQOgySd+7KKvCutrZZXShXblglqT7okalpX/R6jdCOitMmWk0AZLWZSOwZlXG nMzxA2IcMEOscCq7TCC14jO8E5CZuRce0dc1W1AYIGUffS90XYBu3qi7t1bz6cBrJeqv CKQg+IkqM7NdeKOM//BFE2KYoZj4GKziLYO3mDGHKyIyu3+UWmPFbE72+gxx9UelTdBU JKO75NSvP2QL/OkzZRQuXPfdUucc69/YU0ly4yKxbB+hb5swfY1O3/q05+2NcN4I6vLk 72BDhPYPKu6DLK6QCh930elQuxlS+Zpb6yXrh/JB114ITJwqf3f284OFrGaZBLkdjcsb OVxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773701695; x=1774306495; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=aESMhDv9gp0zMf1EBnN9PJRbyf0p6BUGu73DNh1Qjzg=; b=UbbLN2e+L8BQi3t4s5/VeC0KohaqWXHfNvUOX0maIN+sFDt5ADpvNYa+ggJIvFJbb0 ikgTsU+F8OVeCRzwrvXv5f94jcxl4RP94cpyp1q0cmOoafXRWakZ6WxCIY1ghR7246UL GeqeVJPsyzHwWCXfgg2FfzMljgCTuUerHd9YYaLdgzJhE9YjLGJ+UM6KugqWs2jpK29Y Thf1zOe2Q0kCuZ8Zo8IJEhc223+dJnxPw+ouKwkprM9m55q8Yf8sU1dzGBdlJ7bjF3ZG t3CC89HM/Pd8LokxrR5kXljsCFyMSpBrN9wRayukaZKUzstLb//KrCPvaau4gL7rwlHa d61Q== X-Forwarded-Encrypted: i=1; AJvYcCUexHIj9IL2wTULO+RbQO5JQ0v0xoqbPNGTMzbG9y0tSjIoiOZrL/y7+40WHHPdkHvrySU=@vger.kernel.org X-Gm-Message-State: AOJu0YyV4vjpBGFUZtwSF6Oc2xwYxfe+kcpRGrRdiPAV2TjYgvrVozXr vbDQu89MxR+5QJQkm4e430m2yJKhuQfmPcOTsFCZNf9sotQvxOsgZeDGNYardG8B/g== X-Gm-Gg: ATEYQzzTodzURKA/6Lf0QgQq7uPufzu/ri9yBlbLIzNBFM1riuogvoUQu4D2QY7CX15 Dk2TCECboZy/mam7Tdgks7s6C3+AF2ASsalYZ9nv4QyTr2jR23ebDhiE5T2uVGK8ury4gJvnLa2 Xg/OFb8KJB8XodjEvu5cjulGpHe+//MX+eoPla1wYi8NdlGw8WNIE2pg/lsJn6XKWbVYByXYR8g 5xqLOqchE8IzwDH83Y8yYTqngaIrLvBi27B3MQXpQqHcZ5r0GBPbHimlEN/M2q7u01h0374dH7Z TtHNBnoIVHSEy0vBAZk/rR0cr/5SkNdT3SLjHgbDs0d1RIRGM+e/GbiWJc51YRPrPgENG4G/wbP W3/hFU94+mJO9dOqKaxb4rNY4wS4wNrvnP105q/TnNzbebG4CAU+ISakCrM2Caa7gw+flC1hbIe +2FHaK+UX7qtTEYoToPsiIZMlqPLQFBzSVZ8yYSHE7W8+Izoc9T09xMJdmVA== X-Received: by 2002:a17:902:e54d:b0:2a9:5bfa:54ef with SMTP id d9443c01a7336-2b063ee45a7mr1438565ad.10.1773701694173; Mon, 16 Mar 2026 15:54:54 -0700 (PDT) Received: from google.com (60.89.247.35.bc.googleusercontent.com. [35.247.89.60]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-82a0725ca0asm17538461b3a.17.2026.03.16.15.54.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Mar 2026 15:54:53 -0700 (PDT) Date: Mon, 16 Mar 2026 15:54:50 -0700 From: Vipin Sharma To: Samiullah Khawaja Cc: David Woodhouse , Lu Baolu , Joerg Roedel , Will Deacon , Jason Gunthorpe , Robin Murphy , Kevin Tian , Alex Williamson , Shuah Khan , iommu@lists.linux.dev, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Saeed Mahameed , Adithya Jayachandran , Parav Pandit , Leon Romanovsky , William Tu , Pratyush Yadav , Pasha Tatashin , David Matlack , Andrew Morton , Chris Li , Pranjal Shrivastava , YiFei Zhu Subject: Re: [PATCH 01/14] iommu: Implement IOMMU LU FLB callbacks Message-ID: <20260316165018.GA1768676.vipinsh@google.com> References: <20260203220948.2176157-1-skhawaja@google.com> <20260203220948.2176157-2-skhawaja@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260203220948.2176157-2-skhawaja@google.com> On Tue, Feb 03, 2026 at 10:09:35PM +0000, Samiullah Khawaja wrote: > +config IOMMU_LIVEUPDATE > + bool "IOMMU live update state preservation support" > + depends on LIVEUPDATE && IOMMUFD > + help > + Enable support for preserving IOMMU state across a kexec live update. > + > + This allows devices managed by iommufd to maintain their DMA mappings > + during kexec base kernel update. > + > + If unsure, say N. > + Do we need a separate config? Can't we just use CONFIG_LIVEUPDATE? > menuconfig IOMMU_SUPPORT > bool "IOMMU Hardware Support" > depends on MMU > diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile > index 0275821f4ef9..b3715c5a6b97 100644 > --- a/drivers/iommu/Makefile > +++ b/drivers/iommu/Makefile > @@ -15,6 +15,7 @@ obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o > obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o > obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE_KUNIT_TEST) += io-pgtable-arm-selftests.o > obj-$(CONFIG_IOMMU_IO_PGTABLE_DART) += io-pgtable-dart.o > +obj-$(CONFIG_IOMMU_LIVEUPDATE) += liveupdate.o It seems like there is a sorted order for CONFIG_IOMMU_* in the Makefile, lets keep it same if possible. > +static void iommu_liveupdate_free_objs(u64 next, bool incoming) > +{ > + struct iommu_objs_ser *objs; > + > + while (next) { > + objs = __va(next); There is also call to phys_to_virt() in other functions in this patch. Should we use the same here to be consistent? > + next = objs->next_objs; > + > + if (!incoming) > + kho_unpreserve_free(objs); > + else > + folio_put(virt_to_folio(objs)); > + } > +} Instead of passing boolean, and calling with different arguments, I think it will be simpler to just have two functions - iommu_liveupdate_unpreserve() - iommu_liveupdate_folio_put() > + > +static void iommu_liveupdate_flb_free(struct iommu_lu_flb_obj *obj) > +{ > + if (obj->iommu_domains) > + iommu_liveupdate_free_objs(obj->ser->iommu_domains_phys, false); > + > + if (obj->devices) > + iommu_liveupdate_free_objs(obj->ser->devices_phys, false); > + > + if (obj->iommus) > + iommu_liveupdate_free_objs(obj->ser->iommus_phys, false); > + > + kho_unpreserve_free(obj->ser); > + kfree(obj); > +} > + > +static int iommu_liveupdate_flb_preserve(struct liveupdate_flb_op_args *argp) > +{ > + struct iommu_lu_flb_obj *obj; > + struct iommu_lu_flb_ser *ser; > + void *mem; > + > + obj = kzalloc(sizeof(*obj), GFP_KERNEL); > + if (!obj) > + return -ENOMEM; > + > + mutex_init(&obj->lock); > + mem = kho_alloc_preserve(sizeof(*ser)); > + if (IS_ERR(mem)) > + goto err_free; > + > + ser = mem; > + obj->ser = ser; > + > + mem = kho_alloc_preserve(PAGE_SIZE); > + if (IS_ERR(mem)) > + goto err_free; > + > + obj->iommu_domains = mem; > + ser->iommu_domains_phys = virt_to_phys(obj->iommu_domains); > + > + mem = kho_alloc_preserve(PAGE_SIZE); > + if (IS_ERR(mem)) > + goto err_free; > + > + obj->devices = mem; > + ser->devices_phys = virt_to_phys(obj->devices); > + > + mem = kho_alloc_preserve(PAGE_SIZE); > + if (IS_ERR(mem)) > + goto err_free; > + > + obj->iommus = mem; > + ser->iommus_phys = virt_to_phys(obj->iommus); > + > + argp->obj = obj; > + argp->data = virt_to_phys(ser); > + return 0; > + > +err_free: > + iommu_liveupdate_flb_free(obj); Generally, I have seen in the function goto will call corresponding error tags, and free corresponding allocations and all the one which happend before. It is easier to read code that way. I know you are combining the free call from iommu_liveupdate_flb_unpreserve() also. IMHO, code readability will be better this way. > + return PTR_ERR(mem); > +} > + > +static void iommu_liveupdate_flb_unpreserve(struct liveupdate_flb_op_args *argp) > +{ > + iommu_liveupdate_flb_free(argp->obj); > +} > + > +static void iommu_liveupdate_flb_finish(struct liveupdate_flb_op_args *argp) > +{ > + struct iommu_lu_flb_obj *obj = argp->obj; > + > + if (obj->iommu_domains) > + iommu_liveupdate_free_objs(obj->ser->iommu_domains_phys, true); Can there be the case where obj->iommu_domains is NULL but obj->ser->iommu_domains_phys is not? If that is not possible, I will just simplify the patch and unconditionally call iommu_liveupdate_free_objs()? > + > +static int iommu_liveupdate_flb_retrieve(struct liveupdate_flb_op_args *argp) > +{ > + struct iommu_lu_flb_obj *obj; > + struct iommu_lu_flb_ser *ser; > + > + obj = kzalloc(sizeof(*obj), GFP_ATOMIC); > + if (!obj) > + return -ENOMEM; Is kzalloc() failure here recoverable whereas iommu_liveupdate_restore_objs() below is not? If it is not recoverable should there be a BUG_ON here? > + > + mutex_init(&obj->lock); > + BUG_ON(!kho_restore_folio(argp->data)); > + ser = phys_to_virt(argp->data); > + obj->ser = ser; > + > + iommu_liveupdate_restore_objs(ser->iommu_domains_phys); > + obj->iommu_domains = phys_to_virt(ser->iommu_domains_phys); Can iommu_liveupdate_restore_obj() just return virtual address and we can simplify code to: obj->iommu_domains = iommu_liveupdate_restore_objs(ser->iommu_domains_phys); > + > + iommu_liveupdate_restore_objs(ser->devices_phys); > + obj->devices = phys_to_virt(ser->devices_phys); > + > + iommu_liveupdate_restore_objs(ser->iommus_phys); > + obj->iommus = phys_to_virt(ser->iommus_phys); > + > + argp->obj = obj; > + > + return 0; > +} > + > diff --git a/include/linux/iommu-lu.h b/include/linux/iommu-lu.h I will recommend to use full name and not short "lu". iommu-liveupdate.h seems more readable and not too long. > +#define MAX_IOMMU_SERS ((PAGE_SIZE - sizeof(struct iommus_ser)) / sizeof(struct iommu_ser)) > +#define MAX_IOMMU_DOMAIN_SERS \ > + ((PAGE_SIZE - sizeof(struct iommu_domains_ser)) / sizeof(struct iommu_domain_ser)) > +#define MAX_DEVICE_SERS ((PAGE_SIZE - sizeof(struct devices_ser)) / sizeof(struct device_ser)) This is per page limit, not whole serialization limit. May be we can name something like: - MAX_IOMMU_SERS_PER_PAGE, or - MAX_IOMMU_SERS_PAGE_CAPACITY > + > +struct iommu_lu_flb_obj { > + struct mutex lock; > + struct iommu_lu_flb_ser *ser; > + > + struct iommu_domains_ser *iommu_domains; > + struct iommus_ser *iommus; > + struct devices_ser *devices; > +} __packed; > + I think naming scheme used here is little hard to absorb when we have so many individual structs in this header file. Specifically, struct names like: - iommu_domains_ser vs iommu_domain_ser - iommus_ser vs iommu_ser - devices_ser vs device_ser - iommu_objs_ser vs iommu_obj_ser First three are showing container and its elements relation, however, last one doesn't have that relation but naming is same there. I will recommend to change the naming scheme of containers to something like: struct iommu_domain_ser_[hdr|header|table|arr] {}; struct iommu_ser_hdr {} struct device_ser_hdr {} Individual element of container can be same. For objs, something like: iommu_objs_ser -> iommu_hdr_meta