All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@kernel.org>
To: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com,
	dmatlack@google.com, rientjes@google.com, corbet@lwn.net,
	rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com,
	kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com,
	masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org,
	yoann.congal@smile.fr, mmaurer@google.com,
	roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk,
	mark.rutland@arm.com, jannh@google.com,
	vincent.guittot@linaro.org, hannes@cmpxchg.org,
	dan.j.williams@intel.com, david@redhat.com,
	joel.granados@kernel.org, rostedt@goodmis.org,
	anna.schumaker@oracle.com, song@kernel.org, linux@weissschuh.net,
	linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-mm@kvack.org, gregkh@linuxfoundation.org,
	tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
	dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com,
	rafael@kernel.org, dakr@kernel.org,
	bartosz.golaszewski@linaro.org, cw00.choi@samsung.com,
	myungjoo.ham@samsung.com, yesanishhere@gmail.com,
	Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com,
	aleksander.lobakin@intel.com, ira.weiny@intel.com,
	andriy.shevchenko@linux.intel.com, leon@kernel.org,
	lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org,
	djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de,
	lennart@poettering.net, brauner@kernel.org,
	linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com,
	parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com,
	hughd@google.com, skhawaja@google.com, chrisl@kernel.org
Subject: Re: [PATCH v7 14/22] mm: memfd_luo: allow preserving memfd
Date: Sun, 23 Nov 2025 17:47:52 +0200	[thread overview]
Message-ID: <aSMsqD5mB2mHHH9v@kernel.org> (raw)
In-Reply-To: <20251122222351.1059049-15-pasha.tatashin@soleen.com>

On Sat, Nov 22, 2025 at 05:23:41PM -0500, Pasha Tatashin wrote:
> From: Pratyush Yadav <ptyadav@amazon.de>
> 
> The ability to preserve a memfd allows userspace to use KHO and LUO to
> transfer its memory contents to the next kernel. This is useful in many
> ways. For one, it can be used with IOMMUFD as the backing store for
> IOMMU page tables. Preserving IOMMUFD is essential for performing a
> hypervisor live update with passthrough devices. memfd support provides
> the first building block for making that possible.
> 
> For another, applications with a large amount of memory that takes time
> to reconstruct, reboots to consume kernel upgrades can be very
> expensive. memfd with LUO gives those applications reboot-persistent
> memory that they can use to quickly save and reconstruct that state.
> 
> While memfd is backed by either hugetlbfs or shmem, currently only
> support on shmem is added. To be more precise, support for anonymous
> shmem files is added.
> 
> The handover to the next kernel is not transparent. All the properties
> of the file are not preserved; only its memory contents, position, and
> size. The recreated file gets the UID and GID of the task doing the
> restore, and the task's cgroup gets charged with the memory.
> 
> Once preserved, the file cannot grow or shrink, and all its pages are
> pinned to avoid migrations and swapping. The file can still be read from
> or written to.
> 
> Use vmalloc to get the buffer to hold the folios, and preserve
> it using kho_preserve_vmalloc(). This doesn't have the size limit.
> 
> Signed-off-by: Pratyush Yadav <ptyadav@amazon.de>
> Co-developed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> ---

...

> +static int memfd_luo_retrieve_folios(struct file *file,
> +				     struct memfd_luo_folio_ser *folios_ser,
> +				     u64 nr_folios)
> +{
> +	struct inode *inode = file_inode(file);
> +	struct address_space *mapping = inode->i_mapping;
> +	struct folio *folio;
> +	long i = 0;
> +	int err;
> +
> +	for (; i < nr_folios; i++) {
> +		const struct memfd_luo_folio_ser *pfolio = &folios_ser[i];
> +		phys_addr_t phys;
> +		u64 index;
> +		int flags;
> +
> +		if (!pfolio->pfn)
> +			continue;
> +
> +		phys = PFN_PHYS(pfolio->pfn);
> +		folio = kho_restore_folio(phys);
> +		if (!folio) {
> +			pr_err("Unable to restore folio at physical address: %llx\n",
> +			       phys);
> +			goto put_folios;
> +		}
> +		index = pfolio->index;
> +		flags = pfolio->flags;
> +
> +		/* Set up the folio for insertion. */
> +		__folio_set_locked(folio);
> +		__folio_set_swapbacked(folio);
> +
> +		err = mem_cgroup_charge(folio, NULL, mapping_gfp_mask(mapping));
> +		if (err) {
> +			pr_err("shmem: failed to charge folio index %ld: %d\n",
> +			       i, err);
> +			goto unlock_folio;
> +		}
> +
> +		err = shmem_add_to_page_cache(folio, mapping, index, NULL,
> +					      mapping_gfp_mask(mapping));
> +		if (err) {
> +			pr_err("shmem: failed to add to page cache folio index %ld: %d\n",
> +			       i, err);
> +			goto unlock_folio;
> +		}
> +
> +		if (flags & MEMFD_LUO_FOLIO_UPTODATE)
> +			folio_mark_uptodate(folio);
> +		if (flags & MEMFD_LUO_FOLIO_DIRTY)
> +			folio_mark_dirty(folio);
> +
> +		err = shmem_inode_acct_blocks(inode, 1);
> +		if (err) {
> +			pr_err("shmem: failed to account folio index %ld: %d\n",
> +			       i, err);
> +			goto unlock_folio;
> +		}
> +
> +		shmem_recalc_inode(inode, 1, 0);
> +		folio_add_lru(folio);
> +		folio_unlock(folio);
> +		folio_put(folio);
> +	}
> +
> +	return 0;
> +
> +unlock_folio:
> +	folio_unlock(folio);
> +	folio_put(folio);
> +	i++;
 
I'd add a counter and use it int the below for loop.

> +put_folios:
> +	/*
> +	 * Note: don't free the folios already added to the file. They will be
> +	 * freed when the file is freed. Free the ones not added yet here.
> +	 */
> +	for (; i < nr_folios; i++) {
> +		const struct memfd_luo_folio_ser *pfolio = &folios_ser[i];
> +
> +		folio = kho_restore_folio(pfolio->pfn);
> +		if (folio)
> +			folio_put(folio);
> +	}
> +
> +	return err;
> +}

Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

-- 
Sincerely yours,
Mike.

  reply	other threads:[~2025-11-23 15:48 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-22 22:23 [PATCH v7 00/22] Live Update Orchestrator Pasha Tatashin
2025-11-22 22:23 ` [PATCH v7 01/22] liveupdate: luo_core: " Pasha Tatashin
2025-11-23 11:12   ` Mike Rapoport
2025-11-23 12:15     ` Pasha Tatashin
2025-11-24  5:07       ` Mike Rapoport
2025-11-24 20:43         ` Pasha Tatashin
2025-11-22 22:23 ` [PATCH v7 02/22] liveupdate: luo_core: integrate with KHO Pasha Tatashin
2025-11-23 11:27   ` Mike Rapoport
2025-11-23 12:03     ` Pasha Tatashin
2025-11-23 14:16       ` Mike Rapoport
2025-11-23 18:23         ` Pasha Tatashin
2025-11-25 13:08           ` Mike Rapoport
2025-11-25 13:59             ` Pasha Tatashin
2025-11-24 14:21   ` Pratyush Yadav
2025-11-25 16:09     ` Pasha Tatashin
2025-11-22 22:23 ` [PATCH v7 03/22] kexec: call liveupdate_reboot() before kexec Pasha Tatashin
2025-11-22 22:23 ` [PATCH v7 04/22] liveupdate: luo_session: add sessions support Pasha Tatashin
2025-11-23 14:16   ` Mike Rapoport
2025-11-23 19:07     ` Pasha Tatashin
2025-11-24 14:57   ` Pratyush Yadav
2025-11-22 22:23 ` [PATCH v7 05/22] liveupdate: luo_core: add user interface Pasha Tatashin
2025-11-23 14:19   ` Mike Rapoport
2025-11-23 19:25     ` Pasha Tatashin
2025-11-24 15:11   ` Pratyush Yadav
2025-11-22 22:23 ` [PATCH v7 06/22] liveupdate: luo_file: implement file systems callbacks Pasha Tatashin
2025-11-24  8:18   ` Mike Rapoport
2025-11-25 15:13     ` Pasha Tatashin
2025-11-24 15:44   ` Pratyush Yadav
2025-11-24 15:47     ` Pratyush Yadav
2025-11-25 15:17       ` Pasha Tatashin
2025-11-22 22:23 ` [PATCH v7 07/22] liveupdate: luo_session: Add ioctls for file preservation Pasha Tatashin
2025-11-24  5:20   ` Mike Rapoport
2025-11-22 22:23 ` [PATCH v7 08/22] docs: add luo documentation Pasha Tatashin
2025-11-23 16:05   ` Mike Rapoport
2025-11-23 19:29     ` Pasha Tatashin
2025-11-24 15:49   ` Pratyush Yadav
2025-11-22 22:23 ` [PATCH v7 09/22] MAINTAINERS: add liveupdate entry Pasha Tatashin
2025-11-23 15:29   ` Mike Rapoport
2025-11-24 15:18   ` Pratyush Yadav
2025-11-22 22:23 ` [PATCH v7 10/22] mm: shmem: use SHMEM_F_* flags instead of VM_* flags Pasha Tatashin
2025-11-22 22:23 ` [PATCH v7 11/22] mm: shmem: allow freezing inode mapping Pasha Tatashin
2025-11-23 15:29   ` Mike Rapoport
2025-11-23 19:43     ` Pasha Tatashin
2025-11-22 22:23 ` [PATCH v7 12/22] mm: shmem: export some functions to internal.h Pasha Tatashin
2025-11-22 22:23 ` [PATCH v7 13/22] liveupdate: luo_file: add private argument to store runtime state Pasha Tatashin
2025-11-22 22:23 ` [PATCH v7 14/22] mm: memfd_luo: allow preserving memfd Pasha Tatashin
2025-11-23 15:47   ` Mike Rapoport [this message]
2025-11-24  3:13     ` Pasha Tatashin
2025-11-22 22:23 ` [PATCH v7 15/22] docs: add documentation for memfd preservation via LUO Pasha Tatashin
2025-11-23 16:07   ` Mike Rapoport
2025-11-22 22:23 ` [PATCH v7 16/22] selftests/liveupdate: Add userspace API selftests Pasha Tatashin
2025-11-24  5:24   ` Mike Rapoport
2025-11-24 15:56   ` Pratyush Yadav
2025-11-22 22:23 ` [PATCH v7 17/22] selftests/liveupdate: Add kexec-based selftest for Pasha Tatashin
2025-11-24  5:29   ` Mike Rapoport
2025-11-22 22:23 ` [PATCH v7 18/22] selftests/liveupdate: Add kexec test for multiple and empty sessions Pasha Tatashin
2025-11-24  5:30   ` Mike Rapoport
2025-11-22 22:23 ` [PATCH v7 19/22] selftests/liveupdate: add test infrastructure and scripts Pasha Tatashin
2025-11-24  7:54   ` Mike Rapoport
2025-11-25 18:42     ` Pasha Tatashin
2025-11-22 22:23 ` [PATCH v7 20/22] liveupdate: luo_file: Add internal APIs for file preservation Pasha Tatashin
2025-11-22 22:23 ` [PATCH v7 21/22] liveupdate: luo_flb: Introduce File-Lifecycle-Bound global state Pasha Tatashin
2025-11-24 23:45   ` David Matlack
2025-11-25 17:10     ` Pasha Tatashin
2025-11-22 22:23 ` [PATCH v7 22/22] tests/liveupdate: Add in-kernel liveupdate test Pasha Tatashin
2025-11-22 22:44 ` [PATCH v7 00/22] Live Update Orchestrator Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aSMsqD5mB2mHHH9v@kernel.org \
    --to=rppt@kernel.org \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=ajayachandra@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=aleksander.lobakin@intel.com \
    --cc=aliceryhl@google.com \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=anna.schumaker@oracle.com \
    --cc=axboe@kernel.dk \
    --cc=bartosz.golaszewski@linaro.org \
    --cc=bhelgaas@google.com \
    --cc=bp@alien8.de \
    --cc=brauner@kernel.org \
    --cc=chenridong@huawei.com \
    --cc=chrisl@kernel.org \
    --cc=corbet@lwn.net \
    --cc=cw00.choi@samsung.com \
    --cc=dakr@kernel.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=djeffery@redhat.com \
    --cc=dmatlack@google.com \
    --cc=graf@amazon.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=hpa@zytor.com \
    --cc=hughd@google.com \
    --cc=ilpo.jarvinen@linux.intel.com \
    --cc=ira.weiny@intel.com \
    --cc=jannh@google.com \
    --cc=jasonmiu@google.com \
    --cc=jgg@nvidia.com \
    --cc=joel.granados@kernel.org \
    --cc=kanie@linux.alibaba.com \
    --cc=lennart@poettering.net \
    --cc=leon@kernel.org \
    --cc=leonro@nvidia.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@weissschuh.net \
    --cc=lukas@wunner.de \
    --cc=mark.rutland@arm.com \
    --cc=masahiroy@kernel.org \
    --cc=mingo@redhat.com \
    --cc=mmaurer@google.com \
    --cc=myungjoo.ham@samsung.com \
    --cc=ojeda@kernel.org \
    --cc=parav@nvidia.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=pratyush@kernel.org \
    --cc=ptyadav@amazon.de \
    --cc=quic_zijuhu@quicinc.com \
    --cc=rafael@kernel.org \
    --cc=rdunlap@infradead.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=rostedt@goodmis.org \
    --cc=saeedm@nvidia.com \
    --cc=skhawaja@google.com \
    --cc=song@kernel.org \
    --cc=stuart.w.hayes@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=vincent.guittot@linaro.org \
    --cc=wagi@kernel.org \
    --cc=witu@nvidia.com \
    --cc=x86@kernel.org \
    --cc=yesanishhere@gmail.com \
    --cc=yoann.congal@smile.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.