* [RFC v2 08/16] luo: luo_files: add infrastructure for FDs
2025-05-15 18:23 [RFC v2 00/16] Live Update Orchestrator Pasha Tatashin
@ 2025-05-15 18:23 ` Pasha Tatashin
2025-05-15 23:15 ` James Houghton
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Pasha Tatashin @ 2025-05-15 18:23 UTC (permalink / raw)
To: pratyush, jasonmiu, graf, changyuanl, pasha.tatashin, rppt,
dmatlack, rientjes, corbet, rdunlap, ilpo.jarvinen, kanie, ojeda,
aliceryhl, masahiroy, akpm, tj, yoann.congal, mmaurer,
roman.gushchin, chenridong, axboe, mark.rutland, jannh,
vincent.guittot, hannes, dan.j.williams, david, joel.granados,
rostedt, anna.schumaker, song, zhangguopeng, linux, linux-kernel,
linux-doc, linux-mm, gregkh, tglx, mingo, bp, dave.hansen, x86,
hpa, rafael, dakr, bartosz.golaszewski, cw00.choi, myungjoo.ham,
yesanishhere, Jonathan.Cameron, quic_zijuhu, aleksander.lobakin,
ira.weiny, andriy.shevchenko, leon, lukas, bhelgaas, wagi,
djeffery, stuart.w.hayes, ptyadav
Introduce the framework within LUO to support preserving specific types
of file descriptors across a live update transition. This allows
stateful FDs (like memfds or vfio FDs used by VMs) to be recreated in
the new kernel.
Note: The core logic for iterating through the luo_files_list and
invoking the handler callbacks (prepare, freeze, cancel, finish)
within luo_do_files_*_calls, as well as managing the u64 data
persistence via the FDT for individual files, is currently implemented
as stubs in this patch. This patch sets up the registration, FDT layout,
and retrieval framework.
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
---
drivers/misc/liveupdate/Makefile | 1 +
drivers/misc/liveupdate/luo_core.c | 19 +
drivers/misc/liveupdate/luo_files.c | 563 +++++++++++++++++++++++++
drivers/misc/liveupdate/luo_internal.h | 11 +
include/linux/liveupdate.h | 62 +++
5 files changed, 656 insertions(+)
create mode 100644 drivers/misc/liveupdate/luo_files.c
diff --git a/drivers/misc/liveupdate/Makefile b/drivers/misc/liveupdate/Makefile
index df1c9709ba4f..b4cdd162574f 100644
--- a/drivers/misc/liveupdate/Makefile
+++ b/drivers/misc/liveupdate/Makefile
@@ -1,3 +1,4 @@
# SPDX-License-Identifier: GPL-2.0
obj-y += luo_core.o
+obj-y += luo_files.o
obj-y += luo_subsystems.o
diff --git a/drivers/misc/liveupdate/luo_core.c b/drivers/misc/liveupdate/luo_core.c
index 417e7f6bf36c..ab1d76221fe2 100644
--- a/drivers/misc/liveupdate/luo_core.c
+++ b/drivers/misc/liveupdate/luo_core.c
@@ -110,6 +110,10 @@ static int luo_fdt_setup(struct kho_serialization *ser)
if (ret)
goto exit_free;
+ ret = luo_files_fdt_setup(fdt_out);
+ if (ret)
+ goto exit_free;
+
ret = luo_subsystems_fdt_setup(fdt_out);
if (ret)
goto exit_free;
@@ -145,7 +149,13 @@ static int luo_do_prepare_calls(void)
{
int ret;
+ ret = luo_do_files_prepare_calls();
+ if (ret)
+ return ret;
+
ret = luo_do_subsystems_prepare_calls();
+ if (ret)
+ luo_do_files_cancel_calls();
return ret;
}
@@ -154,18 +164,26 @@ static int luo_do_freeze_calls(void)
{
int ret;
+ ret = luo_do_files_freeze_calls();
+ if (ret)
+ return ret;
+
ret = luo_do_subsystems_freeze_calls();
+ if (ret)
+ luo_do_files_cancel_calls();
return ret;
}
static void luo_do_finish_calls(void)
{
+ luo_do_files_finish_calls();
luo_do_subsystems_finish_calls();
}
static void luo_do_cancel_calls(void)
{
+ luo_do_files_cancel_calls();
luo_do_subsystems_cancel_calls();
}
@@ -436,6 +454,7 @@ static int __init luo_startup(void)
}
__luo_set_state(LIVEUPDATE_STATE_UPDATED);
+ luo_files_startup(luo_fdt_in);
luo_subsystems_startup(luo_fdt_in);
return 0;
diff --git a/drivers/misc/liveupdate/luo_files.c b/drivers/misc/liveupdate/luo_files.c
new file mode 100644
index 000000000000..953fc40db3d7
--- /dev/null
+++ b/drivers/misc/liveupdate/luo_files.c
@@ -0,0 +1,563 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright (c) 2025, Google LLC.
+ * Pasha Tatashin <pasha.tatashin@soleen.com>
+ */
+
+/**
+ * DOC: LUO file descriptors
+ *
+ * LUO provides the infrastructure necessary to preserve
+ * specific types of stateful file descriptors across a kernel live
+ * update transition. The primary goal is to allow workloads, such as virtual
+ * machines using vfio, memfd, or iommufd to retain access to their essential
+ * resources without interruption after the underlying kernel is updated.
+ *
+ * The framework operates based on handler registration and instance tracking:
+ *
+ * 1. Handler Registration: Kernel modules responsible for specific file
+ * types (e.g., memfd, vfio) register a &struct liveupdate_filesystem
+ * handler. This handler contains callbacks (&liveupdate_filesystem.prepare,
+ * &liveupdate_filesystem.freeze, &liveupdate_filesystem.finish, etc.)
+ * and a unique 'compatible' string identifying the file type.
+ * Registration occurs via liveupdate_register_filesystem().
+ *
+ * 2. File Instance Tracking: When a potentially preservable file needs to be
+ * managed for live update, the core LUO logic (luo_register_file()) finds a
+ * compatible registered handler using its &liveupdate_filesystem.can_preserve
+ * callback. If found, an internal &struct luo_file instance is created,
+ * assigned a unique u64 'token', and added to a list.
+ *
+ * 3. State Persistence (FDT): During the LUO prepare/freeze phases, the
+ * registered handler callbacks are invoked for each tracked file instance.
+ * These callbacks can generate a u64 data payload representing the minimal
+ * state needed for restoration. This payload, along with the handler's
+ * compatible string and the unique token, is stored in a dedicated
+ * '/file-descriptors' node within the main LUO FDT blob passed via
+ * Kexec Handover (KHO).
+ *
+ * 4. Restoration: In the new kernel, the LUO framework parses the incoming
+ * FDT to reconstruct the list of &struct luo_file instances. When the
+ * original owner requests the file, luo_retrieve_file() uses the corresponding
+ * handler's &liveupdate_filesystem.retrieve callback, passing the persisted
+ * u64 data, to recreate or find the appropriate &struct file object.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/err.h>
+#include <linux/libfdt.h>
+#include <linux/liveupdate.h>
+#include <linux/mutex.h>
+#include <linux/rwsem.h>
+#include <linux/slab.h>
+#include <linux/string.h>
+#include <linux/xarray.h>
+#include "luo_internal.h"
+
+#define LUO_FILES_NODE_NAME "file-descriptors"
+#define LUO_FILES_COMPATIBLE "file-descriptors-v1"
+
+static DEFINE_XARRAY(luo_files_xa_in);
+static DEFINE_XARRAY(luo_files_xa_out);
+static bool luo_files_xa_in_recreated;
+
+/* Regestred filesystems. */
+static DECLARE_RWSEM(luo_filesystems_list_rwsem);
+static LIST_HEAD(luo_filesystems_list);
+
+static void *luo_fdt_out;
+static void *luo_fdt_in;
+
+static u64 luo_next_file_token;
+
+/**
+ * struct luo_file - Represents a file descriptor instance preserved
+ * across live update.
+ * @fs: Pointer to the &struct liveupdate_filesystems containing
+ * the implementation of prepare, freeze, cancel, and finish
+ * operations specific to this file's type.
+ * @file: A pointer to the kernel's &struct file object representing
+ * the open file descriptor that is being preserved.
+ * @private_data: Internal storage used by the live update core framework
+ * between phases.
+ * @reclaimed: Flag indicating whether this preserved file descriptor has
+ * been successfully 'reclaimed' (e.g., requested via an ioctl)
+ * by user-space or the owning kernel subsystem in the new
+ * kernel after the live update.
+ * @state: The current state of file descriptor, it is allowed to
+ * prepare, freeze, and finish FDs before the global state
+ * switch.
+ * @mutex: Lock to protect FD state, and allow independently to change
+ * the FD state compared to global state.
+ *
+ * This structure holds the necessary callbacks and context for managing a
+ * specific open file descriptor throughout the different phases of a live
+ * update process. Instances of this structure are typically allocated,
+ * populated with file-specific details (&file, &arg, callbacks, compatibility
+ * string, token), and linked into a central list managed by the LUO. The
+ * private_data field is used internally by the core logic to store state
+ * between phases.
+ */
+struct luo_file {
+ struct liveupdate_filesystem *fs;
+ struct file *file;
+ u64 private_data;
+ bool reclaimed;
+ enum liveupdate_state state;
+ struct mutex mutex;
+};
+
+/**
+ * luo_files_startup - Validates the LUO file-descriptors FDT node at startup.
+ * @fdt: Pointer to the LUO FDT blob passed from the previous kernel.
+ *
+ * This __init function checks the existence and validity of the
+ * '/file-descriptors' node in the FDT. This node is considered mandatory. It
+ * calls panic() if the node is missing, inaccessible, or invalid (e.g., missing
+ * compatible, wrong compatible string), indicating a critical configuration
+ * error for LUO.
+ */
+void __init luo_files_startup(void *fdt)
+{
+ int ret, node_offset;
+
+ node_offset = fdt_subnode_offset(fdt, 0, LUO_FILES_NODE_NAME);
+ if (node_offset < 0)
+ panic("Failed to find /file-descriptors node\n");
+
+ ret = fdt_node_check_compatible(fdt, node_offset,
+ LUO_FILES_COMPATIBLE);
+ if (ret) {
+ panic("FDT '%s' is incompatible with '%s' [%d]\n",
+ LUO_FILES_NODE_NAME, LUO_FILES_COMPATIBLE, ret);
+ }
+ luo_fdt_in = fdt;
+}
+
+static void luo_files_recreate_luo_files_xa_in(void)
+{
+ int parent_node_offset, file_node_offset;
+ const char *node_name, *fdt_compat_str;
+ struct liveupdate_filesystem *fs;
+ struct luo_file *luo_file;
+ const void *data_ptr;
+ int ret = 0;
+
+ if (luo_files_xa_in_recreated || !luo_fdt_in)
+ return;
+
+ /* Take write in order to gurantee that we re-create list once */
+ down_write(&luo_filesystems_list_rwsem);
+ if (luo_files_xa_in_recreated)
+ goto exit_unlock;
+
+ parent_node_offset = fdt_subnode_offset(luo_fdt_in, 0,
+ LUO_FILES_NODE_NAME);
+
+ fdt_for_each_subnode(file_node_offset, luo_fdt_in, parent_node_offset) {
+ bool handler_found = false;
+ u64 token;
+
+ node_name = fdt_get_name(luo_fdt_in, file_node_offset, NULL);
+ if (!node_name) {
+ panic("Skipping FDT subnode at offset %d: Cannot get name\n",
+ file_node_offset);
+ }
+
+ ret = kstrtou64(node_name, 0, &token);
+ if (ret < 0) {
+ panic("Skipping FDT node '%s': Failed to parse token\n",
+ node_name);
+ }
+
+ fdt_compat_str = fdt_getprop(luo_fdt_in, file_node_offset,
+ "compatible", NULL);
+ if (!fdt_compat_str) {
+ panic("Skipping FDT node '%s': Missing 'compatible' property\n",
+ node_name);
+ }
+
+ data_ptr = fdt_getprop(luo_fdt_in, file_node_offset, "data",
+ NULL);
+ if (!data_ptr) {
+ panic("Can't recover property 'data' for FDT node '%s'\n",
+ node_name);
+ }
+
+ list_for_each_entry(fs, &luo_filesystems_list, list) {
+ if (!strcmp(fs->compatible, fdt_compat_str)) {
+ handler_found = true;
+ break;
+ }
+ }
+
+ if (!handler_found) {
+ panic("Skipping FDT node '%s': No registered handler for compatible '%s'\n",
+ node_name, fdt_compat_str);
+ }
+
+ luo_file = kmalloc(sizeof(*luo_file),
+ GFP_KERNEL | __GFP_NOFAIL);
+ luo_file->fs = fs;
+ luo_file->file = NULL;
+ memcpy(&luo_file->private_data, data_ptr, sizeof(u64));
+ luo_file->reclaimed = false;
+ mutex_init(&luo_file->mutex);
+ luo_file->state = LIVEUPDATE_STATE_UPDATED;
+ ret = xa_err(xa_store(&luo_files_xa_in, token, luo_file,
+ GFP_KERNEL | __GFP_NOFAIL));
+ if (ret < 0) {
+ panic("Failed to store luo_file for token %llu in XArray: %d\n",
+ token, ret);
+ }
+ }
+ luo_files_xa_in_recreated = true;
+
+exit_unlock:
+ up_write(&luo_filesystems_list_rwsem);
+}
+
+/**
+ * luo_files_fdt_setup - Adds and populates the 'file-descriptors' node in the
+ * FDT.
+ * @fdt: Pointer to the LUO FDT blob.
+ *
+ * Add file-descriptors node and each FD node to the LUO FDT blob.
+ *
+ * Returns: 0 on success, negative errno on failure.
+ */
+int luo_files_fdt_setup(void *fdt)
+{
+ int ret, files_node_offset, node_offset;
+ const u64 zero_data = 0;
+ unsigned long token;
+ struct luo_file *h;
+ char token_str[19];
+
+ ret = fdt_add_subnode(fdt, 0, LUO_FILES_NODE_NAME);
+ if (ret < 0)
+ goto exit_error;
+
+ files_node_offset = ret;
+ ret = fdt_setprop_string(fdt, files_node_offset, "compatible",
+ LUO_FILES_COMPATIBLE);
+ if (ret < 0)
+ goto exit_error;
+
+ xa_for_each(&luo_files_xa_out, token, h) {
+ snprintf(token_str, sizeof(token_str), "%#0llx", (u64)token);
+
+ ret = fdt_add_subnode(fdt, files_node_offset, token_str);
+ if (ret < 0)
+ goto exit_error;
+
+ node_offset = ret;
+ ret = fdt_setprop_string(fdt, node_offset, "compatible",
+ h->fs->compatible);
+ if (ret < 0)
+ goto exit_error;
+
+ ret = fdt_setprop(fdt, node_offset, "data",
+ &zero_data, sizeof(zero_data));
+ }
+
+ luo_fdt_out = fdt;
+
+ return 0;
+exit_error:
+ pr_err("Failed to setup 'file-descriptors' node to FDT: %s\n",
+ fdt_strerror(ret));
+ return -ENOSPC;
+}
+
+/**
+ * luo_do_files_prepare_calls - Calls prepare callbacks and updates FDT
+ * if all prepares succeed. Handles cancellation on failure.
+ *
+ * Phase 1: Calls 'prepare' for all files and stores results temporarily.
+ * If any 'prepare' fails, calls 'cancel' on previously prepared files
+ * and returns the error.
+ * Phase 2: If all 'prepare' calls succeeded, writes the stored data to the FDT.
+ * If any FDT write fails, calls 'cancel' on *all* prepared files and
+ * returns the FDT error.
+ *
+ * Returns: 0 on success. Negative errno on failure.
+ */
+int luo_do_files_prepare_calls(void)
+{
+ return 0;
+}
+
+/**
+ * luo_do_files_freeze_calls - Calls freeze callbacks and updates FDT
+ * if all calls succeed. Handles cancellation on failure.
+ *
+ * Phase 1: Calls 'freeze' for all files and stores results temporarily.
+ * If any 'freeze' fails, calls 'cancel' on previously called files.
+ * and returns the error.
+ * Phase 2: If all 'freeze' calls succeeded, writes the stored data to the FDT.
+ * If any FDT write fails, calls 'cancel' on *all* files and returns the FDT
+ * error.
+ *
+ * Returns: 0 on success. Negative errno on failure.
+ */
+int luo_do_files_freeze_calls(void)
+{
+ return 0;
+}
+
+/**
+ * luo_do_files_finish_calls - Calls finish callbacks for all file descriptors.
+ *
+ * This function is called at the end of live update cycle to do the final
+ * clean-up or housekeeping of the post-live update states.
+ */
+void luo_do_files_finish_calls(void)
+{
+ luo_files_recreate_luo_files_xa_in();
+}
+
+/**
+ * luo_do_files_cancel_calls - Calls cancel callbacks for all file descriptors.
+ *
+ * This function is typically called when the live update process needs to be
+ * aborted externally, for example, after the prepare phase may have run but
+ * before actual reboot. It iterates through all registered files and calls
+ * the 'cancel' callback for those that implement it and likely completed
+ * prepare.
+ */
+void luo_do_files_cancel_calls(void)
+{
+}
+
+/**
+ * luo_register_file - Register a file descriptor for live update management.
+ * @tokenp: Return argument for the token value.
+ * @file: Pointer to the struct file to be preserved.
+ *
+ * Context: Must be called when LUO is in 'normal' state.
+ *
+ * Return: 0 on success. Negative errno on failure.
+ */
+int luo_register_file(u64 *tokenp, struct file *file)
+{
+ struct liveupdate_filesystem *fs;
+ bool found = false;
+ int ret = -ENOENT;
+ u64 token;
+
+ luo_state_read_enter();
+ if (!liveupdate_state_normal() && !liveupdate_state_updated()) {
+ pr_warn("File can be registered only in normal or prepared state\n");
+ luo_state_read_exit();
+ return -EBUSY;
+ }
+
+ down_read(&luo_filesystems_list_rwsem);
+ list_for_each_entry(fs, &luo_filesystems_list, list) {
+ if (fs->can_preserve(file, fs->arg)) {
+ found = true;
+ break;
+ }
+ }
+
+ if (found) {
+ struct luo_file *luo_file = kmalloc(sizeof(*luo_file),
+ GFP_KERNEL);
+
+ if (!luo_file) {
+ ret = -ENOMEM;
+ goto exit_unlock;
+ }
+
+ token = luo_next_file_token;
+ luo_next_file_token++;
+
+ luo_file->private_data = 0;
+ luo_file->reclaimed = false;
+
+ luo_file->file = file;
+ luo_file->fs = fs;
+ mutex_init(&luo_file->mutex);
+ luo_file->state = LIVEUPDATE_STATE_NORMAL;
+ ret = xa_err(xa_store(&luo_files_xa_out, token, luo_file,
+ GFP_KERNEL));
+ if (ret < 0) {
+ pr_warn("Failed to store file for token %llu in XArray: %d\n",
+ token, ret);
+ kfree(luo_file);
+ goto exit_unlock;
+ }
+ *tokenp = token;
+ }
+
+exit_unlock:
+ up_read(&luo_filesystems_list_rwsem);
+ luo_state_read_exit();
+
+ return ret;
+}
+
+/**
+ * luo_unregister_file - Unregister a file instance using its token.
+ * @token: The unique token of the file instance to unregister.
+ *
+ * Finds the &struct luo_file associated with the @token in the
+ * global list and removes it. This function *only* removes the entry from the
+ * list; it does *not* free the memory allocated for the &struct luo_file
+ * itself. The caller is responsible for freeing the structure after this
+ * function returns successfully.
+ *
+ * Context: Can be called when a preserved file descriptor is closed or
+ * no longer needs live update management. Uses down_write_killable
+ * for list modification.
+ *
+ * Return: 0 on success. Negative errno on failure.
+ */
+int luo_unregister_file(u64 token)
+{
+ struct luo_file *luo_file;
+ int ret = 0;
+
+ luo_state_read_enter();
+ if (!liveupdate_state_normal() && !liveupdate_state_updated()) {
+ pr_warn("File can be unregistered only in normal or updates state\n");
+ luo_state_read_exit();
+ return -EBUSY;
+ }
+
+ luo_file = xa_erase(&luo_files_xa_out, token);
+ if (luo_file) {
+ kfree(luo_file);
+ } else {
+ pr_warn("Failed to unregister: token %llu not found.\n",
+ token);
+ ret = -ENOENT;
+ }
+ luo_state_read_exit();
+
+ return ret;
+}
+
+/**
+ * luo_retrieve_file - Find a registered file instance by its token.
+ * @token: The unique token of the file instance to retrieve.
+ * @file: Output parameter. On success (return value 0), this will point
+ * to the retrieved "struct file".
+ *
+ * Searches the global list for a &struct luo_file matching the @token. Uses a
+ * read lock, allowing concurrent retrievals.
+ *
+ * Return: 0 on success. Negative errno on failure.
+ */
+int luo_retrieve_file(u64 token, struct file **file)
+{
+ struct luo_file *luo_file;
+ int ret = 0;
+
+ luo_files_recreate_luo_files_xa_in();
+ luo_state_read_enter();
+ if (!liveupdate_state_updated()) {
+ pr_warn("File can be retrieved only in updated state\n");
+ luo_state_read_exit();
+ return -EBUSY;
+ }
+
+ luo_file = xa_load(&luo_files_xa_in, token);
+ if (luo_file && !luo_file->reclaimed) {
+ luo_file->reclaimed = true;
+ ret = luo_file->fs->retrieve(luo_file->fs->arg,
+ luo_file->private_data,
+ file);
+ if (!ret)
+ luo_file->file = *file;
+ } else if (luo_file && luo_file->reclaimed) {
+ pr_err("The file descriptor for token %lld has already been retrieved\n",
+ token);
+ ret = -EINVAL;
+ } else {
+ ret = -ENOENT;
+ }
+
+ luo_state_read_exit();
+
+ return ret;
+}
+
+/**
+ * liveupdate_register_filesystem - Register a filesystem handler with LUO.
+ * @fs: Pointer to a caller-allocated &struct liveupdate_filesystem.
+ * The caller must initialize this structure, including a unique
+ * 'compatible' string and a valid 'fs' callbacks. This function adds the
+ * handler to the global list of supported filesystem handlers.
+ *
+ * Context: Typically called during module initialization for filesystems or
+ * file types that support live update preservation.
+ *
+ * Return: 0 on success. Negative errno on failure.
+ */
+int liveupdate_register_filesystem(struct liveupdate_filesystem *fs)
+{
+ struct liveupdate_filesystem *fs_iter;
+ int ret = 0;
+
+ luo_state_read_enter();
+ if (!liveupdate_state_normal() && !liveupdate_state_updated()) {
+ luo_state_read_exit();
+ return -EBUSY;
+ }
+
+ down_write(&luo_filesystems_list_rwsem);
+ list_for_each_entry(fs_iter, &luo_filesystems_list, list) {
+ if (!strcmp(fs_iter->compatible, fs->compatible)) {
+ pr_err("Filesystem handler registration failed: Compatible string '%s' already registered.\n",
+ fs->compatible);
+ ret = -EEXIST;
+ goto exit_unlock;
+ }
+ }
+
+ INIT_LIST_HEAD(&fs->list);
+ list_add_tail(&fs->list, &luo_filesystems_list);
+
+exit_unlock:
+ up_write(&luo_filesystems_list_rwsem);
+ luo_state_read_exit();
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(liveupdate_register_filesystem);
+
+/**
+ * liveupdate_unregister_filesystem - Unregister a filesystem handler.
+ * @fs: Pointer to the specific &struct liveupdate_filesystem instance
+ * that was previously returned by or passed to liveupdate_register_filesystem.
+ *
+ * Removes the specified handler instance @fs from the global list of
+ * registered filesystem handlers. This function only removes the entry from the
+ * list; it does not free the memory associated with @fs itself. The caller
+ * is responsible for freeing the structure memory after this function returns
+ * successfully.
+ *
+ * Return: 0 on success. Negative errno on failure.
+ */
+int liveupdate_unregister_filesystem(struct liveupdate_filesystem *fs)
+{
+ int ret = 0;
+
+ luo_state_read_enter();
+ if (!liveupdate_state_normal() && !liveupdate_state_updated()) {
+ luo_state_read_exit();
+ return -EBUSY;
+ }
+
+ down_write(&luo_filesystems_list_rwsem);
+ list_del_init(&fs->list);
+ up_write(&luo_filesystems_list_rwsem);
+ luo_state_read_exit();
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(liveupdate_unregister_filesystem);
diff --git a/drivers/misc/liveupdate/luo_internal.h b/drivers/misc/liveupdate/luo_internal.h
index 63a8b93254a6..b7a0f31ddc99 100644
--- a/drivers/misc/liveupdate/luo_internal.h
+++ b/drivers/misc/liveupdate/luo_internal.h
@@ -23,6 +23,17 @@ int luo_do_subsystems_freeze_calls(void);
void luo_do_subsystems_finish_calls(void);
void luo_do_subsystems_cancel_calls(void);
+void luo_files_startup(void *fdt);
+int luo_files_fdt_setup(void *fdt);
+int luo_do_files_prepare_calls(void);
+int luo_do_files_freeze_calls(void);
+void luo_do_files_finish_calls(void);
+void luo_do_files_cancel_calls(void);
+
+int luo_retrieve_file(u64 token, struct file **file);
+int luo_register_file(u64 *token, struct file *file);
+int luo_unregister_file(u64 token);
+
extern const char *const luo_state_str[];
/* Get the current state as a string */
diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h
index 7a130680b5f2..7afe0aac5ce4 100644
--- a/include/linux/liveupdate.h
+++ b/include/linux/liveupdate.h
@@ -86,6 +86,55 @@ enum liveupdate_state {
LIVEUPDATE_STATE_UPDATED = 3,
};
+/* Forward declaration needed if definition isn't included */
+struct file;
+
+/**
+ * struct liveupdate_filesystem - Represents a handler for a live-updatable
+ * filesystem/file type.
+ * @prepare: Optional. Saves state for a specific file instance (@file,
+ * @arg) before update, potentially returning value via @data.
+ * Returns 0 on success, negative errno on failure.
+ * @freeze: Optional. Performs final actions just before kernel
+ * transition, potentially reading/updating the handle via
+ * @data.
+ * Returns 0 on success, negative errno on failure.
+ * @cancel: Optional. Cleans up state/resources if update is aborted
+ * after prepare/freeze succeeded, using the @data handle (by
+ * value) from the successful prepare. Returns void.
+ * @finish: Optional. Performs final cleanup in the new kernel using the
+ * preserved @data handle (by value). Returns void.
+ * @retrieve: Retrieve the preserved file. Must be called before finish.
+ * @can_preserve: callback to determine if @file with associated context (@arg)
+ * can be preserved by this handler.
+ * Return bool (true if preservable, false otherwise).
+ * @compatible: The compatibility string (e.g., "memfd-v1", "vfiofd-v1")
+ * that uniquely identifies the filesystem or file type this
+ * handler supports. This is matched against the compatible
+ * string associated with individual &struct liveupdate_file
+ * instances.
+ * @arg: An opaque pointer to implementation-specific context data
+ * associated with this filesystem handler registration.
+ * @list: used for linking this handler instance into a global list of
+ * registered filesystem handlers.
+ *
+ * Modules that want to support live update for specific file types should
+ * register an instance of this structure. LUO uses this registration to
+ * determine if a given file can be preserved and to find the appropriate
+ * operations to manage its state across the update.
+ */
+struct liveupdate_filesystem {
+ int (*prepare)(struct file *file, void *arg, u64 *data);
+ int (*freeze)(struct file *file, void *arg, u64 *data);
+ void (*cancel)(struct file *file, void *arg, u64 data);
+ void (*finish)(struct file *file, void *arg, u64 data, bool reclaimed);
+ int (*retrieve)(void *arg, u64 data, struct file **file);
+ bool (*can_preserve)(struct file *file, void *arg);
+ const char *compatible;
+ void *arg;
+ struct list_head list;
+};
+
/**
* struct liveupdate_subsystem - Represents a subsystem participating in LUO
* @prepare: Optional. Called during LUO prepare phase. Should perform
@@ -142,6 +191,9 @@ int liveupdate_register_subsystem(struct liveupdate_subsystem *h);
int liveupdate_unregister_subsystem(struct liveupdate_subsystem *h);
int liveupdate_get_subsystem_data(struct liveupdate_subsystem *h, u64 *data);
+int liveupdate_register_filesystem(struct liveupdate_filesystem *h);
+int liveupdate_unregister_filesystem(struct liveupdate_filesystem *h);
+
#else /* CONFIG_LIVEUPDATE */
static inline int liveupdate_reboot(void)
@@ -180,5 +232,15 @@ static inline int liveupdate_get_subsystem_data(struct liveupdate_subsystem *h,
return -ENODATA;
}
+static inline int liveupdate_register_filesystem(struct liveupdate_filesystem *h)
+{
+ return 0;
+}
+
+static inline int liveupdate_unregister_filesystem(struct liveupdate_filesystem *h)
+{
+ return 0;
+}
+
#endif /* CONFIG_LIVEUPDATE */
#endif /* _LINUX_LIVEUPDATE_H */
--
2.49.0.1101.gccaa498523-goog
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [RFC v2 08/16] luo: luo_files: add infrastructure for FDs
2025-05-15 18:23 ` [RFC v2 08/16] luo: luo_files: add infrastructure for FDs Pasha Tatashin
@ 2025-05-15 23:15 ` James Houghton
2025-05-23 18:09 ` Pasha Tatashin
2025-05-26 7:55 ` Mike Rapoport
2025-06-05 15:56 ` Pratyush Yadav
2 siblings, 1 reply; 11+ messages in thread
From: James Houghton @ 2025-05-15 23:15 UTC (permalink / raw)
To: Pasha Tatashin
Cc: pratyush, jasonmiu, graf, changyuanl, rppt, dmatlack, rientjes,
corbet, rdunlap, ilpo.jarvinen, kanie, ojeda, aliceryhl,
masahiroy, akpm, tj, yoann.congal, mmaurer, roman.gushchin,
chenridong, axboe, mark.rutland, jannh, vincent.guittot, hannes,
dan.j.williams, david, joel.granados, rostedt, anna.schumaker,
song, zhangguopeng, linux, linux-kernel, linux-doc, linux-mm,
gregkh, tglx, mingo, bp, dave.hansen, x86, hpa, rafael, dakr,
bartosz.golaszewski, cw00.choi, myungjoo.ham, yesanishhere,
Jonathan.Cameron, quic_zijuhu, aleksander.lobakin, ira.weiny,
andriy.shevchenko, leon, lukas, bhelgaas, wagi, djeffery,
stuart.w.hayes, ptyadav
On Thu, May 15, 2025 at 11:23 AM Pasha Tatashin
<pasha.tatashin@soleen.com> wrote:
> +/**
> + * luo_retrieve_file - Find a registered file instance by its token.
> + * @token: The unique token of the file instance to retrieve.
> + * @file: Output parameter. On success (return value 0), this will point
> + * to the retrieved "struct file".
> + *
> + * Searches the global list for a &struct luo_file matching the @token. Uses a
> + * read lock, allowing concurrent retrievals.
> + *
> + * Return: 0 on success. Negative errno on failure.
> + */
> +int luo_retrieve_file(u64 token, struct file **file)
> +{
> + struct luo_file *luo_file;
> + int ret = 0;
> +
> + luo_files_recreate_luo_files_xa_in();
> + luo_state_read_enter();
> + if (!liveupdate_state_updated()) {
> + pr_warn("File can be retrieved only in updated state\n");
> + luo_state_read_exit();
> + return -EBUSY;
> + }
> +
> + luo_file = xa_load(&luo_files_xa_in, token);
> + if (luo_file && !luo_file->reclaimed) {
> + luo_file->reclaimed = true;
I haven't been able to pay too much attention to the series yet, and I
know this was posted as an RFC, so pardon my nit-picking.
I think you need to have xchg here for this not to be racy, so something like:
`if (luo_file && !xchg(&luo_file->reclaimed, true))`
Or maybe you meant to avoid this race some other way; IIUC,
luo_state_read_enter() is not sufficient.
Thanks!
> + ret = luo_file->fs->retrieve(luo_file->fs->arg,
> + luo_file->private_data,
> + file);
> + if (!ret)
> + luo_file->file = *file;
> + } else if (luo_file && luo_file->reclaimed) {
> + pr_err("The file descriptor for token %lld has already been retrieved\n",
> + token);
> + ret = -EINVAL;
> + } else {
> + ret = -ENOENT;
> + }
> +
> + luo_state_read_exit();
> +
> + return ret;
> +}
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC v2 08/16] luo: luo_files: add infrastructure for FDs
2025-05-15 23:15 ` James Houghton
@ 2025-05-23 18:09 ` Pasha Tatashin
0 siblings, 0 replies; 11+ messages in thread
From: Pasha Tatashin @ 2025-05-23 18:09 UTC (permalink / raw)
To: James Houghton
Cc: pratyush, jasonmiu, graf, changyuanl, rppt, dmatlack, rientjes,
corbet, rdunlap, ilpo.jarvinen, kanie, ojeda, aliceryhl,
masahiroy, akpm, tj, yoann.congal, mmaurer, roman.gushchin,
chenridong, axboe, mark.rutland, jannh, vincent.guittot, hannes,
dan.j.williams, david, joel.granados, rostedt, anna.schumaker,
song, zhangguopeng, linux, linux-kernel, linux-doc, linux-mm,
gregkh, tglx, mingo, bp, dave.hansen, x86, hpa, rafael, dakr,
bartosz.golaszewski, cw00.choi, myungjoo.ham, yesanishhere,
Jonathan.Cameron, quic_zijuhu, aleksander.lobakin, ira.weiny,
andriy.shevchenko, leon, lukas, bhelgaas, wagi, djeffery,
stuart.w.hayes, ptyadav
On Thu, May 15, 2025 at 7:16 PM James Houghton <jthoughton@google.com> wrote:
>
> On Thu, May 15, 2025 at 11:23 AM Pasha Tatashin
> <pasha.tatashin@soleen.com> wrote:
> > +/**
> > + * luo_retrieve_file - Find a registered file instance by its token.
> > + * @token: The unique token of the file instance to retrieve.
> > + * @file: Output parameter. On success (return value 0), this will point
> > + * to the retrieved "struct file".
> > + *
> > + * Searches the global list for a &struct luo_file matching the @token. Uses a
> > + * read lock, allowing concurrent retrievals.
> > + *
> > + * Return: 0 on success. Negative errno on failure.
> > + */
> > +int luo_retrieve_file(u64 token, struct file **file)
> > +{
> > + struct luo_file *luo_file;
> > + int ret = 0;
> > +
> > + luo_files_recreate_luo_files_xa_in();
> > + luo_state_read_enter();
> > + if (!liveupdate_state_updated()) {
> > + pr_warn("File can be retrieved only in updated state\n");
> > + luo_state_read_exit();
> > + return -EBUSY;
> > + }
> > +
> > + luo_file = xa_load(&luo_files_xa_in, token);
> > + if (luo_file && !luo_file->reclaimed) {
> > + luo_file->reclaimed = true;
>
> I haven't been able to pay too much attention to the series yet, and I
> know this was posted as an RFC, so pardon my nit-picking.
>
> I think you need to have xchg here for this not to be racy, so something like:
>
> `if (luo_file && !xchg(&luo_file->reclaimed, true))`
>
> Or maybe you meant to avoid this race some other way; IIUC,
> luo_state_read_enter() is not sufficient.
Thank you for catching this. This is a bug, I actually added a per fd
mutex lock to struct luo_file that is supposed to be used here. I am
going to address this in the next version.
Thanks,
Pasha
>
> Thanks!
>
> > + ret = luo_file->fs->retrieve(luo_file->fs->arg,
> > + luo_file->private_data,
> > + file);
> > + if (!ret)
> > + luo_file->file = *file;
> > + } else if (luo_file && luo_file->reclaimed) {
> > + pr_err("The file descriptor for token %lld has already been retrieved\n",
> > + token);
> > + ret = -EINVAL;
> > + } else {
> > + ret = -ENOENT;
> > + }
> > +
> > + luo_state_read_exit();
> > +
> > + return ret;
> > +}
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC v2 08/16] luo: luo_files: add infrastructure for FDs
2025-05-15 18:23 ` [RFC v2 08/16] luo: luo_files: add infrastructure for FDs Pasha Tatashin
2025-05-15 23:15 ` James Houghton
@ 2025-05-26 7:55 ` Mike Rapoport
2025-06-05 11:56 ` Pratyush Yadav
2025-06-08 13:13 ` Pasha Tatashin
2025-06-05 15:56 ` Pratyush Yadav
2 siblings, 2 replies; 11+ messages in thread
From: Mike Rapoport @ 2025-05-26 7:55 UTC (permalink / raw)
To: Pasha Tatashin
Cc: pratyush, jasonmiu, graf, changyuanl, dmatlack, rientjes, corbet,
rdunlap, ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm,
tj, yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
david, joel.granados, rostedt, anna.schumaker, song, zhangguopeng,
linux, linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav
On Thu, May 15, 2025 at 06:23:12PM +0000, Pasha Tatashin wrote:
> Introduce the framework within LUO to support preserving specific types
> of file descriptors across a live update transition. This allows
> stateful FDs (like memfds or vfio FDs used by VMs) to be recreated in
> the new kernel.
>
> Note: The core logic for iterating through the luo_files_list and
> invoking the handler callbacks (prepare, freeze, cancel, finish)
> within luo_do_files_*_calls, as well as managing the u64 data
> persistence via the FDT for individual files, is currently implemented
> as stubs in this patch. This patch sets up the registration, FDT layout,
> and retrieval framework.
>
> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> ---
> drivers/misc/liveupdate/Makefile | 1 +
> drivers/misc/liveupdate/luo_core.c | 19 +
> drivers/misc/liveupdate/luo_files.c | 563 +++++++++++++++++++++++++
> drivers/misc/liveupdate/luo_internal.h | 11 +
> include/linux/liveupdate.h | 62 +++
> 5 files changed, 656 insertions(+)
> create mode 100644 drivers/misc/liveupdate/luo_files.c
>
> diff --git a/drivers/misc/liveupdate/Makefile b/drivers/misc/liveupdate/Makefile
> index df1c9709ba4f..b4cdd162574f 100644
> --- a/drivers/misc/liveupdate/Makefile
> +++ b/drivers/misc/liveupdate/Makefile
> @@ -1,3 +1,4 @@
> # SPDX-License-Identifier: GPL-2.0
> obj-y += luo_core.o
> +obj-y += luo_files.o
> obj-y += luo_subsystems.o
> diff --git a/drivers/misc/liveupdate/luo_core.c b/drivers/misc/liveupdate/luo_core.c
> index 417e7f6bf36c..ab1d76221fe2 100644
> --- a/drivers/misc/liveupdate/luo_core.c
> +++ b/drivers/misc/liveupdate/luo_core.c
> @@ -110,6 +110,10 @@ static int luo_fdt_setup(struct kho_serialization *ser)
> if (ret)
> goto exit_free;
>
> + ret = luo_files_fdt_setup(fdt_out);
> + if (ret)
> + goto exit_free;
> +
> ret = luo_subsystems_fdt_setup(fdt_out);
> if (ret)
> goto exit_free;
The duplication of files and subsystems does not look nice here and below.
Can't we make files to be a subsystem?
> @@ -145,7 +149,13 @@ static int luo_do_prepare_calls(void)
> {
> int ret;
>
> + ret = luo_do_files_prepare_calls();
> + if (ret)
> + return ret;
> +
> ret = luo_do_subsystems_prepare_calls();
> + if (ret)
> + luo_do_files_cancel_calls();
>
> return ret;
> }
> @@ -154,18 +164,26 @@ static int luo_do_freeze_calls(void)
> {
> int ret;
>
> + ret = luo_do_files_freeze_calls();
> + if (ret)
> + return ret;
> +
> ret = luo_do_subsystems_freeze_calls();
> + if (ret)
> + luo_do_files_cancel_calls();
>
> return ret;
> }
>
> static void luo_do_finish_calls(void)
> {
> + luo_do_files_finish_calls();
> luo_do_subsystems_finish_calls();
> }
>
> static void luo_do_cancel_calls(void)
> {
> + luo_do_files_cancel_calls();
> luo_do_subsystems_cancel_calls();
> }
>
> @@ -436,6 +454,7 @@ static int __init luo_startup(void)
> }
>
> __luo_set_state(LIVEUPDATE_STATE_UPDATED);
> + luo_files_startup(luo_fdt_in);
> luo_subsystems_startup(luo_fdt_in);
>
> return 0;
> diff --git a/drivers/misc/liveupdate/luo_files.c b/drivers/misc/liveupdate/luo_files.c
> new file mode 100644
> index 000000000000..953fc40db3d7
> --- /dev/null
> +++ b/drivers/misc/liveupdate/luo_files.c
> @@ -0,0 +1,563 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +/*
> + * Copyright (c) 2025, Google LLC.
> + * Pasha Tatashin <pasha.tatashin@soleen.com>
> + */
> +
> +/**
> + * DOC: LUO file descriptors
> + *
> + * LUO provides the infrastructure necessary to preserve
> + * specific types of stateful file descriptors across a kernel live
> + * update transition. The primary goal is to allow workloads, such as virtual
> + * machines using vfio, memfd, or iommufd to retain access to their essential
> + * resources without interruption after the underlying kernel is updated.
> + *
> + * The framework operates based on handler registration and instance tracking:
> + *
> + * 1. Handler Registration: Kernel modules responsible for specific file
> + * types (e.g., memfd, vfio) register a &struct liveupdate_filesystem
> + * handler. This handler contains callbacks (&liveupdate_filesystem.prepare,
> + * &liveupdate_filesystem.freeze, &liveupdate_filesystem.finish, etc.)
> + * and a unique 'compatible' string identifying the file type.
> + * Registration occurs via liveupdate_register_filesystem().
I wouldn't use filesystem here, as the obvious users are not really
filesystems. Maybe liveupdate_register_file_ops?
> + *
> + * 2. File Instance Tracking: When a potentially preservable file needs to be
> + * managed for live update, the core LUO logic (luo_register_file()) finds a
> + * compatible registered handler using its &liveupdate_filesystem.can_preserve
> + * callback. If found, an internal &struct luo_file instance is created,
> + * assigned a unique u64 'token', and added to a list.
> + *
> + * 3. State Persistence (FDT): During the LUO prepare/freeze phases, the
> + * registered handler callbacks are invoked for each tracked file instance.
> + * These callbacks can generate a u64 data payload representing the minimal
> + * state needed for restoration. This payload, along with the handler's
> + * compatible string and the unique token, is stored in a dedicated
> + * '/file-descriptors' node within the main LUO FDT blob passed via
> + * Kexec Handover (KHO).
> + *
> + * 4. Restoration: In the new kernel, the LUO framework parses the incoming
> + * FDT to reconstruct the list of &struct luo_file instances. When the
> + * original owner requests the file, luo_retrieve_file() uses the corresponding
> + * handler's &liveupdate_filesystem.retrieve callback, passing the persisted
> + * u64 data, to recreate or find the appropriate &struct file object.
> + */
The DOC is mostly about what luo_files does, we'd also need a description
of it's intended use, both internally in the kernel and by the userspace.
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
...
> +/**
> + * luo_register_file - Register a file descriptor for live update management.
> + * @tokenp: Return argument for the token value.
> + * @file: Pointer to the struct file to be preserved.
> + *
> + * Context: Must be called when LUO is in 'normal' state.
> + *
> + * Return: 0 on success. Negative errno on failure.
> + */
> +int luo_register_file(u64 *tokenp, struct file *file)
> +{
> + struct liveupdate_filesystem *fs;
> + bool found = false;
> + int ret = -ENOENT;
> + u64 token;
> +
> + luo_state_read_enter();
> + if (!liveupdate_state_normal() && !liveupdate_state_updated()) {
> + pr_warn("File can be registered only in normal or prepared state\n");
> + luo_state_read_exit();
> + return -EBUSY;
> + }
> +
> + down_read(&luo_filesystems_list_rwsem);
> + list_for_each_entry(fs, &luo_filesystems_list, list) {
> + if (fs->can_preserve(file, fs->arg)) {
> + found = true;
> + break;
> + }
> + }
> +
> + if (found) {
if (!found)
goto exit_unlock;
> + struct luo_file *luo_file = kmalloc(sizeof(*luo_file),
> + GFP_KERNEL);
> +
> + if (!luo_file) {
> + ret = -ENOMEM;
> + goto exit_unlock;
> + }
> +
> + token = luo_next_file_token;
> + luo_next_file_token++;
> +
> + luo_file->private_data = 0;
> + luo_file->reclaimed = false;
> +
> + luo_file->file = file;
> + luo_file->fs = fs;
> + mutex_init(&luo_file->mutex);
> + luo_file->state = LIVEUPDATE_STATE_NORMAL;
> + ret = xa_err(xa_store(&luo_files_xa_out, token, luo_file,
> + GFP_KERNEL));
> + if (ret < 0) {
> + pr_warn("Failed to store file for token %llu in XArray: %d\n",
> + token, ret);
> + kfree(luo_file);
> + goto exit_unlock;
> + }
> + *tokenp = token;
> + }
> +
> +exit_unlock:
> + up_read(&luo_filesystems_list_rwsem);
> + luo_state_read_exit();
> +
> + return ret;
> +}
> +
> diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h
> index 7a130680b5f2..7afe0aac5ce4 100644
> --- a/include/linux/liveupdate.h
> +++ b/include/linux/liveupdate.h
> @@ -86,6 +86,55 @@ enum liveupdate_state {
> LIVEUPDATE_STATE_UPDATED = 3,
> };
>
> +/* Forward declaration needed if definition isn't included */
> +struct file;
> +
> +/**
> + * struct liveupdate_filesystem - Represents a handler for a live-updatable
> + * filesystem/file type.
> + * @prepare: Optional. Saves state for a specific file instance (@file,
> + * @arg) before update, potentially returning value via @data.
> + * Returns 0 on success, negative errno on failure.
> + * @freeze: Optional. Performs final actions just before kernel
> + * transition, potentially reading/updating the handle via
> + * @data.
> + * Returns 0 on success, negative errno on failure.
> + * @cancel: Optional. Cleans up state/resources if update is aborted
> + * after prepare/freeze succeeded, using the @data handle (by
> + * value) from the successful prepare. Returns void.
> + * @finish: Optional. Performs final cleanup in the new kernel using the
> + * preserved @data handle (by value). Returns void.
> + * @retrieve: Retrieve the preserved file. Must be called before finish.
> + * @can_preserve: callback to determine if @file with associated context (@arg)
> + * can be preserved by this handler.
> + * Return bool (true if preservable, false otherwise).
> + * @compatible: The compatibility string (e.g., "memfd-v1", "vfiofd-v1")
> + * that uniquely identifies the filesystem or file type this
> + * handler supports. This is matched against the compatible
> + * string associated with individual &struct liveupdate_file
> + * instances.
> + * @arg: An opaque pointer to implementation-specific context data
> + * associated with this filesystem handler registration.
> + * @list: used for linking this handler instance into a global list of
> + * registered filesystem handlers.
> + *
> + * Modules that want to support live update for specific file types should
> + * register an instance of this structure. LUO uses this registration to
> + * determine if a given file can be preserved and to find the appropriate
> + * operations to manage its state across the update.
> + */
> +struct liveupdate_filesystem {
> + int (*prepare)(struct file *file, void *arg, u64 *data);
> + int (*freeze)(struct file *file, void *arg, u64 *data);
> + void (*cancel)(struct file *file, void *arg, u64 data);
> + void (*finish)(struct file *file, void *arg, u64 data, bool reclaimed);
> + int (*retrieve)(void *arg, u64 data, struct file **file);
> + bool (*can_preserve)(struct file *file, void *arg);
> + const char *compatible;
> + void *arg;
> + struct list_head list;
> +};
> +
Like with subsystems, I'd split ops and make the data part private to
luo_files.c
> /**
> * struct liveupdate_subsystem - Represents a subsystem participating in LUO
> * @prepare: Optional. Called during LUO prepare phase. Should perform
> @@ -142,6 +191,9 @@ int liveupdate_register_subsystem(struct liveupdate_subsystem *h);
> int liveupdate_unregister_subsystem(struct liveupdate_subsystem *h);
> int liveupdate_get_subsystem_data(struct liveupdate_subsystem *h, u64 *data);
>
> +int liveupdate_register_filesystem(struct liveupdate_filesystem *h);
> +int liveupdate_unregister_filesystem(struct liveupdate_filesystem *h);
int liveupdate_register_file_ops(name, ops, data, ret_token) ?
> +
> #else /* CONFIG_LIVEUPDATE */
>
> static inline int liveupdate_reboot(void)
> @@ -180,5 +232,15 @@ static inline int liveupdate_get_subsystem_data(struct liveupdate_subsystem *h,
> return -ENODATA;
> }
>
> +static inline int liveupdate_register_filesystem(struct liveupdate_filesystem *h)
> +{
> + return 0;
> +}
> +
> +static inline int liveupdate_unregister_filesystem(struct liveupdate_filesystem *h)
> +{
> + return 0;
> +}
> +
> #endif /* CONFIG_LIVEUPDATE */
> #endif /* _LINUX_LIVEUPDATE_H */
> --
> 2.49.0.1101.gccaa498523-goog
>
>
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC v2 08/16] luo: luo_files: add infrastructure for FDs
2025-05-26 7:55 ` Mike Rapoport
@ 2025-06-05 11:56 ` Pratyush Yadav
2025-06-08 13:13 ` Pasha Tatashin
1 sibling, 0 replies; 11+ messages in thread
From: Pratyush Yadav @ 2025-06-05 11:56 UTC (permalink / raw)
To: Mike Rapoport
Cc: Pasha Tatashin, pratyush, jasonmiu, graf, changyuanl, dmatlack,
rientjes, corbet, rdunlap, ilpo.jarvinen, kanie, ojeda, aliceryhl,
masahiroy, akpm, tj, yoann.congal, mmaurer, roman.gushchin,
chenridong, axboe, mark.rutland, jannh, vincent.guittot, hannes,
dan.j.williams, david, joel.granados, rostedt, anna.schumaker,
song, zhangguopeng, linux, linux-kernel, linux-doc, linux-mm,
gregkh, tglx, mingo, bp, dave.hansen, x86, hpa, rafael, dakr,
bartosz.golaszewski, cw00.choi, myungjoo.ham, yesanishhere,
Jonathan.Cameron, quic_zijuhu, aleksander.lobakin, ira.weiny,
andriy.shevchenko, leon, lukas, bhelgaas, wagi, djeffery,
stuart.w.hayes
On Mon, May 26 2025, Mike Rapoport wrote:
> On Thu, May 15, 2025 at 06:23:12PM +0000, Pasha Tatashin wrote:
>> Introduce the framework within LUO to support preserving specific types
>> of file descriptors across a live update transition. This allows
>> stateful FDs (like memfds or vfio FDs used by VMs) to be recreated in
>> the new kernel.
>>
>> Note: The core logic for iterating through the luo_files_list and
>> invoking the handler callbacks (prepare, freeze, cancel, finish)
>> within luo_do_files_*_calls, as well as managing the u64 data
>> persistence via the FDT for individual files, is currently implemented
>> as stubs in this patch. This patch sets up the registration, FDT layout,
>> and retrieval framework.
>>
>> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
>> ---
[...]
>> diff --git a/drivers/misc/liveupdate/luo_core.c b/drivers/misc/liveupdate/luo_core.c
>> index 417e7f6bf36c..ab1d76221fe2 100644
>> --- a/drivers/misc/liveupdate/luo_core.c
>> +++ b/drivers/misc/liveupdate/luo_core.c
>> @@ -110,6 +110,10 @@ static int luo_fdt_setup(struct kho_serialization *ser)
>> if (ret)
>> goto exit_free;
>>
>> + ret = luo_files_fdt_setup(fdt_out);
>> + if (ret)
>> + goto exit_free;
>> +
>> ret = luo_subsystems_fdt_setup(fdt_out);
>> if (ret)
>> goto exit_free;
>
> The duplication of files and subsystems does not look nice here and below.
> Can't we make files to be a subsystem?
+1
It would also give subsystems a user.
[...]
>> diff --git a/drivers/misc/liveupdate/luo_files.c b/drivers/misc/liveupdate/luo_files.c
>> new file mode 100644
>> index 000000000000..953fc40db3d7
>> --- /dev/null
>> +++ b/drivers/misc/liveupdate/luo_files.c
>> @@ -0,0 +1,563 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +
>> +/*
>> + * Copyright (c) 2025, Google LLC.
>> + * Pasha Tatashin <pasha.tatashin@soleen.com>
>> + */
>> +
>> +/**
>> + * DOC: LUO file descriptors
>> + *
>> + * LUO provides the infrastructure necessary to preserve
>> + * specific types of stateful file descriptors across a kernel live
>> + * update transition. The primary goal is to allow workloads, such as virtual
>> + * machines using vfio, memfd, or iommufd to retain access to their essential
>> + * resources without interruption after the underlying kernel is updated.
>> + *
>> + * The framework operates based on handler registration and instance tracking:
>> + *
>> + * 1. Handler Registration: Kernel modules responsible for specific file
>> + * types (e.g., memfd, vfio) register a &struct liveupdate_filesystem
>> + * handler. This handler contains callbacks (&liveupdate_filesystem.prepare,
>> + * &liveupdate_filesystem.freeze, &liveupdate_filesystem.finish, etc.)
>> + * and a unique 'compatible' string identifying the file type.
>> + * Registration occurs via liveupdate_register_filesystem().
>
> I wouldn't use filesystem here, as the obvious users are not really
> filesystems. Maybe liveupdate_register_file_ops?
+1
[...]
--
Regards,
Pratyush Yadav
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC v2 08/16] luo: luo_files: add infrastructure for FDs
2025-05-15 18:23 ` [RFC v2 08/16] luo: luo_files: add infrastructure for FDs Pasha Tatashin
2025-05-15 23:15 ` James Houghton
2025-05-26 7:55 ` Mike Rapoport
@ 2025-06-05 15:56 ` Pratyush Yadav
2025-06-08 13:37 ` Pasha Tatashin
2 siblings, 1 reply; 11+ messages in thread
From: Pratyush Yadav @ 2025-06-05 15:56 UTC (permalink / raw)
To: Pasha Tatashin
Cc: pratyush, jasonmiu, graf, changyuanl, rppt, dmatlack, rientjes,
corbet, rdunlap, ilpo.jarvinen, kanie, ojeda, aliceryhl,
masahiroy, akpm, tj, yoann.congal, mmaurer, roman.gushchin,
chenridong, axboe, mark.rutland, jannh, vincent.guittot, hannes,
dan.j.williams, david, joel.granados, rostedt, anna.schumaker,
song, zhangguopeng, linux, linux-kernel, linux-doc, linux-mm,
gregkh, tglx, mingo, bp, dave.hansen, x86, hpa, rafael, dakr,
bartosz.golaszewski, cw00.choi, myungjoo.ham, yesanishhere,
Jonathan.Cameron, quic_zijuhu, aleksander.lobakin, ira.weiny,
andriy.shevchenko, leon, lukas, bhelgaas, wagi, djeffery,
stuart.w.hayes
On Thu, May 15 2025, Pasha Tatashin wrote:
> Introduce the framework within LUO to support preserving specific types
> of file descriptors across a live update transition. This allows
> stateful FDs (like memfds or vfio FDs used by VMs) to be recreated in
> the new kernel.
>
> Note: The core logic for iterating through the luo_files_list and
> invoking the handler callbacks (prepare, freeze, cancel, finish)
> within luo_do_files_*_calls, as well as managing the u64 data
> persistence via the FDT for individual files, is currently implemented
> as stubs in this patch. This patch sets up the registration, FDT layout,
> and retrieval framework.
>
> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> ---
> drivers/misc/liveupdate/Makefile | 1 +
> drivers/misc/liveupdate/luo_core.c | 19 +
> drivers/misc/liveupdate/luo_files.c | 563 +++++++++++++++++++++++++
> drivers/misc/liveupdate/luo_internal.h | 11 +
> include/linux/liveupdate.h | 62 +++
> 5 files changed, 656 insertions(+)
> create mode 100644 drivers/misc/liveupdate/luo_files.c
>
> diff --git a/drivers/misc/liveupdate/Makefile b/drivers/misc/liveupdate/Makefile
> index df1c9709ba4f..b4cdd162574f 100644
> --- a/drivers/misc/liveupdate/Makefile
> +++ b/drivers/misc/liveupdate/Makefile
> @@ -1,3 +1,4 @@
> # SPDX-License-Identifier: GPL-2.0
> obj-y += luo_core.o
> +obj-y += luo_files.o
> obj-y += luo_subsystems.o
[...]
> diff --git a/drivers/misc/liveupdate/luo_files.c b/drivers/misc/liveupdate/luo_files.c
> new file mode 100644
> index 000000000000..953fc40db3d7
> --- /dev/null
> +++ b/drivers/misc/liveupdate/luo_files.c
> @@ -0,0 +1,563 @@
[...]
> +struct luo_file {
> + struct liveupdate_filesystem *fs;
> + struct file *file;
> + u64 private_data;
> + bool reclaimed;
> + enum liveupdate_state state;
> + struct mutex mutex;
> +};
> +
> +/**
> + * luo_files_startup - Validates the LUO file-descriptors FDT node at startup.
> + * @fdt: Pointer to the LUO FDT blob passed from the previous kernel.
> + *
> + * This __init function checks the existence and validity of the
> + * '/file-descriptors' node in the FDT. This node is considered mandatory. It
Why is it mandatory? Can't a user just preserve some subsystems, and no
FDs?
> + * calls panic() if the node is missing, inaccessible, or invalid (e.g., missing
> + * compatible, wrong compatible string), indicating a critical configuration
> + * error for LUO.
> + */
> +void __init luo_files_startup(void *fdt)
> +{
> + int ret, node_offset;
> +
> + node_offset = fdt_subnode_offset(fdt, 0, LUO_FILES_NODE_NAME);
> + if (node_offset < 0)
> + panic("Failed to find /file-descriptors node\n");
> +
> + ret = fdt_node_check_compatible(fdt, node_offset,
> + LUO_FILES_COMPATIBLE);
> + if (ret) {
> + panic("FDT '%s' is incompatible with '%s' [%d]\n",
> + LUO_FILES_NODE_NAME, LUO_FILES_COMPATIBLE, ret);
> + }
> + luo_fdt_in = fdt;
> +}
> +
> +static void luo_files_recreate_luo_files_xa_in(void)
> +{
> + int parent_node_offset, file_node_offset;
> + const char *node_name, *fdt_compat_str;
> + struct liveupdate_filesystem *fs;
> + struct luo_file *luo_file;
> + const void *data_ptr;
> + int ret = 0;
> +
> + if (luo_files_xa_in_recreated || !luo_fdt_in)
> + return;
> +
> + /* Take write in order to gurantee that we re-create list once */
Typo: s/gurantee/guarantee
> + down_write(&luo_filesystems_list_rwsem);
> + if (luo_files_xa_in_recreated)
> + goto exit_unlock;
> +
> + parent_node_offset = fdt_subnode_offset(luo_fdt_in, 0,
> + LUO_FILES_NODE_NAME);
> +
> + fdt_for_each_subnode(file_node_offset, luo_fdt_in, parent_node_offset) {
> + bool handler_found = false;
> + u64 token;
> +
> + node_name = fdt_get_name(luo_fdt_in, file_node_offset, NULL);
> + if (!node_name) {
> + panic("Skipping FDT subnode at offset %d: Cannot get name\n",
> + file_node_offset);
Should failure to parse a specific FD really be a panic? Wouldn't it be
better to continue and let userspace decide if it can live with the FD
missing?
> + }
> +
> + ret = kstrtou64(node_name, 0, &token);
> + if (ret < 0) {
> + panic("Skipping FDT node '%s': Failed to parse token\n",
> + node_name);
> + }
> +
> + fdt_compat_str = fdt_getprop(luo_fdt_in, file_node_offset,
> + "compatible", NULL);
> + if (!fdt_compat_str) {
> + panic("Skipping FDT node '%s': Missing 'compatible' property\n",
> + node_name);
> + }
> +
> + data_ptr = fdt_getprop(luo_fdt_in, file_node_offset, "data",
> + NULL);
> + if (!data_ptr) {
> + panic("Can't recover property 'data' for FDT node '%s'\n",
> + node_name);
> + }
> +
> + list_for_each_entry(fs, &luo_filesystems_list, list) {
> + if (!strcmp(fs->compatible, fdt_compat_str)) {
> + handler_found = true;
> + break;
> + }
> + }
> +
> + if (!handler_found) {
> + panic("Skipping FDT node '%s': No registered handler for compatible '%s'\n",
> + node_name, fdt_compat_str);
Thinking out loud here: this means that by the time of first retrieval,
all file systems must be registered. Since this is called from
luo_do_files_finish_calls() or luo_retrieve_file(), it will come from
userspace, so all built in modules would be initialized by then. But
some loadable module might not be. I don't see much of a use case for
loadable modules to participate in LUO, so I don't think it should be a
problem.
> + }
> +
> + luo_file = kmalloc(sizeof(*luo_file),
> + GFP_KERNEL | __GFP_NOFAIL);
> + luo_file->fs = fs;
> + luo_file->file = NULL;
> + memcpy(&luo_file->private_data, data_ptr, sizeof(u64));
Why not make sure data_ptr is exactly sizeof(u64) when we parse it, and
then simply do luo_file->private_data = (u64)*data_ptr ?
Because if the previous kernel wrote more than a u64 in data, then
something is broken and we should catch that error anyway.
> + luo_file->reclaimed = false;
> + mutex_init(&luo_file->mutex);
> + luo_file->state = LIVEUPDATE_STATE_UPDATED;
> + ret = xa_err(xa_store(&luo_files_xa_in, token, luo_file,
> + GFP_KERNEL | __GFP_NOFAIL));
Should you also check if something is already at token's slot, in case
previous kernel generated wrong tokens or FDT is broken?
> + if (ret < 0) {
> + panic("Failed to store luo_file for token %llu in XArray: %d\n",
> + token, ret);
> + }
> + }
> + luo_files_xa_in_recreated = true;
> +
> +exit_unlock:
> + up_write(&luo_filesystems_list_rwsem);
> +}
> +
[...]
> diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h
> index 7a130680b5f2..7afe0aac5ce4 100644
> --- a/include/linux/liveupdate.h
> +++ b/include/linux/liveupdate.h
> @@ -86,6 +86,55 @@ enum liveupdate_state {
> LIVEUPDATE_STATE_UPDATED = 3,
> };
>
> +/* Forward declaration needed if definition isn't included */
> +struct file;
> +
> +/**
> + * struct liveupdate_filesystem - Represents a handler for a live-updatable
> + * filesystem/file type.
> + * @prepare: Optional. Saves state for a specific file instance (@file,
> + * @arg) before update, potentially returning value via @data.
> + * Returns 0 on success, negative errno on failure.
> + * @freeze: Optional. Performs final actions just before kernel
> + * transition, potentially reading/updating the handle via
> + * @data.
> + * Returns 0 on success, negative errno on failure.
> + * @cancel: Optional. Cleans up state/resources if update is aborted
> + * after prepare/freeze succeeded, using the @data handle (by
> + * value) from the successful prepare. Returns void.
> + * @finish: Optional. Performs final cleanup in the new kernel using the
> + * preserved @data handle (by value). Returns void.
> + * @retrieve: Retrieve the preserved file. Must be called before finish.
> + * @can_preserve: callback to determine if @file with associated context (@arg)
> + * can be preserved by this handler.
> + * Return bool (true if preservable, false otherwise).
> + * @compatible: The compatibility string (e.g., "memfd-v1", "vfiofd-v1")
> + * that uniquely identifies the filesystem or file type this
> + * handler supports. This is matched against the compatible
> + * string associated with individual &struct liveupdate_file
> + * instances.
> + * @arg: An opaque pointer to implementation-specific context data
> + * associated with this filesystem handler registration.
> + * @list: used for linking this handler instance into a global list of
> + * registered filesystem handlers.
> + *
> + * Modules that want to support live update for specific file types should
> + * register an instance of this structure. LUO uses this registration to
> + * determine if a given file can be preserved and to find the appropriate
> + * operations to manage its state across the update.
> + */
> +struct liveupdate_filesystem {
> + int (*prepare)(struct file *file, void *arg, u64 *data);
> + int (*freeze)(struct file *file, void *arg, u64 *data);
> + void (*cancel)(struct file *file, void *arg, u64 data);
> + void (*finish)(struct file *file, void *arg, u64 data, bool reclaimed);
> + int (*retrieve)(void *arg, u64 data, struct file **file);
> + bool (*can_preserve)(struct file *file, void *arg);
> + const char *compatible;
> + void *arg;
What is the use for this arg? I would expect one file type/system to
register one set of handlers. So they can keep their arg in a global in
their code. I don't see why a per-filesystem arg is needed.
What I do think is needed is a per-file arg. Each callback gets 'data',
which is the serialized data, but there is no place to store runtime
state, like some flags or serialization metadata. Sure, you could make
place for it somewhere in the inode, but I think it would be a lot
cleaner to be able to store it in struct luo_file.
So perhaps rename private_data in struct luo_file to say
serialized_data, and have a field called "private" that filesystems can
use for their runtime state?
Same suggestion for subsystems as well.
> + struct list_head list;
> +};
> +
> /**
> * struct liveupdate_subsystem - Represents a subsystem participating in LUO
> * @prepare: Optional. Called during LUO prepare phase. Should perform
[...]
--
Regards,
Pratyush Yadav
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC v2 08/16] luo: luo_files: add infrastructure for FDs
[not found] <CA+Sjx0XP2A+rsA95NVwX7czQGNREXcFT=psedA7fawwKJb5rkw@mail.gmail.com>
@ 2025-06-08 0:07 ` Pasha Tatashin
0 siblings, 0 replies; 11+ messages in thread
From: Pasha Tatashin @ 2025-06-08 0:07 UTC (permalink / raw)
To: Anish Moorthy
Cc: Jonathan.Cameron, akpm, aleksander.lobakin, aliceryhl,
andriy.shevchenko, anna.schumaker, axboe, bartosz.golaszewski,
bhelgaas, bp, changyuanl, chenridong, corbet, cw00.choi, dakr,
dan.j.williams, dave.hansen, david, djeffery, dmatlack, graf,
gregkh, hannes, hpa, ilpo.jarvinen, ira.weiny, jannh, jasonmiu,
joel.granados, kanie, leon, linux-doc, linux-kernel, linux-mm,
linux, lukas, mark.rutland, masahiroy, mingo, mmaurer,
myungjoo.ham, ojeda, pratyush, ptyadav, quic_zijuhu, rafael,
rdunlap, rientjes, roman.gushchin, rostedt, rppt, song,
stuart.w.hayes, tglx, tj, vincent.guittot, wagi, x86,
yesanishhere, yoann.congal, zhangguopeng
On Fri, Jun 6, 2025 at 6:28 PM Anish Moorthy <anish.moorthy@gmail.com> wrote:
>
> > + token = luo_next_file_token;
> > + luo_next_file_token++;
>
> This seems like it should be an atomic fetch_add: I'm only seeing read locks up till this point
>
> (Sorry if this is too nitpicky. Also for any formatting issues, I'm on mobile atm)
Thank you, this was also found by other reviewers. I have updated this
to use atomic.
Pasha
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC v2 08/16] luo: luo_files: add infrastructure for FDs
2025-05-26 7:55 ` Mike Rapoport
2025-06-05 11:56 ` Pratyush Yadav
@ 2025-06-08 13:13 ` Pasha Tatashin
1 sibling, 0 replies; 11+ messages in thread
From: Pasha Tatashin @ 2025-06-08 13:13 UTC (permalink / raw)
To: Mike Rapoport
Cc: pratyush, jasonmiu, graf, changyuanl, dmatlack, rientjes, corbet,
rdunlap, ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm,
tj, yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
david, joel.granados, rostedt, anna.schumaker, song, zhangguopeng,
linux, linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes, ptyadav
On Mon, May 26, 2025 at 3:55 AM Mike Rapoport <rppt@kernel.org> wrote:
>
> On Thu, May 15, 2025 at 06:23:12PM +0000, Pasha Tatashin wrote:
> > Introduce the framework within LUO to support preserving specific types
> > of file descriptors across a live update transition. This allows
> > stateful FDs (like memfds or vfio FDs used by VMs) to be recreated in
> > the new kernel.
> >
> > Note: The core logic for iterating through the luo_files_list and
> > invoking the handler callbacks (prepare, freeze, cancel, finish)
> > within luo_do_files_*_calls, as well as managing the u64 data
> > persistence via the FDT for individual files, is currently implemented
> > as stubs in this patch. This patch sets up the registration, FDT layout,
> > and retrieval framework.
> >
> > Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> > ---
> > drivers/misc/liveupdate/Makefile | 1 +
> > drivers/misc/liveupdate/luo_core.c | 19 +
> > drivers/misc/liveupdate/luo_files.c | 563 +++++++++++++++++++++++++
> > drivers/misc/liveupdate/luo_internal.h | 11 +
> > include/linux/liveupdate.h | 62 +++
> > 5 files changed, 656 insertions(+)
> > create mode 100644 drivers/misc/liveupdate/luo_files.c
> >
> > diff --git a/drivers/misc/liveupdate/Makefile b/drivers/misc/liveupdate/Makefile
> > index df1c9709ba4f..b4cdd162574f 100644
> > --- a/drivers/misc/liveupdate/Makefile
> > +++ b/drivers/misc/liveupdate/Makefile
> > @@ -1,3 +1,4 @@
> > # SPDX-License-Identifier: GPL-2.0
> > obj-y += luo_core.o
> > +obj-y += luo_files.o
> > obj-y += luo_subsystems.o
> > diff --git a/drivers/misc/liveupdate/luo_core.c b/drivers/misc/liveupdate/luo_core.c
> > index 417e7f6bf36c..ab1d76221fe2 100644
> > --- a/drivers/misc/liveupdate/luo_core.c
> > +++ b/drivers/misc/liveupdate/luo_core.c
> > @@ -110,6 +110,10 @@ static int luo_fdt_setup(struct kho_serialization *ser)
> > if (ret)
> > goto exit_free;
> >
> > + ret = luo_files_fdt_setup(fdt_out);
> > + if (ret)
> > + goto exit_free;
> > +
> > ret = luo_subsystems_fdt_setup(fdt_out);
> > if (ret)
> > goto exit_free;
>
> The duplication of files and subsystems does not look nice here and below.
> Can't we make files to be a subsystem?
Good idea, let me work on this.
>
> > @@ -145,7 +149,13 @@ static int luo_do_prepare_calls(void)
> > {
> > int ret;
> >
> > + ret = luo_do_files_prepare_calls();
> > + if (ret)
> > + return ret;
> > +
> > ret = luo_do_subsystems_prepare_calls();
> > + if (ret)
> > + luo_do_files_cancel_calls();
> >
> > return ret;
> > }
> > @@ -154,18 +164,26 @@ static int luo_do_freeze_calls(void)
> > {
> > int ret;
> >
> > + ret = luo_do_files_freeze_calls();
> > + if (ret)
> > + return ret;
> > +
> > ret = luo_do_subsystems_freeze_calls();
> > + if (ret)
> > + luo_do_files_cancel_calls();
> >
> > return ret;
> > }
> >
> > static void luo_do_finish_calls(void)
> > {
> > + luo_do_files_finish_calls();
> > luo_do_subsystems_finish_calls();
> > }
> >
> > static void luo_do_cancel_calls(void)
> > {
> > + luo_do_files_cancel_calls();
> > luo_do_subsystems_cancel_calls();
> > }
> >
> > @@ -436,6 +454,7 @@ static int __init luo_startup(void)
> > }
> >
> > __luo_set_state(LIVEUPDATE_STATE_UPDATED);
> > + luo_files_startup(luo_fdt_in);
> > luo_subsystems_startup(luo_fdt_in);
> >
> > return 0;
> > diff --git a/drivers/misc/liveupdate/luo_files.c b/drivers/misc/liveupdate/luo_files.c
> > new file mode 100644
> > index 000000000000..953fc40db3d7
> > --- /dev/null
> > +++ b/drivers/misc/liveupdate/luo_files.c
> > @@ -0,0 +1,563 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +/*
> > + * Copyright (c) 2025, Google LLC.
> > + * Pasha Tatashin <pasha.tatashin@soleen.com>
> > + */
> > +
> > +/**
> > + * DOC: LUO file descriptors
> > + *
> > + * LUO provides the infrastructure necessary to preserve
> > + * specific types of stateful file descriptors across a kernel live
> > + * update transition. The primary goal is to allow workloads, such as virtual
> > + * machines using vfio, memfd, or iommufd to retain access to their essential
> > + * resources without interruption after the underlying kernel is updated.
> > + *
> > + * The framework operates based on handler registration and instance tracking:
> > + *
> > + * 1. Handler Registration: Kernel modules responsible for specific file
> > + * types (e.g., memfd, vfio) register a &struct liveupdate_filesystem
> > + * handler. This handler contains callbacks (&liveupdate_filesystem.prepare,
> > + * &liveupdate_filesystem.freeze, &liveupdate_filesystem.finish, etc.)
> > + * and a unique 'compatible' string identifying the file type.
> > + * Registration occurs via liveupdate_register_filesystem().
>
> I wouldn't use filesystem here, as the obvious users are not really
> filesystems. Maybe liveupdate_register_file_ops?
This corresponds to the way these structs are called in linux, so I
think the name is OK.
>
> > + *
> > + * 2. File Instance Tracking: When a potentially preservable file needs to be
> > + * managed for live update, the core LUO logic (luo_register_file()) finds a
> > + * compatible registered handler using its &liveupdate_filesystem.can_preserve
> > + * callback. If found, an internal &struct luo_file instance is created,
> > + * assigned a unique u64 'token', and added to a list.
> > + *
> > + * 3. State Persistence (FDT): During the LUO prepare/freeze phases, the
> > + * registered handler callbacks are invoked for each tracked file instance.
> > + * These callbacks can generate a u64 data payload representing the minimal
> > + * state needed for restoration. This payload, along with the handler's
> > + * compatible string and the unique token, is stored in a dedicated
> > + * '/file-descriptors' node within the main LUO FDT blob passed via
> > + * Kexec Handover (KHO).
> > + *
> > + * 4. Restoration: In the new kernel, the LUO framework parses the incoming
> > + * FDT to reconstruct the list of &struct luo_file instances. When the
> > + * original owner requests the file, luo_retrieve_file() uses the corresponding
> > + * handler's &liveupdate_filesystem.retrieve callback, passing the persisted
> > + * u64 data, to recreate or find the appropriate &struct file object.
> > + */
>
> The DOC is mostly about what luo_files does, we'd also need a description
> of it's intended use, both internally in the kernel and by the userspace.
>
> > +
> > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> > +
>
> ...
>
> > +/**
> > + * luo_register_file - Register a file descriptor for live update management.
> > + * @tokenp: Return argument for the token value.
> > + * @file: Pointer to the struct file to be preserved.
> > + *
> > + * Context: Must be called when LUO is in 'normal' state.
> > + *
> > + * Return: 0 on success. Negative errno on failure.
> > + */
> > +int luo_register_file(u64 *tokenp, struct file *file)
> > +{
> > + struct liveupdate_filesystem *fs;
> > + bool found = false;
> > + int ret = -ENOENT;
> > + u64 token;
> > +
> > + luo_state_read_enter();
> > + if (!liveupdate_state_normal() && !liveupdate_state_updated()) {
> > + pr_warn("File can be registered only in normal or prepared state\n");
> > + luo_state_read_exit();
> > + return -EBUSY;
> > + }
> > +
> > + down_read(&luo_filesystems_list_rwsem);
> > + list_for_each_entry(fs, &luo_filesystems_list, list) {
> > + if (fs->can_preserve(file, fs->arg)) {
> > + found = true;
> > + break;
> > + }
> > + }
> > +
> > + if (found) {
>
> if (!found)
> goto exit_unlock;
Done, thank you.
> > + * struct liveupdate_filesystem - Represents a handler for a live-updatable
> > + * filesystem/file type.
> > + * @prepare: Optional. Saves state for a specific file instance (@file,
> > + * @arg) before update, potentially returning value via @data.
> > + * Returns 0 on success, negative errno on failure.
> > + * @freeze: Optional. Performs final actions just before kernel
> > + * transition, potentially reading/updating the handle via
> > + * @data.
> > + * Returns 0 on success, negative errno on failure.
> > + * @cancel: Optional. Cleans up state/resources if update is aborted
> > + * after prepare/freeze succeeded, using the @data handle (by
> > + * value) from the successful prepare. Returns void.
> > + * @finish: Optional. Performs final cleanup in the new kernel using the
> > + * preserved @data handle (by value). Returns void.
> > + * @retrieve: Retrieve the preserved file. Must be called before finish.
> > + * @can_preserve: callback to determine if @file with associated context (@arg)
> > + * can be preserved by this handler.
> > + * Return bool (true if preservable, false otherwise).
> > + * @compatible: The compatibility string (e.g., "memfd-v1", "vfiofd-v1")
> > + * that uniquely identifies the filesystem or file type this
> > + * handler supports. This is matched against the compatible
> > + * string associated with individual &struct liveupdate_file
> > + * instances.
> > + * @arg: An opaque pointer to implementation-specific context data
> > + * associated with this filesystem handler registration.
> > + * @list: used for linking this handler instance into a global list of
> > + * registered filesystem handlers.
> > + *
> > + * Modules that want to support live update for specific file types should
> > + * register an instance of this structure. LUO uses this registration to
> > + * determine if a given file can be preserved and to find the appropriate
> > + * operations to manage its state across the update.
> > + */
> > +struct liveupdate_filesystem {
> > + int (*prepare)(struct file *file, void *arg, u64 *data);
> > + int (*freeze)(struct file *file, void *arg, u64 *data);
> > + void (*cancel)(struct file *file, void *arg, u64 data);
> > + void (*finish)(struct file *file, void *arg, u64 data, bool reclaimed);
> > + int (*retrieve)(void *arg, u64 data, struct file **file);
> > + bool (*can_preserve)(struct file *file, void *arg);
> > + const char *compatible;
> > + void *arg;
> > + struct list_head list;
> > +};
> > +
>
> Like with subsystems, I'd split ops and make the data part private to
> luo_files.c
For simplicity, I would like to keep them together, the same as in subsystems.
>
> > /**
> > * struct liveupdate_subsystem - Represents a subsystem participating in LUO
> > * @prepare: Optional. Called during LUO prepare phase. Should perform
> > @@ -142,6 +191,9 @@ int liveupdate_register_subsystem(struct liveupdate_subsystem *h);
> > int liveupdate_unregister_subsystem(struct liveupdate_subsystem *h);
> > int liveupdate_get_subsystem_data(struct liveupdate_subsystem *h, u64 *data);
> >
> > +int liveupdate_register_filesystem(struct liveupdate_filesystem *h);
> > +int liveupdate_unregister_filesystem(struct liveupdate_filesystem *h);
>
> int liveupdate_register_file_ops(name, ops, data, ret_token) ?
>
> > +
> > #else /* CONFIG_LIVEUPDATE */
> >
> > static inline int liveupdate_reboot(void)
> > @@ -180,5 +232,15 @@ static inline int liveupdate_get_subsystem_data(struct liveupdate_subsystem *h,
> > return -ENODATA;
> > }
> >
> > +static inline int liveupdate_register_filesystem(struct liveupdate_filesystem *h)
> > +{
> > + return 0;
> > +}
> > +
> > +static inline int liveupdate_unregister_filesystem(struct liveupdate_filesystem *h)
> > +{
> > + return 0;
> > +}
> > +
> > #endif /* CONFIG_LIVEUPDATE */
> > #endif /* _LINUX_LIVEUPDATE_H */
> > --
> > 2.49.0.1101.gccaa498523-goog
> >
> >
>
> --
> Sincerely yours,
> Mike.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC v2 08/16] luo: luo_files: add infrastructure for FDs
2025-06-05 15:56 ` Pratyush Yadav
@ 2025-06-08 13:37 ` Pasha Tatashin
2025-06-13 15:27 ` Pratyush Yadav
0 siblings, 1 reply; 11+ messages in thread
From: Pasha Tatashin @ 2025-06-08 13:37 UTC (permalink / raw)
To: Pratyush Yadav
Cc: jasonmiu, graf, changyuanl, rppt, dmatlack, rientjes, corbet,
rdunlap, ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm,
tj, yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
david, joel.granados, rostedt, anna.schumaker, song, zhangguopeng,
linux, linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes
> > +
> > +/**
> > + * luo_files_startup - Validates the LUO file-descriptors FDT node at startup.
> > + * @fdt: Pointer to the LUO FDT blob passed from the previous kernel.
> > + *
> > + * This __init function checks the existence and validity of the
> > + * '/file-descriptors' node in the FDT. This node is considered mandatory. It
>
> Why is it mandatory? Can't a user just preserve some subsystems, and no
> FDs?
Yes, that is legal, in that case this node is going to be empty.
>
> > + * calls panic() if the node is missing, inaccessible, or invalid (e.g., missing
> > + * compatible, wrong compatible string), indicating a critical configuration
> > + * error for LUO.
> > + */
> > +void __init luo_files_startup(void *fdt)
> > +{
> > + int ret, node_offset;
> > +
> > + node_offset = fdt_subnode_offset(fdt, 0, LUO_FILES_NODE_NAME);
> > + if (node_offset < 0)
> > + panic("Failed to find /file-descriptors node\n");
> > +
> > + ret = fdt_node_check_compatible(fdt, node_offset,
> > + LUO_FILES_COMPATIBLE);
> > + if (ret) {
> > + panic("FDT '%s' is incompatible with '%s' [%d]\n",
> > + LUO_FILES_NODE_NAME, LUO_FILES_COMPATIBLE, ret);
> > + }
> > + luo_fdt_in = fdt;
> > +}
> > +
> > +static void luo_files_recreate_luo_files_xa_in(void)
> > +{
> > + int parent_node_offset, file_node_offset;
> > + const char *node_name, *fdt_compat_str;
> > + struct liveupdate_filesystem *fs;
> > + struct luo_file *luo_file;
> > + const void *data_ptr;
> > + int ret = 0;
> > +
> > + if (luo_files_xa_in_recreated || !luo_fdt_in)
> > + return;
> > +
> > + /* Take write in order to gurantee that we re-create list once */
>
> Typo: s/gurantee/guarantee
Done, thanks.
>
> > + down_write(&luo_filesystems_list_rwsem);
> > + if (luo_files_xa_in_recreated)
> > + goto exit_unlock;
> > +
> > + parent_node_offset = fdt_subnode_offset(luo_fdt_in, 0,
> > + LUO_FILES_NODE_NAME);
> > +
> > + fdt_for_each_subnode(file_node_offset, luo_fdt_in, parent_node_offset) {
> > + bool handler_found = false;
> > + u64 token;
> > +
> > + node_name = fdt_get_name(luo_fdt_in, file_node_offset, NULL);
> > + if (!node_name) {
> > + panic("Skipping FDT subnode at offset %d: Cannot get name\n",
> > + file_node_offset);
>
> Should failure to parse a specific FD really be a panic? Wouldn't it be
> better to continue and let userspace decide if it can live with the FD
> missing?
This is not safe, the memory might be DMA or owned by a sensetive
process, and if we proceed liveupdate reboot without properly handling
memory, we can get corruptions, and memory leaks. Therefore, during
liveupdate boot if there are exceptions, we should panic.
> > + }
> > +
> > + ret = kstrtou64(node_name, 0, &token);
> > + if (ret < 0) {
> > + panic("Skipping FDT node '%s': Failed to parse token\n",
> > + node_name);
> > + }
> > +
> > + fdt_compat_str = fdt_getprop(luo_fdt_in, file_node_offset,
> > + "compatible", NULL);
> > + if (!fdt_compat_str) {
> > + panic("Skipping FDT node '%s': Missing 'compatible' property\n",
> > + node_name);
> > + }
> > +
> > + data_ptr = fdt_getprop(luo_fdt_in, file_node_offset, "data",
> > + NULL);
> > + if (!data_ptr) {
> > + panic("Can't recover property 'data' for FDT node '%s'\n",
> > + node_name);
> > + }
> > +
> > + list_for_each_entry(fs, &luo_filesystems_list, list) {
> > + if (!strcmp(fs->compatible, fdt_compat_str)) {
> > + handler_found = true;
> > + break;
> > + }
> > + }
> > +
> > + if (!handler_found) {
> > + panic("Skipping FDT node '%s': No registered handler for compatible '%s'\n",
> > + node_name, fdt_compat_str);
>
> Thinking out loud here: this means that by the time of first retrieval,
> all file systems must be registered. Since this is called from
> luo_do_files_finish_calls() or luo_retrieve_file(), it will come from
> userspace, so all built in modules would be initialized by then. But
> some loadable module might not be. I don't see much of a use case for
> loadable modules to participate in LUO, so I don't think it should be a
> problem.
Yes, in practice I am against supporting liveupdate for loadable
modules for FDs and devices; however, if userspace decides to use
them, they have to be very careful in terms when data is retrieved,
and when they are loaded.
> > + }
> > +
> > + luo_file = kmalloc(sizeof(*luo_file),
> > + GFP_KERNEL | __GFP_NOFAIL);
> > + luo_file->fs = fs;
> > + luo_file->file = NULL;
> > + memcpy(&luo_file->private_data, data_ptr, sizeof(u64));
>
> Why not make sure data_ptr is exactly sizeof(u64) when we parse it, and
> then simply do luo_file->private_data = (u64)*data_ptr ?
Because FDT alignment is 4 bytes, we can't simply assign it.
> Because if the previous kernel wrote more than a u64 in data, then
> something is broken and we should catch that error anyway.
>
> > + luo_file->reclaimed = false;
> > + mutex_init(&luo_file->mutex);
> > + luo_file->state = LIVEUPDATE_STATE_UPDATED;
> > + ret = xa_err(xa_store(&luo_files_xa_in, token, luo_file,
> > + GFP_KERNEL | __GFP_NOFAIL));
>
> Should you also check if something is already at token's slot, in case
> previous kernel generated wrong tokens or FDT is broken?
Good idea, added.
>
> > + if (ret < 0) {
> > + panic("Failed to store luo_file for token %llu in XArray: %d\n",
> > + token, ret);
> > + }
> > + }
> > + luo_files_xa_in_recreated = true;
> > +
> > +exit_unlock:
> > + up_write(&luo_filesystems_list_rwsem);
> > +}
> > +
> [...]
> > diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h
> > index 7a130680b5f2..7afe0aac5ce4 100644
> > --- a/include/linux/liveupdate.h
> > +++ b/include/linux/liveupdate.h
> > @@ -86,6 +86,55 @@ enum liveupdate_state {
> > LIVEUPDATE_STATE_UPDATED = 3,
> > };
> >
> > +/* Forward declaration needed if definition isn't included */
> > +struct file;
> > +
> > +/**
> > + * struct liveupdate_filesystem - Represents a handler for a live-updatable
> > + * filesystem/file type.
> > + * @prepare: Optional. Saves state for a specific file instance (@file,
> > + * @arg) before update, potentially returning value via @data.
> > + * Returns 0 on success, negative errno on failure.
> > + * @freeze: Optional. Performs final actions just before kernel
> > + * transition, potentially reading/updating the handle via
> > + * @data.
> > + * Returns 0 on success, negative errno on failure.
> > + * @cancel: Optional. Cleans up state/resources if update is aborted
> > + * after prepare/freeze succeeded, using the @data handle (by
> > + * value) from the successful prepare. Returns void.
> > + * @finish: Optional. Performs final cleanup in the new kernel using the
> > + * preserved @data handle (by value). Returns void.
> > + * @retrieve: Retrieve the preserved file. Must be called before finish.
> > + * @can_preserve: callback to determine if @file with associated context (@arg)
> > + * can be preserved by this handler.
> > + * Return bool (true if preservable, false otherwise).
> > + * @compatible: The compatibility string (e.g., "memfd-v1", "vfiofd-v1")
> > + * that uniquely identifies the filesystem or file type this
> > + * handler supports. This is matched against the compatible
> > + * string associated with individual &struct liveupdate_file
> > + * instances.
> > + * @arg: An opaque pointer to implementation-specific context data
> > + * associated with this filesystem handler registration.
> > + * @list: used for linking this handler instance into a global list of
> > + * registered filesystem handlers.
> > + *
> > + * Modules that want to support live update for specific file types should
> > + * register an instance of this structure. LUO uses this registration to
> > + * determine if a given file can be preserved and to find the appropriate
> > + * operations to manage its state across the update.
> > + */
> > +struct liveupdate_filesystem {
> > + int (*prepare)(struct file *file, void *arg, u64 *data);
> > + int (*freeze)(struct file *file, void *arg, u64 *data);
> > + void (*cancel)(struct file *file, void *arg, u64 data);
> > + void (*finish)(struct file *file, void *arg, u64 data, bool reclaimed);
> > + int (*retrieve)(void *arg, u64 data, struct file **file);
> > + bool (*can_preserve)(struct file *file, void *arg);
> > + const char *compatible;
> > + void *arg;
>
> What is the use for this arg? I would expect one file type/system to
> register one set of handlers. So they can keep their arg in a global in
> their code. I don't see why a per-filesystem arg is needed.
I think, arg is useful in case we support a subsystem is registered
multiple times with some differences: i.e. based on mount point, or
file types handling. Let's keep it for now, but if needed, we can
remove that in future revisions.
> What I do think is needed is a per-file arg. Each callback gets 'data',
> which is the serialized data, but there is no place to store runtime
> state, like some flags or serialization metadata. Sure, you could make
> place for it somewhere in the inode, but I think it would be a lot
> cleaner to be able to store it in struct luo_file.
>
> So perhaps rename private_data in struct luo_file to say
> serialized_data, and have a field called "private" that filesystems can
> use for their runtime state?
I am not against this, but let's make this change when it is actually
needed by a registered filesystem.
Thanks,
Pasha
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC v2 08/16] luo: luo_files: add infrastructure for FDs
2025-06-08 13:37 ` Pasha Tatashin
@ 2025-06-13 15:27 ` Pratyush Yadav
2025-06-15 18:02 ` Pasha Tatashin
0 siblings, 1 reply; 11+ messages in thread
From: Pratyush Yadav @ 2025-06-13 15:27 UTC (permalink / raw)
To: Pasha Tatashin
Cc: Pratyush Yadav, jasonmiu, graf, changyuanl, rppt, dmatlack,
rientjes, corbet, rdunlap, ilpo.jarvinen, kanie, ojeda, aliceryhl,
masahiroy, akpm, tj, yoann.congal, mmaurer, roman.gushchin,
chenridong, axboe, mark.rutland, jannh, vincent.guittot, hannes,
dan.j.williams, david, joel.granados, rostedt, anna.schumaker,
song, zhangguopeng, linux, linux-kernel, linux-doc, linux-mm,
gregkh, tglx, mingo, bp, dave.hansen, x86, hpa, rafael, dakr,
bartosz.golaszewski, cw00.choi, myungjoo.ham, yesanishhere,
Jonathan.Cameron, quic_zijuhu, aleksander.lobakin, ira.weiny,
andriy.shevchenko, leon, lukas, bhelgaas, wagi, djeffery,
stuart.w.hayes
On Sun, Jun 08 2025, Pasha Tatashin wrote:
[...]
>> > + down_write(&luo_filesystems_list_rwsem);
>> > + if (luo_files_xa_in_recreated)
>> > + goto exit_unlock;
>> > +
>> > + parent_node_offset = fdt_subnode_offset(luo_fdt_in, 0,
>> > + LUO_FILES_NODE_NAME);
>> > +
>> > + fdt_for_each_subnode(file_node_offset, luo_fdt_in, parent_node_offset) {
>> > + bool handler_found = false;
>> > + u64 token;
>> > +
>> > + node_name = fdt_get_name(luo_fdt_in, file_node_offset, NULL);
>> > + if (!node_name) {
>> > + panic("Skipping FDT subnode at offset %d: Cannot get name\n",
>> > + file_node_offset);
>>
>> Should failure to parse a specific FD really be a panic? Wouldn't it be
>> better to continue and let userspace decide if it can live with the FD
>> missing?
>
> This is not safe, the memory might be DMA or owned by a sensetive
> process, and if we proceed liveupdate reboot without properly handling
> memory, we can get corruptions, and memory leaks. Therefore, during
> liveupdate boot if there are exceptions, we should panic.
I don't get how it would result in memory leaks or corruptions, since
KHO would have marked that memory as preserved, and the new kernel won't
touch it until someone restores it.
So it can at most lead to loss of data, and in that case, userspace can
very well decide if it can live with that loss or not.
Or are you assuming here that even data in KHO is broken? In that case,
it would probably be a good idea to panic early.
[...]
>> > + }
>> > +
>> > + luo_file = kmalloc(sizeof(*luo_file),
>> > + GFP_KERNEL | __GFP_NOFAIL);
>> > + luo_file->fs = fs;
>> > + luo_file->file = NULL;
>> > + memcpy(&luo_file->private_data, data_ptr, sizeof(u64));
>>
>> Why not make sure data_ptr is exactly sizeof(u64) when we parse it, and
>> then simply do luo_file->private_data = (u64)*data_ptr ?
>
> Because FDT alignment is 4 bytes, we can't simply assign it.
Hmm, good catch. Didn't think of that.
>
>> Because if the previous kernel wrote more than a u64 in data, then
>> something is broken and we should catch that error anyway.
>>
>> > + luo_file->reclaimed = false;
>> > + mutex_init(&luo_file->mutex);
>> > + luo_file->state = LIVEUPDATE_STATE_UPDATED;
>> > + ret = xa_err(xa_store(&luo_files_xa_in, token, luo_file,
>> > + GFP_KERNEL | __GFP_NOFAIL));
>>
[...]
>> > +struct liveupdate_filesystem {
>> > + int (*prepare)(struct file *file, void *arg, u64 *data);
>> > + int (*freeze)(struct file *file, void *arg, u64 *data);
>> > + void (*cancel)(struct file *file, void *arg, u64 data);
>> > + void (*finish)(struct file *file, void *arg, u64 data, bool reclaimed);
>> > + int (*retrieve)(void *arg, u64 data, struct file **file);
>> > + bool (*can_preserve)(struct file *file, void *arg);
>> > + const char *compatible;
>> > + void *arg;
>>
>> What is the use for this arg? I would expect one file type/system to
>> register one set of handlers. So they can keep their arg in a global in
>> their code. I don't see why a per-filesystem arg is needed.
>
> I think, arg is useful in case we support a subsystem is registered
> multiple times with some differences: i.e. based on mount point, or
> file types handling. Let's keep it for now, but if needed, we can
> remove that in future revisions.
>
>> What I do think is needed is a per-file arg. Each callback gets 'data',
>> which is the serialized data, but there is no place to store runtime
>> state, like some flags or serialization metadata. Sure, you could make
>> place for it somewhere in the inode, but I think it would be a lot
>> cleaner to be able to store it in struct luo_file.
>>
>> So perhaps rename private_data in struct luo_file to say
>> serialized_data, and have a field called "private" that filesystems can
>> use for their runtime state?
>
> I am not against this, but let's make this change when it is actually
> needed by a registered filesystem.
Okay, fair enough.
--
Regards,
Pratyush Yadav
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC v2 08/16] luo: luo_files: add infrastructure for FDs
2025-06-13 15:27 ` Pratyush Yadav
@ 2025-06-15 18:02 ` Pasha Tatashin
0 siblings, 0 replies; 11+ messages in thread
From: Pasha Tatashin @ 2025-06-15 18:02 UTC (permalink / raw)
To: Pratyush Yadav
Cc: jasonmiu, graf, changyuanl, rppt, dmatlack, rientjes, corbet,
rdunlap, ilpo.jarvinen, kanie, ojeda, aliceryhl, masahiroy, akpm,
tj, yoann.congal, mmaurer, roman.gushchin, chenridong, axboe,
mark.rutland, jannh, vincent.guittot, hannes, dan.j.williams,
david, joel.granados, rostedt, anna.schumaker, song, zhangguopeng,
linux, linux-kernel, linux-doc, linux-mm, gregkh, tglx, mingo, bp,
dave.hansen, x86, hpa, rafael, dakr, bartosz.golaszewski,
cw00.choi, myungjoo.ham, yesanishhere, Jonathan.Cameron,
quic_zijuhu, aleksander.lobakin, ira.weiny, andriy.shevchenko,
leon, lukas, bhelgaas, wagi, djeffery, stuart.w.hayes
> > This is not safe, the memory might be DMA or owned by a sensetive
> > process, and if we proceed liveupdate reboot without properly handling
> > memory, we can get corruptions, and memory leaks. Therefore, during
> > liveupdate boot if there are exceptions, we should panic.
>
> I don't get how it would result in memory leaks or corruptions, since
> KHO would have marked that memory as preserved, and the new kernel won't
> touch it until someone restores it.
>
> So it can at most lead to loss of data, and in that case, userspace can
> very well decide if it can live with that loss or not.
>
> Or are you assuming here that even data in KHO is broken? In that case,
> it would probably be a good idea to panic early.
A broken LUO format is a catastrophic failure. It's unclear at this
point in boot whether the problem lies with KHO, LUO itself, or
mismatched interface assumptions between kernel versions. Regardless,
falling back to a cold reboot is the safest course of action, rather
than attempting to boot into a potentially broken environment. Since
VMs or any preserved userspace won't survive, the additional delay of
a full reboot should not significantly worsen the impact.
>
> [...]
> >> > + }
> >> > +
> >> > + luo_file = kmalloc(sizeof(*luo_file),
> >> > + GFP_KERNEL | __GFP_NOFAIL);
> >> > + luo_file->fs = fs;
> >> > + luo_file->file = NULL;
> >> > + memcpy(&luo_file->private_data, data_ptr, sizeof(u64));
> >>
> >> Why not make sure data_ptr is exactly sizeof(u64) when we parse it, and
> >> then simply do luo_file->private_data = (u64)*data_ptr ?
> >
> > Because FDT alignment is 4 bytes, we can't simply assign it.
>
> Hmm, good catch. Didn't think of that.
>
> >
> >> Because if the previous kernel wrote more than a u64 in data, then
> >> something is broken and we should catch that error anyway.
> >>
> >> > + luo_file->reclaimed = false;
> >> > + mutex_init(&luo_file->mutex);
> >> > + luo_file->state = LIVEUPDATE_STATE_UPDATED;
> >> > + ret = xa_err(xa_store(&luo_files_xa_in, token, luo_file,
> >> > + GFP_KERNEL | __GFP_NOFAIL));
> >>
> [...]
> >> > +struct liveupdate_filesystem {
> >> > + int (*prepare)(struct file *file, void *arg, u64 *data);
> >> > + int (*freeze)(struct file *file, void *arg, u64 *data);
> >> > + void (*cancel)(struct file *file, void *arg, u64 data);
> >> > + void (*finish)(struct file *file, void *arg, u64 data, bool reclaimed);
> >> > + int (*retrieve)(void *arg, u64 data, struct file **file);
> >> > + bool (*can_preserve)(struct file *file, void *arg);
> >> > + const char *compatible;
> >> > + void *arg;
> >>
> >> What is the use for this arg? I would expect one file type/system to
> >> register one set of handlers. So they can keep their arg in a global in
> >> their code. I don't see why a per-filesystem arg is needed.
> >
> > I think, arg is useful in case we support a subsystem is registered
> > multiple times with some differences: i.e. based on mount point, or
> > file types handling. Let's keep it for now, but if needed, we can
> > remove that in future revisions.
> >
> >> What I do think is needed is a per-file arg. Each callback gets 'data',
> >> which is the serialized data, but there is no place to store runtime
> >> state, like some flags or serialization metadata. Sure, you could make
> >> place for it somewhere in the inode, but I think it would be a lot
> >> cleaner to be able to store it in struct luo_file.
> >>
> >> So perhaps rename private_data in struct luo_file to say
> >> serialized_data, and have a field called "private" that filesystems can
> >> use for their runtime state?
> >
> > I am not against this, but let's make this change when it is actually
> > needed by a registered filesystem.
>
> Okay, fair enough.
>
> --
> Regards,
> Pratyush Yadav
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-06-15 18:03 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CA+Sjx0XP2A+rsA95NVwX7czQGNREXcFT=psedA7fawwKJb5rkw@mail.gmail.com>
2025-06-08 0:07 ` [RFC v2 08/16] luo: luo_files: add infrastructure for FDs Pasha Tatashin
2025-05-15 18:23 [RFC v2 00/16] Live Update Orchestrator Pasha Tatashin
2025-05-15 18:23 ` [RFC v2 08/16] luo: luo_files: add infrastructure for FDs Pasha Tatashin
2025-05-15 23:15 ` James Houghton
2025-05-23 18:09 ` Pasha Tatashin
2025-05-26 7:55 ` Mike Rapoport
2025-06-05 11:56 ` Pratyush Yadav
2025-06-08 13:13 ` Pasha Tatashin
2025-06-05 15:56 ` Pratyush Yadav
2025-06-08 13:37 ` Pasha Tatashin
2025-06-13 15:27 ` Pratyush Yadav
2025-06-15 18:02 ` Pasha Tatashin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).