* [RFC PATCH 01/21] selftests/liveupdate: Build tests from the selftests/liveupdate directory
2025-10-18 0:06 [RFC PATCH 00/21] VFIO live update support Vipin Sharma
@ 2025-10-18 0:06 ` Vipin Sharma
2025-10-18 0:06 ` [RFC PATCH 02/21] selftests/liveupdate: Create library of core live update ioctls Vipin Sharma
` (20 subsequent siblings)
21 siblings, 0 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 0:06 UTC (permalink / raw)
To: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf
Cc: pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest, Vipin Sharma
Build selftests from liveupdate directory
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
tools/testing/selftests/liveupdate/.gitignore | 7 ++++--
tools/testing/selftests/liveupdate/Makefile | 25 ++++++++++---------
2 files changed, 18 insertions(+), 14 deletions(-)
diff --git a/tools/testing/selftests/liveupdate/.gitignore b/tools/testing/selftests/liveupdate/.gitignore
index de7ca45d3892..da3a50a32aeb 100644
--- a/tools/testing/selftests/liveupdate/.gitignore
+++ b/tools/testing/selftests/liveupdate/.gitignore
@@ -1,2 +1,5 @@
-/liveupdate
-/luo_multi_kexec
+liveupdate
+luo_multi_kexec
+luo_multi_file
+luo_multi_session
+luo_unreclaimed
diff --git a/tools/testing/selftests/liveupdate/Makefile b/tools/testing/selftests/liveupdate/Makefile
index 25a6dec790bb..fbcacbd1b798 100644
--- a/tools/testing/selftests/liveupdate/Makefile
+++ b/tools/testing/selftests/liveupdate/Makefile
@@ -1,10 +1,5 @@
# SPDX-License-Identifier: GPL-2.0-only
-KHDR_INCLUDES ?= -I../../../usr/include
-CFLAGS += -Wall -O2 -Wno-unused-function
-CFLAGS += $(KHDR_INCLUDES)
-LDFLAGS += -static
-
# --- Test Configuration (Edit this section when adding new tests) ---
LUO_SHARED_SRCS := luo_test_utils.c
LUO_SHARED_HDRS += luo_test_utils.h
@@ -25,6 +20,12 @@ TEST_GEN_PROGS := $(LUO_MAIN_TESTS)
liveupdate_SOURCES := liveupdate.c $(LUO_SHARED_SRCS)
+include ../lib.mk
+
+CFLAGS += -Wall -O2 -Wno-unused-function
+CFLAGS += $(KHDR_INCLUDES)
+LDFLAGS += -static
+
$(OUTPUT)/liveupdate: $(liveupdate_SOURCES) $(LUO_SHARED_HDRS)
$(call msg,LINK,,$@)
$(Q)$(LINK.c) $^ $(LDLIBS) -o $@
@@ -33,16 +34,16 @@ $(OUTPUT)/liveupdate: $(liveupdate_SOURCES) $(LUO_SHARED_HDRS)
$(foreach test,$(LUO_MANUAL_TESTS), \
$(eval $(test)_SOURCES := $(test).c $(LUO_SHARED_SRCS)))
+define BUILD_RULE_TEMPLATE
+$(OUTPUT)/$(1): $($(1)_SOURCES) $(LUO_SHARED_HDRS)
+ $(call msg,LINK,,$$@)
+ $(Q)$(LINK.c) $$^ $(LDLIBS) -o $$@
+ $(Q)chmod +x $$@
+endef
# This loop automatically generates an explicit build rule for each manual test.
# It includes dependencies on the shared headers and makes the output
# executable.
# Note the use of '$$' to escape automatic variables for the 'eval' command.
$(foreach test,$(LUO_MANUAL_TESTS), \
- $(eval $(OUTPUT)/$(test): $($(test)_SOURCES) $(LUO_SHARED_HDRS) \
- $(call msg,LINK,,$$@) ; \
- $(Q)$(LINK.c) $$^ $(LDLIBS) -o $$@ ; \
- $(Q)chmod +x $$@ \
- ) \
+ $(eval $(call BUILD_RULE_TEMPLATE,$(test))) \
)
-
-include ../lib.mk
--
2.51.0.858.gf9c4a03a3a-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread* [RFC PATCH 02/21] selftests/liveupdate: Create library of core live update ioctls
2025-10-18 0:06 [RFC PATCH 00/21] VFIO live update support Vipin Sharma
2025-10-18 0:06 ` [RFC PATCH 01/21] selftests/liveupdate: Build tests from the selftests/liveupdate directory Vipin Sharma
@ 2025-10-18 0:06 ` Vipin Sharma
2025-10-18 0:06 ` [RFC PATCH 03/21] selftests/liveupdate: Move do_kexec.sh script to liveupdate/lib Vipin Sharma
` (19 subsequent siblings)
21 siblings, 0 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 0:06 UTC (permalink / raw)
To: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf
Cc: pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest, Vipin Sharma
Create liveupdate_util.mk library of core live update APIs which can
be shared outside of liveupdate selftests, for example, VFIO selftests.
Shared library avoids the need for VFIO to define its own APIs to
interact with liveupdate ioctls.
No functional changes intended, in this patch only few functions are
moved to library without changing the code.
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
tools/testing/selftests/liveupdate/Makefile | 6 +-
.../liveupdate/lib/include/liveupdate_util.h | 23 +++++++
.../selftests/liveupdate/lib/libliveupdate.mk | 17 +++++
.../liveupdate/lib/liveupdate_util.c | 68 +++++++++++++++++++
.../selftests/liveupdate/luo_test_utils.c | 55 +--------------
.../selftests/liveupdate/luo_test_utils.h | 10 +--
6 files changed, 114 insertions(+), 65 deletions(-)
create mode 100644 tools/testing/selftests/liveupdate/lib/include/liveupdate_util.h
create mode 100644 tools/testing/selftests/liveupdate/lib/libliveupdate.mk
create mode 100644 tools/testing/selftests/liveupdate/lib/liveupdate_util.c
diff --git a/tools/testing/selftests/liveupdate/Makefile b/tools/testing/selftests/liveupdate/Makefile
index fbcacbd1b798..79d1c525f03c 100644
--- a/tools/testing/selftests/liveupdate/Makefile
+++ b/tools/testing/selftests/liveupdate/Makefile
@@ -26,7 +26,9 @@ CFLAGS += -Wall -O2 -Wno-unused-function
CFLAGS += $(KHDR_INCLUDES)
LDFLAGS += -static
-$(OUTPUT)/liveupdate: $(liveupdate_SOURCES) $(LUO_SHARED_HDRS)
+include lib/libliveupdate.mk
+
+$(OUTPUT)/liveupdate: $(liveupdate_SOURCES) $(LUO_SHARED_HDRS) $(LIBLIVEUPDATE_O)
$(call msg,LINK,,$@)
$(Q)$(LINK.c) $^ $(LDLIBS) -o $@
@@ -35,7 +37,7 @@ $(foreach test,$(LUO_MANUAL_TESTS), \
$(eval $(test)_SOURCES := $(test).c $(LUO_SHARED_SRCS)))
define BUILD_RULE_TEMPLATE
-$(OUTPUT)/$(1): $($(1)_SOURCES) $(LUO_SHARED_HDRS)
+$(OUTPUT)/$(1): $($(1)_SOURCES) $(LUO_SHARED_HDRS) $(LIBLIVEUPDATE_O)
$(call msg,LINK,,$$@)
$(Q)$(LINK.c) $$^ $(LDLIBS) -o $$@
$(Q)chmod +x $$@
diff --git a/tools/testing/selftests/liveupdate/lib/include/liveupdate_util.h b/tools/testing/selftests/liveupdate/lib/include/liveupdate_util.h
new file mode 100644
index 000000000000..f938ce60edb7
--- /dev/null
+++ b/tools/testing/selftests/liveupdate/lib/include/liveupdate_util.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Copyright (c) 2025, Google LLC.
+ * Pasha Tatashin <pasha.tatashin@soleen.com>
+ */
+
+#ifndef SELFTESTS_LIVEUPDATE_LIB_LIVEUPDATE_UTIL_H
+#define SELFTESTS_LIVEUPDATE_LIB_LIVEUPDATE_UTIL_H
+
+#include <linux/liveupdate.h>
+
+#define LUO_DEVICE "/dev/liveupdate"
+
+int luo_open_device(void);
+int luo_create_session(int luo_fd, const char *name);
+int luo_retrieve_session(int luo_fd, const char *name);
+
+int luo_set_session_event(int session_fd, enum liveupdate_event event);
+int luo_set_global_event(int luo_fd, enum liveupdate_event event);
+int luo_get_global_state(int luo_fd, enum liveupdate_state *state);
+
+#endif /* SELFTESTS_LIVEUPDATE_LIB_LIVEUPDATE_UTIL_H */
diff --git a/tools/testing/selftests/liveupdate/lib/libliveupdate.mk b/tools/testing/selftests/liveupdate/lib/libliveupdate.mk
new file mode 100644
index 000000000000..b3fc2580a7cf
--- /dev/null
+++ b/tools/testing/selftests/liveupdate/lib/libliveupdate.mk
@@ -0,0 +1,17 @@
+LIBLIVEUPDATE_SRCDIR := $(selfdir)/liveupdate/lib
+
+LIBLIVEUPDATE_C := liveupdate_util.c
+
+LIBLIVEUPDATE_OUTPUT := $(OUTPUT)/libliveupdate
+
+LIBLIVEUPDATE_O := $(patsubst %.c, $(LIBLIVEUPDATE_OUTPUT)/%.o, $(LIBLIVEUPDATE_C))
+
+LIBLIVEUPDATE_O_DIRS := $(shell dirname $(LIBLIVEUPDATE_O) | uniq)
+$(shell mkdir -p $(LIBLIVEUPDATE_O_DIRS))
+
+CFLAGS += -I$(LIBLIVEUPDATE_SRCDIR)/include
+
+$(LIBLIVEUPDATE_O): $(LIBLIVEUPDATE_OUTPUT)/%.o : $(LIBLIVEUPDATE_SRCDIR)/%.c
+ $(CC) $(CFLAGS) $(CPPFLAGS) -c $< -o $@
+
+EXTRA_CLEAN += $(LIBLIVEUPDATE_OUTPUT)
\ No newline at end of file
diff --git a/tools/testing/selftests/liveupdate/lib/liveupdate_util.c b/tools/testing/selftests/liveupdate/lib/liveupdate_util.c
new file mode 100644
index 000000000000..1e6fd9dd8fb9
--- /dev/null
+++ b/tools/testing/selftests/liveupdate/lib/liveupdate_util.c
@@ -0,0 +1,68 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/*
+ * Copyright (c) 2025, Google LLC.
+ * Pasha Tatashin <pasha.tatashin@soleen.com>
+ */
+
+#define _GNU_SOURCE
+
+#include <liveupdate_util.h>
+#include <linux/liveupdate.h>
+#include <errno.h>
+#include <stdio.h>
+#include <fcntl.h>
+#include <sys/ioctl.h>
+
+int luo_open_device(void)
+{
+ return open(LUO_DEVICE, O_RDWR);
+}
+
+int luo_create_session(int luo_fd, const char *name)
+{
+ struct liveupdate_ioctl_create_session arg = { .size = sizeof(arg) };
+
+ snprintf((char *)arg.name, LIVEUPDATE_SESSION_NAME_LENGTH, "%.*s",
+ LIVEUPDATE_SESSION_NAME_LENGTH - 1, name);
+ if (ioctl(luo_fd, LIVEUPDATE_IOCTL_CREATE_SESSION, &arg) < 0)
+ return -errno;
+ return arg.fd;
+}
+
+int luo_retrieve_session(int luo_fd, const char *name)
+{
+ struct liveupdate_ioctl_retrieve_session arg = { .size = sizeof(arg) };
+
+ snprintf((char *)arg.name, LIVEUPDATE_SESSION_NAME_LENGTH, "%.*s",
+ LIVEUPDATE_SESSION_NAME_LENGTH - 1, name);
+ if (ioctl(luo_fd, LIVEUPDATE_IOCTL_RETRIEVE_SESSION, &arg) < 0)
+ return -errno;
+ return arg.fd;
+}
+
+int luo_set_session_event(int session_fd, enum liveupdate_event event)
+{
+ struct liveupdate_session_set_event arg = { .size = sizeof(arg) };
+
+ arg.event = event;
+ return ioctl(session_fd, LIVEUPDATE_SESSION_SET_EVENT, &arg);
+}
+
+int luo_set_global_event(int luo_fd, enum liveupdate_event event)
+{
+ struct liveupdate_ioctl_set_event arg = { .size = sizeof(arg) };
+
+ arg.event = event;
+ return ioctl(luo_fd, LIVEUPDATE_IOCTL_SET_EVENT, &arg);
+}
+
+int luo_get_global_state(int luo_fd, enum liveupdate_state *state)
+{
+ struct liveupdate_ioctl_get_state arg = { .size = sizeof(arg) };
+
+ if (ioctl(luo_fd, LIVEUPDATE_IOCTL_GET_STATE, &arg) < 0)
+ return -errno;
+ *state = arg.state;
+ return 0;
+}
diff --git a/tools/testing/selftests/liveupdate/luo_test_utils.c b/tools/testing/selftests/liveupdate/luo_test_utils.c
index c0840e6e66fd..0f5bc7260ccc 100644
--- a/tools/testing/selftests/liveupdate/luo_test_utils.c
+++ b/tools/testing/selftests/liveupdate/luo_test_utils.c
@@ -17,39 +17,12 @@
#include <sys/mman.h>
#include <errno.h>
#include <stdarg.h>
-
+#include <liveupdate_util.h>
#include "luo_test_utils.h"
#include "../kselftest.h"
/* The fail_exit function is now a macro in the header. */
-int luo_open_device(void)
-{
- return open(LUO_DEVICE, O_RDWR);
-}
-
-int luo_create_session(int luo_fd, const char *name)
-{
- struct liveupdate_ioctl_create_session arg = { .size = sizeof(arg) };
-
- snprintf((char *)arg.name, LIVEUPDATE_SESSION_NAME_LENGTH, "%.*s",
- LIVEUPDATE_SESSION_NAME_LENGTH - 1, name);
- if (ioctl(luo_fd, LIVEUPDATE_IOCTL_CREATE_SESSION, &arg) < 0)
- return -errno;
- return arg.fd;
-}
-
-int luo_retrieve_session(int luo_fd, const char *name)
-{
- struct liveupdate_ioctl_retrieve_session arg = { .size = sizeof(arg) };
-
- snprintf((char *)arg.name, LIVEUPDATE_SESSION_NAME_LENGTH, "%.*s",
- LIVEUPDATE_SESSION_NAME_LENGTH - 1, name);
- if (ioctl(luo_fd, LIVEUPDATE_IOCTL_RETRIEVE_SESSION, &arg) < 0)
- return -errno;
- return arg.fd;
-}
-
int create_and_preserve_memfd(int session_fd, int token, const char *data)
{
struct liveupdate_session_preserve_fd arg = { .size = sizeof(arg) };
@@ -119,32 +92,6 @@ int restore_and_verify_memfd(int session_fd, int token,
return ret;
}
-int luo_set_session_event(int session_fd, enum liveupdate_event event)
-{
- struct liveupdate_session_set_event arg = { .size = sizeof(arg) };
-
- arg.event = event;
- return ioctl(session_fd, LIVEUPDATE_SESSION_SET_EVENT, &arg);
-}
-
-int luo_set_global_event(int luo_fd, enum liveupdate_event event)
-{
- struct liveupdate_ioctl_set_event arg = { .size = sizeof(arg) };
-
- arg.event = event;
- return ioctl(luo_fd, LIVEUPDATE_IOCTL_SET_EVENT, &arg);
-}
-
-int luo_get_global_state(int luo_fd, enum liveupdate_state *state)
-{
- struct liveupdate_ioctl_get_state arg = { .size = sizeof(arg) };
-
- if (ioctl(luo_fd, LIVEUPDATE_IOCTL_GET_STATE, &arg) < 0)
- return -errno;
- *state = arg.state;
- return 0;
-}
-
void create_state_file(int luo_fd, int next_stage)
{
char buf[32];
diff --git a/tools/testing/selftests/liveupdate/luo_test_utils.h b/tools/testing/selftests/liveupdate/luo_test_utils.h
index e30cfcb0a596..4d371b528a01 100644
--- a/tools/testing/selftests/liveupdate/luo_test_utils.h
+++ b/tools/testing/selftests/liveupdate/luo_test_utils.h
@@ -11,9 +11,9 @@
#include <errno.h>
#include <string.h>
#include <linux/liveupdate.h>
+#include <liveupdate_util.h>
#include "../kselftest.h"
-#define LUO_DEVICE "/dev/liveupdate"
#define STATE_SESSION_NAME "state_session"
#define STATE_MEMFD_TOKEN 999
@@ -30,19 +30,11 @@ struct session_info {
ksft_exit_fail_msg("[%s] " fmt " (errno: %s)\n", \
__func__, ##__VA_ARGS__, strerror(errno))
-int luo_open_device(void);
-
-int luo_create_session(int luo_fd, const char *name);
-int luo_retrieve_session(int luo_fd, const char *name);
int create_and_preserve_memfd(int session_fd, int token, const char *data);
int restore_and_verify_memfd(int session_fd, int token, const char *expected_data);
int verify_session_and_get_fd(int luo_fd, struct session_info *s);
-int luo_set_session_event(int session_fd, enum liveupdate_event event);
-int luo_set_global_event(int luo_fd, enum liveupdate_event event);
-int luo_get_global_state(int luo_fd, enum liveupdate_state *state);
-
void create_state_file(int luo_fd, int next_stage);
int restore_and_read_state(int luo_fd, int *stage);
void update_state_file(int session_fd, int next_stage);
--
2.51.0.858.gf9c4a03a3a-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread* [RFC PATCH 03/21] selftests/liveupdate: Move do_kexec.sh script to liveupdate/lib
2025-10-18 0:06 [RFC PATCH 00/21] VFIO live update support Vipin Sharma
2025-10-18 0:06 ` [RFC PATCH 01/21] selftests/liveupdate: Build tests from the selftests/liveupdate directory Vipin Sharma
2025-10-18 0:06 ` [RFC PATCH 02/21] selftests/liveupdate: Create library of core live update ioctls Vipin Sharma
@ 2025-10-18 0:06 ` Vipin Sharma
2025-10-18 0:06 ` [RFC PATCH 04/21] selftests/liveupdate: Move LUO ioctls calls to liveupdate library Vipin Sharma
` (18 subsequent siblings)
21 siblings, 0 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 0:06 UTC (permalink / raw)
To: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf
Cc: pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest, Vipin Sharma
Move do_kexec.sh to lib directory in the liveupdate selftest directory.
Add code in libliveupdate.mk to copy the script to generated
libliveupdate directory during the build.
Script allows liveupdate library users to initiate kexec for liveupdate
test flows.
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
tools/testing/selftests/liveupdate/Makefile | 2 --
.../selftests/liveupdate/{ => lib}/do_kexec.sh | 0
.../liveupdate/lib/include/liveupdate_util.h | 2 ++
.../testing/selftests/liveupdate/lib/libliveupdate.mk | 1 +
.../selftests/liveupdate/lib/liveupdate_util.c | 11 +++++++++++
tools/testing/selftests/liveupdate/luo_multi_file.c | 2 --
tools/testing/selftests/liveupdate/luo_multi_kexec.c | 2 --
.../testing/selftests/liveupdate/luo_multi_session.c | 2 --
tools/testing/selftests/liveupdate/luo_unreclaimed.c | 1 -
9 files changed, 14 insertions(+), 9 deletions(-)
rename tools/testing/selftests/liveupdate/{ => lib}/do_kexec.sh (100%)
diff --git a/tools/testing/selftests/liveupdate/Makefile b/tools/testing/selftests/liveupdate/Makefile
index 79d1c525f03c..f203fd681afe 100644
--- a/tools/testing/selftests/liveupdate/Makefile
+++ b/tools/testing/selftests/liveupdate/Makefile
@@ -9,8 +9,6 @@ LUO_MANUAL_TESTS += luo_multi_kexec
LUO_MANUAL_TESTS += luo_multi_session
LUO_MANUAL_TESTS += luo_unreclaimed
-TEST_FILES += do_kexec.sh
-
LUO_MAIN_TESTS += liveupdate
# --- Automatic Rule Generation (Do not edit below) ---
diff --git a/tools/testing/selftests/liveupdate/do_kexec.sh b/tools/testing/selftests/liveupdate/lib/do_kexec.sh
similarity index 100%
rename from tools/testing/selftests/liveupdate/do_kexec.sh
rename to tools/testing/selftests/liveupdate/lib/do_kexec.sh
diff --git a/tools/testing/selftests/liveupdate/lib/include/liveupdate_util.h b/tools/testing/selftests/liveupdate/lib/include/liveupdate_util.h
index f938ce60edb7..6ee9e124a1a4 100644
--- a/tools/testing/selftests/liveupdate/lib/include/liveupdate_util.h
+++ b/tools/testing/selftests/liveupdate/lib/include/liveupdate_util.h
@@ -11,10 +11,12 @@
#include <linux/liveupdate.h>
#define LUO_DEVICE "/dev/liveupdate"
+#define KEXEC_SCRIPT "libliveupdate/do_kexec.sh"
int luo_open_device(void);
int luo_create_session(int luo_fd, const char *name);
int luo_retrieve_session(int luo_fd, const char *name);
+int luo_session_preserve_fd(int session_fd, int fd, int token);
int luo_set_session_event(int session_fd, enum liveupdate_event event);
int luo_set_global_event(int luo_fd, enum liveupdate_event event);
diff --git a/tools/testing/selftests/liveupdate/lib/libliveupdate.mk b/tools/testing/selftests/liveupdate/lib/libliveupdate.mk
index b3fc2580a7cf..ddb9b1a4363b 100644
--- a/tools/testing/selftests/liveupdate/lib/libliveupdate.mk
+++ b/tools/testing/selftests/liveupdate/lib/libliveupdate.mk
@@ -8,6 +8,7 @@ LIBLIVEUPDATE_O := $(patsubst %.c, $(LIBLIVEUPDATE_OUTPUT)/%.o, $(LIBLIVEUPDATE_
LIBLIVEUPDATE_O_DIRS := $(shell dirname $(LIBLIVEUPDATE_O) | uniq)
$(shell mkdir -p $(LIBLIVEUPDATE_O_DIRS))
+$(shell cp -n $(LIBLIVEUPDATE_SRCDIR)/do_kexec.sh $(LIBLIVEUPDATE_OUTPUT))
CFLAGS += -I$(LIBLIVEUPDATE_SRCDIR)/include
diff --git a/tools/testing/selftests/liveupdate/lib/liveupdate_util.c b/tools/testing/selftests/liveupdate/lib/liveupdate_util.c
index 1e6fd9dd8fb9..26fd6a7763a2 100644
--- a/tools/testing/selftests/liveupdate/lib/liveupdate_util.c
+++ b/tools/testing/selftests/liveupdate/lib/liveupdate_util.c
@@ -30,6 +30,17 @@ int luo_create_session(int luo_fd, const char *name)
return arg.fd;
}
+int luo_session_preserve_fd(int session_fd, int fd, int token)
+{
+ struct liveupdate_session_preserve_fd arg = {
+ .size = sizeof(arg),
+ .fd = fd,
+ .token = token
+ };
+
+ return ioctl(session_fd, LIVEUPDATE_SESSION_PRESERVE_FD, &arg) < 0;
+}
+
int luo_retrieve_session(int luo_fd, const char *name)
{
struct liveupdate_ioctl_retrieve_session arg = { .size = sizeof(arg) };
diff --git a/tools/testing/selftests/liveupdate/luo_multi_file.c b/tools/testing/selftests/liveupdate/luo_multi_file.c
index ae38fe8aba4c..1a4f95046c75 100644
--- a/tools/testing/selftests/liveupdate/luo_multi_file.c
+++ b/tools/testing/selftests/liveupdate/luo_multi_file.c
@@ -7,8 +7,6 @@
#include "luo_test_utils.h"
-#define KEXEC_SCRIPT "./do_kexec.sh"
-
#define SESSION_NAME "multi_file_session"
#define TOKEN_A 101
#define TOKEN_B 102
diff --git a/tools/testing/selftests/liveupdate/luo_multi_kexec.c b/tools/testing/selftests/liveupdate/luo_multi_kexec.c
index 1f350990ee67..5cfecbc6d269 100644
--- a/tools/testing/selftests/liveupdate/luo_multi_kexec.c
+++ b/tools/testing/selftests/liveupdate/luo_multi_kexec.c
@@ -7,8 +7,6 @@
#include "luo_test_utils.h"
-#define KEXEC_SCRIPT "./do_kexec.sh"
-
#define NUM_SESSIONS 3
/* Helper to set up one session and all its files */
diff --git a/tools/testing/selftests/liveupdate/luo_multi_session.c b/tools/testing/selftests/liveupdate/luo_multi_session.c
index 9ea96d7b997f..389d4b559cb3 100644
--- a/tools/testing/selftests/liveupdate/luo_multi_session.c
+++ b/tools/testing/selftests/liveupdate/luo_multi_session.c
@@ -8,8 +8,6 @@
#include "luo_test_utils.h"
#include "../kselftest.h"
-#define KEXEC_SCRIPT "./do_kexec.sh"
-
#define NUM_SESSIONS 5
#define FILES_PER_SESSION 5
diff --git a/tools/testing/selftests/liveupdate/luo_unreclaimed.c b/tools/testing/selftests/liveupdate/luo_unreclaimed.c
index c3921b21b97b..b31bb354bfc3 100644
--- a/tools/testing/selftests/liveupdate/luo_unreclaimed.c
+++ b/tools/testing/selftests/liveupdate/luo_unreclaimed.c
@@ -8,7 +8,6 @@
#include "luo_test_utils.h"
#include "../kselftest.h"
-#define KEXEC_SCRIPT "./do_kexec.sh"
#define SESSION_NAME "unreclaimed_session"
#define TOKEN_A 100
--
2.51.0.858.gf9c4a03a3a-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread* [RFC PATCH 04/21] selftests/liveupdate: Move LUO ioctls calls to liveupdate library
2025-10-18 0:06 [RFC PATCH 00/21] VFIO live update support Vipin Sharma
` (2 preceding siblings ...)
2025-10-18 0:06 ` [RFC PATCH 03/21] selftests/liveupdate: Move do_kexec.sh script to liveupdate/lib Vipin Sharma
@ 2025-10-18 0:06 ` Vipin Sharma
2025-10-18 0:06 ` [RFC PATCH 05/21] vfio/pci: Register VFIO live update file handler to Live Update Orchestrator Vipin Sharma
` (17 subsequent siblings)
21 siblings, 0 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 0:06 UTC (permalink / raw)
To: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf
Cc: pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest, Vipin Sharma
Move liveupdate ioctls call to liveupdate library.
This allows single place for luo ioctl interactions and provide other
selftests to access them.
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
.../liveupdate/lib/include/liveupdate_util.h | 2 ++
.../liveupdate/lib/liveupdate_util.c | 29 ++++++++++++++++++-
.../selftests/liveupdate/luo_test_utils.c | 18 ++++--------
3 files changed, 35 insertions(+), 14 deletions(-)
diff --git a/tools/testing/selftests/liveupdate/lib/include/liveupdate_util.h b/tools/testing/selftests/liveupdate/lib/include/liveupdate_util.h
index 6ee9e124a1a4..a5cb034f7692 100644
--- a/tools/testing/selftests/liveupdate/lib/include/liveupdate_util.h
+++ b/tools/testing/selftests/liveupdate/lib/include/liveupdate_util.h
@@ -17,6 +17,8 @@ int luo_open_device(void);
int luo_create_session(int luo_fd, const char *name);
int luo_retrieve_session(int luo_fd, const char *name);
int luo_session_preserve_fd(int session_fd, int fd, int token);
+int luo_session_unpreserve_fd(int session_fd, int token);
+int luo_session_restore_fd(int session_fd, int token);
int luo_set_session_event(int session_fd, enum liveupdate_event event);
int luo_set_global_event(int luo_fd, enum liveupdate_event event);
diff --git a/tools/testing/selftests/liveupdate/lib/liveupdate_util.c b/tools/testing/selftests/liveupdate/lib/liveupdate_util.c
index 26fd6a7763a2..96c6c1b65043 100644
--- a/tools/testing/selftests/liveupdate/lib/liveupdate_util.c
+++ b/tools/testing/selftests/liveupdate/lib/liveupdate_util.c
@@ -38,7 +38,34 @@ int luo_session_preserve_fd(int session_fd, int fd, int token)
.token = token
};
- return ioctl(session_fd, LIVEUPDATE_SESSION_PRESERVE_FD, &arg) < 0;
+ if (ioctl(session_fd, LIVEUPDATE_SESSION_PRESERVE_FD, &arg) < 0)
+ return -errno;
+ return 0;
+}
+
+int luo_session_unpreserve_fd(int session_fd, int token)
+{
+ struct liveupdate_session_unpreserve_fd arg = {
+ .size = sizeof(arg),
+ .token = token
+ };
+
+ if (ioctl(session_fd, LIVEUPDATE_SESSION_UNPRESERVE_FD, &arg) < 0)
+ return -errno;
+ return 0;
+}
+
+int luo_session_restore_fd(int session_fd, int token)
+{
+ struct liveupdate_session_restore_fd arg = {
+ .size = sizeof(arg),
+ .token = token
+ };
+
+ if (ioctl(session_fd, LIVEUPDATE_SESSION_RESTORE_FD, &arg) < 0)
+ return -errno;
+ return arg.fd;
+
}
int luo_retrieve_session(int luo_fd, const char *name)
diff --git a/tools/testing/selftests/liveupdate/luo_test_utils.c b/tools/testing/selftests/liveupdate/luo_test_utils.c
index 0f5bc7260ccc..b1f7b5c79c07 100644
--- a/tools/testing/selftests/liveupdate/luo_test_utils.c
+++ b/tools/testing/selftests/liveupdate/luo_test_utils.c
@@ -12,7 +12,6 @@
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
-#include <sys/ioctl.h>
#include <sys/syscall.h>
#include <sys/mman.h>
#include <errno.h>
@@ -25,7 +24,6 @@
int create_and_preserve_memfd(int session_fd, int token, const char *data)
{
- struct liveupdate_session_preserve_fd arg = { .size = sizeof(arg) };
long page_size = sysconf(_SC_PAGE_SIZE);
void *map = MAP_FAILED;
int mfd = -1, ret = -1;
@@ -44,9 +42,7 @@ int create_and_preserve_memfd(int session_fd, int token, const char *data)
snprintf(map, page_size, "%s", data);
munmap(map, page_size);
- arg.fd = mfd;
- arg.token = token;
- if (ioctl(session_fd, LIVEUPDATE_SESSION_PRESERVE_FD, &arg) < 0)
+ if (luo_session_preserve_fd(session_fd, mfd, token))
goto out;
ret = 0; /* Success */
@@ -61,15 +57,13 @@ int create_and_preserve_memfd(int session_fd, int token, const char *data)
int restore_and_verify_memfd(int session_fd, int token,
const char *expected_data)
{
- struct liveupdate_session_restore_fd arg = { .size = sizeof(arg) };
long page_size = sysconf(_SC_PAGE_SIZE);
void *map = MAP_FAILED;
int mfd = -1, ret = -1;
- arg.token = token;
- if (ioctl(session_fd, LIVEUPDATE_SESSION_RESTORE_FD, &arg) < 0)
- return -errno;
- mfd = arg.fd;
+ mfd = luo_session_restore_fd(session_fd, token);
+ if (mfd < 0)
+ return mfd;
map = mmap(NULL, page_size, PROT_READ, MAP_SHARED, mfd, 0);
if (map == MAP_FAILED)
@@ -134,10 +128,8 @@ int restore_and_read_state(int luo_fd, int *stage)
void update_state_file(int session_fd, int next_stage)
{
char buf[32];
- struct liveupdate_session_unpreserve_fd arg = { .size = sizeof(arg) };
- arg.token = STATE_MEMFD_TOKEN;
- if (ioctl(session_fd, LIVEUPDATE_SESSION_UNPRESERVE_FD, &arg) < 0)
+ if (luo_session_unpreserve_fd(session_fd, STATE_MEMFD_TOKEN))
fail_exit("unpreserve failed");
snprintf(buf, sizeof(buf), "%d", next_stage);
--
2.51.0.858.gf9c4a03a3a-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread* [RFC PATCH 05/21] vfio/pci: Register VFIO live update file handler to Live Update Orchestrator
2025-10-18 0:06 [RFC PATCH 00/21] VFIO live update support Vipin Sharma
` (3 preceding siblings ...)
2025-10-18 0:06 ` [RFC PATCH 04/21] selftests/liveupdate: Move LUO ioctls calls to liveupdate library Vipin Sharma
@ 2025-10-18 0:06 ` Vipin Sharma
2025-10-31 21:24 ` David Matlack
2025-10-31 22:28 ` David Matlack
2025-10-18 0:06 ` [RFC PATCH 06/21] vfio/pci: Accept live update preservation request for VFIO cdev Vipin Sharma
` (16 subsequent siblings)
21 siblings, 2 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 0:06 UTC (permalink / raw)
To: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf
Cc: pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest, Vipin Sharma
Register VFIO live update file handler to Live Update Orchestrator.
Provide stub implementation of the handler callbacks.
Adding live update support in VFIO will enable a VFIO PCI device to work
uninterrupted while the host kernel is being updated through a kexec
reboot.
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
drivers/vfio/pci/Makefile | 1 +
drivers/vfio/pci/vfio_pci_core.c | 1 +
drivers/vfio/pci/vfio_pci_liveupdate.c | 44 ++++++++++++++++++++++++++
drivers/vfio/pci/vfio_pci_priv.h | 6 ++++
4 files changed, 52 insertions(+)
create mode 100644 drivers/vfio/pci/vfio_pci_liveupdate.c
diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
index cf00c0a7e55c..929df22c079b 100644
--- a/drivers/vfio/pci/Makefile
+++ b/drivers/vfio/pci/Makefile
@@ -2,6 +2,7 @@
vfio-pci-core-y := vfio_pci_core.o vfio_pci_intrs.o vfio_pci_rdwr.o vfio_pci_config.o
vfio-pci-core-$(CONFIG_VFIO_PCI_ZDEV_KVM) += vfio_pci_zdev.o
+vfio-pci-core-$(CONFIG_LIVEUPDATE) += vfio_pci_liveupdate.o
obj-$(CONFIG_VFIO_PCI_CORE) += vfio-pci-core.o
vfio-pci-y := vfio_pci.o
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 7dcf5439dedc..0894673a9262 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -2568,6 +2568,7 @@ static void vfio_pci_core_cleanup(void)
static int __init vfio_pci_core_init(void)
{
/* Allocate shared config space permission data used by all devices */
+ vfio_pci_liveupdate_init();
return vfio_pci_init_perm_bits();
}
diff --git a/drivers/vfio/pci/vfio_pci_liveupdate.c b/drivers/vfio/pci/vfio_pci_liveupdate.c
new file mode 100644
index 000000000000..088f7698a72c
--- /dev/null
+++ b/drivers/vfio/pci/vfio_pci_liveupdate.c
@@ -0,0 +1,44 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Liveupdate support for VFIO devices.
+ *
+ * Copyright (c) 2025, Google LLC.
+ * Vipin Sharma <vipinsh@google.com>
+ */
+
+#include <linux/liveupdate.h>
+#include <linux/errno.h>
+
+#include "vfio_pci_priv.h"
+
+static int vfio_pci_liveupdate_retrieve(struct liveupdate_file_handler *handler,
+ u64 data, struct file **file)
+{
+ return -EOPNOTSUPP;
+}
+
+static bool vfio_pci_liveupdate_can_preserve(struct liveupdate_file_handler *handler,
+ struct file *file)
+{
+ return -EOPNOTSUPP;
+}
+
+static const struct liveupdate_file_ops vfio_pci_luo_fops = {
+ .retrieve = vfio_pci_liveupdate_retrieve,
+ .can_preserve = vfio_pci_liveupdate_can_preserve,
+ .owner = THIS_MODULE,
+};
+
+static struct liveupdate_file_handler vfio_pci_luo_handler = {
+ .ops = &vfio_pci_luo_fops,
+ .compatible = "vfio-v1",
+};
+
+void __init vfio_pci_liveupdate_init(void)
+{
+ int err = liveupdate_register_file_handler(&vfio_pci_luo_handler);
+
+ if (err)
+ pr_err("VFIO PCI liveupdate file handler register failed, error %d.\n", err);
+}
diff --git a/drivers/vfio/pci/vfio_pci_priv.h b/drivers/vfio/pci/vfio_pci_priv.h
index a9972eacb293..7779fd744ff5 100644
--- a/drivers/vfio/pci/vfio_pci_priv.h
+++ b/drivers/vfio/pci/vfio_pci_priv.h
@@ -107,4 +107,10 @@ static inline bool vfio_pci_is_vga(struct pci_dev *pdev)
return (pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA;
}
+#ifdef CONFIG_LIVEUPDATE
+void vfio_pci_liveupdate_init(void);
+#else
+static inline void vfio_pci_liveupdate_init(void) { }
+#endif /* CONFIG_LIVEUPDATE */
+
#endif
--
2.51.0.858.gf9c4a03a3a-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread* Re: [RFC PATCH 05/21] vfio/pci: Register VFIO live update file handler to Live Update Orchestrator
2025-10-18 0:06 ` [RFC PATCH 05/21] vfio/pci: Register VFIO live update file handler to Live Update Orchestrator Vipin Sharma
@ 2025-10-31 21:24 ` David Matlack
2025-10-31 22:28 ` David Matlack
1 sibling, 0 replies; 57+ messages in thread
From: David Matlack @ 2025-10-31 21:24 UTC (permalink / raw)
To: Vipin Sharma
Cc: bhelgaas, alex.williamson, pasha.tatashin, jgg, graf, pratyush,
gregkh, chrisl, rppt, skhawaja, parav, saeedm, kevin.tian,
jrhilke, david, jgowans, dwmw2, epetron, junaids, linux-kernel,
linux-pci, kvm, linux-kselftest
On Fri, Oct 17, 2025 at 5:07 PM Vipin Sharma <vipinsh@google.com> wrote:
> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> static int __init vfio_pci_core_init(void)
> {
> /* Allocate shared config space permission data used by all devices */
> + vfio_pci_liveupdate_init();
> return vfio_pci_init_perm_bits();
The call to vfio_pci_liveupdate_init() should go before the comment
associated with vfio_pci_init_perm_bits().
> diff --git a/drivers/vfio/pci/vfio_pci_liveupdate.c b/drivers/vfio/pci/vfio_pci_liveupdate.c
> +static bool vfio_pci_liveupdate_can_preserve(struct liveupdate_file_handler *handler,
> + struct file *file)
> +{
> + return -EOPNOTSUPP;
can_preserve() returns a bool, so this should be "return false". But I
think we can just do the cdev fops check in this commit. It is a small
enough change.
> +static struct liveupdate_file_handler vfio_pci_luo_handler = {
> + .ops = &vfio_pci_luo_fops,
> + .compatible = "vfio-v1",
This should probably be something like "vfio-pci-v1"?
^ permalink raw reply [flat|nested] 57+ messages in thread* Re: [RFC PATCH 05/21] vfio/pci: Register VFIO live update file handler to Live Update Orchestrator
2025-10-18 0:06 ` [RFC PATCH 05/21] vfio/pci: Register VFIO live update file handler to Live Update Orchestrator Vipin Sharma
2025-10-31 21:24 ` David Matlack
@ 2025-10-31 22:28 ` David Matlack
1 sibling, 0 replies; 57+ messages in thread
From: David Matlack @ 2025-10-31 22:28 UTC (permalink / raw)
To: Vipin Sharma
Cc: bhelgaas, pasha.tatashin, jgg, graf, pratyush, gregkh, chrisl,
rppt, skhawaja, parav, saeedm, kevin.tian, jrhilke, david,
jgowans, dwmw2, epetron, junaids, linux-kernel, linux-pci, kvm,
linux-kselftest, Alex Williamson
On Fri, Oct 17, 2025 at 5:07 PM Vipin Sharma <vipinsh@google.com> wrote:
> +static const struct liveupdate_file_ops vfio_pci_luo_fops = {
> + .retrieve = vfio_pci_liveupdate_retrieve,
> + .can_preserve = vfio_pci_liveupdate_can_preserve,
> + .owner = THIS_MODULE,
> +};
> +
> +static struct liveupdate_file_handler vfio_pci_luo_handler = {
> + .ops = &vfio_pci_luo_fops,
> + .compatible = "vfio-v1",
> +};
> +
> +void __init vfio_pci_liveupdate_init(void)
> +{
> + int err = liveupdate_register_file_handler(&vfio_pci_luo_handler);
> +
> + if (err)
> + pr_err("VFIO PCI liveupdate file handler register failed, error %d.\n", err);
> +}
Alex and Jason, should this go in the top-level VFIO directory? And
then have all the preservation logic go through vfio_device_ops? That
would make Live Update support for VFIO cdev files extensible to other
drivers in the future.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [RFC PATCH 06/21] vfio/pci: Accept live update preservation request for VFIO cdev
2025-10-18 0:06 [RFC PATCH 00/21] VFIO live update support Vipin Sharma
` (4 preceding siblings ...)
2025-10-18 0:06 ` [RFC PATCH 05/21] vfio/pci: Register VFIO live update file handler to Live Update Orchestrator Vipin Sharma
@ 2025-10-18 0:06 ` Vipin Sharma
2025-10-27 20:44 ` Jacob Pan
2025-10-18 0:06 ` [RFC PATCH 07/21] vfio/pci: Store VFIO PCI device preservation data in KHO for live update Vipin Sharma
` (15 subsequent siblings)
21 siblings, 1 reply; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 0:06 UTC (permalink / raw)
To: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf
Cc: pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest, Vipin Sharma
Return true in can_preserve() callback of live update file handler, if
VFIO can preserve the passed VFIO cdev file. Return -EOPNOTSUPP from
prepare() callback for now to fail any attempt to preserve VFIO cdev in
live update.
The VFIO cdev opened check ensures that the file is actually used for
VFIO cdev and not for VFIO device FD which can be obtained from the VFIO
group.
Returning true from can_preserve() tells Live Update Orchestrator that
VFIO can try to preserve the given file during live update. Actual
preservation logic will be added in future patches, therefore, for now,
prepare call will fail.
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
drivers/vfio/pci/vfio_pci_liveupdate.c | 16 +++++++++++++++-
drivers/vfio/vfio_main.c | 3 ++-
include/linux/vfio.h | 2 ++
3 files changed, 19 insertions(+), 2 deletions(-)
diff --git a/drivers/vfio/pci/vfio_pci_liveupdate.c b/drivers/vfio/pci/vfio_pci_liveupdate.c
index 088f7698a72c..2ce2c11cb51c 100644
--- a/drivers/vfio/pci/vfio_pci_liveupdate.c
+++ b/drivers/vfio/pci/vfio_pci_liveupdate.c
@@ -8,10 +8,17 @@
*/
#include <linux/liveupdate.h>
+#include <linux/vfio.h>
#include <linux/errno.h>
#include "vfio_pci_priv.h"
+static int vfio_pci_liveupdate_prepare(struct liveupdate_file_handler *handler,
+ struct file *file, u64 *data)
+{
+ return -EOPNOTSUPP;
+}
+
static int vfio_pci_liveupdate_retrieve(struct liveupdate_file_handler *handler,
u64 data, struct file **file)
{
@@ -21,10 +28,17 @@ static int vfio_pci_liveupdate_retrieve(struct liveupdate_file_handler *handler,
static bool vfio_pci_liveupdate_can_preserve(struct liveupdate_file_handler *handler,
struct file *file)
{
- return -EOPNOTSUPP;
+ struct vfio_device *device = vfio_device_from_file(file);
+
+ if (!device)
+ return false;
+
+ guard(mutex)(&device->dev_set->lock);
+ return vfio_device_cdev_opened(device);
}
static const struct liveupdate_file_ops vfio_pci_luo_fops = {
+ .prepare = vfio_pci_liveupdate_prepare,
.retrieve = vfio_pci_liveupdate_retrieve,
.can_preserve = vfio_pci_liveupdate_can_preserve,
.owner = THIS_MODULE,
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 38c8e9350a60..4cb47c1564f4 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -1386,7 +1386,7 @@ const struct file_operations vfio_device_fops = {
#endif
};
-static struct vfio_device *vfio_device_from_file(struct file *file)
+struct vfio_device *vfio_device_from_file(struct file *file)
{
struct vfio_device_file *df = file->private_data;
@@ -1394,6 +1394,7 @@ static struct vfio_device *vfio_device_from_file(struct file *file)
return NULL;
return df->device;
}
+EXPORT_SYMBOL_GPL(vfio_device_from_file);
/**
* vfio_file_is_valid - True if the file is valid vfio file
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index eb563f538dee..2443d24aa237 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -385,4 +385,6 @@ int vfio_virqfd_enable(void *opaque, int (*handler)(void *, void *),
void vfio_virqfd_disable(struct virqfd **pvirqfd);
void vfio_virqfd_flush_thread(struct virqfd **pvirqfd);
+struct vfio_device *vfio_device_from_file(struct file *file);
+
#endif /* VFIO_H */
--
2.51.0.858.gf9c4a03a3a-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread* Re: [RFC PATCH 06/21] vfio/pci: Accept live update preservation request for VFIO cdev
2025-10-18 0:06 ` [RFC PATCH 06/21] vfio/pci: Accept live update preservation request for VFIO cdev Vipin Sharma
@ 2025-10-27 20:44 ` Jacob Pan
2025-10-28 13:28 ` Jason Gunthorpe
2025-10-30 23:10 ` David Matlack
0 siblings, 2 replies; 57+ messages in thread
From: Jacob Pan @ 2025-10-27 20:44 UTC (permalink / raw)
To: Vipin Sharma
Cc: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf,
pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest, Jacob Pan
On Fri, 17 Oct 2025 17:06:58 -0700
Vipin Sharma <vipinsh@google.com> wrote:
> Return true in can_preserve() callback of live update file handler, if
> VFIO can preserve the passed VFIO cdev file. Return -EOPNOTSUPP from
> prepare() callback for now to fail any attempt to preserve VFIO cdev
> in live update.
>
> The VFIO cdev opened check ensures that the file is actually used for
> VFIO cdev and not for VFIO device FD which can be obtained from the
> VFIO group.
>
> Returning true from can_preserve() tells Live Update Orchestrator that
> VFIO can try to preserve the given file during live update. Actual
> preservation logic will be added in future patches, therefore, for
> now, prepare call will fail.
>
> Signed-off-by: Vipin Sharma <vipinsh@google.com>
> ---
> drivers/vfio/pci/vfio_pci_liveupdate.c | 16 +++++++++++++++-
> drivers/vfio/vfio_main.c | 3 ++-
> include/linux/vfio.h | 2 ++
> 3 files changed, 19 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/vfio/pci/vfio_pci_liveupdate.c
> b/drivers/vfio/pci/vfio_pci_liveupdate.c index
> 088f7698a72c..2ce2c11cb51c 100644 ---
> a/drivers/vfio/pci/vfio_pci_liveupdate.c +++
> b/drivers/vfio/pci/vfio_pci_liveupdate.c @@ -8,10 +8,17 @@
> */
>
> #include <linux/liveupdate.h>
> +#include <linux/vfio.h>
> #include <linux/errno.h>
>
> #include "vfio_pci_priv.h"
>
> +static int vfio_pci_liveupdate_prepare(struct
> liveupdate_file_handler *handler,
> + struct file *file, u64 *data)
> +{
> + return -EOPNOTSUPP;
> +}
> +
> static int vfio_pci_liveupdate_retrieve(struct
> liveupdate_file_handler *handler, u64 data, struct file **file)
> {
> @@ -21,10 +28,17 @@ static int vfio_pci_liveupdate_retrieve(struct
> liveupdate_file_handler *handler, static bool
> vfio_pci_liveupdate_can_preserve(struct liveupdate_file_handler
> *handler, struct file *file) {
> - return -EOPNOTSUPP;
> + struct vfio_device *device = vfio_device_from_file(file);
> +
> + if (!device)
> + return false;
> +
> + guard(mutex)(&device->dev_set->lock);
> + return vfio_device_cdev_opened(device);
IIUC, vfio_device_cdev_opened(device) will only return true after
vfio_df_ioctl_bind_iommufd(). Where it does:
device->cdev_opened = true;
Does this imply that devices not bound to an iommufd cannot be
preserved?
If so, I am confused about your cover letter step #15
> 15. It makes usual bind iommufd and attach page table calls.
Does it mean after restoration, we have to bind iommufd again?
I have a separate question regarding noiommu devices. I’m currently
working on adding noiommu mode support for VFIO cdev under iommufd.
From my understanding, these devices should naturally be included in
your patchset, provided that I ensure the noiommu cdev follows the same
open/bind process. Is that correct?
> }
>
> static const struct liveupdate_file_ops vfio_pci_luo_fops = {
> + .prepare = vfio_pci_liveupdate_prepare,
> .retrieve = vfio_pci_liveupdate_retrieve,
> .can_preserve = vfio_pci_liveupdate_can_preserve,
> .owner = THIS_MODULE,
> diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
> index 38c8e9350a60..4cb47c1564f4 100644
> --- a/drivers/vfio/vfio_main.c
> +++ b/drivers/vfio/vfio_main.c
> @@ -1386,7 +1386,7 @@ const struct file_operations vfio_device_fops =
> { #endif
> };
>
> -static struct vfio_device *vfio_device_from_file(struct file *file)
> +struct vfio_device *vfio_device_from_file(struct file *file)
> {
> struct vfio_device_file *df = file->private_data;
>
> @@ -1394,6 +1394,7 @@ static struct vfio_device
> *vfio_device_from_file(struct file *file) return NULL;
> return df->device;
> }
> +EXPORT_SYMBOL_GPL(vfio_device_from_file);
>
> /**
> * vfio_file_is_valid - True if the file is valid vfio file
> diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> index eb563f538dee..2443d24aa237 100644
> --- a/include/linux/vfio.h
> +++ b/include/linux/vfio.h
> @@ -385,4 +385,6 @@ int vfio_virqfd_enable(void *opaque, int
> (*handler)(void *, void *), void vfio_virqfd_disable(struct virqfd
> **pvirqfd); void vfio_virqfd_flush_thread(struct virqfd **pvirqfd);
>
> +struct vfio_device *vfio_device_from_file(struct file *file);
> +
> #endif /* VFIO_H */
^ permalink raw reply [flat|nested] 57+ messages in thread* Re: [RFC PATCH 06/21] vfio/pci: Accept live update preservation request for VFIO cdev
2025-10-27 20:44 ` Jacob Pan
@ 2025-10-28 13:28 ` Jason Gunthorpe
2025-10-28 17:39 ` Jacob Pan
2025-10-30 23:10 ` David Matlack
1 sibling, 1 reply; 57+ messages in thread
From: Jason Gunthorpe @ 2025-10-28 13:28 UTC (permalink / raw)
To: Jacob Pan
Cc: Vipin Sharma, bhelgaas, alex.williamson, pasha.tatashin, dmatlack,
graf, pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On Mon, Oct 27, 2025 at 01:44:30PM -0700, Jacob Pan wrote:
> I have a separate question regarding noiommu devices. I’m currently
> working on adding noiommu mode support for VFIO cdev under iommufd.
Oh how is that going? I was just thinking about that again..
After writing the generic pt self test it occured to me we now have
enough infrastructure for iommufd to internally create its own
iommu_domain with a AMDv1 page table for the noiommu devices. It would
then be so easy to feed that through the existing machinery and have
all the pinning/etc work.
Then only an ioctl to read back the physical addresses from this
special domain would be needed
It actually sort of feels pretty easy..
Jason
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [RFC PATCH 06/21] vfio/pci: Accept live update preservation request for VFIO cdev
2025-10-28 13:28 ` Jason Gunthorpe
@ 2025-10-28 17:39 ` Jacob Pan
2025-10-29 16:21 ` Jason Gunthorpe
0 siblings, 1 reply; 57+ messages in thread
From: Jacob Pan @ 2025-10-28 17:39 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Vipin Sharma, bhelgaas, alex.williamson, pasha.tatashin, dmatlack,
graf, pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On Tue, 28 Oct 2025 10:28:55 -0300
Jason Gunthorpe <jgg@ziepe.ca> wrote:
> On Mon, Oct 27, 2025 at 01:44:30PM -0700, Jacob Pan wrote:
> > I have a separate question regarding noiommu devices. I’m currently
> > working on adding noiommu mode support for VFIO cdev under iommufd.
> >
>
> Oh how is that going? I was just thinking about that again..
>
I initially tried to create a special VFIO no-iommu iommu_domain
without an iommu driver, but I found it difficult without iommu_group
and other machinery. I also had a special vfio_device_ops
vfio_pci_noiommu_ops with special vfio_iommufd_noiommu_bind to create
iommufd_acess object as in Yi's original patch.
My current approach is that I have a special noiommu driver that handles
the special iommu_domain. It seems much cleaner though some extra code
overhead. I have a working prototype that has:
# tree /dev/vfio/
/dev/vfio/
|-- 7
|-- devices
| `-- noiommu-vfio0
`-- vfio
And the typical:
/sys/class/iommu/noiommu/
|-- devices
| |-- 0000:00:00.0 -> ../../../../pci0000:00/0000:00:00.0
| |-- 0000:00:01.0 -> ../../../../pci0000:00/0000:00:01.0
| |-- 0000:00:02.0 -> ../../../../pci0000:00/0000:00:02.0
| |-- 0000:00:03.0 -> ../../../../pci0000:00/0000:00:03.0
| |-- 0000:00:04.0 -> ../../../../pci0000:00/0000:00:04.0
| |-- 0000:00:05.0 -> ../../../../pci0000:00/0000:00:05.0
| |-- 0000:01:00.0 -> ../../../../pci0000:00/0000:00:04.0/0000:0
The following user test can pass:
1. __iommufd = open("/dev/iommu", O_RDWR);
2. devfd = open a noiommu cdev
3. ioas_id = ioas_alloc(__iommufd)
4. iommufd_bind(__iommufd, devfd)
5. successfully do an ioas map, e.g.
ioctl(iommufd, IOMMU_IOAS_MAP, &map)
This will call pfn_reader_user_pin() but the noiommu driver does
nothing for mapping.
I am still debugging some cases, would like to have a direction check
before going too far.
> After writing the generic pt self test it occured to me we now have
> enough infrastructure for iommufd to internally create its own
> iommu_domain with a AMDv1 page table for the noiommu devices. It would
> then be so easy to feed that through the existing machinery and have
> all the pinning/etc work.
>
Could you elaborate a little more? noiommu devices don't have page
tables. Are you saying iommufd can create its own iommu_domain w/o a
vendor iommu driver? Let me catch up with your v7 :)
> Then only an ioctl to read back the physical addresses from this
> special domain would be needed
>
Yes, that was part of your original suggestion to avoid /proc pagemap.
I have not added that yet. Do you think this warrant a new ioctl or
just return it in
struct iommu_ioas_map map = {
.size = sizeof(map),
.flags = IOMMU_IOAS_MAP_READABLE,
.ioas_id = ioas_id,
.iova = iova,
.user_va = uvaddr,
.length = size,
};
> It actually sort of feels pretty easy..
>
> Jason
^ permalink raw reply [flat|nested] 57+ messages in thread* Re: [RFC PATCH 06/21] vfio/pci: Accept live update preservation request for VFIO cdev
2025-10-28 17:39 ` Jacob Pan
@ 2025-10-29 16:21 ` Jason Gunthorpe
0 siblings, 0 replies; 57+ messages in thread
From: Jason Gunthorpe @ 2025-10-29 16:21 UTC (permalink / raw)
To: Jacob Pan
Cc: Vipin Sharma, bhelgaas, alex.williamson, pasha.tatashin, dmatlack,
graf, pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On Tue, Oct 28, 2025 at 10:39:45AM -0700, Jacob Pan wrote:
> My current approach is that I have a special noiommu driver that handles
> the special iommu_domain. It seems much cleaner though some extra code
> overhead. I have a working prototype that has:
Oh interesting, maybe that is OK and reasonable.. My first worry is
that we don't well support iommu driver hot unplug, but if it is very
carefully controlled I think we can make it safe. iommufd selftests is
already doing this and I've been trying to make sure it stays safe
without races or memory leaks..
Binding is going to also need some fiddling because we don't want to
mess with the fwspec on a real struct device..
But maybe we can have some kind of direct 'bind iommu driver to struct
device' call?
> The following user test can pass:
> 1. __iommufd = open("/dev/iommu", O_RDWR);
> 2. devfd = open a noiommu cdev
> 3. ioas_id = ioas_alloc(__iommufd)
> 4. iommufd_bind(__iommufd, devfd)
> 5. successfully do an ioas map, e.g.
> ioctl(iommufd, IOMMU_IOAS_MAP, &map)
> This will call pfn_reader_user_pin() but the noiommu driver does
> nothing for mapping.
Make sense.
So you can't have a paging iommu_domain that doesn't have a map
function - that just won't work for iommufd. What you should do is use
the iommu pt stuff and have the noiommu driver implement its paging
domain using the amdv1 format.
That will give you map/unmap/iova_to_phys and then iommufd will
immediately full work.
Look at how that series handles the selftest, the simple selftest
iommu_domain is very close to what you need. It is pretty small code
wise.
> > After writing the generic pt self test it occured to me we now have
> > enough infrastructure for iommufd to internally create its own
> > iommu_domain with a AMDv1 page table for the noiommu devices. It would
> > then be so easy to feed that through the existing machinery and have
> > all the pinning/etc work.
>
> Could you elaborate a little more? noiommu devices don't have page
> tables. Are you saying iommufd can create its own iommu_domain w/o a
> vendor iommu driver? Let me catch up with your v7 :)
That was my suggestion, but it seems you tried that and decided it was
too hard with groups/etc. OK.
Adding a dummy iommu driver solves that and you still get to the same
place where there is a paging iommu domain that implements an actual
page table with map/unmap/iova_to_phys. From this perspective iommufd
will be entirely happy and will do all the required pinning and
unpinning.
> > Then only an ioctl to read back the physical addresses from this
> > special domain would be needed
>
> Yes, that was part of your original suggestion to avoid /proc pagemap.
> I have not added that yet. Do you think this warrant a new ioctl or
> just return it in
I think a new ioctl is probably the right idea..
Jason
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [RFC PATCH 06/21] vfio/pci: Accept live update preservation request for VFIO cdev
2025-10-27 20:44 ` Jacob Pan
2025-10-28 13:28 ` Jason Gunthorpe
@ 2025-10-30 23:10 ` David Matlack
2025-10-31 0:18 ` Pasha Tatashin
1 sibling, 1 reply; 57+ messages in thread
From: David Matlack @ 2025-10-30 23:10 UTC (permalink / raw)
To: Jacob Pan
Cc: Vipin Sharma, bhelgaas, alex.williamson, pasha.tatashin, jgg,
graf, pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On 2025-10-27 01:44 PM, Jacob Pan wrote:
> On Fri, 17 Oct 2025 17:06:58 -0700 Vipin Sharma <vipinsh@google.com> wrote:
> > static int vfio_pci_liveupdate_retrieve(struct
> > liveupdate_file_handler *handler, u64 data, struct file **file)
> > {
> > @@ -21,10 +28,17 @@ static int vfio_pci_liveupdate_retrieve(struct
> > liveupdate_file_handler *handler, static bool
> > vfio_pci_liveupdate_can_preserve(struct liveupdate_file_handler
> > *handler, struct file *file) {
> > - return -EOPNOTSUPP;
> > + struct vfio_device *device = vfio_device_from_file(file);
> > +
> > + if (!device)
> > + return false;
> > +
> > + guard(mutex)(&device->dev_set->lock);
> > + return vfio_device_cdev_opened(device);
>
> IIUC, vfio_device_cdev_opened(device) will only return true after
> vfio_df_ioctl_bind_iommufd(). Where it does:
> device->cdev_opened = true;
>
> Does this imply that devices not bound to an iommufd cannot be
> preserved?
Event if being bound to an iommufd is required, it seems wrong to check
it in can_preserve(), as the device can just be unbound from the iommufd
before preserve().
I think can_preserve() just needs to check if this is a VFIO cdev file,
i.e. vfio_device_from_file() returns non-NULL.
>
> If so, I am confused about your cover letter step #15
> > 15. It makes usual bind iommufd and attach page table calls.
>
> Does it mean after restoration, we have to bind iommufd again?
This is still being discussed. These are the two options currently:
- When userspace retrieves the iommufd from LUO after kexec, the kernel
will internally restore all VFIO cdevs and bind them to the iommufd
in a single step.
- Userspace will retrieve the iommufd and cdevs from LUO separately,
and then bind each cdev to the iommufd like they were before kexec.
^ permalink raw reply [flat|nested] 57+ messages in thread* Re: [RFC PATCH 06/21] vfio/pci: Accept live update preservation request for VFIO cdev
2025-10-30 23:10 ` David Matlack
@ 2025-10-31 0:18 ` Pasha Tatashin
2025-10-31 21:41 ` David Matlack
0 siblings, 1 reply; 57+ messages in thread
From: Pasha Tatashin @ 2025-10-31 0:18 UTC (permalink / raw)
To: David Matlack
Cc: Jacob Pan, Vipin Sharma, bhelgaas, alex.williamson, jgg, graf,
pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On Thu, Oct 30, 2025 at 7:10 PM David Matlack <dmatlack@google.com> wrote:
>
> On 2025-10-27 01:44 PM, Jacob Pan wrote:
> > On Fri, 17 Oct 2025 17:06:58 -0700 Vipin Sharma <vipinsh@google.com> wrote:
> > > static int vfio_pci_liveupdate_retrieve(struct
> > > liveupdate_file_handler *handler, u64 data, struct file **file)
> > > {
> > > @@ -21,10 +28,17 @@ static int vfio_pci_liveupdate_retrieve(struct
> > > liveupdate_file_handler *handler, static bool
> > > vfio_pci_liveupdate_can_preserve(struct liveupdate_file_handler
> > > *handler, struct file *file) {
> > > - return -EOPNOTSUPP;
> > > + struct vfio_device *device = vfio_device_from_file(file);
> > > +
> > > + if (!device)
> > > + return false;
> > > +
> > > + guard(mutex)(&device->dev_set->lock);
> > > + return vfio_device_cdev_opened(device);
> >
> > IIUC, vfio_device_cdev_opened(device) will only return true after
> > vfio_df_ioctl_bind_iommufd(). Where it does:
> > device->cdev_opened = true;
> >
> > Does this imply that devices not bound to an iommufd cannot be
> > preserved?
>
> Event if being bound to an iommufd is required, it seems wrong to check
> it in can_preserve(), as the device can just be unbound from the iommufd
> before preserve().
>
> I think can_preserve() just needs to check if this is a VFIO cdev file,
> i.e. vfio_device_from_file() returns non-NULL.
+1, can_preserve() must be fast, as it might be called on every single
FD that is being preserved, to check if type is correct.
So, simply check if "struct file" is cdev via ops check perhaps via
and thats it. It should be a very simple operation
>
> >
> > If so, I am confused about your cover letter step #15
> > > 15. It makes usual bind iommufd and attach page table calls.
> >
> > Does it mean after restoration, we have to bind iommufd again?
>
> This is still being discussed. These are the two options currently:
>
> - When userspace retrieves the iommufd from LUO after kexec, the kernel
> will internally restore all VFIO cdevs and bind them to the iommufd
> in a single step.
>
> - Userspace will retrieve the iommufd and cdevs from LUO separately,
> and then bind each cdev to the iommufd like they were before kexec.
^ permalink raw reply [flat|nested] 57+ messages in thread* Re: [RFC PATCH 06/21] vfio/pci: Accept live update preservation request for VFIO cdev
2025-10-31 0:18 ` Pasha Tatashin
@ 2025-10-31 21:41 ` David Matlack
0 siblings, 0 replies; 57+ messages in thread
From: David Matlack @ 2025-10-31 21:41 UTC (permalink / raw)
To: Pasha Tatashin
Cc: Jacob Pan, Vipin Sharma, bhelgaas, alex.williamson, jgg, graf,
pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On Thu, Oct 30, 2025 at 5:19 PM Pasha Tatashin
<pasha.tatashin@soleen.com> wrote:
> On Thu, Oct 30, 2025 at 7:10 PM David Matlack <dmatlack@google.com> wrote:
> > On 2025-10-27 01:44 PM, Jacob Pan wrote:
> > > On Fri, 17 Oct 2025 17:06:58 -0700 Vipin Sharma <vipinsh@google.com> wrote:
> > > > + guard(mutex)(&device->dev_set->lock);
> > > > + return vfio_device_cdev_opened(device);
> > >
> > > IIUC, vfio_device_cdev_opened(device) will only return true after
> > > vfio_df_ioctl_bind_iommufd(). Where it does:
> > > device->cdev_opened = true;
> > >
> > > Does this imply that devices not bound to an iommufd cannot be
> > > preserved?
> >
> > Event if being bound to an iommufd is required, it seems wrong to check
> > it in can_preserve(), as the device can just be unbound from the iommufd
> > before preserve().
> >
> > I think can_preserve() just needs to check if this is a VFIO cdev file,
> > i.e. vfio_device_from_file() returns non-NULL.
>
> +1, can_preserve() must be fast, as it might be called on every single
> FD that is being preserved, to check if type is correct.
> So, simply check if "struct file" is cdev via ops check perhaps via
> and thats it. It should be a very simple operation
Small correction, vfio_device_from_file() checks if file->fops are
&vfio_device_fops. But device files acquired via group FDs use the
same ops. So I think we actually need to check "device &&
!device->group" here to identify VFIO cdev files, and then check
device->ops == &vfio_pci_ops to make sure this is a vfio-pci device.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [RFC PATCH 07/21] vfio/pci: Store VFIO PCI device preservation data in KHO for live update
2025-10-18 0:06 [RFC PATCH 00/21] VFIO live update support Vipin Sharma
` (5 preceding siblings ...)
2025-10-18 0:06 ` [RFC PATCH 06/21] vfio/pci: Accept live update preservation request for VFIO cdev Vipin Sharma
@ 2025-10-18 0:06 ` Vipin Sharma
2025-10-18 0:07 ` [RFC PATCH 08/21] vfio/pci: Retrieve preserved VFIO device for Live Update Orechestrator Vipin Sharma
` (14 subsequent siblings)
21 siblings, 0 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 0:06 UTC (permalink / raw)
To: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf
Cc: pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest, Vipin Sharma
Create a struct to serialize VFIO PCI data and preserve it using KHO.
Provide physical address of the folio to Live Update Orchestrator (LUO)
in prepare() callback so that LUO can give it back after kexec.
Unpreserve and free the folio in cancel() callback.
Store PCI BDF value in the serialized data. BDF value is unique for each
device on a host and remains same unless hardware or firmware is
changed.
Preserving BDF value allows VFIO to find the PCI device which LUO wants
to restore in retrieve() callback after kexec. In future patches, more
meaningful data will be serialized to actually preserve working of the
device.
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
drivers/vfio/pci/vfio_pci_liveupdate.c | 54 +++++++++++++++++++++++++-
1 file changed, 53 insertions(+), 1 deletion(-)
diff --git a/drivers/vfio/pci/vfio_pci_liveupdate.c b/drivers/vfio/pci/vfio_pci_liveupdate.c
index 2ce2c11cb51c..3eb4895ce475 100644
--- a/drivers/vfio/pci/vfio_pci_liveupdate.c
+++ b/drivers/vfio/pci/vfio_pci_liveupdate.c
@@ -10,13 +10,64 @@
#include <linux/liveupdate.h>
#include <linux/vfio.h>
#include <linux/errno.h>
+#include <linux/kexec_handover.h>
#include "vfio_pci_priv.h"
+struct vfio_pci_core_device_ser {
+ u16 bdf;
+} __packed;
+
+static int vfio_pci_lu_serialize(struct vfio_pci_core_device *vdev,
+ struct vfio_pci_core_device_ser *ser)
+{
+ ser->bdf = pci_dev_id(vdev->pdev);
+ return 0;
+}
+
static int vfio_pci_liveupdate_prepare(struct liveupdate_file_handler *handler,
struct file *file, u64 *data)
{
- return -EOPNOTSUPP;
+ struct vfio_pci_core_device_ser *ser;
+ struct vfio_pci_core_device *vdev;
+ struct vfio_device *device;
+ struct folio *folio;
+ int err;
+
+ device = vfio_device_from_file(file);
+ vdev = container_of(device, struct vfio_pci_core_device, vdev);
+
+ folio = folio_alloc(GFP_KERNEL | __GFP_ZERO, get_order(sizeof(*ser)));
+ if (!folio)
+ return -ENOMEM;
+
+ ser = folio_address(folio);
+
+ err = vfio_pci_lu_serialize(vdev, ser);
+ if (err)
+ goto err_free_folio;
+
+ err = kho_preserve_folio(folio);
+ if (err)
+ goto err_free_folio;
+
+ *data = virt_to_phys(ser);
+
+ return 0;
+
+err_free_folio:
+ folio_put(folio);
+ return err;
+}
+
+static void vfio_pci_liveupdate_cancel(struct liveupdate_file_handler *handler,
+ struct file *file, u64 data)
+{
+ struct vfio_pci_core_device_ser *ser = phys_to_virt(data);
+ struct folio *folio = virt_to_folio(ser);
+
+ WARN_ON_ONCE(kho_unpreserve_folio(folio));
+ folio_put(folio);
}
static int vfio_pci_liveupdate_retrieve(struct liveupdate_file_handler *handler,
@@ -39,6 +90,7 @@ static bool vfio_pci_liveupdate_can_preserve(struct liveupdate_file_handler *han
static const struct liveupdate_file_ops vfio_pci_luo_fops = {
.prepare = vfio_pci_liveupdate_prepare,
+ .cancel = vfio_pci_liveupdate_cancel,
.retrieve = vfio_pci_liveupdate_retrieve,
.can_preserve = vfio_pci_liveupdate_can_preserve,
.owner = THIS_MODULE,
--
2.51.0.858.gf9c4a03a3a-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread* [RFC PATCH 08/21] vfio/pci: Retrieve preserved VFIO device for Live Update Orechestrator
2025-10-18 0:06 [RFC PATCH 00/21] VFIO live update support Vipin Sharma
` (6 preceding siblings ...)
2025-10-18 0:06 ` [RFC PATCH 07/21] vfio/pci: Store VFIO PCI device preservation data in KHO for live update Vipin Sharma
@ 2025-10-18 0:07 ` Vipin Sharma
2025-10-31 23:12 ` David Matlack
2025-10-18 0:07 ` [RFC PATCH 09/21] vfio/pci: Add Live Update finish callback implementation Vipin Sharma
` (13 subsequent siblings)
21 siblings, 1 reply; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 0:07 UTC (permalink / raw)
To: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf
Cc: pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest, Vipin Sharma
Retrieve VFIO device in the retrieve() callback of the LUO file handler.
Deserialize the KHO data and search in the VFIO cdev class for device
matching the BDF. Export needed functions from core VFIO module to
others.
Create anonymous inode and file struct for the device. This is similar
to how VFIO group returns VFIO device FD. This is different than VFIO
cdev where cdev device is connected to inode and file on devtempfs.
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
drivers/vfio/pci/vfio_pci_liveupdate.c | 67 +++++++++++++++++++++++++-
drivers/vfio/vfio_main.c | 17 +++++++
include/linux/vfio.h | 6 +++
3 files changed, 89 insertions(+), 1 deletion(-)
diff --git a/drivers/vfio/pci/vfio_pci_liveupdate.c b/drivers/vfio/pci/vfio_pci_liveupdate.c
index 3eb4895ce475..cb3ff097afbf 100644
--- a/drivers/vfio/pci/vfio_pci_liveupdate.c
+++ b/drivers/vfio/pci/vfio_pci_liveupdate.c
@@ -10,7 +10,9 @@
#include <linux/liveupdate.h>
#include <linux/vfio.h>
#include <linux/errno.h>
+#include <linux/anon_inodes.h>
#include <linux/kexec_handover.h>
+#include <linux/file.h>
#include "vfio_pci_priv.h"
@@ -70,10 +72,73 @@ static void vfio_pci_liveupdate_cancel(struct liveupdate_file_handler *handler,
folio_put(folio);
}
+static int match_bdf(struct device *device, const void *bdf)
+{
+ struct vfio_device *core_vdev =
+ container_of(device, struct vfio_device, device);
+ struct vfio_pci_core_device *vdev =
+ container_of(core_vdev, struct vfio_pci_core_device, vdev);
+
+ return *(u16 *)bdf == pci_dev_id(vdev->pdev);
+}
+
static int vfio_pci_liveupdate_retrieve(struct liveupdate_file_handler *handler,
u64 data, struct file **file)
{
- return -EOPNOTSUPP;
+ struct vfio_pci_core_device_ser *ser;
+ struct vfio_device_file *df;
+ struct vfio_device *device;
+ struct folio *folio;
+ struct file *filep;
+ int err;
+
+ folio = kho_restore_folio(data);
+ if (!folio)
+ return -ENOENT;
+
+ ser = folio_address(folio);
+ device = vfio_find_device_in_cdev_class(&ser->bdf, match_bdf);
+ if (!device)
+ return -ENODEV;
+
+ df = vfio_allocate_device_file(device);
+ if (IS_ERR(df)) {
+ err = PTR_ERR(df);
+ goto err_vfio_device_file;
+ }
+
+ filep = anon_inode_getfile_fmode("[vfio-cdev]", &vfio_device_fops, df,
+ O_RDWR, FMODE_PREAD | FMODE_PWRITE);
+ if (IS_ERR(filep)) {
+ err = PTR_ERR(filep);
+ goto err_anon_inode;
+ }
+
+ /* Paired with the put in vfio_device_fops_release() */
+ if (!vfio_device_try_get_registration(device)) {
+ err = -ENODEV;
+ goto err_get_registration;
+ }
+
+ put_device(&device->device);
+
+ /*
+ * Use the pseudo fs inode on the device to link all mmaps
+ * to the same address space, allowing us to unmap all vmas
+ * associated to this device using unmap_mapping_range().
+ */
+ filep->f_mapping = device->inode->i_mapping;
+ *file = filep;
+
+ return 0;
+
+err_get_registration:
+ fput(filep);
+err_anon_inode:
+ kfree(df);
+err_vfio_device_file:
+ put_device(&device->device);
+ return err;
}
static bool vfio_pci_liveupdate_can_preserve(struct liveupdate_file_handler *handler,
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index 4cb47c1564f4..90ecb3544f79 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -13,6 +13,7 @@
#include <linux/cdev.h>
#include <linux/compat.h>
#include <linux/device.h>
+#include <linux/device/class.h>
#include <linux/fs.h>
#include <linux/idr.h>
#include <linux/iommu.h>
@@ -177,6 +178,7 @@ bool vfio_device_try_get_registration(struct vfio_device *device)
{
return refcount_inc_not_zero(&device->refcount);
}
+EXPORT_SYMBOL_GPL(vfio_device_try_get_registration);
/*
* VFIO driver API
@@ -502,6 +504,7 @@ vfio_allocate_device_file(struct vfio_device *device)
return df;
}
+EXPORT_SYMBOL_GPL(vfio_allocate_device_file);
static int vfio_df_device_first_open(struct vfio_device_file *df)
{
@@ -1385,6 +1388,7 @@ const struct file_operations vfio_device_fops = {
.show_fdinfo = vfio_device_show_fdinfo,
#endif
};
+EXPORT_SYMBOL_GPL(vfio_device_fops);
struct vfio_device *vfio_device_from_file(struct file *file)
{
@@ -1716,6 +1720,19 @@ int vfio_dma_rw(struct vfio_device *device, dma_addr_t iova, void *data,
}
EXPORT_SYMBOL(vfio_dma_rw);
+struct vfio_device *vfio_find_device_in_cdev_class(const void *data,
+ device_match_t match)
+{
+ struct device *device = class_find_device(vfio.device_class, NULL, data,
+ match);
+
+ if (!device)
+ return NULL;
+
+ return container_of(device, struct vfio_device, device);
+}
+EXPORT_SYMBOL_GPL(vfio_find_device_in_cdev_class);
+
/*
* Module/class support
*/
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 2443d24aa237..f98802facb24 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -386,5 +386,11 @@ void vfio_virqfd_disable(struct virqfd **pvirqfd);
void vfio_virqfd_flush_thread(struct virqfd **pvirqfd);
struct vfio_device *vfio_device_from_file(struct file *file);
+struct vfio_device *vfio_find_device_in_cdev_class(const void *data,
+ device_match_t match);
+bool vfio_device_try_get_registration(struct vfio_device *device);
+struct vfio_device_file *vfio_allocate_device_file(struct vfio_device *device);
+
+extern const struct file_operations vfio_device_fops;
#endif /* VFIO_H */
--
2.51.0.858.gf9c4a03a3a-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread* Re: [RFC PATCH 08/21] vfio/pci: Retrieve preserved VFIO device for Live Update Orechestrator
2025-10-18 0:07 ` [RFC PATCH 08/21] vfio/pci: Retrieve preserved VFIO device for Live Update Orechestrator Vipin Sharma
@ 2025-10-31 23:12 ` David Matlack
0 siblings, 0 replies; 57+ messages in thread
From: David Matlack @ 2025-10-31 23:12 UTC (permalink / raw)
To: Vipin Sharma
Cc: bhelgaas, alex.williamson, pasha.tatashin, jgg, graf, pratyush,
gregkh, chrisl, rppt, skhawaja, parav, saeedm, kevin.tian,
jrhilke, david, jgowans, dwmw2, epetron, junaids, linux-kernel,
linux-pci, kvm, linux-kselftest
On Fri, Oct 17, 2025 at 5:07 PM Vipin Sharma <vipinsh@google.com> wrote:
> static int vfio_pci_liveupdate_retrieve(struct liveupdate_file_handler *handler,
> u64 data, struct file **file)
> {
...
> + filep = anon_inode_getfile_fmode("[vfio-cdev]", &vfio_device_fops, df,
> + O_RDWR, FMODE_PREAD | FMODE_PWRITE);
It's a little weird that we have to use an anonymous inode when
restoring cdev file descriptors. Do we care not about the association
between VFIO cdev files and their inodes?
If we wanted to have the cdev inode we could have the user pass a file
path to ioctl(LIVEUPDATE_SESSION_RESTORE_FD)? File handlers can use
that to find the inode to use when creating a struct file. This would
avoid the anonymous inode and also ensure that restoring the fd obeys
the same filesystem permissions as opening a new fd (I think?).
Pasha this would be a uAPI change to LUO. What do you think?
Sami, Jason, what are you planning to do for iommufd?
> + if (IS_ERR(filep)) {
> + err = PTR_ERR(filep);
> + goto err_anon_inode;
> + }
> +
> + /* Paired with the put in vfio_device_fops_release() */
> + if (!vfio_device_try_get_registration(device)) {
> + err = -ENODEV;
> + goto err_get_registration;
> + }
> +
> + put_device(&device->device);
> +
> + /*
> + * Use the pseudo fs inode on the device to link all mmaps
> + * to the same address space, allowing us to unmap all vmas
> + * associated to this device using unmap_mapping_range().
> + */
> + filep->f_mapping = device->inode->i_mapping;
Most of this code already exists in vfio_device_fops_cdev_open(). I'll
work on sharing the code in the next version.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [RFC PATCH 09/21] vfio/pci: Add Live Update finish callback implementation
2025-10-18 0:06 [RFC PATCH 00/21] VFIO live update support Vipin Sharma
` (7 preceding siblings ...)
2025-10-18 0:07 ` [RFC PATCH 08/21] vfio/pci: Retrieve preserved VFIO device for Live Update Orechestrator Vipin Sharma
@ 2025-10-18 0:07 ` Vipin Sharma
2025-10-18 0:07 ` [RFC PATCH 10/21] PCI: Add option to skip Bus Master Enable reset during kexec Vipin Sharma
` (12 subsequent siblings)
21 siblings, 0 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 0:07 UTC (permalink / raw)
To: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf
Cc: pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest, Vipin Sharma
Add finish() callback implentation in LUO file handler to free restored
folio. Reset the VFIO device if it is not reclaimed by userspace.
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
drivers/vfio/pci/vfio_pci_liveupdate.c | 33 ++++++++++++++++++++++++++
1 file changed, 33 insertions(+)
diff --git a/drivers/vfio/pci/vfio_pci_liveupdate.c b/drivers/vfio/pci/vfio_pci_liveupdate.c
index cb3ff097afbf..8e0ee01127b3 100644
--- a/drivers/vfio/pci/vfio_pci_liveupdate.c
+++ b/drivers/vfio/pci/vfio_pci_liveupdate.c
@@ -82,6 +82,38 @@ static int match_bdf(struct device *device, const void *bdf)
return *(u16 *)bdf == pci_dev_id(vdev->pdev);
}
+static void vfio_pci_liveupdate_finish(struct liveupdate_file_handler *handler,
+ struct file *file, u64 data, bool reclaimed)
+{
+ struct vfio_pci_core_device_ser *ser;
+ struct vfio_pci_core_device *vdev;
+ struct vfio_device *device;
+ struct folio *folio;
+
+ if (reclaimed) {
+ folio = virt_to_folio(phys_to_virt(data));
+ goto out_folio_put;
+ } else {
+ folio = kho_restore_folio(data);
+ }
+
+ if (!folio)
+ return;
+
+ ser = folio_address(folio);
+
+ device = vfio_find_device_in_cdev_class(&ser->bdf, match_bdf);
+ if (!device)
+ goto out_folio_put;
+
+ vdev = container_of(device, struct vfio_pci_core_device, vdev);
+ pci_try_reset_function(vdev->pdev);
+ put_device(&device->device);
+
+out_folio_put:
+ folio_put(folio);
+}
+
static int vfio_pci_liveupdate_retrieve(struct liveupdate_file_handler *handler,
u64 data, struct file **file)
{
@@ -156,6 +188,7 @@ static bool vfio_pci_liveupdate_can_preserve(struct liveupdate_file_handler *han
static const struct liveupdate_file_ops vfio_pci_luo_fops = {
.prepare = vfio_pci_liveupdate_prepare,
.cancel = vfio_pci_liveupdate_cancel,
+ .finish = vfio_pci_liveupdate_finish,
.retrieve = vfio_pci_liveupdate_retrieve,
.can_preserve = vfio_pci_liveupdate_can_preserve,
.owner = THIS_MODULE,
--
2.51.0.858.gf9c4a03a3a-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread* [RFC PATCH 10/21] PCI: Add option to skip Bus Master Enable reset during kexec
2025-10-18 0:06 [RFC PATCH 00/21] VFIO live update support Vipin Sharma
` (8 preceding siblings ...)
2025-10-18 0:07 ` [RFC PATCH 09/21] vfio/pci: Add Live Update finish callback implementation Vipin Sharma
@ 2025-10-18 0:07 ` Vipin Sharma
2025-10-18 0:07 ` [RFC PATCH 11/21] vfio/pci: Skip clearing bus master on live update device " Vipin Sharma
` (11 subsequent siblings)
21 siblings, 0 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 0:07 UTC (permalink / raw)
To: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf
Cc: pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest, Vipin Sharma
Add bit field 'skip_kexec_clear_master' to struct pci_dev{}. Skip
clearing Bus Master Enable bit on PCI device during kexec reboot.
Devices preserved using live update might be performing a DMA
transaction during kexec. Skipping clearing this bit allows a device to
continue DMA while live update is in progress.
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
drivers/pci/pci-driver.c | 6 ++++--
include/linux/pci.h | 2 ++
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 302d61783f6c..6aab358dc27a 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -513,11 +513,13 @@ static void pci_device_shutdown(struct device *dev)
/*
* If this is a kexec reboot, turn off Bus Master bit on the
* device to tell it to not continue to do DMA. Don't touch
- * devices in D3cold or unknown states.
+ * devices in D3cold or unknown states. Don't clear the bit
+ * if device has explicitly asked to skip it.
* If it is not a kexec reboot, firmware will hit the PCI
* devices with big hammer and stop their DMA any way.
*/
- if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot))
+ if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot) &&
+ !pci_dev->skip_kexec_clear_master)
pci_clear_master(pci_dev);
}
diff --git a/include/linux/pci.h b/include/linux/pci.h
index d1fdf81fbe1e..8ce2d4528193 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -400,6 +400,8 @@ struct pci_dev {
decoding during BAR sizing */
unsigned int wakeup_prepared:1;
unsigned int skip_bus_pm:1; /* Internal: Skip bus-level PM */
+ unsigned int skip_kexec_clear_master:1; /* Don't clear the Bus Master
+ Enable bit on kexec reboot */
unsigned int ignore_hotplug:1; /* Ignore hotplug events */
unsigned int hotplug_user_indicators:1; /* SlotCtl indicators
controlled exclusively by
--
2.51.0.858.gf9c4a03a3a-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread* [RFC PATCH 11/21] vfio/pci: Skip clearing bus master on live update device during kexec
2025-10-18 0:06 [RFC PATCH 00/21] VFIO live update support Vipin Sharma
` (9 preceding siblings ...)
2025-10-18 0:07 ` [RFC PATCH 10/21] PCI: Add option to skip Bus Master Enable reset during kexec Vipin Sharma
@ 2025-10-18 0:07 ` Vipin Sharma
2025-10-18 7:09 ` Lukas Wunner
2025-10-18 0:07 ` [RFC PATCH 12/21] vfio/pci: Skip clearing bus master on live update restored device Vipin Sharma
` (10 subsequent siblings)
21 siblings, 1 reply; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 0:07 UTC (permalink / raw)
To: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf
Cc: pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest, Vipin Sharma
Set skip_kexec_clear_master on live update prepare() so that the device
participating in live update can continue to perform DMA during kexec
phase.
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
drivers/vfio/pci/vfio_pci_liveupdate.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/vfio/pci/vfio_pci_liveupdate.c b/drivers/vfio/pci/vfio_pci_liveupdate.c
index 8e0ee01127b3..789b52665e35 100644
--- a/drivers/vfio/pci/vfio_pci_liveupdate.c
+++ b/drivers/vfio/pci/vfio_pci_liveupdate.c
@@ -54,6 +54,7 @@ static int vfio_pci_liveupdate_prepare(struct liveupdate_file_handler *handler,
goto err_free_folio;
*data = virt_to_phys(ser);
+ vdev->pdev->skip_kexec_clear_master = true;
return 0;
@@ -67,7 +68,12 @@ static void vfio_pci_liveupdate_cancel(struct liveupdate_file_handler *handler,
{
struct vfio_pci_core_device_ser *ser = phys_to_virt(data);
struct folio *folio = virt_to_folio(ser);
+ struct vfio_pci_core_device *vdev;
+ struct vfio_device *device;
+ device = vfio_device_from_file(file);
+ vdev = container_of(device, struct vfio_pci_core_device, vdev);
+ vdev->pdev->skip_kexec_clear_master = false;
WARN_ON_ONCE(kho_unpreserve_folio(folio));
folio_put(folio);
}
--
2.51.0.858.gf9c4a03a3a-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread* Re: [RFC PATCH 11/21] vfio/pci: Skip clearing bus master on live update device during kexec
2025-10-18 0:07 ` [RFC PATCH 11/21] vfio/pci: Skip clearing bus master on live update device " Vipin Sharma
@ 2025-10-18 7:09 ` Lukas Wunner
2025-10-18 22:19 ` Vipin Sharma
0 siblings, 1 reply; 57+ messages in thread
From: Lukas Wunner @ 2025-10-18 7:09 UTC (permalink / raw)
To: Vipin Sharma
Cc: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf,
pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On Fri, Oct 17, 2025 at 05:07:03PM -0700, Vipin Sharma wrote:
> Set skip_kexec_clear_master on live update prepare() so that the device
> participating in live update can continue to perform DMA during kexec
> phase.
Instead of introducing the skip_kexec_clear_master flag,
could you introduce a function to check whether a device
participates in live update and call that in pci_device_shutdown()?
I think that would be cleaner. Otherwise someone reading
the code has to chase down the meaning of skip_kexec_clear_master,
i.e. search for places where the bit is set.
When the device is unbound from vfio-pci, don't you have to
clear the skip_kexec_clear_master flag? I'm not seeing this
in your patches but maybe I'm missing something. That problem
would solve itself if you follow the suggestion above.
Thanks,
Lukas
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [RFC PATCH 11/21] vfio/pci: Skip clearing bus master on live update device during kexec
2025-10-18 7:09 ` Lukas Wunner
@ 2025-10-18 22:19 ` Vipin Sharma
0 siblings, 0 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 22:19 UTC (permalink / raw)
To: Lukas Wunner
Cc: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf,
pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On 2025-10-18 09:09:06, Lukas Wunner wrote:
> On Fri, Oct 17, 2025 at 05:07:03PM -0700, Vipin Sharma wrote:
> > Set skip_kexec_clear_master on live update prepare() so that the device
> > participating in live update can continue to perform DMA during kexec
> > phase.
>
> Instead of introducing the skip_kexec_clear_master flag,
> could you introduce a function to check whether a device
> participates in live update and call that in pci_device_shutdown()?
>
> I think that would be cleaner. Otherwise someone reading
> the code has to chase down the meaning of skip_kexec_clear_master,
> i.e. search for places where the bit is set.
That is one way to do it. In our internal implementation we have an API
which checks for the device participation in the live update, similar to
what you have suggested.
The PCI series posted by Chris [1] is providing a different way to know
the live update particpation of device. There pci_dev has a new struct
which contains particpation information.
In this VFIO series, my intention is to make minimal changes to PCI or
any other subsystem. I opted for a simple variable to check what device
should do during kexec reboot.
My hunch is that we will end up needing some state information in the
struct pci_dev{} which denotes device participation and whatever that
ends up being, we can use that here.
[1] https://lore.kernel.org/linux-pci/20250916-luo-pci-v2-0-c494053c3c08@kernel.org/
>
> When the device is unbound from vfio-pci, don't you have to
> clear the skip_kexec_clear_master flag? I'm not seeing this
> in your patches but maybe I'm missing something. That problem
> would solve itself if you follow the suggestion above.
VFIO subsystem blocks removal from vfio-pci if there is still a
reference to device (references are increased/decreased when device is
opened/closed, check vfio_unregister_group_dev()). LUO also do fget on
the VFIO FD which means we will not get closed callback on the VFIO FD
until that reference is dropped besides the opened file in userspace.
So, prior to kexec, luo will drop reference only if live update cancel
happens and that is the time we are resetting this flag in this patch
series.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [RFC PATCH 12/21] vfio/pci: Skip clearing bus master on live update restored device
2025-10-18 0:06 [RFC PATCH 00/21] VFIO live update support Vipin Sharma
` (10 preceding siblings ...)
2025-10-18 0:07 ` [RFC PATCH 11/21] vfio/pci: Skip clearing bus master on live update device " Vipin Sharma
@ 2025-10-18 0:07 ` Vipin Sharma
2025-10-20 21:29 ` David Matlack
2025-10-18 0:07 ` [RFC PATCH 13/21] vfio/pci: Preserve VFIO PCI config space through live update Vipin Sharma
` (9 subsequent siblings)
21 siblings, 1 reply; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 0:07 UTC (permalink / raw)
To: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf
Cc: pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest, Vipin Sharma
Store the restored serialized data in struct vfio_pci_core_device{}.
Skip clearing the bus master bit on the restored VFIO devices when
opened for the first time after live update reboot.
In the live update finish, clean up the pointer to the restored KHO
data. Warn if the device open count is 0, which indicates that userspace
might not have opened and restored the device.
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
drivers/vfio/pci/vfio_pci_core.c | 8 ++++++--
drivers/vfio/pci/vfio_pci_liveupdate.c | 19 ++++++++++++++-----
include/linux/vfio_pci_core.h | 1 +
3 files changed, 21 insertions(+), 7 deletions(-)
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 0894673a9262..29236b015242 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -475,8 +475,12 @@ int vfio_pci_core_enable(struct vfio_pci_core_device *vdev)
return ret;
}
- /* Don't allow our initial saved state to include busmaster */
- pci_clear_master(pdev);
+ /*
+ * Don't allow our initial saved state to include busmaster. However, if
+ * device is participating in liveupdate then don't change this bit.
+ */
+ if (!vdev->liveupdate_restore)
+ pci_clear_master(pdev);
ret = pci_enable_device(pdev);
if (ret)
diff --git a/drivers/vfio/pci/vfio_pci_liveupdate.c b/drivers/vfio/pci/vfio_pci_liveupdate.c
index 789b52665e35..6cc94d9a0386 100644
--- a/drivers/vfio/pci/vfio_pci_liveupdate.c
+++ b/drivers/vfio/pci/vfio_pci_liveupdate.c
@@ -96,12 +96,10 @@ static void vfio_pci_liveupdate_finish(struct liveupdate_file_handler *handler,
struct vfio_device *device;
struct folio *folio;
- if (reclaimed) {
+ if (reclaimed)
folio = virt_to_folio(phys_to_virt(data));
- goto out_folio_put;
- } else {
+ else
folio = kho_restore_folio(data);
- }
if (!folio)
return;
@@ -113,7 +111,14 @@ static void vfio_pci_liveupdate_finish(struct liveupdate_file_handler *handler,
goto out_folio_put;
vdev = container_of(device, struct vfio_pci_core_device, vdev);
- pci_try_reset_function(vdev->pdev);
+ if (reclaimed) {
+ guard(mutex)(&device->dev_set->lock);
+ if (!vfio_device_cdev_opened(device))
+ pci_err(vdev->pdev, "Open count is 0, userspace might not have restored the device.\n");
+ vdev->liveupdate_restore = NULL;
+ } else {
+ pci_try_reset_function(vdev->pdev);
+ }
put_device(&device->device);
out_folio_put:
@@ -124,6 +129,7 @@ static int vfio_pci_liveupdate_retrieve(struct liveupdate_file_handler *handler,
u64 data, struct file **file)
{
struct vfio_pci_core_device_ser *ser;
+ struct vfio_pci_core_device *vdev;
struct vfio_device_file *df;
struct vfio_device *device;
struct folio *folio;
@@ -167,6 +173,9 @@ static int vfio_pci_liveupdate_retrieve(struct liveupdate_file_handler *handler,
*/
filep->f_mapping = device->inode->i_mapping;
*file = filep;
+ vdev = container_of(device, struct vfio_pci_core_device, vdev);
+ guard(mutex)(&device->dev_set->lock);
+ vdev->liveupdate_restore = ser;
return 0;
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index f541044e42a2..8c3fe2db7eb3 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -94,6 +94,7 @@ struct vfio_pci_core_device {
struct vfio_pci_core_device *sriov_pf_core_dev;
struct notifier_block nb;
struct rw_semaphore memory_lock;
+ void *liveupdate_restore;
};
/* Will be exported for vfio pci drivers usage */
--
2.51.0.858.gf9c4a03a3a-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread* Re: [RFC PATCH 12/21] vfio/pci: Skip clearing bus master on live update restored device
2025-10-18 0:07 ` [RFC PATCH 12/21] vfio/pci: Skip clearing bus master on live update restored device Vipin Sharma
@ 2025-10-20 21:29 ` David Matlack
2025-10-20 22:39 ` Vipin Sharma
0 siblings, 1 reply; 57+ messages in thread
From: David Matlack @ 2025-10-20 21:29 UTC (permalink / raw)
To: Vipin Sharma
Cc: bhelgaas, alex.williamson, pasha.tatashin, jgg, graf, pratyush,
gregkh, chrisl, rppt, skhawaja, parav, saeedm, kevin.tian,
jrhilke, david, jgowans, dwmw2, epetron, junaids, linux-kernel,
linux-pci, kvm, linux-kselftest
On 2025-10-17 05:07 PM, Vipin Sharma wrote:
> @@ -167,6 +173,9 @@ static int vfio_pci_liveupdate_retrieve(struct liveupdate_file_handler *handler,
> */
> filep->f_mapping = device->inode->i_mapping;
> *file = filep;
> + vdev = container_of(device, struct vfio_pci_core_device, vdev);
> + guard(mutex)(&device->dev_set->lock);
> + vdev->liveupdate_restore = ser;
FYI, this causes a build failure for me:
drivers/vfio/pci/vfio_pci_liveupdate.c:381:3: error: cannot jump from this goto statement to its label
381 | goto err_get_registration;
| ^
drivers/vfio/pci/vfio_pci_liveupdate.c:394:2: note: jump bypasses initialization of variable with __attribute__((cleanup))
394 | guard(mutex)(&device->dev_set->lock);
| ^
It seems you cannot jump past a guard(). Replacing the guard with
lock/unlock fixes it, and so does putting the guard into its own inner
statement.
^ permalink raw reply [flat|nested] 57+ messages in thread* Re: [RFC PATCH 12/21] vfio/pci: Skip clearing bus master on live update restored device
2025-10-20 21:29 ` David Matlack
@ 2025-10-20 22:39 ` Vipin Sharma
0 siblings, 0 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-20 22:39 UTC (permalink / raw)
To: David Matlack
Cc: bhelgaas, alex.williamson, pasha.tatashin, jgg, graf, pratyush,
gregkh, chrisl, rppt, skhawaja, parav, saeedm, kevin.tian,
jrhilke, david, jgowans, dwmw2, epetron, junaids, linux-kernel,
linux-pci, kvm, linux-kselftest
On 2025-10-20 21:29:47, David Matlack wrote:
> On 2025-10-17 05:07 PM, Vipin Sharma wrote:
>
> > @@ -167,6 +173,9 @@ static int vfio_pci_liveupdate_retrieve(struct liveupdate_file_handler *handler,
> > */
> > filep->f_mapping = device->inode->i_mapping;
> > *file = filep;
> > + vdev = container_of(device, struct vfio_pci_core_device, vdev);
> > + guard(mutex)(&device->dev_set->lock);
> > + vdev->liveupdate_restore = ser;
>
> FYI, this causes a build failure for me:
>
> drivers/vfio/pci/vfio_pci_liveupdate.c:381:3: error: cannot jump from this goto statement to its label
> 381 | goto err_get_registration;
> | ^
> drivers/vfio/pci/vfio_pci_liveupdate.c:394:2: note: jump bypasses initialization of variable with __attribute__((cleanup))
> 394 | guard(mutex)(&device->dev_set->lock);
> | ^
>
> It seems you cannot jump past a guard(). Replacing the guard with
> lock/unlock fixes it, and so does putting the guard into its own inner
> statement.
I didn't get this error in my builds. I used:
make -j$(nproc) bzImage
After your email, I tried with clang, using:
LLVM=1 make -j$(nproc) bzImage
This one indeed fails with the error you mentioned. Thanks for catching
it. I wonder why gcc not complaining about it? May be I need to pass
some options to enable this build error on gcc.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [RFC PATCH 13/21] vfio/pci: Preserve VFIO PCI config space through live update
2025-10-18 0:06 [RFC PATCH 00/21] VFIO live update support Vipin Sharma
` (11 preceding siblings ...)
2025-10-18 0:07 ` [RFC PATCH 12/21] vfio/pci: Skip clearing bus master on live update restored device Vipin Sharma
@ 2025-10-18 0:07 ` Vipin Sharma
2025-10-18 14:59 ` Vipin Sharma
2025-10-18 0:07 ` [RFC PATCH 14/21] vfio/pci: Skip device reset on live update restored device Vipin Sharma
` (8 subsequent siblings)
21 siblings, 1 reply; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 0:07 UTC (permalink / raw)
To: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf
Cc: pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest, Vipin Sharma
Save and restore vconfig, pci_config_map, and rbar members of the struct
vfio_pci_core_device{} during live update. Use the max size of PCI
config space i.e. 4096 bytes for storing vconfig and pci_config_map
irrespective of the exact size. Store the current config size which is
present in the struct pci_dev{} also, to know how much actual data is
present in the vconfig and the pci_config_map.
vconfig represents virtual PCI config used by VFIO to virtualize certain
bits of the config space in the PCI device. This should be preserved as
those virtualized bits cannot be retrieved from reading hardware.
pci_config_map is used to identify starting point of a capability. This
is not strictly needed to be preserved and can be recreated after kexec
but saving it in kHO reduces the code change. Currently, pci_config_map
is populated in the same code where vconfig gets initialized. If
pci_config_map is not saved then a separate flow need to be added for
just populating pci_config_map.
rbar is used to restore BARs after a reset. This value needs to be
preserved as reset will lose this information.
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
drivers/vfio/pci/vfio_pci_config.c | 17 ++++++++++++
drivers/vfio/pci/vfio_pci_liveupdate.c | 38 ++++++++++++++++++++++++++
drivers/vfio/pci/vfio_pci_priv.h | 5 ++++
3 files changed, 60 insertions(+)
diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c
index 8f02f236b5b4..36a71fc3d526 100644
--- a/drivers/vfio/pci/vfio_pci_config.c
+++ b/drivers/vfio/pci/vfio_pci_config.c
@@ -1756,6 +1756,23 @@ int vfio_config_init(struct vfio_pci_core_device *vdev)
vdev->pci_config_map = map;
vdev->vconfig = vconfig;
+ if (vdev->liveupdate_restore) {
+ ret = vfio_pci_liveupdate_restore_config(vdev);
+ if (ret)
+ goto out;
+ /*
+ * Liveupdate might have started after userspace writes to BARs
+ * but before VFIO sanitizes them which happens when BARs are
+ * read next time.
+ *
+ * Assume BARs are dirty so that VFIO will sanitize them
+ * unconditionally next time and avoid giving userspace wrong
+ * value.
+ */
+ vdev->bardirty = true;
+ return 0;
+ }
+
memset(map, PCI_CAP_ID_BASIC, PCI_STD_HEADER_SIZEOF);
memset(map + PCI_STD_HEADER_SIZEOF, PCI_CAP_ID_INVALID,
pdev->cfg_size - PCI_STD_HEADER_SIZEOF);
diff --git a/drivers/vfio/pci/vfio_pci_liveupdate.c b/drivers/vfio/pci/vfio_pci_liveupdate.c
index 6cc94d9a0386..824dba2750fe 100644
--- a/drivers/vfio/pci/vfio_pci_liveupdate.c
+++ b/drivers/vfio/pci/vfio_pci_liveupdate.c
@@ -18,12 +18,43 @@
struct vfio_pci_core_device_ser {
u16 bdf;
+ u32 cfg_size;
+ u8 pci_config_map[PCI_CFG_SPACE_EXP_SIZE];
+ u8 vconfig[PCI_CFG_SPACE_EXP_SIZE];
+ u32 rbar[7];
} __packed;
+static int vfio_pci_liveupdate_deserialize_config(struct vfio_pci_core_device *vdev,
+ struct vfio_pci_core_device_ser *ser)
+{
+ struct pci_dev *pdev = vdev->pdev;
+
+ if (WARN_ON_ONCE(pdev->cfg_size != ser->cfg_size)) {
+ dev_err(&pdev->dev, "Config size in serialized (%d) not matching the one pci_dev (%d)",
+ ser->cfg_size, pdev->cfg_size);
+ return -EINVAL;
+ }
+
+ memcpy(vdev->pci_config_map, ser->pci_config_map, ser->cfg_size);
+ memcpy(vdev->vconfig, ser->vconfig, ser->cfg_size);
+ memcpy(vdev->rbar, ser->rbar, sizeof(vdev->rbar));
+ return 0;
+}
+
+static void vfio_pci_liveupdate_serialize_config(struct vfio_pci_core_device *vdev,
+ struct vfio_pci_core_device_ser *ser)
+{
+ ser->cfg_size = vdev->pdev->cfg_size;
+ memcpy(ser->pci_config_map, vdev->pci_config_map, ser->cfg_size);
+ memcpy(ser->vconfig, vdev->vconfig, ser->cfg_size);
+ memcpy(ser->rbar, vdev->rbar, sizeof(vdev->rbar));
+}
+
static int vfio_pci_lu_serialize(struct vfio_pci_core_device *vdev,
struct vfio_pci_core_device_ser *ser)
{
ser->bdf = pci_dev_id(vdev->pdev);
+ vfio_pci_liveupdate_serialize_config(vdev, ser);
return 0;
}
@@ -221,3 +252,10 @@ void __init vfio_pci_liveupdate_init(void)
if (err)
pr_err("VFIO PCI liveupdate file handler register failed, error %d.\n", err);
}
+
+int vfio_pci_liveupdate_restore_config(struct vfio_pci_core_device *vdev)
+{
+ struct vfio_pci_core_device_ser *ser = vdev->liveupdate_restore;
+
+ return vfio_pci_liveupdate_deserialize_config(vdev, ser);
+}
diff --git a/drivers/vfio/pci/vfio_pci_priv.h b/drivers/vfio/pci/vfio_pci_priv.h
index 7779fd744ff5..0d5aca6c2471 100644
--- a/drivers/vfio/pci/vfio_pci_priv.h
+++ b/drivers/vfio/pci/vfio_pci_priv.h
@@ -109,8 +109,13 @@ static inline bool vfio_pci_is_vga(struct pci_dev *pdev)
#ifdef CONFIG_LIVEUPDATE
void vfio_pci_liveupdate_init(void);
+int vfio_pci_liveupdate_restore_config(struct vfio_pci_core_device *vdev);
#else
static inline void vfio_pci_liveupdate_init(void) { }
+int vfio_pci_liveupdate_restore_config(struct vfio_pci_core_device *vdev)
+{
+ return -EINVAL;
+}
#endif /* CONFIG_LIVEUPDATE */
#endif
--
2.51.0.858.gf9c4a03a3a-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread* Re: [RFC PATCH 13/21] vfio/pci: Preserve VFIO PCI config space through live update
2025-10-18 0:07 ` [RFC PATCH 13/21] vfio/pci: Preserve VFIO PCI config space through live update Vipin Sharma
@ 2025-10-18 14:59 ` Vipin Sharma
0 siblings, 0 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 14:59 UTC (permalink / raw)
To: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf
Cc: pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On 2025-10-17 17:07:05, Vipin Sharma wrote:
> --- a/drivers/vfio/pci/vfio_pci_priv.h
> +++ b/drivers/vfio/pci/vfio_pci_priv.h
> @@ -109,8 +109,13 @@ static inline bool vfio_pci_is_vga(struct pci_dev *pdev)
>
> #ifdef CONFIG_LIVEUPDATE
> void vfio_pci_liveupdate_init(void);
> +int vfio_pci_liveupdate_restore_config(struct vfio_pci_core_device *vdev);
> #else
> static inline void vfio_pci_liveupdate_init(void) { }
> +int vfio_pci_liveupdate_restore_config(struct vfio_pci_core_device *vdev)
This should be static inline
> +{
> + return -EINVAL;
> +}
> #endif /* CONFIG_LIVEUPDATE */
>
> #endif
> --
> 2.51.0.858.gf9c4a03a3a-goog
>
^ permalink raw reply [flat|nested] 57+ messages in thread
* [RFC PATCH 14/21] vfio/pci: Skip device reset on live update restored device.
2025-10-18 0:06 [RFC PATCH 00/21] VFIO live update support Vipin Sharma
` (12 preceding siblings ...)
2025-10-18 0:07 ` [RFC PATCH 13/21] vfio/pci: Preserve VFIO PCI config space through live update Vipin Sharma
@ 2025-10-18 0:07 ` Vipin Sharma
2025-10-18 0:07 ` [RFC PATCH 15/21] PCI: Make PCI saved state and capability structs public Vipin Sharma
` (7 subsequent siblings)
21 siblings, 0 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 0:07 UTC (permalink / raw)
To: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf
Cc: pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest, Vipin Sharma
Do not reset the device when a live update preserved VFIO PCI device is
opened for the first time after kexec.
Save 'reset_works' to the device serialized state. If not saved then
this value can only be restored by performing an actual reset, which is
not desired during live update. If a device can be reset before live
update then most likely it can be reset after live update unless some
reset methods have been removed. In that case when actual reset is tried
it will return an error.
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
drivers/vfio/pci/vfio_pci_core.c | 15 ++++++++++-----
drivers/vfio/pci/vfio_pci_liveupdate.c | 9 +++++++++
drivers/vfio/pci/vfio_pci_priv.h | 2 ++
3 files changed, 21 insertions(+), 5 deletions(-)
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 29236b015242..186a669b68a4 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -486,12 +486,17 @@ int vfio_pci_core_enable(struct vfio_pci_core_device *vdev)
if (ret)
goto out_power;
- /* If reset fails because of the device lock, fail this path entirely */
- ret = pci_try_reset_function(pdev);
- if (ret == -EAGAIN)
- goto out_disable_device;
+ if (vdev->liveupdate_restore) {
+ vfio_pci_liveupdate_restore_device(vdev);
+ } else {
+ /* If reset fails because of the device lock, fail this path entirely */
+ ret = pci_try_reset_function(pdev);
+ if (ret == -EAGAIN)
+ goto out_disable_device;
+
+ vdev->reset_works = !ret;
+ }
- vdev->reset_works = !ret;
pci_save_state(pdev);
vdev->pci_saved_state = pci_store_saved_state(pdev);
if (!vdev->pci_saved_state)
diff --git a/drivers/vfio/pci/vfio_pci_liveupdate.c b/drivers/vfio/pci/vfio_pci_liveupdate.c
index 824dba2750fe..82ff9f178fdc 100644
--- a/drivers/vfio/pci/vfio_pci_liveupdate.c
+++ b/drivers/vfio/pci/vfio_pci_liveupdate.c
@@ -22,6 +22,7 @@ struct vfio_pci_core_device_ser {
u8 pci_config_map[PCI_CFG_SPACE_EXP_SIZE];
u8 vconfig[PCI_CFG_SPACE_EXP_SIZE];
u32 rbar[7];
+ u8 reset_works;
} __packed;
static int vfio_pci_liveupdate_deserialize_config(struct vfio_pci_core_device *vdev,
@@ -55,6 +56,7 @@ static int vfio_pci_lu_serialize(struct vfio_pci_core_device *vdev,
{
ser->bdf = pci_dev_id(vdev->pdev);
vfio_pci_liveupdate_serialize_config(vdev, ser);
+ ser->reset_works = vdev->reset_works;
return 0;
}
@@ -259,3 +261,10 @@ int vfio_pci_liveupdate_restore_config(struct vfio_pci_core_device *vdev)
return vfio_pci_liveupdate_deserialize_config(vdev, ser);
}
+
+void vfio_pci_liveupdate_restore_device(struct vfio_pci_core_device *vdev)
+{
+ struct vfio_pci_core_device_ser *ser = vdev->liveupdate_restore;
+
+ vdev->reset_works = ser->reset_works;
+}
diff --git a/drivers/vfio/pci/vfio_pci_priv.h b/drivers/vfio/pci/vfio_pci_priv.h
index 0d5aca6c2471..ee1c7c229020 100644
--- a/drivers/vfio/pci/vfio_pci_priv.h
+++ b/drivers/vfio/pci/vfio_pci_priv.h
@@ -110,12 +110,14 @@ static inline bool vfio_pci_is_vga(struct pci_dev *pdev)
#ifdef CONFIG_LIVEUPDATE
void vfio_pci_liveupdate_init(void);
int vfio_pci_liveupdate_restore_config(struct vfio_pci_core_device *vdev);
+void vfio_pci_liveupdate_restore_device(struct vfio_pci_core_device *vdev);
#else
static inline void vfio_pci_liveupdate_init(void) { }
int vfio_pci_liveupdate_restore_config(struct vfio_pci_core_device *vdev)
{
return -EINVAL;
}
+void vfio_pci_liveupdate_restore_device(struct vfio_pci_core_device *vdev) { }
#endif /* CONFIG_LIVEUPDATE */
#endif
--
2.51.0.858.gf9c4a03a3a-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread* [RFC PATCH 15/21] PCI: Make PCI saved state and capability structs public
2025-10-18 0:06 [RFC PATCH 00/21] VFIO live update support Vipin Sharma
` (13 preceding siblings ...)
2025-10-18 0:07 ` [RFC PATCH 14/21] vfio/pci: Skip device reset on live update restored device Vipin Sharma
@ 2025-10-18 0:07 ` Vipin Sharma
2025-10-18 7:17 ` Lukas Wunner
2025-10-18 0:07 ` [RFC PATCH 16/21] vfio/pci: Save and restore the PCI state of the VFIO device Vipin Sharma
` (6 subsequent siblings)
21 siblings, 1 reply; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 0:07 UTC (permalink / raw)
To: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf
Cc: pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest, Vipin Sharma
Move struct pci_saved_state{} and struct pci_cap_saved_data{} to
linux/pci.h so that they are available to code outside of the PCI core.
These structs will be used in subsequent commits to serialize and
deserialize PCI state across Live Update.
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
drivers/pci/pci.c | 5 -----
drivers/pci/pci.h | 7 -------
include/linux/pci.h | 13 +++++++++++++
3 files changed, 13 insertions(+), 12 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b14dd064006c..b68bf3e820ce 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1884,11 +1884,6 @@ void pci_restore_state(struct pci_dev *dev)
}
EXPORT_SYMBOL(pci_restore_state);
-struct pci_saved_state {
- u32 config_space[16];
- struct pci_cap_saved_data cap[];
-};
-
/**
* pci_store_saved_state - Allocate and return an opaque struct containing
* the device saved state.
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 09476a467cc0..973fcdf7898d 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -197,13 +197,6 @@ int pci_bridge_secondary_bus_reset(struct pci_dev *dev);
int pci_bus_error_reset(struct pci_dev *dev);
int __pci_reset_bus(struct pci_bus *bus);
-struct pci_cap_saved_data {
- u16 cap_nr;
- bool cap_extended;
- unsigned int size;
- u32 data[];
-};
-
struct pci_cap_saved_state {
struct hlist_node next;
struct pci_cap_saved_data cap;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 8ce2d4528193..70c9b12c8c02 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1448,6 +1448,19 @@ void pci_disable_rom(struct pci_dev *pdev);
void __iomem __must_check *pci_map_rom(struct pci_dev *pdev, size_t *size);
void pci_unmap_rom(struct pci_dev *pdev, void __iomem *rom);
+
+struct pci_cap_saved_data {
+ u16 cap_nr;
+ bool cap_extended;
+ unsigned int size;
+ u32 data[];
+};
+
+struct pci_saved_state {
+ u32 config_space[16];
+ struct pci_cap_saved_data cap[];
+};
+
/* Power management related routines */
int pci_save_state(struct pci_dev *dev);
void pci_restore_state(struct pci_dev *dev);
--
2.51.0.858.gf9c4a03a3a-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread* Re: [RFC PATCH 15/21] PCI: Make PCI saved state and capability structs public
2025-10-18 0:07 ` [RFC PATCH 15/21] PCI: Make PCI saved state and capability structs public Vipin Sharma
@ 2025-10-18 7:17 ` Lukas Wunner
2025-10-18 22:36 ` Vipin Sharma
0 siblings, 1 reply; 57+ messages in thread
From: Lukas Wunner @ 2025-10-18 7:17 UTC (permalink / raw)
To: Vipin Sharma
Cc: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf,
pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On Fri, Oct 17, 2025 at 05:07:07PM -0700, Vipin Sharma wrote:
> Move struct pci_saved_state{} and struct pci_cap_saved_data{} to
> linux/pci.h so that they are available to code outside of the PCI core.
>
> These structs will be used in subsequent commits to serialize and
> deserialize PCI state across Live Update.
That's not sufficient as a justification to make these public in my view.
There are already pci_store_saved_state() and pci_load_saved_state()
helpers to serialize PCI state. Why do you need anything more?
(Honest question.)
Thanks,
Lukas
^ permalink raw reply [flat|nested] 57+ messages in thread* Re: [RFC PATCH 15/21] PCI: Make PCI saved state and capability structs public
2025-10-18 7:17 ` Lukas Wunner
@ 2025-10-18 22:36 ` Vipin Sharma
2025-10-18 23:11 ` Jason Gunthorpe
2025-10-19 8:15 ` Lukas Wunner
0 siblings, 2 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 22:36 UTC (permalink / raw)
To: Lukas Wunner
Cc: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf,
pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On 2025-10-18 09:17:33, Lukas Wunner wrote:
> On Fri, Oct 17, 2025 at 05:07:07PM -0700, Vipin Sharma wrote:
> > Move struct pci_saved_state{} and struct pci_cap_saved_data{} to
> > linux/pci.h so that they are available to code outside of the PCI core.
> >
> > These structs will be used in subsequent commits to serialize and
> > deserialize PCI state across Live Update.
>
> That's not sufficient as a justification to make these public in my view.
>
> There are already pci_store_saved_state() and pci_load_saved_state()
> helpers to serialize PCI state. Why do you need anything more?
> (Honest question.)
>
In LUO ecosystem, currently, we do not have a solid solution to do
proper serialization/deserialization of structs along with versioning
between different kernel versions. This work is still being discussed.
Here, I created separate structs (exactly same as the original one) to
have little bit control on what gets saved in serialized state and
correctly gets deserialized after kexec.
For example, if I am using existing structs and not creating my own
structs then I cannot just do a blind memcpy() between whole of the PCI state
prior to kexec to PCI state after the kexec. In the new kernel
layout might have changed like addition or removal of a field.
Having __packed in my version of struct, I can build validation like
hardcoded offset of members. I can add version number (not added in this
series) for checking compatbility in the struct for serialization and
deserialization. Overall, it is providing some freedom to how to pass
data to next kernel without changing or modifying the PCI state structs.
^ permalink raw reply [flat|nested] 57+ messages in thread* Re: [RFC PATCH 15/21] PCI: Make PCI saved state and capability structs public
2025-10-18 22:36 ` Vipin Sharma
@ 2025-10-18 23:11 ` Jason Gunthorpe
2025-10-20 23:49 ` Vipin Sharma
2025-10-19 8:15 ` Lukas Wunner
1 sibling, 1 reply; 57+ messages in thread
From: Jason Gunthorpe @ 2025-10-18 23:11 UTC (permalink / raw)
To: Vipin Sharma
Cc: Lukas Wunner, bhelgaas, alex.williamson, pasha.tatashin, dmatlack,
graf, pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On Sat, Oct 18, 2025 at 03:36:20PM -0700, Vipin Sharma wrote:
> Having __packed in my version of struct, I can build validation like
> hardcoded offset of members. I can add version number (not added in this
> series) for checking compatbility in the struct for serialization and
> deserialization. Overall, it is providing some freedom to how to pass
> data to next kernel without changing or modifying the PCI state
> structs.
I keep saying this, and this series really strongly shows why, we need
to have a dedicated header directroy for LUO "ABI" structs. Putting
this random struct in some random header and then declaring it is part
of the luo ABI is really bad.
All the information in the abi headers needs to have detailed comments
explaining what it is and so on so people can evaluate if it is
suitable or not.
But, it is also not clear why pci serialization structs should leak
out of the PCI layer.
The design of luo was to allow each layer to contribute its own
tags/etc to the serialization so there is no reason to have vfio
piggback on pci structs or something.
Jason
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [RFC PATCH 15/21] PCI: Make PCI saved state and capability structs public
2025-10-18 23:11 ` Jason Gunthorpe
@ 2025-10-20 23:49 ` Vipin Sharma
2025-10-22 17:45 ` David Matlack
2025-10-22 17:53 ` Jason Gunthorpe
0 siblings, 2 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-20 23:49 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Lukas Wunner, bhelgaas, alex.williamson, pasha.tatashin, dmatlack,
graf, pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On 2025-10-18 20:11:26, Jason Gunthorpe wrote:
> On Sat, Oct 18, 2025 at 03:36:20PM -0700, Vipin Sharma wrote:
>
> > Having __packed in my version of struct, I can build validation like
> > hardcoded offset of members. I can add version number (not added in this
> > series) for checking compatbility in the struct for serialization and
> > deserialization. Overall, it is providing some freedom to how to pass
> > data to next kernel without changing or modifying the PCI state
> > structs.
>
> I keep saying this, and this series really strongly shows why, we need
> to have a dedicated header directroy for LUO "ABI" structs. Putting
> this random struct in some random header and then declaring it is part
> of the luo ABI is really bad.
Now that we have PCI, IOMMU, and VFIO series out. What should be the
strategy for LUO "ABI" structs? I would like some more clarity on how
you are visioning this.
Are you suggesting that each subsystem create a separate header file for
their serialization structs or we can have one common header file used
by all subsystems as dumping ground for their structs?
>
> All the information in the abi headers needs to have detailed comments
> explaining what it is and so on so people can evaluate if it is
> suitable or not.
I agree. I should have at least written comments in my *_ser structs on
why that particular field is there and what it is enabling. I will do
that in next version.
>
> But, it is also not clear why pci serialization structs should leak
> out of the PCI layer.
>
When PCI device is opened for the first time, VFIO driver asks for this state
from PCI and saves it in struct vfio_pci_core_device{.pci_saved_state}
field. It loads this value back to pci device after last device FD is
closed.
PCI layer will not have access to this value as it can be changed once
VFIO has start using this device. Therefore, I thought this should be
saved.
May be serialization and deserialization logic can be put in PCI and
that way it can stay in PCI?
> The design of luo was to allow each layer to contribute its own
> tags/etc to the serialization so there is no reason to have vfio
> piggback on pci structs or something.
>
> Jason
^ permalink raw reply [flat|nested] 57+ messages in thread* Re: [RFC PATCH 15/21] PCI: Make PCI saved state and capability structs public
2025-10-20 23:49 ` Vipin Sharma
@ 2025-10-22 17:45 ` David Matlack
2025-10-22 17:51 ` Jason Gunthorpe
2025-10-22 17:53 ` Jason Gunthorpe
1 sibling, 1 reply; 57+ messages in thread
From: David Matlack @ 2025-10-22 17:45 UTC (permalink / raw)
To: Vipin Sharma
Cc: Jason Gunthorpe, Lukas Wunner, bhelgaas, alex.williamson,
pasha.tatashin, graf, pratyush, gregkh, chrisl, rppt, skhawaja,
parav, saeedm, kevin.tian, jrhilke, david, jgowans, dwmw2,
epetron, junaids, linux-kernel, linux-pci, kvm, linux-kselftest
On Mon, Oct 20, 2025 at 4:49 PM Vipin Sharma <vipinsh@google.com> wrote:
>
> On 2025-10-18 20:11:26, Jason Gunthorpe wrote:
> > On Sat, Oct 18, 2025 at 03:36:20PM -0700, Vipin Sharma wrote:
> >
> > > Having __packed in my version of struct, I can build validation like
> > > hardcoded offset of members. I can add version number (not added in this
> > > series) for checking compatbility in the struct for serialization and
> > > deserialization. Overall, it is providing some freedom to how to pass
> > > data to next kernel without changing or modifying the PCI state
> > > structs.
> >
> > I keep saying this, and this series really strongly shows why, we need
> > to have a dedicated header directroy for LUO "ABI" structs. Putting
> > this random struct in some random header and then declaring it is part
> > of the luo ABI is really bad.
>
> Now that we have PCI, IOMMU, and VFIO series out. What should be the
> strategy for LUO "ABI" structs? I would like some more clarity on how
> you are visioning this.
>
> Are you suggesting that each subsystem create a separate header file for
> their serialization structs or we can have one common header file used
> by all subsystems as dumping ground for their structs?
I think we should have multiple header files in one directory, that
way we can assign separate MAINTAINERS for each file as needed.
Jason Miu proposed the first such header for KHO in
https://lore.kernel.org/lkml/CALzav=eqwTdzFhZLi_mWWXGuDBRwWQdBxQrzr4tN28ag8Zr_8Q@mail.gmail.com/.
Following that example we can add vfio_pci.h and pci.h to that
directory for VFIO and PCI ABI structs respectively.
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [RFC PATCH 15/21] PCI: Make PCI saved state and capability structs public
2025-10-22 17:45 ` David Matlack
@ 2025-10-22 17:51 ` Jason Gunthorpe
0 siblings, 0 replies; 57+ messages in thread
From: Jason Gunthorpe @ 2025-10-22 17:51 UTC (permalink / raw)
To: David Matlack
Cc: Vipin Sharma, Lukas Wunner, bhelgaas, alex.williamson,
pasha.tatashin, graf, pratyush, gregkh, chrisl, rppt, skhawaja,
parav, saeedm, kevin.tian, jrhilke, david, jgowans, dwmw2,
epetron, junaids, linux-kernel, linux-pci, kvm, linux-kselftest
On Wed, Oct 22, 2025 at 10:45:31AM -0700, David Matlack wrote:
> On Mon, Oct 20, 2025 at 4:49 PM Vipin Sharma <vipinsh@google.com> wrote:
> >
> > On 2025-10-18 20:11:26, Jason Gunthorpe wrote:
> > > On Sat, Oct 18, 2025 at 03:36:20PM -0700, Vipin Sharma wrote:
> > >
> > > > Having __packed in my version of struct, I can build validation like
> > > > hardcoded offset of members. I can add version number (not added in this
> > > > series) for checking compatbility in the struct for serialization and
> > > > deserialization. Overall, it is providing some freedom to how to pass
> > > > data to next kernel without changing or modifying the PCI state
> > > > structs.
> > >
> > > I keep saying this, and this series really strongly shows why, we need
> > > to have a dedicated header directroy for LUO "ABI" structs. Putting
> > > this random struct in some random header and then declaring it is part
> > > of the luo ABI is really bad.
> >
> > Now that we have PCI, IOMMU, and VFIO series out. What should be the
> > strategy for LUO "ABI" structs? I would like some more clarity on how
> > you are visioning this.
> >
> > Are you suggesting that each subsystem create a separate header file for
> > their serialization structs or we can have one common header file used
> > by all subsystems as dumping ground for their structs?
>
> I think we should have multiple header files in one directory, that
> way we can assign separate MAINTAINERS for each file as needed.
>
> Jason Miu proposed the first such header for KHO in
> https://lore.kernel.org/lkml/CALzav=eqwTdzFhZLi_mWWXGuDBRwWQdBxQrzr4tN28ag8Zr_8Q@mail.gmail.com/.
>
> Following that example we can add vfio_pci.h and pci.h to that
> directory for VFIO and PCI ABI structs respectively.
Seems like a good idea to me.
Jason
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [RFC PATCH 15/21] PCI: Make PCI saved state and capability structs public
2025-10-20 23:49 ` Vipin Sharma
2025-10-22 17:45 ` David Matlack
@ 2025-10-22 17:53 ` Jason Gunthorpe
1 sibling, 0 replies; 57+ messages in thread
From: Jason Gunthorpe @ 2025-10-22 17:53 UTC (permalink / raw)
To: Vipin Sharma
Cc: Lukas Wunner, bhelgaas, alex.williamson, pasha.tatashin, dmatlack,
graf, pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On Mon, Oct 20, 2025 at 04:49:34PM -0700, Vipin Sharma wrote:
> May be serialization and deserialization logic can be put in PCI and
> that way it can stay in PCI?
This does seem better
vfio should call something and get back a token it can store.
Jason
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [RFC PATCH 15/21] PCI: Make PCI saved state and capability structs public
2025-10-18 22:36 ` Vipin Sharma
2025-10-18 23:11 ` Jason Gunthorpe
@ 2025-10-19 8:15 ` Lukas Wunner
2025-10-20 23:54 ` Vipin Sharma
2025-10-30 23:55 ` David Matlack
1 sibling, 2 replies; 57+ messages in thread
From: Lukas Wunner @ 2025-10-19 8:15 UTC (permalink / raw)
To: Vipin Sharma
Cc: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf,
pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On Sat, Oct 18, 2025 at 03:36:20PM -0700, Vipin Sharma wrote:
> On 2025-10-18 09:17:33, Lukas Wunner wrote:
> > On Fri, Oct 17, 2025 at 05:07:07PM -0700, Vipin Sharma wrote:
> > > Move struct pci_saved_state{} and struct pci_cap_saved_data{} to
> > > linux/pci.h so that they are available to code outside of the PCI core.
> > >
> > > These structs will be used in subsequent commits to serialize and
> > > deserialize PCI state across Live Update.
> >
> > That's not sufficient as a justification to make these public in my view.
> >
> > There are already pci_store_saved_state() and pci_load_saved_state()
> > helpers to serialize PCI state. Why do you need anything more?
> > (Honest question.)
>
> In LUO ecosystem, currently, we do not have a solid solution to do
> proper serialization/deserialization of structs along with versioning
> between different kernel versions. This work is still being discussed.
>
> Here, I created separate structs (exactly same as the original one) to
> have little bit control on what gets saved in serialized state and
> correctly gets deserialized after kexec.
>
> For example, if I am using existing structs and not creating my own
> structs then I cannot just do a blind memcpy() between whole of the PCI state
> prior to kexec to PCI state after the kexec. In the new kernel
> layout might have changed like addition or removal of a field.
The last time we changed those structs was in 2013 by fd0f7f73ca96.
So changes are extremely rare.
What could change in theory is the layout of the individual
capabilities (the data[] in struct pci_cap_saved_data).
E.g. maybe we decide that we need to save an additional register.
But that's also rare. Normally we add all the mutable registers
when a new capability is supported and have no need to amend that
afterwards.
So I think you're preparing for an eventuality that's very unlikely
to happen. Question is whether that justifies the additional
complexity and duplication. (Probably not.)
Note that struct pci_cap_saved_state was made private in 2021 by
f0ab00174eb7. We try to prevent other subsystems or drivers fiddling
with structures internal to the PCI core. For LUO to find acceptance,
it needs to respect subsystems' desire to keep private what's private
and it needs to be as non-intrusive as possible. If necessary,
helpers needed by LUO (e.g. to determine the size of saved PCI state)
should probably live in the PCI core and be #ifdef'ed to LUO being enabled.
Thanks,
Lukas
^ permalink raw reply [flat|nested] 57+ messages in thread* Re: [RFC PATCH 15/21] PCI: Make PCI saved state and capability structs public
2025-10-19 8:15 ` Lukas Wunner
@ 2025-10-20 23:54 ` Vipin Sharma
2025-10-30 23:55 ` David Matlack
1 sibling, 0 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-20 23:54 UTC (permalink / raw)
To: Lukas Wunner
Cc: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf,
pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On 2025-10-19 10:15:19, Lukas Wunner wrote:
> On Sat, Oct 18, 2025 at 03:36:20PM -0700, Vipin Sharma wrote:
> > On 2025-10-18 09:17:33, Lukas Wunner wrote:
> > > On Fri, Oct 17, 2025 at 05:07:07PM -0700, Vipin Sharma wrote:
> > > > Move struct pci_saved_state{} and struct pci_cap_saved_data{} to
> > > > linux/pci.h so that they are available to code outside of the PCI core.
> > > >
> > > > These structs will be used in subsequent commits to serialize and
> > > > deserialize PCI state across Live Update.
> > >
> > > That's not sufficient as a justification to make these public in my view.
> > >
> > > There are already pci_store_saved_state() and pci_load_saved_state()
> > > helpers to serialize PCI state. Why do you need anything more?
> > > (Honest question.)
> >
> > In LUO ecosystem, currently, we do not have a solid solution to do
> > proper serialization/deserialization of structs along with versioning
> > between different kernel versions. This work is still being discussed.
> >
> > Here, I created separate structs (exactly same as the original one) to
> > have little bit control on what gets saved in serialized state and
> > correctly gets deserialized after kexec.
> >
> > For example, if I am using existing structs and not creating my own
> > structs then I cannot just do a blind memcpy() between whole of the PCI state
> > prior to kexec to PCI state after the kexec. In the new kernel
> > layout might have changed like addition or removal of a field.
>
> The last time we changed those structs was in 2013 by fd0f7f73ca96.
> So changes are extremely rare.
>
> What could change in theory is the layout of the individual
> capabilities (the data[] in struct pci_cap_saved_data).
> E.g. maybe we decide that we need to save an additional register.
> But that's also rare. Normally we add all the mutable registers
> when a new capability is supported and have no need to amend that
> afterwards.
>
> So I think you're preparing for an eventuality that's very unlikely
> to happen. Question is whether that justifies the additional
> complexity and duplication. (Probably not.)
>
> Note that struct pci_cap_saved_state was made private in 2021 by
> f0ab00174eb7. We try to prevent other subsystems or drivers fiddling
> with structures internal to the PCI core. For LUO to find acceptance,
> it needs to respect subsystems' desire to keep private what's private
> and it needs to be as non-intrusive as possible. If necessary,
> helpers needed by LUO (e.g. to determine the size of saved PCI state)
> should probably live in the PCI core and be #ifdef'ed to LUO being enabled.
>
Sounds good, I will create helpers in PCI core and ifdef them for the
things we end up agreeing that need to be saved.
But I also think we need some guardrails to detect if they change
otherwise we might end up getting some hard to catch data corruption. I
think this ties up to what Jason also saying we need to define LUO ABI.
^ permalink raw reply [flat|nested] 57+ messages in thread* Re: [RFC PATCH 15/21] PCI: Make PCI saved state and capability structs public
2025-10-19 8:15 ` Lukas Wunner
2025-10-20 23:54 ` Vipin Sharma
@ 2025-10-30 23:55 ` David Matlack
2025-10-31 0:06 ` David Matlack
1 sibling, 1 reply; 57+ messages in thread
From: David Matlack @ 2025-10-30 23:55 UTC (permalink / raw)
To: Lukas Wunner
Cc: Vipin Sharma, bhelgaas, alex.williamson, pasha.tatashin, jgg,
graf, pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On 2025-10-19 10:15 AM, Lukas Wunner wrote:
> On Sat, Oct 18, 2025 at 03:36:20PM -0700, Vipin Sharma wrote:
> > On 2025-10-18 09:17:33, Lukas Wunner wrote:
> > > On Fri, Oct 17, 2025 at 05:07:07PM -0700, Vipin Sharma wrote:
> > > > Move struct pci_saved_state{} and struct pci_cap_saved_data{} to
> > > > linux/pci.h so that they are available to code outside of the PCI core.
> > > >
> > > > These structs will be used in subsequent commits to serialize and
> > > > deserialize PCI state across Live Update.
> > >
> > > That's not sufficient as a justification to make these public in my view.
> > >
> > > There are already pci_store_saved_state() and pci_load_saved_state()
> > > helpers to serialize PCI state. Why do you need anything more?
> > > (Honest question.)
> >
> > In LUO ecosystem, currently, we do not have a solid solution to do
> > proper serialization/deserialization of structs along with versioning
> > between different kernel versions. This work is still being discussed.
> >
> > Here, I created separate structs (exactly same as the original one) to
> > have little bit control on what gets saved in serialized state and
> > correctly gets deserialized after kexec.
> >
> > For example, if I am using existing structs and not creating my own
> > structs then I cannot just do a blind memcpy() between whole of the PCI state
> > prior to kexec to PCI state after the kexec. In the new kernel
> > layout might have changed like addition or removal of a field.
>
> The last time we changed those structs was in 2013 by fd0f7f73ca96.
> So changes are extremely rare.
>
> What could change in theory is the layout of the individual
> capabilities (the data[] in struct pci_cap_saved_data).
> E.g. maybe we decide that we need to save an additional register.
> But that's also rare. Normally we add all the mutable registers
> when a new capability is supported and have no need to amend that
> afterwards.
Yeah that has me worried. A totally innocuous commit that adds, removes,
or reorders a register stashed in data[] could lead a broken device when
VFIO does pci_restore_state() after a Live Update.
Turing pci_save_state into an actual ABI would require adding the
registers into the save state probably, rather than assuming their
order.
But... I wonder if we truly need to preserve the PCI save state
across Live Update.
Based on this comment in drivers/vfio/pci/vfio_pci_core.c, the PCI
save/restore stuff in VFIO is for cleaning up devices that do not
support resets:
648 /*
649 * If we have saved state, restore it. If we can reset the device,
650 * even better. Resetting with current state seems better than
651 * nothing, but saving and restoring current state without reset
652 * is just busy work.
653 */
654 if (pci_load_and_free_saved_state(pdev, &vdev->pci_saved_state)) {
655 pci_info(pdev, "%s: Couldn't reload saved state\n", __func__);
656
657 if (!vdev->reset_works)
658 goto out;
659
660 pci_save_state(pdev);
661 }
So if we just limit Live Update support to devices with reset_works,
then we don't have to deal with preserving the save state.
I will have to double check that reset_works is true for all the devices
we care about supporting for Live Update, but I imagine it will be.
They're all relatively modern PCIe devices.
^ permalink raw reply [flat|nested] 57+ messages in thread* Re: [RFC PATCH 15/21] PCI: Make PCI saved state and capability structs public
2025-10-30 23:55 ` David Matlack
@ 2025-10-31 0:06 ` David Matlack
0 siblings, 0 replies; 57+ messages in thread
From: David Matlack @ 2025-10-31 0:06 UTC (permalink / raw)
To: Lukas Wunner
Cc: Vipin Sharma, bhelgaas, alex.williamson, pasha.tatashin, jgg,
graf, pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On Thu, Oct 30, 2025 at 4:55 PM David Matlack <dmatlack@google.com> wrote:
>
> On 2025-10-19 10:15 AM, Lukas Wunner wrote:
> > On Sat, Oct 18, 2025 at 03:36:20PM -0700, Vipin Sharma wrote:
> > > On 2025-10-18 09:17:33, Lukas Wunner wrote:
> > > > On Fri, Oct 17, 2025 at 05:07:07PM -0700, Vipin Sharma wrote:
> > > > > Move struct pci_saved_state{} and struct pci_cap_saved_data{} to
> > > > > linux/pci.h so that they are available to code outside of the PCI core.
> > > > >
> > > > > These structs will be used in subsequent commits to serialize and
> > > > > deserialize PCI state across Live Update.
> > > >
> > > > That's not sufficient as a justification to make these public in my view.
> > > >
> > > > There are already pci_store_saved_state() and pci_load_saved_state()
> > > > helpers to serialize PCI state. Why do you need anything more?
> > > > (Honest question.)
> > >
> > > In LUO ecosystem, currently, we do not have a solid solution to do
> > > proper serialization/deserialization of structs along with versioning
> > > between different kernel versions. This work is still being discussed.
> > >
> > > Here, I created separate structs (exactly same as the original one) to
> > > have little bit control on what gets saved in serialized state and
> > > correctly gets deserialized after kexec.
> > >
> > > For example, if I am using existing structs and not creating my own
> > > structs then I cannot just do a blind memcpy() between whole of the PCI state
> > > prior to kexec to PCI state after the kexec. In the new kernel
> > > layout might have changed like addition or removal of a field.
> >
> > The last time we changed those structs was in 2013 by fd0f7f73ca96.
> > So changes are extremely rare.
> >
> > What could change in theory is the layout of the individual
> > capabilities (the data[] in struct pci_cap_saved_data).
> > E.g. maybe we decide that we need to save an additional register.
> > But that's also rare. Normally we add all the mutable registers
> > when a new capability is supported and have no need to amend that
> > afterwards.
>
> Yeah that has me worried. A totally innocuous commit that adds, removes,
> or reorders a register stashed in data[] could lead a broken device when
> VFIO does pci_restore_state() after a Live Update.
>
> Turing pci_save_state into an actual ABI would require adding the
> registers into the save state probably, rather than assuming their
> order.
>
> But... I wonder if we truly need to preserve the PCI save state
> across Live Update.
>
> Based on this comment in drivers/vfio/pci/vfio_pci_core.c, the PCI
> save/restore stuff in VFIO is for cleaning up devices that do not
> support resets:
Err, no, I misread that comment. But I guess my question still stands
whether we truly need to preserve the pci_save_state across Live
Update. Maybe there is a simpler way for VFIO to clean up the device
in vfio_pci_core_disable() if we make certain restrictions on which
devices we support.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [RFC PATCH 16/21] vfio/pci: Save and restore the PCI state of the VFIO device
2025-10-18 0:06 [RFC PATCH 00/21] VFIO live update support Vipin Sharma
` (14 preceding siblings ...)
2025-10-18 0:07 ` [RFC PATCH 15/21] PCI: Make PCI saved state and capability structs public Vipin Sharma
@ 2025-10-18 0:07 ` Vipin Sharma
2025-10-18 7:25 ` Lukas Wunner
2025-10-18 15:02 ` Vipin Sharma
2025-10-18 0:07 ` [RFC PATCH 17/21] vfio/pci: Disable interrupts before going live update kexec Vipin Sharma
` (5 subsequent siblings)
21 siblings, 2 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 0:07 UTC (permalink / raw)
To: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf
Cc: pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest, Vipin Sharma
Save and restore the PCI state of the VFIO device which in the normal
flow is recorded by VFIO when the device FD is opened for the first time
and then reapplied to PCI device when the last opened device FD is
closed.
Introduce "_ser" version of the struct pci_saved_state{} and struct
pci_cap_saved_data{} to serialized saved PCI state for liveupdate. Store
PCI state in VFIO in a separate folio as the size is indeterministic at
build time to reserve space in struct vfio_pci_core_device_ser{}.
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
drivers/vfio/pci/vfio_pci_core.c | 9 +-
drivers/vfio/pci/vfio_pci_liveupdate.c | 176 ++++++++++++++++++++++++-
drivers/vfio/pci/vfio_pci_priv.h | 8 +-
3 files changed, 187 insertions(+), 6 deletions(-)
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 186a669b68a4..44ea3ac8da16 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -487,7 +487,9 @@ int vfio_pci_core_enable(struct vfio_pci_core_device *vdev)
goto out_power;
if (vdev->liveupdate_restore) {
- vfio_pci_liveupdate_restore_device(vdev);
+ ret = vfio_pci_liveupdate_restore_device(vdev);
+ if (ret)
+ goto out_disable_device;
} else {
/* If reset fails because of the device lock, fail this path entirely */
ret = pci_try_reset_function(pdev);
@@ -495,10 +497,11 @@ int vfio_pci_core_enable(struct vfio_pci_core_device *vdev)
goto out_disable_device;
vdev->reset_works = !ret;
+
+ pci_save_state(pdev);
+ vdev->pci_saved_state = pci_store_saved_state(pdev);
}
- pci_save_state(pdev);
- vdev->pci_saved_state = pci_store_saved_state(pdev);
if (!vdev->pci_saved_state)
pci_dbg(pdev, "%s: Couldn't store saved state\n", __func__);
diff --git a/drivers/vfio/pci/vfio_pci_liveupdate.c b/drivers/vfio/pci/vfio_pci_liveupdate.c
index 82ff9f178fdc..caef023d007a 100644
--- a/drivers/vfio/pci/vfio_pci_liveupdate.c
+++ b/drivers/vfio/pci/vfio_pci_liveupdate.c
@@ -13,9 +13,22 @@
#include <linux/anon_inodes.h>
#include <linux/kexec_handover.h>
#include <linux/file.h>
+#include <linux/pci.h>
#include "vfio_pci_priv.h"
+struct pci_cap_saved_data_ser {
+ u16 cap_nr;
+ bool cap_extended;
+ unsigned int size;
+ u32 data[];
+} __packed;
+
+struct pci_saved_state_ser {
+ u32 config_space[16];
+ struct pci_cap_saved_data_ser cap[];
+} __packed;
+
struct vfio_pci_core_device_ser {
u16 bdf;
u32 cfg_size;
@@ -23,6 +36,7 @@ struct vfio_pci_core_device_ser {
u8 vconfig[PCI_CFG_SPACE_EXP_SIZE];
u32 rbar[7];
u8 reset_works;
+ u64 pci_saved_state_phys;
} __packed;
static int vfio_pci_liveupdate_deserialize_config(struct vfio_pci_core_device *vdev,
@@ -51,12 +65,150 @@ static void vfio_pci_liveupdate_serialize_config(struct vfio_pci_core_device *vd
memcpy(ser->rbar, vdev->rbar, sizeof(vdev->rbar));
}
+static size_t pci_saved_state_size(struct pci_saved_state *state)
+{
+ struct pci_cap_saved_data *cap;
+ size_t size;
+
+ /* One empty cap to denote end. */
+ size = sizeof(struct pci_saved_state) + sizeof(struct pci_cap_saved_data);
+
+ cap = state->cap;
+ while (cap->size) {
+ size_t len = sizeof(struct pci_cap_saved_data) + cap->size;
+
+ size += len;
+ cap = (struct pci_cap_saved_data *)((u8 *)cap + len);
+ }
+
+ return size;
+}
+
+static size_t pci_saved_state_size_from_ser(struct pci_saved_state_ser *state)
+{
+ struct pci_cap_saved_data_ser *cap;
+ size_t size;
+
+ /* One empty cap to denote end. */
+ size = sizeof(struct pci_saved_state) + sizeof(struct pci_cap_saved_data);
+
+ cap = state->cap;
+ while (cap->size) {
+ size_t len = sizeof(struct pci_cap_saved_data) + cap->size;
+
+ size += len;
+ cap = (struct pci_cap_saved_data_ser *)((u8 *)cap + len);
+ }
+
+ return size;
+}
+
+static void serialize_pci_cap_saved_data(struct pci_saved_state *state,
+ struct pci_saved_state_ser *state_ser)
+{
+ struct pci_cap_saved_data_ser *cap_ser = state_ser->cap;
+ struct pci_cap_saved_data *cap = state->cap;
+
+ while (cap->size) {
+ cap_ser->cap_nr = cap->cap_nr;
+ cap_ser->cap_extended = cap->cap_extended;
+ cap_ser->size = cap->size;
+ memcpy(cap_ser->data, cap->data, cap_ser->size);
+
+ cap = (void *)cap + sizeof(*cap) + cap->size;
+ cap_ser = (void *)cap_ser + sizeof(*cap_ser) + cap_ser->size;
+ }
+}
+
+static void deserialize_pci_cap_saved_data(struct pci_saved_state *state,
+ struct pci_saved_state_ser *state_ser)
+{
+ struct pci_cap_saved_data_ser *cap_ser = state_ser->cap;
+ struct pci_cap_saved_data *cap = state->cap;
+
+ while (cap_ser->size) {
+ cap->cap_nr = cap_ser->cap_nr;
+ cap->cap_extended = cap_ser->cap_extended;
+ cap->size = cap_ser->size;
+ memcpy(cap->data, cap_ser->data, cap_ser->size);
+
+ cap = (void *)cap + sizeof(*cap) + cap->size;
+ cap_ser = (void *)cap_ser + sizeof(*cap_ser) + cap_ser->size;
+ }
+}
+
+static int serialize_pci_saved_state(struct vfio_pci_core_device *vdev,
+ struct vfio_pci_core_device_ser *ser)
+{
+ struct pci_saved_state *state = vdev->pci_saved_state;
+ struct pci_saved_state_ser *state_ser;
+ struct folio *folio;
+ size_t size;
+ int ret;
+
+ if (!state)
+ return 0;
+
+ size = pci_saved_state_size(state);
+
+ folio = folio_alloc(GFP_KERNEL | __GFP_ZERO, get_order(size));
+ if (!folio)
+ return -ENOMEM;
+
+ state_ser = folio_address(folio);
+
+ memcpy(state_ser->config_space, state->config_space,
+ sizeof(state_ser->config_space));
+
+ serialize_pci_cap_saved_data(state, state_ser);
+
+ ret = kho_preserve_folio(folio);
+ if (ret) {
+ folio_put(folio);
+ return ret;
+ }
+
+ ser->pci_saved_state_phys = virt_to_phys(state_ser);
+
+ return 0;
+}
+
+static int deserialize_pci_saved_state(struct vfio_pci_core_device *vdev,
+ struct vfio_pci_core_device_ser *ser)
+{
+ struct pci_saved_state_ser *state_ser;
+ struct pci_saved_state *state;
+ size_t size;
+
+ if (!ser->pci_saved_state_phys)
+ return 0;
+
+ state_ser = phys_to_virt(ser->pci_saved_state_phys);
+ size = pci_saved_state_size_from_ser(state_ser);
+ state = kzalloc(size, GFP_KERNEL);
+ if (!state)
+ return -ENOMEM;
+
+ memcpy(state->config_space, state_ser->config_space,
+ sizeof(state_ser->config_space));
+
+ deserialize_pci_cap_saved_data(state, state_ser);
+ vdev->pci_saved_state = state;
+ return 0;
+}
+
static int vfio_pci_lu_serialize(struct vfio_pci_core_device *vdev,
struct vfio_pci_core_device_ser *ser)
{
+ int err;
+
ser->bdf = pci_dev_id(vdev->pdev);
vfio_pci_liveupdate_serialize_config(vdev, ser);
ser->reset_works = vdev->reset_works;
+ err = serialize_pci_saved_state(vdev, ser);
+ if (err)
+ return err;
+
return 0;
}
@@ -101,12 +253,18 @@ static void vfio_pci_liveupdate_cancel(struct liveupdate_file_handler *handler,
{
struct vfio_pci_core_device_ser *ser = phys_to_virt(data);
struct folio *folio = virt_to_folio(ser);
+ struct folio *pci_saved_state_folio;
struct vfio_pci_core_device *vdev;
struct vfio_device *device;
device = vfio_device_from_file(file);
vdev = container_of(device, struct vfio_pci_core_device, vdev);
vdev->pdev->skip_kexec_clear_master = false;
+ if (ser->pci_saved_state_phys) {
+ pci_saved_state_folio = virt_to_folio(phys_to_virt(ser->pci_saved_state_phys));
+ WARN_ON_ONCE(kho_unpreserve_folio(pci_saved_state_folio));
+ folio_put(pci_saved_state_folio);
+ }
WARN_ON_ONCE(kho_unpreserve_folio(folio));
folio_put(folio);
}
@@ -139,6 +297,9 @@ static void vfio_pci_liveupdate_finish(struct liveupdate_file_handler *handler,
ser = folio_address(folio);
+ if (!reclaimed && ser->pci_saved_state_phys)
+ kho_restore_folio(ser->pci_saved_state_phys);
+
device = vfio_find_device_in_cdev_class(&ser->bdf, match_bdf);
if (!device)
goto out_folio_put;
@@ -155,6 +316,8 @@ static void vfio_pci_liveupdate_finish(struct liveupdate_file_handler *handler,
put_device(&device->device);
out_folio_put:
+ if (ser->pci_saved_state_phys)
+ folio_put(virt_to_folio(phys_to_virt(ser->pci_saved_state_phys)));
folio_put(folio);
}
@@ -174,6 +337,11 @@ static int vfio_pci_liveupdate_retrieve(struct liveupdate_file_handler *handler,
return -ENOENT;
ser = folio_address(folio);
+ if (ser->pci_saved_state_phys) {
+ if (!kho_restore_folio(ser->pci_saved_state_phys))
+ return -ENOENT;
+ }
+
device = vfio_find_device_in_cdev_class(&ser->bdf, match_bdf);
if (!device)
return -ENODEV;
@@ -262,9 +430,15 @@ int vfio_pci_liveupdate_restore_config(struct vfio_pci_core_device *vdev)
return vfio_pci_liveupdate_deserialize_config(vdev, ser);
}
-void vfio_pci_liveupdate_restore_device(struct vfio_pci_core_device *vdev)
+int vfio_pci_liveupdate_restore_device(struct vfio_pci_core_device *vdev)
{
struct vfio_pci_core_device_ser *ser = vdev->liveupdate_restore;
+ int err;
+
+ err = deserialize_pci_saved_state(vdev, ser);
+ if (err)
+ return err;
vdev->reset_works = ser->reset_works;
+ return 0;
}
diff --git a/drivers/vfio/pci/vfio_pci_priv.h b/drivers/vfio/pci/vfio_pci_priv.h
index ee1c7c229020..9d692e4d0cf7 100644
--- a/drivers/vfio/pci/vfio_pci_priv.h
+++ b/drivers/vfio/pci/vfio_pci_priv.h
@@ -110,14 +110,18 @@ static inline bool vfio_pci_is_vga(struct pci_dev *pdev)
#ifdef CONFIG_LIVEUPDATE
void vfio_pci_liveupdate_init(void);
int vfio_pci_liveupdate_restore_config(struct vfio_pci_core_device *vdev);
-void vfio_pci_liveupdate_restore_device(struct vfio_pci_core_device *vdev);
+int vfio_pci_liveupdate_restore_device(struct vfio_pci_core_device *vdev);
#else
static inline void vfio_pci_liveupdate_init(void) { }
int vfio_pci_liveupdate_restore_config(struct vfio_pci_core_device *vdev)
{
return -EINVAL;
}
-void vfio_pci_liveupdate_restore_device(struct vfio_pci_core_device *vdev) { }
+int vfio_pci_liveupdate_restore_device(struct vfio_pci_core_device *vdev)
+{
+ return -EOPNOTSUPP;
+}
+
#endif /* CONFIG_LIVEUPDATE */
#endif
--
2.51.0.858.gf9c4a03a3a-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread* Re: [RFC PATCH 16/21] vfio/pci: Save and restore the PCI state of the VFIO device
2025-10-18 0:07 ` [RFC PATCH 16/21] vfio/pci: Save and restore the PCI state of the VFIO device Vipin Sharma
@ 2025-10-18 7:25 ` Lukas Wunner
2025-10-18 22:44 ` Vipin Sharma
2025-10-18 15:02 ` Vipin Sharma
1 sibling, 1 reply; 57+ messages in thread
From: Lukas Wunner @ 2025-10-18 7:25 UTC (permalink / raw)
To: Vipin Sharma
Cc: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf,
pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On Fri, Oct 17, 2025 at 05:07:08PM -0700, Vipin Sharma wrote:
> Save and restore the PCI state of the VFIO device which in the normal
> flow is recorded by VFIO when the device FD is opened for the first time
> and then reapplied to PCI device when the last opened device FD is
> closed.
>
> Introduce "_ser" version of the struct pci_saved_state{} and struct
> pci_cap_saved_data{} to serialized saved PCI state for liveupdate. Store
> PCI state in VFIO in a separate folio as the size is indeterministic at
> build time to reserve space in struct vfio_pci_core_device_ser{}.
Unfortunately this commit message is of the type "summarize the code
changes without explaining the reason for these changes".
Comparing the pci_saved_state_ser and pci_cap_saved_data_ser structs
which you're introducing here with the existing pci_saved_state and
pci_cap_saved_data structs, the only difference seems to be that
you're adding __packed to your new structs. Is that all? Is that
the only reason why these structs need to be duplicated? Maybe
it would make more sense to add __packed to the existing structs,
though the gain seems minimal.
Thanks,
Lukas
^ permalink raw reply [flat|nested] 57+ messages in thread* Re: [RFC PATCH 16/21] vfio/pci: Save and restore the PCI state of the VFIO device
2025-10-18 7:25 ` Lukas Wunner
@ 2025-10-18 22:44 ` Vipin Sharma
0 siblings, 0 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 22:44 UTC (permalink / raw)
To: Lukas Wunner
Cc: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf,
pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On 2025-10-18 09:25:30, Lukas Wunner wrote:
> On Fri, Oct 17, 2025 at 05:07:08PM -0700, Vipin Sharma wrote:
> > Save and restore the PCI state of the VFIO device which in the normal
> > flow is recorded by VFIO when the device FD is opened for the first time
> > and then reapplied to PCI device when the last opened device FD is
> > closed.
> >
> > Introduce "_ser" version of the struct pci_saved_state{} and struct
> > pci_cap_saved_data{} to serialized saved PCI state for liveupdate. Store
> > PCI state in VFIO in a separate folio as the size is indeterministic at
> > build time to reserve space in struct vfio_pci_core_device_ser{}.
>
> Unfortunately this commit message is of the type "summarize the code
> changes without explaining the reason for these changes".
>
> Comparing the pci_saved_state_ser and pci_cap_saved_data_ser structs
> which you're introducing here with the existing pci_saved_state and
> pci_cap_saved_data structs, the only difference seems to be that
> you're adding __packed to your new structs. Is that all? Is that
> the only reason why these structs need to be duplicated? Maybe
> it would make more sense to add __packed to the existing structs,
> though the gain seems minimal.
>
It allows (in future) to build more validation and compatibility between
layout changes of struct across kernel version. We can add more fields
in the *_ser version which can act as metadata to support in
deserialization.
I do agree in the current form (with the assumption of no layout
changes) we can get away with using the existing structs. I also think
this should be taken care by PCI series instead of VFIO series.
Lets see what others also think, I am open to not adding these *_ser
structs if we should wait for a proper support for struct serialization
and work under assumption that these won't change.
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [RFC PATCH 16/21] vfio/pci: Save and restore the PCI state of the VFIO device
2025-10-18 0:07 ` [RFC PATCH 16/21] vfio/pci: Save and restore the PCI state of the VFIO device Vipin Sharma
2025-10-18 7:25 ` Lukas Wunner
@ 2025-10-18 15:02 ` Vipin Sharma
1 sibling, 0 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 15:02 UTC (permalink / raw)
To: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf
Cc: pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On 2025-10-17 17:07:08, Vipin Sharma wrote:
> --- a/drivers/vfio/pci/vfio_pci_priv.h
> +++ b/drivers/vfio/pci/vfio_pci_priv.h
> @@ -110,14 +110,18 @@ static inline bool vfio_pci_is_vga(struct pci_dev *pdev)
> #ifdef CONFIG_LIVEUPDATE
> void vfio_pci_liveupdate_init(void);
> int vfio_pci_liveupdate_restore_config(struct vfio_pci_core_device *vdev);
> -void vfio_pci_liveupdate_restore_device(struct vfio_pci_core_device *vdev);
> +int vfio_pci_liveupdate_restore_device(struct vfio_pci_core_device *vdev);
> #else
> static inline void vfio_pci_liveupdate_init(void) { }
> int vfio_pci_liveupdate_restore_config(struct vfio_pci_core_device *vdev)
> {
> return -EINVAL;
> }
> -void vfio_pci_liveupdate_restore_device(struct vfio_pci_core_device *vdev) { }
> +int vfio_pci_liveupdate_restore_device(struct vfio_pci_core_device *vdev)
> +{
> + return -EOPNOTSUPP;
> +}
> +
This should also be static inline.
> #endif /* CONFIG_LIVEUPDATE */
>
> #endif
> --
> 2.51.0.858.gf9c4a03a3a-goog
>
^ permalink raw reply [flat|nested] 57+ messages in thread
* [RFC PATCH 17/21] vfio/pci: Disable interrupts before going live update kexec
2025-10-18 0:06 [RFC PATCH 00/21] VFIO live update support Vipin Sharma
` (15 preceding siblings ...)
2025-10-18 0:07 ` [RFC PATCH 16/21] vfio/pci: Save and restore the PCI state of the VFIO device Vipin Sharma
@ 2025-10-18 0:07 ` Vipin Sharma
2025-10-18 0:07 ` [RFC PATCH 18/21] vfio: selftests: Build liveupdate library in VFIO selftests Vipin Sharma
` (4 subsequent siblings)
21 siblings, 0 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 0:07 UTC (permalink / raw)
To: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf
Cc: pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest, Vipin Sharma
Disable VFIO interrupts configured on device during live update
freeze callback. As there is no way for those interrupts to be handled
during kexec, better stop the interrupts and let userspace reconfigure
them after kexec.
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
drivers/vfio/pci/vfio_pci_liveupdate.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/drivers/vfio/pci/vfio_pci_liveupdate.c b/drivers/vfio/pci/vfio_pci_liveupdate.c
index caef023d007a..5d786ace6bde 100644
--- a/drivers/vfio/pci/vfio_pci_liveupdate.c
+++ b/drivers/vfio/pci/vfio_pci_liveupdate.c
@@ -248,6 +248,22 @@ static int vfio_pci_liveupdate_prepare(struct liveupdate_file_handler *handler,
return err;
}
+static int vfio_pci_liveupdate_freeze(struct liveupdate_file_handler *handler,
+ struct file *file, u64 *data)
+{
+ struct vfio_pci_core_device *vdev;
+ struct vfio_device *device;
+
+ device = vfio_device_from_file(file);
+ vdev = container_of(device, struct vfio_pci_core_device, vdev);
+
+ guard(mutex)(&vdev->igate);
+ if (vdev->irq_type == VFIO_PCI_NUM_IRQS)
+ return 0;
+ return vfio_pci_set_irqs_ioctl(vdev, VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER,
+ vdev->irq_type, 0, 0, NULL);
+}
+
static void vfio_pci_liveupdate_cancel(struct liveupdate_file_handler *handler,
struct file *file, u64 data)
{
@@ -403,6 +419,7 @@ static bool vfio_pci_liveupdate_can_preserve(struct liveupdate_file_handler *han
static const struct liveupdate_file_ops vfio_pci_luo_fops = {
.prepare = vfio_pci_liveupdate_prepare,
+ .freeze = vfio_pci_liveupdate_freeze,
.cancel = vfio_pci_liveupdate_cancel,
.finish = vfio_pci_liveupdate_finish,
.retrieve = vfio_pci_liveupdate_retrieve,
--
2.51.0.858.gf9c4a03a3a-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread* [RFC PATCH 18/21] vfio: selftests: Build liveupdate library in VFIO selftests
2025-10-18 0:06 [RFC PATCH 00/21] VFIO live update support Vipin Sharma
` (16 preceding siblings ...)
2025-10-18 0:07 ` [RFC PATCH 17/21] vfio/pci: Disable interrupts before going live update kexec Vipin Sharma
@ 2025-10-18 0:07 ` Vipin Sharma
2025-10-20 20:50 ` David Matlack
2025-10-18 0:07 ` [RFC PATCH 19/21] vfio: selftests: Initialize vfio_pci_device using a VFIO cdev FD Vipin Sharma
` (3 subsequent siblings)
21 siblings, 1 reply; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 0:07 UTC (permalink / raw)
To: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf
Cc: pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest, Vipin Sharma
Import and build liveupdate selftest library in VFIO selftests.
It allows to use liveupdate ioctls in VFIO selftests
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
tools/testing/selftests/vfio/Makefile | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/vfio/Makefile b/tools/testing/selftests/vfio/Makefile
index 324ba0175a33..c7f271884cb4 100644
--- a/tools/testing/selftests/vfio/Makefile
+++ b/tools/testing/selftests/vfio/Makefile
@@ -6,16 +6,24 @@ TEST_GEN_PROGS += vfio_pci_driver_test
TEST_PROGS_EXTENDED := run.sh
include ../lib.mk
include lib/libvfio.mk
+include ../liveupdate/lib/libliveupdate.mk
CFLAGS += -I$(top_srcdir)/tools/include
CFLAGS += -MD
CFLAGS += $(EXTRA_CFLAGS)
-$(TEST_GEN_PROGS): %: %.o $(LIBVFIO_O)
- $(CC) $(CFLAGS) $(CPPFLAGS) $(LDFLAGS) $< $(LIBVFIO_O) $(LDLIBS) -o $@
+LIBS_O := $(LIBVFIO_O)
+LIBS_O += $(LIBLIVEUPDATE_O)
+
+TEST_GEN_ALL_PROGS := $(TEST_GEN_PROGS)
+TEST_GEN_ALL_PROGS += $(TEST_GEN_PROGS_EXTENDED)
+
+$(TEST_GEN_ALL_PROGS): %: %.o $(LIBS_O)
+ $(CC) $(CFLAGS) $(CPPFLAGS) $(LDFLAGS) $(TARGET_ARCH) $< $(LIBS_O) $(LDLIBS) -o $@
TEST_GEN_PROGS_O = $(patsubst %, %.o, $(TEST_GEN_PROGS))
-TEST_DEP_FILES = $(patsubst %.o, %.d, $(TEST_GEN_PROGS_O) $(LIBVFIO_O))
+TEST_GEN_PROGS_O += $(patsubst %, %.o, $(TEST_GEN_PROGS_EXTENDED))
+TEST_DEP_FILES = $(patsubst %.o, %.d, $(TEST_GEN_PROGS_O) $(LIBS_O))
-include $(TEST_DEP_FILES)
EXTRA_CLEAN += $(TEST_GEN_PROGS_O) $(TEST_DEP_FILES)
--
2.51.0.858.gf9c4a03a3a-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread* Re: [RFC PATCH 18/21] vfio: selftests: Build liveupdate library in VFIO selftests
2025-10-18 0:07 ` [RFC PATCH 18/21] vfio: selftests: Build liveupdate library in VFIO selftests Vipin Sharma
@ 2025-10-20 20:50 ` David Matlack
2025-10-20 23:55 ` Vipin Sharma
0 siblings, 1 reply; 57+ messages in thread
From: David Matlack @ 2025-10-20 20:50 UTC (permalink / raw)
To: Vipin Sharma
Cc: bhelgaas, alex.williamson, pasha.tatashin, jgg, graf, pratyush,
gregkh, chrisl, rppt, skhawaja, parav, saeedm, kevin.tian,
jrhilke, david, jgowans, dwmw2, epetron, junaids, linux-kernel,
linux-pci, kvm, linux-kselftest
On Fri, Oct 17, 2025 at 5:07 PM Vipin Sharma <vipinsh@google.com> wrote:
> +TEST_GEN_ALL_PROGS := $(TEST_GEN_PROGS)
> +TEST_GEN_ALL_PROGS += $(TEST_GEN_PROGS_EXTENDED)
The TEST_GEN_PROGS_EXTENDED support should go in the commit that first
needs them, or in their own commit.
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [RFC PATCH 18/21] vfio: selftests: Build liveupdate library in VFIO selftests
2025-10-20 20:50 ` David Matlack
@ 2025-10-20 23:55 ` Vipin Sharma
0 siblings, 0 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-20 23:55 UTC (permalink / raw)
To: David Matlack
Cc: bhelgaas, alex.williamson, pasha.tatashin, jgg, graf, pratyush,
gregkh, chrisl, rppt, skhawaja, parav, saeedm, kevin.tian,
jrhilke, david, jgowans, dwmw2, epetron, junaids, linux-kernel,
linux-pci, kvm, linux-kselftest
On 2025-10-20 13:50:45, David Matlack wrote:
> On Fri, Oct 17, 2025 at 5:07 PM Vipin Sharma <vipinsh@google.com> wrote:
>
> > +TEST_GEN_ALL_PROGS := $(TEST_GEN_PROGS)
> > +TEST_GEN_ALL_PROGS += $(TEST_GEN_PROGS_EXTENDED)
>
> The TEST_GEN_PROGS_EXTENDED support should go in the commit that first
> needs them, or in their own commit.
Yeah, this can be extracted out from this commit.
^ permalink raw reply [flat|nested] 57+ messages in thread
* [RFC PATCH 19/21] vfio: selftests: Initialize vfio_pci_device using a VFIO cdev FD
2025-10-18 0:06 [RFC PATCH 00/21] VFIO live update support Vipin Sharma
` (17 preceding siblings ...)
2025-10-18 0:07 ` [RFC PATCH 18/21] vfio: selftests: Build liveupdate library in VFIO selftests Vipin Sharma
@ 2025-10-18 0:07 ` Vipin Sharma
2025-10-18 0:07 ` [RFC PATCH 20/21] vfio: selftests: Add VFIO live update test Vipin Sharma
` (2 subsequent siblings)
21 siblings, 0 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 0:07 UTC (permalink / raw)
To: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf
Cc: pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest, Vipin Sharma
Use the given VFIO cdev FD to initialize vfio_pci_device in VFIO
selftests. Add the assertion to make sure that passed cdev FD is not
used with legacy VFIO APIs. If VFIO cdev FD is provided then do not open
the device instead use the FD for any interaction with the device.
This API will allow to write selftests where VFIO device FD is preserved
using liveupdate and retrieved later using liveupdate ioctl after kexec.
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
.../selftests/vfio/lib/include/vfio_util.h | 1 +
.../selftests/vfio/lib/vfio_pci_device.c | 33 +++++++++++++++----
2 files changed, 28 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/vfio/lib/include/vfio_util.h b/tools/testing/selftests/vfio/lib/include/vfio_util.h
index ed31606e01b7..8ec60a62a0d1 100644
--- a/tools/testing/selftests/vfio/lib/include/vfio_util.h
+++ b/tools/testing/selftests/vfio/lib/include/vfio_util.h
@@ -203,6 +203,7 @@ const char *vfio_pci_get_cdev_path(const char *bdf);
extern const char *default_iommu_mode;
struct vfio_pci_device *vfio_pci_device_init(const char *bdf, const char *iommu_mode);
+struct vfio_pci_device *vfio_pci_device_init_fd(int vfio_cdev_fd);
void vfio_pci_device_cleanup(struct vfio_pci_device *device);
void vfio_pci_device_reset(struct vfio_pci_device *device);
diff --git a/tools/testing/selftests/vfio/lib/vfio_pci_device.c b/tools/testing/selftests/vfio/lib/vfio_pci_device.c
index 0921b2451ba5..cab9c74d2de8 100644
--- a/tools/testing/selftests/vfio/lib/vfio_pci_device.c
+++ b/tools/testing/selftests/vfio/lib/vfio_pci_device.c
@@ -486,13 +486,18 @@ static void vfio_device_attach_iommufd_pt(int device_fd, u32 pt_id)
ioctl_assert(device_fd, VFIO_DEVICE_ATTACH_IOMMUFD_PT, &args);
}
-static void vfio_pci_iommufd_setup(struct vfio_pci_device *device, const char *bdf)
+static void vfio_pci_iommufd_setup(struct vfio_pci_device *device,
+ const char *bdf, int vfio_cdev_fd)
{
- const char *cdev_path = vfio_pci_get_cdev_path(bdf);
- device->fd = open(cdev_path, O_RDWR);
+ if (vfio_cdev_fd > 0) {
+ device->fd = vfio_cdev_fd;
+ } else {
+ const char *cdev_path = vfio_pci_get_cdev_path(bdf);
+ device->fd = open(cdev_path, O_RDWR);
+ free((void *)cdev_path);
+ }
VFIO_ASSERT_GE(device->fd, 0);
- free((void *)cdev_path);
/*
* Require device->iommufd to be >0 so that a simple non-0 check can be
@@ -507,7 +512,9 @@ static void vfio_pci_iommufd_setup(struct vfio_pci_device *device, const char *b
vfio_device_attach_iommufd_pt(device->fd, device->ioas_id);
}
-struct vfio_pci_device *vfio_pci_device_init(const char *bdf, const char *iommu_mode)
+struct vfio_pci_device *__vfio_pci_device_init(const char *bdf,
+ const char *iommu_mode,
+ int vfio_cdev_fd)
{
struct vfio_pci_device *device;
@@ -518,10 +525,13 @@ struct vfio_pci_device *vfio_pci_device_init(const char *bdf, const char *iommu_
device->iommu_mode = lookup_iommu_mode(iommu_mode);
+ VFIO_ASSERT_FALSE(device->iommu_mode->container_path != NULL && vfio_cdev_fd > 0,
+ "Provide either container path or VFIO cdev FD, not both.\n");
+
if (device->iommu_mode->container_path)
vfio_pci_container_setup(device, bdf);
else
- vfio_pci_iommufd_setup(device, bdf);
+ vfio_pci_iommufd_setup(device, bdf, vfio_cdev_fd);
vfio_pci_device_setup(device);
vfio_pci_driver_probe(device);
@@ -529,6 +539,17 @@ struct vfio_pci_device *vfio_pci_device_init(const char *bdf, const char *iommu_
return device;
}
+struct vfio_pci_device *vfio_pci_device_init(const char *bdf,
+ const char *iommu_mode)
+{
+ return __vfio_pci_device_init(bdf, iommu_mode, -1);
+}
+
+struct vfio_pci_device *vfio_pci_device_init_fd(int vfio_cdev_fd)
+{
+ return __vfio_pci_device_init(NULL, "iommufd", vfio_cdev_fd);
+}
+
void vfio_pci_device_cleanup(struct vfio_pci_device *device)
{
int i;
--
2.51.0.858.gf9c4a03a3a-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread* [RFC PATCH 20/21] vfio: selftests: Add VFIO live update test
2025-10-18 0:06 [RFC PATCH 00/21] VFIO live update support Vipin Sharma
` (18 preceding siblings ...)
2025-10-18 0:07 ` [RFC PATCH 19/21] vfio: selftests: Initialize vfio_pci_device using a VFIO cdev FD Vipin Sharma
@ 2025-10-18 0:07 ` Vipin Sharma
2025-10-18 0:07 ` [RFC PATCH 21/21] vfio: selftests: Validate vconfig preservation of VFIO PCI device during live update Vipin Sharma
2025-10-18 17:21 ` [RFC PATCH 00/21] VFIO live update support Jason Gunthorpe
21 siblings, 0 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 0:07 UTC (permalink / raw)
To: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf
Cc: pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest, Vipin Sharma
Write a test to exercise VFIO live update support on the passed device
BDF. Provide different behavior of the test based on host live update
state (NORMAL or UPDATED).
When test is executed in NORMAL state, initialize a VFIO PCI device and
enable its Bus Master Enable bit by writing to PCI command register.
Create a live update session, and pass the VFIO device FD to it for
preservation. Preserve the session and then send the global live update
prepare event. If everything is fine up to this point, then reboot the
kernel using kexec.
When test is executed in UPDATED state, retrieve the session from Live
Update Orchestrator, restore the VFIO FD from the session. Use the
restored FD to initialize vfio_pci_device in selftest. Move the host to
NORMAL state and verify if the Bus Master Enable bit is still enabled on
the VFIO device.
Test will not be auto run, therefore, only build this test and let the
user run the test manually with the command:
./run.sh -d 0000:6a:01.0 ./vfio_pci_liveupdate_test
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
tools/testing/selftests/vfio/Makefile | 1 +
.../selftests/vfio/vfio_pci_liveupdate_test.c | 106 ++++++++++++++++++
2 files changed, 107 insertions(+)
create mode 100644 tools/testing/selftests/vfio/vfio_pci_liveupdate_test.c
diff --git a/tools/testing/selftests/vfio/Makefile b/tools/testing/selftests/vfio/Makefile
index c7f271884cb4..949b7fcc091e 100644
--- a/tools/testing/selftests/vfio/Makefile
+++ b/tools/testing/selftests/vfio/Makefile
@@ -3,6 +3,7 @@ TEST_GEN_PROGS += vfio_dma_mapping_test
TEST_GEN_PROGS += vfio_iommufd_setup_test
TEST_GEN_PROGS += vfio_pci_device_test
TEST_GEN_PROGS += vfio_pci_driver_test
+TEST_GEN_PROGS_EXTENDED += vfio_pci_liveupdate_test
TEST_PROGS_EXTENDED := run.sh
include ../lib.mk
include lib/libvfio.mk
diff --git a/tools/testing/selftests/vfio/vfio_pci_liveupdate_test.c b/tools/testing/selftests/vfio/vfio_pci_liveupdate_test.c
new file mode 100644
index 000000000000..9fd0061348e0
--- /dev/null
+++ b/tools/testing/selftests/vfio/vfio_pci_liveupdate_test.c
@@ -0,0 +1,106 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/*
+ * Copyright (c) 2025, Google LLC.
+ * Vipin Sharma <vipinsh@google.com>
+ */
+
+#include <linux/liveupdate.h>
+#include <liveupdate_util.h>
+#include <vfio_util.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <sys/ioctl.h>
+
+#define SESSION_NAME "multi_file_session"
+#define TOKEN 1234
+
+static void run_pre_kexec(int luo_fd, const char *bdf)
+{
+ struct vfio_pci_device *device;
+ int session_fd;
+ u16 command;
+
+ device = vfio_pci_device_init(bdf, "iommufd");
+
+ command = vfio_pci_config_readw(device, PCI_COMMAND);
+ VFIO_ASSERT_FALSE(command & PCI_COMMAND_MASTER);
+
+ vfio_pci_config_writew(device, PCI_COMMAND,
+ command | PCI_COMMAND_MASTER);
+
+ session_fd = luo_create_session(luo_fd, SESSION_NAME);
+ VFIO_ASSERT_GE(session_fd, 0, "Failed to create session %s",
+ SESSION_NAME);
+ VFIO_ASSERT_EQ(luo_session_preserve_fd(session_fd, device->fd, TOKEN),
+ 0, "Failed to preserve VFIO device");
+ VFIO_ASSERT_EQ(luo_set_global_event(luo_fd, LIVEUPDATE_PREPARE), 0,
+ "Failed to set global PREPARE event");
+
+ VFIO_ASSERT_EQ(system(KEXEC_SCRIPT), 0, "kexec script failed");
+
+ sleep(10); /* Should not be reached */
+ vfio_pci_device_cleanup(device);
+ exit(EXIT_FAILURE);
+}
+
+static void run_post_kexec(int luo_fd, const char *bdf)
+{
+ int session_fd;
+ int vfio_fd;
+ struct vfio_pci_device *device;
+ u16 command;
+
+
+ session_fd = luo_retrieve_session(luo_fd, SESSION_NAME);
+ VFIO_ASSERT_GE(session_fd, 0, "Failed to retrieve session %s",
+ SESSION_NAME);
+
+ vfio_fd = luo_session_restore_fd(session_fd, TOKEN);
+ if (vfio_fd < 0) {
+ printf("Failed to restore VFIO device, error %d", vfio_fd);
+ exit(1);
+ }
+
+ device = vfio_pci_device_init_fd(vfio_fd);
+
+ if (luo_set_global_event(luo_fd, LIVEUPDATE_FINISH) < 0) {
+ printf("Failed to set global FINISH event");
+ exit(1);
+ }
+
+ close(session_fd);
+
+ command = vfio_pci_config_readw(device, PCI_COMMAND);
+ VFIO_ASSERT_TRUE(command & PCI_COMMAND_MASTER);
+ vfio_pci_device_cleanup(device);
+}
+
+int main(int argc, char *argv[])
+{
+ enum liveupdate_state state;
+ const char *device_bdf;
+ int luo_fd;
+
+ device_bdf = vfio_selftests_get_bdf(&argc, argv);
+
+ luo_fd = luo_open_device();
+ VFIO_ASSERT_GE(luo_fd, 0, "Failed to open %s", LUO_DEVICE);
+ VFIO_ASSERT_EQ(luo_get_global_state(luo_fd, &state), 0, "Failed to get LUO state.");
+
+ switch (state) {
+ case LIVEUPDATE_STATE_NORMAL:
+ printf("Running pre-kexec actions.\n");
+ run_pre_kexec(luo_fd, device_bdf);
+ break;
+ case LIVEUPDATE_STATE_UPDATED:
+ printf("Running post-kexec actions.\n");
+ run_post_kexec(luo_fd, device_bdf);
+ break;
+ default:
+ printf("Test started in an unexpected state: %d", state);
+ }
+
+ close(luo_fd);
+}
--
2.51.0.858.gf9c4a03a3a-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread* [RFC PATCH 21/21] vfio: selftests: Validate vconfig preservation of VFIO PCI device during live update
2025-10-18 0:06 [RFC PATCH 00/21] VFIO live update support Vipin Sharma
` (19 preceding siblings ...)
2025-10-18 0:07 ` [RFC PATCH 20/21] vfio: selftests: Add VFIO live update test Vipin Sharma
@ 2025-10-18 0:07 ` Vipin Sharma
2025-10-18 17:21 ` [RFC PATCH 00/21] VFIO live update support Jason Gunthorpe
21 siblings, 0 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 0:07 UTC (permalink / raw)
To: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, jgg, graf
Cc: pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest, Vipin Sharma
Test preservation of a VFIO PCI device virtual config (vconfig in struct
vfio_pci_core_device{}) during the live update. Write some random data
to PCI_INTERRUPT_LINE register which is virtualized by VFIO and verify
that the same data is read after kexec.
Certain bits in the config space are virtualized by VFIO, so write to
them don't go to the device PCI config instead they are stored in
memory. After live update, vconfig should have the value same as prior
to kexec, which means vconfig should be saved in KHO and later retrieved
to restore the device.
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
.../testing/selftests/vfio/vfio_pci_liveupdate_test.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/tools/testing/selftests/vfio/vfio_pci_liveupdate_test.c b/tools/testing/selftests/vfio/vfio_pci_liveupdate_test.c
index 9fd0061348e0..2d80fdcb1ef7 100644
--- a/tools/testing/selftests/vfio/vfio_pci_liveupdate_test.c
+++ b/tools/testing/selftests/vfio/vfio_pci_liveupdate_test.c
@@ -15,12 +15,14 @@
#define SESSION_NAME "multi_file_session"
#define TOKEN 1234
+#define RANDOM_DATA 0x12
static void run_pre_kexec(int luo_fd, const char *bdf)
{
struct vfio_pci_device *device;
int session_fd;
u16 command;
+ u8 data;
device = vfio_pci_device_init(bdf, "iommufd");
@@ -30,6 +32,10 @@ static void run_pre_kexec(int luo_fd, const char *bdf)
vfio_pci_config_writew(device, PCI_COMMAND,
command | PCI_COMMAND_MASTER);
+ vfio_pci_config_writeb(device, PCI_INTERRUPT_LINE, RANDOM_DATA);
+ data = vfio_pci_config_readb(device, PCI_INTERRUPT_LINE);
+ VFIO_ASSERT_EQ(data, RANDOM_DATA);
+
session_fd = luo_create_session(luo_fd, SESSION_NAME);
VFIO_ASSERT_GE(session_fd, 0, "Failed to create session %s",
SESSION_NAME);
@@ -51,6 +57,7 @@ static void run_post_kexec(int luo_fd, const char *bdf)
int vfio_fd;
struct vfio_pci_device *device;
u16 command;
+ u8 data;
session_fd = luo_retrieve_session(luo_fd, SESSION_NAME);
@@ -74,6 +81,9 @@ static void run_post_kexec(int luo_fd, const char *bdf)
command = vfio_pci_config_readw(device, PCI_COMMAND);
VFIO_ASSERT_TRUE(command & PCI_COMMAND_MASTER);
+
+ data = vfio_pci_config_readb(device, PCI_INTERRUPT_LINE);
+ VFIO_ASSERT_EQ(data, RANDOM_DATA);
vfio_pci_device_cleanup(device);
}
--
2.51.0.858.gf9c4a03a3a-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread* Re: [RFC PATCH 00/21] VFIO live update support
2025-10-18 0:06 [RFC PATCH 00/21] VFIO live update support Vipin Sharma
` (20 preceding siblings ...)
2025-10-18 0:07 ` [RFC PATCH 21/21] vfio: selftests: Validate vconfig preservation of VFIO PCI device during live update Vipin Sharma
@ 2025-10-18 17:21 ` Jason Gunthorpe
2025-10-18 22:53 ` Vipin Sharma
21 siblings, 1 reply; 57+ messages in thread
From: Jason Gunthorpe @ 2025-10-18 17:21 UTC (permalink / raw)
To: Vipin Sharma
Cc: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, graf,
pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On Fri, Oct 17, 2025 at 05:06:52PM -0700, Vipin Sharma wrote:
> 2. Integration with IOMMUFD and PCI series for complete workflow where a
> device continues a DMA while undergoing through live update.
It is a bit confusing, this series has PCI components so how does it
relate the PCI series? Is this self contained for at least limited PCI
topologies?
Jason
^ permalink raw reply [flat|nested] 57+ messages in thread* Re: [RFC PATCH 00/21] VFIO live update support
2025-10-18 17:21 ` [RFC PATCH 00/21] VFIO live update support Jason Gunthorpe
@ 2025-10-18 22:53 ` Vipin Sharma
2025-10-18 23:06 ` Jason Gunthorpe
0 siblings, 1 reply; 57+ messages in thread
From: Vipin Sharma @ 2025-10-18 22:53 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, graf,
pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On 2025-10-18 14:21:30, Jason Gunthorpe wrote:
> On Fri, Oct 17, 2025 at 05:06:52PM -0700, Vipin Sharma wrote:
> > 2. Integration with IOMMUFD and PCI series for complete workflow where a
> > device continues a DMA while undergoing through live update.
>
> It is a bit confusing, this series has PCI components so how does it
> relate the PCI series? Is this self contained for at least limited PCI
> topologies?
This series has very minimal PCI support. For example, it is skipping
DMA disable on the VFIO PCI device during kexec reboot and saving initial PCI
state during first open (bind) of the device.
We do need proper PCI support, few examples:
- Not disabling DMA bit on bridges upstream of the leaf VFIO PCI device node.
- Not writing to PCI config during device enumeration.
- Not autobinding devices to their default driver. My testing works on
devices which don't have driver bulit in the kernel so there is no
probing by other drivers.
- PCI enable and disable calls support.
These things I think should be solved in PCI series.
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [RFC PATCH 00/21] VFIO live update support
2025-10-18 22:53 ` Vipin Sharma
@ 2025-10-18 23:06 ` Jason Gunthorpe
2025-10-20 23:30 ` Vipin Sharma
0 siblings, 1 reply; 57+ messages in thread
From: Jason Gunthorpe @ 2025-10-18 23:06 UTC (permalink / raw)
To: Vipin Sharma
Cc: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, graf,
pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On Sat, Oct 18, 2025 at 03:53:09PM -0700, Vipin Sharma wrote:
> On 2025-10-18 14:21:30, Jason Gunthorpe wrote:
> > On Fri, Oct 17, 2025 at 05:06:52PM -0700, Vipin Sharma wrote:
> > > 2. Integration with IOMMUFD and PCI series for complete workflow where a
> > > device continues a DMA while undergoing through live update.
> >
> > It is a bit confusing, this series has PCI components so how does it
> > relate the PCI series? Is this self contained for at least limited PCI
> > topologies?
>
> This series has very minimal PCI support. For example, it is skipping
> DMA disable on the VFIO PCI device during kexec reboot and saving initial PCI
> state during first open (bind) of the device.
>
> We do need proper PCI support, few examples:
>
> - Not disabling DMA bit on bridges upstream of the leaf VFIO PCI device node.
So limited to topology without bridges
> - Not writing to PCI config during device enumeration.
I think this should be included here
> - Not autobinding devices to their default driver. My testing works on
> devices which don't have driver bulit in the kernel so there is no
> probing by other drivers.
Good enough for now, easy to not build in such drivers.
> - PCI enable and disable calls support.
?? Shouldn't vfio restore skip calling pci enable? Seems like there
should be some solution here.
Jason
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [RFC PATCH 00/21] VFIO live update support
2025-10-18 23:06 ` Jason Gunthorpe
@ 2025-10-20 23:30 ` Vipin Sharma
0 siblings, 0 replies; 57+ messages in thread
From: Vipin Sharma @ 2025-10-20 23:30 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: bhelgaas, alex.williamson, pasha.tatashin, dmatlack, graf,
pratyush, gregkh, chrisl, rppt, skhawaja, parav, saeedm,
kevin.tian, jrhilke, david, jgowans, dwmw2, epetron, junaids,
linux-kernel, linux-pci, kvm, linux-kselftest
On 2025-10-18 20:06:41, Jason Gunthorpe wrote:
> On Sat, Oct 18, 2025 at 03:53:09PM -0700, Vipin Sharma wrote:
> >
> > This series has very minimal PCI support. For example, it is skipping
> > DMA disable on the VFIO PCI device during kexec reboot and saving initial PCI
> > state during first open (bind) of the device.
> >
> > We do need proper PCI support, few examples:
> >
> > - Not disabling DMA bit on bridges upstream of the leaf VFIO PCI device node.
>
> So limited to topology without bridges
>
> > - Not writing to PCI config during device enumeration.
>
> I think this should be included here
>
> > - Not autobinding devices to their default driver. My testing works on
> > devices which don't have driver bulit in the kernel so there is no
> > probing by other drivers.
>
> Good enough for now, easy to not build in such drivers.
>
> > - PCI enable and disable calls support.
>
> ?? Shouldn't vfio restore skip calling pci enable? Seems like there
> should be some solution here.
I think PCI subsystem when restores/enumerates a preserved device after
kexec, should enable the device and VFIO can skip calling this. By
default enable mostly does:
1. Increments enable_cnt.
2. Enables to bus master of upstream bridges.
3. Reset INTx Disable bit in command register.
4. Enables IO and Memory space bit in command register.
5. Apply fixups.
6. Sets power state to D0.
On a preserved and restored device, I think only item 1 needs to happen,
2-6 should remain same if device config space is not written during
enumeration and state is recreated by reading values in config space.
I believe this should be part of PCI preservation and restoration
series. VFIO can assume that device is enabled and skip the call or check if it is not enabled
then fail the restoration.
^ permalink raw reply [flat|nested] 57+ messages in thread