All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/3] liveupdate: suppress TCP RST during post-kexec restore window
@ 2026-01-30 14:51 Li Chen
  2026-01-30 14:51 ` [PATCH v1 1/3] liveupdate: track " Li Chen
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Li Chen @ 2026-01-30 14:51 UTC (permalink / raw)
  To: Pasha Tatashin, Mike Rapoport, Eric Dumazet, Neal Cardwell,
	David S. Miller, David Ahern, Jakub Kicinski, Paolo Abeni,
	Jonathan Corbet, netdev, linux-doc, linux-kernel
  Cc: Pratyush Yadav, Kuniyuki Iwashima, Simon Horman

LUO supports kexec-based live update where userspace restores preserved state
after the new kernel boots. For established TCP connections that are restored
in userspace (e.g. CRIU), there is an unavoidable window where the new kernel
has no socket yet but the peer can still send packets. The TCP no-socket path
replies with a RST to segments with ACK set, causing the peer to immediately
tear down the connection.

This series tracks a bounded "restore window" in which incoming LUO sessions
are expected to be retrieved and finished. The window ends automatically when
all incoming sessions are retrieved and finished (or their session file
descriptors are closed), can be terminated early via the global
LIVEUPDATE_IOCTL_RESTORE_DONE, and is also bounded by a hard timeout (15
minutes) so it cannot remain open indefinitely.

On top of that, an optional cmdline knob, liveupdate_tcp_rst_suppress=on,
drops established (ACK && !SYN) segments that would otherwise trigger a RST
while the restore window is in progress. Default is off and requires
liveupdate=on.

Tested with an established TCP stream across LUO kexec + CRIU restore; calling
RESTORE_DONE after restore avoids TCP RST and the connection continues.

Li Chen (3):
  liveupdate: track post-kexec restore window
  liveupdate: bound and control post-kexec restore window
  liveupdate: suppress TCP RST during post-kexec restore window

 .../admin-guide/kernel-parameters.txt         | 10 +++
 include/linux/liveupdate.h                    | 26 ++++++
 include/uapi/linux/liveupdate.h               | 34 ++++++++
 kernel/liveupdate/luo_core.c                  | 31 +++++++
 kernel/liveupdate/luo_internal.h              |  5 ++
 kernel/liveupdate/luo_session.c               | 85 ++++++++++++++++++-
 net/ipv4/tcp_ipv4.c                           |  5 ++
 net/ipv6/tcp_ipv6.c                           |  5 ++
 8 files changed, 198 insertions(+), 3 deletions(-)

-- 
2.52.0

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v1 1/3] liveupdate: track post-kexec restore window
  2026-01-30 14:51 [PATCH v1 0/3] liveupdate: suppress TCP RST during post-kexec restore window Li Chen
@ 2026-01-30 14:51 ` Li Chen
  2026-01-30 14:51 ` [PATCH v1 2/3] liveupdate: bound and control " Li Chen
  2026-01-30 14:51 ` [PATCH v1 3/3] liveupdate: suppress TCP RST during " Li Chen
  2 siblings, 0 replies; 8+ messages in thread
From: Li Chen @ 2026-01-30 14:51 UTC (permalink / raw)
  To: Pasha Tatashin, Mike Rapoport, Pratyush Yadav, linux-kernel; +Cc: Li Chen

A kexec-based live update introduces a window after the new kernel boots
where userspace needs to retrieve and restore preserved sessions.

Provide liveupdate_restore_in_progress() backed by a counter of incoming
sessions left, and make session finishing idempotent to avoid double
finish paths causing counter underflow.

Signed-off-by: Li Chen <me@linux.beauty>
---
 include/linux/liveupdate.h       | 11 +++++++++
 kernel/liveupdate/luo_core.c     |  2 ++
 kernel/liveupdate/luo_internal.h |  4 ++++
 kernel/liveupdate/luo_session.c  | 41 +++++++++++++++++++++++++++++++-
 4 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h
index ed81e7b31a9f..406a6e2dd4a1 100644
--- a/include/linux/liveupdate.h
+++ b/include/linux/liveupdate.h
@@ -217,6 +217,12 @@ struct liveupdate_flb {
 /* Return true if live update orchestrator is enabled */
 bool liveupdate_enabled(void);
 
+/*
+ * Return true during a kexec-based live update boot while userspace is still
+ * restoring preserved sessions/resources.
+ */
+bool liveupdate_restore_in_progress(void);
+
 /* Called during kexec to tell LUO that entered into reboot */
 int liveupdate_reboot(void);
 
@@ -238,6 +244,11 @@ static inline bool liveupdate_enabled(void)
 	return false;
 }
 
+static inline bool liveupdate_restore_in_progress(void)
+{
+	return false;
+}
+
 static inline int liveupdate_reboot(void)
 {
 	return 0;
diff --git a/kernel/liveupdate/luo_core.c b/kernel/liveupdate/luo_core.c
index 7a9ef16b37d8..19c91843fbdb 100644
--- a/kernel/liveupdate/luo_core.c
+++ b/kernel/liveupdate/luo_core.c
@@ -128,6 +128,8 @@ static int __init luo_early_startup(void)
 	if (err)
 		return err;
 
+	luo_session_restore_window_init();
+
 	err = luo_flb_setup_incoming(luo_global.fdt_in);
 
 	return err;
diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_internal.h
index 6115d6a4054d..8aa4c5b0101b 100644
--- a/kernel/liveupdate/luo_internal.h
+++ b/kernel/liveupdate/luo_internal.h
@@ -72,6 +72,8 @@ struct luo_file_set {
  *              previous kernel) sessions.
  * @retrieved:  A boolean flag indicating whether this session has been
  *              retrieved by a consumer in the new kernel.
+ * @finished:   A boolean flag indicating whether this session has been
+ *              successfully finished in the new kernel.
  * @file_set:   A set of files that belong to this session.
  * @mutex:      protects fields in the luo_session.
  */
@@ -80,6 +82,7 @@ struct luo_session {
 	struct luo_session_ser *ser;
 	struct list_head list;
 	bool retrieved;
+	bool finished;
 	struct luo_file_set file_set;
 	struct mutex mutex;
 };
@@ -88,6 +91,7 @@ int luo_session_create(const char *name, struct file **filep);
 int luo_session_retrieve(const char *name, struct file **filep);
 int __init luo_session_setup_outgoing(void *fdt);
 int __init luo_session_setup_incoming(void *fdt);
+void __init luo_session_restore_window_init(void);
 int luo_session_serialize(void);
 int luo_session_deserialize(void);
 bool luo_session_quiesce(void);
diff --git a/kernel/liveupdate/luo_session.c b/kernel/liveupdate/luo_session.c
index dbdbc3bd7929..e1d1ab795c40 100644
--- a/kernel/liveupdate/luo_session.c
+++ b/kernel/liveupdate/luo_session.c
@@ -50,6 +50,7 @@
 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
+#include <linux/atomic.h>
 #include <linux/anon_inodes.h>
 #include <linux/cleanup.h>
 #include <linux/err.h>
@@ -117,6 +118,31 @@ static struct luo_session_global luo_session_global = {
 	},
 };
 
+static atomic_long_t liveupdate_incoming_sessions_left = ATOMIC_LONG_INIT(0);
+
+bool liveupdate_restore_in_progress(void)
+{
+	return atomic_long_read(&liveupdate_incoming_sessions_left) > 0;
+}
+
+void __init luo_session_restore_window_init(void)
+{
+	struct luo_session_header *sh = &luo_session_global.incoming;
+	u64 count;
+
+	if (!sh->active)
+		return;
+
+	count = sh->header_ser->count;
+	if (count > LUO_SESSION_MAX) {
+		pr_warn("incoming session count %llu exceeds max %lu\n",
+			count, LUO_SESSION_MAX);
+		count = LUO_SESSION_MAX;
+	}
+
+	atomic_long_set(&liveupdate_incoming_sessions_left, (long)count);
+}
+
 static struct luo_session *luo_session_alloc(const char *name)
 {
 	struct luo_session *session = kzalloc(sizeof(*session), GFP_KERNEL);
@@ -182,8 +208,21 @@ static void luo_session_remove(struct luo_session_header *sh,
 
 static int luo_session_finish_one(struct luo_session *session)
 {
+	int err;
+
 	guard(mutex)(&session->mutex);
-	return luo_file_finish(&session->file_set);
+	if (session->finished)
+		return 0;
+
+	err = luo_file_finish(&session->file_set);
+	if (err)
+		return err;
+
+	session->finished = true;
+	if (session->retrieved)
+		atomic_long_dec(&liveupdate_incoming_sessions_left);
+
+	return 0;
 }
 
 static void luo_session_unfreeze_one(struct luo_session *session,
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v1 2/3] liveupdate: bound and control post-kexec restore window
  2026-01-30 14:51 [PATCH v1 0/3] liveupdate: suppress TCP RST during post-kexec restore window Li Chen
  2026-01-30 14:51 ` [PATCH v1 1/3] liveupdate: track " Li Chen
@ 2026-01-30 14:51 ` Li Chen
  2026-01-30 14:51 ` [PATCH v1 3/3] liveupdate: suppress TCP RST during " Li Chen
  2 siblings, 0 replies; 8+ messages in thread
From: Li Chen @ 2026-01-30 14:51 UTC (permalink / raw)
  To: Pasha Tatashin, Mike Rapoport, Pratyush Yadav, linux-kernel; +Cc: Li Chen

The restore window can remain open indefinitely if userspace never
finishes some incoming sessions. Bound the window with a hard timeout and
add a RESTORE_DONE ioctl on /dev/liveupdate so an orchestrator can end it
explicitly once restoration is complete.

Signed-off-by: Li Chen <me@linux.beauty>
---
 include/linux/liveupdate.h       |  4 +++
 include/uapi/linux/liveupdate.h  | 34 ++++++++++++++++++
 kernel/liveupdate/luo_core.c     | 15 ++++++++
 kernel/liveupdate/luo_internal.h |  1 +
 kernel/liveupdate/luo_session.c  | 59 ++++++++++++++++++++++++++------
 5 files changed, 103 insertions(+), 10 deletions(-)

diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h
index 406a6e2dd4a1..301d3e94516e 100644
--- a/include/linux/liveupdate.h
+++ b/include/linux/liveupdate.h
@@ -220,6 +220,10 @@ bool liveupdate_enabled(void);
 /*
  * Return true during a kexec-based live update boot while userspace is still
  * restoring preserved sessions/resources.
+ *
+ * The restore window ends automatically once all incoming sessions have been
+ * retrieved and finished. Userspace can also terminate the window explicitly
+ * (and early) via LIVEUPDATE_IOCTL_RESTORE_DONE.
  */
 bool liveupdate_restore_in_progress(void);
 
diff --git a/include/uapi/linux/liveupdate.h b/include/uapi/linux/liveupdate.h
index 30bc66ee9436..357137d9a78c 100644
--- a/include/uapi/linux/liveupdate.h
+++ b/include/uapi/linux/liveupdate.h
@@ -51,6 +51,7 @@ enum {
 	LIVEUPDATE_CMD_BASE = 0x00,
 	LIVEUPDATE_CMD_CREATE_SESSION = LIVEUPDATE_CMD_BASE,
 	LIVEUPDATE_CMD_RETRIEVE_SESSION = 0x01,
+	LIVEUPDATE_CMD_RESTORE_DONE = 0x02,
 };
 
 /* ioctl commands for session file descriptors */
@@ -118,6 +119,39 @@ struct liveupdate_ioctl_retrieve_session {
 #define LIVEUPDATE_IOCTL_RETRIEVE_SESSION \
 	_IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_RETRIEVE_SESSION)
 
+/**
+ * struct liveupdate_ioctl_restore_done - ioctl(LIVEUPDATE_IOCTL_RESTORE_DONE)
+ * @size:     Input; sizeof(struct liveupdate_ioctl_restore_done)
+ * @reserved: Input; Must be zero. Reserved for future use.
+ *
+ * Marks the completion of the post-kexec restore window.
+ *
+ * After a live update (kexec), userspace needs time to restore preserved
+ * sessions/resources. Some kernel subsystems may apply temporary compatibility
+ * behavior during this window. Userspace should call this ioctl once it has
+ * completed restoration and wants normal kernel behavior to resume, even if
+ * some incoming sessions are left unused.
+ *
+ * This ioctl updates global state and is not tied to a specific live update
+ * session file descriptor. A typical userspace agent calls it once per live
+ * update, after it has restored the required state.
+ *
+ * Note that the restore window may also end automatically once all incoming
+ * sessions have been retrieved and finished (or their session file
+ * descriptors have been closed). This ioctl is intended for cases where an
+ * agent intentionally does not retrieve all incoming sessions or does not want
+ * to wait for them to be finished/closed.
+ *
+ * Return: 0 on success, negative error code on failure.
+ */
+struct liveupdate_ioctl_restore_done {
+	__u32		size;
+	__u32		reserved;
+};
+
+#define LIVEUPDATE_IOCTL_RESTORE_DONE \
+	_IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_RESTORE_DONE)
+
 /* Session specific IOCTLs */
 
 /**
diff --git a/kernel/liveupdate/luo_core.c b/kernel/liveupdate/luo_core.c
index 19c91843fbdb..fb6a73c08979 100644
--- a/kernel/liveupdate/luo_core.c
+++ b/kernel/liveupdate/luo_core.c
@@ -340,6 +340,18 @@ static int luo_ioctl_retrieve_session(struct luo_ucmd *ucmd)
 	return err;
 }
 
+static int luo_ioctl_restore_done(struct luo_ucmd *ucmd)
+{
+	struct liveupdate_ioctl_restore_done *argp = ucmd->cmd;
+
+	if (argp->reserved)
+		return -EINVAL;
+
+	luo_session_restore_done();
+
+	return luo_ucmd_respond(ucmd, sizeof(*argp));
+}
+
 static int luo_open(struct inode *inodep, struct file *filep)
 {
 	struct luo_device_state *ldev = container_of(filep->private_data,
@@ -371,6 +383,7 @@ static int luo_release(struct inode *inodep, struct file *filep)
 union ucmd_buffer {
 	struct liveupdate_ioctl_create_session create;
 	struct liveupdate_ioctl_retrieve_session retrieve;
+	struct liveupdate_ioctl_restore_done restore_done;
 };
 
 struct luo_ioctl_op {
@@ -395,6 +408,8 @@ static const struct luo_ioctl_op luo_ioctl_ops[] = {
 		 struct liveupdate_ioctl_create_session, name),
 	IOCTL_OP(LIVEUPDATE_IOCTL_RETRIEVE_SESSION, luo_ioctl_retrieve_session,
 		 struct liveupdate_ioctl_retrieve_session, name),
+	IOCTL_OP(LIVEUPDATE_IOCTL_RESTORE_DONE, luo_ioctl_restore_done,
+		 struct liveupdate_ioctl_restore_done, reserved),
 };
 
 static long luo_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_internal.h
index 8aa4c5b0101b..03a5b27498be 100644
--- a/kernel/liveupdate/luo_internal.h
+++ b/kernel/liveupdate/luo_internal.h
@@ -96,6 +96,7 @@ int luo_session_serialize(void);
 int luo_session_deserialize(void);
 bool luo_session_quiesce(void);
 void luo_session_resume(void);
+void luo_session_restore_done(void);
 
 int luo_preserve_file(struct luo_file_set *file_set, u64 token, int fd);
 void luo_file_unpreserve_files(struct luo_file_set *file_set);
diff --git a/kernel/liveupdate/luo_session.c b/kernel/liveupdate/luo_session.c
index e1d1ab795c40..2c7dd3b12303 100644
--- a/kernel/liveupdate/luo_session.c
+++ b/kernel/liveupdate/luo_session.c
@@ -58,6 +58,7 @@
 #include <linux/file.h>
 #include <linux/fs.h>
 #include <linux/io.h>
+#include <linux/jiffies.h>
 #include <linux/kexec_handover.h>
 #include <linux/kho/abi/luo.h>
 #include <linux/libfdt.h>
@@ -67,6 +68,7 @@
 #include <linux/rwsem.h>
 #include <linux/slab.h>
 #include <linux/unaligned.h>
+#include <linux/workqueue.h>
 #include <uapi/linux/liveupdate.h>
 #include "luo_internal.h"
 
@@ -118,8 +120,28 @@ static struct luo_session_global luo_session_global = {
 	},
 };
 
+/*
+ * Count of incoming sessions that still keep the post-kexec restore window
+ * open. This is initialized from the serialized session header and decremented
+ * when a retrieved session is finished (or closed). Userspace can also end the
+ * restore window explicitly via LIVEUPDATE_IOCTL_RESTORE_DONE.
+ */
 static atomic_long_t liveupdate_incoming_sessions_left = ATOMIC_LONG_INIT(0);
 
+#define LIVEUPDATE_RESTORE_WINDOW_TIMEOUT	(15 * 60 * HZ)
+
+static void liveupdate_restore_window_timeout(struct work_struct *work)
+{
+	if (!liveupdate_restore_in_progress())
+		return;
+
+	pr_warn_once("liveupdate restore window timed out\n");
+	atomic_long_set(&liveupdate_incoming_sessions_left, 0);
+}
+
+static DECLARE_DELAYED_WORK(liveupdate_restore_window_timeout_work,
+			    liveupdate_restore_window_timeout);
+
 bool liveupdate_restore_in_progress(void)
 {
 	return atomic_long_read(&liveupdate_incoming_sessions_left) > 0;
@@ -141,6 +163,16 @@ void __init luo_session_restore_window_init(void)
 	}
 
 	atomic_long_set(&liveupdate_incoming_sessions_left, (long)count);
+
+	if (count > 0)
+		schedule_delayed_work(&liveupdate_restore_window_timeout_work,
+				      LIVEUPDATE_RESTORE_WINDOW_TIMEOUT);
+}
+
+void luo_session_restore_done(void)
+{
+	atomic_long_set(&liveupdate_incoming_sessions_left, 0);
+	cancel_delayed_work_sync(&liveupdate_restore_window_timeout_work);
 }
 
 static struct luo_session *luo_session_alloc(const char *name)
@@ -209,18 +241,26 @@ static void luo_session_remove(struct luo_session_header *sh,
 static int luo_session_finish_one(struct luo_session *session)
 {
 	int err;
+	bool cancel_timeout = false;
 
-	guard(mutex)(&session->mutex);
-	if (session->finished)
-		return 0;
+	{
+		guard(mutex)(&session->mutex);
+		if (session->finished)
+			return 0;
 
-	err = luo_file_finish(&session->file_set);
-	if (err)
-		return err;
+		err = luo_file_finish(&session->file_set);
+		if (err)
+			return err;
 
-	session->finished = true;
-	if (session->retrieved)
-		atomic_long_dec(&liveupdate_incoming_sessions_left);
+		session->finished = true;
+		if (session->retrieved) {
+			if (atomic_long_dec_if_positive(&liveupdate_incoming_sessions_left) == 0)
+				cancel_timeout = true;
+		}
+	}
+
+	if (cancel_timeout)
+		cancel_delayed_work_sync(&liveupdate_restore_window_timeout_work);
 
 	return 0;
 }
@@ -545,7 +585,6 @@ int __init luo_session_setup_incoming(void *fdt_in)
 	luo_session_global.incoming.header_ser = header_ser;
 	luo_session_global.incoming.ser = (void *)(header_ser + 1);
 	luo_session_global.incoming.active = true;
-
 	return 0;
 }
 
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v1 3/3] liveupdate: suppress TCP RST during post-kexec restore window
  2026-01-30 14:51 [PATCH v1 0/3] liveupdate: suppress TCP RST during post-kexec restore window Li Chen
  2026-01-30 14:51 ` [PATCH v1 1/3] liveupdate: track " Li Chen
  2026-01-30 14:51 ` [PATCH v1 2/3] liveupdate: bound and control " Li Chen
@ 2026-01-30 14:51 ` Li Chen
  2026-01-31  1:05   ` Jakub Kicinski
  2 siblings, 1 reply; 8+ messages in thread
From: Li Chen @ 2026-01-30 14:51 UTC (permalink / raw)
  To: Jonathan Corbet, Pasha Tatashin, Mike Rapoport, Pratyush Yadav,
	Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller,
	David Ahern, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Andrew Morton, Borislav Petkov (AMD), Randy Dunlap, Pawan Gupta,
	Petr Mladek, Feng Tang, Kees Cook, Li RongQing, Arnd Bergmann,
	Askar Safin, Frank van der Linden, linux-doc, linux-kernel,
	netdev
  Cc: Li Chen

During a kexec-based live update, userspace may restore established TCP
connections after the new kernel has booted (e.g. via CRIU). Any packet
arriving for a not-yet-restored socket will hit the no-socket path and
trigger a TCP RST, causing the peer to immediately drop the connection.
Add an optional cmdline knob, liveupdate_tcp_rst_suppress=, to drop such
packets while liveupdate_restore_in_progress() is true. Only segments
with ACK set and SYN clear are dropped, and the default behavior remains
unchanged.
Document the liveupdate_tcp_rst_suppress cmdline parameter.

Signed-off-by: Li Chen <me@linux.beauty>
---
 Documentation/admin-guide/kernel-parameters.txt | 10 ++++++++++
 include/linux/liveupdate.h                      | 11 +++++++++++
 kernel/liveupdate/luo_core.c                    | 14 ++++++++++++++
 kernel/liveupdate/luo_session.c                 |  1 +
 net/ipv4/tcp_ipv4.c                             |  5 +++++
 net/ipv6/tcp_ipv6.c                             |  5 +++++
 6 files changed, 46 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 3097e4266d76..b73347a0aefd 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3442,6 +3442,16 @@ Kernel parameters
 			If there are multiple matching configurations changing
 			the same attribute, the last one is used.
 
+	liveupdate_tcp_rst_suppress=	[KNL,EARLY]
+			Format: <bool>
+			When enabled, drop packets for established connections
+			(ACK set, SYN clear) that would otherwise trigger a RST
+			in the LUO post-kexec restore window.
+			This is useful when userspace restores sockets after
+			kexec (e.g. via CRIU).
+			Requires liveupdate=on.
+			Default: off.
+
 	lockd.nlm_grace_period=P  [NFS] Assign grace period.
 			Format: <integer>
 
diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h
index 301d3e94516e..6ca740ec19d4 100644
--- a/include/linux/liveupdate.h
+++ b/include/linux/liveupdate.h
@@ -227,6 +227,12 @@ bool liveupdate_enabled(void);
  */
 bool liveupdate_restore_in_progress(void);
 
+/*
+ * Return true when TCP RST suppression is enabled for the post-kexec restore
+ * window.
+ */
+bool liveupdate_tcp_rst_suppress_enabled(void);
+
 /* Called during kexec to tell LUO that entered into reboot */
 int liveupdate_reboot(void);
 
@@ -253,6 +259,11 @@ static inline bool liveupdate_restore_in_progress(void)
 	return false;
 }
 
+static inline bool liveupdate_tcp_rst_suppress_enabled(void)
+{
+	return false;
+}
+
 static inline int liveupdate_reboot(void)
 {
 	return 0;
diff --git a/kernel/liveupdate/luo_core.c b/kernel/liveupdate/luo_core.c
index fb6a73c08979..0ed5c9ce1421 100644
--- a/kernel/liveupdate/luo_core.c
+++ b/kernel/liveupdate/luo_core.c
@@ -64,6 +64,7 @@
 
 static struct {
 	bool enabled;
+	bool tcp_rst_suppress;
 	void *fdt_out;
 	void *fdt_in;
 	u64 liveupdate_num;
@@ -75,6 +76,13 @@ static int __init early_liveupdate_param(char *buf)
 }
 early_param("liveupdate", early_liveupdate_param);
 
+static int __init early_liveupdate_tcp_rst_suppress_param(char *buf)
+{
+	return kstrtobool(buf, &luo_global.tcp_rst_suppress);
+}
+early_param("liveupdate_tcp_rst_suppress",
+	    early_liveupdate_tcp_rst_suppress_param);
+
 static int __init luo_early_startup(void)
 {
 	phys_addr_t fdt_phys;
@@ -259,6 +267,12 @@ bool liveupdate_enabled(void)
 	return luo_global.enabled;
 }
 
+bool liveupdate_tcp_rst_suppress_enabled(void)
+{
+	return liveupdate_enabled() && luo_global.tcp_rst_suppress;
+}
+EXPORT_SYMBOL_GPL(liveupdate_tcp_rst_suppress_enabled);
+
 /**
  * DOC: LUO ioctl Interface
  *
diff --git a/kernel/liveupdate/luo_session.c b/kernel/liveupdate/luo_session.c
index 2c7dd3b12303..427ae74061ba 100644
--- a/kernel/liveupdate/luo_session.c
+++ b/kernel/liveupdate/luo_session.c
@@ -146,6 +146,7 @@ bool liveupdate_restore_in_progress(void)
 {
 	return atomic_long_read(&liveupdate_incoming_sessions_left) > 0;
 }
+EXPORT_SYMBOL_GPL(liveupdate_restore_in_progress);
 
 void __init luo_session_restore_window_init(void)
 {
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index f8a9596e8f4d..9a95f3dbf39a 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -56,6 +56,7 @@
 #include <linux/fips.h>
 #include <linux/jhash.h>
 #include <linux/init.h>
+#include <linux/liveupdate.h>
 #include <linux/times.h>
 #include <linux/slab.h>
 #include <linux/sched.h>
@@ -2349,6 +2350,10 @@ int tcp_v4_rcv(struct sk_buff *skb)
 bad_packet:
 		__TCP_INC_STATS(net, TCP_MIB_INERRS);
 	} else {
+		if (liveupdate_tcp_rst_suppress_enabled() &&
+		    liveupdate_restore_in_progress() &&
+		    th->ack && !th->syn)
+			goto discard_it;
 		tcp_v4_send_reset(NULL, skb, sk_rst_convert_drop_reason(drop_reason));
 	}
 
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 280fe5978559..c2e680eba041 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -40,6 +40,7 @@
 #include <linux/icmpv6.h>
 #include <linux/random.h>
 #include <linux/indirect_call_wrapper.h>
+#include <linux/liveupdate.h>
 
 #include <net/aligned_data.h>
 #include <net/tcp.h>
@@ -1900,6 +1901,10 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
 bad_packet:
 		__TCP_INC_STATS(net, TCP_MIB_INERRS);
 	} else {
+		if (liveupdate_tcp_rst_suppress_enabled() &&
+		    liveupdate_restore_in_progress() &&
+		    th->ack && !th->syn)
+			goto discard_it;
 		tcp_v6_send_reset(NULL, skb, sk_rst_convert_drop_reason(drop_reason));
 	}
 
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 3/3] liveupdate: suppress TCP RST during post-kexec restore window
  2026-01-30 14:51 ` [PATCH v1 3/3] liveupdate: suppress TCP RST during " Li Chen
@ 2026-01-31  1:05   ` Jakub Kicinski
  2026-02-01  1:44     ` Li Chen
  0 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2026-01-31  1:05 UTC (permalink / raw)
  To: Li Chen
  Cc: Jonathan Corbet, Pasha Tatashin, Mike Rapoport, Pratyush Yadav,
	Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller,
	David Ahern, Paolo Abeni, Simon Horman, Andrew Morton,
	Borislav Petkov (AMD), Randy Dunlap, Pawan Gupta, Petr Mladek,
	Feng Tang, Kees Cook, Li RongQing, Arnd Bergmann, Askar Safin,
	Frank van der Linden, linux-doc, linux-kernel, netdev

On Fri, 30 Jan 2026 22:51:19 +0800 Li Chen wrote:
> During a kexec-based live update, userspace may restore established TCP
> connections after the new kernel has booted (e.g. via CRIU). Any packet
> arriving for a not-yet-restored socket will hit the no-socket path and
> trigger a TCP RST, causing the peer to immediately drop the connection.

Can you not add a filter to simply drop those packets until workload is
running again? It'd actually be less racy than this hac^w patch ...

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 3/3] liveupdate: suppress TCP RST during post-kexec restore window
  2026-01-31  1:05   ` Jakub Kicinski
@ 2026-02-01  1:44     ` Li Chen
  2026-02-03  0:53       ` Jakub Kicinski
  0 siblings, 1 reply; 8+ messages in thread
From: Li Chen @ 2026-02-01  1:44 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Jonathan Corbet, Pasha Tatashin, Mike Rapoport, Pratyush Yadav,
	Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller,
	David Ahern, Paolo Abeni, Simon Horman, Andrew Morton,
	Borislav Petkov, Randy Dunlap, Pawan Gupta, Petr Mladek,
	Feng Tang, Kees Cook, Li RongQing, Arnd Bergmann, Askar Safin,
	Frank van der Linden, linux-doc, linux-kernel, netdev

Hi Jakub,

 > On Fri, 30 Jan 2026 22:51:19 +0800 Li Chen wrote:
 > > During a kexec-based live update, userspace may restore established TCP
 > > connections after the new kernel has booted (e.g. via CRIU). Any packet
 > > arriving for a not-yet-restored socket will hit the no-socket path and
 > > trigger a TCP RST, causing the peer to immediately drop the connection.
 > 
 > Can you not add a filter to simply drop those packets until workload is
 > running again? It'd actually be less racy than this hac^w patch ...
 > 

Thanks for the suggestion.

When you say "add a filter", do you mean installing a temporary drop rule
(nftables/iptables/tc) in the network domain which does not get rebooted by
kexec (e.g. LB/ToR/host firewall), so packets never reach the new kernel
until the workload is restored and ready?

If you meant a filter inside the kexec'ed kernel, I'm worried it won't cover
the critical window: kexec resets the ruleset, so we'd have to install the
drop rule extremely early (initramfs) before any packets hit the no-socket
path, which still seems inherently racy.

If the expectation is to drain/blackhole traffic externally and re-enable it
once the workload is running again, I can rework the series to keep only the
restore-window tracking plus a clear "restore done" control plane, and rely
on the external filter for the data plane.

Regards
Li

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 3/3] liveupdate: suppress TCP RST during post-kexec restore window
  2026-02-01  1:44     ` Li Chen
@ 2026-02-03  0:53       ` Jakub Kicinski
  2026-02-03  3:15         ` Li Chen
  0 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2026-02-03  0:53 UTC (permalink / raw)
  To: Li Chen
  Cc: Jonathan Corbet, Pasha Tatashin, Mike Rapoport, Pratyush Yadav,
	Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller,
	David Ahern, Paolo Abeni, Simon Horman, Andrew Morton,
	Borislav Petkov, Randy Dunlap, Pawan Gupta, Petr Mladek,
	Feng Tang, Kees Cook, Li RongQing, Arnd Bergmann, Askar Safin,
	Frank van der Linden, linux-doc, linux-kernel, netdev

On Sun, 01 Feb 2026 09:44:27 +0800 Li Chen wrote:
>  > On Fri, 30 Jan 2026 22:51:19 +0800 Li Chen wrote:  
>  > > During a kexec-based live update, userspace may restore established TCP
>  > > connections after the new kernel has booted (e.g. via CRIU). Any packet
>  > > arriving for a not-yet-restored socket will hit the no-socket path and
>  > > trigger a TCP RST, causing the peer to immediately drop the connection.  
>  > 
>  > Can you not add a filter to simply drop those packets until workload is
>  > running again? It'd actually be less racy than this hac^w patch ...
>  >   
> 
> Thanks for the suggestion.
> 
> When you say "add a filter", do you mean installing a temporary drop rule
> (nftables/iptables/tc) in the network domain which does not get rebooted by
> kexec (e.g. LB/ToR/host firewall), so packets never reach the new kernel
> until the workload is restored and ready?
> 
> If you meant a filter inside the kexec'ed kernel, I'm worried it won't cover
> the critical window: kexec resets the ruleset, so we'd have to install the
> drop rule extremely early (initramfs) before any packets hit the no-socket
> path, which still seems inherently racy.

I'm not sure what your flow is exactly, but I assume you drive 
the workload restore from user space already?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v1 3/3] liveupdate: suppress TCP RST during post-kexec restore window
  2026-02-03  0:53       ` Jakub Kicinski
@ 2026-02-03  3:15         ` Li Chen
  0 siblings, 0 replies; 8+ messages in thread
From: Li Chen @ 2026-02-03  3:15 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Jonathan Corbet, Pasha Tatashin, Mike Rapoport, Pratyush Yadav,
	Eric Dumazet, Neal Cardwell, Kuniyuki Iwashima, David S. Miller,
	David Ahern, Paolo Abeni, Simon Horman, Andrew Morton,
	Borislav Petkov, Randy Dunlap, Pawan Gupta, Petr Mladek,
	Feng Tang, Kees Cook, Li RongQing, Arnd Bergmann, Askar Safin,
	Frank van der Linden, linux-doc, linux-kernel, netdev

Hi Jakub,

 ---- On Tue, 03 Feb 2026 08:53:20 +0800  Jakub Kicinski <kuba@kernel.org> wrote --- 
 > On Sun, 01 Feb 2026 09:44:27 +0800 Li Chen wrote:
 > >  > On Fri, 30 Jan 2026 22:51:19 +0800 Li Chen wrote:  
 > >  > > During a kexec-based live update, userspace may restore established TCP
 > >  > > connections after the new kernel has booted (e.g. via CRIU). Any packet
 > >  > > arriving for a not-yet-restored socket will hit the no-socket path and
 > >  > > trigger a TCP RST, causing the peer to immediately drop the connection.  
 > >  > 
 > >  > Can you not add a filter to simply drop those packets until workload is
 > >  > running again? It'd actually be less racy than this hac^w patch ...
 > >  >   
 > > 
 > > Thanks for the suggestion.
 > > 
 > > When you say "add a filter", do you mean installing a temporary drop rule
 > > (nftables/iptables/tc) in the network domain which does not get rebooted by
 > > kexec (e.g. LB/ToR/host firewall), so packets never reach the new kernel
 > > until the workload is restored and ready?
 > > 
 > > If you meant a filter inside the kexec'ed kernel, I'm worried it won't cover
 > > the critical window: kexec resets the ruleset, so we'd have to install the
 > > drop rule extremely early (initramfs) before any packets hit the no-socket
 > > path, which still seems inherently racy.
 > 
 > I'm not sure what your flow is exactly, but I assume you drive 
 > the workload restore from user space already?
 > 

Yes, in our PoC setup the post-kexec restore flow is driven from initramfs / early userspace.

We pass an initramfs via kexec --initrd and install a temporary iptables INPUT DROP rule from a dracut pre-mount hook (keyed by a cmdline like luo_tcp_drop_port=...). In our
external-peer test this avoids the early TCP RST window; the peer just retransmits/timeouts until CRIU restore recreates the socket.

The downside is that it makes initramfs heavier (iptables userspace + required xtables extensions, and it relies on legacy iptables filter support being available early). Not sure
this is a great general solution, but it can work when initramfs is under our control.

Regards,
Li​


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-02-03  3:16 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-30 14:51 [PATCH v1 0/3] liveupdate: suppress TCP RST during post-kexec restore window Li Chen
2026-01-30 14:51 ` [PATCH v1 1/3] liveupdate: track " Li Chen
2026-01-30 14:51 ` [PATCH v1 2/3] liveupdate: bound and control " Li Chen
2026-01-30 14:51 ` [PATCH v1 3/3] liveupdate: suppress TCP RST during " Li Chen
2026-01-31  1:05   ` Jakub Kicinski
2026-02-01  1:44     ` Li Chen
2026-02-03  0:53       ` Jakub Kicinski
2026-02-03  3:15         ` Li Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.