All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/5] liveupdate: serialization safety and race fixes
@ 2026-05-18 12:54 Pasha Tatashin
  2026-05-18 12:54 ` [PATCH v5 1/5] liveupdate: skip serialization for context-preserving kexec Pasha Tatashin
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Pasha Tatashin @ 2026-05-18 12:54 UTC (permalink / raw)
  To: rppt, sourabhjain, jbouron, akpm, bhe, linux-kernel,
	dan.carpenter, pasha.tatashin, rafael.j.wysocki, piliu, kexec,
	pratyush, skhawaja, graf, mario.limonciello

This series addresses several issues related to the synchronization
between the reboot process and LUO session management.

Changes in v5:
- Collected Acked-by from Mike Rapoport.
- In "block session mutations during reboot" (#3):
  - Moved down_read(&luo_session_serialize_rwsem) after luo_session_alloc()
    to minimize the critical section, and simplify cleanup.
  - Replaced scoped_guard() with explicit mutex_lock/unlock in
    luo_session_create() for consistency.

1. Skip LUO serialization for context-preserving kexec: A
preserve_context kexec returns to the current kernel, which is unrelated
to live update where state is passed to the next kernel. Skipping
serialization avoids unnecessary work and prevents sessions from being
left in a frozen state upon return.

2. Fix TOCTOU race in luo_session_retrieve(): Extend the rwsem lock
scope to prevent a session from being released between lookup and
mutex acquisition.

3. Block session mutations during reboot: During the reboot() syscall,
user processes may still be running concurrently and attempting to
mutate sessions. To prevent this, we introduce luo_session_serialize_rwsem.
All mutation operations (create, retrieve, release, ioctl) hold the
read lock. The serialization process holds the write lock indefinitely
on success, effectively freezing the subsystem.

4. Fix use-after-free in luo_file_unpreserve_files(): Reorder module_put()
to ensure the file handler module remains pinned while its operations
are being accessed during cleanup.

5. Remove unused ser field from struct luo_session: Clean up the
session structure by removing a field that was never utilized.

Tree: git.kernel.org/pub/scm/linux/kernel/git/tatashin/linux.git Branch:
luo-reboot-sync/v5

Pasha Tatashin (5):
  liveupdate: skip serialization for context-preserving kexec
  liveupdate: fix TOCTOU race in luo_session_retrieve()
  liveupdate: block session mutations during reboot
  liveupdate: fix u-a-f in luo_file_unpreserve_files() and
    luo_file_finish()
  liveupdate: Remove unused ser field from struct luo_session

 kernel/kexec_core.c              |  8 +++++---
 kernel/liveupdate/luo_file.c     |  5 +++--
 kernel/liveupdate/luo_internal.h |  2 --
 kernel/liveupdate/luo_session.c  | 35 ++++++++++++++++++++++++--------
 4 files changed, 34 insertions(+), 16 deletions(-)


base-commit: b1378127003b61930ce30064328640503ad3ef6d
-- 
2.53.0



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v5 1/5] liveupdate: skip serialization for context-preserving kexec
  2026-05-18 12:54 [PATCH v5 0/5] liveupdate: serialization safety and race fixes Pasha Tatashin
@ 2026-05-18 12:54 ` Pasha Tatashin
  2026-05-18 12:54 ` [PATCH v5 2/5] liveupdate: fix TOCTOU race in luo_session_retrieve() Pasha Tatashin
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Pasha Tatashin @ 2026-05-18 12:54 UTC (permalink / raw)
  To: rppt, sourabhjain, jbouron, akpm, bhe, linux-kernel,
	dan.carpenter, pasha.tatashin, rafael.j.wysocki, piliu, kexec,
	pratyush, skhawaja, graf, mario.limonciello

A preserve_context kexec returns to the current kernel, which is
unrelated to live update where the state is passed to the next kernel.
Skip liveupdate_reboot() in this case to avoid serialization and prevent
sessions from being left in a frozen state upon return.

Fixes: db8bed8082dc ("kexec: call liveupdate_reboot() before kexec")
Reported-by: Oskar Gerlicz Kowalczuk <oskar@gerlicz.space>
Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
---
 kernel/kexec_core.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index a43d2da0fe3e..dc770b9a6d05 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1146,9 +1146,11 @@ int kernel_kexec(void)
 		goto Unlock;
 	}
 
-	error = liveupdate_reboot();
-	if (error)
-		goto Unlock;
+	if (!kexec_image->preserve_context) {
+		error = liveupdate_reboot();
+		if (error)
+			goto Unlock;
+	}
 
 #ifdef CONFIG_KEXEC_JUMP
 	if (kexec_image->preserve_context) {
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 2/5] liveupdate: fix TOCTOU race in luo_session_retrieve()
  2026-05-18 12:54 [PATCH v5 0/5] liveupdate: serialization safety and race fixes Pasha Tatashin
  2026-05-18 12:54 ` [PATCH v5 1/5] liveupdate: skip serialization for context-preserving kexec Pasha Tatashin
@ 2026-05-18 12:54 ` Pasha Tatashin
  2026-05-18 16:13   ` Pratyush Yadav
  2026-05-18 12:54 ` [PATCH v5 3/5] liveupdate: block session mutations during reboot Pasha Tatashin
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Pasha Tatashin @ 2026-05-18 12:54 UTC (permalink / raw)
  To: rppt, sourabhjain, jbouron, akpm, bhe, linux-kernel,
	dan.carpenter, pasha.tatashin, rafael.j.wysocki, piliu, kexec,
	pratyush, skhawaja, graf, mario.limonciello

Extend the scope of the rwsem_read lock in luo_session_retrieve() to
overlap with the acquisition of the session mutex. This prevents a
concurrent thread from releasing and freeing the session between the
lookup and the mutex lock.

Fixes: 0153094d03df ("liveupdate: luo_session: add sessions support")
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
---
 kernel/liveupdate/luo_session.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/kernel/liveupdate/luo_session.c b/kernel/liveupdate/luo_session.c
index a3327a28fc1f..59b37d17db6b 100644
--- a/kernel/liveupdate/luo_session.c
+++ b/kernel/liveupdate/luo_session.c
@@ -415,12 +415,11 @@ int luo_session_retrieve(const char *name, struct file **filep)
 	struct luo_session *it;
 	int err;
 
-	scoped_guard(rwsem_read, &sh->rwsem) {
-		list_for_each_entry(it, &sh->list, list) {
-			if (!strncmp(it->name, name, sizeof(it->name))) {
-				session = it;
-				break;
-			}
+	guard(rwsem_read)(&sh->rwsem);
+	list_for_each_entry(it, &sh->list, list) {
+		if (!strncmp(it->name, name, sizeof(it->name))) {
+			session = it;
+			break;
 		}
 	}
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 3/5] liveupdate: block session mutations during reboot
  2026-05-18 12:54 [PATCH v5 0/5] liveupdate: serialization safety and race fixes Pasha Tatashin
  2026-05-18 12:54 ` [PATCH v5 1/5] liveupdate: skip serialization for context-preserving kexec Pasha Tatashin
  2026-05-18 12:54 ` [PATCH v5 2/5] liveupdate: fix TOCTOU race in luo_session_retrieve() Pasha Tatashin
@ 2026-05-18 12:54 ` Pasha Tatashin
  2026-05-18 16:31   ` Pratyush Yadav
  2026-05-18 12:54 ` [PATCH v5 4/5] liveupdate: fix u-a-f in luo_file_unpreserve_files() and luo_file_finish() Pasha Tatashin
  2026-05-18 12:54 ` [PATCH v5 5/5] liveupdate: Remove unused ser field from struct luo_session Pasha Tatashin
  4 siblings, 1 reply; 13+ messages in thread
From: Pasha Tatashin @ 2026-05-18 12:54 UTC (permalink / raw)
  To: rppt, sourabhjain, jbouron, akpm, bhe, linux-kernel,
	dan.carpenter, pasha.tatashin, rafael.j.wysocki, piliu, kexec,
	pratyush, skhawaja, graf, mario.limonciello

During the reboot() syscall, user processes may still be running
concurrently and attempting to mutate sessions (e.g., creating,
retrieving, or releasing sessions). To prevent this, introduce
luo_session_serialize_rwsem to synchronize mutations with the
serialization process.

All session mutation operations (create, retrieve, release, ioctl) take
the read lock. The serialization process (luo_session_serialize) takes
the write lock and holds it indefinitely on success. This effectively
freezes the LUO session subsystem during the transition to the new
kernel. If serialization fails, the lock is released to allow recovery.

Fixes: 0153094d03df ("liveupdate: luo_session: add sessions support")
Reported-by: Oskar Gerlicz Kowalczuk <oskar@gerlicz.space>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
---
 kernel/liveupdate/luo_session.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/kernel/liveupdate/luo_session.c b/kernel/liveupdate/luo_session.c
index 59b37d17db6b..812f09152ec1 100644
--- a/kernel/liveupdate/luo_session.c
+++ b/kernel/liveupdate/luo_session.c
@@ -75,6 +75,13 @@
 		sizeof(struct luo_session_header_ser)) /		\
 		sizeof(struct luo_session_ser))
 
+/*
+ * Protects session mutations during serialization. All session mutation
+ * operations must hold the read lock. The serialization process holds the write
+ * lock indefinitely on success to block all concurrent and future mutations.
+ */
+static DECLARE_RWSEM(luo_session_serialize_rwsem);
+
 /**
  * struct luo_session_header - Header struct for managing LUO sessions.
  * @count:      The number of sessions currently tracked in the @list.
@@ -205,6 +212,7 @@ static int luo_session_release(struct inode *inodep, struct file *filep)
 	struct luo_session *session = filep->private_data;
 	struct luo_session_header *sh;
 
+	guard(rwsem_read)(&luo_session_serialize_rwsem);
 	/* If retrieved is set, it means this session is from incoming list */
 	if (session->retrieved) {
 		int err = luo_session_finish_one(session);
@@ -354,6 +362,7 @@ static long luo_session_ioctl(struct file *filep, unsigned int cmd,
 	if (ret)
 		return ret;
 
+	guard(rwsem_read)(&luo_session_serialize_rwsem);
 	return op->execute(session, &ucmd);
 }
 
@@ -389,14 +398,17 @@ int luo_session_create(const char *name, struct file **filep)
 	if (IS_ERR(session))
 		return PTR_ERR(session);
 
+	down_read(&luo_session_serialize_rwsem);
 	err = luo_session_insert(&luo_session_global.outgoing, session);
 	if (err)
 		goto err_free;
 
-	scoped_guard(mutex, &session->mutex)
-		err = luo_session_getfile(session, filep);
+	mutex_lock(&session->mutex);
+	err = luo_session_getfile(session, filep);
+	mutex_unlock(&session->mutex);
 	if (err)
 		goto err_remove;
+	up_read(&luo_session_serialize_rwsem);
 
 	return 0;
 
@@ -404,6 +416,7 @@ int luo_session_create(const char *name, struct file **filep)
 	luo_session_remove(&luo_session_global.outgoing, session);
 err_free:
 	luo_session_free(session);
+	up_read(&luo_session_serialize_rwsem);
 
 	return err;
 }
@@ -415,6 +428,7 @@ int luo_session_retrieve(const char *name, struct file **filep)
 	struct luo_session *it;
 	int err;
 
+	guard(rwsem_read)(&luo_session_serialize_rwsem);
 	guard(rwsem_read)(&sh->rwsem);
 	list_for_each_entry(it, &sh->list, list) {
 		if (!strncmp(it->name, name, sizeof(it->name))) {
@@ -582,7 +596,8 @@ int luo_session_serialize(void)
 	int i = 0;
 	int err;
 
-	guard(rwsem_write)(&sh->rwsem);
+	down_write(&luo_session_serialize_rwsem);
+	down_write(&sh->rwsem);
 	list_for_each_entry(session, &sh->list, list) {
 		err = luo_session_freeze_one(session, &sh->ser[i]);
 		if (err)
@@ -593,6 +608,7 @@ int luo_session_serialize(void)
 		i++;
 	}
 	sh->header_ser->count = sh->count;
+	up_write(&sh->rwsem);
 
 	return 0;
 
@@ -602,6 +618,8 @@ int luo_session_serialize(void)
 		luo_session_unfreeze_one(session, &sh->ser[i]);
 		memset(sh->ser[i].name, 0, sizeof(sh->ser[i].name));
 	}
+	up_write(&sh->rwsem);
+	up_write(&luo_session_serialize_rwsem);
 
 	return err;
 }
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 4/5] liveupdate: fix u-a-f in luo_file_unpreserve_files() and luo_file_finish()
  2026-05-18 12:54 [PATCH v5 0/5] liveupdate: serialization safety and race fixes Pasha Tatashin
                   ` (2 preceding siblings ...)
  2026-05-18 12:54 ` [PATCH v5 3/5] liveupdate: block session mutations during reboot Pasha Tatashin
@ 2026-05-18 12:54 ` Pasha Tatashin
  2026-05-18 16:24   ` Pratyush Yadav
  2026-05-18 12:54 ` [PATCH v5 5/5] liveupdate: Remove unused ser field from struct luo_session Pasha Tatashin
  4 siblings, 1 reply; 13+ messages in thread
From: Pasha Tatashin @ 2026-05-18 12:54 UTC (permalink / raw)
  To: rppt, sourabhjain, jbouron, akpm, bhe, linux-kernel,
	dan.carpenter, pasha.tatashin, rafael.j.wysocki, piliu, kexec,
	pratyush, skhawaja, graf, mario.limonciello

In luo_file_unpreserve_files() and luo_file_finish(), reorder
module_put() and xa_erase() to ensure the file handler module remains
pinned while its operations are being accessed.

Specifically, luo_get_id() dereferences fh->ops->get_id, so the module
reference must be held until after xa_erase() (which calls luo_get_id)
completes.

For luo_file_finish(), this requires moving the module_put() call out of
the luo_file_finish_one() helper and into the main loop of
luo_file_finish() itself.

Fixes: 00d0b372374f ("liveupdate: prevent double management of files")
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
---
 kernel/liveupdate/luo_file.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/liveupdate/luo_file.c b/kernel/liveupdate/luo_file.c
index a0a419085e28..208987502f73 100644
--- a/kernel/liveupdate/luo_file.c
+++ b/kernel/liveupdate/luo_file.c
@@ -385,10 +385,11 @@ void luo_file_unpreserve_files(struct luo_file_set *file_set)
 		args.private_data = luo_file->private_data;
 		luo_file->fh->ops->unpreserve(&args);
 		luo_flb_file_unpreserve(luo_file->fh);
-		module_put(luo_file->fh->ops->owner);
 
 		xa_erase(&luo_preserved_files,
 			 luo_get_id(luo_file->fh, luo_file->file));
+		module_put(luo_file->fh->ops->owner);
+
 		list_del(&luo_file->list);
 		file_set->count--;
 
@@ -677,7 +678,6 @@ static void luo_file_finish_one(struct luo_file_set *file_set,
 
 	luo_file->fh->ops->finish(&args);
 	luo_flb_file_finish(luo_file->fh);
-	module_put(luo_file->fh->ops->owner);
 }
 
 /**
@@ -738,6 +738,7 @@ int luo_file_finish(struct luo_file_set *file_set)
 				 luo_get_id(luo_file->fh, luo_file->file));
 			fput(luo_file->file);
 		}
+		module_put(luo_file->fh->ops->owner);
 		list_del(&luo_file->list);
 		file_set->count--;
 		mutex_destroy(&luo_file->mutex);
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 5/5] liveupdate: Remove unused ser field from struct luo_session
  2026-05-18 12:54 [PATCH v5 0/5] liveupdate: serialization safety and race fixes Pasha Tatashin
                   ` (3 preceding siblings ...)
  2026-05-18 12:54 ` [PATCH v5 4/5] liveupdate: fix u-a-f in luo_file_unpreserve_files() and luo_file_finish() Pasha Tatashin
@ 2026-05-18 12:54 ` Pasha Tatashin
  2026-05-18 16:24   ` Pratyush Yadav
  4 siblings, 1 reply; 13+ messages in thread
From: Pasha Tatashin @ 2026-05-18 12:54 UTC (permalink / raw)
  To: rppt, sourabhjain, jbouron, akpm, bhe, linux-kernel,
	dan.carpenter, pasha.tatashin, rafael.j.wysocki, piliu, kexec,
	pratyush, skhawaja, graf, mario.limonciello

The ser field in struct luo_session was intended to point to the
serialized data for a session, but it was never actually utilized in the
implementation. All serialization and deserialization logic consistently
uses the pointers maintained in struct luo_session_header.

Remove the dead field to clean up the structure.

Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
---
 kernel/liveupdate/luo_internal.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_internal.h
index 875844d7a41d..dd53d4a7277e 100644
--- a/kernel/liveupdate/luo_internal.h
+++ b/kernel/liveupdate/luo_internal.h
@@ -59,7 +59,6 @@ struct luo_file_set {
  * struct luo_session - Represents an active or incoming Live Update session.
  * @name:       A unique name for this session, used for identification and
  *              retrieval.
- * @ser:        Pointer to the serialized data for this session.
  * @list:       A list_head member used to link this session into a global list
  *              of either outgoing (to be preserved) or incoming (restored from
  *              previous kernel) sessions.
@@ -70,7 +69,6 @@ struct luo_file_set {
  */
 struct luo_session {
 	char name[LIVEUPDATE_SESSION_NAME_LENGTH];
-	struct luo_session_ser *ser;
 	struct list_head list;
 	bool retrieved;
 	struct luo_file_set file_set;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 2/5] liveupdate: fix TOCTOU race in luo_session_retrieve()
  2026-05-18 12:54 ` [PATCH v5 2/5] liveupdate: fix TOCTOU race in luo_session_retrieve() Pasha Tatashin
@ 2026-05-18 16:13   ` Pratyush Yadav
  0 siblings, 0 replies; 13+ messages in thread
From: Pratyush Yadav @ 2026-05-18 16:13 UTC (permalink / raw)
  To: Pasha Tatashin
  Cc: rppt, sourabhjain, jbouron, akpm, bhe, linux-kernel,
	dan.carpenter, rafael.j.wysocki, piliu, kexec, pratyush, skhawaja,
	graf, mario.limonciello

On Mon, May 18 2026, Pasha Tatashin wrote:

> Extend the scope of the rwsem_read lock in luo_session_retrieve() to
> overlap with the acquisition of the session mutex. This prevents a
> concurrent thread from releasing and freeing the session between the
> lookup and the mutex lock.
>
> Fixes: 0153094d03df ("liveupdate: luo_session: add sessions support")
> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>

I was about to comment that perhaps we should drop session header lock
after getting session lock, but I think the session retrieval is pretty
fast so this should not matter.

So,

Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org>

-- 
Regards,
Pratyush Yadav


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 4/5] liveupdate: fix u-a-f in luo_file_unpreserve_files() and luo_file_finish()
  2026-05-18 12:54 ` [PATCH v5 4/5] liveupdate: fix u-a-f in luo_file_unpreserve_files() and luo_file_finish() Pasha Tatashin
@ 2026-05-18 16:24   ` Pratyush Yadav
  0 siblings, 0 replies; 13+ messages in thread
From: Pratyush Yadav @ 2026-05-18 16:24 UTC (permalink / raw)
  To: Pasha Tatashin
  Cc: rppt, sourabhjain, jbouron, akpm, bhe, linux-kernel,
	dan.carpenter, rafael.j.wysocki, piliu, kexec, pratyush, skhawaja,
	graf, mario.limonciello

On Mon, May 18 2026, Pasha Tatashin wrote:

> In luo_file_unpreserve_files() and luo_file_finish(), reorder
> module_put() and xa_erase() to ensure the file handler module remains
> pinned while its operations are being accessed.
>
> Specifically, luo_get_id() dereferences fh->ops->get_id, so the module
> reference must be held until after xa_erase() (which calls luo_get_id)
> completes.
>
> For luo_file_finish(), this requires moving the module_put() call out of
> the luo_file_finish_one() helper and into the main loop of
> luo_file_finish() itself.
>
> Fixes: 00d0b372374f ("liveupdate: prevent double management of files")
> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>

Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org>

-- 
Regards,
Pratyush Yadav


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 5/5] liveupdate: Remove unused ser field from struct luo_session
  2026-05-18 12:54 ` [PATCH v5 5/5] liveupdate: Remove unused ser field from struct luo_session Pasha Tatashin
@ 2026-05-18 16:24   ` Pratyush Yadav
  0 siblings, 0 replies; 13+ messages in thread
From: Pratyush Yadav @ 2026-05-18 16:24 UTC (permalink / raw)
  To: Pasha Tatashin
  Cc: rppt, sourabhjain, jbouron, akpm, bhe, linux-kernel,
	dan.carpenter, rafael.j.wysocki, piliu, kexec, pratyush, skhawaja,
	graf, mario.limonciello

On Mon, May 18 2026, Pasha Tatashin wrote:

> The ser field in struct luo_session was intended to point to the
> serialized data for a session, but it was never actually utilized in the
> implementation. All serialization and deserialization logic consistently
> uses the pointers maintained in struct luo_session_header.
>
> Remove the dead field to clean up the structure.
>
> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>

Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org>

-- 
Regards,
Pratyush Yadav


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 3/5] liveupdate: block session mutations during reboot
  2026-05-18 12:54 ` [PATCH v5 3/5] liveupdate: block session mutations during reboot Pasha Tatashin
@ 2026-05-18 16:31   ` Pratyush Yadav
  2026-05-18 23:15     ` Pasha Tatashin
  0 siblings, 1 reply; 13+ messages in thread
From: Pratyush Yadav @ 2026-05-18 16:31 UTC (permalink / raw)
  To: Pasha Tatashin
  Cc: rppt, sourabhjain, jbouron, akpm, bhe, linux-kernel,
	dan.carpenter, rafael.j.wysocki, piliu, kexec, pratyush, skhawaja,
	graf, mario.limonciello

On Mon, May 18 2026, Pasha Tatashin wrote:

> During the reboot() syscall, user processes may still be running
> concurrently and attempting to mutate sessions (e.g., creating,
> retrieving, or releasing sessions). To prevent this, introduce
> luo_session_serialize_rwsem to synchronize mutations with the
> serialization process.
>
> All session mutation operations (create, retrieve, release, ioctl) take
> the read lock. The serialization process (luo_session_serialize) takes
> the write lock and holds it indefinitely on success. This effectively
> freezes the LUO session subsystem during the transition to the new
> kernel. If serialization fails, the lock is released to allow recovery.

Good idea I think.

But, do we need a new mutex? Can't we use luo_session_header->rwsem?
Session creation and release take the header rwsem at one point anyway,
so perhaps we can just reuse that?

Also, do we need to block incoming sessions? They won't participate in
serialization, so perhaps we can leave those alone, and all the outgoing
sessions get protected by the outgoing session header rwsem?

>
> Fixes: 0153094d03df ("liveupdate: luo_session: add sessions support")
> Reported-by: Oskar Gerlicz Kowalczuk <oskar@gerlicz.space>
> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>

-- 
Regards,
Pratyush Yadav


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 3/5] liveupdate: block session mutations during reboot
  2026-05-18 16:31   ` Pratyush Yadav
@ 2026-05-18 23:15     ` Pasha Tatashin
  2026-05-22 12:52       ` Pratyush Yadav
  0 siblings, 1 reply; 13+ messages in thread
From: Pasha Tatashin @ 2026-05-18 23:15 UTC (permalink / raw)
  To: Pratyush Yadav
  Cc: Pasha Tatashin, rppt, sourabhjain, jbouron, akpm, bhe,
	linux-kernel, dan.carpenter, rafael.j.wysocki, piliu, kexec,
	skhawaja, graf, mario.limonciello

On 05-18 18:31, Pratyush Yadav wrote:
> On Mon, May 18 2026, Pasha Tatashin wrote:
> 
> > During the reboot() syscall, user processes may still be running
> > concurrently and attempting to mutate sessions (e.g., creating,
> > retrieving, or releasing sessions). To prevent this, introduce
> > luo_session_serialize_rwsem to synchronize mutations with the
> > serialization process.
> >
> > All session mutation operations (create, retrieve, release, ioctl) take
> > the read lock. The serialization process (luo_session_serialize) takes
> > the write lock and holds it indefinitely on success. This effectively
> > freezes the LUO session subsystem during the transition to the new
> > kernel. If serialization fails, the lock is released to allow recovery.
> 
> Good idea I think.

Hi Pratyush,

> 
> But, do we need a new mutex? Can't we use luo_session_header->rwsem?
> Session creation and release take the header rwsem at one point anyway,
> so perhaps we can just reuse that?

The sh->rwsem is for protecting the the session list. We only take it as 
a writer when modifying the list (insert/remove) and as a reader when 
traversing it. Also, we drop sh->rwsem as soon as we've acquired the 
per-session mutex to allow other list operations to proceed while a 
session is being modified.

Because of this, many session mutation operations (specifically ioctl 
calls) don't touch sh->rwsem at all—they jump straight to the  session 
state via the file's private_data. To use sh->rwsem to block
these mutations, we would be forced to add down_read(&sh->rwsem) to 
every ioctl path. This would be a layering violation, coupling list 
management to per-session data mutations, and would introduce a global
bottleneck for operations that are otherwise independent.

The only other way to prevent mutations without a new global lock would 
be for the reboot process to acquire every individual session mutex. 
However, since LUO_SESSION_MAX can be large, this would exceed lockdep's 
maximum lock tracking limit and trigger failures. The 
luo_session_serialize_rwsem provides a dedicated signal to freeze the 
entire subsystem without messing with the existing fine-grained locking 
logic.

> 
> Also, do we need to block incoming sessions? They won't participate in
> serialization, so perhaps we can leave those alone, and all the outgoing
> sessions get protected by the outgoing session header rwsem?

Incoming sessions don't participate in serialization, but blocking them 
makes the code more robust. This provides a level of future proofing if 
new ioctls or operations are added later, we won't accidentally miss a 
path that should have been frozen during reboot. It's safer to treat the 
subsystem as a single unit that freezes entirely once the transition 
begins.

Pasha


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 3/5] liveupdate: block session mutations during reboot
  2026-05-18 23:15     ` Pasha Tatashin
@ 2026-05-22 12:52       ` Pratyush Yadav
  2026-05-27 20:06         ` Pasha Tatashin
  0 siblings, 1 reply; 13+ messages in thread
From: Pratyush Yadav @ 2026-05-22 12:52 UTC (permalink / raw)
  To: Pasha Tatashin
  Cc: Pratyush Yadav, rppt, sourabhjain, jbouron, akpm, bhe,
	linux-kernel, dan.carpenter, rafael.j.wysocki, piliu, kexec,
	skhawaja, graf, mario.limonciello

On Mon, May 18 2026, Pasha Tatashin wrote:

> On 05-18 18:31, Pratyush Yadav wrote:
>> On Mon, May 18 2026, Pasha Tatashin wrote:
>> 
>> > During the reboot() syscall, user processes may still be running
>> > concurrently and attempting to mutate sessions (e.g., creating,
>> > retrieving, or releasing sessions). To prevent this, introduce
>> > luo_session_serialize_rwsem to synchronize mutations with the
>> > serialization process.
>> >
>> > All session mutation operations (create, retrieve, release, ioctl) take
>> > the read lock. The serialization process (luo_session_serialize) takes
>> > the write lock and holds it indefinitely on success. This effectively
>> > freezes the LUO session subsystem during the transition to the new
>> > kernel. If serialization fails, the lock is released to allow recovery.
>> 
>> Good idea I think.
>
> Hi Pratyush,
>
>> 
>> But, do we need a new mutex? Can't we use luo_session_header->rwsem?
>> Session creation and release take the header rwsem at one point anyway,
>> so perhaps we can just reuse that?
>
> The sh->rwsem is for protecting the the session list. We only take it as 
> a writer when modifying the list (insert/remove) and as a reader when 
> traversing it. Also, we drop sh->rwsem as soon as we've acquired the 
> per-session mutex to allow other list operations to proceed while a 
> session is being modified.
>
> Because of this, many session mutation operations (specifically ioctl 
> calls) don't touch sh->rwsem at all—they jump straight to the  session 
> state via the file's private_data. To use sh->rwsem to block
> these mutations, we would be forced to add down_read(&sh->rwsem) to 
> every ioctl path. This would be a layering violation, coupling list 
> management to per-session data mutations, and would introduce a global
> bottleneck for operations that are otherwise independent.

As for the layering violation, I think we would need to change the
semantics of the lock -- it no longer protects only the list, but other
session operations as well.

But yeah, if we do this then operations like session creation would have
to wait for ongoing session operations like PRESERVE_FD. My argument was
based around the fact that session creation or removal should not be
very frequent (and don't happen in the hot path anyway) so the added
latency should not affect them as much. By doing this tradeoff we get
slightly simpler code (and simpler locking scheme).

But I see your point as well. In practice session creation and
PRESERVE_FD are independent and one should not block the other. Maybe we
get VMMs creating sessions while another VMM is preserving stuff, and
this slowing down the live update preparation? Dunno...

I suppose let's go with this patch. But, can you please document the
lock hierarchy where you explain what this lock is for?

>
> The only other way to prevent mutations without a new global lock would 
> be for the reboot process to acquire every individual session mutex. 
> However, since LUO_SESSION_MAX can be large, this would exceed lockdep's 
> maximum lock tracking limit and trigger failures. The 
> luo_session_serialize_rwsem provides a dedicated signal to freeze the 
> entire subsystem without messing with the existing fine-grained locking 
> logic.

Yeah, that would have been nice, but we shouldn't break lockdep I reckon.

>
>> 
>> Also, do we need to block incoming sessions? They won't participate in
>> serialization, so perhaps we can leave those alone, and all the outgoing
>> sessions get protected by the outgoing session header rwsem?
>
> Incoming sessions don't participate in serialization, but blocking them 
> makes the code more robust. This provides a level of future proofing if 
> new ioctls or operations are added later, we won't accidentally miss a 
> path that should have been frozen during reboot. It's safer to treat the 
> subsystem as a single unit that freezes entirely once the transition 
> begins.

Sure, makes sense.

-- 
Regards,
Pratyush Yadav


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 3/5] liveupdate: block session mutations during reboot
  2026-05-22 12:52       ` Pratyush Yadav
@ 2026-05-27 20:06         ` Pasha Tatashin
  0 siblings, 0 replies; 13+ messages in thread
From: Pasha Tatashin @ 2026-05-27 20:06 UTC (permalink / raw)
  To: Pratyush Yadav
  Cc: Pasha Tatashin, rppt, sourabhjain, jbouron, akpm, bhe,
	linux-kernel, dan.carpenter, rafael.j.wysocki, piliu, kexec,
	skhawaja, graf, mario.limonciello

On 05-22 14:52, Pratyush Yadav wrote:
> On Mon, May 18 2026, Pasha Tatashin wrote:
> 
> > On 05-18 18:31, Pratyush Yadav wrote:
> >> On Mon, May 18 2026, Pasha Tatashin wrote:
> >> 
> >> > During the reboot() syscall, user processes may still be running
> >> > concurrently and attempting to mutate sessions (e.g., creating,
> >> > retrieving, or releasing sessions). To prevent this, introduce
> >> > luo_session_serialize_rwsem to synchronize mutations with the
> >> > serialization process.
> >> >
> >> > All session mutation operations (create, retrieve, release, ioctl) take
> >> > the read lock. The serialization process (luo_session_serialize) takes
> >> > the write lock and holds it indefinitely on success. This effectively
> >> > freezes the LUO session subsystem during the transition to the new
> >> > kernel. If serialization fails, the lock is released to allow recovery.
> >> 
> >> Good idea I think.
> >
> > Hi Pratyush,
> >
> >> 
> >> But, do we need a new mutex? Can't we use luo_session_header->rwsem?
> >> Session creation and release take the header rwsem at one point anyway,
> >> so perhaps we can just reuse that?
> >
> > The sh->rwsem is for protecting the the session list. We only take it as 
> > a writer when modifying the list (insert/remove) and as a reader when 
> > traversing it. Also, we drop sh->rwsem as soon as we've acquired the 
> > per-session mutex to allow other list operations to proceed while a 
> > session is being modified.
> >
> > Because of this, many session mutation operations (specifically ioctl 
> > calls) don't touch sh->rwsem at all—they jump straight to the  session 
> > state via the file's private_data. To use sh->rwsem to block
> > these mutations, we would be forced to add down_read(&sh->rwsem) to 
> > every ioctl path. This would be a layering violation, coupling list 
> > management to per-session data mutations, and would introduce a global
> > bottleneck for operations that are otherwise independent.
> 
> As for the layering violation, I think we would need to change the
> semantics of the lock -- it no longer protects only the list, but other
> session operations as well.
> 
> But yeah, if we do this then operations like session creation would have
> to wait for ongoing session operations like PRESERVE_FD. My argument was
> based around the fact that session creation or removal should not be
> very frequent (and don't happen in the hot path anyway) so the added
> latency should not affect them as much. By doing this tradeoff we get
> slightly simpler code (and simpler locking scheme).
> 
> But I see your point as well. In practice session creation and
> PRESERVE_FD are independent and one should not block the other. Maybe we
> get VMMs creating sessions while another VMM is preserving stuff, and
> this slowing down the live update preparation? Dunno...
> 
> I suppose let's go with this patch. But, can you please document the
> lock hierarchy where you explain what this lock is for?

SGTM, Added a documentation about locking.

Pasha


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2026-05-27 20:06 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-18 12:54 [PATCH v5 0/5] liveupdate: serialization safety and race fixes Pasha Tatashin
2026-05-18 12:54 ` [PATCH v5 1/5] liveupdate: skip serialization for context-preserving kexec Pasha Tatashin
2026-05-18 12:54 ` [PATCH v5 2/5] liveupdate: fix TOCTOU race in luo_session_retrieve() Pasha Tatashin
2026-05-18 16:13   ` Pratyush Yadav
2026-05-18 12:54 ` [PATCH v5 3/5] liveupdate: block session mutations during reboot Pasha Tatashin
2026-05-18 16:31   ` Pratyush Yadav
2026-05-18 23:15     ` Pasha Tatashin
2026-05-22 12:52       ` Pratyush Yadav
2026-05-27 20:06         ` Pasha Tatashin
2026-05-18 12:54 ` [PATCH v5 4/5] liveupdate: fix u-a-f in luo_file_unpreserve_files() and luo_file_finish() Pasha Tatashin
2026-05-18 16:24   ` Pratyush Yadav
2026-05-18 12:54 ` [PATCH v5 5/5] liveupdate: Remove unused ser field from struct luo_session Pasha Tatashin
2026-05-18 16:24   ` Pratyush Yadav

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.