[PATCH AUTOSEL 6.15 15/21] dm vdo indexer: don't read request structure after enqueuing

public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed

From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Matthew Sakai <msakai@redhat.com>,
	Ken Raeburn <raeburn@redhat.com>,
	Mikulas Patocka <mpatocka@redhat.com>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH AUTOSEL 6.15 15/21] dm vdo indexer: don't read request structure after enqueuing
Date: Fri,  6 Jun 2025 11:41:40 -0400	[thread overview]
Message-ID: <20250606154147.546388-15-sashal@kernel.org> (raw)
In-Reply-To: <20250606154147.546388-1-sashal@kernel.org>

From: Matthew Sakai <msakai@redhat.com>

[ Upstream commit 3da732687d72078e52cc7f334a482383e84ca156 ]

The function get_volume_page_protected may place a request on
a queue for another thread to process asynchronously. When this
happens, the volume should not read the request from the original
thread. This can not currently cause problems, due to the way
request processing is handled, but it is not safe in general.

Reviewed-by: Ken Raeburn <raeburn@redhat.com>
Signed-off-by: Matthew Sakai <msakai@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

**YES**

This commit should be backported to stable kernel trees. Here's my
analysis:

## Bug Analysis

This commit fixes a **serious race condition and potential use-after-
free vulnerability** in the dm-vdo indexer. The issue occurs when a
request structure is accessed after being enqueued for asynchronous
processing by another thread.

### The Problem

Looking at the code changes, the original implementation had a dangerous
pattern:

```c
// In get_volume_page_protected() - BEFORE fix
if (page == NULL) {
    enqueue_page_read(volume, request, physical_page);  // Request
handed to reader thread
    mutex_unlock(&volume->read_threads_mutex);
    begin_pending_search(&volume->page_cache, physical_page,
request->zone_number);  // UNSAFE!
    //
^^^^^^^^^^^^^^
    // Accessing request after it may have been freed/modified by reader
thread
}
```

Once `enqueue_page_read()` is called, the request structure is owned by
the reader thread pool and may be:
- Freed immediately
- Modified by the reader thread
- Reused for another request

Any subsequent access to `request->zone_number` creates a race condition
that could lead to:
1. **Kernel crashes** from accessing freed memory
2. **Silent data corruption** from using wrong zone numbers
3. **System instability** in production storage environments

### The Fix

The solution is elegant and safe - cache the zone number immediately:

```c
// AFTER fix
static int get_volume_page_protected(struct volume *volume, struct
uds_request *request,
                                     u32 physical_page, struct
cached_page **page_ptr)
{
    unsigned int zone_number = request->zone_number;  // Cache
immediately

    // ... later use zone_number instead of request->zone_number
    begin_pending_search(&volume->page_cache, physical_page,
zone_number);  // SAFE!
}
```

## Backport Justification

This meets all stable tree criteria:

1. **Critical Bug Fix**: Fixes a race condition that can cause kernel
   crashes and data corruption
2. **Production Impact**: VDO is used in enterprise storage systems
   where this could cause severe problems
3. **Low Risk**: The fix is minimal (just caching a variable locally)
   with no functional changes
4. **Hard to Debug**: Race conditions are timing-dependent and extremely
   difficult to diagnose in production
5. **Clear Root Cause**: The commit message and code changes clearly
   identify and fix the specific issue

The commit message explicitly states "This can not currently cause
problems, due to the way request processing is handled, but it is not
safe in general" - indicating this is a latent bug that could be
triggered by future changes or under specific timing conditions.

This type of memory safety fix in critical storage infrastructure should
absolutely be backported to prevent potential data loss or system
crashes in production environments.

 drivers/md/dm-vdo/indexer/volume.c | 24 +++++++++++++-----------
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/drivers/md/dm-vdo/indexer/volume.c b/drivers/md/dm-vdo/indexer/volume.c
index 655453bb276be..425b3a74f4dba 100644
--- a/drivers/md/dm-vdo/indexer/volume.c
+++ b/drivers/md/dm-vdo/indexer/volume.c
@@ -754,10 +754,11 @@ static int get_volume_page_protected(struct volume *volume, struct uds_request *
 				     u32 physical_page, struct cached_page **page_ptr)
 {
 	struct cached_page *page;
+	unsigned int zone_number = request->zone_number;
 
 	get_page_from_cache(&volume->page_cache, physical_page, &page);
 	if (page != NULL) {
-		if (request->zone_number == 0) {
+		if (zone_number == 0) {
 			/* Only one zone is allowed to update the LRU. */
 			make_page_most_recent(&volume->page_cache, page);
 		}
@@ -767,7 +768,7 @@ static int get_volume_page_protected(struct volume *volume, struct uds_request *
 	}
 
 	/* Prepare to enqueue a read for the page. */
-	end_pending_search(&volume->page_cache, request->zone_number);
+	end_pending_search(&volume->page_cache, zone_number);
 	mutex_lock(&volume->read_threads_mutex);
 
 	/*
@@ -787,8 +788,7 @@ static int get_volume_page_protected(struct volume *volume, struct uds_request *
 		 * the order does not matter for correctness as it does below.
 		 */
 		mutex_unlock(&volume->read_threads_mutex);
-		begin_pending_search(&volume->page_cache, physical_page,
-				     request->zone_number);
+		begin_pending_search(&volume->page_cache, physical_page, zone_number);
 		return UDS_QUEUED;
 	}
 
@@ -797,7 +797,7 @@ static int get_volume_page_protected(struct volume *volume, struct uds_request *
 	 * "search pending" state in careful order so no other thread can mess with the data before
 	 * the caller gets to look at it.
 	 */
-	begin_pending_search(&volume->page_cache, physical_page, request->zone_number);
+	begin_pending_search(&volume->page_cache, physical_page, zone_number);
 	mutex_unlock(&volume->read_threads_mutex);
 	*page_ptr = page;
 	return UDS_SUCCESS;
@@ -849,6 +849,7 @@ static int search_cached_index_page(struct volume *volume, struct uds_request *r
 {
 	int result;
 	struct cached_page *page = NULL;
+	unsigned int zone_number = request->zone_number;
 	u32 physical_page = map_to_physical_page(volume->geometry, chapter,
 						 index_page_number);
 
@@ -858,18 +859,18 @@ static int search_cached_index_page(struct volume *volume, struct uds_request *r
 	 * invalidation by the reader thread, before the reader thread has noticed that the
 	 * invalidate_counter has been incremented.
 	 */
-	begin_pending_search(&volume->page_cache, physical_page, request->zone_number);
+	begin_pending_search(&volume->page_cache, physical_page, zone_number);
 
 	result = get_volume_page_protected(volume, request, physical_page, &page);
 	if (result != UDS_SUCCESS) {
-		end_pending_search(&volume->page_cache, request->zone_number);
+		end_pending_search(&volume->page_cache, zone_number);
 		return result;
 	}
 
 	result = uds_search_chapter_index_page(&page->index_page, volume->geometry,
 					       &request->record_name,
 					       record_page_number);
-	end_pending_search(&volume->page_cache, request->zone_number);
+	end_pending_search(&volume->page_cache, zone_number);
 	return result;
 }
 
@@ -882,6 +883,7 @@ int uds_search_cached_record_page(struct volume *volume, struct uds_request *req
 {
 	struct cached_page *record_page;
 	struct index_geometry *geometry = volume->geometry;
+	unsigned int zone_number = request->zone_number;
 	int result;
 	u32 physical_page, page_number;
 
@@ -905,11 +907,11 @@ int uds_search_cached_record_page(struct volume *volume, struct uds_request *req
 	 * invalidation by the reader thread, before the reader thread has noticed that the
 	 * invalidate_counter has been incremented.
 	 */
-	begin_pending_search(&volume->page_cache, physical_page, request->zone_number);
+	begin_pending_search(&volume->page_cache, physical_page, zone_number);
 
 	result = get_volume_page_protected(volume, request, physical_page, &record_page);
 	if (result != UDS_SUCCESS) {
-		end_pending_search(&volume->page_cache, request->zone_number);
+		end_pending_search(&volume->page_cache, zone_number);
 		return result;
 	}
 
@@ -917,7 +919,7 @@ int uds_search_cached_record_page(struct volume *volume, struct uds_request *req
 			       &request->record_name, geometry, &request->old_metadata))
 		*found = true;
 
-	end_pending_search(&volume->page_cache, request->zone_number);
+	end_pending_search(&volume->page_cache, zone_number);
 	return UDS_SUCCESS;
 }
 
-- 
2.39.5

next prev parent reply	other threads:[~2025-06-06 15:42 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-06 15:41 [PATCH AUTOSEL 6.15 01/21] cifs: Correctly set SMB1 SessionKey field in Session Setup Request Sasha Levin
2025-06-06 15:41 ` [PATCH AUTOSEL 6.15 02/21] cifs: Fix cifs_query_path_info() for Windows NT servers Sasha Levin
2025-06-06 15:41 ` [PATCH AUTOSEL 6.15 03/21] cifs: Fix encoding of SMB1 Session Setup NTLMSSP Request in non-UNICODE mode Sasha Levin
2025-06-06 15:41 ` [PATCH AUTOSEL 6.15 04/21] NFSv4: Always set NLINK even if the server doesn't support it Sasha Levin
2025-06-06 15:41 ` [PATCH AUTOSEL 6.15 05/21] NFSv4.2: fix listxattr to return selinux security label Sasha Levin
2025-06-06 15:41 ` [PATCH AUTOSEL 6.15 06/21] NFSv4.2: fix setattr caching of TIME_[MODIFY|ACCESS]_SET when timestamps are delegated Sasha Levin
2025-06-06 15:41 ` [PATCH AUTOSEL 6.15 07/21] mailbox: Not protect module_put with spin_lock_irqsave Sasha Levin
2025-06-06 15:41 ` [PATCH AUTOSEL 6.15 08/21] mfd: max77541: Fix wakeup source leaks on device unbind Sasha Levin
2025-06-06 15:41 ` [PATCH AUTOSEL 6.15 09/21] mfd: max14577: " Sasha Levin
2025-06-06 15:41 ` [PATCH AUTOSEL 6.15 10/21] mfd: max77705: " Sasha Levin
2025-06-06 15:41 ` [PATCH AUTOSEL 6.15 11/21] mfd: 88pm886: " Sasha Levin
2025-06-06 15:41 ` [PATCH AUTOSEL 6.15 12/21] mfd: sprd-sc27xx: " Sasha Levin
2025-06-06 15:41 ` [PATCH AUTOSEL 6.15 13/21] sunrpc: don't immediately retransmit on seqno miss Sasha Levin
2025-06-06 15:41 ` [PATCH AUTOSEL 6.15 14/21] hwmon: (isl28022) Fix current reading calculation Sasha Levin
2025-06-06 15:41 ` Sasha Levin [this message]
2025-06-06 15:41 ` [PATCH AUTOSEL 6.15 16/21] leds: multicolor: Fix intensity setting while SW blinking Sasha Levin
2025-06-06 15:41 ` [PATCH AUTOSEL 6.15 17/21] fuse: fix race between concurrent setattrs from multiple nodes Sasha Levin
2025-06-06 15:41 ` [PATCH AUTOSEL 6.15 18/21] cxl/region: Add a dev_err() on missing target list entries Sasha Levin
2025-06-06 15:41 ` [PATCH AUTOSEL 6.15 19/21] cxl: core/region - ignore interleave granularity when ways=1 Sasha Levin
2025-06-06 15:41 ` [PATCH AUTOSEL 6.15 20/21] NFSv4: xattr handlers should check for absent nfs filehandles Sasha Levin
2025-06-06 15:41 ` [PATCH AUTOSEL 6.15 21/21] hwmon: (pmbus/max34440) Fix support for max34451 Sasha Levin

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:655453bb276b dfblob:425b3a74f4db )
 OR (
bs:"[PATCH AUTOSEL 6.15 15/21] dm vdo indexer: don't read request structure after enqueuing" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250606154147.546388-15-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=msakai@redhat.com \
    --cc=patches@lists.linux.dev \
    --cc=raeburn@redhat.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox