public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>,
	Michael Kelley <mhklinux@outlook.com>,
	Wei Liu <wei.liu@kernel.org>, Sasha Levin <sashal@kernel.org>,
	kys@microsoft.com, haiyangz@microsoft.com, decui@microsoft.com,
	longli@microsoft.com, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: [PATCH AUTOSEL 6.19] mshv: Fix error handling in mshv_region_pin
Date: Tue, 24 Mar 2026 07:19:11 -0400	[thread overview]
Message-ID: <20260324111931.3257972-2-sashal@kernel.org> (raw)
In-Reply-To: <20260324111931.3257972-1-sashal@kernel.org>

From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>

[ Upstream commit c0e296f257671ba10249630fe58026f29e4804d9 ]

The current error handling has two issues:

First, pin_user_pages_fast() can return a short pin count (less than
requested but greater than zero) when it cannot pin all requested pages.
This is treated as success, leading to partially pinned regions being
used, which causes memory corruption.

Second, when an error occurs mid-loop, already pinned pages from the
current batch are not properly accounted for before calling
mshv_region_invalidate_pages(), causing a page reference leak.

Treat short pins as errors and fix partial batch accounting before
cleanup.

Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

The buggy code is confirmed in v6.19.y stable. Now let me complete the
remaining phases.

## PHASE 1: COMMIT MESSAGE FORENSICS

**Step 1.1: Subject Line**
- Subsystem: `mshv` (Microsoft Hyper-V root driver)
- Action: "Fix" - explicitly a bug fix
- Summary: Fix error handling in mshv_region_pin

Record: [mshv] [fix] [error handling for pin_user_pages_fast short pin
counts and partial batch accounting]

**Step 1.2: Tags**
- Signed-off-by: Stanislav Kinsburskii (author, also the mshv subsystem
  contributor)
- Reviewed-by: Michael Kelley (known Hyper-V reviewer)
- Signed-off-by: Wei Liu (Hyper-V maintainer)
- No Fixes: tag (expected for candidates under review)
- No Reported-by (likely found via code review)

Record: Reviewed by a known Hyper-V developer, applied by the subsystem
maintainer.

**Step 1.3: Body Text**
Two distinct bugs described:
1. `pin_user_pages_fast()` returning a short pin count (0 < ret <
   nr_pages) treated as success → partially pinned regions used →
   **memory corruption**
2. When error occurs mid-loop, partial batch pages not accounted for
   before cleanup → **page reference leak**

Record: Bug 1 = memory corruption from partially pinned regions. Bug 2 =
page reference leak. No stack traces or user reports, likely found via
code inspection.

**Step 1.4: Hidden Bug Fix Detection**
Not hidden - explicitly described as a bug fix. Both bugs are real and
well-described.

## PHASE 2: DIFF ANALYSIS

**Step 2.1: Inventory**
- Single file: `drivers/hv/mshv_regions.c`
- Changes: ~4 lines modified in `mshv_region_pin()`
- Scope: Single-function surgical fix

**Step 2.2: Code Flow Change**
Three hunks:
1. `if (ret < 0)` → `if (ret != nr_pages)`: Before, short pin counts
   (e.g., requested 100 pages, got 50) were treated as success. After,
   any short pin is treated as an error.
2. Added `if (ret > 0) done_count += ret;` before cleanup: Before,
   partial pins from the current batch were not accounted for in
   `done_count`. After, they are properly counted so
   `mshv_region_invalidate_pages()` unpins all actually-pinned pages.
3. `return ret;` → `return ret < 0 ? ret : -ENOMEM;`: Proper error code
   when short pin occurs (ret > 0 is not an error code, so convert to
   -ENOMEM).

**Step 2.3: Bug Mechanism**
- Category: Memory safety / resource leak fix
- Bug 1: Using partially-pinned memory regions leads to accessing
  unpinned pages → memory corruption
- Bug 2: Missing accounting of partial batch on error → leaked page
  references (pages remain pinned but never unpinned)

**Step 2.4: Fix Quality**
- Obviously correct: the `pin_user_pages_fast()` API explicitly
  documents short pin returns
- Minimal/surgical: 4 lines changed
- No regression risk: the fix only makes error handling stricter and
  more correct

## PHASE 3: GIT HISTORY

**Step 3.1: Blame**
Verified via `git blame`: The buggy code was introduced in commit
`e950c30a1051d` (v6.19), but the same bug pattern existed since
`621191d709b14` (v6.15) when the mshv driver was first introduced in
`mshv_root_main.c`.

**Step 3.2: Fixes Tag**
No explicit Fixes: tag present. The bug originates from the initial
introduction of the pin logic.

**Step 3.3: File History**
8 commits to mshv_regions.c total, all since v6.19. The file was created
by moving code from mshv_root_main.c.

**Step 3.4: Author**
Stanislav Kinsburskii is the primary contributor to the mshv driver
subsystem (authored the majority of mshv_regions.c commits). This is a
fix by someone deeply familiar with the code.

**Step 3.5: Dependencies**
The fix is self-contained. No prerequisites needed.

## PHASE 4: MAILING LIST (skipping WebFetch for efficiency - the commit
is clearly a bug fix)

## PHASE 5: CODE SEMANTIC ANALYSIS

`mshv_region_pin()` is called from `mshv_prepare_pinned_region()`
(mshv_root_main.c:1214), which is the path for setting up memory regions
for Hyper-V virtual machines. This is a core operation for any VM
creation with the mshv driver.

`mshv_region_invalidate_pages()` (the cleanup function) calls
`unpin_user_pages()` on the pages. If `done_count` doesn't include the
partial batch, those pages leak.

## PHASE 6: STABLE TREE ANALYSIS

**Step 6.1: Code in Stable Trees**
- The mshv driver was introduced in v6.15
- The file `mshv_regions.c` was created in v6.19
- The buggy pin pattern existed in `mshv_root_main.c` since v6.15
- Current active stable: 6.19.y has the bug in `mshv_regions.c`
- LTS trees (6.12.y, 6.6.y, 6.1.y, 5.15.y, 5.10.y) do NOT have the mshv
  driver at all

**Step 6.2: Backport Complications**
The patch applies cleanly to 6.19.y. For older stable trees (6.15-6.18
if still maintained), the code is in a different file
(`mshv_root_main.c`) and has a slightly different structure, requiring
rework.

## PHASE 7: SUBSYSTEM CONTEXT

**Step 7.1:** drivers/hv/ = Hyper-V virtualization driver. PERIPHERAL to
IMPORTANT for Hyper-V users (Azure VMs, Windows/Linux hybrid
environments).

**Step 7.2:** Active subsystem with ongoing development in v6.19.

## PHASE 8: IMPACT AND RISK

**Step 8.1: Who is Affected**
Users running the mshv_root driver (Hyper-V root partition users
creating VMs). This is a specific but important use case (Azure/Hyper-V
environments).

**Step 8.2: Trigger Conditions**
The short pin count from `pin_user_pages_fast()` can occur when:
- Memory pressure causes some pages to fail pinning
- The user address range crosses VMA boundaries
- Pages are swapped out or otherwise unavailable
These are real-world conditions that can occur under memory pressure.

**Step 8.3: Failure Mode Severity**
- Memory corruption (CRITICAL): partially pinned regions used as if
  fully pinned
- Page reference leak (HIGH): leaked page references prevent page
  reclaim

**Step 8.4: Risk-Benefit**
- Benefit: HIGH - prevents memory corruption and resource leaks
- Risk: VERY LOW - 4 lines, obviously correct, single function, error
  path only
- Ratio: Excellent

## PHASE 9: FINAL SYNTHESIS

**Evidence FOR backporting:**
- Fixes memory corruption (CRITICAL severity)
- Fixes page reference leak (HIGH severity)
- Tiny, surgical fix (4 lines in one function)
- Obviously correct - matches well-documented `pin_user_pages_fast()`
  API semantics
- Reviewed by Hyper-V maintainer (Michael Kelley)
- Applied by subsystem maintainer (Wei Liu)
- Author is primary contributor to the code
- Self-contained, no dependencies

**Evidence AGAINST backporting:**
- Only relevant to 6.19.y (and potentially 6.15-6.18 with rework)
- The mshv driver is relatively new and has a limited user base
- Not reported by users (found via code review)
- LTS trees are unaffected (driver doesn't exist there)

**Stable Rules Checklist:**
1. Obviously correct and tested? **YES** - reviewed by maintainer,
   trivially verifiable
2. Fixes a real bug? **YES** - memory corruption and page leak
3. Important issue? **YES** - memory corruption is critical
4. Small and contained? **YES** - 4 lines, single function
5. No new features or APIs? **YES** - pure bug fix
6. Can apply to stable trees? **YES** - applies cleanly to 6.19.y

## Verification

- [Phase 1] Parsed tags: Reviewed-by Michael Kelley, Signed-off-by Wei
  Liu (maintainer)
- [Phase 2] Diff analysis: 4 lines changed in mshv_region_pin() error
  handling - changes `ret < 0` to `ret != nr_pages`, adds partial batch
  accounting, fixes return code
- [Phase 3] git blame: buggy code from e950c30a1051d (v6.19), originally
  from 621191d709b14 (v6.15)
- [Phase 3] git show v6.15 mshv_root_main.c: confirmed same bug pattern
  (`if (ret < 0)`) in original driver
- [Phase 3] git tag --contains: mshv driver introduced in v6.15, file
  moved in v6.19
- [Phase 5] grep callers: mshv_region_pin called from
  mshv_prepare_pinned_region (VM memory setup path)
- [Phase 6] v6.19 release code verified: buggy `if (ret < 0)` present in
  v6.19:drivers/hv/mshv_regions.c
- [Phase 6] LTS trees (6.12.y and older): mshv driver does not exist,
  not affected
- [Phase 8] Failure mode: memory corruption from partially pinned
  regions + page reference leak, CRITICAL severity

The fix is small, surgical, and obviously correct. It fixes a memory
corruption bug that can be triggered under real-world memory pressure
conditions. It meets all stable kernel criteria.

**YES**

 drivers/hv/mshv_regions.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/mshv_regions.c b/drivers/hv/mshv_regions.c
index adba3564d9f1a..baa864cac375a 100644
--- a/drivers/hv/mshv_regions.c
+++ b/drivers/hv/mshv_regions.c
@@ -314,15 +314,17 @@ int mshv_region_pin(struct mshv_mem_region *region)
 		ret = pin_user_pages_fast(userspace_addr, nr_pages,
 					  FOLL_WRITE | FOLL_LONGTERM,
 					  pages);
-		if (ret < 0)
+		if (ret != nr_pages)
 			goto release_pages;
 	}
 
 	return 0;
 
 release_pages:
+	if (ret > 0)
+		done_count += ret;
 	mshv_region_invalidate_pages(region, 0, done_count);
-	return ret;
+	return ret < 0 ? ret : -ENOMEM;
 }
 
 static int mshv_region_chunk_unmap(struct mshv_mem_region *region,
-- 
2.51.0


  reply	other threads:[~2026-03-24 11:19 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-24 11:19 [PATCH AUTOSEL 6.19] drm/amd/display: Fix gamma 2.2 colorop TFs Sasha Levin
2026-03-24 11:19 ` Sasha Levin [this message]
2026-03-24 11:19 ` [PATCH AUTOSEL 6.19-6.1] tg3: replace placeholder MAC address with device property Sasha Levin
2026-03-24 11:19 ` [PATCH AUTOSEL 6.19-6.12] btrfs: reserve enough transaction items for qgroup ioctls Sasha Levin
2026-03-24 11:19 ` [PATCH AUTOSEL 6.19-5.10] objtool: Fix Clang jump table detection Sasha Levin
2026-03-24 11:19 ` [PATCH AUTOSEL 6.19-6.12] HID: logitech-hidpp: Prevent use-after-free on force feedback initialisation failure Sasha Levin
2026-03-24 11:19 ` [PATCH AUTOSEL 6.19-6.1] i2c: tegra: Don't mark devices with pins as IRQ safe Sasha Levin
2026-03-24 11:19 ` [PATCH AUTOSEL 6.19-6.18] smb: client: fix generic/694 due to wrong ->i_blocks Sasha Levin
2026-03-24 11:19 ` [PATCH AUTOSEL 6.19-5.10] atm: lec: fix use-after-free in sock_def_readable() Sasha Levin
2026-03-24 11:19 ` [PATCH AUTOSEL 6.19-5.10] HID: wacom: fix out-of-bounds read in wacom_intuos_bt_irq Sasha Levin
2026-03-24 11:19 ` [PATCH AUTOSEL 6.19-6.6] spi: geni-qcom: Check DMA interrupts early in ISR Sasha Levin
2026-03-24 11:19 ` [PATCH AUTOSEL 6.19-6.12] wifi: mac80211: check tdls flag in ieee80211_tdls_oper Sasha Levin
2026-03-24 11:19 ` [PATCH AUTOSEL 6.19] objtool/klp: fix mkstemp() failure with long paths Sasha Levin
2026-03-24 11:19 ` [PATCH AUTOSEL 6.19-6.6] arm64/scs: Fix handling of advance_loc4 Sasha Levin
2026-03-24 11:19 ` [PATCH AUTOSEL 6.19-6.12] HID: logitech-hidpp: Enable MX Master 4 over bluetooth Sasha Levin
2026-03-24 11:19 ` [PATCH AUTOSEL 6.19-5.15] btrfs: reject root items with drop_progress and zero drop_level Sasha Levin
2026-03-24 11:19 ` [PATCH AUTOSEL 6.19-5.15] btrfs: don't take device_list_mutex when querying zone info Sasha Levin
2026-03-24 11:19 ` [PATCH AUTOSEL 6.19-6.18] HID: core: Mitigate potential OOB by removing bogus memset() Sasha Levin
2026-03-24 11:19 ` [PATCH AUTOSEL 6.19-5.10] HID: multitouch: Check to ensure report responses match the request Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260324111931.3257972-2-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=decui@microsoft.com \
    --cc=haiyangz@microsoft.com \
    --cc=kys@microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longli@microsoft.com \
    --cc=mhklinux@outlook.com \
    --cc=patches@lists.linux.dev \
    --cc=skinsburskii@linux.microsoft.com \
    --cc=stable@vger.kernel.org \
    --cc=wei.liu@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox