From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Linggang Zeng <linggang.zeng@easystack.cn>,
Mingzhe Zou <mingzhe.zou@easystack.cn>,
Coly Li <colyli@kernel.org>, Jens Axboe <axboe@kernel.dk>,
Sasha Levin <sashal@kernel.org>,
kent.overstreet@linux.dev, linux-bcache@vger.kernel.org
Subject: [PATCH AUTOSEL 6.6 03/18] bcache: fix NULL pointer in cache_set_flush()
Date: Mon, 9 Jun 2025 09:46:37 -0400 [thread overview]
Message-ID: <20250609134652.1344323-3-sashal@kernel.org> (raw)
In-Reply-To: <20250609134652.1344323-1-sashal@kernel.org>
From: Linggang Zeng <linggang.zeng@easystack.cn>
[ Upstream commit 1e46ed947ec658f89f1a910d880cd05e42d3763e ]
1. LINE#1794 - LINE#1887 is some codes about function of
bch_cache_set_alloc().
2. LINE#2078 - LINE#2142 is some codes about function of
register_cache_set().
3. register_cache_set() will call bch_cache_set_alloc() in LINE#2098.
1794 struct cache_set *bch_cache_set_alloc(struct cache_sb *sb)
1795 {
...
1860 if (!(c->devices = kcalloc(c->nr_uuids, sizeof(void *), GFP_KERNEL)) ||
1861 mempool_init_slab_pool(&c->search, 32, bch_search_cache) ||
1862 mempool_init_kmalloc_pool(&c->bio_meta, 2,
1863 sizeof(struct bbio) + sizeof(struct bio_vec) *
1864 bucket_pages(c)) ||
1865 mempool_init_kmalloc_pool(&c->fill_iter, 1, iter_size) ||
1866 bioset_init(&c->bio_split, 4, offsetof(struct bbio, bio),
1867 BIOSET_NEED_BVECS|BIOSET_NEED_RESCUER) ||
1868 !(c->uuids = alloc_bucket_pages(GFP_KERNEL, c)) ||
1869 !(c->moving_gc_wq = alloc_workqueue("bcache_gc",
1870 WQ_MEM_RECLAIM, 0)) ||
1871 bch_journal_alloc(c) ||
1872 bch_btree_cache_alloc(c) ||
1873 bch_open_buckets_alloc(c) ||
1874 bch_bset_sort_state_init(&c->sort, ilog2(c->btree_pages)))
1875 goto err;
^^^^^^^^
1876
...
1883 return c;
1884 err:
1885 bch_cache_set_unregister(c);
^^^^^^^^^^^^^^^^^^^^^^^^^^^
1886 return NULL;
1887 }
...
2078 static const char *register_cache_set(struct cache *ca)
2079 {
...
2098 c = bch_cache_set_alloc(&ca->sb);
2099 if (!c)
2100 return err;
^^^^^^^^^^
...
2128 ca->set = c;
2129 ca->set->cache[ca->sb.nr_this_dev] = ca;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...
2138 return NULL;
2139 err:
2140 bch_cache_set_unregister(c);
2141 return err;
2142 }
(1) If LINE#1860 - LINE#1874 is true, then do 'goto err'(LINE#1875) and
call bch_cache_set_unregister()(LINE#1885).
(2) As (1) return NULL(LINE#1886), LINE#2098 - LINE#2100 would return.
(3) As (2) has returned, LINE#2128 - LINE#2129 would do *not* give the
value to c->cache[], it means that c->cache[] is NULL.
LINE#1624 - LINE#1665 is some codes about function of cache_set_flush().
As (1), in LINE#1885 call
bch_cache_set_unregister()
---> bch_cache_set_stop()
---> closure_queue()
-.-> cache_set_flush() (as below LINE#1624)
1624 static void cache_set_flush(struct closure *cl)
1625 {
...
1654 for_each_cache(ca, c, i)
1655 if (ca->alloc_thread)
^^
1656 kthread_stop(ca->alloc_thread);
...
1665 }
(4) In LINE#1655 ca is NULL(see (3)) in cache_set_flush() then the
kernel crash occurred as below:
[ 846.712887] bcache: register_cache() error drbd6: cannot allocate memory
[ 846.713242] bcache: register_bcache() error : failed to register device
[ 846.713336] bcache: cache_set_free() Cache set 2f84bdc1-498a-4f2f-98a7-01946bf54287 unregistered
[ 846.713768] BUG: unable to handle kernel NULL pointer dereference at 00000000000009f8
[ 846.714790] PGD 0 P4D 0
[ 846.715129] Oops: 0000 [#1] SMP PTI
[ 846.715472] CPU: 19 PID: 5057 Comm: kworker/19:16 Kdump: loaded Tainted: G OE --------- - - 4.18.0-147.5.1.el8_1.5es.3.x86_64 #1
[ 846.716082] Hardware name: ESPAN GI-25212/X11DPL-i, BIOS 2.1 06/15/2018
[ 846.716451] Workqueue: events cache_set_flush [bcache]
[ 846.716808] RIP: 0010:cache_set_flush+0xc9/0x1b0 [bcache]
[ 846.717155] Code: 00 4c 89 a5 b0 03 00 00 48 8b 85 68 f6 ff ff a8 08 0f 84 88 00 00 00 31 db 66 83 bd 3c f7 ff ff 00 48 8b 85 48 ff ff ff 74 28 <48> 8b b8 f8 09 00 00 48 85 ff 74 05 e8 b6 58 a2 e1 0f b7 95 3c f7
[ 846.718026] RSP: 0018:ffffb56dcf85fe70 EFLAGS: 00010202
[ 846.718372] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 846.718725] RDX: 0000000000000001 RSI: 0000000040000001 RDI: 0000000000000000
[ 846.719076] RBP: ffffa0ccc0f20df8 R08: ffffa0ce1fedb118 R09: 000073746e657665
[ 846.719428] R10: 8080808080808080 R11: 0000000000000000 R12: ffffa0ce1fee8700
[ 846.719779] R13: ffffa0ccc0f211a8 R14: ffffa0cd1b902840 R15: ffffa0ccc0f20e00
[ 846.720132] FS: 0000000000000000(0000) GS:ffffa0ce1fec0000(0000) knlGS:0000000000000000
[ 846.720726] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 846.721073] CR2: 00000000000009f8 CR3: 00000008ba00a005 CR4: 00000000007606e0
[ 846.721426] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 846.721778] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 846.722131] PKRU: 55555554
[ 846.722467] Call Trace:
[ 846.722814] process_one_work+0x1a7/0x3b0
[ 846.723157] worker_thread+0x30/0x390
[ 846.723501] ? create_worker+0x1a0/0x1a0
[ 846.723844] kthread+0x112/0x130
[ 846.724184] ? kthread_flush_work_fn+0x10/0x10
[ 846.724535] ret_from_fork+0x35/0x40
Now, check whether that ca is NULL in LINE#1655 to fix the issue.
Signed-off-by: Linggang Zeng <linggang.zeng@easystack.cn>
Signed-off-by: Mingzhe Zou <mingzhe.zou@easystack.cn>
Signed-off-by: Coly Li <colyli@kernel.org>
Link: https://lore.kernel.org/r/20250527051601.74407-2-colyli@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
**YES**
This commit should be backported to stable kernel trees. Here's my
extensive analysis:
## Bug Severity and Impact
This commit fixes a **NULL pointer dereference** in the bcache subsystem
that causes a kernel crash. The crash trace shows:
```
BUG: unable to handle kernel NULL pointer dereference at
00000000000009f8
```
The crash occurs during cache set registration failure in an out-of-
memory (OOM) condition, which is a real-world scenario that can happen
in production systems under memory pressure.
## Root Cause Analysis
The commit message provides excellent debugging information showing the
precise execution flow:
1. In `bch_cache_set_alloc()`, if memory allocation fails (lines
1860-1874), it goes to the error path which calls
`bch_cache_set_unregister()` and returns NULL
2. In `register_cache_set()`, when `bch_cache_set_alloc()` returns NULL,
the function returns early (line 2100)
3. Because of the early return, `ca->set` is never assigned (lines
2128-2129), leaving `c->cache[]` as NULL
4. When `cache_set_flush()` is called via the closure mechanism, it
tries to access `ca->alloc_thread` where `ca` is NULL, causing the
crash
## Code Change Analysis
The fix is minimal and surgical:
```diff
- if (ca->alloc_thread)
+ /*
+ * If the register_cache_set() call to bch_cache_set_alloc()
failed,
+ * ca has not been assigned a value and return error.
+ * So we need check ca is not NULL during
bch_cache_set_unregister().
+ */
+ if (ca && ca->alloc_thread)
kthread_stop(ca->alloc_thread);
```
This adds a NULL check before dereferencing `ca`, which is a defensive
programming practice that prevents the crash.
## Comparison with Similar Commits
Looking at the similar commits with "Backport Status: YES":
1. **Similar commits #1, #2, #3**: All fix NULL pointer dereferences in
bcache during OOM conditions
2. **Similar commits #4, #5**: Fix potential NULL pointer issues in
btree node allocation
3. All these commits share characteristics:
- Fix real crashes with clear reproducers
- Minimal code changes (often just adding NULL checks)
- No architectural changes
- No new features
- Clear bug fixes in error paths
## Stable Tree Criteria Assessment
This commit meets all the stable tree criteria:
1. **Fixes a real bug**: Yes - NULL pointer dereference causing kernel
crash
2. **Already in Linus' tree**: Yes - has been merged by Jens Axboe
3. **Small and contained**: Yes - single line change adding NULL check
4. **No new features**: Correct - purely a bug fix
5. **Minimal risk**: Very low risk - adds defensive NULL check
6. **Clear issue**: The crash trace and detailed analysis make the issue
crystal clear
## Additional Considerations
- The bug occurs in an error path during OOM conditions, which might be
considered rare but is definitely possible in production
- The fix prevents a kernel crash, which is always high priority for
stable backports
- The bcache subsystem is used in production systems for caching, making
stability critical
- The fix pattern (adding NULL checks in error paths) is well-
established and safe
## Conclusion
This commit is an excellent candidate for stable backporting. It fixes a
real kernel crash with a minimal, safe change that adds proper NULL
checking in an error path. The detailed commit message with line-by-line
analysis demonstrates thorough understanding of the issue, and the fix
follows established patterns seen in other successful bcache stable
backports.
drivers/md/bcache/super.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 8440b56e385d5..b9dfebaa9eae5 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1739,7 +1739,12 @@ static void cache_set_flush(struct closure *cl)
mutex_unlock(&b->write_lock);
}
- if (ca->alloc_thread)
+ /*
+ * If the register_cache_set() call to bch_cache_set_alloc() failed,
+ * ca has not been assigned a value and return error.
+ * So we need check ca is not NULL during bch_cache_set_unregister().
+ */
+ if (ca && ca->alloc_thread)
kthread_stop(ca->alloc_thread);
if (c->journal.cur) {
--
2.39.5
next prev parent reply other threads:[~2025-06-09 13:46 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-09 13:46 [PATCH AUTOSEL 6.6 01/18] md/md-bitmap: fix dm-raid max_write_behind setting Sasha Levin
2025-06-09 13:46 ` [PATCH AUTOSEL 6.6 02/18] amd/amdkfd: fix a kfd_process ref leak Sasha Levin
2025-06-09 13:46 ` Sasha Levin [this message]
2025-06-09 13:46 ` [PATCH AUTOSEL 6.6 04/18] drm/scheduler: signal scheduled fence when kill job Sasha Levin
2025-06-09 13:46 ` [PATCH AUTOSEL 6.6 05/18] iio: pressure: zpa2326: Use aligned_s64 for the timestamp Sasha Levin
2025-06-09 13:46 ` [PATCH AUTOSEL 6.6 06/18] um: Add cmpxchg8b_emu and checksum functions to asm-prototypes.h Sasha Levin
2025-06-09 13:46 ` [PATCH AUTOSEL 6.6 07/18] um: use proper care when taking mmap lock during segfault Sasha Levin
2025-06-09 13:46 ` [PATCH AUTOSEL 6.6 08/18] coresight: Only check bottom two claim bits Sasha Levin
2025-06-09 13:46 ` [PATCH AUTOSEL 6.6 09/18] usb: dwc2: also exit clock_gating when stopping udc while suspended Sasha Levin
2025-06-09 13:46 ` [PATCH AUTOSEL 6.6 10/18] iio: adc: ad_sigma_delta: Fix use of uninitialized status_pos Sasha Levin
2025-06-09 13:46 ` [PATCH AUTOSEL 6.6 11/18] misc: tps6594-pfsm: Add NULL pointer check in tps6594_pfsm_probe() Sasha Levin
2025-06-09 13:46 ` [PATCH AUTOSEL 6.6 12/18] usb: potential integer overflow in usbg_make_tpg() Sasha Levin
2025-06-09 13:46 ` [PATCH AUTOSEL 6.6 13/18] tty: serial: uartlite: register uart driver in init Sasha Levin
2025-06-09 13:46 ` [PATCH AUTOSEL 6.6 14/18] usb: common: usb-conn-gpio: use a unique name for usb connector device Sasha Levin
2025-06-09 13:46 ` [PATCH AUTOSEL 6.6 15/18] usb: Add checks for snprintf() calls in usb_alloc_dev() Sasha Levin
2025-06-09 13:46 ` [PATCH AUTOSEL 6.6 16/18] usb: cdc-wdm: avoid setting WDM_READ for ZLP-s Sasha Levin
2025-06-09 13:46 ` [PATCH AUTOSEL 6.6 17/18] usb: typec: displayport: Receive DP Status Update NAK request exit dp altmode Sasha Levin
2025-06-09 13:46 ` [PATCH AUTOSEL 6.6 18/18] usb: typec: mux: do not return on EOPNOTSUPP in {mux, switch}_set Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250609134652.1344323-3-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=axboe@kernel.dk \
--cc=colyli@kernel.org \
--cc=kent.overstreet@linux.dev \
--cc=linggang.zeng@easystack.cn \
--cc=linux-bcache@vger.kernel.org \
--cc=mingzhe.zou@easystack.cn \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.