From: David Howells <dhowells@redhat.com>
To: Christian Brauner <christian@brauner.io>,
Steve French <smfrench@gmail.com>,
Matthew Wilcox <willy@infradead.org>
Cc: David Howells <dhowells@redhat.com>,
Jeff Layton <jlayton@kernel.org>,
Gao Xiang <hsiangkao@linux.alibaba.com>,
Dominique Martinet <asmadeus@codewreck.org>,
Marc Dionne <marc.dionne@auristor.com>,
Paulo Alcantara <pc@manguebit.com>,
Shyam Prasad N <sprasad@microsoft.com>,
Tom Talpey <tom@talpey.com>,
Eric Van Hensbergen <ericvh@kernel.org>,
Ilya Dryomov <idryomov@gmail.com>,
netfs@lists.linux.dev, linux-afs@lists.infradead.org,
linux-cifs@vger.kernel.org, linux-nfs@vger.kernel.org,
ceph-devel@vger.kernel.org, v9fs@lists.linux.dev,
linux-erofs@lists.ozlabs.org, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org,
Max Kellermann <max.kellermann@ionos.com>,
stable@vger.kernel.org
Subject: [PATCH 01/24] fs/netfs/fscache_cookie: add missing "n_accesses" check
Date: Mon, 29 Jul 2024 17:19:30 +0100 [thread overview]
Message-ID: <20240729162002.3436763-2-dhowells@redhat.com> (raw)
In-Reply-To: <20240729162002.3436763-1-dhowells@redhat.com>
From: Max Kellermann <max.kellermann@ionos.com>
This fixes a NULL pointer dereference bug due to a data race which
looks like this:
BUG: kernel NULL pointer dereference, address: 0000000000000008
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
CPU: 33 PID: 16573 Comm: kworker/u97:799 Not tainted 6.8.7-cm4all1-hp+ #43
Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 10/17/2018
Workqueue: events_unbound netfs_rreq_write_to_cache_work
RIP: 0010:cachefiles_prepare_write+0x30/0xa0
Code: 57 41 56 45 89 ce 41 55 49 89 cd 41 54 49 89 d4 55 53 48 89 fb 48 83 ec 08 48 8b 47 08 48 83 7f 10 00 48 89 34 24 48 8b 68 20 <48> 8b 45 08 4c 8b 38 74 45 49 8b 7f 50 e8 4e a9 b0 ff 48 8b 73 10
RSP: 0018:ffffb4e78113bde0 EFLAGS: 00010286
RAX: ffff976126be6d10 RBX: ffff97615cdb8438 RCX: 0000000000020000
RDX: ffff97605e6c4c68 RSI: ffff97605e6c4c60 RDI: ffff97615cdb8438
RBP: 0000000000000000 R08: 0000000000278333 R09: 0000000000000001
R10: ffff97605e6c4600 R11: 0000000000000001 R12: ffff97605e6c4c68
R13: 0000000000020000 R14: 0000000000000001 R15: ffff976064fe2c00
FS: 0000000000000000(0000) GS:ffff9776dfd40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 000000005942c002 CR4: 00000000001706f0
Call Trace:
<TASK>
? __die+0x1f/0x70
? page_fault_oops+0x15d/0x440
? search_module_extables+0xe/0x40
? fixup_exception+0x22/0x2f0
? exc_page_fault+0x5f/0x100
? asm_exc_page_fault+0x22/0x30
? cachefiles_prepare_write+0x30/0xa0
netfs_rreq_write_to_cache_work+0x135/0x2e0
process_one_work+0x137/0x2c0
worker_thread+0x2e9/0x400
? __pfx_worker_thread+0x10/0x10
kthread+0xcc/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x30/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1b/0x30
</TASK>
Modules linked in:
CR2: 0000000000000008
---[ end trace 0000000000000000 ]---
This happened because fscache_cookie_state_machine() was slow and was
still running while another process invoked fscache_unuse_cookie();
this led to a fscache_cookie_lru_do_one() call, setting the
FSCACHE_COOKIE_DO_LRU_DISCARD flag, which was picked up by
fscache_cookie_state_machine(), withdrawing the cookie via
cachefiles_withdraw_cookie(), clearing cookie->cache_priv.
At the same time, yet another process invoked
cachefiles_prepare_write(), which found a NULL pointer in this code
line:
struct cachefiles_object *object = cachefiles_cres_object(cres);
The next line crashes, obviously:
struct cachefiles_cache *cache = object->volume->cache;
During cachefiles_prepare_write(), the "n_accesses" counter is
non-zero (via fscache_begin_operation()). The cookie must not be
withdrawn until it drops to zero.
The counter is checked by fscache_cookie_state_machine() before
switching to FSCACHE_COOKIE_STATE_RELINQUISHING and
FSCACHE_COOKIE_STATE_WITHDRAWING (in "case
FSCACHE_COOKIE_STATE_FAILED"), but not for
FSCACHE_COOKIE_STATE_LRU_DISCARDING ("case
FSCACHE_COOKIE_STATE_ACTIVE").
This patch adds the missing check. With a non-zero access counter,
the function returns and the next fscache_end_cookie_access() call
will queue another fscache_cookie_state_machine() call to handle the
still-pending FSCACHE_COOKIE_DO_LRU_DISCARD.
Fixes: 12bb21a29c19 ("fscache: Implement cookie user counting and resource pinning")
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
cc: stable@vger.kernel.org
---
fs/netfs/fscache_cookie.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/fs/netfs/fscache_cookie.c b/fs/netfs/fscache_cookie.c
index bce2492186d0..d4d4b3a8b106 100644
--- a/fs/netfs/fscache_cookie.c
+++ b/fs/netfs/fscache_cookie.c
@@ -741,6 +741,10 @@ static void fscache_cookie_state_machine(struct fscache_cookie *cookie)
spin_lock(&cookie->lock);
}
if (test_bit(FSCACHE_COOKIE_DO_LRU_DISCARD, &cookie->flags)) {
+ if (atomic_read(&cookie->n_accesses) != 0)
+ /* still being accessed: postpone it */
+ break;
+
__fscache_set_cookie_state(cookie,
FSCACHE_COOKIE_STATE_LRU_DISCARDING);
wake = true;
next prev parent reply other threads:[~2024-07-29 16:20 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-29 16:19 [PATCH 00/24] netfs: Read/write improvements David Howells
2024-07-29 16:19 ` David Howells [this message]
2024-07-29 16:19 ` [PATCH 02/24] cachefiles: Fix non-taking of sb_writers around set/removexattr David Howells
2024-07-29 16:19 ` [PATCH 03/24] netfs: Adjust labels in /proc/fs/netfs/stats David Howells
2024-07-29 16:19 ` [PATCH 04/24] netfs: Record contention stats for writeback lock David Howells
2024-07-29 16:19 ` [PATCH 05/24] netfs: Reduce number of conditional branches in netfs_perform_write() David Howells
2024-07-29 16:19 ` [PATCH 06/24] netfs, cifs: Move CIFS_INO_MODIFIED_ATTR to netfs_inode David Howells
2024-07-29 16:19 ` [PATCH 07/24] netfs: Move max_len/max_nr_segs from netfs_io_subrequest to netfs_io_stream David Howells
2024-07-29 16:19 ` [PATCH 08/24] netfs: Reserve netfs_sreq_source 0 as unset/unknown David Howells
2024-07-29 16:19 ` [PATCH 09/24] netfs: Remove NETFS_COPY_TO_CACHE David Howells
2024-07-29 16:19 ` [PATCH 10/24] netfs: Set the request work function upon allocation David Howells
2024-07-29 16:19 ` [PATCH 11/24] netfs: Use bh-disabling spinlocks for rreq->lock David Howells
2024-07-29 16:19 ` [PATCH 12/24] mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios David Howells
2024-07-29 16:19 ` [PATCH 13/24] cifs: Provide the capability to extract from ITER_FOLIOQ to RDMA SGEs David Howells
2024-07-29 16:19 ` [PATCH 14/24] netfs: Use new folio_queue data type and iterator instead of xarray iter David Howells
2024-07-29 16:19 ` [PATCH 15/24] netfs: Provide an iterator-reset function David Howells
2024-07-29 16:19 ` [PATCH 16/24] netfs: Simplify the writeback code David Howells
2024-07-29 16:19 ` [PATCH 17/24] afs: Make read subreqs async David Howells
2024-07-29 16:19 ` [PATCH 18/24] netfs: Speed up buffered reading David Howells
2024-07-31 19:07 ` Simon Horman
2024-08-01 18:53 ` Nathan Chancellor
2024-08-02 14:18 ` David Howells
2024-08-02 14:44 ` Simon Horman
2024-07-29 16:19 ` [PATCH 19/24] netfs: Remove fs/netfs/io.c David Howells
2024-07-29 16:19 ` [PATCH 20/24] cachefiles, netfs: Fix write to partial block at EOF David Howells
2024-07-29 16:19 ` [PATCH 21/24] netfs: Cancel dirty folios that have no storage destination David Howells
2024-07-29 16:19 ` [PATCH 22/24] cifs: Use iterate_and_advance*() routines directly for hashing David Howells
2024-07-29 16:19 ` [PATCH 23/24] cifs: Switch crypto buffer to use a folio_queue rather than an xarray David Howells
2024-07-29 16:19 ` [PATCH 24/24] cifs: Don't support ITER_XARRAY David Howells
2024-07-30 10:36 ` (subset) [PATCH 00/24] netfs: Read/write improvements Christian Brauner
2024-07-30 10:38 ` Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240729162002.3436763-2-dhowells@redhat.com \
--to=dhowells@redhat.com \
--cc=asmadeus@codewreck.org \
--cc=ceph-devel@vger.kernel.org \
--cc=christian@brauner.io \
--cc=ericvh@kernel.org \
--cc=hsiangkao@linux.alibaba.com \
--cc=idryomov@gmail.com \
--cc=jlayton@kernel.org \
--cc=linux-afs@lists.infradead.org \
--cc=linux-cifs@vger.kernel.org \
--cc=linux-erofs@lists.ozlabs.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nfs@vger.kernel.org \
--cc=marc.dionne@auristor.com \
--cc=max.kellermann@ionos.com \
--cc=netdev@vger.kernel.org \
--cc=netfs@lists.linux.dev \
--cc=pc@manguebit.com \
--cc=smfrench@gmail.com \
--cc=sprasad@microsoft.com \
--cc=stable@vger.kernel.org \
--cc=tom@talpey.com \
--cc=v9fs@lists.linux.dev \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).