Linux CIFS filesystem development
 help / color / mirror / Atom feed
From: David Howells <dhowells@redhat.com>
To: Shyam Prasad N <nspmangalore@gmail.com>
Cc: dhowells@redhat.com, Steve French <smfrench@gmail.com>,
	CIFS <linux-cifs@vger.kernel.org>
Subject: Re: null-ptr deref found in netfs code
Date: Tue, 03 Dec 2024 11:14:46 +0000	[thread overview]
Message-ID: <526707.1733224486@warthog.procyon.org.uk> (raw)
In-Reply-To: <505338.1733181074@warthog.procyon.org.uk>

Okay, I think I see the problem.

Looking at the following extraction from the trace:

> netfs_rreq_ref: R=0000290e NEW         r=1
> netfs_read: R=0000290e READAHEAD c=00000000 ni=0 s=85e00000 l=800000 sz=280000000
> netfs_folio: i=f1c2900000000000 ix=85e00-85fff read
> netfs_folio: i=f1c2900000000000 ix=86000-861ff read
> netfs_folio: i=f1c2900000000000 ix=86200-863ff read
> netfs_folio: i=f1c2900000000000 ix=86400-865ff read

We're requesting reads of four folios, each consisting of 512 pages for a
total of 8MiB.

> netfs_sreq: R=0000290e[1] DOWN SUBMT f=02 s=85e00000 0/100000 e=0
> netfs_sreq: R=0000290e[2] DOWN SUBMT f=02 s=85f00000 0/100000 e=0
> netfs_sreq: R=0000290e[3] DOWN SUBMT f=02 s=86000000 0/100000 e=0
> netfs_sreq: R=0000290e[4] DOWN SUBMT f=02 s=86100000 0/100000 e=0
> netfs_sreq: R=0000290e[5] DOWN SUBMT f=02 s=86200000 0/100000 e=0
> netfs_sreq: R=0000290e[6] DOWN SUBMT f=02 s=86300000 0/100000 e=0
> netfs_sreq: R=0000290e[7] DOWN SUBMT f=02 s=86400000 0/100000 e=0
> netfs_sreq: R=0000290e[8] DOWN SUBMT f=02 s=86500000 0/100000 e=0

That got broken down into 8 submissions, each for a 1MiB slice.

> netfs_sreq: R=0000290e[1] DOWN IO    f=02 s=85e00000 100000/100000 e=0
> netfs_progress: R=0000290e[01] s=85e00000 ct=0/100000 pa=100000/100000 sl=0
> netfs_donate: R=0000290e[01] -> [02] to-next am=100000

Subrequest 1 completed, but wasn't large enough to cover a whole folio, so it
donated its coverage forwards to subreq 2.

> netfs_sreq: R=0000290e[6] DOWN IO    f=02 s=86300000 100000/100000 e=0
> netfs_progress: R=0000290e[06] s=86300000 ct=0/100000 pa=100000/100000 sl=2
> netfs_donate: R=0000290e[06] -> [05] tail-to-prev am=100000

Subrequest 6 completed, but wasn't large enough to cover a whole folio, so it
donated its coverage backwards to subreq 5.

> netfs_sreq: R=0000290e[2] DOWN IO    f=02 s=85f00000 100000/100000 e=0
> netfs_progress: R=0000290e[02] s=85e00000 ct=0/200000 pa=200000/200000 sl=0
> netfs_folio: i=f1c2900000000000 ix=85e00-85fff read-done
> netfs_folio: i=f1c2900000000000 ix=85e00-85fff read-unlock

Subreq 2 completed, and with the donation from subreq 1, had sufficient to
unlock the first folio.

> netfs_sreq: R=0000290e[5] DOWN IO    f=02 s=86200000 100000/100000 e=0
> netfs_progress: R=0000290e[05] s=86200000 ct=0/200000 pa=200000/200000 sl=2
> netfs_folio: i=f1c2900000000000 ix=86200-863ff read-done
> netfs_folio: i=f1c2900000000000 ix=86200-863ff read-unlock

Subreq 5 completed, and with the donation from subreq 6, had sufficient to
unlock the third folio.

> netfs_sreq: R=0000290e[3] DOWN IO    f=02 s=86000000 100000/100000 e=0
> netfs_progress: R=0000290e[03] s=86000000 ct=0/100000 pa=100000/100000 sl=1
> netfs_donate: R=0000290e[03] -> [04] to-next am=100000

Subrequest 3 completed, but wasn't large enough to cover a whole folio, so it
donated its coverage forwards to subreq 4.  So far, so good.

> netfs_sreq: R=0000290e[7] DOWN IO    f=02 s=86400000 100000/100000 e=0
> netfs_progress: R=0000290e[07] s=86400000 ct=0/100000 pa=100000/100000 sl=3
> netfs_donate: R=0000290e[07] -> [04] to-prev am=0

Subreq 7 completed, but wasn't large enough to cover a whole folio, so it
donated its coverage backwards to subreq 4.  This is a bug as subreq 7 is not
contiguous with subreq 4.  It should instead have donated forwards to subreq
8.

> netfs_sreq: R=0000290e[4] DOWN IO    f=02 s=86100000 100000/100000 e=0
> netfs_sreq: R=0000290e[4] DOWN +DON  f=02 s=86000000 300000/300000 e=0
> netfs_progress: R=0000290e[04] s=86000000 ct=0/300000 pa=200000/300000 sl=1
> netfs_folio: i=f1c2900000000000 ix=86000-861ff read-done
> netfs_folio: i=f1c2900000000000 ix=86000-861ff read-unlock

Subreq 4 completed, and with the donation from subreq 2, had sufficient to
unlock the second folio.  However, it also has some excess from subreq 7 that
it can't do anything with - and this gets lost.

> netfs_sreq: R=0000290e[8] DOWN IO    f=02 s=86500000 100000/100000 e=0
> netfs_progress: R=0000290e[08] s=86500000 ct=0/100000 pa=100000/100000 sl=3
> netfs_donate: R=0000290e[08] -> [04] tail-to-prev am=100000

Here's a repeat of the bug: subreq 8 donates to subreq 4, but, again, is not
contiguous.  As these are happening concurrently, the other thread hasn't
delisted subreq 4 yet.

> netfs_sreq: R=0000290e[4] DOWN +DON  f=02 s=86000000 400000/400000 e=0
> netfs_progress: R=0000290e[04] s=86200000 ct=200000/400000 pa=200000/200000 sl=2

The request screwed at this point: subreq 4 shows the extra stuff it has been
donated, but it is unable to do anything with it.  There is no folio to
wrangle as the third slot (sl=2) in the queue has already been cleared.

(Note that this bug shouldn't happen with the patches currently on my
netfs-writeback branch as I got rid of the donation mechanism in preference
for something simpler with single-threaded collection.)

David


  parent reply	other threads:[~2024-12-03 11:14 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-02  2:16 null-ptr deref found in netfs code Shyam Prasad N
2024-12-02  8:03 ` Shyam Prasad N
2024-12-02 17:10   ` David Howells
2024-12-02 23:11   ` David Howells
2024-12-03  5:33     ` Shyam Prasad N
2024-12-03 11:14     ` David Howells [this message]
2024-12-04 16:30       ` Shyam Prasad N
2024-12-04 16:31         ` Shyam Prasad N
2024-12-04 16:52           ` David Howells
2024-12-04 20:46           ` David Howells
2024-12-04 20:50           ` ronnie sahlberg
2024-12-05  4:47             ` Shyam Prasad N
2024-12-05 10:49               ` Shyam Prasad N
2025-01-23 15:47                 ` Shyam Prasad N
2025-01-23 16:31                   ` David Howells
2025-01-23 16:34                   ` David Howells
2025-01-25 17:12                     ` Shyam Prasad N
2025-01-25 19:13                       ` David Howells
2024-12-03 16:14 ` [PATCH] netfs: Fix non-contiguous donation between completed reads David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=526707.1733224486@warthog.procyon.org.uk \
    --to=dhowells@redhat.com \
    --cc=linux-cifs@vger.kernel.org \
    --cc=nspmangalore@gmail.com \
    --cc=smfrench@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox