Linux NFS development
 help / color / mirror / Atom feed
From: Li Lingfeng <lilingfeng3@huawei.com>
To: Dai Ngo <Dai.Ngo@oracle.com>,
	Chuck Lever <chuck.lever@oracle.com>,
	Jeff Layton <jlayton@kernel.org>, NeilBrown <neil@brown.name>,
	Olga Kornievskaia <okorniev@redhat.com>,
	Tom Talpey <tom@talpey.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	yangerkun <yangerkun@huawei.com>,
	"zhangyi (F)" <yi.zhang@huawei.com>, Hou Tao <houtao1@huawei.com>,
	"chengzhihao1@huawei.com" <chengzhihao1@huawei.com>,
	"yukuai (C)" <yukuai3@huawei.com>, <leo.lilong@huawei.com>,
	<wutengda2@huawei.com>
Subject: [Question] nfsd: possible reordering between nf->nf_file assignment and NFSD_FILE_PENDING clearing?
Date: Thu, 18 Sep 2025 21:57:59 +0800	[thread overview]
Message-ID: <34bd5595-8f3f-4c52-a1d5-d782fc99efb9@huawei.com> (raw)

Recently, we encountered a null pointer dereference on a relatively old
5.10 kernel that does not include commit c4c649ab413ba ("NFSD: Convert
filecache to rhltable"), which exhibited the same behavior as described
in [1]. I was wondering if it might be caused by the reordering between
the assignment of nf->nf_file and the clearing of NFSD_FILE_PENDING.

Just to mention, I don't believe the analysis in [1] is entirely accurate,
since hlist_add_head_rcu includes a write barrier.

We haven't encountered this issue on newer kernel versions, but the
assignment of nf->nf_file and the clearing of NFSD_FILE_PENDING appear
consistent across different versions.

Our expected outcome should be like this:
                 T1                                    T2
nfsd_read
  nfsd_file_acquire_gc
   nfsd_file_do_acquire
    nfsd_file_lookup_locked
     // get nfsd_file from nfsd_file_rhltable
                                         nfsd_read
                                          nfsd_file_acquire_gc
                                           nfsd_file_do_acquire
                                            nfsd_file_alloc
                                             nf->nf_flags // set 
NFSD_FILE_PENDING
                                            rhltable_insert // insert to 
nfsd_file_rhltable
                                            nf->nf_file = file // set 
nf_file
    wait_on_bit
    // wait NFSD_FILE_PENDING to be cleared
                                            clear_and_wake_up_bit // 
clear NFSD_FILE_PENDING
    // get file after being awakened
  file = nf->nf_file

Or like this:
                 T1                                    T2
nfsd_read
  nfsd_file_acquire_gc
   nfsd_file_do_acquire
    nfsd_file_lookup_locked
     // get nfsd_file from nfsd_file_rhltable
                                         nfsd_read
                                          nfsd_file_acquire_gc
                                           nfsd_file_do_acquire
                                            nfsd_file_alloc
                                             nf->nf_flags // set 
NFSD_FILE_PENDING
                                            rhltable_insert // insert to 
nfsd_file_rhltable
                                            nf->nf_file = file // set 
nf_file
                                            clear_and_wake_up_bit // 
clear NFSD_FILE_PENDING
    // get file directly
  file = nf->nf_file

But is it possible that due to reordering, it ends up like this:
                 T1                                    T2
nfsd_read
  nfsd_file_acquire_gc
   nfsd_file_do_acquire
    nfsd_file_lookup_locked
     // get nfsd_file from nfsd_file_rhltable
                                         nfsd_read
                                          nfsd_file_acquire_gc
                                           nfsd_file_do_acquire
                                            nfsd_file_alloc
                                             nf->nf_flags // set 
NFSD_FILE_PENDING
                                            rhltable_insert // insert to 
nfsd_file_rhltable
                                            clear_and_wake_up_bit // 
clear NFSD_FILE_PENDING
    // get file directly
  file = nf->nf_file
                                            nf->nf_file = file // set 
nf_file
  // Null dereference due to uninitialized file pointer.

[1]: 
https://lore.kernel.org/all/20230818065507.1280625-1-haydenw.kernel@gmail.com/

Any suggestion will be appreciated.

Thanks,
Lingfeng.


             reply	other threads:[~2025-09-18 13:58 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-18 13:57 Li Lingfeng [this message]
  -- strict thread matches above, loose matches on Subject: below --
2025-09-18 15:26 [Question] nfsd: possible reordering between nf->nf_file assignment and NFSD_FILE_PENDING clearing? Tengda Wu
2025-09-18 22:59 ` NeilBrown
2025-09-19  1:11   ` Li Lingfeng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=34bd5595-8f3f-4c52-a1d5-d782fc99efb9@huawei.com \
    --to=lilingfeng3@huawei.com \
    --cc=Dai.Ngo@oracle.com \
    --cc=chengzhihao1@huawei.com \
    --cc=chuck.lever@oracle.com \
    --cc=houtao1@huawei.com \
    --cc=jlayton@kernel.org \
    --cc=leo.lilong@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neil@brown.name \
    --cc=okorniev@redhat.com \
    --cc=tom@talpey.com \
    --cc=wutengda2@huawei.com \
    --cc=yangerkun@huawei.com \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox