All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Mahoney <jeffm@suse.com>
To: Christoph Hellwig <hch@lst.de>, Stefan Priebe <s.priebe@profihost.ag>
Cc: "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org, Chris Mason <clm@fb.com>
Subject: Re: btrfs regression since 4.X kernel NULL pointer dereference
Date: Fri, 11 Sep 2015 15:05:36 -0400	[thread overview]
Message-ID: <55F32600.6080201@suse.com> (raw)
In-Reply-To: <55F32395.6030702@suse.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 9/11/15 2:55 PM, Jeff Mahoney wrote:
> On 8/25/15 5:00 AM, Christoph Hellwig wrote:
>> I think this is btrfs using a struct block_device that doesn't 
>> have a valid queue pointer in it's gendisk for ->s_bdev.  And
>> there are some fishy looking ->s_bdev assignments in the code
>> which I suspect are related to it:
> 
>> fs/btrfs/dev-replace.c: if (fs_info->sb->s_bdev == 
>> src_device->bdev) fs/btrfs/dev-replace.c: fs_info->sb->s_bdev =
>> tgt_device->bdev; fs/btrfs/volumes.c:     if (device->bdev ==
>> root->fs_info->sb->s_bdev) fs/btrfs/volumes.c: 
>> root->fs_info->sb->s_bdev = next_device->bdev;
>> fs/btrfs/volumes.c: if (tgtdev->bdev == fs_info->sb->s_bdev)
>> fs/btrfs/volumes.c: fs_info->sb->s_bdev = next_device->bdev;
> 
> The report at https://bugzilla.kernel.org/show_bug.cgi?id=100911 
> tracks it down a bit further and it's bdev->bd_disk == NULL instead
> of the queue in the gendisk. I don't think that the s_bdev stuff
> is related, though I'd certainly love to see that bit go away.
> 
> If we're calling blk_get_backing_dev_info, that means we're
> already using an inode that has blockdev_superblock and the btrfs
> superblock isn't even involved.
> 
> We're getting there because btrfs_evict_inode -> 
> btrfs_wait_ordered_range -> btrfs_fdatawrite_range -> 
> filemap_fdatawrite_range gets called with inode->i_mapping.  That 
> mapping gets passed down through __filemap_fdatawrite_range to 
> wbc_attach_fdatawrite_inode where the inode passed is mapping->host
> -- which will be the block device inode rather than the btrfs
> device node inode.  That inode is the one ultimately checked in
> inode_to_bdi.
> 
> So it looks like we're causing writeback on an unrelated block
> device that was opened using a device node hosted on btrfs, which
> is obviously wrong.
> 
> I don't think snapshot removal is even a requirement to trigger
> this. I expect it's possible to trigger with two device nodes for
> the same block device where one is getting closed and cleaned up
> while the eviction of the other happens.  The device nodes wouldn't
> even need to be on the same fs.
> 
> Other file systems use &inode->i_data in eviction.  Is it that
> simple here?

Incidentally, this explanation also covers why I was unable to
reproduce it locally.  SLES systems use devtmpfs and I just bind
mounted it into my chroot environment like I normally would.  When I
cp'd /dev into the test environment, I was able to reproduce immediately
.

- -Jeff

- -- 
Jeff Mahoney
SUSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.19 (Darwin)

iQIcBAEBAgAGBQJV8yYAAAoJEB57S2MheeWy2BwP+QGdpsErIfyHJcx95LLrvsxu
n0kBoI4Jd5yfNxp8m+Ll3xgUdsd6rKHJV2Muq8aRdNEdzf1E0DFrRcE0d1W5UrJy
lPzrA8QxCVaLf5jFysFp0xygKbLKHGmOAv2KnAGYFw6exIjb344UnZb6aiw5Uekm
DqrTmEq+0Yb/mE04GVpWMylK6pkDOhgkOzFVZa1Pff0eKY4E61G5GtmA2kNAUP9v
CsoZ0FO1WdF2Fc9ONSPjq7FdZLKH+OmIVakHnaELa8EEM3W7NU+mxLRabznBV25e
L/KPjr+awzkhV1ieyAAww/dddE3bN5nmDOq+OgvA9WPgaRvvwne2tHVTFaxHoiHg
d8oHDLkC1/Z1MqINLi5dZNsSuIWMvRhIMV9Th5F2rdWxrBCSRvID7N+Z2HHh6mJC
Q9rgSOyYKclTam6IF7yX8lDWIqkAnoA6OxvOKRccgr3hS/u4DzVtRmWHO9RblEi+
a9dF2FCP+v+Lgdb8C5n7XUixrtF5H6BWHhmArgjmxD6iyeXOmphyGrgqmSLdY1s9
sakvLrSB9i3O27CKoup2OHyF6MOdgsaa90FZLPLt6BDrCTWAscd0LDy8MbaKgKCR
kjfSTiwNydzZfkJixH71U/1mGbuB9nqf6jrNWCQdE5f57MSCEwiFqQvaD1KK+Uug
ZW2Bz1VQxkOvGbYiJ4HV
=ic4+
-----END PGP SIGNATURE-----

  reply	other threads:[~2015-09-11 19:05 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-22 17:29 btrfs regression since 4.X kernel NULL pointer dereference Stefan Priebe
2015-08-25  9:00 ` Christoph Hellwig
2015-08-25  9:44   ` Stefan Priebe - Profihost AG
2015-08-25 13:51   ` Chris Mason
2015-08-31 17:32     ` Stefan Priebe - Profihost AG
2015-09-01  0:06       ` Chris Mason
2015-09-01  4:41         ` Stefan Priebe
2015-09-11 23:21           ` Christoph Biedl
2015-09-10 22:21         ` Jeff Mahoney
2015-09-11  4:55           ` Stefan Priebe
2015-09-11 18:55   ` Jeff Mahoney
2015-09-11 19:05     ` Jeff Mahoney [this message]
2015-09-11 23:31       ` Stefan Priebe
2015-09-11 19:34     ` Chris Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55F32600.6080201@suse.com \
    --to=jeffm@suse.com \
    --cc=clm@fb.com \
    --cc=hch@lst.de \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=s.priebe@profihost.ag \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.