linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jack Wang <jinpu.wang@profitbricks.com>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Jens Axboe <axboe@kernel.dk>
Subject: Re: [BUG]NULL pointer dereference at 0000000000000008 __blkdev_put+0x17f/0x1d0
Date: Mon, 06 Jan 2014 09:45:08 +0100	[thread overview]
Message-ID: <52CA6D14.4040300@profitbricks.com> (raw)
In-Reply-To: <20140104060925.GF10323@ZenIV.linux.org.uk>

On 01/04/2014 07:09 AM, Al Viro wrote:
> On Thu, Jan 02, 2014 at 10:36:30AM +0100, Jack Wang wrote:
> 
>>> Bug happened at line 1486, looks disk->fops is NULL here for some
>>> reason, is it reasonable to add a check like:
>>>
>>> if (disk->fops)
>>> 	if (disk->fops->release)
>>> 		ret = disk->fops->release(disk, mode);
>>>
>>>
>>> Happy New Year and Best regards:)
>>> Jack
>>>
>>
>> Ping, could you share opnions on this, attached with patch I proposaled.
> 
> Sorry, had been sick since mid-December ;-/  The patch is not a good idea -
> in the best case it's papering over a bug (and insufficiently so, at that,
> since there are other places where disk->fops->some_method is checked).
> 
> gendisk->fops should never be assigned NULL; it starts life with NULL
> ->fops, but that should be assigned a non-NULL value (and never modified
> afterwards) before anyone can see it.  Moreover, even if some driver has
> fscked up and forgot to initialize the damn thing, get_gendisk() would've
> refused to return such a thing to any callers (including __blkdev_get()).
> Note that __blkdev_get() would oops on such a thing if get_gendisk()
> somehow returned it.
> 
> Looks like something is shitting over bdev->bd_disk or bdev->bd_disk->fops.
> The offsets in the disassembled code are all wrong (including that from
> beginning of function to oopsing instruction), but the code match is good,
> so I agree that we are hitting bdev->bd_disk->fops == NULL here.  The
> question is how it has happened - that's where the real bug is...
> 
> How reproducible it is?  And which kernel, while we are at it?  This area
> didn't get a lot of changes lately, but still...
> 
Thanks Al for reply, and look into this.
We're using 3.4.71, and this happened in production, we can not
reproduce it yet. What I could see is: before this happened, we saw scsi
devices offlined, and multipath failed path, raid1 failed member device.

Possible the bug lies in drivers md-raid1, dm-multipath or sd? How could
I narrow it down? Could you teach me?

Thanks, wish you happy and healthy!

Jack

      reply	other threads:[~2014-01-06  8:45 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-30 15:55 [BUG]NULL pointer dereference at 0000000000000008 __blkdev_put+0x17f/0x1d0 Jack Wang
2014-01-02  9:36 ` Jack Wang
2014-01-04  6:09   ` Al Viro
2014-01-06  8:45     ` Jack Wang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52CA6D14.4040300@profitbricks.com \
    --to=jinpu.wang@profitbricks.com \
    --cc=axboe@kernel.dk \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).