All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Clements <paul.clements@steeleye.com>
To: Mike Snitzer <snitzer@gmail.com>
Cc: nbd-general@lists.sourceforge.net, linux-kernel@vger.kernel.org
Subject: Re: nbd: Oops because nbd doesn't prevent NBD_CLEAR_SOCK while sock_xmit() is working on a receive
Date: Thu, 27 Mar 2008 08:35:55 -0400	[thread overview]
Message-ID: <47EB94AB.6090608@steeleye.com> (raw)
In-Reply-To: <170fa0d20803261143s1ab258b2ra470c158ac5744a@mail.gmail.com>

Mike Snitzer wrote:

> In practice this looks like:
> 
> nbd1: NBD_DISCONNECT
> nbd1: Send control failed (result -32)
> end_request: I/O error, dev nbd1, sector 0
> end_request: I/O error, dev nbd1, sector 8032264
> md: super_written gets error=-5, uptodate=0
> raid1: Disk failure on nbd1, disabling device.
>         Operation continuing on 1 devices
> Unable to handle kernel NULL pointer dereference at 0000000000000028 RIP:
>  [<ffffffff88b1e125>] :nbd:sock_xmit+0x9d/0x301

> The fact that sock_xmit() in receive mode is unprotected seems to be
> the WHY a NULL pointer is possible; but I'm still trying to identify
> the HOW.

Do you know who is setting the socket NULL? Is it already NULL when you 
get to this point? Is it the nbd-client -d? Is it the original 
nbd-client/kernel that does it? Figuring that out would help narrow down 
the cause.

> But for me this begs the question:  why isn't the nbd_device's socket
> always protected during sock_xmit() for both
> transmits and receives; rather than just transmits (via tx_lock)!?

It would deadlock if we held the lock over both. Generally we don't have 
to worry about receives, since they're always done in the nbd-client 
process, so we have control over when and how it exits and cleans up. 
The odd case, as you've discovered, is when another process (nbd-client 
-d) comes along and starts mucking with the queue and socket. Would 
"kill -9 <nbd-client-pid>" work for you instead? That is what I use to 
break the connection, and it's safe, as it tells the original nbd-client 
to exit (which it does cleanly and safely).

--
Paul

  reply	other threads:[~2008-03-27 12:36 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-26 18:43 nbd: Oops because nbd doesn't prevent NBD_CLEAR_SOCK while sock_xmit() is working on a receive Mike Snitzer
2008-03-27 12:35 ` Paul Clements [this message]
2008-03-27 13:21   ` Mike Snitzer
2008-03-27 21:12     ` Mike Snitzer
2008-03-28  3:17       ` Paul Clements

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47EB94AB.6090608@steeleye.com \
    --to=paul.clements@steeleye.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nbd-general@lists.sourceforge.net \
    --cc=snitzer@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.