public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Paul Clements <Paul.Clements@SteelEye.com>
To: Lou Langholtz <ldl@aros.net>, Andrew Morton <akpm@osdl.org>
Cc: linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] 2.6.0 NBD driver: remove send/recieve race for request
Date: Fri, 08 Aug 2003 12:47:26 -0400	[thread overview]
Message-ID: <3F33D41E.89CFA182@SteelEye.com> (raw)
In-Reply-To: 3F334396.7030008@aros.net

Lou Langholtz wrote:

> >--- linux-2.6.0-test2-mm4-PRISTINE/drivers/block/nbd.c Sun Jul 27 12:58:51 2003
> >+++ linux-2.6.0-test2-mm4/drivers/block/nbd.c  Thu Aug  7 18:02:23 2003
> >@@ -416,11 +416,19 @@ void nbd_clear_que(struct nbd_device *lo
> >       BUG_ON(lo->magic != LO_MAGIC);
> > #endif
> >
> >+retry:
> >       do {
> >               req = NULL;
> >               spin_lock(&lo->queue_lock);
> >               if (!list_empty(&lo->queue_head)) {
> >                       req = list_entry(lo->queue_head.next, struct request, queuelist);
> >+                      if (req->ref_count > 1) { /* still in xmit */
> >+                              spin_unlock(&lo->queue_lock);
> >+                              printk(KERN_DEBUG "%s: request %p: still in use (%d), waiting...\n",
> >+                                  lo->disk->disk_name, req, req->ref_count);
> >+                              schedule_timeout(HZ); /* wait a second */
> >

> Isn't there something more deterministic than just waiting a second and
> hoping things clear up that you can use here? 

It's not exactly "hoping" something will happen...we're waiting until
do_nbd_request decrements the ref_count, which is completely
deterministic, but just not guaranteed to have already happened when
nbd_clear_que() starts. 

The use of the ref_count closes a race window (the one that I pointed
out in my response to Lou's original patch a few days ago). It protects
us in the following case(s):

1) do_nbd_request begins -- sock and file are valid
2) do_nbd_request queues the request and then calls nbd_send_req to send
the request out on the network
3) on another CPU, or after preempt, nbd-client gets a signal or an
"nbd-client -d" is called, which results in sock and file being set to
NULL, and the queue begins to be cleared, and requests that were on the
queue get freed (before do_nbd_request has finished accessing req for
the last time)


> How about not clearing the
> queue unless lo->sock is NULL and using whatever lock it is now that's
> protecting lo->sock. That way the queue clearing race can be eliminated too.

That makes sense. I hadn't done this (for compatibility reasons) since
"nbd-client -d" (disconnect) calls NBD_CLEAR_QUE before disconnecting
the socket. But I think we can just make NBD_CLEAR_QUE silently fail
(return 0) if lo->sock is set and basically turn that first
NBD_CLEAR_QUE into a noop, and let the nbd_clear_que() call at the end
of NBD_CLEAR_SOCK handle it (properly). Note that the race condition
above still applies, even with this lo->sock check in place. But this
check does eliminate the race in "nbd-client -d" where, e.g., requests
are being queued up faster than nbd_clear_que can dequeue them,
potentially making nbd_clear_que loop forever.

As a side note, NBD_CLEAR_QUE is actually now completely superfluous,
since NBD_DO_IT and NBD_CLEAR_SOCK (which always get called in
conjunction with NBD_CLEAR_QUE) contain the nbd_clear_que() call
themselves. We could just make NBD_CLEAR_QUE a noop in all cases, but I
guess we'd risk breaking some nbd client tool that's out there...

I'm testing the updated patch now, I'll send it out in a little while...

Thanks,
Paul

  parent reply	other threads:[~2003-08-08 16:49 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-08-05 16:51 [PATCH] 2.6.0 NBD driver: remove send/recieve race for request Lou Langholtz
2003-08-05 19:37 ` Paul Clements
2003-08-05 22:48   ` Lou Langholtz
2003-08-06  0:51     ` Paul Clements
2003-08-06  7:34       ` Lou Langholtz
2003-08-08  5:02         ` Paul Clements
2003-08-08  5:27           ` Andrew Morton
2003-08-08 17:05             ` Paul Clements
2003-08-08  6:30           ` Lou Langholtz
2003-08-08  6:43             ` Andrew Morton
2003-08-08  6:59             ` Jens Axboe
2003-08-08 15:00               ` Paul Clements
2003-08-25  9:58                 ` Jens Axboe
2003-08-08 16:47             ` Paul Clements [this message]
2003-08-08 20:07               ` [PATCH 2.6.0-test2-mm] nbd: fix send/receive/shutdown/disconnect races Paul Clements
2003-08-09 22:10                 ` [PATCH 2.4.22-pre] nbd: fix race conditions Paul Clements

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3F33D41E.89CFA182@SteelEye.com \
    --to=paul.clements@steeleye.com \
    --cc=akpm@osdl.org \
    --cc=ldl@aros.net \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox