From: Paul Clements <Paul.Clements@SteelEye.com>
To: Lou Langholtz <ldl@aros.net>, Andrew Morton <akpm@osdl.org>
Cc: linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] 2.6.0 NBD driver: remove send/recieve race for request
Date: Fri, 08 Aug 2003 12:47:26 -0400 [thread overview]
Message-ID: <3F33D41E.89CFA182@SteelEye.com> (raw)
In-Reply-To: 3F334396.7030008@aros.net
Lou Langholtz wrote:
> >--- linux-2.6.0-test2-mm4-PRISTINE/drivers/block/nbd.c Sun Jul 27 12:58:51 2003
> >+++ linux-2.6.0-test2-mm4/drivers/block/nbd.c Thu Aug 7 18:02:23 2003
> >@@ -416,11 +416,19 @@ void nbd_clear_que(struct nbd_device *lo
> > BUG_ON(lo->magic != LO_MAGIC);
> > #endif
> >
> >+retry:
> > do {
> > req = NULL;
> > spin_lock(&lo->queue_lock);
> > if (!list_empty(&lo->queue_head)) {
> > req = list_entry(lo->queue_head.next, struct request, queuelist);
> >+ if (req->ref_count > 1) { /* still in xmit */
> >+ spin_unlock(&lo->queue_lock);
> >+ printk(KERN_DEBUG "%s: request %p: still in use (%d), waiting...\n",
> >+ lo->disk->disk_name, req, req->ref_count);
> >+ schedule_timeout(HZ); /* wait a second */
> >
> Isn't there something more deterministic than just waiting a second and
> hoping things clear up that you can use here?
It's not exactly "hoping" something will happen...we're waiting until
do_nbd_request decrements the ref_count, which is completely
deterministic, but just not guaranteed to have already happened when
nbd_clear_que() starts.
The use of the ref_count closes a race window (the one that I pointed
out in my response to Lou's original patch a few days ago). It protects
us in the following case(s):
1) do_nbd_request begins -- sock and file are valid
2) do_nbd_request queues the request and then calls nbd_send_req to send
the request out on the network
3) on another CPU, or after preempt, nbd-client gets a signal or an
"nbd-client -d" is called, which results in sock and file being set to
NULL, and the queue begins to be cleared, and requests that were on the
queue get freed (before do_nbd_request has finished accessing req for
the last time)
> How about not clearing the
> queue unless lo->sock is NULL and using whatever lock it is now that's
> protecting lo->sock. That way the queue clearing race can be eliminated too.
That makes sense. I hadn't done this (for compatibility reasons) since
"nbd-client -d" (disconnect) calls NBD_CLEAR_QUE before disconnecting
the socket. But I think we can just make NBD_CLEAR_QUE silently fail
(return 0) if lo->sock is set and basically turn that first
NBD_CLEAR_QUE into a noop, and let the nbd_clear_que() call at the end
of NBD_CLEAR_SOCK handle it (properly). Note that the race condition
above still applies, even with this lo->sock check in place. But this
check does eliminate the race in "nbd-client -d" where, e.g., requests
are being queued up faster than nbd_clear_que can dequeue them,
potentially making nbd_clear_que loop forever.
As a side note, NBD_CLEAR_QUE is actually now completely superfluous,
since NBD_DO_IT and NBD_CLEAR_SOCK (which always get called in
conjunction with NBD_CLEAR_QUE) contain the nbd_clear_que() call
themselves. We could just make NBD_CLEAR_QUE a noop in all cases, but I
guess we'd risk breaking some nbd client tool that's out there...
I'm testing the updated patch now, I'll send it out in a little while...
Thanks,
Paul
next prev parent reply other threads:[~2003-08-08 16:49 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-08-05 16:51 [PATCH] 2.6.0 NBD driver: remove send/recieve race for request Lou Langholtz
2003-08-05 19:37 ` Paul Clements
2003-08-05 22:48 ` Lou Langholtz
2003-08-06 0:51 ` Paul Clements
2003-08-06 7:34 ` Lou Langholtz
2003-08-08 5:02 ` Paul Clements
2003-08-08 5:27 ` Andrew Morton
2003-08-08 17:05 ` Paul Clements
2003-08-08 6:30 ` Lou Langholtz
2003-08-08 6:43 ` Andrew Morton
2003-08-08 6:59 ` Jens Axboe
2003-08-08 15:00 ` Paul Clements
2003-08-25 9:58 ` Jens Axboe
2003-08-08 16:47 ` Paul Clements [this message]
2003-08-08 20:07 ` [PATCH 2.6.0-test2-mm] nbd: fix send/receive/shutdown/disconnect races Paul Clements
2003-08-09 22:10 ` [PATCH 2.4.22-pre] nbd: fix race conditions Paul Clements
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3F33D41E.89CFA182@SteelEye.com \
--to=paul.clements@steeleye.com \
--cc=akpm@osdl.org \
--cc=ldl@aros.net \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox