public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* nbd problem.
@ 2007-05-08 19:40 Rogier Wolff
  2007-05-08 20:33 ` Satyam Sharma
  2007-05-09  5:48 ` Jens Axboe
  0 siblings, 2 replies; 7+ messages in thread
From: Rogier Wolff @ 2007-05-08 19:40 UTC (permalink / raw)
  To: linux-kernel


Hi,

The nbd client still reliably hangs when I use it. 

While looking into this, I found:


446                 req->errors = 0;
447                 spin_unlock_irq(q->queue_lock);
                   ^^^^^^^^^^^^^^^^^^^^
448 
449                 mutex_lock(&lo->tx_lock);
450                 if (unlikely(!lo->sock)) {
451                         mutex_unlock(&lo->tx_lock);
452                         printk(KERN_ERR "%s: Attempted send on closed socket\n",
453                                lo->disk->disk_name);
454                         req->errors++;
455                         nbd_end_request(req);
456                         spin_lock_irq(q->queue_lock);
457                         continue;
458                 }
459 
460                 lo->active_req = req;
461 
462                 if (nbd_send_req(lo, req) != 0) {
463                         printk(KERN_ERR "%s: Request send failed\n",
464                                         lo->disk->disk_name);
465                         req->errors++;
466                         nbd_end_request(req);
467                 } else {
468                         spin_lock(&lo->queue_lock);
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
469                         list_add(&req->queuelist, &lo->queue_head);
470                         spin_unlock(&lo->queue_lock);
471                 }
472 
473                 lo->active_req = NULL;


As far as I read things, the function is called with the lock
held and interrupts disabled., the lock can then be released and 
retaken without disabling interrupts again. 

Should this be fixed?

(it doesn't fix my hang though....)

	Roger. 

-- 
** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**    Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233    **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. --------- Adapted from lxrbot FAQ

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: nbd problem.
  2007-05-08 19:40 nbd problem Rogier Wolff
@ 2007-05-08 20:33 ` Satyam Sharma
  2007-05-09 11:10   ` Rogier Wolff
  2007-05-09  5:48 ` Jens Axboe
  1 sibling, 1 reply; 7+ messages in thread
From: Satyam Sharma @ 2007-05-08 20:33 UTC (permalink / raw)
  To: Rogier Wolff; +Cc: linux-kernel

On 5/8/07, Rogier Wolff <R.E.Wolff@bitwizard.nl> wrote:
>
> Hi,
>
> The nbd client still reliably hangs when I use it.
>
> While looking into this, I found:
>
>
> 446                 req->errors = 0;
> 447                 spin_unlock_irq(q->queue_lock);
>                    ^^^^^^^^^^^^^^^^^^^^

BTW (this could be unrelated to the original issue here), but can
anybody ever have a _genuine_ excuse to use spin_lock_irq /
spin_unlock_irq and not spin_lock_irqsave / spin_unlock_restore? I
find the latter primitives more tasteful even when I *know* something
is being called with interrupts enabled / disabled -- you never know
when some code is re-used again somewhere else and/or ripped out of
one place and put inside another ... the former API only invites
trouble, if anything.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: nbd problem.
  2007-05-08 19:40 nbd problem Rogier Wolff
  2007-05-08 20:33 ` Satyam Sharma
@ 2007-05-09  5:48 ` Jens Axboe
  1 sibling, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2007-05-09  5:48 UTC (permalink / raw)
  To: Rogier Wolff; +Cc: linux-kernel

On Tue, May 08 2007, Rogier Wolff wrote:
> 
> Hi,
> 
> The nbd client still reliably hangs when I use it. 
> 
> While looking into this, I found:
> 
> 
> 446                 req->errors = 0;
> 447                 spin_unlock_irq(q->queue_lock);
>                    ^^^^^^^^^^^^^^^^^^^^
> 448 
> 449                 mutex_lock(&lo->tx_lock);
> 450                 if (unlikely(!lo->sock)) {
> 451                         mutex_unlock(&lo->tx_lock);
> 452                         printk(KERN_ERR "%s: Attempted send on closed socket\n",
> 453                                lo->disk->disk_name);
> 454                         req->errors++;
> 455                         nbd_end_request(req);
> 456                         spin_lock_irq(q->queue_lock);
> 457                         continue;
> 458                 }
> 459 
> 460                 lo->active_req = req;
> 461 
> 462                 if (nbd_send_req(lo, req) != 0) {
> 463                         printk(KERN_ERR "%s: Request send failed\n",
> 464                                         lo->disk->disk_name);
> 465                         req->errors++;
> 466                         nbd_end_request(req);
> 467                 } else {
> 468                         spin_lock(&lo->queue_lock);
>                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 469                         list_add(&req->queuelist, &lo->queue_head);
> 470                         spin_unlock(&lo->queue_lock);
> 471                 }
> 472 
> 473                 lo->active_req = NULL;
> 
> 
> As far as I read things, the function is called with the lock
> held and interrupts disabled., the lock can then be released and 
> retaken without disabling interrupts again. 
> 
> Should this be fixed?
> 
> (it doesn't fix my hang though....)

Note lo->queue_lock vs q->queue_lock.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: nbd problem.
  2007-05-08 20:33 ` Satyam Sharma
@ 2007-05-09 11:10   ` Rogier Wolff
  2007-05-09 12:38     ` Rogier Wolff
  2007-05-09 13:39     ` Jens Axboe
  0 siblings, 2 replies; 7+ messages in thread
From: Rogier Wolff @ 2007-05-09 11:10 UTC (permalink / raw)
  To: Satyam Sharma; +Cc: Rogier Wolff, linux-kernel

On Tue, May 08, 2007 at 01:33:52PM -0700, Satyam Sharma wrote:
> On 5/8/07, Rogier Wolff <R.E.Wolff@bitwizard.nl> wrote:
> >
> >Hi,
> >
> >The nbd client still reliably hangs when I use it.

Someone suggested to use 

http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=summary

and that fixed it.  (i.e. there is something in there that should
be merged....)

Jens, thanks for pointing out that there were different locks 
involved.

	Roger. 

(I seem to have lost all other EMails in this thread. Apparently
my delete-old-list-emails is too agressive today...)

-- 
** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**    Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233    **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. --------- Adapted from lxrbot FAQ

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: nbd problem.
  2007-05-09 11:10   ` Rogier Wolff
@ 2007-05-09 12:38     ` Rogier Wolff
  2007-05-09 12:52       ` Jan Engelhardt
  2007-05-09 13:39     ` Jens Axboe
  1 sibling, 1 reply; 7+ messages in thread
From: Rogier Wolff @ 2007-05-09 12:38 UTC (permalink / raw)
  To: Rogier Wolff; +Cc: Satyam Sharma, linux-kernel

On Wed, May 09, 2007 at 01:10:49PM +0200, Rogier Wolff wrote:
> On Tue, May 08, 2007 at 01:33:52PM -0700, Satyam Sharma wrote:
> > On 5/8/07, Rogier Wolff <R.E.Wolff@bitwizard.nl> wrote:
> > >
> > >Hi,
> > >
> > >The nbd client still reliably hangs when I use it.
> 
> Someone suggested to use 
> 
> http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=summary
> 
> and that fixed it.  (i.e. there is something in there that should
> be merged....)

Cancel the party! It got MUCH further than before, but crashed
eventually. 

ozon:~> ps auxww | grep D
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root       110  0.4  0.0      0     0 ?        D    10:28   0:31 [pdflush]
root       112  0.0  0.0      0     0 ?        D<   10:28   0:05 [kswapd0]
root      1649  0.0  0.1   1604   108 pts/0    D+   11:17   0:03 nbd-client petisuix 1234 /dev/nd0
root      1654  0.9  4.5   4648  2816 pts/0    D+   11:17   0:44 rsync /usr/src/linux-2.6.21.ozon /mnt/test1 -av --progress
wolff     1716  0.0  0.9   1648   560 pts/1    R+   12:33   0:00 grep D
ozon:~> 

Can anybody help me figure out what these proceses are waiting for?

	Roger. 

-- 
** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**    Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233    **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. --------- Adapted from lxrbot FAQ

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: nbd problem.
  2007-05-09 12:38     ` Rogier Wolff
@ 2007-05-09 12:52       ` Jan Engelhardt
  0 siblings, 0 replies; 7+ messages in thread
From: Jan Engelhardt @ 2007-05-09 12:52 UTC (permalink / raw)
  To: Rogier Wolff; +Cc: Satyam Sharma, linux-kernel


On May 9 2007 14:38, Rogier Wolff wrote:
>
>ozon:~> ps auxww | grep D
>USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
>root       110  0.4  0.0      0     0 ?        D    10:28   0:31 [pdflush]
>root       112  0.0  0.0      0     0 ?        D<   10:28   0:05 [kswapd0]
>root      1649  0.0  0.1   1604   108 pts/0    D+   11:17   0:03 nbd-client petisuix 1234 /dev/nd0
>root      1654  0.9  4.5   4648  2816 pts/0    D+   11:17   0:44 rsync /usr/src/linux-2.6.21.ozon /mnt/test1 -av --progress
>wolff     1716  0.0  0.9   1648   560 pts/1    R+   12:33   0:00 grep D
>ozon:~> 
>
>Can anybody help me figure out what these proceses are waiting for?

echo t >/proc/sysrq-trigger

dumps a ton to /var/log/messages.



Jan
-- 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: nbd problem.
  2007-05-09 11:10   ` Rogier Wolff
  2007-05-09 12:38     ` Rogier Wolff
@ 2007-05-09 13:39     ` Jens Axboe
  1 sibling, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2007-05-09 13:39 UTC (permalink / raw)
  To: Rogier Wolff; +Cc: Satyam Sharma, linux-kernel

On Wed, May 09 2007, Rogier Wolff wrote:
> On Tue, May 08, 2007 at 01:33:52PM -0700, Satyam Sharma wrote:
> > On 5/8/07, Rogier Wolff <R.E.Wolff@bitwizard.nl> wrote:
> > >
> > >Hi,
> > >
> > >The nbd client still reliably hangs when I use it.
> 
> Someone suggested to use 
> 
> http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=summary
> 
> and that fixed it.  (i.e. there is something in there that should
> be merged....)

Hmm, which branch? Most of my stuff is merged up with Linus as this
point.

> Jens, thanks for pointing out that there were different locks 
> involved.

You're welcome.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2007-05-09 13:41 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-08 19:40 nbd problem Rogier Wolff
2007-05-08 20:33 ` Satyam Sharma
2007-05-09 11:10   ` Rogier Wolff
2007-05-09 12:38     ` Rogier Wolff
2007-05-09 12:52       ` Jan Engelhardt
2007-05-09 13:39     ` Jens Axboe
2007-05-09  5:48 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox