* [PROBLEM]: hdparm strange behaviour for 2.6.21 and later
@ 2007-06-16 16:58 Thanos Kyritsis
2007-06-18 20:01 ` Mark Lord
0 siblings, 1 reply; 6+ messages in thread
From: Thanos Kyritsis @ 2007-06-16 16:58 UTC (permalink / raw)
To: linux-ide; +Cc: Bartlomiej Zolnierkiewicz
Hello,
starting with kernel 2.6.21 and up to kernel 2.6.22-rc4, I'm having the
following problem:
/etc/rc.d/rc.local contains the following:
/usr/sbin/hdparm -q -d1 -q -u1 -q -c1 -q -k1 /dev/hda
/usr/sbin/hdparm -q -d1 -q -u1 -q -c1 -q -k1 /dev/hdb
/usr/sbin/hdparm -q -d1 -q -u1 -q -c1 -q -k1 /dev/hdc
/usr/sbin/hdparm -q -d1 -q -u1 -q -c1 -q -k1 /dev/hdd
(I'm using Slackware, no Debian-style automated hdparm.conf is running
during bootup, that's why these are in rc.local)
The above seem to somehow lock up the boot procedure just at the point
where rc.local gets executed, so the system never reaches login prompt.
All drivers (kernelspace) and system daemons (userspace) before rc.local
do normally load, but there are no strange messages in the console or in
the system logs and because I cannot login, I cannot trace it any further.
I believe the kernel is in running state because the machine responds to
ICMP pings from the ethernet, but since the login prompt is not up, the
already running sshd/telnetd do not provide any help.
The strange thing is that if I remove all the quiet options (-q) from the
above commands, everything works like it should. Furthermore, if I
comment them out from rc.local, then boot, login, and execute them by
hand (with -q), again everything works like it should. Lockup only happens if
I run 2 or more hdparm commands, if I leave only one (doesn't matter
which one) hdparm command in rc.local (with -q), it works.
This is not happening for kernels up to 2.6.20.14 and I'm using the same
above hdparm options for over a year while the hardware hasn't changed
at all.
Speaking of hardware:
Pentium 4 HT, ICH5 IDE Controller, running on SMP/HT kernel
(ticks enabled @ 1000 Hz, PREEMPT/low-latency is on,
CONFIG_BLK_DEV_IDEDMA=y).
hda and hdb are Hard drives.
hdc and hdd are DVD drives (hdc is a recorder).
Can this be regarded as a kernel bug at all ? Can I do something to properly
debug it and help you out ?
I posted it here because I couldn't help noticing the following inside .21's Changelog:
commit 8799620400b0b1a4729d8be828b5bfb3d2a8db1a
Author: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Date: Mon Mar 26 23:03:19 2007 +0200
ide: fix locking for manual DMA enable/disable ("hdparm -d")
Since hwif->ide_dma_check and hwif->ide_dma_on never queue any commands
(ide_config_drive_speed() sets transfer mode using polling and has no error
recovery) we are safe with setting hwgroup->busy for the time while DMA
setting for a drive is changed (so it won't race against I/O commands in fly).
I audited briefly all ->ide_dma_check/->ide_dma_on/->tuneproc/->speedproc
implementations and they all look OK wrt to this change.
This patch finally allowed me to close kernel bugzilla bug #8169
(once again thanks to Patrick Horn for reporting the issue & testing patches).
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
--
Thanos Kyritsis <djart at linux.gr>
- What's your ONE purpose in life ?
- To explode, of course! ;-)
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PROBLEM]: hdparm strange behaviour for 2.6.21 and later
2007-06-16 16:58 [PROBLEM]: hdparm strange behaviour for 2.6.21 and later Thanos Kyritsis
@ 2007-06-18 20:01 ` Mark Lord
2007-06-20 15:07 ` Thanos Kyritsis
0 siblings, 1 reply; 6+ messages in thread
From: Mark Lord @ 2007-06-18 20:01 UTC (permalink / raw)
To: Thanos Kyritsis; +Cc: linux-ide, Bartlomiej Zolnierkiewicz
Thanos Kyritsis wrote:
> Hello,
>
> starting with kernel 2.6.21 and up to kernel 2.6.22-rc4, I'm having the
> following problem:
>
> /etc/rc.d/rc.local contains the following:
> /usr/sbin/hdparm -q -d1 -q -u1 -q -c1 -q -k1 /dev/hda
> /usr/sbin/hdparm -q -d1 -q -u1 -q -c1 -q -k1 /dev/hdb
> /usr/sbin/hdparm -q -d1 -q -u1 -q -c1 -q -k1 /dev/hdc
> /usr/sbin/hdparm -q -d1 -q -u1 -q -c1 -q -k1 /dev/hdd
>
> (I'm using Slackware, no Debian-style automated hdparm.conf is running
> during bootup, that's why these are in rc.local)
>
> The above seem to somehow lock up the boot procedure just at the point
> where rc.local gets executed, so the system never reaches login prompt.
> All drivers (kernelspace) and system daemons (userspace) before rc.local
> do normally load, but there are no strange messages in the console or in
> the system logs and because I cannot login, I cannot trace it any further.
> I believe the kernel is in running state because the machine responds to
> ICMP pings from the ethernet, but since the login prompt is not up, the
> already running sshd/telnetd do not provide any help.
>
> The strange thing is that if I remove all the quiet options (-q) from the
> above commands, everything works like it should. Furthermore, if I
> comment them out from rc.local, then boot, login, and execute them by
> hand (with -q), again everything works like it should. Lockup only happens if
> I run 2 or more hdparm commands, if I leave only one (doesn't matter
> which one) hdparm command in rc.local (with -q), it works.
Sounds like a (kernel) timing issue.
The "-q" option gets rid of some intermediary printf's,
and nothing else. So with -q, the ioctl() calls happen
much closer together in time. Without -q, the intermediary
printf's likely cause a resched, giving the kernel more time
to complete anything left over from the earlier call.
????
Any difference with a modern version of hdparm?
-ml
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PROBLEM]: hdparm strange behaviour for 2.6.21 and later
2007-06-18 20:01 ` Mark Lord
@ 2007-06-20 15:07 ` Thanos Kyritsis
2007-06-23 18:28 ` Bartlomiej Zolnierkiewicz
0 siblings, 1 reply; 6+ messages in thread
From: Thanos Kyritsis @ 2007-06-20 15:07 UTC (permalink / raw)
To: Mark Lord; +Cc: linux-ide, Bartlomiej Zolnierkiewicz
On Monday 18 June 2007, Mark Lord wrote:
> Thanos Kyritsis wrote:
[snip]
> > /etc/rc.d/rc.local contains the following:
> > /usr/sbin/hdparm -q -d1 -q -u1 -q -c1 -q -k1 /dev/hda
> > /usr/sbin/hdparm -q -d1 -q -u1 -q -c1 -q -k1 /dev/hdb
[snip]
> Sounds like a (kernel) timing issue.
> The "-q" option gets rid of some intermediary printf's,
> and nothing else. So with -q, the ioctl() calls happen
> much closer together in time. Without -q, the intermediary
> printf's likely cause a resched, giving the kernel more time
> to complete anything left over from the earlier call.
>
> ????
>
> Any difference with a modern version of hdparm?
The same issue happens when using hdparm 7.4 as well as 7.5.
> -ml
--
Thanos Kyritsis <djart at linux.gr>
- What's your ONE purpose in life ?
- To explode, of course! ;-)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PROBLEM]: hdparm strange behaviour for 2.6.21 and later
2007-06-20 15:07 ` Thanos Kyritsis
@ 2007-06-23 18:28 ` Bartlomiej Zolnierkiewicz
2007-06-24 17:47 ` Thanos Kyritsis
0 siblings, 1 reply; 6+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2007-06-23 18:28 UTC (permalink / raw)
To: Thanos Kyritsis; +Cc: Mark Lord, linux-ide
Hi,
On Wednesday 20 June 2007, Thanos Kyritsis wrote:
> On Monday 18 June 2007, Mark Lord wrote:
> > Thanos Kyritsis wrote:
> [snip]
> > > /etc/rc.d/rc.local contains the following:
> > > /usr/sbin/hdparm -q -d1 -q -u1 -q -c1 -q -k1 /dev/hda
> > > /usr/sbin/hdparm -q -d1 -q -u1 -q -c1 -q -k1 /dev/hdb
> [snip]
>
> > Sounds like a (kernel) timing issue.
> > The "-q" option gets rid of some intermediary printf's,
> > and nothing else. So with -q, the ioctl() calls happen
> > much closer together in time. Without -q, the intermediary
> > printf's likely cause a resched, giving the kernel more time
> > to complete anything left over from the earlier call.
It could be that some assumptions that I've taken when
fixing DMA tuning locking were wrong...
> > ????
> >
> > Any difference with a modern version of hdparm?
>
> The same issue happens when using hdparm 7.4 as well as 7.5.
Adding a couple of printk-s to ide.c::set_using_dma() and
ide.c::ide_spin_wait_hwgroup() will for sure help in debugging
it further.
Also could you try running UP kernel without PREEMPT and see
if it makes difference?
Thanks,
Bart
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PROBLEM]: hdparm strange behaviour for 2.6.21 and later
2007-06-23 18:28 ` Bartlomiej Zolnierkiewicz
@ 2007-06-24 17:47 ` Thanos Kyritsis
2007-06-27 19:46 ` PREEMPT bug? (was: Re: [PROBLEM]: hdparm strange behaviour for 2.6.21 and later) Bartlomiej Zolnierkiewicz
0 siblings, 1 reply; 6+ messages in thread
From: Thanos Kyritsis @ 2007-06-24 17:47 UTC (permalink / raw)
To: Bartlomiej Zolnierkiewicz; +Cc: Mark Lord, linux-ide
On Saturday 23 June 2007, Bartlomiej Zolnierkiewicz wrote:
> Hi,
Hello, thanks for answering :-D
> Also could you try running UP kernel without PREEMPT and see
> if it makes difference?
Yes, it did make a difference. I've just tried both UP and SMP kernels
(22-rc5) *without* PREEMPT. Neither of these locked!
I also tried UP with PREEMPT and it locks.
> Adding a couple of printk-s to ide.c::set_using_dma() and
> ide.c::ide_spin_wait_hwgroup() will for sure help in debugging
> it further.
I just tried that as well. I placed printk-s when entering and exiting
these 2 functions and when they use spin_lock_irq() and
spin_unlock_irq(), and then watched the output:
I didn't see something odd. For every hdparm execution, set_using_dma()
is called, it successfully calls ide_spin_wait_hwgroup(), then
set_using_dma() finishes.
After set_using_dma() finishes, one more ide_spin_wait_hwgroup()
is called and finishes.
This pattern happens always, no matter if the kernel will eventually
lock (preempt) or not lock (no preempt), with absolutely no
differences.
entering ide.c::set_using_dma()
|
entering ide.c::ide_spin_wait_hwgroup()
ide.c::ide_spin_wait_hwgroup: LOCK
exiting ide.c::ide_spin_wait_hwgroup()
|
ide.c::set_using_dma: set ->busy flag, unlock and let it ride
ide.c::set_using_dma: UNLOCK
ide.c::set_using_dma: lock, clear ->busy flag and unlock before leaving
ide.c::set_using_dma: LOCK
ide.c::set_using_dma: UNLOCK
|
exiting ide.c::set_using_dma()
entering ide.c::ide_spin_wait_hwgroup()
ide.c::ide_spin_wait_hwgroup: LOCK
exiting ide.c::ide_spin_wait_hwgroup()
The above gets printed twice (one for the hda hdparm and one for hdb).
But I noticed something extra.
Sometimes, the ide_spin_wait_hwgroup() that runs either before the 1st
set_using_dma() or between the 1st and the 2nd (1st for hda, 2nd for
hdb) (*but never the one called last*) is waiting A LOT inside the busy
loop (while (hwgroup->busy)).
I think PREEMPT kernels always produce a lot of this while loop output,
and then print the above pattern, then lock.
Non-PREEMPT kernels don't always produce the huge while loop output,
only sometimes, but they never have locking problem.
I don't know if this is at all relevant to the problem. Perhaps it's
normal that during some of the bootups the IDE device group is busy
while during other bootups it's not busy, right ?
However, since all functions exit properly, should I try to place
printk-s in other functions as well ?
(I kind of need and appreciate guideance in order to help you, because
I've never been in the kernel hacking business before :) )
> Thanks,
> Bart
--
Thanos Kyritsis <djart at linux.gr>
- What's your ONE purpose in life ?
- To explode, of course! ;-)
^ permalink raw reply [flat|nested] 6+ messages in thread* PREEMPT bug? (was: Re: [PROBLEM]: hdparm strange behaviour for 2.6.21 and later)
2007-06-24 17:47 ` Thanos Kyritsis
@ 2007-06-27 19:46 ` Bartlomiej Zolnierkiewicz
0 siblings, 0 replies; 6+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2007-06-27 19:46 UTC (permalink / raw)
To: Thanos Kyritsis; +Cc: Mark Lord, linux-ide, linux-kernel
Hi,
On Sunday 24 June 2007, Thanos Kyritsis wrote:
> On Saturday 23 June 2007, Bartlomiej Zolnierkiewicz wrote:
> > Hi,
>
> Hello, thanks for answering :-D
>
>
> > Also could you try running UP kernel without PREEMPT and see
> > if it makes difference?
>
> Yes, it did make a difference. I've just tried both UP and SMP kernels
> (22-rc5) *without* PREEMPT. Neither of these locked!
Thanks for testing.
IIRC SMP kernel without PREEMPT should also fail (or not?),
it could be that we are hitting some generic PREEMPT bug.
Cc:ed linux-kernel@ in hope that some PREEMPT guru lends us a hand.
[ original thread is here:
http://www.mail-archive.com/linux-ide@vger.kernel.org/msg07380.html ]
> I also tried UP with PREEMPT and it locks.
>
> > Adding a couple of printk-s to ide.c::set_using_dma() and
> > ide.c::ide_spin_wait_hwgroup() will for sure help in debugging
> > it further.
>
> I just tried that as well. I placed printk-s when entering and exiting
> these 2 functions and when they use spin_lock_irq() and
> spin_unlock_irq(), and then watched the output:
>
> I didn't see something odd. For every hdparm execution, set_using_dma()
> is called, it successfully calls ide_spin_wait_hwgroup(), then
> set_using_dma() finishes.
> After set_using_dma() finishes, one more ide_spin_wait_hwgroup()
> is called and finishes.
>
> This pattern happens always, no matter if the kernel will eventually
> lock (preempt) or not lock (no preempt), with absolutely no
> differences.
>
> entering ide.c::set_using_dma()
> |
> entering ide.c::ide_spin_wait_hwgroup()
> ide.c::ide_spin_wait_hwgroup: LOCK
> exiting ide.c::ide_spin_wait_hwgroup()
> |
> ide.c::set_using_dma: set ->busy flag, unlock and let it ride
> ide.c::set_using_dma: UNLOCK
> ide.c::set_using_dma: lock, clear ->busy flag and unlock before leaving
> ide.c::set_using_dma: LOCK
> ide.c::set_using_dma: UNLOCK
> |
> exiting ide.c::set_using_dma()
>
> entering ide.c::ide_spin_wait_hwgroup()
> ide.c::ide_spin_wait_hwgroup: LOCK
> exiting ide.c::ide_spin_wait_hwgroup()
>
> The above gets printed twice (one for the hda hdparm and one for hdb).
>
> But I noticed something extra.
>
> Sometimes, the ide_spin_wait_hwgroup() that runs either before the 1st
> set_using_dma() or between the 1st and the 2nd (1st for hda, 2nd for
> hdb) (*but never the one called last*) is waiting A LOT inside the busy
> loop (while (hwgroup->busy)).
>
> I think PREEMPT kernels always produce a lot of this while loop output,
> and then print the above pattern, then lock.
>
> Non-PREEMPT kernels don't always produce the huge while loop output,
> only sometimes, but they never have locking problem.
> I don't know if this is at all relevant to the problem. Perhaps it's
> normal that during some of the bootups the IDE device group is busy
> while during other bootups it's not busy, right ?
Yes, this is expected behavior (especially when you mix SMP in)
unless of course ide_spin_wait_hwgroupt() fails (timeouts).
> However, since all functions exit properly, should I try to place
> printk-s in other functions as well ?
ide_do_request() and hwgroup->busy flag but as this could produce *a*lot*
of output (serial or net console would be required to capture the log for
the lockup case).
> (I kind of need and appreciate guideance in order to help you, because
> I've never been in the kernel hacking business before :) )
No problem, happy hacking. :)
Thanks,
Bart
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-06-27 19:55 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-16 16:58 [PROBLEM]: hdparm strange behaviour for 2.6.21 and later Thanos Kyritsis
2007-06-18 20:01 ` Mark Lord
2007-06-20 15:07 ` Thanos Kyritsis
2007-06-23 18:28 ` Bartlomiej Zolnierkiewicz
2007-06-24 17:47 ` Thanos Kyritsis
2007-06-27 19:46 ` PREEMPT bug? (was: Re: [PROBLEM]: hdparm strange behaviour for 2.6.21 and later) Bartlomiej Zolnierkiewicz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).