All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Lykov <combr@yandex.ru>
To: Don Zickus <dzickus@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Ingo Molnar <mingo@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org, linux-watchdog@vger.kernel.org,
	kirill@shutemov.name
Subject: Re: [BUG?] false positive in soft lockup detector while unlzma initramfs on slow cpu
Date: Wed, 30 Jan 2013 13:39:23 +0400	[thread overview]
Message-ID: <5108EA4B.7030003@yandex.ru> (raw)
In-Reply-To: <20130129153348.GR98867@redhat.com>

29.01.2013 19:33, Don Zickus пишет:

> The softlockup mechanism works scheduling a high priority task that kicks
> the softlockups.  If the unzip thread is taking too long, it could
> accidentally trip the detection.

Inyerestingly, that a decompress of lzma -4 takes longer time than 
decompress lzma -9, and it stated in man lzma(1):
"  On  the  same hardware, the decompression speed is approximately a 
constant number of bytes of compressed data per second.  In other words, 
the better the compression, the faster the  decompression
will  usually  be. "

I tested it on target computer by hand:

lzma -4 compressed: time unlzma initram-alt-p6rel3-4.cpio.lzma
20.94user 1.47system 0:22:45elapsed 99%CPU (...19424maxresidents)k

lzma -9 compressed: time unlzma initram-alt-p6rel3-9.cpio.lzma
19.49user 1.92system 0:21:44elapsed 99%CPU (...241488maxresidents)k

So, it cannot "take too long" because not-working faster than working.
Apparently time not matter, but algorithm complexity?

>> 2. How to change watchdog_thresh parameter at boot without patching
>> sources? If it necessary (with it side effects) maybe implement it
>> as commandline parameter or config compile time parameter?
>
> I attached a patch below that allows you to set it a boot time.  Let me
> know if this works for you, then I can clean it up and post it properly.

It not works for me. I apply this patch, build, use ("int __read_mostly 
watchdog_thresh = 10;"  as in original)
command line:

[    0.000000] Kernel command line: 
initrd=initram-alt-p6rel3-9.cpio.lzma console=uart,io,0x240,115200n8 
kernel.watchdog_thresh=30 BOOT_IMAGE=bzImage-3232-ml5-fwinkrn-wtdg10cmd

Full list of panic:

[   28.057086] BUG: soft lockup - CPU#0 stuck for 23s! [swapper:1]
[   28.057086]
[   28.057086] Pid: 1, comm: swapper Not tainted 
3.2.32VEP-01ML5-initramfs #19
[   28.057086] EIP: 0060:[<c03ab92f>] EFLAGS: 00000212 CPU: 0
[   28.057086] EIP is at rc_get_bit+0x1a/0x7c
[   28.057086] EAX: ce827f34 EBX: ce827f34 ECX: ce827f70 EDX: d481f926
[   28.057086] ESI: d481f926 EDI: ce827f70 EBP: ce827ee0 ESP: ce827ed4
[   28.057086]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
[   28.057086] Process swapper (pid: 1, ti=ce802000 task=ce80b410 
task.ti=ce826000)
[   28.057086] Stack:
[   28.057086]  00000001 02857802 d481f86c ce827f80 c03abd13 0b60a3e6 
d481c666 00000000
[   28.057086]  00000003 cf0ea000 00000183 004f1e0a cf0ea000 00000003 
00000000 00e584dd
[   28.057086]  009c2509 d481f86c d481c000 d080a000 00000012 00000002 
000003dd ffffffff
[   28.057086] Call Trace:
[   28.057086]  [<c03abd13>] unlzma+0x382/0xac0
[   28.057086]  [<c03ab8ae>] ? gunzip+0x25b/0x25b
[   28.057086]  [<c039ed46>] ? initrd_load+0x3b/0x3b
[   28.057086]  [<c03ab991>] ? rc_get_bit+0x7c/0x7c
[   28.057086]  [<c039f211>] unpack_to_rootfs+0x139/0x237
[   28.057086]  [<c039ef53>] ? write_buffer+0x2c/0x2c
[   28.057086]  [<c039ed46>] ? initrd_load+0x3b/0x3b
[   28.057086]  [<c039e791>] ? do_one_initcall+0x112/0x112
[   28.057086]  [<c039f9b4>] populate_rootfs+0x42/0x85
[   28.057086]  [<c039e6ef>] do_one_initcall+0x70/0x112
[   28.057086]  [<c039f972>] ? do_header+0x1d4/0x1d4
[   28.057086]  [<c039e791>] ? do_one_initcall+0x112/0x112
[   28.057086]  [<c039e810>] kernel_init+0x7f/0xf8
[   28.057086]  [<c02eb2b6>] kernel_thread_helper+0x6/0xd
[   28.057086] Code: b6 01 41 c1 e2 08 89 4b 04 09 d0 89 43 14 5b 5d c3 
55 89 e5 57 89 cf 56 89 d6 53 89 c3 81 78 18 ff ff ff 00 77 05 e8 b5 ff 
ff ff <8b> 4b 18 0f b7 06 89 ca c1 ea 0b 0f af c2 8b 53 14 39 c2 89 43
[   28.057086] Call Trace:
[   28.057086]  [<c03abd13>] unlzma+0x382/0xac0
[   28.057086]  [<c03ab8ae>] ? gunzip+0x25b/0x25b
[   28.057086]  [<c039ed46>] ? initrd_load+0x3b/0x3b
[   28.057086]  [<c03ab991>] ? rc_get_bit+0x7c/0x7c
[   28.057086]  [<c039f211>] unpack_to_rootfs+0x139/0x237
[   28.057086]  [<c039ef53>] ? write_buffer+0x2c/0x2c
[   28.057086]  [<c039ed46>] ? initrd_load+0x3b/0x3b
[   28.057086]  [<c039e791>] ? do_one_initcall+0x112/0x112
[   28.057086]  [<c039f9b4>] populate_rootfs+0x42/0x85
[   28.057086]  [<c039e6ef>] do_one_initcall+0x70/0x112
[   28.057086]  [<c039f972>] ? do_header+0x1d4/0x1d4
[   28.057086]  [<c039e791>] ? do_one_initcall+0x112/0x112
[   28.057086]  [<c039e810>] kernel_init+0x7f/0xf8
[   28.057086]  [<c02eb2b6>] kernel_thread_helper+0x6/0xd
[   28.057086] Kernel panic - not syncing: softlockup: hung tasks
[   28.057086] Pid: 1, comm: swapper Not tainted 
3.2.32VEP-01ML5-initramfs #19
[   28.057086] Call Trace:
[   28.057086]  [<c02e91c4>] ? printk+0xf/0x11
[   28.057086]  [<c02e90c0>] panic+0x50/0x145
[   28.057086]  [<c012fa9b>] watchdog_timer_fn+0xf2/0x10f
[   28.057086]  [<c0124ae0>] hrtimer_run_queues+0x13d/0x1bc
[   28.057086]  [<c01193cf>] run_local_timers+0x8/0x14
[   28.057086]  [<c01193f6>] update_process_times+0x1b/0x4e
[   28.057086]  [<c012b6d2>] tick_periodic.clone.20+0x52/0x54
[   28.057086]  [<c012b6e1>] tick_handle_periodic+0xd/0x5b
[   28.057086]  [<c010339f>] timer_interrupt+0x13/0x1a
[   28.057086]  [<c013032b>] handle_irq_event_percpu+0x24/0xfb
[   28.057086]  [<c0131a80>] ? handle_simple_irq+0x3f/0x3f
[   28.057086]  [<c013041e>] handle_irq_event+0x1c/0x26
[   28.057086]  [<c0131aeb>] handle_level_irq+0x6b/0x75
[   28.057086]  <IRQ>  [<c0102f62>] ? do_IRQ+0x34/0x74
[   28.057086]  [<c02eb2a9>] ? common_interrupt+0x29/0x30
[   28.057086]  [<c03ab92f>] ? rc_get_bit+0x1a/0x7c
[   28.057086]  [<c03abd13>] ? unlzma+0x382/0xac0
[   28.057086]  [<c03ab8ae>] ? gunzip+0x25b/0x25b
[   28.057086]  [<c039ed46>] ? initrd_load+0x3b/0x3b
[   28.057086]  [<c03ab991>] ? rc_get_bit+0x7c/0x7c
[   28.057086]  [<c039f211>] ? unpack_to_rootfs+0x139/0x237
[   28.057086]  [<c039ef53>] ? write_buffer+0x2c/0x2c
[   28.057086]  [<c039ed46>] ? initrd_load+0x3b/0x3b
[   28.057086]  [<c039e791>] ? do_one_initcall+0x112/0x112
[   28.057086]  [<c039f9b4>] ? populate_rootfs+0x42/0x85
[   28.057086]  [<c039e6ef>] ? do_one_initcall+0x70/0x112
[   28.057086]  [<c039f972>] ? do_header+0x1d4/0x1d4
[   28.057086]  [<c039e791>] ? do_one_initcall+0x112/0x112
[   28.057086]  [<c039e810>] ? kernel_init+0x7f/0xf8
[   28.057086]  [<c02eb2b6>] ? kernel_thread_helper+0x6/0xd


> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index 75a2ab3..e448d63 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -79,6 +79,14 @@ static int __init softlockup_panic_setup(char *str)
>   }
>   __setup("softlockup_panic=", softlockup_panic_setup);
>
> +static int __init watchdog_thresh_setup(char *str)
> +{
> +	watchdog_thresh = simple_strtoul(str, NULL, 0);
> +
> +	return 1;
> +}
> +__setup("watchdog_thresh=", watchdog_thresh_setup);
> +
>   static int __init nowatchdog_setup(char *str)
>   {
>   	watchdog_enabled = 0;

--
Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-watchdog" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: Mike Lykov <combr@yandex.ru>
To: Don Zickus <dzickus@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Ingo Molnar <mingo@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org, linux-watchdog@vger.kernel.org,
	kirill@shutemov.name
Subject: Re: [BUG?] false positive in soft lockup detector while unlzma initramfs on slow cpu
Date: Wed, 30 Jan 2013 13:39:23 +0400	[thread overview]
Message-ID: <5108EA4B.7030003@yandex.ru> (raw)
In-Reply-To: <20130129153348.GR98867@redhat.com>

29.01.2013 19:33, Don Zickus пишет:

> The softlockup mechanism works scheduling a high priority task that kicks
> the softlockups.  If the unzip thread is taking too long, it could
> accidentally trip the detection.

Inyerestingly, that a decompress of lzma -4 takes longer time than 
decompress lzma -9, and it stated in man lzma(1):
"  On  the  same hardware, the decompression speed is approximately a 
constant number of bytes of compressed data per second.  In other words, 
the better the compression, the faster the  decompression
will  usually  be. "

I tested it on target computer by hand:

lzma -4 compressed: time unlzma initram-alt-p6rel3-4.cpio.lzma
20.94user 1.47system 0:22:45elapsed 99%CPU (...19424maxresidents)k

lzma -9 compressed: time unlzma initram-alt-p6rel3-9.cpio.lzma
19.49user 1.92system 0:21:44elapsed 99%CPU (...241488maxresidents)k

So, it cannot "take too long" because not-working faster than working.
Apparently time not matter, but algorithm complexity?

>> 2. How to change watchdog_thresh parameter at boot without patching
>> sources? If it necessary (with it side effects) maybe implement it
>> as commandline parameter or config compile time parameter?
>
> I attached a patch below that allows you to set it a boot time.  Let me
> know if this works for you, then I can clean it up and post it properly.

It not works for me. I apply this patch, build, use ("int __read_mostly 
watchdog_thresh = 10;"  as in original)
command line:

[    0.000000] Kernel command line: 
initrd=initram-alt-p6rel3-9.cpio.lzma console=uart,io,0x240,115200n8 
kernel.watchdog_thresh=30 BOOT_IMAGE=bzImage-3232-ml5-fwinkrn-wtdg10cmd

Full list of panic:

[   28.057086] BUG: soft lockup - CPU#0 stuck for 23s! [swapper:1]
[   28.057086]
[   28.057086] Pid: 1, comm: swapper Not tainted 
3.2.32VEP-01ML5-initramfs #19
[   28.057086] EIP: 0060:[<c03ab92f>] EFLAGS: 00000212 CPU: 0
[   28.057086] EIP is at rc_get_bit+0x1a/0x7c
[   28.057086] EAX: ce827f34 EBX: ce827f34 ECX: ce827f70 EDX: d481f926
[   28.057086] ESI: d481f926 EDI: ce827f70 EBP: ce827ee0 ESP: ce827ed4
[   28.057086]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
[   28.057086] Process swapper (pid: 1, ti=ce802000 task=ce80b410 
task.ti=ce826000)
[   28.057086] Stack:
[   28.057086]  00000001 02857802 d481f86c ce827f80 c03abd13 0b60a3e6 
d481c666 00000000
[   28.057086]  00000003 cf0ea000 00000183 004f1e0a cf0ea000 00000003 
00000000 00e584dd
[   28.057086]  009c2509 d481f86c d481c000 d080a000 00000012 00000002 
000003dd ffffffff
[   28.057086] Call Trace:
[   28.057086]  [<c03abd13>] unlzma+0x382/0xac0
[   28.057086]  [<c03ab8ae>] ? gunzip+0x25b/0x25b
[   28.057086]  [<c039ed46>] ? initrd_load+0x3b/0x3b
[   28.057086]  [<c03ab991>] ? rc_get_bit+0x7c/0x7c
[   28.057086]  [<c039f211>] unpack_to_rootfs+0x139/0x237
[   28.057086]  [<c039ef53>] ? write_buffer+0x2c/0x2c
[   28.057086]  [<c039ed46>] ? initrd_load+0x3b/0x3b
[   28.057086]  [<c039e791>] ? do_one_initcall+0x112/0x112
[   28.057086]  [<c039f9b4>] populate_rootfs+0x42/0x85
[   28.057086]  [<c039e6ef>] do_one_initcall+0x70/0x112
[   28.057086]  [<c039f972>] ? do_header+0x1d4/0x1d4
[   28.057086]  [<c039e791>] ? do_one_initcall+0x112/0x112
[   28.057086]  [<c039e810>] kernel_init+0x7f/0xf8
[   28.057086]  [<c02eb2b6>] kernel_thread_helper+0x6/0xd
[   28.057086] Code: b6 01 41 c1 e2 08 89 4b 04 09 d0 89 43 14 5b 5d c3 
55 89 e5 57 89 cf 56 89 d6 53 89 c3 81 78 18 ff ff ff 00 77 05 e8 b5 ff 
ff ff <8b> 4b 18 0f b7 06 89 ca c1 ea 0b 0f af c2 8b 53 14 39 c2 89 43
[   28.057086] Call Trace:
[   28.057086]  [<c03abd13>] unlzma+0x382/0xac0
[   28.057086]  [<c03ab8ae>] ? gunzip+0x25b/0x25b
[   28.057086]  [<c039ed46>] ? initrd_load+0x3b/0x3b
[   28.057086]  [<c03ab991>] ? rc_get_bit+0x7c/0x7c
[   28.057086]  [<c039f211>] unpack_to_rootfs+0x139/0x237
[   28.057086]  [<c039ef53>] ? write_buffer+0x2c/0x2c
[   28.057086]  [<c039ed46>] ? initrd_load+0x3b/0x3b
[   28.057086]  [<c039e791>] ? do_one_initcall+0x112/0x112
[   28.057086]  [<c039f9b4>] populate_rootfs+0x42/0x85
[   28.057086]  [<c039e6ef>] do_one_initcall+0x70/0x112
[   28.057086]  [<c039f972>] ? do_header+0x1d4/0x1d4
[   28.057086]  [<c039e791>] ? do_one_initcall+0x112/0x112
[   28.057086]  [<c039e810>] kernel_init+0x7f/0xf8
[   28.057086]  [<c02eb2b6>] kernel_thread_helper+0x6/0xd
[   28.057086] Kernel panic - not syncing: softlockup: hung tasks
[   28.057086] Pid: 1, comm: swapper Not tainted 
3.2.32VEP-01ML5-initramfs #19
[   28.057086] Call Trace:
[   28.057086]  [<c02e91c4>] ? printk+0xf/0x11
[   28.057086]  [<c02e90c0>] panic+0x50/0x145
[   28.057086]  [<c012fa9b>] watchdog_timer_fn+0xf2/0x10f
[   28.057086]  [<c0124ae0>] hrtimer_run_queues+0x13d/0x1bc
[   28.057086]  [<c01193cf>] run_local_timers+0x8/0x14
[   28.057086]  [<c01193f6>] update_process_times+0x1b/0x4e
[   28.057086]  [<c012b6d2>] tick_periodic.clone.20+0x52/0x54
[   28.057086]  [<c012b6e1>] tick_handle_periodic+0xd/0x5b
[   28.057086]  [<c010339f>] timer_interrupt+0x13/0x1a
[   28.057086]  [<c013032b>] handle_irq_event_percpu+0x24/0xfb
[   28.057086]  [<c0131a80>] ? handle_simple_irq+0x3f/0x3f
[   28.057086]  [<c013041e>] handle_irq_event+0x1c/0x26
[   28.057086]  [<c0131aeb>] handle_level_irq+0x6b/0x75
[   28.057086]  <IRQ>  [<c0102f62>] ? do_IRQ+0x34/0x74
[   28.057086]  [<c02eb2a9>] ? common_interrupt+0x29/0x30
[   28.057086]  [<c03ab92f>] ? rc_get_bit+0x1a/0x7c
[   28.057086]  [<c03abd13>] ? unlzma+0x382/0xac0
[   28.057086]  [<c03ab8ae>] ? gunzip+0x25b/0x25b
[   28.057086]  [<c039ed46>] ? initrd_load+0x3b/0x3b
[   28.057086]  [<c03ab991>] ? rc_get_bit+0x7c/0x7c
[   28.057086]  [<c039f211>] ? unpack_to_rootfs+0x139/0x237
[   28.057086]  [<c039ef53>] ? write_buffer+0x2c/0x2c
[   28.057086]  [<c039ed46>] ? initrd_load+0x3b/0x3b
[   28.057086]  [<c039e791>] ? do_one_initcall+0x112/0x112
[   28.057086]  [<c039f9b4>] ? populate_rootfs+0x42/0x85
[   28.057086]  [<c039e6ef>] ? do_one_initcall+0x70/0x112
[   28.057086]  [<c039f972>] ? do_header+0x1d4/0x1d4
[   28.057086]  [<c039e791>] ? do_one_initcall+0x112/0x112
[   28.057086]  [<c039e810>] ? kernel_init+0x7f/0xf8
[   28.057086]  [<c02eb2b6>] ? kernel_thread_helper+0x6/0xd


> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index 75a2ab3..e448d63 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -79,6 +79,14 @@ static int __init softlockup_panic_setup(char *str)
>   }
>   __setup("softlockup_panic=", softlockup_panic_setup);
>
> +static int __init watchdog_thresh_setup(char *str)
> +{
> +	watchdog_thresh = simple_strtoul(str, NULL, 0);
> +
> +	return 1;
> +}
> +__setup("watchdog_thresh=", watchdog_thresh_setup);
> +
>   static int __init nowatchdog_setup(char *str)
>   {
>   	watchdog_enabled = 0;

--
Mike

  parent reply	other threads:[~2013-01-30  9:40 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-29 13:42 [BUG?] false positive in soft lockup detector while unlzma initramfs on slow cpu Mike Lykov
2013-01-29 15:33 ` Don Zickus
2013-01-29 17:18   ` anish kumar
2013-01-30 15:51     ` Don Zickus
2013-01-30 15:59       ` anish kumar
2013-01-29 23:59   ` Andrew Morton
2013-01-31 11:18     ` Ingo Molnar
2013-01-30  9:39   ` Mike Lykov [this message]
2013-01-30  9:39     ` Mike Lykov
2013-01-30 15:40     ` Don Zickus
2013-01-30 15:40       ` Don Zickus
2013-01-31 11:21       ` Mike Lykov
2013-01-31 11:21         ` Mike Lykov
2013-01-31 14:46         ` Don Zickus
2013-01-31 14:46           ` Don Zickus
2013-02-01 10:44           ` Mike Lykov
2013-02-01 15:59             ` Don Zickus
2013-02-01 15:59               ` Don Zickus
2013-02-01 16:43               ` Mike Lykov
2013-02-01 16:43                 ` Mike Lykov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5108EA4B.7030003@yandex.ru \
    --to=combr@yandex.ru \
    --cc=akpm@linux-foundation.org \
    --cc=dzickus@redhat.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-watchdog@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.