Re: cgroup_fj tests will stick the nort kernel

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Li Zefan <lizefan@huawei.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Qiang Huang <h.huangqiang@huawei.com>,
	linux-rt-users <linux-rt-users@vger.kernel.org>,
	zhangwei <jovi.zhangwei@huawei.com>
Subject: Re: cgroup_fj tests will stick the nort kernel
Date: Tue, 23 Apr 2013 13:51:58 +0800	[thread overview]
Message-ID: <5176217E.8030008@huawei.com> (raw)
In-Reply-To: <1366646447.9609.131.camel@gandalf.local.home>

On 2013/4/23 0:00, Steven Rostedt wrote:
> On Mon, 2013-04-22 at 17:39 +0800, Li Zefan wrote:
>> On 2013/4/19 15:30, Qiang Huang wrote:
>>> Hi,
>>>
>>> I ran cgroup_fj tests on RT kernel with PREEMPT_RT_FULL disabled, it will
>>> stick the system when ran cpuset stress tests, it happens everytime.
>>>
>>> Here stick the system means there are almost no response from the system and
>>> we can hardly do anything on the terminal, but kernel isn't crash nor deadlocked
>>> (according to the lockdep message), and it may do some response sometimes.
>>>
>>> The problem exists on all RT versions from 3.4.18-rt29 to 3.4.37-rt51 AFAIK, but
>>> without RT patches or with PREEMPT_RT_FULL enabled, the problem isn't exists.
>>>
>>> When the system is stuck, we will get the following message:
>>> # dmesg
>>> ...
>>
>> I've found the culprit after some investigation:
>>
>> From: Thomas Gleixner <tglx@linutronix.de>
>> Date: Fri, 04 Nov 2011 19:48:36 +0000
>> Subject: sched-clear-pf-thread-bound-on-fallback-rq.patch
>>
>> At system boot when some cpus haven't been up, the scheduler calls select_fallback_rq()
>> and schedules tasks in other cpus, which ends up clearing some kernel threads'
>> PF_THREAD_BOUND flag...
> 
> I'm curious to why this doesn't break when PREEMPT_RT_FULL is enabled. I
> would think it would also cause issues there too.
> 

I was wrong in saying that PF_THREAD_BOUND is cleared because some cpus are not
online yet. It's because select_task_rq_fair() just returns prev_cpu, which is
task_cpu(p), which is 0 during system boot or some other cpu after boot, which
is not in tsk_cpus_allowed, so select_fallback_rq() is called and it clears
PF_THREAD_BOUND.

I don't know why it didn't cause trouble when RT_FULL is enabled for Huang Qiang,
but I did encoutner problems when testing in my box.

I can trigger the bug with cgroup_fj.sh, or with taskset:

  # for pid in `ps -e -o pid`; do taskset -p -c 0-15 $pid; done

But system hung or tasks hung may not happen right in the test, but will happen
after some random operations (try compile kernel).

And while running test I saw lots of warnings like this:

[  146.702056] BUG: using smp_processor_id() in preemptible [00000000 00000000] code: kworker/
4:0/23
[  146.702069] caller is vmstat_update+0x22/0x60
[  146.702075] Pid: 23, comm: kworker/4:0 Not tainted 3.4.24.05+ #49
[  146.702077] Call Trace:
[  146.702087]  [<ffffffff8125f685>] debug_smp_processor_id+0x145/0x150
[  146.702091]  [<ffffffff8113c872>] vmstat_update+0x22/0x60
[  146.702097]  [<ffffffff81061033>] process_one_work+0x203/0x610
[  146.702101]  [<ffffffff81060f70>] ? process_one_work+0x140/0x610
[  146.702105]  [<ffffffff81061fdd>] ? worker_thread+0x6d/0x450
[  146.702109]  [<ffffffff8113c850>] ? refresh_cpu_vm_stats+0x1d0/0x1d0
[  146.702114]  [<ffffffff81062116>] worker_thread+0x1a6/0x450
[  146.702118]  [<ffffffff81061f70>] ? manage_workers+0x250/0x250
[  146.702122]  [<ffffffff810680f6>] kthread+0xb6/0xc0
[  146.702130]  [<ffffffff81474ab4>] kernel_thread_helper+0x4/0x10
[  146.702137]  [<ffffffff81076930>] ? finish_task_switch+0x90/0x100
[  146.702142]  [<ffffffff8146bb34>] ? retint_restore_args+0x13/0x13
[  146.702145]  [<ffffffff81068040>] ? kthreadd+0x310/0x310
[  146.702149]  [<ffffffff81474ab0>] ? gs_change+0x13/0x13

and after a while those warnings stopped, instead warnings like this popped up,
even after I stopped the test:

[  252.896103] ------------[ cut here ]------------
[  252.896107] WARNING: at kernel/cpu.c:157 unpin_current_cpu+0x7d/0x90()
[  252.896110] Hardware name: Tecal RH2285
[  252.896112] Modules linked in: iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi bridge
ipv6 stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf binfm
t_misc fuse loop dm_mod tpm_tis tpm coretemp crc32c_intel ghash_clmulni_intel aesni_intel sg s
erio_raw cryptd aes_x86_64 tpm_bios microcode i2c_i801 iTCO_wdt i2c_core bnx2 iTCO_vendor_supp
ort mptctl button usbhid hid uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10dif edd ext3 m
bcache jbd fan processor ide_pci_generic ide_core ata_generic ata_piix libata mptsas mptscsih
mptbase scsi_transport_sas scsi_mod thermal thermal_sys hwmon
[  252.896201] Pid: 9893, comm: dmesg Tainted: G        W    3.4.24.05+ #49
[  252.896203] Call Trace:
[  252.896208]  [<ffffffff810404ed>] ? unpin_current_cpu+0x7d/0x90
[  252.896212]  [<ffffffff810404ed>] ? unpin_current_cpu+0x7d/0x90
[  252.896217]  [<ffffffff8103d83f>] warn_slowpath_common+0x7f/0xc0
[  252.896221]  [<ffffffff8103d89a>] warn_slowpath_null+0x1a/0x20
[  252.896226]  [<ffffffff810404ed>] unpin_current_cpu+0x7d/0x90
[  252.896231]  [<ffffffff81078ddb>] migrate_enable+0xeb/0x1e0
[  252.896235]  [<ffffffff81146b7b>] handle_pte_fault+0x34b/0x980
[  252.896240]  [<ffffffff81076431>] ? get_parent_ip+0x11/0x50
[  252.896244]  [<ffffffff81076431>] ? get_parent_ip+0x11/0x50
[  252.896250]  [<ffffffff811472fc>] handle_mm_fault+0x14c/0x1e0
[  252.896254]  [<ffffffff8146ef47>] do_page_fault+0x257/0x550
[  252.896260]  [<ffffffff8114c995>] ? do_mmap_pgoff+0x375/0x3a0
[  252.896264]  [<ffffffff8146bfb6>] ? error_sti+0x5/0x6
[  252.896269]  [<ffffffff81259175>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[  252.896274]  [<ffffffff8146bd75>] page_fault+0x25/0x30
[  252.896277] ---[ end trace 000000000000ae6e ]---

I didn't see those warnings if !RT_FULL.

next prev parent reply	other threads:[~2013-04-23  5:52 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-19  7:30 cgroup_fj tests will stick the nort kernel Qiang Huang
2013-04-20  2:00 ` Qiang Huang
2013-04-20  7:21   ` Li Zefan
2013-04-22  9:39 ` Li Zefan
2013-04-22 16:00   ` Steven Rostedt
2013-04-23  5:51     ` Li Zefan [this message]
2013-04-23 10:46       ` Li Zefan
2013-04-25  6:11       ` Qiang Huang
2013-04-25  8:44         ` Li Zefan
2013-04-25  8:56           ` Qiang Huang
2013-04-25 12:53         ` Steven Rostedt
2013-04-30 14:21     ` Luis Claudio R. Goncalves

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5176217E.8030008@huawei.com \
    --to=lizefan@huawei.com \
    --cc=h.huangqiang@huawei.com \
    --cc=jovi.zhangwei@huawei.com \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.