linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Bug 62751] New: INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0)
@ 2013-10-09 13:55 bugzilla-daemon
  2013-10-09 13:59 ` [Bug 62751] " bugzilla-daemon
                   ` (14 more replies)
  0 siblings, 15 replies; 16+ messages in thread
From: bugzilla-daemon @ 2013-10-09 13:55 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=62751

            Bug ID: 62751
           Summary: INFO: rcu_sched detected stalls on CPUs/tasks: { 3}
                    (detected by 2, t=15107 jiffies, g=270766, c=270765,
                    q=0)
           Product: File System
           Version: 2.5
    Kernel Version: 3.11.3-031103-generic
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: ext4
          Assignee: fs_ext4@kernel-bugs.osdl.org
          Reporter: loris.luise@gmail.com
        Regression: No

Oct  9 09:51:19 h3oserver2 kernel: [87982.718328] NMI backtrace for cpu 3
Oct  9 09:51:19 h3oserver2 kernel: [87982.718351] CPU: 3 PID: 71 Comm: kswapd0
Not tainted 3.11.3-031103-generic #201310011421
Oct  9 09:51:19 h3oserver2 kernel: [87982.718355] Hardware name: VMware, Inc.
VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 01/15/2013
Oct  9 09:51:19 h3oserver2 kernel: [87982.718361] task: ffff8801b5d3c650 ti:
ffff8801b2026000 task.ti: ffff8801b2026000
Oct  9 09:51:19 h3oserver2 kernel: [87982.718363] RIP:
0010:[<ffffffff8128cfa3>]  [<ffffffff8128cfa3>] __ext4_es_shrink+0xb3/0x200
Oct  9 09:51:19 h3oserver2 kernel: [87982.869348] RSP: 0018:ffff8801b2027b78 
EFLAGS: 00010293
Oct  9 09:51:19 h3oserver2 kernel: [87982.869352] RAX: ffff8801b4a89f30 RBX:
ffff880002b29c00 RCX: ffff8801b2027ba8
Oct  9 09:51:19 h3oserver2 kernel: [87982.869353] RDX: ffff880002b29f30 RSI:
ffff880002b2af30 RDI: 000000010129c39a
Oct  9 09:51:19 h3oserver2 kernel: [87982.869357] RBP: ffff8801b2027be8 R08:
ffff880002b2af30 R09: ffffea0006933480
Oct  9 09:51:19 h3oserver2 kernel: [87982.869359] R10: ffffffff8128cdfd R11:
0000000000000002 R12: 000000000000003a
Oct  9 09:51:19 h3oserver2 kernel: [87982.869360] R13: 0000000000000000 R14:
ffff8801b0060800 R15: ffff8801b0060c10
Oct  9 09:51:19 h3oserver2 kernel: [87982.869385] FS:  0000000000000000(0000)
GS:ffff8801bfcc0000(0000) knlGS:0000000000000000
Oct  9 09:51:19 h3oserver2 kernel: [87982.869387] CS:  0010 DS: 0000 ES: 0000
CR0: 000000008005003b
Oct  9 09:51:19 h3oserver2 kernel: [87982.869388] CR2: 0000000002236028 CR3:
0000000001c0d000 CR4: 00000000000006e0
Oct  9 09:51:19 h3oserver2 kernel: [87982.869403] Stack:
Oct  9 09:51:19 h3oserver2 kernel: [87982.869406]  ffff8801b1c1bb80
ffff88006019d330 ffff88006027e730 ffff8801b2027ba8
Oct  9 09:51:19 h3oserver2 kernel: [87982.869409]  ffff8801b0060c80
00000046012f3080 ffff8801b4b2db30 ffff880002b28730
Oct  9 09:51:19 h3oserver2 kernel: [87982.869412]  ffff8801b2027fd8
ffff8801b0060be0 ffff8801b2027d18 0000000000010fa0
Oct  9 09:51:19 h3oserver2 kernel: [87982.910096] Call Trace:
Oct  9 09:51:19 h3oserver2 kernel: [87982.910141]  [<ffffffff8128d1b8>]
ext4_es_shrink+0xc8/0x140
Oct  9 09:51:19 h3oserver2 kernel: [87982.910158]  [<ffffffff8115b6b4>]
shrink_slab+0x154/0x300
Oct  9 09:51:19 h3oserver2 kernel: [87982.910179]  [<ffffffff8172ea0e>] ?
_raw_spin_lock+0xe/0x20
Oct  9 09:51:19 h3oserver2 kernel: [87982.910191]  [<ffffffff811ad776>] ?
vmpressure+0x76/0xa0
Oct  9 09:51:19 h3oserver2 kernel: [87982.910194]  [<ffffffff8115de8b>]
kswapd_shrink_zone+0xfb/0x1a0
Oct  9 09:51:19 h3oserver2 kernel: [87982.910197]  [<ffffffff8115ec5c>]
balance_pgdat+0x32c/0x470
Oct  9 09:51:19 h3oserver2 kernel: [87982.910200]  [<ffffffff81166d60>] ?
vm_events_fold_cpu+0x50/0x50
Oct  9 09:51:19 h3oserver2 kernel: [87982.910203]  [<ffffffff8115eec3>]
kswapd+0x123/0x240
Oct  9 09:51:19 h3oserver2 kernel: [87982.910205]  [<ffffffff8115eda0>] ?
balance_pgdat+0x470/0x470
Oct  9 09:51:19 h3oserver2 kernel: [87982.910521]  [<ffffffff81089040>]
kthread+0xc0/0xd0
Oct  9 09:51:19 h3oserver2 kernel: [87982.910526]  [<ffffffff81088f80>] ?
flush_kthread_worker+0xb0/0xb0
Oct  9 09:51:19 h3oserver2 kernel: [87982.910532]  [<ffffffff81737dac>]
ret_from_fork+0x7c/0xb0
Oct  9 09:51:19 h3oserver2 kernel: [87982.910535]  [<ffffffff81088f80>] ?
flush_kthread_worker+0xb0/0xb0
Oct  9 09:51:19 h3oserver2 kernel: [87982.910536] Code: 8c da 00 00 00 0f 84 d4
00 00 00 0f 1f 44 00 00 48 8b 7a 18 49 39 be 20 04 00 00 48 8d 9a d0 fc ff ff
73 44 48 8b 42 08 48 8b 32 <48> 89 46 08 48 89 30 48 8b 45 c8 48 89 55 c8 48 89
0a 48 89 42

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 62751] INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0)
  2013-10-09 13:55 [Bug 62751] New: INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0) bugzilla-daemon
@ 2013-10-09 13:59 ` bugzilla-daemon
  2013-10-09 15:10 ` bugzilla-daemon
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2013-10-09 13:59 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=62751

Loris Luise <loris.luise@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |loris.luise@gmail.com

--- Comment #1 from Loris Luise <loris.luise@gmail.com> ---
Created attachment 110491
  --> https://bugzilla.kernel.org/attachment.cgi?id=110491&action=edit
Kernel.org format filed infos

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 62751] INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0)
  2013-10-09 13:55 [Bug 62751] New: INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0) bugzilla-daemon
  2013-10-09 13:59 ` [Bug 62751] " bugzilla-daemon
@ 2013-10-09 15:10 ` bugzilla-daemon
  2013-10-09 20:53 ` bugzilla-daemon
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2013-10-09 15:10 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=62751

--- Comment #2 from Loris Luise <loris.luise@gmail.com> ---
Sorry, but I have not idea which part of kernel this bug is related,

so I chose ext4 (becaise in call trace I saw ext4_es_shrink+0xc8/0x140)

I logged several of this almost every day.

Kernel is running on a VMware Esxi 4.1 last patched Host on a HP ML350 G5
server.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 62751] INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0)
  2013-10-09 13:55 [Bug 62751] New: INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0) bugzilla-daemon
  2013-10-09 13:59 ` [Bug 62751] " bugzilla-daemon
  2013-10-09 15:10 ` bugzilla-daemon
@ 2013-10-09 20:53 ` bugzilla-daemon
  2013-10-10  6:49 ` bugzilla-daemon
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2013-10-09 20:53 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=62751

Jan Kara <jack@suse.cz> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jack@suse.cz

--- Comment #3 from Jan Kara <jack@suse.cz> ---
Thanks. This looks like a problem in the shrinker for ext4 extent cache.
Apparently we burn too much time (the longest stall is over 2 seconds)
searching and freeing extents without giving CPU a break. Ideally we should
make the shrinker more efficient (since I don't believe it was called with so
many entries to free) but if that won't work out easily, we should at least
sprinkle some cond_resched() into the shrinker code as a band aid. I'll see
what we can do.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 62751] INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0)
  2013-10-09 13:55 [Bug 62751] New: INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0) bugzilla-daemon
                   ` (2 preceding siblings ...)
  2013-10-09 20:53 ` bugzilla-daemon
@ 2013-10-10  6:49 ` bugzilla-daemon
  2013-10-10  6:56 ` bugzilla-daemon
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2013-10-10  6:49 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=62751

--- Comment #4 from Loris Luise <loris.luise@gmail.com> ---
Update to kernel 3.11.4-generic, problem still persist, I added another attach
(bug-20131010084533.txt)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 62751] INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0)
  2013-10-09 13:55 [Bug 62751] New: INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0) bugzilla-daemon
                   ` (3 preceding siblings ...)
  2013-10-10  6:49 ` bugzilla-daemon
@ 2013-10-10  6:56 ` bugzilla-daemon
  2013-10-24  8:18 ` bugzilla-daemon
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2013-10-10  6:56 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=62751

--- Comment #5 from Loris Luise <loris.luise@gmail.com> ---
Created attachment 110571
  --> https://bugzilla.kernel.org/attachment.cgi?id=110571&action=edit
Infos about bug evidence.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 62751] INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0)
  2013-10-09 13:55 [Bug 62751] New: INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0) bugzilla-daemon
                   ` (4 preceding siblings ...)
  2013-10-10  6:56 ` bugzilla-daemon
@ 2013-10-24  8:18 ` bugzilla-daemon
  2013-10-24 10:18 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2013-10-24  8:18 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=62751

--- Comment #6 from Loris Luise <loris.luise@gmail.com> ---
Created attachment 112191
  --> https://bugzilla.kernel.org/attachment.cgi?id=112191&action=edit
Additional evidence, kernel 3.11.6

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 62751] INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0)
  2013-10-09 13:55 [Bug 62751] New: INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0) bugzilla-daemon
                   ` (5 preceding siblings ...)
  2013-10-24  8:18 ` bugzilla-daemon
@ 2013-10-24 10:18 ` bugzilla-daemon
  2013-11-04 11:53 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2013-10-24 10:18 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=62751

Loris Luise <loris.luise@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Kernel Version|3.11.3-031103-generic       |3.11.6-031106-generic

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 62751] INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0)
  2013-10-09 13:55 [Bug 62751] New: INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0) bugzilla-daemon
                   ` (6 preceding siblings ...)
  2013-10-24 10:18 ` bugzilla-daemon
@ 2013-11-04 11:53 ` bugzilla-daemon
  2013-11-13 13:44 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2013-11-04 11:53 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=62751

Alan <alan@lxorguk.ukuu.org.uk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |alan@lxorguk.ukuu.org.uk
     Kernel Version|3.11.6-031106-generic       |3.11.6

--- Comment #7 from Alan <alan@lxorguk.ukuu.org.uk> ---
Jan - its on a vmware virtual platform so it could just be that vmware ran off
with the CPU. Can you duplicate this on a native platform, and if you disable
the NMI watchdog does the kernel hang or merely stall for a moment in this
situation ?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 62751] INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0)
  2013-10-09 13:55 [Bug 62751] New: INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0) bugzilla-daemon
                   ` (7 preceding siblings ...)
  2013-11-04 11:53 ` bugzilla-daemon
@ 2013-11-13 13:44 ` bugzilla-daemon
  2013-11-18  7:51 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2013-11-13 13:44 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=62751

Alan <alan@lxorguk.ukuu.org.uk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 62751] INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0)
  2013-10-09 13:55 [Bug 62751] New: INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0) bugzilla-daemon
                   ` (8 preceding siblings ...)
  2013-11-13 13:44 ` bugzilla-daemon
@ 2013-11-18  7:51 ` bugzilla-daemon
  2013-11-18  7:53 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2013-11-18  7:51 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=62751

--- Comment #8 from Loris Luise <loris.luise@gmail.com> ---
Hello,

what can I do on my side?

I've done the following modifications:

1) Installed latest mainline kernel 3.12

2) Replaced SCSI drivers from vmw_pvsci to LSI Logic SAS (mptsas)

3) reduced number of CPUs from 8 to 4

4) enabled on ESXi (4.1) setting for this machine "CPU Hot add" feature

The issue is still affecting ths machine... other 2 almost equals machine
running on this host (same kernel) but with XFS on root are not affected and
very
stable.

I reattached another info file (bug-20131118084007.txt) which contains a lot of
infos.

Thanks.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 62751] INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0)
  2013-10-09 13:55 [Bug 62751] New: INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0) bugzilla-daemon
                   ` (9 preceding siblings ...)
  2013-11-18  7:51 ` bugzilla-daemon
@ 2013-11-18  7:53 ` bugzilla-daemon
  2013-11-18  7:54 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2013-11-18  7:53 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=62751

Loris Luise <loris.luise@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #110491|0                           |1
        is obsolete|                            |
 Attachment #110571|0                           |1
        is obsolete|                            |
 Attachment #112191|0                           |1
        is obsolete|                            |

--- Comment #9 from Loris Luise <loris.luise@gmail.com> ---
Created attachment 115021
  --> https://bugzilla.kernel.org/attachment.cgi?id=115021&action=edit
Additional evidence, kernel 3.12

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 62751] INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0)
  2013-10-09 13:55 [Bug 62751] New: INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0) bugzilla-daemon
                   ` (10 preceding siblings ...)
  2013-11-18  7:53 ` bugzilla-daemon
@ 2013-11-18  7:54 ` bugzilla-daemon
  2013-12-07 10:08 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2013-11-18  7:54 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=62751

Loris Luise <loris.luise@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Kernel Version|3.11.6                      |3.12

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 62751] INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0)
  2013-10-09 13:55 [Bug 62751] New: INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0) bugzilla-daemon
                   ` (11 preceding siblings ...)
  2013-11-18  7:54 ` bugzilla-daemon
@ 2013-12-07 10:08 ` bugzilla-daemon
  2013-12-20 15:45 ` bugzilla-daemon
  2013-12-20 16:16 ` bugzilla-daemon
  14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2013-12-07 10:08 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=62751

Zheng Liu <gnehzuil.liu@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |gnehzuil.liu@gmail.com

--- Comment #10 from Zheng Liu <gnehzuil.liu@gmail.com> ---
Hi Loris,

Sorry for my late reply.  I haven't had a chance to take a closer look at this
problem until now.

(In reply to Loris Luise from comment #8)
> Hello,
> 
> what can I do on my side?

I am not sure whether you use 'delalloc' on your system.  I don't see these
details from the log that you pasted.  So now only I could suggest is that you
can try to turn off the delalloc.  The step is as below:

 % sudo mount -t ext4 -o remount,nodelalloc ${DEV} ${MNT}

After this, it won't track the delayed entry on extent status tree.  So it will
avoid es shrinker to take too much time to scan delayed entries that couldn't
be reclaimed.

Then I will try to reproduce this problem on my sand box and think about how to
improve es shrinker.

Thanks,
- Zheng

> 
> I've done the following modifications:
> 
> 1) Installed latest mainline kernel 3.12
> 
> 2) Replaced SCSI drivers from vmw_pvsci to LSI Logic SAS (mptsas)
> 
> 3) reduced number of CPUs from 8 to 4
> 
> 4) enabled on ESXi (4.1) setting for this machine "CPU Hot add" feature
> 
> The issue is still affecting ths machine... other 2 almost equals machine
> running on this host (same kernel) but with XFS on root are not affected and
> very
> stable.
> 
> I reattached another info file (bug-20131118084007.txt) which contains a lot
> of infos.
> 
> Thanks.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 62751] INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0)
  2013-10-09 13:55 [Bug 62751] New: INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0) bugzilla-daemon
                   ` (12 preceding siblings ...)
  2013-12-07 10:08 ` bugzilla-daemon
@ 2013-12-20 15:45 ` bugzilla-daemon
  2013-12-20 16:16 ` bugzilla-daemon
  14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2013-12-20 15:45 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=62751

--- Comment #11 from Loris Luise <loris.luise@gmail.com> ---
Hello,

I had cloned root partition to a XFS one and curently I have no more problems.

Previously I used a mount line in fstab like this

/dev/mapper/h30server2-root /               ext4   
errors=remount-ro,noatime,nodiratime,data=writeback 0       1

and later (after your suggestion)

/dev/mapper/h30server2-root /               ext4   
errors=remount-ro,noatime,nodiratime,nodelalloc,data=writeback 0       1


but I did not test it with this new option for a sufficient amount of time to
be considered valid and come to a conclusion.

(I had to stabilize the system so I was forced to switch to XFS).

I collected more buginfos If they can be usefult I can post them here.

Thanks a lot.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug 62751] INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0)
  2013-10-09 13:55 [Bug 62751] New: INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0) bugzilla-daemon
                   ` (13 preceding siblings ...)
  2013-12-20 15:45 ` bugzilla-daemon
@ 2013-12-20 16:16 ` bugzilla-daemon
  14 siblings, 0 replies; 16+ messages in thread
From: bugzilla-daemon @ 2013-12-20 16:16 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=62751

Theodore Tso <tytso@mit.edu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tytso@mit.edu

--- Comment #12 from Theodore Tso <tytso@mit.edu> ---
Hi Louis,

Can you talk a bit about your workload?   What sort of applications are you
running on the file system which is having the problems?   A database?  Squid
cache, something else?   Is it a lot of random writes?    What sort of files
does it create?  Does it create and delete a lot of files?   Etc.

It looks like your system has 8GB of memory --- is that right?   How much free
memory does your machine normally has?   Also, if you still have your ext4 file
system image, could you run "e2freefrag -v /dev/mapper/XXX"?

Many thanks!!

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-12-20 16:16 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-09 13:55 [Bug 62751] New: INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0) bugzilla-daemon
2013-10-09 13:59 ` [Bug 62751] " bugzilla-daemon
2013-10-09 15:10 ` bugzilla-daemon
2013-10-09 20:53 ` bugzilla-daemon
2013-10-10  6:49 ` bugzilla-daemon
2013-10-10  6:56 ` bugzilla-daemon
2013-10-24  8:18 ` bugzilla-daemon
2013-10-24 10:18 ` bugzilla-daemon
2013-11-04 11:53 ` bugzilla-daemon
2013-11-13 13:44 ` bugzilla-daemon
2013-11-18  7:51 ` bugzilla-daemon
2013-11-18  7:53 ` bugzilla-daemon
2013-11-18  7:54 ` bugzilla-daemon
2013-12-07 10:08 ` bugzilla-daemon
2013-12-20 15:45 ` bugzilla-daemon
2013-12-20 16:16 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).