All of lore.kernel.org
 help / color / mirror / Atom feed
From: Simon Kirby <sim@hostway.ca>
To: linux-mm@kvack.org, linux-nfs@vger.kernel.org
Subject: Lock contention from flush processes for NFS mounts take all cpu [2.6.36, NFSv3]
Date: Thu, 25 Nov 2010 12:04:24 -0800	[thread overview]
Message-ID: <20101125200424.GA23598@hostway.ca> (raw)

Hello!

I'm not sure what triggers this off, but at certain times, we see this on
shared hosting servers with storage over NFSv3:

top - 13:24:35 up 8 days,  6:47,  2 users,  load average: 69.23, 77.59, 85.53
Tasks: 1318 total,  11 running, 1303 sleeping,   0 stopped,   4 zombie
Cpu(s):  2.5%us, 96.6%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.9%si,  0.0%st
Mem:   4055836k total,  3746576k used,   309260k free,    39348k buffers
Swap:   979960k total,   104868k used,   875092k free,  1558432k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 5558 root      20   0     0    0    0 R   38  0.0  22:30.23 flush-0:58
 5683 root      20   0     0    0    0 R   38  0.0  53:06.24 flush-0:76
13076 root      20   0     0    0    0 R   38  0.0  30:03.96 flush-0:62
13265 mysql     20   0 1763m 128m 5632 S   38  3.3   8:21.47 mysqld                                                                                                                                              
25419 root      20   0     0    0    0 R   38  0.0  12:17.28 flush-0:38
 5783 root      20   0     0    0    0 R   32  0.0  43:32.78 flush-0:78
12725 root      20   0     0    0    0 R   32  0.0  29:45.53 flush-0:72
 5676 root      20   0     0    0    0 R   32  0.0  41:18.64 flush-0:82
24034 root      20   0     0    0    0 R   32  0.0  30:26.86 flush-0:32
32192 root      20   0     0    0    0 R   32  0.0  24:36.93 flush-0:81
 8465 root      20   0     0    0    0 R   31  0.0  19:26.43 flush-0:83
29379 root      20   0 30228 3076 1560 R    2  0.1   0:00.25 top

# cat /proc/5676/stack
[<ffffffff81203fed>] nfs_writepages+0xed/0x160
[<ffffffff8136b320>] _atomic_dec_and_lock+0x50/0x70
[<ffffffff8111ac3c>] iput+0x2c/0x2a0
[<ffffffff8112636e>] writeback_sb_inodes+0xce/0x150
[<ffffffff81126d1c>] writeback_inodes_wb+0xcc/0x140
[<ffffffff811270bb>] wb_writeback+0x32b/0x370
[<ffffffff811271d4>] wb_do_writeback+0xd4/0x1e0
[<ffffffff811273bb>] bdi_writeback_thread+0xdb/0x240
[<ffffffff81071ae6>] kthread+0x96/0xb0
[<ffffffff8100bd24>] kernel_thread_helper+0x4/0x10
[<ffffffffffffffff>] 0xffffffffffffffff

   PerfTop:    3894 irqs/sec  kernel:99.5%  exact:  0.0% [1000Hz cycles],  (all, 4 CPUs)
----------------------------------------------------------------------------------------

     samples  pcnt function                        DSO
     _______ _____ _______________________________ _________________

    25466.00 81.1% _raw_spin_lock                  [kernel.kallsyms]          
      968.00  3.1% nfs_writepages                  [kernel.kallsyms]          
      704.00  2.2% writeback_sb_inodes             [kernel.kallsyms]          
      671.00  2.1% writeback_single_inode          [kernel.kallsyms]          
      636.00  2.0% queue_io                        [kernel.kallsyms]          
      393.00  1.3% nfs_write_inode                 [kernel.kallsyms]          
      368.00  1.2% __iget                          [kernel.kallsyms]          
      360.00  1.1% iput                            [kernel.kallsyms]          
      334.00  1.1% bit_waitqueue                   [kernel.kallsyms]          
      293.00  0.9% do_writepages                   [kernel.kallsyms]          
      206.00  0.7% redirty_tail                    [kernel.kallsyms]          
      194.00  0.6% __mark_inode_dirty              [kernel.kallsyms]          
      135.00  0.4% _atomic_dec_and_lock            [kernel.kallsyms]          
      119.00  0.4% write_cache_pages               [kernel.kallsyms]          
       65.00  0.2% __wake_up_bit                   [kernel.kallsyms]          
       62.00  0.2% is_bad_inode                    [kernel.kallsyms]          
       41.00  0.1% find_get_pages_tag              [kernel.kallsyms]          
       37.00  0.1% radix_tree_gang_lookup_tag_slot [kernel.kallsyms]          
       29.00  0.1% wake_up_bit                     [kernel.kallsyms]          
       17.00  0.1% thread_group_cputime            [kernel.kallsyms]          
       16.00  0.1% pagevec_lookup_tag              [kernel.kallsyms]          
       16.00  0.1% radix_tree_tagged               [kernel.kallsyms]          
       16.00  0.1% _cond_resched                   [kernel.kallsyms]          
       14.00  0.0% ktime_get                       [kernel.kallsyms]          
       11.00  0.0% mapping_tagged                  [kernel.kallsyms]          
	9.00  0.0% nfs_pageio_init                 [kernel.kallsyms]          
	9.00  0.0% __phys_addr                     [kernel.kallsyms]          
	8.00  0.0% nfs_pageio_doio                 [kernel.kallsyms]          
	7.00  0.0% _raw_spin_lock_irqsave          [kernel.kallsyms]          
	7.00  0.0% dso__load_sym                   /usr/local/bin/perf        

# grep ^nr_ /proc/vmstat
nr_free_pages 123806
nr_inactive_anon 52656
nr_active_anon 192039
nr_inactive_file 351876
nr_active_file 92993
nr_unevictable 0
nr_mlock 0
nr_anon_pages 234118
nr_mapped 4652
nr_file_pages 457325
nr_dirty 211
nr_writeback 0
nr_slab_reclaimable 79296
nr_slab_unreclaimable 45221
nr_page_table_pages 57483
nr_kernel_stack 2537
nr_unstable 0
nr_bounce 0
nr_vmscan_write 3387227
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 144

Looking up flush-major:minor in /proc/self/mountinfo shows that these are
all NFS mounts from a few different servers.  It doesn't seem like there
are any large write sor anything going on at this moment -- no processes
are in D state.  I started a "sync", which ran for another 10 minutes or
so, and then finally exited, and everything returned to normal.

Simon-

WARNING: multiple messages have this Message-ID (diff)
From: Simon Kirby <sim@hostway.ca>
To: linux-mm@kvack.org, linux-nfs@vger.kernel.org
Subject: Lock contention from flush processes for NFS mounts take all cpu [2.6.36, NFSv3]
Date: Thu, 25 Nov 2010 12:04:24 -0800	[thread overview]
Message-ID: <20101125200424.GA23598@hostway.ca> (raw)

Hello!

I'm not sure what triggers this off, but at certain times, we see this on
shared hosting servers with storage over NFSv3:

top - 13:24:35 up 8 days,  6:47,  2 users,  load average: 69.23, 77.59, 85.53
Tasks: 1318 total,  11 running, 1303 sleeping,   0 stopped,   4 zombie
Cpu(s):  2.5%us, 96.6%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.9%si,  0.0%st
Mem:   4055836k total,  3746576k used,   309260k free,    39348k buffers
Swap:   979960k total,   104868k used,   875092k free,  1558432k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 5558 root      20   0     0    0    0 R   38  0.0  22:30.23 flush-0:58
 5683 root      20   0     0    0    0 R   38  0.0  53:06.24 flush-0:76
13076 root      20   0     0    0    0 R   38  0.0  30:03.96 flush-0:62
13265 mysql     20   0 1763m 128m 5632 S   38  3.3   8:21.47 mysqld                                                                                                                                              
25419 root      20   0     0    0    0 R   38  0.0  12:17.28 flush-0:38
 5783 root      20   0     0    0    0 R   32  0.0  43:32.78 flush-0:78
12725 root      20   0     0    0    0 R   32  0.0  29:45.53 flush-0:72
 5676 root      20   0     0    0    0 R   32  0.0  41:18.64 flush-0:82
24034 root      20   0     0    0    0 R   32  0.0  30:26.86 flush-0:32
32192 root      20   0     0    0    0 R   32  0.0  24:36.93 flush-0:81
 8465 root      20   0     0    0    0 R   31  0.0  19:26.43 flush-0:83
29379 root      20   0 30228 3076 1560 R    2  0.1   0:00.25 top

# cat /proc/5676/stack
[<ffffffff81203fed>] nfs_writepages+0xed/0x160
[<ffffffff8136b320>] _atomic_dec_and_lock+0x50/0x70
[<ffffffff8111ac3c>] iput+0x2c/0x2a0
[<ffffffff8112636e>] writeback_sb_inodes+0xce/0x150
[<ffffffff81126d1c>] writeback_inodes_wb+0xcc/0x140
[<ffffffff811270bb>] wb_writeback+0x32b/0x370
[<ffffffff811271d4>] wb_do_writeback+0xd4/0x1e0
[<ffffffff811273bb>] bdi_writeback_thread+0xdb/0x240
[<ffffffff81071ae6>] kthread+0x96/0xb0
[<ffffffff8100bd24>] kernel_thread_helper+0x4/0x10
[<ffffffffffffffff>] 0xffffffffffffffff

   PerfTop:    3894 irqs/sec  kernel:99.5%  exact:  0.0% [1000Hz cycles],  (all, 4 CPUs)
----------------------------------------------------------------------------------------

     samples  pcnt function                        DSO
     _______ _____ _______________________________ _________________

    25466.00 81.1% _raw_spin_lock                  [kernel.kallsyms]          
      968.00  3.1% nfs_writepages                  [kernel.kallsyms]          
      704.00  2.2% writeback_sb_inodes             [kernel.kallsyms]          
      671.00  2.1% writeback_single_inode          [kernel.kallsyms]          
      636.00  2.0% queue_io                        [kernel.kallsyms]          
      393.00  1.3% nfs_write_inode                 [kernel.kallsyms]          
      368.00  1.2% __iget                          [kernel.kallsyms]          
      360.00  1.1% iput                            [kernel.kallsyms]          
      334.00  1.1% bit_waitqueue                   [kernel.kallsyms]          
      293.00  0.9% do_writepages                   [kernel.kallsyms]          
      206.00  0.7% redirty_tail                    [kernel.kallsyms]          
      194.00  0.6% __mark_inode_dirty              [kernel.kallsyms]          
      135.00  0.4% _atomic_dec_and_lock            [kernel.kallsyms]          
      119.00  0.4% write_cache_pages               [kernel.kallsyms]          
       65.00  0.2% __wake_up_bit                   [kernel.kallsyms]          
       62.00  0.2% is_bad_inode                    [kernel.kallsyms]          
       41.00  0.1% find_get_pages_tag              [kernel.kallsyms]          
       37.00  0.1% radix_tree_gang_lookup_tag_slot [kernel.kallsyms]          
       29.00  0.1% wake_up_bit                     [kernel.kallsyms]          
       17.00  0.1% thread_group_cputime            [kernel.kallsyms]          
       16.00  0.1% pagevec_lookup_tag              [kernel.kallsyms]          
       16.00  0.1% radix_tree_tagged               [kernel.kallsyms]          
       16.00  0.1% _cond_resched                   [kernel.kallsyms]          
       14.00  0.0% ktime_get                       [kernel.kallsyms]          
       11.00  0.0% mapping_tagged                  [kernel.kallsyms]          
	9.00  0.0% nfs_pageio_init                 [kernel.kallsyms]          
	9.00  0.0% __phys_addr                     [kernel.kallsyms]          
	8.00  0.0% nfs_pageio_doio                 [kernel.kallsyms]          
	7.00  0.0% _raw_spin_lock_irqsave          [kernel.kallsyms]          
	7.00  0.0% dso__load_sym                   /usr/local/bin/perf        

# grep ^nr_ /proc/vmstat
nr_free_pages 123806
nr_inactive_anon 52656
nr_active_anon 192039
nr_inactive_file 351876
nr_active_file 92993
nr_unevictable 0
nr_mlock 0
nr_anon_pages 234118
nr_mapped 4652
nr_file_pages 457325
nr_dirty 211
nr_writeback 0
nr_slab_reclaimable 79296
nr_slab_unreclaimable 45221
nr_page_table_pages 57483
nr_kernel_stack 2537
nr_unstable 0
nr_bounce 0
nr_vmscan_write 3387227
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 144

Looking up flush-major:minor in /proc/self/mountinfo shows that these are
all NFS mounts from a few different servers.  It doesn't seem like there
are any large write sor anything going on at this moment -- no processes
are in D state.  I started a "sync", which ran for another 10 minutes or
so, and then finally exited, and everything returned to normal.

Simon-

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

             reply	other threads:[~2010-11-25 20:04 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-25 20:04 Simon Kirby [this message]
2010-11-25 20:04 ` Lock contention from flush processes for NFS mounts take all cpu [2.6.36, NFSv3] Simon Kirby

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101125200424.GA23598@hostway.ca \
    --to=sim@hostway.ca \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.