From: Arthur Marsh <arthur.marsh@internode.on.net>
To: Oleg Nesterov <oleg@redhat.com>, Al Viro <viro@zeniv.linux.org.uk>
Cc: Dave Chinner <david@fromorbit.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Jan Kara <jack@suse.cz>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Peter Zijlstra <peterz@infradead.org>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 0/8] change sb_writers to use percpu_rw_semaphore
Date: Sun, 16 Aug 2015 23:17:06 +0930 [thread overview]
Message-ID: <55D0945A.2040505@internode.on.net> (raw)
In-Reply-To: <20150814171935.GA15042@redhat.com>
Oleg Nesterov wrote on 15/08/15 02:49:
> On 08/13, Jan Kara wrote:
>>
>> Regarding the routing, ideally Al Viro should take these as a VFS
>> maintainer.
>
> Al, could you take these patches?
>
> Only cosmetic changes in V3 to address the comments from Jan, I
> preserved his acks.
>
> In case you missed all the spam I sent before, let me repeat that
> the awful (and currently unneeded) 7/8 will be reverted later. We
> need it to ensure that other percpu_rw_semaphore changes routed
> via another tree won't break fs/super.c. After that we will add
> rcu_sync_dtor(s_writers->rw_sem) into deactivate_locked_super()
> and revert this horror.
>
> 3/8 documents the lockdep problems we currently have. This is fixed
> by the patch below but it depends on xfs ILOCK fixes from Dave, so
> I will send it later. Plus another patch which removes the "trylock"
> hack in __sb_start_write().
>
> Oleg.
Would these patches address what I've seen in the last day or so using
Linus' git head kernel and seeing problems like:
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Initializing cgroup subsys cpuacct
[ 0.000000] Linux version 4.2.0-rc6+ (root@victoria) (gcc version
5.2.1 20150808 (Debian 5.2.1-15) ) #11 SMP PREEMPT Sun Aug 16 07:27:00
ACST 2015
...
[ 6000.096107] INFO: task basename:7796 blocked for more than 120 seconds.
[ 6000.096116] Not tainted 4.2.0-rc6+ #11
[ 6000.096120] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 6000.096123] basename D e7b5b180 0 7796 6936 0x00000000
[ 6000.096132] c0379a84 00000086 c11127a5 e7b5b180 e7b5b5ec 2e0a5fb9
00000557 f5f0b310
[ 6000.096143] f330b180 e7b5b180 c037a000 f5f0b300 7fffffff c0379a90
c155b740 00000000
[ 6000.096154] c0379b04 c155fa1d 00000046 c11127a5 00000246 00000000
c0379ab0 c10a569b
[ 6000.096164] Call Trace:
[ 6000.096174] [<c11127a5>] ? __delayacct_blkio_start+0x15/0x20
[ 6000.096179] [<c155b740>] schedule+0x30/0x80
[ 6000.096184] [<c155fa1d>] schedule_timeout+0x2cd/0x5c0
[ 6000.096188] [<c11127a5>] ? __delayacct_blkio_start+0x15/0x20
[ 6000.096193] [<c10a569b>] ? trace_hardirqs_on+0xb/0x10
[ 6000.096198] [<c10d701c>] ? ktime_get+0xac/0x1a0
[ 6000.096202] [<c11127a5>] ? __delayacct_blkio_start+0x15/0x20
[ 6000.096206] [<c155ad99>] io_schedule_timeout+0x89/0xf0
[ 6000.096209] [<c155beb0>] ? bit_wait+0x40/0x40
[ 6000.096213] [<c155bed5>] bit_wait_io+0x25/0x50
[ 6000.096216] [<c155bbc9>] __wait_on_bit+0x49/0x70
[ 6000.096219] [<c155beb0>] ? bit_wait+0x40/0x40
[ 6000.096223] [<c155bc4d>] out_of_line_wait_on_bit+0x5d/0x70
[ 6000.096226] [<c155beb0>] ? bit_wait+0x40/0x40
[ 6000.096230] [<c109b210>] ? autoremove_wake_function+0x40/0x40
[ 6000.096236] [<c11ed5be>] bh_submit_read+0x7e/0x90
[ 6000.096265] [<f8321354>] ext4_get_branch+0xa4/0x110 [ext4]
[ 6000.096286] [<f8321f14>] ext4_ind_map_blocks+0xd4/0xe30 [ext4]
[ 6000.096291] [<c10a6290>] ? __lock_acquire+0x910/0x16a0
[ 6000.096295] [<c10a6290>] ? __lock_acquire+0x910/0x16a0
[ 6000.096300] [<c155ed53>] ? down_read+0x33/0x50
[ 6000.096315] [<f82ddc9d>] ext4_map_blocks+0x29d/0x4f0 [ext4]
[ 6000.096319] [<c10a548b>] ? mark_held_locks+0x5b/0x90
[ 6000.096323] [<c10a55ec>] ? trace_hardirqs_on_caller+0x12c/0x1d0
[ 6000.096337] [<f82db052>] ? ext4_readpages+0x32/0x40 [ext4]
[ 6000.096358] [<f832d37b>] ext4_mpage_readpages+0x30b/0x8c0 [ext4]
[ 6000.096372] [<f82db052>] ? ext4_readpages+0x32/0x40 [ext4]
[ 6000.096377] [<c1156930>] ? __alloc_pages_nodemask+0x9c0/0xa40
[ 6000.096383] [<c107ed46>] ? preempt_count_sub+0x26/0x70
[ 6000.096397] [<f82db052>] ext4_readpages+0x32/0x40 [ext4]
[ 6000.096411] [<f82db020>] ? do_journal_get_write_access+0xb0/0xb0 [ext4]
[ 6000.096416] [<c115c376>] __do_page_cache_readahead+0x2e6/0x370
[ 6000.096420] [<c115c233>] ? __do_page_cache_readahead+0x1a3/0x370
[ 6000.096426] [<c114fb85>] filemap_fault+0x505/0x570
[ 6000.096430] [<c117bb6f>] ? __do_fault+0x2f/0x80
[ 6000.096435] [<c117bb6f>] __do_fault+0x2f/0x80
[ 6000.096439] [<c1560ca7>] ? _raw_spin_unlock+0x27/0x50
[ 6000.096443] [<c117f412>] handle_mm_fault+0xb22/0x11d0
[ 6000.096448] [<c104aa7a>] __do_page_fault+0x16a/0x500
[ 6000.096452] [<c104ae10>] ? __do_page_fault+0x500/0x500
[ 6000.096456] [<c104ae31>] do_page_fault+0x21/0x30
[ 6000.096460] [<c156282b>] error_code+0x5f/0x64
[ 6000.096464] [<c104ae10>] ? __do_page_fault+0x500/0x500
[ 6000.096468] 2 locks held by basename/7796:
[ 6000.096470] #0: (&mm->mmap_sem){++++++}, at: [<c104aa25>]
__do_page_fault+0x115/0x500
[ 6000.096479] #1: (&ei->i_data_sem){++++..}, at: [<f82ddd9b>]
ext4_map_blocks+0x39b/0x4f0 [ext4]
[ 6000.096500] INFO: task hddtemp:7797 blocked for more than 120 seconds.
[ 6000.096503] Not tainted 4.2.0-rc6+ #11
[ 6000.096505] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 6000.096508] hddtemp D e896d100 0 7797 5140 0x00000000
[ 6000.096514] c02c3a84 00000086 e896d588 e896d100 e896d56c 00000001
c02c3a84 f5f0b310
[ 6000.096525] c176fb00 e896d100 c02c4000 f5f0b300 7fffffff c02c3a90
c155b740 00000000
[ 6000.096535] c02c3b04 c155fa1d 00000046 c11127a5 00000246 00000000
c02c3ab0 c10a569b
[ 6000.096546] Call Trace:
[ 6000.096550] [<c155b740>] schedule+0x30/0x80
[ 6000.096554] [<c155fa1d>] schedule_timeout+0x2cd/0x5c0
[ 6000.096558] [<c11127a5>] ? __delayacct_blkio_start+0x15/0x20
[ 6000.096562] [<c10a569b>] ? trace_hardirqs_on+0xb/0x10
[ 6000.096566] [<c10d701c>] ? ktime_get+0xac/0x1a0
[ 6000.096569] [<c11127a5>] ? __delayacct_blkio_start+0x15/0x20
[ 6000.096574] [<c155ad99>] io_schedule_timeout+0x89/0xf0
[ 6000.096577] [<c109ad07>] ? prepare_to_wait_exclusive+0x47/0x80
[ 6000.096581] [<c155beb0>] ? bit_wait+0x40/0x40
[ 6000.096584] [<c155bed5>] bit_wait_io+0x25/0x50
[ 6000.096587] [<c155bd12>] __wait_on_bit_lock+0x32/0x80
[ 6000.096591] [<c155bdbd>] out_of_line_wait_on_bit_lock+0x5d/0x70
[ 6000.096595] [<c155beb0>] ? bit_wait+0x40/0x40
[ 6000.096598] [<c109b210>] ? autoremove_wake_function+0x40/0x40
[ 6000.096602] [<c11ea166>] bh_uptodate_or_lock+0x66/0x70
[ 6000.096623] [<f8321349>] ext4_get_branch+0x99/0x110 [ext4]
[ 6000.096643] [<f8321f14>] ext4_ind_map_blocks+0xd4/0xe30 [ext4]
[ 6000.096647] [<c10a6290>] ? __lock_acquire+0x910/0x16a0
[ 6000.096651] [<c10a6290>] ? __lock_acquire+0x910/0x16a0
[ 6000.096656] [<c155ed53>] ? down_read+0x33/0x50
[ 6000.096671] [<f82ddc9d>] ext4_map_blocks+0x29d/0x4f0 [ext4]
[ 6000.096675] [<c10a548b>] ? mark_held_locks+0x5b/0x90
[ 6000.096679] [<c10a55ec>] ? trace_hardirqs_on_caller+0x12c/0x1d0
[ 6000.096693] [<f82db052>] ? ext4_readpages+0x32/0x40 [ext4]
[ 6000.096713] [<f832d37b>] ext4_mpage_readpages+0x30b/0x8c0 [ext4]
[ 6000.096727] [<f82db052>] ? ext4_readpages+0x32/0x40 [ext4]
[ 6000.096732] [<c1156930>] ? __alloc_pages_nodemask+0x9c0/0xa40
[ 6000.096747] [<f82db052>] ext4_readpages+0x32/0x40 [ext4]
[ 6000.096761] [<f82db020>] ? do_journal_get_write_access+0xb0/0xb0 [ext4]
[ 6000.096766] [<c115c376>] __do_page_cache_readahead+0x2e6/0x370
[ 6000.096770] [<c115c233>] ? __do_page_cache_readahead+0x1a3/0x370
[ 6000.096775] [<c114fb85>] filemap_fault+0x505/0x570
[ 6000.096779] [<c117bb6f>] ? __do_fault+0x2f/0x80
[ 6000.096783] [<c117bb6f>] __do_fault+0x2f/0x80
[ 6000.096787] [<c1560ca7>] ? _raw_spin_unlock+0x27/0x50
[ 6000.096791] [<c117f412>] handle_mm_fault+0xb22/0x11d0
[ 6000.096796] [<c104aa7a>] __do_page_fault+0x16a/0x500
[ 6000.096800] [<c104ae10>] ? __do_page_fault+0x500/0x500
[ 6000.096803] [<c104ae31>] do_page_fault+0x21/0x30
[ 6000.096807] [<c156282b>] error_code+0x5f/0x64
[ 6000.096811] [<c104ae10>] ? __do_page_fault+0x500/0x500
[ 6000.096815] 2 locks held by hddtemp/7797:
[ 6000.096817] #0: (&mm->mmap_sem){++++++}, at: [<c104aa25>]
__do_page_fault+0x115/0x500
[ 6000.096825] #1: (&ei->i_data_sem){++++..}, at: [<f82ddd9b>]
ext4_map_blocks+0x39b/0x4f0 [ext4]
next prev parent reply other threads:[~2015-08-16 13:47 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-14 17:19 [PATCH v3 0/8] change sb_writers to use percpu_rw_semaphore Oleg Nesterov
2015-08-14 17:19 ` [PATCH v3 1/8] introduce __sb_writers_{acquired,release}() helpers Oleg Nesterov
2015-08-14 17:19 ` [PATCH v3 2/8] fix the broken lockdep logic in __sb_start_write() Oleg Nesterov
2015-08-14 17:19 ` [PATCH v3 3/8] document rwsem_release() in sb_wait_write() Oleg Nesterov
2015-08-14 17:19 ` [PATCH v3 4/8] percpu-rwsem: introduce percpu_down_read_trylock() Oleg Nesterov
2015-08-14 17:20 ` [PATCH v3 5/8] percpu-rwsem: introduce percpu_rwsem_release() and percpu_rwsem_acquire() Oleg Nesterov
2015-08-14 17:20 ` [PATCH v3 6/8] percpu-rwsem: kill CONFIG_PERCPU_RWSEM Oleg Nesterov
2015-08-14 17:20 ` [PATCH v3 7/8] shift percpu_counter_destroy() into destroy_super_work() Oleg Nesterov
2015-08-14 17:20 ` [PATCH v3 8/8] change sb_writers to use percpu_rw_semaphore Oleg Nesterov
2015-08-15 7:17 ` [PATCH v3 0/8] " Al Viro
2015-08-15 12:03 ` Oleg Nesterov
2015-08-16 13:47 ` Arthur Marsh [this message]
2015-08-17 11:35 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55D0945A.2040505@internode.on.net \
--to=arthur.marsh@internode.on.net \
--cc=dave.hansen@linux.intel.com \
--cc=david@fromorbit.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.