linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [3.18rc1] btrfs triggering vm bug_on
@ 2014-10-21 21:57 Dave Jones
  2014-10-22  0:50 ` Qu Wenruo
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Jones @ 2014-10-21 21:57 UTC (permalink / raw)
  To: Linux Kernel; +Cc: clm, jbacik, linux-btrfs

page:ffffea00088aa1c0 count:4 mapcount:0 mapping:ffff88009901e2d8 index:0x0
flags: 0x2ffc0000000806(error|referenced|private)
page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))
------------[ cut here ]------------
kernel BUG at mm/filemap.c:747!
invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: fuse snd_seq_dummy tun binfmt_misc usb_debug bnep rfcomm nfnetlink scsi_transport_iscsi hidp af_key llc2 can_bcm can_raw nfc caif_socket caif 
af_802154 ieee802154 phonet af_rxrpc bluetooth can pppoe pppox ppp_generic slhc irda crc_ccitt rds rose sctp libcrc32c x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 c
fg80211 rfkill btrfs xor coretemp hwmon x86_pkg_temp_thermal kvm_intel snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi kvm snd_hda_intel snd_hda_controller snd_h
da_codec snd_hwdep crct10dif_pclmul snd_seq crc32c_intel ghash_clmulni_intel e1000e snd_seq_device raid6_pq snd_pcm microcode snd_timer serio_raw snd pcspkr ptp pps_core shpchp
 soundcore nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc
CPU: 3 PID: 6466 Comm: trinity-c402 Not tainted 3.18.0-rc1+ #72 
task: ffff88003ec22f00 ti: ffff88006f3ac000 task.ti: ffff88006f3ac000
RIP: 0010:[<ffffffffbb1a06df>]  [<ffffffffbb1a06df>] unlock_page+0x7f/0x90
RSP: 0018:ffff88006f3afa08  EFLAGS: 00010292
RAX: 0000000000000036 RBX: 0000000000001000 RCX: 0000000000000006
RDX: 0000000000000000 RSI: ffffffffbb843c53 RDI: ffffffffbb0e399e
RBP: ffff88006f3afa08 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000c2b
R13: 0000000000000000 R14: ffffea00088aa1c0 R15: 0000000000000000
FS:  00007f7e30942740(0000) GS:ffff880244e00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7e2d8b8000 CR3: 0000000098e56000 CR4: 00000000001407e0
DR0: 000000000219e000 DR1: 0000000000de8000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
Stack:
 ffff88006f3afb28 ffffffffc05adf78 ffffffffc05ae375 ffff88006f3afa70
 ffffffffbb03d193 ffffffffbb0b25c5 ffffffffc05adb10 0000000000000000
 00000000211ce98b 0000000000000000 ffff88006f3afbd0 ffff880222a87000
Call Trace:
 [<ffffffffc05adf78>] __do_readpage+0x468/0x7a0 [btrfs]
 [<ffffffffc05ae375>] ? __extent_read_full_page+0xc5/0xe0 [btrfs]
 [<ffffffffbb03d193>] ? prepare_ftrace_return+0x73/0xe0
 [<ffffffffbb0b25c5>] ? preempt_count_sub+0x5/0x100
 [<ffffffffc05adb10>] ? extent_write_cache_pages.isra.30.constprop.48+0x440/0x440 [btrfs]
 [<ffffffffc0590290>] ? btrfs_writepage_end_io_hook+0x360/0x360 [btrfs]
 [<ffffffffbb0b263b>] ? preempt_count_sub+0x7b/0x100
 [<ffffffffbb8567ba>] ? ftrace_graph_caller+0x5a/0x85
 [<ffffffffc05ae375>] __extent_read_full_page+0xc5/0xe0 [btrfs]
 [<ffffffffc0590290>] ? btrfs_writepage_end_io_hook+0x360/0x360 [btrfs]
 [<ffffffffbb8567ba>] ? ftrace_graph_caller+0x5a/0x85
 [<ffffffffc05af426>] extent_read_full_page+0x46/0x80 [btrfs]
 [<ffffffffc058d265>] btrfs_readpage+0x25/0x30 [btrfs]
 [<ffffffffc059da78>] prepare_uptodate_page+0x38/0xa0 [btrfs]
 [<ffffffffc059dbec>] prepare_pages.isra.19+0x10c/0x1a0 [btrfs]
 [<ffffffffc059ed54>] __btrfs_buffered_write+0x1e4/0x5f0 [btrfs]
 [<ffffffffbb8567ba>] ? ftrace_graph_caller+0x5a/0x85
 [<ffffffffc059f52e>] btrfs_file_write_iter+0x3ce/0x550 [btrfs]
 [<ffffffffbb341060>] ? cap_inode_getsecid+0x20/0x20
 [<ffffffffbb21aad0>] ? new_sync_read+0xd0/0xd0
 [<ffffffffbb21ab5e>] new_sync_write+0x8e/0xd0
 [<ffffffffbb8567ba>] ? ftrace_graph_caller+0x5a/0x85
 [<ffffffffbb21b53a>] vfs_write+0xba/0x1f0
 [<ffffffffbb21c048>] SyS_write+0x58/0xd0
 [<ffffffffbb854689>] tracesys_phase2+0xd4/0xd9
Code: 48 8d 0c 92 48 8d 0c 4a 48 8b 96 98 00 00 00 48 89 c6 48 8d 3c ca 31 d2 e8 df d5 f2 ff 5d c3 48 c7 c6 38 81 c7 bb e8 01 ff 02 00 <0f> 0b 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 e8 6b 60 6b 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [3.18rc1] btrfs triggering vm bug_on
  2014-10-21 21:57 [3.18rc1] btrfs triggering vm bug_on Dave Jones
@ 2014-10-22  0:50 ` Qu Wenruo
  2014-10-22  0:55   ` Dave Jones
  2014-10-22  1:55   ` Robert White
  0 siblings, 2 replies; 5+ messages in thread
From: Qu Wenruo @ 2014-10-22  0:50 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, clm, jbacik, linux-btrfs

Any reproducer?

Thanks,
Qu
-------- Original Message --------
Subject: [3.18rc1] btrfs triggering vm bug_on
From: Dave Jones <davej@redhat.com>
To: Linux Kernel <linux-kernel@vger.kernel.org>
Date: 2014年10月22日 05:57
> page:ffffea00088aa1c0 count:4 mapcount:0 mapping:ffff88009901e2d8 index:0x0
> flags: 0x2ffc0000000806(error|referenced|private)
> page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))
> ------------[ cut here ]------------
> kernel BUG at mm/filemap.c:747!
> invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> Modules linked in: fuse snd_seq_dummy tun binfmt_misc usb_debug bnep rfcomm nfnetlink scsi_transport_iscsi hidp af_key llc2 can_bcm can_raw nfc caif_socket caif
> af_802154 ieee802154 phonet af_rxrpc bluetooth can pppoe pppox ppp_generic slhc irda crc_ccitt rds rose sctp libcrc32c x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 c
> fg80211 rfkill btrfs xor coretemp hwmon x86_pkg_temp_thermal kvm_intel snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi kvm snd_hda_intel snd_hda_controller snd_h
> da_codec snd_hwdep crct10dif_pclmul snd_seq crc32c_intel ghash_clmulni_intel e1000e snd_seq_device raid6_pq snd_pcm microcode snd_timer serio_raw snd pcspkr ptp pps_core shpchp
>   soundcore nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc
> CPU: 3 PID: 6466 Comm: trinity-c402 Not tainted 3.18.0-rc1+ #72
> task: ffff88003ec22f00 ti: ffff88006f3ac000 task.ti: ffff88006f3ac000
> RIP: 0010:[<ffffffffbb1a06df>]  [<ffffffffbb1a06df>] unlock_page+0x7f/0x90
> RSP: 0018:ffff88006f3afa08  EFLAGS: 00010292
> RAX: 0000000000000036 RBX: 0000000000001000 RCX: 0000000000000006
> RDX: 0000000000000000 RSI: ffffffffbb843c53 RDI: ffffffffbb0e399e
> RBP: ffff88006f3afa08 R08: 0000000000000001 R09: 0000000000000001
> R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000c2b
> R13: 0000000000000000 R14: ffffea00088aa1c0 R15: 0000000000000000
> FS:  00007f7e30942740(0000) GS:ffff880244e00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f7e2d8b8000 CR3: 0000000098e56000 CR4: 00000000001407e0
> DR0: 000000000219e000 DR1: 0000000000de8000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
> Stack:
>   ffff88006f3afb28 ffffffffc05adf78 ffffffffc05ae375 ffff88006f3afa70
>   ffffffffbb03d193 ffffffffbb0b25c5 ffffffffc05adb10 0000000000000000
>   00000000211ce98b 0000000000000000 ffff88006f3afbd0 ffff880222a87000
> Call Trace:
>   [<ffffffffc05adf78>] __do_readpage+0x468/0x7a0 [btrfs]
>   [<ffffffffc05ae375>] ? __extent_read_full_page+0xc5/0xe0 [btrfs]
>   [<ffffffffbb03d193>] ? prepare_ftrace_return+0x73/0xe0
>   [<ffffffffbb0b25c5>] ? preempt_count_sub+0x5/0x100
>   [<ffffffffc05adb10>] ? extent_write_cache_pages.isra.30.constprop.48+0x440/0x440 [btrfs]
>   [<ffffffffc0590290>] ? btrfs_writepage_end_io_hook+0x360/0x360 [btrfs]
>   [<ffffffffbb0b263b>] ? preempt_count_sub+0x7b/0x100
>   [<ffffffffbb8567ba>] ? ftrace_graph_caller+0x5a/0x85
>   [<ffffffffc05ae375>] __extent_read_full_page+0xc5/0xe0 [btrfs]
>   [<ffffffffc0590290>] ? btrfs_writepage_end_io_hook+0x360/0x360 [btrfs]
>   [<ffffffffbb8567ba>] ? ftrace_graph_caller+0x5a/0x85
>   [<ffffffffc05af426>] extent_read_full_page+0x46/0x80 [btrfs]
>   [<ffffffffc058d265>] btrfs_readpage+0x25/0x30 [btrfs]
>   [<ffffffffc059da78>] prepare_uptodate_page+0x38/0xa0 [btrfs]
>   [<ffffffffc059dbec>] prepare_pages.isra.19+0x10c/0x1a0 [btrfs]
>   [<ffffffffc059ed54>] __btrfs_buffered_write+0x1e4/0x5f0 [btrfs]
>   [<ffffffffbb8567ba>] ? ftrace_graph_caller+0x5a/0x85
>   [<ffffffffc059f52e>] btrfs_file_write_iter+0x3ce/0x550 [btrfs]
>   [<ffffffffbb341060>] ? cap_inode_getsecid+0x20/0x20
>   [<ffffffffbb21aad0>] ? new_sync_read+0xd0/0xd0
>   [<ffffffffbb21ab5e>] new_sync_write+0x8e/0xd0
>   [<ffffffffbb8567ba>] ? ftrace_graph_caller+0x5a/0x85
>   [<ffffffffbb21b53a>] vfs_write+0xba/0x1f0
>   [<ffffffffbb21c048>] SyS_write+0x58/0xd0
>   [<ffffffffbb854689>] tracesys_phase2+0xd4/0xd9
> Code: 48 8d 0c 92 48 8d 0c 4a 48 8b 96 98 00 00 00 48 89 c6 48 8d 3c ca 31 d2 e8 df d5 f2 ff 5d c3 48 c7 c6 38 81 c7 bb e8 01 ff 02 00 <0f> 0b 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 e8 6b 60 6b
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [3.18rc1] btrfs triggering vm bug_on
  2014-10-22  0:50 ` Qu Wenruo
@ 2014-10-22  0:55   ` Dave Jones
  2014-10-22  1:55   ` Robert White
  1 sibling, 0 replies; 5+ messages in thread
From: Dave Jones @ 2014-10-22  0:55 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Linux Kernel, clm, jbacik, linux-btrfs

On Wed, Oct 22, 2014 at 08:50:57AM +0800, Qu Wenruo wrote:
 > Any reproducer?
 > 
 > Thanks,
 > Qu
 > -------- Original Message --------
 > Subject: [3.18rc1] btrfs triggering vm bug_on
 > From: Dave Jones <davej@redhat.com>
 > To: Linux Kernel <linux-kernel@vger.kernel.org>
 > Date: 2014年10月22日 05:57
 > > page:ffffea00088aa1c0 count:4 mapcount:0 mapping:ffff88009901e2d8 index:0x0
 > > flags: 0x2ffc0000000806(error|referenced|private)
 

Not easily. It had been running trinity for around 24hrs, and the
post-mortem file it dumped doesn't seem too helpful.

	Dave

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [3.18rc1] btrfs triggering vm bug_on
  2014-10-22  0:50 ` Qu Wenruo
  2014-10-22  0:55   ` Dave Jones
@ 2014-10-22  1:55   ` Robert White
  2014-10-22  1:59     ` Robert White
  1 sibling, 1 reply; 5+ messages in thread
From: Robert White @ 2014-10-22  1:55 UTC (permalink / raw)
  To: Qu Wenruo, Dave Jones, Linux Kernel, clm, jbacik, linux-btrfs

It _looks_ like a hard out-of-ram event and not necessarily a filesystem 
implementation problem.

One of the systems I work with has zero swap but still allows overcommit 
in the VM. It would do things like this all the time back in development 
(and I suspect it still does but the developers got rid of those pesky 
logs). Not necessarily this exact error, but lots of them with the same 
"feel".

You are actually triggering two bugs, only the second being in btrfs code.

The VM_BUG_ON_PAGE() in cleancache_get_page() (mm/cleancache.c) is going 
off first. This is presumably the ram exhaustion event, which is an 
outer problem taking place in your system (e.g. the real cause).



This floats back to the __do_readpage() which doesn't (and almost 
cannot) figure out how the page validation failed

         if (!PageUptodate(page)) {
                 if (cleancache_get_page(page) == 0) {
                         BUG_ON(blocksize != PAGE_SIZE);
                         unlock_extent(tree, start, end);
                         goto out;
                 }
         }



Which (improperly) ends with this suspect bit from the out: in 
__do_readpage() where the unlock_page() is called unconditionally.

out:
         if (!nr) {
                 if (!PageError(page))
                         SetPageUptodate(page);
                 unlock_page(page);
         }

I _think_ the call to unlock_page() _should_ have an if 
(PageLocked(page)) in front of it.

--- extent_io.c   2014-09-19 03:26:21.058266026 -0700
+++ extent_io.c 2014-10-21 18:52:45.127713115 -0700
@@ -2960,7 +2960,8 @@
         if (!nr) {
                 if (!PageError(page))
                         SetPageUptodate(page);
-               unlock_page(page);
+               if (PageLocked(page))
+                       unlock_page(page);
         }
         return 0;
  }

That said, the question of what your system is going to do with the VM 
memory exhausted is a bigger problem.

-- Rob.

On 10/21/2014 05:50 PM, Qu Wenruo wrote:
> Any reproducer?
>
> Thanks,
> Qu
> -------- Original Message --------
> Subject: [3.18rc1] btrfs triggering vm bug_on
> From: Dave Jones <davej@redhat.com>
> To: Linux Kernel <linux-kernel@vger.kernel.org>
> Date: 2014年10月22日 05:57
>> page:ffffea00088aa1c0 count:4 mapcount:0 mapping:ffff88009901e2d8
>> index:0x0
>> flags: 0x2ffc0000000806(error|referenced|private)
>> page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))
>> ------------[ cut here ]------------
>> kernel BUG at mm/filemap.c:747!
>> invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
>> Modules linked in: fuse snd_seq_dummy tun binfmt_misc usb_debug bnep
>> rfcomm nfnetlink scsi_transport_iscsi hidp af_key llc2 can_bcm can_raw
>> nfc caif_socket caif
>> af_802154 ieee802154 phonet af_rxrpc bluetooth can pppoe pppox
>> ppp_generic slhc irda crc_ccitt rds rose sctp libcrc32c x25 atm netrom
>> appletalk ipx p8023 psnap p8022 llc ax25 c
>> fg80211 rfkill btrfs xor coretemp hwmon x86_pkg_temp_thermal kvm_intel
>> snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi kvm
>> snd_hda_intel snd_hda_controller snd_h
>> da_codec snd_hwdep crct10dif_pclmul snd_seq crc32c_intel
>> ghash_clmulni_intel e1000e snd_seq_device raid6_pq snd_pcm microcode
>> snd_timer serio_raw snd pcspkr ptp pps_core shpchp
>>   soundcore nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc
>> CPU: 3 PID: 6466 Comm: trinity-c402 Not tainted 3.18.0-rc1+ #72
>> task: ffff88003ec22f00 ti: ffff88006f3ac000 task.ti: ffff88006f3ac000
>> RIP: 0010:[<ffffffffbb1a06df>]  [<ffffffffbb1a06df>]
>> unlock_page+0x7f/0x90
>> RSP: 0018:ffff88006f3afa08  EFLAGS: 00010292
>> RAX: 0000000000000036 RBX: 0000000000001000 RCX: 0000000000000006
>> RDX: 0000000000000000 RSI: ffffffffbb843c53 RDI: ffffffffbb0e399e
>> RBP: ffff88006f3afa08 R08: 0000000000000001 R09: 0000000000000001
>> R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000c2b
>> R13: 0000000000000000 R14: ffffea00088aa1c0 R15: 0000000000000000
>> FS:  00007f7e30942740(0000) GS:ffff880244e00000(0000)
>> knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007f7e2d8b8000 CR3: 0000000098e56000 CR4: 00000000001407e0
>> DR0: 000000000219e000 DR1: 0000000000de8000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
>> Stack:
>>   ffff88006f3afb28 ffffffffc05adf78 ffffffffc05ae375 ffff88006f3afa70
>>   ffffffffbb03d193 ffffffffbb0b25c5 ffffffffc05adb10 0000000000000000
>>   00000000211ce98b 0000000000000000 ffff88006f3afbd0 ffff880222a87000
>> Call Trace:
>>   [<ffffffffc05adf78>] __do_readpage+0x468/0x7a0 [btrfs]
>>   [<ffffffffc05ae375>] ? __extent_read_full_page+0xc5/0xe0 [btrfs]
>>   [<ffffffffbb03d193>] ? prepare_ftrace_return+0x73/0xe0
>>   [<ffffffffbb0b25c5>] ? preempt_count_sub+0x5/0x100
>>   [<ffffffffc05adb10>] ?
>> extent_write_cache_pages.isra.30.constprop.48+0x440/0x440 [btrfs]
>>   [<ffffffffc0590290>] ? btrfs_writepage_end_io_hook+0x360/0x360 [btrfs]
>>   [<ffffffffbb0b263b>] ? preempt_count_sub+0x7b/0x100
>>   [<ffffffffbb8567ba>] ? ftrace_graph_caller+0x5a/0x85
>>   [<ffffffffc05ae375>] __extent_read_full_page+0xc5/0xe0 [btrfs]
>>   [<ffffffffc0590290>] ? btrfs_writepage_end_io_hook+0x360/0x360 [btrfs]
>>   [<ffffffffbb8567ba>] ? ftrace_graph_caller+0x5a/0x85
>>   [<ffffffffc05af426>] extent_read_full_page+0x46/0x80 [btrfs]
>>   [<ffffffffc058d265>] btrfs_readpage+0x25/0x30 [btrfs]
>>   [<ffffffffc059da78>] prepare_uptodate_page+0x38/0xa0 [btrfs]
>>   [<ffffffffc059dbec>] prepare_pages.isra.19+0x10c/0x1a0 [btrfs]
>>   [<ffffffffc059ed54>] __btrfs_buffered_write+0x1e4/0x5f0 [btrfs]
>>   [<ffffffffbb8567ba>] ? ftrace_graph_caller+0x5a/0x85
>>   [<ffffffffc059f52e>] btrfs_file_write_iter+0x3ce/0x550 [btrfs]
>>   [<ffffffffbb341060>] ? cap_inode_getsecid+0x20/0x20
>>   [<ffffffffbb21aad0>] ? new_sync_read+0xd0/0xd0
>>   [<ffffffffbb21ab5e>] new_sync_write+0x8e/0xd0
>>   [<ffffffffbb8567ba>] ? ftrace_graph_caller+0x5a/0x85
>>   [<ffffffffbb21b53a>] vfs_write+0xba/0x1f0
>>   [<ffffffffbb21c048>] SyS_write+0x58/0xd0
>>   [<ffffffffbb854689>] tracesys_phase2+0xd4/0xd9
>> Code: 48 8d 0c 92 48 8d 0c 4a 48 8b 96 98 00 00 00 48 89 c6 48 8d 3c
>> ca 31 d2 e8 df d5 f2 ff 5d c3 48 c7 c6 38 81 c7 bb e8 01 ff 02 00 <0f>
>> 0b 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 e8 6b 60 6b
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [3.18rc1] btrfs triggering vm bug_on
  2014-10-22  1:55   ` Robert White
@ 2014-10-22  1:59     ` Robert White
  0 siblings, 0 replies; 5+ messages in thread
From: Robert White @ 2014-10-22  1:59 UTC (permalink / raw)
  To: Qu Wenruo, Dave Jones, Linux Kernel, clm, jbacik, linux-btrfs

Pre-Script :: this is not an 3.18 problem. I traced it out in 3.16.

On 10/21/2014 06:55 PM, Robert White wrote:
> It _looks_ like a hard out-of-ram event and not necessarily a filesystem
> implementation problem.
>
> One of the systems I work with has zero swap but still allows overcommit
> in the VM. It would do things like this all the time back in development
> (and I suspect it still does but the developers got rid of those pesky
> logs). Not necessarily this exact error, but lots of them with the same
> "feel".
>
> You are actually triggering two bugs, only the second being in btrfs code.
>
> The VM_BUG_ON_PAGE() in cleancache_get_page() (mm/cleancache.c) is going
> off first. This is presumably the ram exhaustion event, which is an
> outer problem taking place in your system (e.g. the real cause).
>
>
>
> This floats back to the __do_readpage() which doesn't (and almost
> cannot) figure out how the page validation failed
>
>          if (!PageUptodate(page)) {
>                  if (cleancache_get_page(page) == 0) {
>                          BUG_ON(blocksize != PAGE_SIZE);
>                          unlock_extent(tree, start, end);
>                          goto out;
>                  }
>          }
>
>
>
> Which (improperly) ends with this suspect bit from the out: in
> __do_readpage() where the unlock_page() is called unconditionally.
>
> out:
>          if (!nr) {
>                  if (!PageError(page))
>                          SetPageUptodate(page);
>                  unlock_page(page);
>          }
>
> I _think_ the call to unlock_page() _should_ have an if
> (PageLocked(page)) in front of it.
>
> --- extent_io.c   2014-09-19 03:26:21.058266026 -0700
> +++ extent_io.c 2014-10-21 18:52:45.127713115 -0700
> @@ -2960,7 +2960,8 @@
>          if (!nr) {
>                  if (!PageError(page))
>                          SetPageUptodate(page);
> -               unlock_page(page);
> +               if (PageLocked(page))
> +                       unlock_page(page);
>          }
>          return 0;
>   }
>
> That said, the question of what your system is going to do with the VM
> memory exhausted is a bigger problem.
>
> -- Rob.
>
> On 10/21/2014 05:50 PM, Qu Wenruo wrote:
>> Any reproducer?
>>
>> Thanks,
>> Qu
>> -------- Original Message --------
>> Subject: [3.18rc1] btrfs triggering vm bug_on
>> From: Dave Jones <davej@redhat.com>
>> To: Linux Kernel <linux-kernel@vger.kernel.org>
>> Date: 2014年10月22日 05:57
>>> page:ffffea00088aa1c0 count:4 mapcount:0 mapping:ffff88009901e2d8
>>> index:0x0
>>> flags: 0x2ffc0000000806(error|referenced|private)
>>> page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))
>>> ------------[ cut here ]------------
>>> kernel BUG at mm/filemap.c:747!
>>> invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
>>> Modules linked in: fuse snd_seq_dummy tun binfmt_misc usb_debug bnep
>>> rfcomm nfnetlink scsi_transport_iscsi hidp af_key llc2 can_bcm can_raw
>>> nfc caif_socket caif
>>> af_802154 ieee802154 phonet af_rxrpc bluetooth can pppoe pppox
>>> ppp_generic slhc irda crc_ccitt rds rose sctp libcrc32c x25 atm netrom
>>> appletalk ipx p8023 psnap p8022 llc ax25 c
>>> fg80211 rfkill btrfs xor coretemp hwmon x86_pkg_temp_thermal kvm_intel
>>> snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi kvm
>>> snd_hda_intel snd_hda_controller snd_h
>>> da_codec snd_hwdep crct10dif_pclmul snd_seq crc32c_intel
>>> ghash_clmulni_intel e1000e snd_seq_device raid6_pq snd_pcm microcode
>>> snd_timer serio_raw snd pcspkr ptp pps_core shpchp
>>>   soundcore nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc
>>> CPU: 3 PID: 6466 Comm: trinity-c402 Not tainted 3.18.0-rc1+ #72
>>> task: ffff88003ec22f00 ti: ffff88006f3ac000 task.ti: ffff88006f3ac000
>>> RIP: 0010:[<ffffffffbb1a06df>]  [<ffffffffbb1a06df>]
>>> unlock_page+0x7f/0x90
>>> RSP: 0018:ffff88006f3afa08  EFLAGS: 00010292
>>> RAX: 0000000000000036 RBX: 0000000000001000 RCX: 0000000000000006
>>> RDX: 0000000000000000 RSI: ffffffffbb843c53 RDI: ffffffffbb0e399e
>>> RBP: ffff88006f3afa08 R08: 0000000000000001 R09: 0000000000000001
>>> R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000c2b
>>> R13: 0000000000000000 R14: ffffea00088aa1c0 R15: 0000000000000000
>>> FS:  00007f7e30942740(0000) GS:ffff880244e00000(0000)
>>> knlGS:0000000000000000
>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 00007f7e2d8b8000 CR3: 0000000098e56000 CR4: 00000000001407e0
>>> DR0: 000000000219e000 DR1: 0000000000de8000 DR2: 0000000000000000
>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
>>> Stack:
>>>   ffff88006f3afb28 ffffffffc05adf78 ffffffffc05ae375 ffff88006f3afa70
>>>   ffffffffbb03d193 ffffffffbb0b25c5 ffffffffc05adb10 0000000000000000
>>>   00000000211ce98b 0000000000000000 ffff88006f3afbd0 ffff880222a87000
>>> Call Trace:
>>>   [<ffffffffc05adf78>] __do_readpage+0x468/0x7a0 [btrfs]
>>>   [<ffffffffc05ae375>] ? __extent_read_full_page+0xc5/0xe0 [btrfs]
>>>   [<ffffffffbb03d193>] ? prepare_ftrace_return+0x73/0xe0
>>>   [<ffffffffbb0b25c5>] ? preempt_count_sub+0x5/0x100
>>>   [<ffffffffc05adb10>] ?
>>> extent_write_cache_pages.isra.30.constprop.48+0x440/0x440 [btrfs]
>>>   [<ffffffffc0590290>] ? btrfs_writepage_end_io_hook+0x360/0x360 [btrfs]
>>>   [<ffffffffbb0b263b>] ? preempt_count_sub+0x7b/0x100
>>>   [<ffffffffbb8567ba>] ? ftrace_graph_caller+0x5a/0x85
>>>   [<ffffffffc05ae375>] __extent_read_full_page+0xc5/0xe0 [btrfs]
>>>   [<ffffffffc0590290>] ? btrfs_writepage_end_io_hook+0x360/0x360 [btrfs]
>>>   [<ffffffffbb8567ba>] ? ftrace_graph_caller+0x5a/0x85
>>>   [<ffffffffc05af426>] extent_read_full_page+0x46/0x80 [btrfs]
>>>   [<ffffffffc058d265>] btrfs_readpage+0x25/0x30 [btrfs]
>>>   [<ffffffffc059da78>] prepare_uptodate_page+0x38/0xa0 [btrfs]
>>>   [<ffffffffc059dbec>] prepare_pages.isra.19+0x10c/0x1a0 [btrfs]
>>>   [<ffffffffc059ed54>] __btrfs_buffered_write+0x1e4/0x5f0 [btrfs]
>>>   [<ffffffffbb8567ba>] ? ftrace_graph_caller+0x5a/0x85
>>>   [<ffffffffc059f52e>] btrfs_file_write_iter+0x3ce/0x550 [btrfs]
>>>   [<ffffffffbb341060>] ? cap_inode_getsecid+0x20/0x20
>>>   [<ffffffffbb21aad0>] ? new_sync_read+0xd0/0xd0
>>>   [<ffffffffbb21ab5e>] new_sync_write+0x8e/0xd0
>>>   [<ffffffffbb8567ba>] ? ftrace_graph_caller+0x5a/0x85
>>>   [<ffffffffbb21b53a>] vfs_write+0xba/0x1f0
>>>   [<ffffffffbb21c048>] SyS_write+0x58/0xd0
>>>   [<ffffffffbb854689>] tracesys_phase2+0xd4/0xd9
>>> Code: 48 8d 0c 92 48 8d 0c 4a 48 8b 96 98 00 00 00 48 89 c6 48 8d 3c
>>> ca 31 d2 e8 df d5 f2 ff 5d c3 48 c7 c6 38 81 c7 bb e8 01 ff 02 00 <0f>
>>> 0b 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 e8 6b 60 6b
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe
>>> linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-10-22  1:59 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-21 21:57 [3.18rc1] btrfs triggering vm bug_on Dave Jones
2014-10-22  0:50 ` Qu Wenruo
2014-10-22  0:55   ` Dave Jones
2014-10-22  1:55   ` Robert White
2014-10-22  1:59     ` Robert White

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).