public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* GPF in aio_migratepage
@ 2013-11-26  3:26 Dave Jones
  2013-11-26  6:01 ` Dave Jones
  0 siblings, 1 reply; 17+ messages in thread
From: Dave Jones @ 2013-11-26  3:26 UTC (permalink / raw)
  To: Linux Kernel; +Cc: kmo

Hi Kent,

I hit the GPF below on a tree based on 8e45099e029bb6b369b27d8d4920db8caff5ecce
which has your commit e34ecee2ae791df674dfb466ce40692ca6218e43
("aio: Fix a trinity splat").  Is this another path your patch missed, or
a completely different bug to what you were chasing ?

	Dave

general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: snd_seq_dummy bridge stp tun fuse hidp bnep rfcomm ipt_ULOG can_bcm scsi_transport_iscsi nfc caif_socket caif af_802154 phonet af_rxrpc bluetooth rfkill can_raw can llc2 pppoe pppox ppp_generic slhc irda crc_ccitt rds nfnetlink af_key rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 xfs libcrc32c coretemp hwmon x86_pkg_temp_thermal kvm_intel snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec kvm snd_hwdep snd_seq snd_seq_device crct10dif_pclmul snd_pcm snd_page_alloc snd_timer snd crc32c_intel ghash_clmulni_intel shpchp usb_debug e1000e soundcore microcode pcspkr ptp pps_core serio_raw
CPU: 3 PID: 1840 Comm: trinity-child3 Not tainted 3.13.0-rc1+ #9 
task: ffff88003b3a15d0 ti: ffff88001f208000 task.ti: ffff88001f208000
RIP: 0010:[<ffffffff810ad3d1>]  [<ffffffff810ad3d1>] __lock_acquire+0x1b1/0x19f0
RSP: 0018:ffff88001f209740  EFLAGS: 00010002
RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000002 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88001fbf3760
RBP: ffff88001f2097e8 R08: 0000000000000002 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003b3a15d0
R13: 6b6b6b6b6b6b6b6b R14: ffff88001fbf3760 R15: 0000000000000000
FS:  00007faab2396740(0000) GS:ffff880244e00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f4e589ba36c CR3: 000000001f2fa000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
 0000000000000006 ffffffff810a970f 0000000000000006 0000050b04f1418f
 ffff88001f209778 ffffffff8100b164 ffffffff824cb6a0 ffffffff810a970f
 0000000000000000 ffff88003b3a1cd8 0000000000000007 0000000000000006
Call Trace:
 [<ffffffff810a970f>] ? trace_hardirqs_off_caller+0x1f/0xc0
 [<ffffffff8100b164>] ? native_sched_clock+0x24/0x80
 [<ffffffff810a970f>] ? trace_hardirqs_off_caller+0x1f/0xc0
 [<ffffffff810acccb>] ? mark_held_locks+0xbb/0x140
 [<ffffffff810af3c3>] lock_acquire+0x93/0x1c0
 [<ffffffff81210596>] ? aio_migratepage+0xa6/0x150
 [<ffffffff81744b4b>] _raw_spin_lock_irqsave+0x4b/0x90
 [<ffffffff81210596>] ? aio_migratepage+0xa6/0x150
 [<ffffffff81210596>] aio_migratepage+0xa6/0x150
 [<ffffffff811abe29>] move_to_new_page+0x79/0x240
 [<ffffffff811ac8d5>] migrate_pages+0x7a5/0x850
 [<ffffffff81173c50>] ? isolate_freepages_block+0x440/0x440
 [<ffffffff81174bda>] compact_zone+0x2ba/0x510
 [<ffffffff81174ec4>] compact_zone_order+0x94/0xe0
 [<ffffffff81175201>] try_to_compact_pages+0xe1/0x110
 [<ffffffff817388bd>] __alloc_pages_direct_compact+0xac/0x1d0
 [<ffffffff81159946>] __alloc_pages_nodemask+0x996/0xb50
 [<ffffffff8119d6b1>] alloc_pages_vma+0xf1/0x1b0
 [<ffffffff811b121d>] ? do_huge_pmd_anonymous_page+0xfd/0x3a0
 [<ffffffff811b121d>] do_huge_pmd_anonymous_page+0xfd/0x3a0
 [<ffffffff810aa4a6>] ? lock_release_holdtime.part.29+0xe6/0x160
 [<ffffffff8117c279>] handle_mm_fault+0x479/0xbb0
 [<ffffffff810a9f27>] ? __lock_is_held+0x57/0x80
 [<ffffffff8117cb5e>] __get_user_pages+0x1ae/0x5f0
 [<ffffffff8117ebec>] __mlock_vma_pages_range+0x8c/0xa0
 [<ffffffff8117f360>] __mm_populate+0xc0/0x150
 [<ffffffff8116d786>] vm_mmap_pgoff+0xb6/0xc0
 [<ffffffff81181676>] SyS_mmap_pgoff+0x116/0x270
 [<ffffffff8174fa29>] ia32_do_call+0x13/0x13
Code: c2 b6 75 a2 81 31 c0 be fb 0b 00 00 48 c7 c7 00 b6 a2 81 e8 b2 6d fa ff eb a8 44 89 fa 4d 8b 6c d6 08 4d 85 ed 0f 84 cb fe ff ff <f0> 41 ff 85 98 01 00 00 8b 05 b9 28 9b 01 45 8b bc 24 00 07 00 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in aio_migratepage
  2013-11-26  3:26 GPF in aio_migratepage Dave Jones
@ 2013-11-26  6:01 ` Dave Jones
  2013-11-26  7:19   ` Kent Overstreet
  0 siblings, 1 reply; 17+ messages in thread
From: Dave Jones @ 2013-11-26  6:01 UTC (permalink / raw)
  To: Linux Kernel, kmo

On Mon, Nov 25, 2013 at 10:26:45PM -0500, Dave Jones wrote:
 > Hi Kent,
 > 
 > I hit the GPF below on a tree based on 8e45099e029bb6b369b27d8d4920db8caff5ecce
 > which has your commit e34ecee2ae791df674dfb466ce40692ca6218e43
 > ("aio: Fix a trinity splat").  Is this another path your patch missed, or
 > a completely different bug to what you were chasing ?

And here's another from a different path, this time on 32bit.


Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: tun fuse hidp rfcomm bnep scsi_transport_iscsi l2tp_ppp l2tp_netlink l2tp_core nfc caif_socket caif af_802154 phonet af_rxrpc bluetooth rfkill can_raw can_bcm can llc2 pppoe pppox ppp_generic slhc irda crc_ccitt rds af_key rose x25 atm netrom appletalk ipx p8023 p8022 psnap llc ax25 nouveau video backlight mxm_wmi wmi i2c_algo_bit ttm drm_kms_helper drm i2c_core kvm_intel kvm tg3 ptp pps_core libphy serio_raw pcspkr lpc_ich microcode mfd_core rtc_cmos parport_pc parport shpchp xfs libcrc32c raid0 floppy
CPU: 0 PID: 4517 Comm: trinity-child0 Not tainted 3.13.0-rc1+ #6 
Hardware name: Dell Inc.                 Precision WorkStation 490    /0DT031, BIOS A08 04/25/2008
task: ed899630 ti: dea22000 task.ti: dea22000
EIP: 0060:[<c11c7a02>] EFLAGS: 00010293 CPU: 0
EIP is at aio_migratepage+0xad/0x126
EAX: 00000144 EBX: f6844ed8 ECX: deaf4a84 EDX: 6b6b6b6b
ESI: f68dc508 EDI: deaf4800 EBP: dea23bcc ESP: dea23ba8
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
CR0: 8005003b CR2: 6b6b707b CR3: 2c985000 CR4: 000007f0
Stack:
 00000000 00000001 deaf4a84 00000286 d709b280 00000000 f68dc508 c11c7955
 f6844ed8 dea23c0c c116aa9f 00000001 00000001 c11c7955 c1179a33 00000000
 00000000 c114166d f6844ed8 f6844ed8 c1140fc9 dea23c0c 00000000 f6844ed8
Call Trace:
 [<c11c7955>] ? free_ioctx+0x62/0x62
 [<c116aa9f>] move_to_new_page+0x63/0x1bb
 [<c11c7955>] ? free_ioctx+0x62/0x62
 [<c1179a33>] ? mem_cgroup_prepare_migration+0xc1/0x243
 [<c114166d>] ? isolate_migratepages_range+0x3fb/0x675
 [<c1140fc9>] ? isolate_freepages_block+0x316/0x316
 [<c116b319>] migrate_pages+0x614/0x72b
 [<c1140fc9>] ? isolate_freepages_block+0x316/0x316
 [<c1141c21>] compact_zone+0x294/0x475
 [<c1142065>] try_to_compact_pages+0x129/0x196
 [<c15b95e7>] __alloc_pages_direct_compact+0x91/0x197
 [<c112a25c>] __alloc_pages_nodemask+0x863/0xa55
 [<c116b68f>] get_huge_zero_page+0x52/0xf9
 [<c116ef78>] do_huge_pmd_anonymous_page+0x24e/0x39f
 [<c1171c4b>] ? __mem_cgroup_count_vm_event+0xa6/0x191
 [<c1171c64>] ? __mem_cgroup_count_vm_event+0xbf/0x191
 [<c114815c>] handle_mm_fault+0x235/0xd9a
 [<c15c7586>] ? __do_page_fault+0xf8/0x5a1
 [<c15c75ee>] __do_page_fault+0x160/0x5a1
 [<c15c7586>] ? __do_page_fault+0xf8/0x5a1
 [<c15c7a2f>] ? __do_page_fault+0x5a1/0x5a1
 [<c15c7a3c>] do_page_fault+0xd/0xf
 [<c15c4e7c>] error_code+0x6c/0x74
 [<c114007b>] ? memcg_update_all_caches+0x23/0x6b
 [<c12d0be5>] ? __copy_from_user_ll+0x30/0xdb
 [<c12d0ccf>] _copy_from_user+0x3f/0x55
 [<c1057aa2>] SyS_setrlimit+0x27/0x50
 [<c1044792>] ? SyS_gettimeofday+0x33/0x6d
 [<c12d0798>] ? trace_hardirqs_on_thunk+0xc/0x10
 [<c15cb33b>] sysenter_do_call+0x12/0x32
Code: 6e 8d 8f 84 02 00 00 89 c8 89 4d e4 e8 df bf 3f 00 89 45 e8 89 da 89 f0 e8 99 2b fa ff 8b 43 08 3b 47 54 8b 4d e4 73 06 8b 57 50 <89> 34 82 8b 55 e8 89 c8 e8 aa c1 3f 00 8b 45 ec e8 28 c1 3f 00


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in aio_migratepage
  2013-11-26  6:01 ` Dave Jones
@ 2013-11-26  7:19   ` Kent Overstreet
  2013-11-26 15:23     ` Benjamin LaHaise
  0 siblings, 1 reply; 17+ messages in thread
From: Kent Overstreet @ 2013-11-26  7:19 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, Sasha Levin, Benjamin LaHaise

On Tue, Nov 26, 2013 at 01:01:32AM -0500, Dave Jones wrote:
> On Mon, Nov 25, 2013 at 10:26:45PM -0500, Dave Jones wrote:
>  > Hi Kent,
>  > 
>  > I hit the GPF below on a tree based on 8e45099e029bb6b369b27d8d4920db8caff5ecce
>  > which has your commit e34ecee2ae791df674dfb466ce40692ca6218e43
>  > ("aio: Fix a trinity splat").  Is this another path your patch missed, or
>  > a completely different bug to what you were chasing ?
> 
> And here's another from a different path, this time on 32bit.

I'm pretty sure this is a different bug... it appears to be related to
aio ring buffer migration, which I don't think I've touched.

Any information on what it was doing at the time? I see exit_aio() in
the second backtrace, maybe some sort of race between migratepage and
ioctx teardown? But it is using the address space mapping, so I dunno.

I don't see what's protecting ctx->ring_pages - I imagine it's got to
have something to do with the page migration machinery but I have no
idea how that works. Ben?

> Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> Modules linked in: tun fuse hidp rfcomm bnep scsi_transport_iscsi l2tp_ppp l2tp_netlink l2tp_core nfc caif_socket caif af_802154 phonet af_rxrpc bluetooth rfkill can_raw can_bcm can llc2 pppoe pppox ppp_generic slhc irda crc_ccitt rds af_key rose x25 atm netrom appletalk ipx p8023 p8022 psnap llc ax25 nouveau video backlight mxm_wmi wmi i2c_algo_bit ttm drm_kms_helper drm i2c_core kvm_intel kvm tg3 ptp pps_core libphy serio_raw pcspkr lpc_ich microcode mfd_core rtc_cmos parport_pc parport shpchp xfs libcrc32c raid0 floppy
> CPU: 0 PID: 4517 Comm: trinity-child0 Not tainted 3.13.0-rc1+ #6 
> Hardware name: Dell Inc.                 Precision WorkStation 490    /0DT031, BIOS A08 04/25/2008
> task: ed899630 ti: dea22000 task.ti: dea22000
> EIP: 0060:[<c11c7a02>] EFLAGS: 00010293 CPU: 0
> EIP is at aio_migratepage+0xad/0x126
> EAX: 00000144 EBX: f6844ed8 ECX: deaf4a84 EDX: 6b6b6b6b
> ESI: f68dc508 EDI: deaf4800 EBP: dea23bcc ESP: dea23ba8
>  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> CR0: 8005003b CR2: 6b6b707b CR3: 2c985000 CR4: 000007f0
> Stack:
>  00000000 00000001 deaf4a84 00000286 d709b280 00000000 f68dc508 c11c7955
>  f6844ed8 dea23c0c c116aa9f 00000001 00000001 c11c7955 c1179a33 00000000
>  00000000 c114166d f6844ed8 f6844ed8 c1140fc9 dea23c0c 00000000 f6844ed8
> Call Trace:
>  [<c11c7955>] ? free_ioctx+0x62/0x62
>  [<c116aa9f>] move_to_new_page+0x63/0x1bb
>  [<c11c7955>] ? free_ioctx+0x62/0x62
>  [<c1179a33>] ? mem_cgroup_prepare_migration+0xc1/0x243
>  [<c114166d>] ? isolate_migratepages_range+0x3fb/0x675
>  [<c1140fc9>] ? isolate_freepages_block+0x316/0x316
>  [<c116b319>] migrate_pages+0x614/0x72b
>  [<c1140fc9>] ? isolate_freepages_block+0x316/0x316
>  [<c1141c21>] compact_zone+0x294/0x475
>  [<c1142065>] try_to_compact_pages+0x129/0x196
>  [<c15b95e7>] __alloc_pages_direct_compact+0x91/0x197
>  [<c112a25c>] __alloc_pages_nodemask+0x863/0xa55
>  [<c116b68f>] get_huge_zero_page+0x52/0xf9
>  [<c116ef78>] do_huge_pmd_anonymous_page+0x24e/0x39f
>  [<c1171c4b>] ? __mem_cgroup_count_vm_event+0xa6/0x191
>  [<c1171c64>] ? __mem_cgroup_count_vm_event+0xbf/0x191
>  [<c114815c>] handle_mm_fault+0x235/0xd9a
>  [<c15c7586>] ? __do_page_fault+0xf8/0x5a1
>  [<c15c75ee>] __do_page_fault+0x160/0x5a1
>  [<c15c7586>] ? __do_page_fault+0xf8/0x5a1
>  [<c15c7a2f>] ? __do_page_fault+0x5a1/0x5a1
>  [<c15c7a3c>] do_page_fault+0xd/0xf
>  [<c15c4e7c>] error_code+0x6c/0x74
>  [<c114007b>] ? memcg_update_all_caches+0x23/0x6b
>  [<c12d0be5>] ? __copy_from_user_ll+0x30/0xdb
>  [<c12d0ccf>] _copy_from_user+0x3f/0x55
>  [<c1057aa2>] SyS_setrlimit+0x27/0x50
>  [<c1044792>] ? SyS_gettimeofday+0x33/0x6d
>  [<c12d0798>] ? trace_hardirqs_on_thunk+0xc/0x10
>  [<c15cb33b>] sysenter_do_call+0x12/0x32
> Code: 6e 8d 8f 84 02 00 00 89 c8 89 4d e4 e8 df bf 3f 00 89 45 e8 89 da 89 f0 e8 99 2b fa ff 8b 43 08 3b 47 54 8b 4d e4 73 06 8b 57 50 <89> 34 82 8b 55 e8 89 c8 e8 aa c1 3f 00 8b 45 ec e8 28 c1 3f 00
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in aio_migratepage
  2013-11-26  7:19   ` Kent Overstreet
@ 2013-11-26 15:23     ` Benjamin LaHaise
  2013-11-26 15:56       ` Dave Jones
  2013-11-30 15:28       ` Kristian Nielsen
  0 siblings, 2 replies; 17+ messages in thread
From: Benjamin LaHaise @ 2013-11-26 15:23 UTC (permalink / raw)
  To: Kent Overstreet; +Cc: Dave Jones, Linux Kernel, Sasha Levin

On Mon, Nov 25, 2013 at 11:19:53PM -0800, Kent Overstreet wrote:
> On Tue, Nov 26, 2013 at 01:01:32AM -0500, Dave Jones wrote:
> > On Mon, Nov 25, 2013 at 10:26:45PM -0500, Dave Jones wrote:
> >  > Hi Kent,
> >  > 
> >  > I hit the GPF below on a tree based on 8e45099e029bb6b369b27d8d4920db8caff5ecce
> >  > which has your commit e34ecee2ae791df674dfb466ce40692ca6218e43
> >  > ("aio: Fix a trinity splat").  Is this another path your patch missed, or
> >  > a completely different bug to what you were chasing ?
> > 
> > And here's another from a different path, this time on 32bit.

For Dave: what line is this bug on?  Is it the dereference of ctx when 
doing spin_lock_irqsave(&ctx->completion_lock, flags); or is the 
ctx->ring_pages[idx] = new; ?  From the 64 bit splat, I'm thinking the 
former, which is quite strange given that the clearing of 
mapping->private_data is protected by mapping->private_lock.  If it's 
the latter, we might well need to check if ctx->ring_pages is NULL during 
setup. 

Actually, is there easy way to reproduce this with Trinity?  I can have a 
look if you point me in the right direction.

> I'm pretty sure this is a different bug... it appears to be related to
> aio ring buffer migration, which I don't think I've touched.
> 
> Any information on what it was doing at the time? I see exit_aio() in
> the second backtrace, maybe some sort of race between migratepage and
> ioctx teardown? But it is using the address space mapping, so I dunno.

Teardown should be protected by mapping->private_lock (see put_aio_ring_file() 
which takes mapping->private_lock to protect aio_migratepage() against 
accessing the ioctx after releasing the private file for the mapping.

		-ben

> I don't see what's protecting ctx->ring_pages - I imagine it's got to
> have something to do with the page migration machinery but I have no
> idea how that works. Ben?
> > ESI: f68dc508 EDI: deaf4800 EBP: dea23bcc ESP: dea23ba8
> >  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> > CR0: 8005003b CR2: 6b6b707b CR3: 2c985000 CR4: 000007f0
> > Stack:
> >  00000000 00000001 deaf4a84 00000286 d709b280 00000000 f68dc508 c11c7955
> >  f6844ed8 dea23c0c c116aa9f 00000001 00000001 c11c7955 c1179a33 00000000
> >  00000000 c114166d f6844ed8 f6844ed8 c1140fc9 dea23c0c 00000000 f6844ed8
> > Call Trace:
> >  [<c11c7955>] ? free_ioctx+0x62/0x62
> >  [<c116aa9f>] move_to_new_page+0x63/0x1bb
> >  [<c11c7955>] ? free_ioctx+0x62/0x62
> >  [<c1179a33>] ? mem_cgroup_prepare_migration+0xc1/0x243
> >  [<c114166d>] ? isolate_migratepages_range+0x3fb/0x675
> >  [<c1140fc9>] ? isolate_freepages_block+0x316/0x316
> >  [<c116b319>] migrate_pages+0x614/0x72b
> >  [<c1140fc9>] ? isolate_freepages_block+0x316/0x316
> >  [<c1141c21>] compact_zone+0x294/0x475
> >  [<c1142065>] try_to_compact_pages+0x129/0x196
> >  [<c15b95e7>] __alloc_pages_direct_compact+0x91/0x197
> >  [<c112a25c>] __alloc_pages_nodemask+0x863/0xa55
> >  [<c116b68f>] get_huge_zero_page+0x52/0xf9
> >  [<c116ef78>] do_huge_pmd_anonymous_page+0x24e/0x39f
> >  [<c1171c4b>] ? __mem_cgroup_count_vm_event+0xa6/0x191
> >  [<c1171c64>] ? __mem_cgroup_count_vm_event+0xbf/0x191
> >  [<c114815c>] handle_mm_fault+0x235/0xd9a
> >  [<c15c7586>] ? __do_page_fault+0xf8/0x5a1
> >  [<c15c75ee>] __do_page_fault+0x160/0x5a1
> >  [<c15c7586>] ? __do_page_fault+0xf8/0x5a1
> >  [<c15c7a2f>] ? __do_page_fault+0x5a1/0x5a1
> >  [<c15c7a3c>] do_page_fault+0xd/0xf
> >  [<c15c4e7c>] error_code+0x6c/0x74
> >  [<c114007b>] ? memcg_update_all_caches+0x23/0x6b
> >  [<c12d0be5>] ? __copy_from_user_ll+0x30/0xdb
> >  [<c12d0ccf>] _copy_from_user+0x3f/0x55
> >  [<c1057aa2>] SyS_setrlimit+0x27/0x50
> >  [<c1044792>] ? SyS_gettimeofday+0x33/0x6d
> >  [<c12d0798>] ? trace_hardirqs_on_thunk+0xc/0x10
> >  [<c15cb33b>] sysenter_do_call+0x12/0x32
> > Code: 6e 8d 8f 84 02 00 00 89 c8 89 4d e4 e8 df bf 3f 00 89 45 e8 89 da 89 f0 e8 99 2b fa ff 8b 43 08 3b 47 54 8b 4d e4 73 06 8b 57 50 <89> 34 82 8b 55 e8 89 c8 e8 aa c1 3f 00 8b 45 ec e8 28 c1 3f 00
> > 

-- 
"Thought is the essence of where you are now."

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in aio_migratepage
  2013-11-26 15:23     ` Benjamin LaHaise
@ 2013-11-26 15:56       ` Dave Jones
  2013-12-03  9:02         ` Gu Zheng
  2013-11-30 15:28       ` Kristian Nielsen
  1 sibling, 1 reply; 17+ messages in thread
From: Dave Jones @ 2013-11-26 15:56 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Kent Overstreet, Linux Kernel, Sasha Levin

On Tue, Nov 26, 2013 at 10:23:37AM -0500, Benjamin LaHaise wrote:
 > On Mon, Nov 25, 2013 at 11:19:53PM -0800, Kent Overstreet wrote:
 > > On Tue, Nov 26, 2013 at 01:01:32AM -0500, Dave Jones wrote:
 > > > On Mon, Nov 25, 2013 at 10:26:45PM -0500, Dave Jones wrote:
 > > >  > Hi Kent,
 > > >  > 
 > > >  > I hit the GPF below on a tree based on 8e45099e029bb6b369b27d8d4920db8caff5ecce
 > > >  > which has your commit e34ecee2ae791df674dfb466ce40692ca6218e43
 > > >  > ("aio: Fix a trinity splat").  Is this another path your patch missed, or
 > > >  > a completely different bug to what you were chasing ?
 > > > 
 > > > And here's another from a different path, this time on 32bit.
 > 
 > For Dave: what line is this bug on?  Is it the dereference of ctx when 
 > doing spin_lock_irqsave(&ctx->completion_lock, flags); or is the 
 > ctx->ring_pages[idx] = new; ?

>From the 32bit trace:

> EIP is at aio_migratepage+0xad/0x126

disasm of aio.o shows aio_migratepage at 0x6f5, which means we oopsed at 7a2...



                        ctx->ring_pages[idx] = new;
     79f:       8b 57 50                mov    0x50(%edi),%edx
     7a2:       89 34 82                mov    %esi,(%edx,%eax,4)
        raw_spin_unlock_irq(&lock->rlock);

which matches up with the Code: line.

So that's actually..

	spin_unlock_irqrestore(&ctx->completion_lock, flags);


The 64bit trace looks a little funky due to gcc optimising and moving
things around, but I think it's the same thing except this time it's
in the lock acquire path instead of lock release.

> aio_migratepage+0xa6/0x150

aio_migratepage is at 0x540, and at 0x5e6, we see...

         */
        spin_lock(&mapping->private_lock);
        ctx = mapping->private_data;
     5c3:       4d 8b ad a8 01 00 00    mov    0x1a8(%r13),%r13
        if (ctx) {
     5ca:       4d 85 ed                test   %r13,%r13
     5cd:       0f 84 85 00 00 00       je     658 <aio_migratepage+0x118>
                pgoff_t idx;
                spin_lock_irqsave(&ctx->completion_lock, flags);
     5d3:       49 8d 95 c8 02 00 00    lea    0x2c8(%r13),%rdx
     5da:       48 89 d7                mov    %rdx,%rdi
     5dd:       48 89 55 c8             mov    %rdx,-0x38(%rbp)
     5e1:       e8 00 00 00 00          callq  5e6 <aio_migratepage+0xa6>
                migrate_page_copy(new, old);
     5e6:       48 89 de                mov    %rbx,%rsi
     5e9:       4c 89 e7                mov    %r12,%rdi
         */
        spin_lock(&mapping->private_lock);
        ctx = mapping->private_data;
        if (ctx) {
                pgoff_t idx;
                spin_lock_irqsave(&ctx->completion_lock, flags);


 > Actually, is there easy way to reproduce this with Trinity?  I can have a 
 > look if you point me in the right direction.

I've not found a simple reproducer recipe yet, working on it.
So far I've just been running it for an hour and waiting. If I can narrow down
the syscalls necessary I'll let you know.
 
	Dave
 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in aio_migratepage
  2013-11-26 15:23     ` Benjamin LaHaise
  2013-11-26 15:56       ` Dave Jones
@ 2013-11-30 15:28       ` Kristian Nielsen
  2013-12-02 10:10         ` Gu Zheng
  1 sibling, 1 reply; 17+ messages in thread
From: Kristian Nielsen @ 2013-11-30 15:28 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Kent Overstreet, Dave Jones, Linux Kernel, Sasha Levin

Benjamin LaHaise <bcrl@kvack.org> writes:

> For Dave: what line is this bug on?  Is it the dereference of ctx when 
> doing spin_lock_irqsave(&ctx->completion_lock, flags); or is the 
> ctx->ring_pages[idx] = new; ?  From the 64 bit splat, I'm thinking the 
> former, which is quite strange given that the clearing of 
> mapping->private_data is protected by mapping->private_lock.  If it's 
> the latter, we might well need to check if ctx->ring_pages is NULL during 
> setup. 

I think I got the same BUG (at least it looks very similar, full details
below).

The bug is on this line:

    ctx->ring_pages[idx] = new;

Disassembly:

    af7:   48 89 2c d1    mov    %rbp,(%rcx,%rdx,8)

ctx->ring_pages is 0xffffffffffffffff (this is x86_64). idx is 13.

  RCX: ffffffffffffffff  RDX: 000000000000000d
  BUG: unable to handle kernel NULL pointer dereference at 0000000000000067

So we are de-referencing a pointer that is (page **)-1, causing the crash.

If you look closer at the 32-bit dump that Dave gave, you can see that it is
similar:

     7a2:       89 34 82                mov    %esi,(%edx,%eax,4)

  RAX: 6b6b6b6b6b6b6b6b  RDX: 0000000000000000

Though in this case ctx->ring_pages seems to be NULL and idx=old->index seems
to be 6b6b6b6b6b6b6b6b, so not completely the same (or maybe I read his dump
incorrectly).

This is 3.13-rc1. Unfortunately, I do not have a way to reproduce (so far I
only saw it this once). But I can see if it turns up again, or should I
install -rc2 and see if it goes away?

I was not doing anything special at the time, normal desktop load (I was using
the evince pdf viewer).

Let me know if there is anything else I can do to help track this down?

 - Kristian.

Full details:

I put my .config here:

    http://knielsen-hq.org/config-3.13-rc1-gpf-in-aio-migratepage.txt

BUG output:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000067
IP: [<ffffffff8113d73f>] aio_migratepage+0xb3/0xe4
PGD 0 
Oops: 0002 [#1] SMP 
Modules linked in: tun parport_pc ppdev lp parport bnep rfcomm bluetooth cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_conservative binfmt_misc uinput fuse nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc ext3 jbd loop snd_hda_codec_hdmi hid_generic usbhid hid joydev ums_realtek usb_storage snd_hda_codec_realtek iTCO_wdt iTCO_vendor_support arc4 brcmsmac cordic brcmutil b43 mac80211 cfg80211 ssb mmc_core rfkill rng_core pcmcia pcmcia_core nouveau mxm_wmi wmi x86_pkg_temp_thermal coretemp snd_hda_intel kvm_intel snd_hda_codec snd_hwdep snd_pcm_oss kvm snd_mixer_oss snd_seq_midi snd_seq_midi_event snd_pcm crc32c_intel snd_rawmidi snd_page_alloc snd_seq ghash_clmulni_intel snd_timer snd_seq_device lpc_ich aesni_intel mfd_core ttm battery aes_x86_64 ablk_helper drm_kms_helper cryptd lrw gf128mul drm glue_helper psmouse snd pcspkr serio_raw i2c_i801 evdev ehci_pci soundcore ehci_hcd bcma ac acpi_cpufreq video button processor ext4 crc16 jbd2 mbc
r_mod cdrom crc_t10dif crct10dif_common microcode ahci libahci xhci_hcd libata usbcore scsi_mod usb_common fan thermal thermal_sys r8169 mii
CPU: 2 PID: 15596 Comm: evince Not tainted 3.13.0-rc1-kn #1
Hardware name: Compal PBL2021/Base Board Product Name, BIOS 2.40 08/26/2011
task: ffff88010322f7c0 ti: ffff880102b48000 task.ti: ffff880102b48000
RIP: 0010:[<ffffffff8113d73f>]  [<ffffffff8113d73f>] aio_migratepage+0xb3/0xe4
RSP: 0018:ffff880102b49798  EFLAGS: 00010213
RAX: 0000000000000286 RBX: ffffea00038f1640 RCX: ffffffffffffffff
RDX: 000000000000000d RSI: ffffea00038f1640 RDI: ffffea00038f1640
RBP: ffffea0007b6a800 R08: 0000000000000000 R09: 000000000000000d
R10: 0000000000000038 R11: ffffea0007b6a800 R12: ffff880144a30d00
R13: 0000000000000000 R14: ffff88014ba5b1f8 R15: ffff880144a30ec4
FS:  00007f68ecfe8960(0000) GS:ffff88024f480000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000067 CR3: 0000000051ee8000 CR4: 00000000000407e0
Stack:
 000000000000000e 0000000000000286 ffff88024f7f6d80 ffffea00038f1640
 ffffea0007b6a800 0000000000000000 ffff88014ba5b170 0000000000000001
 0000000000000001 ffffffff810ffc68 ffff88014ba5b1a8 0000000000000000
Call Trace:
 [<ffffffff810ffc68>] ? move_to_new_page+0x84/0x1ab
 [<ffffffff810cbcbd>] ? get_page+0x9/0x25
 [<ffffffff8110019e>] ? migrate_pages+0x330/0x524
 [<ffffffff810dac77>] ? isolate_freepages_block+0x237/0x237
 [<ffffffff810db651>] ? compact_zone+0x13a/0x301
 [<ffffffff810dba3e>] ? compact_zone_order+0x94/0xa7
 [<ffffffff810dbae9>] ? try_to_compact_pages+0x98/0xec
 [<ffffffff8138ef42>] ? __alloc_pages_direct_compact+0xa9/0x19a
 [<ffffffff810c8567>] ? __alloc_pages_nodemask+0x46f/0x7f3
 [<ffffffff812cf2bc>] ? __kmalloc_reserve.isra.42+0x2a/0x6d
 [<ffffffff810f64df>] ? alloc_pages_current+0xac/0xc6
 [<ffffffff812cbd47>] ? sock_alloc_send_pskb+0x1fc/0x345
 [<ffffffff812d2625>] ? memcpy_fromiovecend+0x48/0x6f
 [<ffffffff812d2ac5>] ? skb_copy_datagram_from_iovec+0x128/0x1f2
 [<ffffffff812ca529>] ? sk_wake_async+0x19/0x3c
 [<ffffffff8134c605>] ? unix_stream_sendmsg+0x12e/0x2e9
 [<ffffffff812c8001>] ? sock_aio_write+0xc0/0xd5
 [<ffffffff81115581>] ? set_restore_sigmask+0x2d/0x2d
 [<ffffffff81106da4>] ? do_sync_readv_writev+0x48/0x6b
 [<ffffffff812c7f41>] ? sock_alloc_file+0x119/0x119
 [<ffffffff81107e9c>] ? do_readv_writev+0xb4/0x121
 [<ffffffff812c7f41>] ? sock_alloc_file+0x119/0x119
 [<ffffffff810015d7>] ? __switch_to+0x1b1/0x3de
 [<ffffffff8111c1ce>] ? fget_light+0x6b/0x7c
 [<ffffffff81106d10>] ? fdget+0xe/0x17
 [<ffffffff8110807d>] ? SyS_writev+0x51/0xaa
 [<ffffffff813997e2>] ? system_call_fastpath+0x16/0x1b
Code: 48 89 de 48 89 ef 48 89 44 24 08 e8 03 22 fc ff 48 8b 53 10 49 3b 94 24 a0 00 00 00 48 8b 44 24 08 73 0c 49 8b 8c 24 98 00 00 00 <48> 89 2c d1 48 89 c6 4c 89 ff e8 74 6e 25 00 eb 06 41 bd f0 ff 
RIP  [<ffffffff8113d73f>] aio_migratepage+0xb3/0xe4
 RSP <ffff880102b49798>
CR2: 0000000000000067
---[ end trace be5b4877a98efec5 ]---
------------[ cut here ]------------


After this I got lots of stuff like

  WARNING: CPU: 4 PID: 15642 at kernel/watchdog.c:245 watchdog_overflow_callback+0x80/0xa3()
  Watchdog detected hard LOCKUP on cpu 4
  BUG: soft lockup - CPU#3 stuck for 22s! [EvJobScheduler:15653]

But I assume that is just due to crashing with two spinlocks held.


Disassembly of aio_migratepage():

0000000000000a44 <aio_migratepage>:
     a44:       41 57                   push   %r15
     a46:       41 56                   push   %r14
     a48:       41 55                   push   %r13
     a4a:       41 54                   push   %r12
     a4c:       55                      push   %rbp
     a4d:       53                      push   %rbx
     a4e:       48 89 d3                mov    %rdx,%rbx
     a51:       48 83 ec 18             sub    $0x18,%rsp
     a55:       48 8b 02                mov    (%rdx),%rax
     a58:       f6 c4 20                test   $0x20,%ah
     a5b:       74 02                   je     a5f <aio_migratepage+0x1b>
     a5d:       0f 0b                   ud2    
     a5f:       49 89 fc                mov    %rdi,%r12
     a62:       48 89 d7                mov    %rdx,%rdi
     a65:       48 89 f5                mov    %rsi,%rbp
     a68:       89 4c 24 08             mov    %ecx,0x8(%rsp)
     a6c:       e8 00 00 00 00          callq  a71 <aio_migratepage+0x2d>
     a71:       44 8b 44 24 08          mov    0x8(%rsp),%r8d
     a76:       31 c9                   xor    %ecx,%ecx
     a78:       48 89 da                mov    %rbx,%rdx
     a7b:       48 89 ee                mov    %rbp,%rsi
     a7e:       4c 89 e7                mov    %r12,%rdi
     a81:       e8 00 00 00 00          callq  a86 <aio_migratepage+0x42>
     a86:       85 c0                   test   %eax,%eax
     a88:       41 89 c5                mov    %eax,%r13d
     a8b:       74 0a                   je     a97 <aio_migratepage+0x53>
     a8d:       48 89 df                mov    %rbx,%rdi
     a90:       e8 92 ff ff ff          callq  a27 <get_page>
     a95:       eb 7f                   jmp    b16 <aio_migratepage+0xd2>
     a97:       4d 8d b4 24 88 00 00    lea    0x88(%r12),%r14
     a9e:       00 
     a9f:       48 89 ef                mov    %rbp,%rdi
     aa2:       e8 80 ff ff ff          callq  a27 <get_page>
     aa7:       4c 89 f7                mov    %r14,%rdi
     aaa:       e8 00 00 00 00          callq  aaf <aio_migratepage+0x6b>
     aaf:       4d 8b a4 24 a0 00 00    mov    0xa0(%r12),%r12
     ab6:       00 
     ab7:       4d 85 e4                test   %r12,%r12
     aba:       74 4c                   je     b08 <aio_migratepage+0xc4>
     abc:       4d 8d bc 24 c4 01 00    lea    0x1c4(%r12),%r15
     ac3:       00 
     ac4:       4c 89 ff                mov    %r15,%rdi
     ac7:       e8 00 00 00 00          callq  acc <aio_migratepage+0x88>
     acc:       48 89 de                mov    %rbx,%rsi
     acf:       48 89 ef                mov    %rbp,%rdi
     ad2:       48 89 44 24 08          mov    %rax,0x8(%rsp)
     ad7:       e8 00 00 00 00          callq  adc <aio_migratepage+0x98>
     adc:       48 8b 53 10             mov    0x10(%rbx),%rdx
     ae0:       49 3b 94 24 a0 00 00    cmp    0xa0(%r12),%rdx
     ae7:       00 
     ae8:       48 8b 44 24 08          mov    0x8(%rsp),%rax
     aed:       73 0c                   jae    afb <aio_migratepage+0xb7>
     aef:       49 8b 8c 24 98 00 00    mov    0x98(%r12),%rcx
     af6:       00 
# We get the crash on this next instruction, %rcx is 0xffffffffffffffff
     af7:       48 89 2c d1             mov    %rbp,(%rcx,%rdx,8)
     afb:       48 89 c6                mov    %rax,%rsi
     afe:       4c 89 ff                mov    %r15,%rdi
     b01:       e8 00 00 00 00          callq  b06 <aio_migratepage+0xc2>
     b06:       eb 06                   jmp    b0e <aio_migratepage+0xca>
     b08:       41 bd f0 ff ff ff       mov    $0xfffffff0,%r13d
     b0e:       4c 89 f7                mov    %r14,%rdi
     b11:       e8 b7 fa ff ff          callq  5cd <spin_unlock>
     b16:       48 83 c4 18             add    $0x18,%rsp
     b1a:       44 89 e8                mov    %r13d,%eax
     b1d:       5b                      pop    %rbx
     b1e:       5d                      pop    %rbp
     b1f:       41 5c                   pop    %r12
     b21:       41 5d                   pop    %r13
     b23:       41 5e                   pop    %r14
     b25:       41 5f                   pop    %r15
     b27:       c3                      retq   

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in aio_migratepage
  2013-11-30 15:28       ` Kristian Nielsen
@ 2013-12-02 10:10         ` Gu Zheng
  2013-12-02 10:49           ` Kristian Nielsen
  2013-12-02 17:49           ` Dave Jones
  0 siblings, 2 replies; 17+ messages in thread
From: Gu Zheng @ 2013-12-02 10:10 UTC (permalink / raw)
  To: Kristian Nielsen, Dave Jones
  Cc: Benjamin LaHaise, Kent Overstreet, Linux Kernel, Sasha Levin

Hi Kristian, Dave,

Could you please help to check whether the following patch can fix this issue?


Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
---
 fs/aio.c |   28 ++++++++++------------------
 1 files changed, 10 insertions(+), 18 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 08159ed..fc1fd0a 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -223,33 +223,25 @@ static int __init aio_setup(void)
 }
 __initcall(aio_setup);
 
-static void put_aio_ring_file(struct kioctx *ctx)
-{
-	struct file *aio_ring_file = ctx->aio_ring_file;
-	if (aio_ring_file) {
-		truncate_setsize(aio_ring_file->f_inode, 0);
-
-		/* Prevent further access to the kioctx from migratepages */
-		spin_lock(&aio_ring_file->f_inode->i_mapping->private_lock);
-		aio_ring_file->f_inode->i_mapping->private_data = NULL;
-		ctx->aio_ring_file = NULL;
-		spin_unlock(&aio_ring_file->f_inode->i_mapping->private_lock);
-
-		fput(aio_ring_file);
-	}
-}
-
 static void aio_free_ring(struct kioctx *ctx)
 {
+	struct file *aio_ring_file = ctx->aio_ring_file;
 	int i;
 
+	BUG_ON(!aio_ring_file);
+
+	spin_lock(&aio_ring_file->f_inode->i_mapping->private_lock);
 	for (i = 0; i < ctx->nr_pages; i++) {
 		pr_debug("pid(%d) [%d] page->count=%d\n", current->pid, i,
 				page_count(ctx->ring_pages[i]));
 		put_page(ctx->ring_pages[i]);
 	}
-
-	put_aio_ring_file(ctx);
+	truncate_setsize(aio_ring_file->f_inode, 0);
+	/* Prevent further access to the kioctx from migratepages */
+	aio_ring_file->f_inode->i_mapping->private_data = NULL;
+	ctx->aio_ring_file = NULL;
+	spin_unlock(&aio_ring_file->f_inode->i_mapping->private_lock);
+	fput(aio_ring_file);
 
 	if (ctx->ring_pages && ctx->ring_pages != ctx->internal_pages) {
 		kfree(ctx->ring_pages);
-- 
1.7.7



On 11/30/2013 11:28 PM, Kristian Nielsen wrote:

> Benjamin LaHaise <bcrl@kvack.org> writes:
> 
>> For Dave: what line is this bug on?  Is it the dereference of ctx when 
>> doing spin_lock_irqsave(&ctx->completion_lock, flags); or is the 
>> ctx->ring_pages[idx] = new; ?  From the 64 bit splat, I'm thinking the 
>> former, which is quite strange given that the clearing of 
>> mapping->private_data is protected by mapping->private_lock.  If it's 
>> the latter, we might well need to check if ctx->ring_pages is NULL during 
>> setup. 
> 
> I think I got the same BUG (at least it looks very similar, full details
> below).
> 
> The bug is on this line:
> 
>     ctx->ring_pages[idx] = new;
> 
> Disassembly:
> 
>     af7:   48 89 2c d1    mov    %rbp,(%rcx,%rdx,8)
> 
> ctx->ring_pages is 0xffffffffffffffff (this is x86_64). idx is 13.
> 
>   RCX: ffffffffffffffff  RDX: 000000000000000d
>   BUG: unable to handle kernel NULL pointer dereference at 0000000000000067
> 
> So we are de-referencing a pointer that is (page **)-1, causing the crash.
> 
> If you look closer at the 32-bit dump that Dave gave, you can see that it is
> similar:
> 
>      7a2:       89 34 82                mov    %esi,(%edx,%eax,4)
> 
>   RAX: 6b6b6b6b6b6b6b6b  RDX: 0000000000000000
> 
> Though in this case ctx->ring_pages seems to be NULL and idx=old->index seems
> to be 6b6b6b6b6b6b6b6b, so not completely the same (or maybe I read his dump
> incorrectly).
> 
> This is 3.13-rc1. Unfortunately, I do not have a way to reproduce (so far I
> only saw it this once). But I can see if it turns up again, or should I
> install -rc2 and see if it goes away?
> 
> I was not doing anything special at the time, normal desktop load (I was using
> the evince pdf viewer).
> 
> Let me know if there is anything else I can do to help track this down?
> 
>  - Kristian.
> 
> Full details:
> 
> I put my .config here:
> 
>     http://knielsen-hq.org/config-3.13-rc1-gpf-in-aio-migratepage.txt
> 
> BUG output:
> 
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000067
> IP: [<ffffffff8113d73f>] aio_migratepage+0xb3/0xe4
> PGD 0 
> Oops: 0002 [#1] SMP 
> Modules linked in: tun parport_pc ppdev lp parport bnep rfcomm bluetooth cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_conservative binfmt_misc uinput fuse nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc ext3 jbd loop snd_hda_codec_hdmi hid_generic usbhid hid joydev ums_realtek usb_storage snd_hda_codec_realtek iTCO_wdt iTCO_vendor_support arc4 brcmsmac cordic brcmutil b43 mac80211 cfg80211 ssb mmc_core rfkill rng_core pcmcia pcmcia_core nouveau mxm_wmi wmi x86_pkg_temp_thermal coretemp snd_hda_intel kvm_intel snd_hda_codec snd_hwdep snd_pcm_oss kvm snd_mixer_oss snd_seq_midi snd_seq_midi_event snd_pcm crc32c_intel snd_rawmidi snd_page_alloc snd_seq ghash_clmulni_intel snd_timer snd_seq_device lpc_ich aesni_intel mfd_core ttm battery aes_x86_64 ablk_helper drm_kms_helper cryptd lrw gf128mul drm glue_helper psmouse snd pcspkr serio_raw i2c_i801 evdev ehci_pci soundcore ehci_hcd bcma ac acpi_cpufreq video button processor ext4 crc16 jbd2 mbc
> r_mod cdrom crc_t10dif crct10dif_common microcode ahci libahci xhci_hcd libata usbcore scsi_mod usb_common fan thermal thermal_sys r8169 mii
> CPU: 2 PID: 15596 Comm: evince Not tainted 3.13.0-rc1-kn #1
> Hardware name: Compal PBL2021/Base Board Product Name, BIOS 2.40 08/26/2011
> task: ffff88010322f7c0 ti: ffff880102b48000 task.ti: ffff880102b48000
> RIP: 0010:[<ffffffff8113d73f>]  [<ffffffff8113d73f>] aio_migratepage+0xb3/0xe4
> RSP: 0018:ffff880102b49798  EFLAGS: 00010213
> RAX: 0000000000000286 RBX: ffffea00038f1640 RCX: ffffffffffffffff
> RDX: 000000000000000d RSI: ffffea00038f1640 RDI: ffffea00038f1640
> RBP: ffffea0007b6a800 R08: 0000000000000000 R09: 000000000000000d
> R10: 0000000000000038 R11: ffffea0007b6a800 R12: ffff880144a30d00
> R13: 0000000000000000 R14: ffff88014ba5b1f8 R15: ffff880144a30ec4
> FS:  00007f68ecfe8960(0000) GS:ffff88024f480000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000067 CR3: 0000000051ee8000 CR4: 00000000000407e0
> Stack:
>  000000000000000e 0000000000000286 ffff88024f7f6d80 ffffea00038f1640
>  ffffea0007b6a800 0000000000000000 ffff88014ba5b170 0000000000000001
>  0000000000000001 ffffffff810ffc68 ffff88014ba5b1a8 0000000000000000
> Call Trace:
>  [<ffffffff810ffc68>] ? move_to_new_page+0x84/0x1ab
>  [<ffffffff810cbcbd>] ? get_page+0x9/0x25
>  [<ffffffff8110019e>] ? migrate_pages+0x330/0x524
>  [<ffffffff810dac77>] ? isolate_freepages_block+0x237/0x237
>  [<ffffffff810db651>] ? compact_zone+0x13a/0x301
>  [<ffffffff810dba3e>] ? compact_zone_order+0x94/0xa7
>  [<ffffffff810dbae9>] ? try_to_compact_pages+0x98/0xec
>  [<ffffffff8138ef42>] ? __alloc_pages_direct_compact+0xa9/0x19a
>  [<ffffffff810c8567>] ? __alloc_pages_nodemask+0x46f/0x7f3
>  [<ffffffff812cf2bc>] ? __kmalloc_reserve.isra.42+0x2a/0x6d
>  [<ffffffff810f64df>] ? alloc_pages_current+0xac/0xc6
>  [<ffffffff812cbd47>] ? sock_alloc_send_pskb+0x1fc/0x345
>  [<ffffffff812d2625>] ? memcpy_fromiovecend+0x48/0x6f
>  [<ffffffff812d2ac5>] ? skb_copy_datagram_from_iovec+0x128/0x1f2
>  [<ffffffff812ca529>] ? sk_wake_async+0x19/0x3c
>  [<ffffffff8134c605>] ? unix_stream_sendmsg+0x12e/0x2e9
>  [<ffffffff812c8001>] ? sock_aio_write+0xc0/0xd5
>  [<ffffffff81115581>] ? set_restore_sigmask+0x2d/0x2d
>  [<ffffffff81106da4>] ? do_sync_readv_writev+0x48/0x6b
>  [<ffffffff812c7f41>] ? sock_alloc_file+0x119/0x119
>  [<ffffffff81107e9c>] ? do_readv_writev+0xb4/0x121
>  [<ffffffff812c7f41>] ? sock_alloc_file+0x119/0x119
>  [<ffffffff810015d7>] ? __switch_to+0x1b1/0x3de
>  [<ffffffff8111c1ce>] ? fget_light+0x6b/0x7c
>  [<ffffffff81106d10>] ? fdget+0xe/0x17
>  [<ffffffff8110807d>] ? SyS_writev+0x51/0xaa
>  [<ffffffff813997e2>] ? system_call_fastpath+0x16/0x1b
> Code: 48 89 de 48 89 ef 48 89 44 24 08 e8 03 22 fc ff 48 8b 53 10 49 3b 94 24 a0 00 00 00 48 8b 44 24 08 73 0c 49 8b 8c 24 98 00 00 00 <48> 89 2c d1 48 89 c6 4c 89 ff e8 74 6e 25 00 eb 06 41 bd f0 ff 
> RIP  [<ffffffff8113d73f>] aio_migratepage+0xb3/0xe4
>  RSP <ffff880102b49798>
> CR2: 0000000000000067
> ---[ end trace be5b4877a98efec5 ]---
> ------------[ cut here ]------------
> 
> 
> After this I got lots of stuff like
> 
>   WARNING: CPU: 4 PID: 15642 at kernel/watchdog.c:245 watchdog_overflow_callback+0x80/0xa3()
>   Watchdog detected hard LOCKUP on cpu 4
>   BUG: soft lockup - CPU#3 stuck for 22s! [EvJobScheduler:15653]
> 
> But I assume that is just due to crashing with two spinlocks held.
> 
> 
> Disassembly of aio_migratepage():
> 
> 0000000000000a44 <aio_migratepage>:
>      a44:       41 57                   push   %r15
>      a46:       41 56                   push   %r14
>      a48:       41 55                   push   %r13
>      a4a:       41 54                   push   %r12
>      a4c:       55                      push   %rbp
>      a4d:       53                      push   %rbx
>      a4e:       48 89 d3                mov    %rdx,%rbx
>      a51:       48 83 ec 18             sub    $0x18,%rsp
>      a55:       48 8b 02                mov    (%rdx),%rax
>      a58:       f6 c4 20                test   $0x20,%ah
>      a5b:       74 02                   je     a5f <aio_migratepage+0x1b>
>      a5d:       0f 0b                   ud2    
>      a5f:       49 89 fc                mov    %rdi,%r12
>      a62:       48 89 d7                mov    %rdx,%rdi
>      a65:       48 89 f5                mov    %rsi,%rbp
>      a68:       89 4c 24 08             mov    %ecx,0x8(%rsp)
>      a6c:       e8 00 00 00 00          callq  a71 <aio_migratepage+0x2d>
>      a71:       44 8b 44 24 08          mov    0x8(%rsp),%r8d
>      a76:       31 c9                   xor    %ecx,%ecx
>      a78:       48 89 da                mov    %rbx,%rdx
>      a7b:       48 89 ee                mov    %rbp,%rsi
>      a7e:       4c 89 e7                mov    %r12,%rdi
>      a81:       e8 00 00 00 00          callq  a86 <aio_migratepage+0x42>
>      a86:       85 c0                   test   %eax,%eax
>      a88:       41 89 c5                mov    %eax,%r13d
>      a8b:       74 0a                   je     a97 <aio_migratepage+0x53>
>      a8d:       48 89 df                mov    %rbx,%rdi
>      a90:       e8 92 ff ff ff          callq  a27 <get_page>
>      a95:       eb 7f                   jmp    b16 <aio_migratepage+0xd2>
>      a97:       4d 8d b4 24 88 00 00    lea    0x88(%r12),%r14
>      a9e:       00 
>      a9f:       48 89 ef                mov    %rbp,%rdi
>      aa2:       e8 80 ff ff ff          callq  a27 <get_page>
>      aa7:       4c 89 f7                mov    %r14,%rdi
>      aaa:       e8 00 00 00 00          callq  aaf <aio_migratepage+0x6b>
>      aaf:       4d 8b a4 24 a0 00 00    mov    0xa0(%r12),%r12
>      ab6:       00 
>      ab7:       4d 85 e4                test   %r12,%r12
>      aba:       74 4c                   je     b08 <aio_migratepage+0xc4>
>      abc:       4d 8d bc 24 c4 01 00    lea    0x1c4(%r12),%r15
>      ac3:       00 
>      ac4:       4c 89 ff                mov    %r15,%rdi
>      ac7:       e8 00 00 00 00          callq  acc <aio_migratepage+0x88>
>      acc:       48 89 de                mov    %rbx,%rsi
>      acf:       48 89 ef                mov    %rbp,%rdi
>      ad2:       48 89 44 24 08          mov    %rax,0x8(%rsp)
>      ad7:       e8 00 00 00 00          callq  adc <aio_migratepage+0x98>
>      adc:       48 8b 53 10             mov    0x10(%rbx),%rdx
>      ae0:       49 3b 94 24 a0 00 00    cmp    0xa0(%r12),%rdx
>      ae7:       00 
>      ae8:       48 8b 44 24 08          mov    0x8(%rsp),%rax
>      aed:       73 0c                   jae    afb <aio_migratepage+0xb7>
>      aef:       49 8b 8c 24 98 00 00    mov    0x98(%r12),%rcx
>      af6:       00 
> # We get the crash on this next instruction, %rcx is 0xffffffffffffffff
>      af7:       48 89 2c d1             mov    %rbp,(%rcx,%rdx,8)
>      afb:       48 89 c6                mov    %rax,%rsi
>      afe:       4c 89 ff                mov    %r15,%rdi
>      b01:       e8 00 00 00 00          callq  b06 <aio_migratepage+0xc2>
>      b06:       eb 06                   jmp    b0e <aio_migratepage+0xca>
>      b08:       41 bd f0 ff ff ff       mov    $0xfffffff0,%r13d
>      b0e:       4c 89 f7                mov    %r14,%rdi
>      b11:       e8 b7 fa ff ff          callq  5cd <spin_unlock>
>      b16:       48 83 c4 18             add    $0x18,%rsp
>      b1a:       44 89 e8                mov    %r13d,%eax
>      b1d:       5b                      pop    %rbx
>      b1e:       5d                      pop    %rbp
>      b1f:       41 5c                   pop    %r12
>      b21:       41 5d                   pop    %r13
>      b23:       41 5e                   pop    %r14
>      b25:       41 5f                   pop    %r15
>      b27:       c3                      retq   
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: GPF in aio_migratepage
  2013-12-02 10:10         ` Gu Zheng
@ 2013-12-02 10:49           ` Kristian Nielsen
  2013-12-02 17:49           ` Dave Jones
  1 sibling, 0 replies; 17+ messages in thread
From: Kristian Nielsen @ 2013-12-02 10:49 UTC (permalink / raw)
  To: Gu Zheng
  Cc: Dave Jones, Benjamin LaHaise, Kent Overstreet, Linux Kernel,
	Sasha Levin

Gu Zheng <guz.fnst@cn.fujitsu.com> writes:

> Hi Kristian, Dave,
>
> Could you please help to check whether the following patch can fix this issue?

> Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
> ---
>  fs/aio.c |   28 ++++++++++------------------
>  1 files changed, 10 insertions(+), 18 deletions(-)
>

Ok. I've applied the patch to 3.13-rc2 and will give it a spin. I will let you
know if I encounter the failure again with the patch.

Thanks,

 - Kristian.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in aio_migratepage
  2013-12-02 10:10         ` Gu Zheng
  2013-12-02 10:49           ` Kristian Nielsen
@ 2013-12-02 17:49           ` Dave Jones
  2013-12-15 21:59             ` Kristian Nielsen
  1 sibling, 1 reply; 17+ messages in thread
From: Dave Jones @ 2013-12-02 17:49 UTC (permalink / raw)
  To: Gu Zheng
  Cc: Kristian Nielsen, Benjamin LaHaise, Kent Overstreet, Linux Kernel,
	Sasha Levin

On Mon, Dec 02, 2013 at 06:10:46PM +0800, Gu Zheng wrote:
 > Hi Kristian, Dave,
 > 
 > Could you please help to check whether the following patch can fix this issue?

This introduces some locking bugs..


[  222.327950] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:616
[  222.328004] in_atomic(): 1, irqs_disabled(): 0, pid: 12794, name: trinity-child1
[  222.328044] 1 lock held by trinity-child1/12794:
[  222.328072]  #0:  (&(&mapping->private_lock)->rlock){+.+...}, at: [<ffffffff81210a64>] aio_free_ring+0x44/0x160
[  222.328147] CPU: 1 PID: 12794 Comm: trinity-child1 Not tainted 3.13.0-rc2+ #12 
[  222.328268]  0000000000000268 ffff880229517d68 ffffffff8173bc52 0000000000000000
[  222.328320]  ffff880229517d90 ffffffff8108ad95 ffff880223b6acd0 0000000000000000
[  222.328370]  0000000000000000 ffff880229517e08 ffffffff81741cf3 ffff880229517dc0
[  222.328421] Call Trace:
[  222.328443]  [<ffffffff8173bc52>] dump_stack+0x4e/0x7a
[  222.328475]  [<ffffffff8108ad95>] __might_sleep+0x175/0x200
[  222.328510]  [<ffffffff81741cf3>] mutex_lock_nested+0x33/0x400
[  222.328545]  [<ffffffff81179e68>] unmap_mapping_range+0x68/0x170
[  222.328582]  [<ffffffff81160a35>] truncate_pagecache+0x35/0x60
[  222.328617]  [<ffffffff81160a72>] truncate_setsize+0x12/0x20
[  222.328651]  [<ffffffff81210ab9>] aio_free_ring+0x99/0x160
[  222.328684]  [<ffffffff81213071>] SyS_io_setup+0xef1/0xf00
[  222.328717]  [<ffffffff8174eaa4>] tracesys+0xdd/0xe2

[  222.328769] ======================================================
[  222.328804] [ INFO: possible circular locking dependency detected ]
[  222.328838] 3.13.0-rc2+ #12 Not tainted
[  222.328862] -------------------------------------------------------
[  222.328896] trinity-child1/12794 is trying to acquire lock:
[  222.328928]  (&mapping->i_mmap_mutex){+.+...}, at: [<ffffffff81179e68>] unmap_mapping_range+0x68/0x170
[  222.328987] 
but task is already holding lock:
[  222.329020]  (&(&mapping->private_lock)->rlock){+.+...}, at: [<ffffffff81210a64>] aio_free_ring+0x44/0x160
[  222.329081] 
which lock already depends on the new lock.

[  222.329125] 
the existing dependency chain (in reverse order) is:
[  222.329166] 
-> #2 (&(&mapping->private_lock)->rlock){+.+...}:
[  222.329211]        [<ffffffff810af833>] lock_acquire+0x93/0x1c0
[  222.329248]        [<ffffffff817454f0>] _raw_spin_lock+0x40/0x80
[  222.329285]        [<ffffffff811f334d>] __set_page_dirty_buffers+0x2d/0xb0
[  222.331243]        [<ffffffff8115aada>] set_page_dirty+0x3a/0x60
[  222.334437]        [<ffffffff81179a7f>] unmap_single_vma+0x62f/0x830
[  222.337633]        [<ffffffff8117ad19>] unmap_vmas+0x49/0x90
[  222.340819]        [<ffffffff811804bd>] unmap_region+0x9d/0x110
[  222.343968]        [<ffffffff811829f6>] do_munmap+0x226/0x3b0
[  222.346689]        [<ffffffff81182bc4>] vm_munmap+0x44/0x60
[  222.349741]        [<ffffffff81183b42>] SyS_munmap+0x22/0x30
[  222.352758]        [<ffffffff8174eaa4>] tracesys+0xdd/0xe2
[  222.355735] 
-> #1 (&(ptlock_ptr(page))->rlock#2){+.+...}:
[  222.361611]        [<ffffffff810af833>] lock_acquire+0x93/0x1c0
[  222.364589]        [<ffffffff817454f0>] _raw_spin_lock+0x40/0x80
[  222.367200]        [<ffffffff81186338>] __page_check_address+0x98/0x160
[  222.370168]        [<ffffffff811864fe>] page_mkclean+0xfe/0x1c0
[  222.373120]        [<ffffffff8115ad60>] clear_page_dirty_for_io+0x60/0x100
[  222.376076]        [<ffffffff8124d207>] mpage_submit_page+0x47/0x80
[  222.379015]        [<ffffffff8124d350>] mpage_process_page_bufs+0x110/0x130
[  222.381955]        [<ffffffff8124d91b>] mpage_prepare_extent_to_map+0x22b/0x2f0
[  222.384895]        [<ffffffff8125318f>] ext4_writepages+0x4ef/0x1050
[  222.387839]        [<ffffffff8115cdf1>] do_writepages+0x21/0x50
[  222.390786]        [<ffffffff81150959>] __filemap_fdatawrite_range+0x59/0x60
[  222.393747]        [<ffffffff81150a5d>] filemap_write_and_wait_range+0x2d/0x70
[  222.396729]        [<ffffffff812498ca>] ext4_sync_file+0xba/0x4d0
[  222.399714]        [<ffffffff811f1691>] do_fsync+0x51/0x80
[  222.402317]        [<ffffffff811f1980>] SyS_fsync+0x10/0x20
[  222.405240]        [<ffffffff8174eaa4>] tracesys+0xdd/0xe2
[  222.407760] 
-> #0 (&mapping->i_mmap_mutex){+.+...}:
[  222.413349]        [<ffffffff810aed16>] __lock_acquire+0x1786/0x1af0
[  222.416127]        [<ffffffff810af833>] lock_acquire+0x93/0x1c0
[  222.418826]        [<ffffffff81741d37>] mutex_lock_nested+0x77/0x400
[  222.421456]        [<ffffffff81179e68>] unmap_mapping_range+0x68/0x170
[  222.424085]        [<ffffffff81160a35>] truncate_pagecache+0x35/0x60
[  222.426696]        [<ffffffff81160a72>] truncate_setsize+0x12/0x20
[  222.428955]        [<ffffffff81210ab9>] aio_free_ring+0x99/0x160
[  222.431509]        [<ffffffff81213071>] SyS_io_setup+0xef1/0xf00
[  222.434069]        [<ffffffff8174eaa4>] tracesys+0xdd/0xe2
[  222.436308] 
other info that might help us debug this:

[  222.443857] Chain exists of:
  &mapping->i_mmap_mutex --> &(ptlock_ptr(page))->rlock#2 --> &(&mapping->private_lock)->rlock

[  222.451618]  Possible unsafe locking scenario:

[  222.456831]        CPU0                    CPU1
[  222.459413]        ----                    ----
[  222.461958]   lock(&(&mapping->private_lock)->rlock);
[  222.464505]                                lock(&(ptlock_ptr(page))->rlock#2);
[  222.467094]                                lock(&(&mapping->private_lock)->rlock);
[  222.469625]   lock(&mapping->i_mmap_mutex);
[  222.472111] 
 *** DEADLOCK ***

[  222.478392] 1 lock held by trinity-child1/12794:
[  222.480744]  #0:  (&(&mapping->private_lock)->rlock){+.+...}, at: [<ffffffff81210a64>] aio_free_ring+0x44/0x160
[  222.483240] 
stack backtrace:
[  222.488119] CPU: 1 PID: 12794 Comm: trinity-child1 Not tainted 3.13.0-rc2+ #12 
[  222.493016]  ffffffff824cb110 ffff880229517c30 ffffffff8173bc52 ffffffff824a3f40
[  222.495690]  ffff880229517c70 ffffffff81737fed ffff880229517cc0 ffff8800a1e49d10
[  222.498379]  ffff8800a1e495d0 0000000000000001 0000000000000001 ffff8800a1e49d10
[  222.501073] Call Trace:
[  222.503394]  [<ffffffff8173bc52>] dump_stack+0x4e/0x7a
[  222.506080]  [<ffffffff81737fed>] print_circular_bug+0x200/0x20f
[  222.508781]  [<ffffffff810aed16>] __lock_acquire+0x1786/0x1af0
[  222.511485]  [<ffffffff810af833>] lock_acquire+0x93/0x1c0
[  222.514197]  [<ffffffff81179e68>] ? unmap_mapping_range+0x68/0x170
[  222.516594]  [<ffffffff81179e68>] ? unmap_mapping_range+0x68/0x170
[  222.519307]  [<ffffffff81741d37>] mutex_lock_nested+0x77/0x400
[  222.522028]  [<ffffffff81179e68>] ? unmap_mapping_range+0x68/0x170
[  222.524752]  [<ffffffff81179e68>] ? unmap_mapping_range+0x68/0x170
[  222.527445]  [<ffffffff81179e68>] unmap_mapping_range+0x68/0x170
[  222.530113]  [<ffffffff81160a35>] truncate_pagecache+0x35/0x60
[  222.532785]  [<ffffffff81160a72>] truncate_setsize+0x12/0x20
[  222.535439]  [<ffffffff81210ab9>] aio_free_ring+0x99/0x160
[  222.538089]  [<ffffffff81213071>] SyS_io_setup+0xef1/0xf00
[  222.540725]  [<ffffffff8174eaa4>] tracesys+0xdd/0xe2


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in aio_migratepage
  2013-11-26 15:56       ` Dave Jones
@ 2013-12-03  9:02         ` Gu Zheng
  0 siblings, 0 replies; 17+ messages in thread
From: Gu Zheng @ 2013-12-03  9:02 UTC (permalink / raw)
  To: Dave Jones; +Cc: Benjamin LaHaise, Kent Overstreet, Linux Kernel, Sasha Levin

Hi Dave,
According to your analysis and the dump stack, it seems that it tried to migrate
aio ring pages in the compact path, but the aio context(mapping->private_data) is
invalid (not NULL), so only one case can cause this condition, aio ring file was
left and not cleaned up in the fail path of ioctx_alloc.

So please try the following patch, I think it can fix this issue.


Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
---
 fs/aio.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 08159ed..5255548 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -367,8 +367,10 @@ static int aio_setup_ring(struct kioctx *ctx)
 	if (nr_pages > AIO_RING_PAGES) {
 		ctx->ring_pages = kcalloc(nr_pages, sizeof(struct page *),
 					  GFP_KERNEL);
-		if (!ctx->ring_pages)
+		if (!ctx->ring_pages) {
+			put_aio_ring_file(ctx);
 			return -ENOMEM;
+		}
 	}
 
 	ctx->mmap_size = nr_pages * PAGE_SIZE;
-- 
1.7.7



On 11/26/2013 11:56 PM, Dave Jones wrote:

> On Tue, Nov 26, 2013 at 10:23:37AM -0500, Benjamin LaHaise wrote:
>  > On Mon, Nov 25, 2013 at 11:19:53PM -0800, Kent Overstreet wrote:
>  > > On Tue, Nov 26, 2013 at 01:01:32AM -0500, Dave Jones wrote:
>  > > > On Mon, Nov 25, 2013 at 10:26:45PM -0500, Dave Jones wrote:
>  > > >  > Hi Kent,
>  > > >  > 
>  > > >  > I hit the GPF below on a tree based on 8e45099e029bb6b369b27d8d4920db8caff5ecce
>  > > >  > which has your commit e34ecee2ae791df674dfb466ce40692ca6218e43
>  > > >  > ("aio: Fix a trinity splat").  Is this another path your patch missed, or
>  > > >  > a completely different bug to what you were chasing ?
>  > > > 
>  > > > And here's another from a different path, this time on 32bit.
>  > 
>  > For Dave: what line is this bug on?  Is it the dereference of ctx when 
>  > doing spin_lock_irqsave(&ctx->completion_lock, flags); or is the 
>  > ctx->ring_pages[idx] = new; ?
> 
>>From the 32bit trace:
> 
>> EIP is at aio_migratepage+0xad/0x126
> 
> disasm of aio.o shows aio_migratepage at 0x6f5, which means we oopsed at 7a2...
> 
> 
> 
>                         ctx->ring_pages[idx] = new;
>      79f:       8b 57 50                mov    0x50(%edi),%edx
>      7a2:       89 34 82                mov    %esi,(%edx,%eax,4)
>         raw_spin_unlock_irq(&lock->rlock);
> 
> which matches up with the Code: line.
> 
> So that's actually..
> 
> 	spin_unlock_irqrestore(&ctx->completion_lock, flags);
> 
> 
> The 64bit trace looks a little funky due to gcc optimising and moving
> things around, but I think it's the same thing except this time it's
> in the lock acquire path instead of lock release.
> 
>> aio_migratepage+0xa6/0x150
> 
> aio_migratepage is at 0x540, and at 0x5e6, we see...
> 
>          */
>         spin_lock(&mapping->private_lock);
>         ctx = mapping->private_data;
>      5c3:       4d 8b ad a8 01 00 00    mov    0x1a8(%r13),%r13
>         if (ctx) {
>      5ca:       4d 85 ed                test   %r13,%r13
>      5cd:       0f 84 85 00 00 00       je     658 <aio_migratepage+0x118>
>                 pgoff_t idx;
>                 spin_lock_irqsave(&ctx->completion_lock, flags);
>      5d3:       49 8d 95 c8 02 00 00    lea    0x2c8(%r13),%rdx
>      5da:       48 89 d7                mov    %rdx,%rdi
>      5dd:       48 89 55 c8             mov    %rdx,-0x38(%rbp)
>      5e1:       e8 00 00 00 00          callq  5e6 <aio_migratepage+0xa6>
>                 migrate_page_copy(new, old);
>      5e6:       48 89 de                mov    %rbx,%rsi
>      5e9:       4c 89 e7                mov    %r12,%rdi
>          */
>         spin_lock(&mapping->private_lock);
>         ctx = mapping->private_data;
>         if (ctx) {
>                 pgoff_t idx;
>                 spin_lock_irqsave(&ctx->completion_lock, flags);
> 
> 
>  > Actually, is there easy way to reproduce this with Trinity?  I can have a 
>  > look if you point me in the right direction.
> 
> I've not found a simple reproducer recipe yet, working on it.
> So far I've just been running it for an hour and waiting. If I can narrow down
> the syscalls necessary I'll let you know.
>  
> 	Dave
>  
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: GPF in aio_migratepage
  2013-12-02 17:49           ` Dave Jones
@ 2013-12-15 21:59             ` Kristian Nielsen
  2013-12-16  2:58               ` Gu Zheng
  0 siblings, 1 reply; 17+ messages in thread
From: Kristian Nielsen @ 2013-12-15 21:59 UTC (permalink / raw)
  To: Gu Zheng
  Cc: Dave Jones, Benjamin LaHaise, Kent Overstreet, Linux Kernel,
	Sasha Levin

What is the status of this?

If I understand correctly, the crash I saw is different from what Dave
saw.

There was one patched scheduled for inclusion that fixes Dave's crash. But
what about mine? I have been running 3.13-rc2 for a couple of weeks now with
your other patch, without seeing it again, which suggests it has helped. But
it seems that patch has a locking bug as described by Dave (sleeping under
spinlock)? So this appears unsolved as of yet...

So I just wanted to check that this was not forgotten. Is there something I
can do to help get this sorted out? Should I try to run with unpatched -rc4
for some time to check if it appears again? Anything else?

 - Kristian.

Dave Jones <davej@redhat.com> writes:

> On Mon, Dec 02, 2013 at 06:10:46PM +0800, Gu Zheng wrote:
>  > Hi Kristian, Dave,
>  > 
>  > Could you please help to check whether the following patch can fix this issue?
>
> This introduces some locking bugs..
>
>
> [  222.327950] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:616
> [  222.328004] in_atomic(): 1, irqs_disabled(): 0, pid: 12794, name: trinity-child1
> [  222.328044] 1 lock held by trinity-child1/12794:
> [  222.328072]  #0:  (&(&mapping->private_lock)->rlock){+.+...}, at: [<ffffffff81210a64>] aio_free_ring+0x44/0x160
> [  222.328147] CPU: 1 PID: 12794 Comm: trinity-child1 Not tainted 3.13.0-rc2+ #12 
> [  222.328268]  0000000000000268 ffff880229517d68 ffffffff8173bc52 0000000000000000
> [  222.328320]  ffff880229517d90 ffffffff8108ad95 ffff880223b6acd0 0000000000000000
> [  222.328370]  0000000000000000 ffff880229517e08 ffffffff81741cf3 ffff880229517dc0
> [  222.328421] Call Trace:
> [  222.328443]  [<ffffffff8173bc52>] dump_stack+0x4e/0x7a
> [  222.328475]  [<ffffffff8108ad95>] __might_sleep+0x175/0x200
> [  222.328510]  [<ffffffff81741cf3>] mutex_lock_nested+0x33/0x400
> [  222.328545]  [<ffffffff81179e68>] unmap_mapping_range+0x68/0x170
> [  222.328582]  [<ffffffff81160a35>] truncate_pagecache+0x35/0x60
> [  222.328617]  [<ffffffff81160a72>] truncate_setsize+0x12/0x20
> [  222.328651]  [<ffffffff81210ab9>] aio_free_ring+0x99/0x160
> [  222.328684]  [<ffffffff81213071>] SyS_io_setup+0xef1/0xf00
> [  222.328717]  [<ffffffff8174eaa4>] tracesys+0xdd/0xe2
>
> [  222.328769] ======================================================
> [  222.328804] [ INFO: possible circular locking dependency detected ]
> [  222.328838] 3.13.0-rc2+ #12 Not tainted
> [  222.328862] -------------------------------------------------------
> [  222.328896] trinity-child1/12794 is trying to acquire lock:
> [  222.328928]  (&mapping->i_mmap_mutex){+.+...}, at: [<ffffffff81179e68>] unmap_mapping_range+0x68/0x170
> [  222.328987] 
> but task is already holding lock:
> [  222.329020]  (&(&mapping->private_lock)->rlock){+.+...}, at: [<ffffffff81210a64>] aio_free_ring+0x44/0x160
> [  222.329081] 
> which lock already depends on the new lock.
>
> [  222.329125] 
> the existing dependency chain (in reverse order) is:
> [  222.329166] 
> -> #2 (&(&mapping->private_lock)->rlock){+.+...}:
> [  222.329211]        [<ffffffff810af833>] lock_acquire+0x93/0x1c0
> [  222.329248]        [<ffffffff817454f0>] _raw_spin_lock+0x40/0x80
> [  222.329285]        [<ffffffff811f334d>] __set_page_dirty_buffers+0x2d/0xb0
> [  222.331243]        [<ffffffff8115aada>] set_page_dirty+0x3a/0x60
> [  222.334437]        [<ffffffff81179a7f>] unmap_single_vma+0x62f/0x830
> [  222.337633]        [<ffffffff8117ad19>] unmap_vmas+0x49/0x90
> [  222.340819]        [<ffffffff811804bd>] unmap_region+0x9d/0x110
> [  222.343968]        [<ffffffff811829f6>] do_munmap+0x226/0x3b0
> [  222.346689]        [<ffffffff81182bc4>] vm_munmap+0x44/0x60
> [  222.349741]        [<ffffffff81183b42>] SyS_munmap+0x22/0x30
> [  222.352758]        [<ffffffff8174eaa4>] tracesys+0xdd/0xe2
> [  222.355735] 
> -> #1 (&(ptlock_ptr(page))->rlock#2){+.+...}:
> [  222.361611]        [<ffffffff810af833>] lock_acquire+0x93/0x1c0
> [  222.364589]        [<ffffffff817454f0>] _raw_spin_lock+0x40/0x80
> [  222.367200]        [<ffffffff81186338>] __page_check_address+0x98/0x160
> [  222.370168]        [<ffffffff811864fe>] page_mkclean+0xfe/0x1c0
> [  222.373120]        [<ffffffff8115ad60>] clear_page_dirty_for_io+0x60/0x100
> [  222.376076]        [<ffffffff8124d207>] mpage_submit_page+0x47/0x80
> [  222.379015]        [<ffffffff8124d350>] mpage_process_page_bufs+0x110/0x130
> [  222.381955]        [<ffffffff8124d91b>] mpage_prepare_extent_to_map+0x22b/0x2f0
> [  222.384895]        [<ffffffff8125318f>] ext4_writepages+0x4ef/0x1050
> [  222.387839]        [<ffffffff8115cdf1>] do_writepages+0x21/0x50
> [  222.390786]        [<ffffffff81150959>] __filemap_fdatawrite_range+0x59/0x60
> [  222.393747]        [<ffffffff81150a5d>] filemap_write_and_wait_range+0x2d/0x70
> [  222.396729]        [<ffffffff812498ca>] ext4_sync_file+0xba/0x4d0
> [  222.399714]        [<ffffffff811f1691>] do_fsync+0x51/0x80
> [  222.402317]        [<ffffffff811f1980>] SyS_fsync+0x10/0x20
> [  222.405240]        [<ffffffff8174eaa4>] tracesys+0xdd/0xe2
> [  222.407760] 
> -> #0 (&mapping->i_mmap_mutex){+.+...}:
> [  222.413349]        [<ffffffff810aed16>] __lock_acquire+0x1786/0x1af0
> [  222.416127]        [<ffffffff810af833>] lock_acquire+0x93/0x1c0
> [  222.418826]        [<ffffffff81741d37>] mutex_lock_nested+0x77/0x400
> [  222.421456]        [<ffffffff81179e68>] unmap_mapping_range+0x68/0x170
> [  222.424085]        [<ffffffff81160a35>] truncate_pagecache+0x35/0x60
> [  222.426696]        [<ffffffff81160a72>] truncate_setsize+0x12/0x20
> [  222.428955]        [<ffffffff81210ab9>] aio_free_ring+0x99/0x160
> [  222.431509]        [<ffffffff81213071>] SyS_io_setup+0xef1/0xf00
> [  222.434069]        [<ffffffff8174eaa4>] tracesys+0xdd/0xe2
> [  222.436308] 
> other info that might help us debug this:
>
> [  222.443857] Chain exists of:
>   &mapping->i_mmap_mutex --> &(ptlock_ptr(page))->rlock#2 --> &(&mapping->private_lock)->rlock
>
> [  222.451618]  Possible unsafe locking scenario:
>
> [  222.456831]        CPU0                    CPU1
> [  222.459413]        ----                    ----
> [  222.461958]   lock(&(&mapping->private_lock)->rlock);
> [  222.464505]                                lock(&(ptlock_ptr(page))->rlock#2);
> [  222.467094]                                lock(&(&mapping->private_lock)->rlock);
> [  222.469625]   lock(&mapping->i_mmap_mutex);
> [  222.472111] 
>  *** DEADLOCK ***
>
> [  222.478392] 1 lock held by trinity-child1/12794:
> [  222.480744]  #0:  (&(&mapping->private_lock)->rlock){+.+...}, at: [<ffffffff81210a64>] aio_free_ring+0x44/0x160
> [  222.483240] 
> stack backtrace:
> [  222.488119] CPU: 1 PID: 12794 Comm: trinity-child1 Not tainted 3.13.0-rc2+ #12 
> [  222.493016]  ffffffff824cb110 ffff880229517c30 ffffffff8173bc52 ffffffff824a3f40
> [  222.495690]  ffff880229517c70 ffffffff81737fed ffff880229517cc0 ffff8800a1e49d10
> [  222.498379]  ffff8800a1e495d0 0000000000000001 0000000000000001 ffff8800a1e49d10
> [  222.501073] Call Trace:
> [  222.503394]  [<ffffffff8173bc52>] dump_stack+0x4e/0x7a
> [  222.506080]  [<ffffffff81737fed>] print_circular_bug+0x200/0x20f
> [  222.508781]  [<ffffffff810aed16>] __lock_acquire+0x1786/0x1af0
> [  222.511485]  [<ffffffff810af833>] lock_acquire+0x93/0x1c0
> [  222.514197]  [<ffffffff81179e68>] ? unmap_mapping_range+0x68/0x170
> [  222.516594]  [<ffffffff81179e68>] ? unmap_mapping_range+0x68/0x170
> [  222.519307]  [<ffffffff81741d37>] mutex_lock_nested+0x77/0x400
> [  222.522028]  [<ffffffff81179e68>] ? unmap_mapping_range+0x68/0x170
> [  222.524752]  [<ffffffff81179e68>] ? unmap_mapping_range+0x68/0x170
> [  222.527445]  [<ffffffff81179e68>] unmap_mapping_range+0x68/0x170
> [  222.530113]  [<ffffffff81160a35>] truncate_pagecache+0x35/0x60
> [  222.532785]  [<ffffffff81160a72>] truncate_setsize+0x12/0x20
> [  222.535439]  [<ffffffff81210ab9>] aio_free_ring+0x99/0x160
> [  222.538089]  [<ffffffff81213071>] SyS_io_setup+0xef1/0xf00
> [  222.540725]  [<ffffffff8174eaa4>] tracesys+0xdd/0xe2

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in aio_migratepage
  2013-12-15 21:59             ` Kristian Nielsen
@ 2013-12-16  2:58               ` Gu Zheng
  2013-12-16  3:27                 ` Gu Zheng
  2013-12-22 20:44                 ` Kristian Nielsen
  0 siblings, 2 replies; 17+ messages in thread
From: Gu Zheng @ 2013-12-16  2:58 UTC (permalink / raw)
  To: Kristian Nielsen
  Cc: Dave Jones, Benjamin LaHaise, Kent Overstreet, Linux Kernel,
	Sasha Levin

Hi Kristian,
On 12/16/2013 05:59 AM, Kristian Nielsen wrote:

> What is the status of this?
> 
> If I understand correctly, the crash I saw is different from what Dave
> saw.
> 
> There was one patched scheduled for inclusion that fixes Dave's crash. But
> what about mine? I have been running 3.13-rc2 for a couple of weeks now with
> your other patch, without seeing it again, which suggests it has helped. But
> it seems that patch has a locking bug as described by Dave (sleeping under
> spinlock)? So this appears unsolved as of yet...
> 
> So I just wanted to check that this was not forgotten. Is there something I
> can do to help get this sorted out? Should I try to run with unpatched -rc4
> for some time to check if it appears again? Anything else?

Thanks for your reminder. I really do not forget this issue.
This issue seems like a problem that has been fixed yet:
http://article.gmane.org/gmane.linux.kernel.aio.general/3741/match=potential+use+after+free+aio%5fmigratepage
commit 5e9ae2e5da0beb93f8557fc92a8f4fbc05ea448f
aio: fix use-after-free in aio_migratepage
So I think maybe you can run with latest Linus' tree or 3.13-rc4 to
check whether this issue still appears.
Looking forward to your replay.

Thanks,
Gu

> 
>  - Kristian.
> 
> Dave Jones <davej@redhat.com> writes:
> 
>> On Mon, Dec 02, 2013 at 06:10:46PM +0800, Gu Zheng wrote:
>>  > Hi Kristian, Dave,
>>  > 
>>  > Could you please help to check whether the following patch can fix this issue?
>>
>> This introduces some locking bugs..
>>
>>
>> [  222.327950] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:616
>> [  222.328004] in_atomic(): 1, irqs_disabled(): 0, pid: 12794, name: trinity-child1
>> [  222.328044] 1 lock held by trinity-child1/12794:
>> [  222.328072]  #0:  (&(&mapping->private_lock)->rlock){+.+...}, at: [<ffffffff81210a64>] aio_free_ring+0x44/0x160
>> [  222.328147] CPU: 1 PID: 12794 Comm: trinity-child1 Not tainted 3.13.0-rc2+ #12 
>> [  222.328268]  0000000000000268 ffff880229517d68 ffffffff8173bc52 0000000000000000
>> [  222.328320]  ffff880229517d90 ffffffff8108ad95 ffff880223b6acd0 0000000000000000
>> [  222.328370]  0000000000000000 ffff880229517e08 ffffffff81741cf3 ffff880229517dc0
>> [  222.328421] Call Trace:
>> [  222.328443]  [<ffffffff8173bc52>] dump_stack+0x4e/0x7a
>> [  222.328475]  [<ffffffff8108ad95>] __might_sleep+0x175/0x200
>> [  222.328510]  [<ffffffff81741cf3>] mutex_lock_nested+0x33/0x400
>> [  222.328545]  [<ffffffff81179e68>] unmap_mapping_range+0x68/0x170
>> [  222.328582]  [<ffffffff81160a35>] truncate_pagecache+0x35/0x60
>> [  222.328617]  [<ffffffff81160a72>] truncate_setsize+0x12/0x20
>> [  222.328651]  [<ffffffff81210ab9>] aio_free_ring+0x99/0x160
>> [  222.328684]  [<ffffffff81213071>] SyS_io_setup+0xef1/0xf00
>> [  222.328717]  [<ffffffff8174eaa4>] tracesys+0xdd/0xe2
>>
>> [  222.328769] ======================================================
>> [  222.328804] [ INFO: possible circular locking dependency detected ]
>> [  222.328838] 3.13.0-rc2+ #12 Not tainted
>> [  222.328862] -------------------------------------------------------
>> [  222.328896] trinity-child1/12794 is trying to acquire lock:
>> [  222.328928]  (&mapping->i_mmap_mutex){+.+...}, at: [<ffffffff81179e68>] unmap_mapping_range+0x68/0x170
>> [  222.328987] 
>> but task is already holding lock:
>> [  222.329020]  (&(&mapping->private_lock)->rlock){+.+...}, at: [<ffffffff81210a64>] aio_free_ring+0x44/0x160
>> [  222.329081] 
>> which lock already depends on the new lock.
>>
>> [  222.329125] 
>> the existing dependency chain (in reverse order) is:
>> [  222.329166] 
>> -> #2 (&(&mapping->private_lock)->rlock){+.+...}:
>> [  222.329211]        [<ffffffff810af833>] lock_acquire+0x93/0x1c0
>> [  222.329248]        [<ffffffff817454f0>] _raw_spin_lock+0x40/0x80
>> [  222.329285]        [<ffffffff811f334d>] __set_page_dirty_buffers+0x2d/0xb0
>> [  222.331243]        [<ffffffff8115aada>] set_page_dirty+0x3a/0x60
>> [  222.334437]        [<ffffffff81179a7f>] unmap_single_vma+0x62f/0x830
>> [  222.337633]        [<ffffffff8117ad19>] unmap_vmas+0x49/0x90
>> [  222.340819]        [<ffffffff811804bd>] unmap_region+0x9d/0x110
>> [  222.343968]        [<ffffffff811829f6>] do_munmap+0x226/0x3b0
>> [  222.346689]        [<ffffffff81182bc4>] vm_munmap+0x44/0x60
>> [  222.349741]        [<ffffffff81183b42>] SyS_munmap+0x22/0x30
>> [  222.352758]        [<ffffffff8174eaa4>] tracesys+0xdd/0xe2
>> [  222.355735] 
>> -> #1 (&(ptlock_ptr(page))->rlock#2){+.+...}:
>> [  222.361611]        [<ffffffff810af833>] lock_acquire+0x93/0x1c0
>> [  222.364589]        [<ffffffff817454f0>] _raw_spin_lock+0x40/0x80
>> [  222.367200]        [<ffffffff81186338>] __page_check_address+0x98/0x160
>> [  222.370168]        [<ffffffff811864fe>] page_mkclean+0xfe/0x1c0
>> [  222.373120]        [<ffffffff8115ad60>] clear_page_dirty_for_io+0x60/0x100
>> [  222.376076]        [<ffffffff8124d207>] mpage_submit_page+0x47/0x80
>> [  222.379015]        [<ffffffff8124d350>] mpage_process_page_bufs+0x110/0x130
>> [  222.381955]        [<ffffffff8124d91b>] mpage_prepare_extent_to_map+0x22b/0x2f0
>> [  222.384895]        [<ffffffff8125318f>] ext4_writepages+0x4ef/0x1050
>> [  222.387839]        [<ffffffff8115cdf1>] do_writepages+0x21/0x50
>> [  222.390786]        [<ffffffff81150959>] __filemap_fdatawrite_range+0x59/0x60
>> [  222.393747]        [<ffffffff81150a5d>] filemap_write_and_wait_range+0x2d/0x70
>> [  222.396729]        [<ffffffff812498ca>] ext4_sync_file+0xba/0x4d0
>> [  222.399714]        [<ffffffff811f1691>] do_fsync+0x51/0x80
>> [  222.402317]        [<ffffffff811f1980>] SyS_fsync+0x10/0x20
>> [  222.405240]        [<ffffffff8174eaa4>] tracesys+0xdd/0xe2
>> [  222.407760] 
>> -> #0 (&mapping->i_mmap_mutex){+.+...}:
>> [  222.413349]        [<ffffffff810aed16>] __lock_acquire+0x1786/0x1af0
>> [  222.416127]        [<ffffffff810af833>] lock_acquire+0x93/0x1c0
>> [  222.418826]        [<ffffffff81741d37>] mutex_lock_nested+0x77/0x400
>> [  222.421456]        [<ffffffff81179e68>] unmap_mapping_range+0x68/0x170
>> [  222.424085]        [<ffffffff81160a35>] truncate_pagecache+0x35/0x60
>> [  222.426696]        [<ffffffff81160a72>] truncate_setsize+0x12/0x20
>> [  222.428955]        [<ffffffff81210ab9>] aio_free_ring+0x99/0x160
>> [  222.431509]        [<ffffffff81213071>] SyS_io_setup+0xef1/0xf00
>> [  222.434069]        [<ffffffff8174eaa4>] tracesys+0xdd/0xe2
>> [  222.436308] 
>> other info that might help us debug this:
>>
>> [  222.443857] Chain exists of:
>>   &mapping->i_mmap_mutex --> &(ptlock_ptr(page))->rlock#2 --> &(&mapping->private_lock)->rlock
>>
>> [  222.451618]  Possible unsafe locking scenario:
>>
>> [  222.456831]        CPU0                    CPU1
>> [  222.459413]        ----                    ----
>> [  222.461958]   lock(&(&mapping->private_lock)->rlock);
>> [  222.464505]                                lock(&(ptlock_ptr(page))->rlock#2);
>> [  222.467094]                                lock(&(&mapping->private_lock)->rlock);
>> [  222.469625]   lock(&mapping->i_mmap_mutex);
>> [  222.472111] 
>>  *** DEADLOCK ***
>>
>> [  222.478392] 1 lock held by trinity-child1/12794:
>> [  222.480744]  #0:  (&(&mapping->private_lock)->rlock){+.+...}, at: [<ffffffff81210a64>] aio_free_ring+0x44/0x160
>> [  222.483240] 
>> stack backtrace:
>> [  222.488119] CPU: 1 PID: 12794 Comm: trinity-child1 Not tainted 3.13.0-rc2+ #12 
>> [  222.493016]  ffffffff824cb110 ffff880229517c30 ffffffff8173bc52 ffffffff824a3f40
>> [  222.495690]  ffff880229517c70 ffffffff81737fed ffff880229517cc0 ffff8800a1e49d10
>> [  222.498379]  ffff8800a1e495d0 0000000000000001 0000000000000001 ffff8800a1e49d10
>> [  222.501073] Call Trace:
>> [  222.503394]  [<ffffffff8173bc52>] dump_stack+0x4e/0x7a
>> [  222.506080]  [<ffffffff81737fed>] print_circular_bug+0x200/0x20f
>> [  222.508781]  [<ffffffff810aed16>] __lock_acquire+0x1786/0x1af0
>> [  222.511485]  [<ffffffff810af833>] lock_acquire+0x93/0x1c0
>> [  222.514197]  [<ffffffff81179e68>] ? unmap_mapping_range+0x68/0x170
>> [  222.516594]  [<ffffffff81179e68>] ? unmap_mapping_range+0x68/0x170
>> [  222.519307]  [<ffffffff81741d37>] mutex_lock_nested+0x77/0x400
>> [  222.522028]  [<ffffffff81179e68>] ? unmap_mapping_range+0x68/0x170
>> [  222.524752]  [<ffffffff81179e68>] ? unmap_mapping_range+0x68/0x170
>> [  222.527445]  [<ffffffff81179e68>] unmap_mapping_range+0x68/0x170
>> [  222.530113]  [<ffffffff81160a35>] truncate_pagecache+0x35/0x60
>> [  222.532785]  [<ffffffff81160a72>] truncate_setsize+0x12/0x20
>> [  222.535439]  [<ffffffff81210ab9>] aio_free_ring+0x99/0x160
>> [  222.538089]  [<ffffffff81213071>] SyS_io_setup+0xef1/0xf00
>> [  222.540725]  [<ffffffff8174eaa4>] tracesys+0xdd/0xe2
> 



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in aio_migratepage
  2013-12-16  2:58               ` Gu Zheng
@ 2013-12-16  3:27                 ` Gu Zheng
  2013-12-22 20:44                 ` Kristian Nielsen
  1 sibling, 0 replies; 17+ messages in thread
From: Gu Zheng @ 2013-12-16  3:27 UTC (permalink / raw)
  To: Kristian Nielsen
  Cc: Dave Jones, Benjamin LaHaise, Kent Overstreet, Linux Kernel,
	Sasha Levin

Hi Kristian,
On 12/16/2013 10:58 AM, Gu Zheng wrote:

> Hi Kristian,
> On 12/16/2013 05:59 AM, Kristian Nielsen wrote:
> 
>> What is the status of this?
>>
>> If I understand correctly, the crash I saw is different from what Dave
>> saw.

Thought the crash you saw is different from Dave's, but as you know, they
both appear in the compact page path. So this is anther reason that you need
to run with latest Linus' tree or 3.13-rc4 to check again.

Thanks,
Gu

>>
>> There was one patched scheduled for inclusion that fixes Dave's crash. But
>> what about mine? I have been running 3.13-rc2 for a couple of weeks now with
>> your other patch, without seeing it again, which suggests it has helped. But
>> it seems that patch has a locking bug as described by Dave (sleeping under
>> spinlock)? So this appears unsolved as of yet...
>>
>> So I just wanted to check that this was not forgotten. Is there something I
>> can do to help get this sorted out? Should I try to run with unpatched -rc4
>> for some time to check if it appears again? Anything else?
> 
> Thanks for your reminder. I really do not forget this issue.
> This issue seems like a problem that has been fixed yet:
> http://article.gmane.org/gmane.linux.kernel.aio.general/3741/match=potential+use+after+free+aio%5fmigratepage
> commit 5e9ae2e5da0beb93f8557fc92a8f4fbc05ea448f
> aio: fix use-after-free in aio_migratepage
> So I think maybe you can run with latest Linus' tree or 3.13-rc4 to
> check whether this issue still appears.
> Looking forward to your replay.
> 
> Thanks,
> Gu
> 
>>
>>  - Kristian.
>>
>> Dave Jones <davej@redhat.com> writes:
>>
>>> On Mon, Dec 02, 2013 at 06:10:46PM +0800, Gu Zheng wrote:
>>>  > Hi Kristian, Dave,
>>>  > 
>>>  > Could you please help to check whether the following patch can fix this issue?
>>>
>>> This introduces some locking bugs..
>>>
>>>
>>> [  222.327950] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:616
>>> [  222.328004] in_atomic(): 1, irqs_disabled(): 0, pid: 12794, name: trinity-child1
>>> [  222.328044] 1 lock held by trinity-child1/12794:
>>> [  222.328072]  #0:  (&(&mapping->private_lock)->rlock){+.+...}, at: [<ffffffff81210a64>] aio_free_ring+0x44/0x160
>>> [  222.328147] CPU: 1 PID: 12794 Comm: trinity-child1 Not tainted 3.13.0-rc2+ #12 
>>> [  222.328268]  0000000000000268 ffff880229517d68 ffffffff8173bc52 0000000000000000
>>> [  222.328320]  ffff880229517d90 ffffffff8108ad95 ffff880223b6acd0 0000000000000000
>>> [  222.328370]  0000000000000000 ffff880229517e08 ffffffff81741cf3 ffff880229517dc0
>>> [  222.328421] Call Trace:
>>> [  222.328443]  [<ffffffff8173bc52>] dump_stack+0x4e/0x7a
>>> [  222.328475]  [<ffffffff8108ad95>] __might_sleep+0x175/0x200
>>> [  222.328510]  [<ffffffff81741cf3>] mutex_lock_nested+0x33/0x400
>>> [  222.328545]  [<ffffffff81179e68>] unmap_mapping_range+0x68/0x170
>>> [  222.328582]  [<ffffffff81160a35>] truncate_pagecache+0x35/0x60
>>> [  222.328617]  [<ffffffff81160a72>] truncate_setsize+0x12/0x20
>>> [  222.328651]  [<ffffffff81210ab9>] aio_free_ring+0x99/0x160
>>> [  222.328684]  [<ffffffff81213071>] SyS_io_setup+0xef1/0xf00
>>> [  222.328717]  [<ffffffff8174eaa4>] tracesys+0xdd/0xe2
>>>
>>> [  222.328769] ======================================================
>>> [  222.328804] [ INFO: possible circular locking dependency detected ]
>>> [  222.328838] 3.13.0-rc2+ #12 Not tainted
>>> [  222.328862] -------------------------------------------------------
>>> [  222.328896] trinity-child1/12794 is trying to acquire lock:
>>> [  222.328928]  (&mapping->i_mmap_mutex){+.+...}, at: [<ffffffff81179e68>] unmap_mapping_range+0x68/0x170
>>> [  222.328987] 
>>> but task is already holding lock:
>>> [  222.329020]  (&(&mapping->private_lock)->rlock){+.+...}, at: [<ffffffff81210a64>] aio_free_ring+0x44/0x160
>>> [  222.329081] 
>>> which lock already depends on the new lock.
>>>
>>> [  222.329125] 
>>> the existing dependency chain (in reverse order) is:
>>> [  222.329166] 
>>> -> #2 (&(&mapping->private_lock)->rlock){+.+...}:
>>> [  222.329211]        [<ffffffff810af833>] lock_acquire+0x93/0x1c0
>>> [  222.329248]        [<ffffffff817454f0>] _raw_spin_lock+0x40/0x80
>>> [  222.329285]        [<ffffffff811f334d>] __set_page_dirty_buffers+0x2d/0xb0
>>> [  222.331243]        [<ffffffff8115aada>] set_page_dirty+0x3a/0x60
>>> [  222.334437]        [<ffffffff81179a7f>] unmap_single_vma+0x62f/0x830
>>> [  222.337633]        [<ffffffff8117ad19>] unmap_vmas+0x49/0x90
>>> [  222.340819]        [<ffffffff811804bd>] unmap_region+0x9d/0x110
>>> [  222.343968]        [<ffffffff811829f6>] do_munmap+0x226/0x3b0
>>> [  222.346689]        [<ffffffff81182bc4>] vm_munmap+0x44/0x60
>>> [  222.349741]        [<ffffffff81183b42>] SyS_munmap+0x22/0x30
>>> [  222.352758]        [<ffffffff8174eaa4>] tracesys+0xdd/0xe2
>>> [  222.355735] 
>>> -> #1 (&(ptlock_ptr(page))->rlock#2){+.+...}:
>>> [  222.361611]        [<ffffffff810af833>] lock_acquire+0x93/0x1c0
>>> [  222.364589]        [<ffffffff817454f0>] _raw_spin_lock+0x40/0x80
>>> [  222.367200]        [<ffffffff81186338>] __page_check_address+0x98/0x160
>>> [  222.370168]        [<ffffffff811864fe>] page_mkclean+0xfe/0x1c0
>>> [  222.373120]        [<ffffffff8115ad60>] clear_page_dirty_for_io+0x60/0x100
>>> [  222.376076]        [<ffffffff8124d207>] mpage_submit_page+0x47/0x80
>>> [  222.379015]        [<ffffffff8124d350>] mpage_process_page_bufs+0x110/0x130
>>> [  222.381955]        [<ffffffff8124d91b>] mpage_prepare_extent_to_map+0x22b/0x2f0
>>> [  222.384895]        [<ffffffff8125318f>] ext4_writepages+0x4ef/0x1050
>>> [  222.387839]        [<ffffffff8115cdf1>] do_writepages+0x21/0x50
>>> [  222.390786]        [<ffffffff81150959>] __filemap_fdatawrite_range+0x59/0x60
>>> [  222.393747]        [<ffffffff81150a5d>] filemap_write_and_wait_range+0x2d/0x70
>>> [  222.396729]        [<ffffffff812498ca>] ext4_sync_file+0xba/0x4d0
>>> [  222.399714]        [<ffffffff811f1691>] do_fsync+0x51/0x80
>>> [  222.402317]        [<ffffffff811f1980>] SyS_fsync+0x10/0x20
>>> [  222.405240]        [<ffffffff8174eaa4>] tracesys+0xdd/0xe2
>>> [  222.407760] 
>>> -> #0 (&mapping->i_mmap_mutex){+.+...}:
>>> [  222.413349]        [<ffffffff810aed16>] __lock_acquire+0x1786/0x1af0
>>> [  222.416127]        [<ffffffff810af833>] lock_acquire+0x93/0x1c0
>>> [  222.418826]        [<ffffffff81741d37>] mutex_lock_nested+0x77/0x400
>>> [  222.421456]        [<ffffffff81179e68>] unmap_mapping_range+0x68/0x170
>>> [  222.424085]        [<ffffffff81160a35>] truncate_pagecache+0x35/0x60
>>> [  222.426696]        [<ffffffff81160a72>] truncate_setsize+0x12/0x20
>>> [  222.428955]        [<ffffffff81210ab9>] aio_free_ring+0x99/0x160
>>> [  222.431509]        [<ffffffff81213071>] SyS_io_setup+0xef1/0xf00
>>> [  222.434069]        [<ffffffff8174eaa4>] tracesys+0xdd/0xe2
>>> [  222.436308] 
>>> other info that might help us debug this:
>>>
>>> [  222.443857] Chain exists of:
>>>   &mapping->i_mmap_mutex --> &(ptlock_ptr(page))->rlock#2 --> &(&mapping->private_lock)->rlock
>>>
>>> [  222.451618]  Possible unsafe locking scenario:
>>>
>>> [  222.456831]        CPU0                    CPU1
>>> [  222.459413]        ----                    ----
>>> [  222.461958]   lock(&(&mapping->private_lock)->rlock);
>>> [  222.464505]                                lock(&(ptlock_ptr(page))->rlock#2);
>>> [  222.467094]                                lock(&(&mapping->private_lock)->rlock);
>>> [  222.469625]   lock(&mapping->i_mmap_mutex);
>>> [  222.472111] 
>>>  *** DEADLOCK ***
>>>
>>> [  222.478392] 1 lock held by trinity-child1/12794:
>>> [  222.480744]  #0:  (&(&mapping->private_lock)->rlock){+.+...}, at: [<ffffffff81210a64>] aio_free_ring+0x44/0x160
>>> [  222.483240] 
>>> stack backtrace:
>>> [  222.488119] CPU: 1 PID: 12794 Comm: trinity-child1 Not tainted 3.13.0-rc2+ #12 
>>> [  222.493016]  ffffffff824cb110 ffff880229517c30 ffffffff8173bc52 ffffffff824a3f40
>>> [  222.495690]  ffff880229517c70 ffffffff81737fed ffff880229517cc0 ffff8800a1e49d10
>>> [  222.498379]  ffff8800a1e495d0 0000000000000001 0000000000000001 ffff8800a1e49d10
>>> [  222.501073] Call Trace:
>>> [  222.503394]  [<ffffffff8173bc52>] dump_stack+0x4e/0x7a
>>> [  222.506080]  [<ffffffff81737fed>] print_circular_bug+0x200/0x20f
>>> [  222.508781]  [<ffffffff810aed16>] __lock_acquire+0x1786/0x1af0
>>> [  222.511485]  [<ffffffff810af833>] lock_acquire+0x93/0x1c0
>>> [  222.514197]  [<ffffffff81179e68>] ? unmap_mapping_range+0x68/0x170
>>> [  222.516594]  [<ffffffff81179e68>] ? unmap_mapping_range+0x68/0x170
>>> [  222.519307]  [<ffffffff81741d37>] mutex_lock_nested+0x77/0x400
>>> [  222.522028]  [<ffffffff81179e68>] ? unmap_mapping_range+0x68/0x170
>>> [  222.524752]  [<ffffffff81179e68>] ? unmap_mapping_range+0x68/0x170
>>> [  222.527445]  [<ffffffff81179e68>] unmap_mapping_range+0x68/0x170
>>> [  222.530113]  [<ffffffff81160a35>] truncate_pagecache+0x35/0x60
>>> [  222.532785]  [<ffffffff81160a72>] truncate_setsize+0x12/0x20
>>> [  222.535439]  [<ffffffff81210ab9>] aio_free_ring+0x99/0x160
>>> [  222.538089]  [<ffffffff81213071>] SyS_io_setup+0xef1/0xf00
>>> [  222.540725]  [<ffffffff8174eaa4>] tracesys+0xdd/0xe2
>>
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in aio_migratepage
  2013-12-16  2:58               ` Gu Zheng
  2013-12-16  3:27                 ` Gu Zheng
@ 2013-12-22 20:44                 ` Kristian Nielsen
  2013-12-22 21:34                   ` Benjamin LaHaise
  1 sibling, 1 reply; 17+ messages in thread
From: Kristian Nielsen @ 2013-12-22 20:44 UTC (permalink / raw)
  To: Gu Zheng
  Cc: Dave Jones, Benjamin LaHaise, Kent Overstreet, Linux Kernel,
	Sasha Levin

Gu Zheng <guz.fnst@cn.fujitsu.com> writes:

> This issue seems like a problem that has been fixed yet:
> http://article.gmane.org/gmane.linux.kernel.aio.general/3741/match=potential+use+after+free+aio%5fmigratepage
> commit 5e9ae2e5da0beb93f8557fc92a8f4fbc05ea448f
> aio: fix use-after-free in aio_migratepage
> So I think maybe you can run with latest Linus' tree or 3.13-rc4 to
> check whether this issue still appears.

Hm. I checked that thread, and as far as I can see, that patch was already
included in the tree I hit the BUG in (3.13-rc1):

    http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=5e9ae2e5da0beb93f8557fc92a8f4fbc05ea448f

There are other changes in that area since 3.13-rc1 though.

Anyway, I am now running with 3.13-rc4 and will report if I see anything.
Given that I do not have any way to reproduce (I only ever saw this once),
this seems the best that can be done for now.

Thanks for following up on this!

 - Kristian.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in aio_migratepage
  2013-12-22 20:44                 ` Kristian Nielsen
@ 2013-12-22 21:34                   ` Benjamin LaHaise
  2013-12-22 22:38                     ` Kristian Nielsen
  0 siblings, 1 reply; 17+ messages in thread
From: Benjamin LaHaise @ 2013-12-22 21:34 UTC (permalink / raw)
  To: Kristian Nielsen
  Cc: Gu Zheng, Dave Jones, Kent Overstreet, Linux Kernel, Sasha Levin

Hi Kristian,

On Sun, Dec 22, 2013 at 09:44:45PM +0100, Kristian Nielsen wrote:
> There are other changes in that area since 3.13-rc1 though.
> 
> Anyway, I am now running with 3.13-rc4 and will report if I see anything.
> Given that I do not have any way to reproduce (I only ever saw this once),
> this seems the best that can be done for now.

Linus just pushed out 3.13-rc5 that has changes to aio_migratepage() that 
should make it much more robust, as well as other fixes.  Can you please 
give it a spin as well and let me know if it works?  Thanks a bunch!

		-ben

> Thanks for following up on this!
> 
>  - Kristian.

-- 
"Thought is the essence of where you are now."

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in aio_migratepage
  2013-12-22 21:34                   ` Benjamin LaHaise
@ 2013-12-22 22:38                     ` Kristian Nielsen
  2014-01-21  8:38                       ` Kristian Nielsen
  0 siblings, 1 reply; 17+ messages in thread
From: Kristian Nielsen @ 2013-12-22 22:38 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Gu Zheng, Linux Kernel

Benjamin LaHaise <bcrl@kvack.org> writes:

> Linus just pushed out 3.13-rc5 that has changes to aio_migratepage() that 
> should make it much more robust, as well as other fixes.  Can you please 
> give it a spin as well and let me know if it works?  Thanks a bunch!

Ok, will do.

 - Kristian.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: GPF in aio_migratepage
  2013-12-22 22:38                     ` Kristian Nielsen
@ 2014-01-21  8:38                       ` Kristian Nielsen
  0 siblings, 0 replies; 17+ messages in thread
From: Kristian Nielsen @ 2014-01-21  8:38 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Gu Zheng, Linux Kernel

Kristian Nielsen <knielsen@knielsen-hq.org> writes:

> Benjamin LaHaise <bcrl@kvack.org> writes:
>
>> Linus just pushed out 3.13-rc5 that has changes to aio_migratepage() that 
>> should make it much more robust, as well as other fixes.  Can you please 
>> give it a spin as well and let me know if it works?  Thanks a bunch!
>
> Ok, will do.

JFYI, I have been running -rc5 (and later -rc7) for a month now without seeing
anything like this again. So hopefully the issue is solved.

 - Kristian.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2014-01-21  8:38 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-26  3:26 GPF in aio_migratepage Dave Jones
2013-11-26  6:01 ` Dave Jones
2013-11-26  7:19   ` Kent Overstreet
2013-11-26 15:23     ` Benjamin LaHaise
2013-11-26 15:56       ` Dave Jones
2013-12-03  9:02         ` Gu Zheng
2013-11-30 15:28       ` Kristian Nielsen
2013-12-02 10:10         ` Gu Zheng
2013-12-02 10:49           ` Kristian Nielsen
2013-12-02 17:49           ` Dave Jones
2013-12-15 21:59             ` Kristian Nielsen
2013-12-16  2:58               ` Gu Zheng
2013-12-16  3:27                 ` Gu Zheng
2013-12-22 20:44                 ` Kristian Nielsen
2013-12-22 21:34                   ` Benjamin LaHaise
2013-12-22 22:38                     ` Kristian Nielsen
2014-01-21  8:38                       ` Kristian Nielsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox