* generic/320 triggers "list_add attempted on force-poisoned entry" warning on XFS
@ 2016-02-27 13:02 Eryu Guan
2016-02-27 20:10 ` Dan Williams
0 siblings, 1 reply; 6+ messages in thread
From: Eryu Guan @ 2016-02-27 13:02 UTC (permalink / raw)
To: xfs; +Cc: Dan Williams, Ross Zwisler
Hi,
Starting from 4.5-rc1 kernel, I sometimes see generic/320 triggers
"list_add attempted on force-poisoned entry" warnings on XFS, test hosts
are arm64/ppc64/ppc64le, haven't seen it on x86_64 hosts.
[ 2441.772340] run fstests generic/320 at 2016-02-27 05:52:05
[ 2441.916302] XFS (sda5): Unmounting Filesystem
[ 2442.180551] XFS (sda5): Mounting V5 Filesystem
[ 2442.231940] XFS (sda5): Ending clean mount
[ 2460.142155] list_add attempted on force-poisoned entry
[ 2460.142278] ------------[ cut here ]------------
[ 2460.142326] WARNING: at lib/list_debug.c:34
[ 2460.142362] Modules linked in: rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_
ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ses enclosure scsi_transport_sas sg shpchp powernv_rng rtc_opal nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sr_
mod sd_mod cdrom mlx4_ib ib_sa ib_mad mlx4_en ib_core vxlan ip6_udp_tunnel ib_addr udp_tunnel mlx4_core ipr libata tg3 ptp pps_core
[ 2460.143083] CPU: 21 PID: 134288 Comm: cp Not tainted 4.5.0-rc5 #25
[ 2460.143141] task: c000000f550adb00 ti: c000000fb5fc0000 task.ti: c000000fb5fc0000
[ 2460.143209] NIP: c00000000043c390 LR: c00000000043c38c CTR: 0000000030041bec
[ 2460.143278] REGS: c000000fb5fc30a0 TRAP: 0700 Not tainted (4.5.0-rc5)
[ 2460.143334] MSR: 900000010282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 22028422 XER: 00000000
[ 2460.143575] CFAR: c0000000008259d8 SOFTE: 0
GPR00: c00000000043c38c c000000fb5fc3320 c00000000108bc00 000000000000002a
GPR04: c000000ff8d49c50 c000000ff8d5b4a0 900000010280b033 0000000000000065
GPR08: 0000000000000000 c000000000bcb284 0000000ff8180000 000000000000076b
GPR12: 0000000000008800 c00000000fb8bd00 0000000000000000 0000000000000000
GPR16: 0000000000000000 00003ffffa430978 0000000000000000 0000000000000001
GPR20: c000000fb08ab880 0000000000008180 d00000002024bae0 0000000000000000
GPR24: 0000000000000000 c000000fc73c9e40 c000000fe914a740 0000000000000002
GPR28: 0000000000000001 c000000fc812ab38 c000000fc812ab38 c000000fb5fc33c0
[ 2460.144450] NIP [c00000000043c390] __list_add+0xb0/0x150
[ 2460.144497] LR [c00000000043c38c] __list_add+0xac/0x150
[ 2460.144542] Call Trace:
[ 2460.144566] [c000000fb5fc3320] [c00000000043c38c] __list_add+0xac/0x150 (unreliable)
[ 2460.144648] [c000000fb5fc33a0] [c00000000081b454] __down+0x4c/0xf8
[ 2460.144718] [c000000fb5fc3410] [c00000000010b6f8] down+0x68/0x70
[ 2460.144809] [c000000fb5fc3450] [d0000000201ebf4c] xfs_buf_lock+0x4c/0x150 [xfs]
[ 2460.144902] [c000000fb5fc3490] [d0000000201ec2f0] _xfs_buf_find+0x2a0/0x4d0 [xfs]
[ 2460.144995] [c000000fb5fc3530] [d0000000201ec70c] xfs_buf_get_map+0x4c/0x250 [xfs]
[ 2460.145088] [c000000fb5fc35d0] [d0000000201ed740] xfs_buf_read_map+0x50/0x1f0 [xfs]
[ 2460.145244] [c000000fb5fc3630] [d0000000202280d8] xfs_trans_read_buf_map+0x1d8/0x390 [xfs]
[ 2460.145412] [c000000fb5fc36a0] [d0000000201d849c] xfs_read_agi+0x9c/0x130 [xfs]
[ 2460.145580] [c000000fb5fc3700] [d0000000201d8580] xfs_ialloc_read_agi+0x50/0x160 [xfs]
[ 2460.145748] [c000000fb5fc3750] [d0000000201d92f0] xfs_dialloc+0x130/0x2f0 [xfs]
[ 2460.145918] [c000000fb5fc37e0] [d000000020203274] xfs_ialloc+0x84/0x550 [xfs]
[ 2460.146068] [c000000fb5fc3860] [d0000000202037d8] xfs_dir_ialloc+0x98/0x270 [xfs]
[ 2460.146240] [c000000fb5fc3960] [d000000020203f24] xfs_create+0x4f4/0x750 [xfs]
[ 2460.146412] [c000000fb5fc3a60] [d0000000201ff0a8] xfs_generic_create+0x208/0x3d0 [xfs]
[ 2460.146572] [c000000fb5fc3af0] [c0000000002af0f8] vfs_create+0x158/0x1f0
[ 2460.146708] [c000000fb5fc3b40] [c0000000002b0cd8] do_last+0x698/0xf40
[ 2460.146845] [c000000fb5fc3c10] [c0000000002b1624] path_openat+0xa4/0x3c0
[ 2460.146982] [c000000fb5fc3c90] [c0000000002b2ec4] do_filp_open+0x74/0xf0
[ 2460.147120] [c000000fb5fc3dc0] [c00000000029c654] do_sys_open+0x1b4/0x2d0
[ 2460.147257] [c000000fb5fc3e30] [c000000000009204] system_call+0x38/0xb4
[ 2460.147392] Instruction dump:
[ 2460.147459] fbfe0000 38210080 e8010010 eba1ffe8 ebc1fff0 ebe1fff8 7c0803a6 4e800020
[ 2460.147680] 3c62ff9e 38631a78 483e95f9 60000000 <0fe00000> 4bffff98 60000000 60420000
[ 2460.147902] ---[ end trace aa6c4f990634a77c ]---
The warning itself is introduced by commit 5c2c2587b132 ("mm, dax, pmem:
introduce {get|put}_dev_pagemap() for dax-gup") in 4.5-rc1, and git
bisect points to the same commit. But I'm not sure if it's a regression
or just exposes an old issue.
If more infomation is needed please let me know.
Thanks,
Eryu
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: generic/320 triggers "list_add attempted on force-poisoned entry" warning on XFS 2016-02-27 13:02 generic/320 triggers "list_add attempted on force-poisoned entry" warning on XFS Eryu Guan @ 2016-02-27 20:10 ` Dan Williams 2016-02-28 5:31 ` Eryu Guan 0 siblings, 1 reply; 6+ messages in thread From: Dan Williams @ 2016-02-27 20:10 UTC (permalink / raw) To: Eryu Guan; +Cc: Ross Zwisler, XFS Developers On Sat, Feb 27, 2016 at 5:02 AM, Eryu Guan <eguan@redhat.com> wrote: > Hi, > > Starting from 4.5-rc1 kernel, I sometimes see generic/320 triggers > "list_add attempted on force-poisoned entry" warnings on XFS, test hosts > are arm64/ppc64/ppc64le, haven't seen it on x86_64 hosts. Hmm, this triggers when a list_head has ->next or ->prev pointing at the address of force_poison which is only defined in lib/list_debug.c. The only call site that uses list_force_poison() is in devm_memremap_pages(). That currently depends on CONFIG_ZONE_DEVICE which in turn depends on X86_64. So, this appears to be a false positive and the address of force_poison is somehow ending up on the stack by accident as that is the random value being passed in from __down_common: struct semaphore_waiter waiter; list_add_tail(&waiter.list, &sem->wait_list); So, I think we need a more unique poison value that should never appear on the stack: diff --git a/include/linux/poison.h b/include/linux/poison.h index 4a27153574e2..0604806c2f52 100644 --- a/include/linux/poison.h +++ b/include/linux/poison.h @@ -21,6 +21,7 @@ */ #define LIST_POISON1 ((void *) 0x100 + POISON_POINTER_DELTA) #define LIST_POISON2 ((void *) 0x200 + POISON_POINTER_DELTA) +#define LIST_POISON3 ((void *) 0x500 + POISON_POINTER_DELTA) /********** include/linux/timer.h **********/ /* diff --git a/lib/list_debug.c b/lib/list_debug.c index 3345a089ef7b..318bf1c181b2 100644 --- a/lib/list_debug.c +++ b/lib/list_debug.c @@ -12,11 +12,10 @@ #include <linux/kernel.h> #include <linux/rculist.h> -static struct list_head force_poison; void list_force_poison(struct list_head *entry) { - entry->next = &force_poison; - entry->prev = &force_poison; + entry->next = LIST_POISON3; + entry->prev = LIST_POISON3; } /* @@ -30,7 +29,7 @@ void __list_add(struct list_head *new, struct list_head *prev, struct list_head *next) { - WARN(new->next == &force_poison || new->prev == &force_poison, + WARN(new->next == LIST_POISON3 || new->prev == LIST_POISON3, "list_add attempted on force-poisoned entry\n"); WARN(next->prev != prev, "list_add corruption. next->prev should be " _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: generic/320 triggers "list_add attempted on force-poisoned entry" warning on XFS 2016-02-27 20:10 ` Dan Williams @ 2016-02-28 5:31 ` Eryu Guan 2016-02-29 18:22 ` Dan Williams 0 siblings, 1 reply; 6+ messages in thread From: Eryu Guan @ 2016-02-28 5:31 UTC (permalink / raw) To: Dan Williams; +Cc: Ross Zwisler, XFS Developers On Sat, Feb 27, 2016 at 12:10:51PM -0800, Dan Williams wrote: > On Sat, Feb 27, 2016 at 5:02 AM, Eryu Guan <eguan@redhat.com> wrote: > > Hi, > > > > Starting from 4.5-rc1 kernel, I sometimes see generic/320 triggers > > "list_add attempted on force-poisoned entry" warnings on XFS, test hosts > > are arm64/ppc64/ppc64le, haven't seen it on x86_64 hosts. > > Hmm, this triggers when a list_head has ->next or ->prev pointing at > the address of force_poison which is only defined in lib/list_debug.c. > The only call site that uses list_force_poison() is in > devm_memremap_pages(). That currently depends on CONFIG_ZONE_DEVICE > which in turn depends on X86_64. > > So, this appears to be a false positive and the address of > force_poison is somehow ending up on the stack by accident as that is > the random value being passed in from __down_common: > > struct semaphore_waiter waiter; > > list_add_tail(&waiter.list, &sem->wait_list); > > So, I think we need a more unique poison value that should never > appear on the stack: Unfortunately I can still see the warning after applying this test patch. Then I added debug code to print the pointer value and re-ran the test. All five failures printed the same pointer value, failed in the same pattern: list_add attempted on force-poisoned entry(0000000000000500), new->next = c00000000136bc00, new->prev = 0000000000000500 Thanks, Eryu > > diff --git a/include/linux/poison.h b/include/linux/poison.h > index 4a27153574e2..0604806c2f52 100644 > --- a/include/linux/poison.h > +++ b/include/linux/poison.h > @@ -21,6 +21,7 @@ > */ > #define LIST_POISON1 ((void *) 0x100 + POISON_POINTER_DELTA) > #define LIST_POISON2 ((void *) 0x200 + POISON_POINTER_DELTA) > +#define LIST_POISON3 ((void *) 0x500 + POISON_POINTER_DELTA) > > /********** include/linux/timer.h **********/ > /* > diff --git a/lib/list_debug.c b/lib/list_debug.c > index 3345a089ef7b..318bf1c181b2 100644 > --- a/lib/list_debug.c > +++ b/lib/list_debug.c > @@ -12,11 +12,10 @@ > #include <linux/kernel.h> > #include <linux/rculist.h> > > -static struct list_head force_poison; > void list_force_poison(struct list_head *entry) > { > - entry->next = &force_poison; > - entry->prev = &force_poison; > + entry->next = LIST_POISON3; > + entry->prev = LIST_POISON3; > } > > /* > @@ -30,7 +29,7 @@ void __list_add(struct list_head *new, > struct list_head *prev, > struct list_head *next) > { > - WARN(new->next == &force_poison || new->prev == &force_poison, > + WARN(new->next == LIST_POISON3 || new->prev == LIST_POISON3, > "list_add attempted on force-poisoned entry\n"); > WARN(next->prev != prev, > "list_add corruption. next->prev should be " _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: generic/320 triggers "list_add attempted on force-poisoned entry" warning on XFS 2016-02-28 5:31 ` Eryu Guan @ 2016-02-29 18:22 ` Dan Williams 2016-03-01 8:00 ` Eryu Guan 0 siblings, 1 reply; 6+ messages in thread From: Dan Williams @ 2016-02-29 18:22 UTC (permalink / raw) To: Eryu Guan; +Cc: Ross Zwisler, XFS Developers On Sat, Feb 27, 2016 at 9:31 PM, Eryu Guan <eguan@redhat.com> wrote: > On Sat, Feb 27, 2016 at 12:10:51PM -0800, Dan Williams wrote: >> On Sat, Feb 27, 2016 at 5:02 AM, Eryu Guan <eguan@redhat.com> wrote: >> > Hi, >> > >> > Starting from 4.5-rc1 kernel, I sometimes see generic/320 triggers >> > "list_add attempted on force-poisoned entry" warnings on XFS, test hosts >> > are arm64/ppc64/ppc64le, haven't seen it on x86_64 hosts. >> >> Hmm, this triggers when a list_head has ->next or ->prev pointing at >> the address of force_poison which is only defined in lib/list_debug.c. >> The only call site that uses list_force_poison() is in >> devm_memremap_pages(). That currently depends on CONFIG_ZONE_DEVICE >> which in turn depends on X86_64. >> >> So, this appears to be a false positive and the address of >> force_poison is somehow ending up on the stack by accident as that is >> the random value being passed in from __down_common: >> >> struct semaphore_waiter waiter; >> >> list_add_tail(&waiter.list, &sem->wait_list); >> >> So, I think we need a more unique poison value that should never >> appear on the stack: > > Unfortunately I can still see the warning after applying this test patch. > > Then I added debug code to print the pointer value and re-ran the test. > All five failures printed the same pointer value, failed in the same > pattern: > > list_add attempted on force-poisoned entry(0000000000000500), new->next = c00000000136bc00, new->prev = 0000000000000500 > I think this means that no matter what we do the stack will pick up these poison values unless the list_head is explicitly initialized. Something like the following: diff --git a/kernel/locking/semaphore.c b/kernel/locking/semaphore.c index b8120abe594b..39929b4e6fbb 100644 --- a/kernel/locking/semaphore.c +++ b/kernel/locking/semaphore.c @@ -205,7 +205,9 @@ static inline int __sched __down_common(struct semaphore *sem, long state, long timeout) { struct task_struct *task = current; - struct semaphore_waiter waiter; + struct semaphore_waiter waiter = { + .list = LIST_HEAD_INIT(waiter.list), + }; list_add_tail(&waiter.list, &sem->wait_list); waiter.task = task; _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: generic/320 triggers "list_add attempted on force-poisoned entry" warning on XFS 2016-02-29 18:22 ` Dan Williams @ 2016-03-01 8:00 ` Eryu Guan 2016-03-01 16:27 ` Dan Williams 0 siblings, 1 reply; 6+ messages in thread From: Eryu Guan @ 2016-03-01 8:00 UTC (permalink / raw) To: Dan Williams; +Cc: Ross Zwisler, XFS Developers On Mon, Feb 29, 2016 at 10:22:06AM -0800, Dan Williams wrote: > On Sat, Feb 27, 2016 at 9:31 PM, Eryu Guan <eguan@redhat.com> wrote: > > On Sat, Feb 27, 2016 at 12:10:51PM -0800, Dan Williams wrote: > >> On Sat, Feb 27, 2016 at 5:02 AM, Eryu Guan <eguan@redhat.com> wrote: > >> > Hi, > >> > > >> > Starting from 4.5-rc1 kernel, I sometimes see generic/320 triggers > >> > "list_add attempted on force-poisoned entry" warnings on XFS, test hosts > >> > are arm64/ppc64/ppc64le, haven't seen it on x86_64 hosts. > >> > >> Hmm, this triggers when a list_head has ->next or ->prev pointing at > >> the address of force_poison which is only defined in lib/list_debug.c. > >> The only call site that uses list_force_poison() is in > >> devm_memremap_pages(). That currently depends on CONFIG_ZONE_DEVICE > >> which in turn depends on X86_64. > >> > >> So, this appears to be a false positive and the address of > >> force_poison is somehow ending up on the stack by accident as that is > >> the random value being passed in from __down_common: > >> > >> struct semaphore_waiter waiter; > >> > >> list_add_tail(&waiter.list, &sem->wait_list); > >> > >> So, I think we need a more unique poison value that should never > >> appear on the stack: > > > > Unfortunately I can still see the warning after applying this test patch. > > > > Then I added debug code to print the pointer value and re-ran the test. > > All five failures printed the same pointer value, failed in the same > > pattern: > > > > list_add attempted on force-poisoned entry(0000000000000500), new->next = c00000000136bc00, new->prev = 0000000000000500 > > > > I think this means that no matter what we do the stack will pick up > these poison values unless the list_head is explicitly initialized. > Something like the following: Umm, it's still reproducible... but seems harder than before, it took me 200+ iterations to hit (less than 10 iterations in previous runs) [ 5465.401191] run fstests generic/320 at 2016-03-01 00:11:13 [ 5465.561754] XFS (sda5): Unmounting Filesystem [ 5466.202130] XFS (sda5): Mounting V4 Filesystem [ 5466.260396] XFS (sda5): Ending clean mount [ 5482.629036] list_add attempted on force-poisoned entry(0000000000000500), new->next == d0000000059ecdb0, new->prev == 0000000000000500 [ 5482.629070] ------------[ cut here ]------------ [ 5482.629077] WARNING: at lib/list_debug.c:33 [ 5482.629082] Modules linked in: pseries_rng(E) sg(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) sunrpc(E) grace(E) ip_tables(E) xfs(E) libcrc32c(E) sd_mod(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) [ 5482.629121] CPU: 4 PID: 7203 Comm: rm Tainted: G E 4.5.0-rc5+ #4 [ 5482.629129] task: c0000005f0712d00 ti: c0000004c749c000 task.ti: c0000004c749c000 [ 5482.629136] NIP: c00000000042db78 LR: c00000000042db74 CTR: 00000000013abb8c [ 5482.629144] REGS: c0000004c749f3a0 TRAP: 0700 Tainted: G E (4.5.0-rc5+) [ 5482.629150] MSR: 8000000100029032 <SF,EE,ME,IR,DR,RI,TM[E]> CR: 22002822 XER: 0000000b [ 5482.629173] CFAR: c00000000080a5e4 SOFTE: 0 GPR00: c00000000042db74 c0000004c749f620 c00000000136bc00 000000000000007a GPR04: c0000005ffc09c58 c0000005ffc1b490 000005cf29ac0100 0000000000000000 GPR08: 0000000000000000 c000000000c3b27c 00000005fefd0000 0000000000000f97 GPR12: 0000000042002844 c00000000e822400 0000000000000002 0000000000000000 GPR16: 000000001000da78 000000001000d758 0000010018009cd0 000000001000dab8 GPR20: 0000000000000001 c0000004c749f960 c0000005f5931e00 c0000005f5931e80 GPR24: c0000000fd01c000 c0000000fbe0a400 fffffffffffff000 0000000000000000 GPR28: c0000005ea59f938 c0000005f5931e88 c0000005f1f6b890 c0000004c749f720 [ 5482.629270] NIP [c00000000042db78] .__list_add+0xa8/0x140 [ 5482.629277] LR [c00000000042db74] .__list_add+0xa4/0x140 [ 5482.629282] Call Trace: [ 5482.629288] [c0000004c749f620] [c00000000042db74] .__list_add+0xa4/0x140 (unreliable) [ 5482.629299] [c0000004c749f6b0] [c0000000008010ec] .rwsem_down_read_failed+0x6c/0x1a0 [ 5482.629310] [c0000004c749f760] [c000000000800828] .down_read+0x58/0x60 [ 5482.629396] [c0000004c749f7e0] [d000000005a1a6bc] .xfs_log_commit_cil+0x7c/0x600 [xfs] [ 5482.629482] [c0000004c749f8f0] [d000000005a12848] .__xfs_trans_commit+0x178/0x300 [xfs] [ 5482.629567] [c0000004c749f990] [d000000005a12f14] .__xfs_trans_roll+0x74/0x130 [xfs] [ 5482.629653] [c0000004c749fa30] [d0000000059e8994] .xfs_bmap_finish+0xd4/0x1e0 [xfs] [ 5482.629738] [c0000004c749fae0] [d000000005a06acc] .xfs_inactive_ifree+0x20c/0x2a0 [xfs] [ 5482.629830] [c0000004c749fb90] [d000000005a06c14] .xfs_inactive+0xb4/0x190 [xfs] [ 5482.629913] [c0000004c749fc10] [d000000005a0d8f8] .xfs_fs_evict_inode+0xd8/0x170 [xfs] [ 5482.629923] [c0000004c749fca0] [c0000000002b60d8] .evict+0xe8/0x220 [ 5482.629932] [c0000004c749fd30] [c0000000002a9278] .do_unlinkat+0x248/0x360 [ 5482.629942] [c0000004c749fe30] [c000000000009204] system_call+0x38/0xb4 [ 5482.629948] Instruction dump: [ 5482.629953] e8010010 eba1ffe8 ebc1fff0 ebe1fff8 7c0803a6 4e800020 3c62ff77 38800500 [ 5482.629969] 38632550 7d254b78 483dca15 60000000 <0fe00000> 4bffff90 3c62ff77 7fe4fb78 [ 5482.629985] ---[ end trace 71e305f825b24cc9 ]--- Thanks, Eryu _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: generic/320 triggers "list_add attempted on force-poisoned entry" warning on XFS 2016-03-01 8:00 ` Eryu Guan @ 2016-03-01 16:27 ` Dan Williams 0 siblings, 0 replies; 6+ messages in thread From: Dan Williams @ 2016-03-01 16:27 UTC (permalink / raw) To: Eryu Guan; +Cc: Ross Zwisler, XFS Developers On Tue, Mar 1, 2016 at 12:00 AM, Eryu Guan <eguan@redhat.com> wrote: > On Mon, Feb 29, 2016 at 10:22:06AM -0800, Dan Williams wrote: >> On Sat, Feb 27, 2016 at 9:31 PM, Eryu Guan <eguan@redhat.com> wrote: >> > On Sat, Feb 27, 2016 at 12:10:51PM -0800, Dan Williams wrote: >> >> On Sat, Feb 27, 2016 at 5:02 AM, Eryu Guan <eguan@redhat.com> wrote: >> >> > Hi, >> >> > >> >> > Starting from 4.5-rc1 kernel, I sometimes see generic/320 triggers >> >> > "list_add attempted on force-poisoned entry" warnings on XFS, test hosts >> >> > are arm64/ppc64/ppc64le, haven't seen it on x86_64 hosts. >> >> >> >> Hmm, this triggers when a list_head has ->next or ->prev pointing at >> >> the address of force_poison which is only defined in lib/list_debug.c. >> >> The only call site that uses list_force_poison() is in >> >> devm_memremap_pages(). That currently depends on CONFIG_ZONE_DEVICE >> >> which in turn depends on X86_64. >> >> >> >> So, this appears to be a false positive and the address of >> >> force_poison is somehow ending up on the stack by accident as that is >> >> the random value being passed in from __down_common: >> >> >> >> struct semaphore_waiter waiter; >> >> >> >> list_add_tail(&waiter.list, &sem->wait_list); >> >> >> >> So, I think we need a more unique poison value that should never >> >> appear on the stack: >> > >> > Unfortunately I can still see the warning after applying this test patch. >> > >> > Then I added debug code to print the pointer value and re-ran the test. >> > All five failures printed the same pointer value, failed in the same >> > pattern: >> > >> > list_add attempted on force-poisoned entry(0000000000000500), new->next = c00000000136bc00, new->prev = 0000000000000500 >> > >> >> I think this means that no matter what we do the stack will pick up >> these poison values unless the list_head is explicitly initialized. >> Something like the following: > > Umm, it's still reproducible... but seems harder than before, it took me > 200+ iterations to hit (less than 10 iterations in previous runs) Similar fix, just in rwsem_down_read_failed() this time: diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index a4d4de05b2d1..68678a20da52 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -214,8 +214,10 @@ __visible struct rw_semaphore __sched *rwsem_down_read_failed(struct rw_semaphore *sem) { long count, adjustment = -RWSEM_ACTIVE_READ_BIAS; - struct rwsem_waiter waiter; struct task_struct *tsk = current; + struct rwsem_waiter waiter = { + .list = LIST_HEAD_INIT(waiter.list), + }; /* set up my own style of waitqueue */ waiter.task = tsk; _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-03-01 16:28 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-02-27 13:02 generic/320 triggers "list_add attempted on force-poisoned entry" warning on XFS Eryu Guan 2016-02-27 20:10 ` Dan Williams 2016-02-28 5:31 ` Eryu Guan 2016-02-29 18:22 ` Dan Williams 2016-03-01 8:00 ` Eryu Guan 2016-03-01 16:27 ` Dan Williams
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox