* [LTP] madvise07.c:72: FAIL: Did not receive SIGBUS @ 2017-02-10 2:53 Li Wang 2017-02-13 9:08 ` Cyril Hrubis 0 siblings, 1 reply; 8+ messages in thread From: Li Wang @ 2017-02-10 2:53 UTC (permalink / raw) To: ltp Hi, I'm trying to run ltp on upstream kernel-4.10.0-rc7, and found that madvise07 always failing with no SIGBUS received when mmap the PRIVATE memory. I hope to know if there're some relevant stuff about this issue. Any discussion or document for that? # uname -r 4.10.0-rc7 # ./madvise07 tst_test.c:794: INFO: Timeout per run is 0h 05m 00s madvise07.c:57: INFO: madvise(0x7f25bdd7e000, 4096, MADV_HWPOISON) madvise07.c:72: FAIL: Did not receive SIGBUS after accessing MAP_PRIVATE memory marked with MADV_HWPOISON madvise07.c:57: INFO: madvise(0x7f25bdd7e000, 4096, MADV_HWPOISON) madvise07.c:90: PASS: madvise(..., MADV_HWPOISON) on MAP_SHARED memory -- Regards, Li Wang Email: liwang@redhat.com ^ permalink raw reply [flat|nested] 8+ messages in thread
* [LTP] madvise07.c:72: FAIL: Did not receive SIGBUS 2017-02-10 2:53 [LTP] madvise07.c:72: FAIL: Did not receive SIGBUS Li Wang @ 2017-02-13 9:08 ` Cyril Hrubis 2017-02-13 12:43 ` Richard Palethorpe 2017-02-14 14:06 ` Jan Stancek 0 siblings, 2 replies; 8+ messages in thread From: Cyril Hrubis @ 2017-02-13 9:08 UTC (permalink / raw) To: ltp Hi! > I'm trying to run ltp on upstream kernel-4.10.0-rc7, and found that > madvise07 always failing with no SIGBUS received when mmap the PRIVATE > memory. I hope to know if there're some relevant stuff about this > issue. > Any discussion or document for that? Looks like a plain old kernel bug to me. > # uname -r > 4.10.0-rc7 > > # ./madvise07 > tst_test.c:794: INFO: Timeout per run is 0h 05m 00s > madvise07.c:57: INFO: madvise(0x7f25bdd7e000, 4096, MADV_HWPOISON) > madvise07.c:72: FAIL: Did not receive SIGBUS after accessing > MAP_PRIVATE memory marked with MADV_HWPOISON If you reach this TFAIL the child wasn't killed with a signal after it accessed memory marked with MADV_HWPOISON. What hardware is this? > madvise07.c:57: INFO: madvise(0x7f25bdd7e000, 4096, MADV_HWPOISON) > madvise07.c:90: PASS: madvise(..., MADV_HWPOISON) on MAP_SHARED memory -- Cyril Hrubis chrubis@suse.cz ^ permalink raw reply [flat|nested] 8+ messages in thread
* [LTP] madvise07.c:72: FAIL: Did not receive SIGBUS 2017-02-13 9:08 ` Cyril Hrubis @ 2017-02-13 12:43 ` Richard Palethorpe 2017-02-14 14:06 ` Jan Stancek 1 sibling, 0 replies; 8+ messages in thread From: Richard Palethorpe @ 2017-02-13 12:43 UTC (permalink / raw) To: ltp Hello Li & Metan, On Mon, 13 Feb 2017 10:08:37 +0100 "Cyril Hrubis" <chrubis@suse.cz> wrote: > Hi! > > I'm trying to run ltp on upstream kernel-4.10.0-rc7, and found that > > madvise07 always failing with no SIGBUS received when mmap the PRIVATE > > memory. I hope to know if there're some relevant stuff about this > > issue. > > Any discussion or document for that? > > Looks like a plain old kernel bug to me. Sorry, I have to admit that I knew this fails, but did not follow it up before submitting the patch! don't know whether it is a bug, or if MADV_HWPOISON is not intended to work with private memory. I would assume that it is a bug judging by the man pages. > > > # uname -r > > 4.10.0-rc7 > > > > # ./madvise07 > > tst_test.c:794: INFO: Timeout per run is 0h 05m 00s > > madvise07.c:57: INFO: madvise(0x7f25bdd7e000, 4096, MADV_HWPOISON) > > madvise07.c:72: FAIL: Did not receive SIGBUS after accessing > > MAP_PRIVATE memory marked with MADV_HWPOISON > > If you reach this TFAIL the child wasn't killed with a signal after it > accessed memory marked with MADV_HWPOISON. > > What hardware is this? > > > madvise07.c:57: INFO: madvise(0x7f25bdd7e000, 4096, MADV_HWPOISON) > > madvise07.c:90: PASS: madvise(..., MADV_HWPOISON) on MAP_SHARED memory > I know that it fails on x86_64 and ppc64le. Thank you, Richard. ^ permalink raw reply [flat|nested] 8+ messages in thread
* [LTP] madvise07.c:72: FAIL: Did not receive SIGBUS 2017-02-13 9:08 ` Cyril Hrubis 2017-02-13 12:43 ` Richard Palethorpe @ 2017-02-14 14:06 ` Jan Stancek 2017-02-14 15:18 ` Richard Palethorpe 2017-02-15 9:38 ` Li Wang 1 sibling, 2 replies; 8+ messages in thread From: Jan Stancek @ 2017-02-14 14:06 UTC (permalink / raw) To: ltp ----- Original Message ----- > From: "Cyril Hrubis" <chrubis@suse.cz> > To: "Li Wang" <liwang@redhat.com> > Cc: richiejp@f-m.fm, ltp@lists.linux.it > Sent: Monday, 13 February, 2017 10:08:37 AM > Subject: Re: [LTP] madvise07.c:72: FAIL: Did not receive SIGBUS > > Hi! > > I'm trying to run ltp on upstream kernel-4.10.0-rc7, and found that > > madvise07 always failing with no SIGBUS received when mmap the PRIVATE > > memory. I hope to know if there're some relevant stuff about this > > issue. > > Any discussion or document for that? > > Looks like a plain old kernel bug to me. Or maybe MADV_HWPOISON is supposed to work only for faulted-in pages? It works fine for me with change below: diff --git a/testcases/kernel/syscalls/madvise/madvise07.c b/testcases/kernel/syscalls/madvise/madvise07.c index 2f8c42e..f5fd4b7 100644 --- a/testcases/kernel/syscalls/madvise/madvise07.c +++ b/testcases/kernel/syscalls/madvise/madvise07.c @@ -44,13 +44,13 @@ static int maptypes[] = { static void run_child(int maptype) { - const size_t msize = 4096; + const size_t msize = getpagesize(); void *mem = NULL; mem = SAFE_MMAP(NULL, msize, PROT_READ | PROT_WRITE, - MAP_ANONYMOUS | maptype, + MAP_ANONYMOUS | maptype | MAP_POPULATE, -1, 0); > > > # uname -r > > 4.10.0-rc7 > > > > # ./madvise07 > > tst_test.c:794: INFO: Timeout per run is 0h 05m 00s > > madvise07.c:57: INFO: madvise(0x7f25bdd7e000, 4096, MADV_HWPOISON) > > madvise07.c:72: FAIL: Did not receive SIGBUS after accessing > > MAP_PRIVATE memory marked with MADV_HWPOISON > > If you reach this TFAIL the child wasn't killed with a signal after it > accessed memory marked with MADV_HWPOISON. > > What hardware is this? I'm seeing it on x86 KVM guest, with 2.6.32 (RHEL6.0), 3.10 (RHEL7), 4.8 and 4.9 kernels. > > > madvise07.c:57: INFO: madvise(0x7f25bdd7e000, 4096, MADV_HWPOISON) > > madvise07.c:90: PASS: madvise(..., MADV_HWPOISON) on MAP_SHARED memory > > -- > Cyril Hrubis > chrubis@suse.cz > > -- > Mailing list info: https://lists.linux.it/listinfo/ltp > ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [LTP] madvise07.c:72: FAIL: Did not receive SIGBUS 2017-02-14 14:06 ` Jan Stancek @ 2017-02-14 15:18 ` Richard Palethorpe 2017-02-14 15:25 ` Jan Stancek 2017-02-15 9:38 ` Li Wang 1 sibling, 1 reply; 8+ messages in thread From: Richard Palethorpe @ 2017-02-14 15:18 UTC (permalink / raw) To: ltp Hi Jan, On Tue, 14 Feb 2017 09:06:14 -0500 (EST) "Jan Stancek" <jstancek@redhat.com> wrote: > > Or maybe MADV_HWPOISON is supposed to work only for faulted-in pages? > It works fine for me with change below: > > diff --git a/testcases/kernel/syscalls/madvise/madvise07.c b/testcases/kernel/syscalls/madvise/madvise07.c > index 2f8c42e..f5fd4b7 100644 > --- a/testcases/kernel/syscalls/madvise/madvise07.c > +++ b/testcases/kernel/syscalls/madvise/madvise07.c > @@ -44,13 +44,13 @@ static int maptypes[] = { > > static void run_child(int maptype) > { > - const size_t msize = 4096; > + const size_t msize = getpagesize(); > void *mem = NULL; > > mem = SAFE_MMAP(NULL, > msize, > PROT_READ | PROT_WRITE, > - MAP_ANONYMOUS | maptype, > + MAP_ANONYMOUS | maptype | MAP_POPULATE, > -1, > 0); > My only concern is that this is not documented in the man pages, but considering we are testing a test interface, I'm not sure it matters. Thank you, Richard. ^ permalink raw reply [flat|nested] 8+ messages in thread
* [LTP] madvise07.c:72: FAIL: Did not receive SIGBUS 2017-02-14 15:18 ` Richard Palethorpe @ 2017-02-14 15:25 ` Jan Stancek 0 siblings, 0 replies; 8+ messages in thread From: Jan Stancek @ 2017-02-14 15:25 UTC (permalink / raw) To: ltp ----- Original Message ----- > From: "Richard Palethorpe" <rpalethorpe@suse.com> > To: "Jan Stancek" <jstancek@redhat.com> > Cc: "Cyril Hrubis" <chrubis@suse.cz>, ltp@lists.linux.it > Sent: Tuesday, 14 February, 2017 4:18:45 PM > Subject: Re: [LTP] madvise07.c:72: FAIL: Did not receive SIGBUS > > Hi Jan, > > On Tue, 14 Feb 2017 09:06:14 -0500 (EST) > "Jan Stancek" <jstancek@redhat.com> wrote: > > > > > Or maybe MADV_HWPOISON is supposed to work only for faulted-in pages? > > It works fine for me with change below: > > > > diff --git a/testcases/kernel/syscalls/madvise/madvise07.c > > b/testcases/kernel/syscalls/madvise/madvise07.c > > index 2f8c42e..f5fd4b7 100644 > > --- a/testcases/kernel/syscalls/madvise/madvise07.c > > +++ b/testcases/kernel/syscalls/madvise/madvise07.c > > @@ -44,13 +44,13 @@ static int maptypes[] = { > > > > static void run_child(int maptype) > > { > > - const size_t msize = 4096; > > + const size_t msize = getpagesize(); > > void *mem = NULL; > > > > mem = SAFE_MMAP(NULL, > > msize, > > PROT_READ | PROT_WRITE, > > - MAP_ANONYMOUS | maptype, > > + MAP_ANONYMOUS | maptype | MAP_POPULATE, > > -1, > > 0); > > > > My only concern is that this is not documented in the man pages, but > considering we are testing a test interface, I'm not sure it matters. I'll ask on linux-mm. > > Thank you, > Richard. > ^ permalink raw reply [flat|nested] 8+ messages in thread
* [LTP] madvise07.c:72: FAIL: Did not receive SIGBUS 2017-02-14 14:06 ` Jan Stancek 2017-02-14 15:18 ` Richard Palethorpe @ 2017-02-15 9:38 ` Li Wang 2017-02-15 9:45 ` Li Wang 1 sibling, 1 reply; 8+ messages in thread From: Li Wang @ 2017-02-15 9:38 UTC (permalink / raw) To: ltp On Tue, Feb 14, 2017 at 10:06 PM, Jan Stancek <jstancek@redhat.com> wrote: > > > ----- Original Message ----- >> From: "Cyril Hrubis" <chrubis@suse.cz> >> To: "Li Wang" <liwang@redhat.com> >> Cc: richiejp@f-m.fm, ltp@lists.linux.it >> Sent: Monday, 13 February, 2017 10:08:37 AM >> Subject: Re: [LTP] madvise07.c:72: FAIL: Did not receive SIGBUS >> >> Hi! >> > I'm trying to run ltp on upstream kernel-4.10.0-rc7, and found that >> > madvise07 always failing with no SIGBUS received when mmap the PRIVATE >> > memory. I hope to know if there're some relevant stuff about this >> > issue. >> > Any discussion or document for that? >> >> Looks like a plain old kernel bug to me. > > Or maybe MADV_HWPOISON is supposed to work only for faulted-in pages? Looks like this thought is reasonable. Since the flag MAP_PRIVATE creates a private copy-on-write page mapping, it means the testcase will poison the read-only empty zero page many times if we reserve more than one page. I did a test and verify that imagination. e.g Only running madvise07 PRIVATE part with 4pages on rhel7.3 # dmesg [ 62.322637] Injecting memory failure for page 1c9d at 7f0594254000 [ 62.329660] MCE 0x1c9d: reserved kernel page still referenced by 1 users [ 62.337143] MCE 0x1c9d: reserved kernel page recovery: Failed [ 91.505460] Injecting memory failure for page 1c9d at 7f09ab16e000 [ 91.512363] MCE 0x1c9d: already hardware poisoned [ 91.517620] Injecting memory failure for page 1c9d at 7f09ab16f000 [ 91.524516] MCE 0x1c9d: already hardware poisoned [ 91.529763] Injecting memory failure for page 1c9d at 7f09ab170000 [ 91.536659] MCE 0x1c9d: already hardware poisoned And a patch in upstream kernel to fix a similar problem like that, it make sense to fix our LTP case madvise07.c. commit 29b4eedee67b449534214058e1bcb36307a7f1dc Author: Wanpeng Li <liwanp@linux.vnet.ibm.com> Date: Wed Sep 11 14:22:59 2013 -0700 mm/hwpoison.c: fix held reference count after unpoisoning empty zero page > It works fine for me with change below: > > diff --git a/testcases/kernel/syscalls/madvise/madvise07.c b/testcases/kernel/syscalls/madvise/madvise07.c > index 2f8c42e..f5fd4b7 100644 > --- a/testcases/kernel/syscalls/madvise/madvise07.c > +++ b/testcases/kernel/syscalls/madvise/madvise07.c > @@ -44,13 +44,13 @@ static int maptypes[] = { > > static void run_child(int maptype) > { > - const size_t msize = 4096; > + const size_t msize = getpagesize(); > void *mem = NULL; > > mem = SAFE_MMAP(NULL, > msize, > PROT_READ | PROT_WRITE, > - MAP_ANONYMOUS | maptype, > + MAP_ANONYMOUS | maptype | MAP_POPULATE, > -1, > 0); > An other way I propose to fix the problem is just to using the page before madvise(): $ git diff diff --git a/testcases/kernel/syscalls/madvise/madvise07.c b/testcases/kernel/syscalls/madvise/madvise07.c index 2f8c42e..0ed5307 100644 --- a/testcases/kernel/syscalls/madvise/madvise07.c +++ b/testcases/kernel/syscalls/madvise/madvise07.c @@ -54,6 +54,8 @@ static void run_child(int maptype) -1, 0); + *((char *)mem) = 'a'; + tst_res(TINFO, "madvise(%p, %zu, MADV_HWPOISON)", mem, msize); if (madvise(mem, msize, MADV_HWPOISON) == -1) { if (errno == EINVAL) > >> >> > # uname -r >> > 4.10.0-rc7 >> > >> > # ./madvise07 >> > tst_test.c:794: INFO: Timeout per run is 0h 05m 00s >> > madvise07.c:57: INFO: madvise(0x7f25bdd7e000, 4096, MADV_HWPOISON) >> > madvise07.c:72: FAIL: Did not receive SIGBUS after accessing >> > MAP_PRIVATE memory marked with MADV_HWPOISON >> >> If you reach this TFAIL the child wasn't killed with a signal after it >> accessed memory marked with MADV_HWPOISON. >> >> What hardware is this? > > I'm seeing it on x86 KVM guest, with 2.6.32 (RHEL6.0), 3.10 (RHEL7), 4.8 and 4.9 kernels. > >> >> > madvise07.c:57: INFO: madvise(0x7f25bdd7e000, 4096, MADV_HWPOISON) >> > madvise07.c:90: PASS: madvise(..., MADV_HWPOISON) on MAP_SHARED memory >> >> -- >> Cyril Hrubis >> chrubis@suse.cz >> >> -- >> Mailing list info: https://lists.linux.it/listinfo/ltp >> -- Regards, Li Wang Email: liwang@redhat.com ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [LTP] madvise07.c:72: FAIL: Did not receive SIGBUS 2017-02-15 9:38 ` Li Wang @ 2017-02-15 9:45 ` Li Wang 0 siblings, 0 replies; 8+ messages in thread From: Li Wang @ 2017-02-15 9:45 UTC (permalink / raw) To: ltp On Wed, Feb 15, 2017 at 5:38 PM, Li Wang <liwang@redhat.com> wrote: > On Tue, Feb 14, 2017 at 10:06 PM, Jan Stancek <jstancek@redhat.com> wrote: >> >> >> ----- Original Message ----- >>> From: "Cyril Hrubis" <chrubis@suse.cz> >>> To: "Li Wang" <liwang@redhat.com> >>> Cc: richiejp@f-m.fm, ltp@lists.linux.it >>> Sent: Monday, 13 February, 2017 10:08:37 AM >>> Subject: Re: [LTP] madvise07.c:72: FAIL: Did not receive SIGBUS >>> >>> Hi! >>> > I'm trying to run ltp on upstream kernel-4.10.0-rc7, and found that >>> > madvise07 always failing with no SIGBUS received when mmap the PRIVATE >>> > memory. I hope to know if there're some relevant stuff about this >>> > issue. >>> > Any discussion or document for that? >>> >>> Looks like a plain old kernel bug to me. >> >> Or maybe MADV_HWPOISON is supposed to work only for faulted-in pages? > > Looks like this thought is reasonable. Since the flag MAP_PRIVATE > creates a private copy-on-write page mapping, it means the testcase > will poison the read-only empty zero page many times if we reserve > more than one page. I did a test and verify that imagination. > > e.g Only running madvise07 PRIVATE part with 4pages on rhel7.3 > > # dmesg > [ 62.322637] Injecting memory failure for page 1c9d at 7f0594254000 > [ 62.329660] MCE 0x1c9d: reserved kernel page still referenced by 1 users > [ 62.337143] MCE 0x1c9d: reserved kernel page recovery: Failed > [ 91.505460] Injecting memory failure for page 1c9d at 7f09ab16e000 > [ 91.512363] MCE 0x1c9d: already hardware poisoned > [ 91.517620] Injecting memory failure for page 1c9d at 7f09ab16f000 > [ 91.524516] MCE 0x1c9d: already hardware poisoned > [ 91.529763] Injecting memory failure for page 1c9d at 7f09ab170000 > [ 91.536659] MCE 0x1c9d: already hardware poisoned > > > > And a patch in upstream kernel to fix a similar problem like that, it > make sense to fix our LTP case madvise07.c. > > commit 29b4eedee67b449534214058e1bcb36307a7f1dc > Author: Wanpeng Li <liwanp@linux.vnet.ibm.com> > Date: Wed Sep 11 14:22:59 2013 -0700 > > mm/hwpoison.c: fix held reference count after unpoisoning empty zero page > > > >> It works fine for me with change below: >> >> diff --git a/testcases/kernel/syscalls/madvise/madvise07.c b/testcases/kernel/syscalls/madvise/madvise07.c >> index 2f8c42e..f5fd4b7 100644 >> --- a/testcases/kernel/syscalls/madvise/madvise07.c >> +++ b/testcases/kernel/syscalls/madvise/madvise07.c >> @@ -44,13 +44,13 @@ static int maptypes[] = { >> >> static void run_child(int maptype) >> { >> - const size_t msize = 4096; >> + const size_t msize = getpagesize(); >> void *mem = NULL; >> >> mem = SAFE_MMAP(NULL, >> msize, >> PROT_READ | PROT_WRITE, >> - MAP_ANONYMOUS | maptype, >> + MAP_ANONYMOUS | maptype | MAP_POPULATE, >> -1, >> 0); >> > > An other way I propose to fix the problem is just to using the page > before madvise(): > > $ git diff > diff --git a/testcases/kernel/syscalls/madvise/madvise07.c > b/testcases/kernel/syscalls/madvise/madvise07.c > index 2f8c42e..0ed5307 100644 > --- a/testcases/kernel/syscalls/madvise/madvise07.c > +++ b/testcases/kernel/syscalls/madvise/madvise07.c > @@ -54,6 +54,8 @@ static void run_child(int maptype) > -1, > 0); > > + *((char *)mem) = 'a'; > + > tst_res(TINFO, "madvise(%p, %zu, MADV_HWPOISON)", mem, msize); > if (madvise(mem, msize, MADV_HWPOISON) == -1) { > if (errno == EINVAL) > Attach this patched madvise07 result below: # ./madvise07 tst_test.c:792: INFO: Timeout per run is 0h 05m 00s madvise07.c:54: INFO: madvise(0x7f864a116000, 4096, MADV_HWPOISON) madvise07.c:88: PASS: madvise(..., MADV_HWPOISON) on MAP_PRIVATE memory madvise07.c:54: INFO: madvise(0x7f864a116000, 4096, MADV_HWPOISON) madvise07.c:88: PASS: madvise(..., MADV_HWPOISON) on MAP_SHARED memory Summary: passed 2 failed 0 skipped 0 warnings 0 # dmesg [ 636.254254] Injecting memory failure for page 223cfd at 7f864a116000 [ 636.261400] MCE 0x223cfd: dirty LRU page recovery: Recovered [ 636.267722] MCE: Killing madvise07:2498 due to hardware memory corruption fault at 7f864a116000 [ 636.277674] Injecting memory failure for page 223d18 at 7f864a116000 [ 636.284811] MCE 0x223d18: dirty LRU page recovery: Recovered [ 636.291133] MCE: Killing madvise07:2499 due to hardware memory corruption fault at 7f864a116000 Regards, Li Wang ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2017-02-15 9:45 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-02-10 2:53 [LTP] madvise07.c:72: FAIL: Did not receive SIGBUS Li Wang 2017-02-13 9:08 ` Cyril Hrubis 2017-02-13 12:43 ` Richard Palethorpe 2017-02-14 14:06 ` Jan Stancek 2017-02-14 15:18 ` Richard Palethorpe 2017-02-14 15:25 ` Jan Stancek 2017-02-15 9:38 ` Li Wang 2017-02-15 9:45 ` Li Wang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox