Segmentation fault with drgn + libkdumpfile

linux-debuggers.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Segmentation fault with drgn + libkdumpfile
@ 2024-01-05 18:38 Stephen Brennan
  2024-01-05 19:23 ` Petr Tesařík
  0 siblings, 1 reply; 11+ messages in thread
From: Stephen Brennan @ 2024-01-05 18:38 UTC (permalink / raw)
  To: Petr Tesarik; +Cc: Omar Sandoval, linux-debuggers

Hi Petr,

I recently encountered a segmentation fault with libkdumpfile & drgn
which appears to be related to the cache implementation. I've included
the stack trace at the end of this message, since it's a bit of a longer
one. The exact issue occurred with a test vmcore that I could probably
share with you privately if you'd like. In any case, the reproducer is
fairly straightforward in drgn code:

for t in for_each_task(prog):
    prog.stack_trace(t)
for t in for_each_task(prog):
    prog.stack_trace(t)

The repetition is required, the segfault only occurs on the second
iteration of the loop. Which, in hindsight, is a textbook sign that the
issue has to do with caching. I'd expect that the issue is specific to
this vmcore, it doesn't reproduce on others.

I stuck that into a git bisect script and bisected the libkdumpfile
commit that introduced it:

commit 487a8042ea5da580e1fdb5b8f91c8bd7cad05cd6
Author: Petr Tesarik <petr@tesarici.cz>
Date:   Wed Jan 11 22:53:01 2023 +0100

    Cache: Calculate eprobe in reinit_entry()

    If this function is called to reuse a ghost entry, the probe list
    has not been walked yet, so eprobe is left uninitialized.

    This passed the test case, because the correct old value was left
    on stack. Modify the test case to poison the stack.

    Signed-off-by: Petr Tesarik <petr@tesarici.cz>

 src/kdumpfile/cache.c      |  6 +++++-
 src/kdumpfile/test-cache.c | 13 +++++++++++++
 2 files changed, 18 insertions(+), 1 deletion(-)

I haven't yet tried to debug the logic of the cache implementation and
create a patch. I'm totally willing to try that, but I figured I would
send this report to you first, to see if there's something obvious that
sticks out to your eyes.

Thanks!
Stephen


#0  0x00007fb9a1ac1ef3 in lzo1x_decompress_safe () from /lib64/liblzo2.so.2
#1  0x00007fb9af164151 in diskdump_read_page (pio=pio@entry=0x7fff1f5fa760)
    at diskdump.c:584
#2  0x00007fb9af16bcd8 in _kdumpfile_priv_cache_get_page (pio=0x7fff1f5fa760,
    fn=0x7fb9af163cd0 <diskdump_read_page>) at read.c:69
#3  0x00007fb9af16bc15 in get_page (pio=0x7fff1f5fa760)
    at /home/stepbren/repos/libkdumpfile/src/kdumpfile/kdumpfile-priv.h:1512
#4  get_page_xlat (pio=pio@entry=0x7fff1f5fa760) at read.c:126
#5  0x00007fb9af16be57 in get_page_maybe_xlat (pio=0x7fff1f5fa760) at read.c:137
#6  _kdumpfile_priv_read_locked (ctx=ctx@entry=0x564a0a9475e0,
    as=as@entry=KDUMP_KVADDR, addr=addr@entry=18446612133360081960,
    buffer=buffer@entry=0x7fff1f5fa917, plength=plength@entry=0x7fff1f5fa848)
    at read.c:169
#7  0x00007fb9af16beee in kdump_read (ctx=ctx@entry=0x564a0a9475e0, as=KDUMP_KVADDR,
    addr=addr@entry=18446612133360081960, buffer=0x7fff1f5fa917,
    plength=plength@entry=0x7fff1f5fa848) at read.c:196
#8  0x00007fb9a1bc9a8f in drgn_read_kdump (buf=<optimized out>,
    address=18446612133360081960, count=<optimized out>, offset=<optimized out>,
    arg=0x564a0a9475e0, physical=<optimized out>) at ../../libdrgn/kdump.c:73
#9  0x00007fb9a1bb67bd in drgn_memory_reader_read (reader=reader@entry=0x564a0aa3c440,
    buf=buf@entry=0x7fff1f5fa917, address=<optimized out>, count=count@entry=4,
    physical=physical@entry=false) at ../../libdrgn/memory_reader.c:260
#10 0x00007fb9a1bc0160 in drgn_program_read_memory (prog=0x564a0aa3c440,
    buf=buf@entry=0x7fff1f5fa917, address=<optimized out>, count=count@entry=4,
    physical=physical@entry=false) at ../../libdrgn/program.c:1648
#11 0x00007fb9a1bb6b61 in drgn_object_read_reference (obj=0x7fff1f5fab10,
    value=value@entry=0x7fff1f5fa990) at ../../libdrgn/object.c:739
#12 0x00007fb9a1bb82c8 in drgn_object_read_value (obj=<optimized out>,
    value=0x7fff1f5fa990, ret=0x7fff1f5fa998) at ../../libdrgn/object.c:782
#13 0x00007fb9a1bb8311 in drgn_object_value_signed (obj=0x7fff1f5fab10,
    ret=0x7fff1f5fa9f0) at ../../libdrgn/object.c:905
#14 0x00007fb9a1bb8632 in drgn_object_is_zero_impl (obj=obj@entry=0x7fff1f5fab10,
    ret=ret@entry=0x7fff1f5faab6) at ../../libdrgn/object.c:1250
#15 0x00007fb9a1bb9338 in drgn_object_is_zero (obj=obj@entry=0x7fff1f5fab10,
    ret=ret@entry=0x7fff1f5faab6) at ../../libdrgn/object.c:1322
#16 0x00007fb9a1babb13 in c_op_bool (obj=0x7fff1f5fab10, ret=0x7fff1f5faab6)
    at ../../libdrgn/language_c.c:3307
#17 0x00007fb9a1bc3471 in drgn_get_initial_registers (ret=0x7fff1f5faad0,
    thread_obj=<optimized out>, tid=0, prog=0x564a0aa3c440)
    at ../../libdrgn/stack_trace.c:631
#18 drgn_get_stack_trace (prog=0x564a0aa3c440, tid=tid@entry=0, obj=<optimized out>,
    prstatus=prstatus@entry=0x0, ret=ret@entry=0x7fff1f5fab88)
    at ../../libdrgn/stack_trace.c:1091
#19 0x00007fb9a1bc4726 in drgn_get_stack_trace (ret=0x7fff1f5fab88, prstatus=0x0,
    obj=<optimized out>, tid=0, prog=<optimized out>)
    at ../../libdrgn/stack_trace.c:1151
#20 0x00007fb9a1b7fc20 in Program_stack_trace (self=0x564a0aa3c430,
    args=<optimized out>, kwds=<optimized out>) at ../../libdrgn/python/program.c:849
...
many more python stack frames :)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Segmentation fault with drgn + libkdumpfile
  2024-01-05 18:38 Segmentation fault with drgn + libkdumpfile Stephen Brennan
@ 2024-01-05 19:23 ` Petr Tesařík
  2024-01-05 21:53   ` Stephen Brennan
  0 siblings, 1 reply; 11+ messages in thread
From: Petr Tesařík @ 2024-01-05 19:23 UTC (permalink / raw)
  To: Stephen Brennan; +Cc: Omar Sandoval, linux-debuggers

On Fri, 05 Jan 2024 10:38:16 -0800
Stephen Brennan <stephen.s.brennan@oracle.com> wrote:

> Hi Petr,
> 
> I recently encountered a segmentation fault with libkdumpfile & drgn
> which appears to be related to the cache implementation. I've included
> the stack trace at the end of this message, since it's a bit of a longer
> one. The exact issue occurred with a test vmcore that I could probably
> share with you privately if you'd like. In any case, the reproducer is
> fairly straightforward in drgn code:
> 
> for t in for_each_task(prog):
>     prog.stack_trace(t)
> for t in for_each_task(prog):
>     prog.stack_trace(t)
> 
> The repetition is required, the segfault only occurs on the second
> iteration of the loop. Which, in hindsight, is a textbook sign that the
> issue has to do with caching. I'd expect that the issue is specific to
> this vmcore, it doesn't reproduce on others.
> 
> I stuck that into a git bisect script and bisected the libkdumpfile
> commit that introduced it:
> 
> commit 487a8042ea5da580e1fdb5b8f91c8bd7cad05cd6
> Author: Petr Tesarik <petr@tesarici.cz>
> Date:   Wed Jan 11 22:53:01 2023 +0100
> 
>     Cache: Calculate eprobe in reinit_entry()
> 
>     If this function is called to reuse a ghost entry, the probe list
>     has not been walked yet, so eprobe is left uninitialized.
> 
>     This passed the test case, because the correct old value was left
>     on stack. Modify the test case to poison the stack.
> 
>     Signed-off-by: Petr Tesarik <petr@tesarici.cz>
> 
>  src/kdumpfile/cache.c      |  6 +++++-
>  src/kdumpfile/test-cache.c | 13 +++++++++++++
>  2 files changed, 18 insertions(+), 1 deletion(-)

This looks like a red herring to me. The cache most likely continues in
a corrupted state without this commit, which may mask the issue (until
it resurfaces later).

> I haven't yet tried to debug the logic of the cache implementation and
> create a patch. I'm totally willing to try that, but I figured I would
> send this report to you first, to see if there's something obvious that
> sticks out to your eyes.

No, but I should be able to recreate the issue if I get a log of the
cache API calls:

- cache_alloc() - to know the number of elements
- cache_get_entry()
- cache_put_entry()
- cache_insert()
- cache_discard()
- cache_flush() - not likely after initialization, but...

Petr T

> Thanks!
> Stephen
> 
> 
> #0  0x00007fb9a1ac1ef3 in lzo1x_decompress_safe () from
> /lib64/liblzo2.so.2 #1  0x00007fb9af164151 in diskdump_read_page
> (pio=pio@entry=0x7fff1f5fa760) at diskdump.c:584
> #2  0x00007fb9af16bcd8 in _kdumpfile_priv_cache_get_page
> (pio=0x7fff1f5fa760, fn=0x7fb9af163cd0 <diskdump_read_page>) at
> read.c:69 #3  0x00007fb9af16bc15 in get_page (pio=0x7fff1f5fa760)
>     at
> /home/stepbren/repos/libkdumpfile/src/kdumpfile/kdumpfile-priv.h:1512
> #4  get_page_xlat (pio=pio@entry=0x7fff1f5fa760) at read.c:126 #5
> 0x00007fb9af16be57 in get_page_maybe_xlat (pio=0x7fff1f5fa760) at
> read.c:137 #6  _kdumpfile_priv_read_locked
> (ctx=ctx@entry=0x564a0a9475e0, as=as@entry=KDUMP_KVADDR,
> addr=addr@entry=18446612133360081960,
> buffer=buffer@entry=0x7fff1f5fa917,
> plength=plength@entry=0x7fff1f5fa848) at read.c:169 #7
> 0x00007fb9af16beee in kdump_read (ctx=ctx@entry=0x564a0a9475e0,
> as=KDUMP_KVADDR, addr=addr@entry=18446612133360081960,
> buffer=0x7fff1f5fa917, plength=plength@entry=0x7fff1f5fa848) at
> read.c:196 #8  0x00007fb9a1bc9a8f in drgn_read_kdump (buf=<optimized
> out>, address=18446612133360081960, count=<optimized out>,
> out>offset=<optimized out>, arg=0x564a0a9475e0, physical=<optimized
> out>out>) at ../../libdrgn/kdump.c:73 #9  0x00007fb9a1bb67bd in
> out>out>drgn_memory_reader_read (reader=reader@entry=0x564a0aa3c440,
>     buf=buf@entry=0x7fff1f5fa917, address=<optimized out>,
> count=count@entry=4, physical=physical@entry=false) at
> ../../libdrgn/memory_reader.c:260 #10 0x00007fb9a1bc0160 in
> drgn_program_read_memory (prog=0x564a0aa3c440,
> buf=buf@entry=0x7fff1f5fa917, address=<optimized out>,
> count=count@entry=4, physical=physical@entry=false) at
> ../../libdrgn/program.c:1648 #11 0x00007fb9a1bb6b61 in
> drgn_object_read_reference (obj=0x7fff1f5fab10,
> value=value@entry=0x7fff1f5fa990) at ../../libdrgn/object.c:739 #12
> 0x00007fb9a1bb82c8 in drgn_object_read_value (obj=<optimized out>,
> value=0x7fff1f5fa990, ret=0x7fff1f5fa998) at
> ../../libdrgn/object.c:782 #13 0x00007fb9a1bb8311 in
> drgn_object_value_signed (obj=0x7fff1f5fab10, ret=0x7fff1f5fa9f0) at
> ../../libdrgn/object.c:905 #14 0x00007fb9a1bb8632 in
> drgn_object_is_zero_impl (obj=obj@entry=0x7fff1f5fab10,
> ret=ret@entry=0x7fff1f5faab6) at ../../libdrgn/object.c:1250 #15
> 0x00007fb9a1bb9338 in drgn_object_is_zero
> (obj=obj@entry=0x7fff1f5fab10, ret=ret@entry=0x7fff1f5faab6) at
> ../../libdrgn/object.c:1322 #16 0x00007fb9a1babb13 in c_op_bool
> (obj=0x7fff1f5fab10, ret=0x7fff1f5faab6) at
> ../../libdrgn/language_c.c:3307 #17 0x00007fb9a1bc3471 in
> drgn_get_initial_registers (ret=0x7fff1f5faad0, thread_obj=<optimized
> out>, tid=0, prog=0x564a0aa3c440) at ../../libdrgn/stack_trace.c:631
> out>#18 drgn_get_stack_trace (prog=0x564a0aa3c440, tid=tid@entry=0,
> out>obj=<optimized out>, prstatus=prstatus@entry=0x0,
> out>ret=ret@entry=0x7fff1f5fab88) at ../../libdrgn/stack_trace.c:1091
> #19 0x00007fb9a1bc4726 in drgn_get_stack_trace (ret=0x7fff1f5fab88,
> prstatus=0x0, obj=<optimized out>, tid=0, prog=<optimized out>)
>     at ../../libdrgn/stack_trace.c:1151
> #20 0x00007fb9a1b7fc20 in Program_stack_trace (self=0x564a0aa3c430,
>     args=<optimized out>, kwds=<optimized out>) at
> ../../libdrgn/python/program.c:849 ...
> many more python stack frames :)
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Segmentation fault with drgn + libkdumpfile
  2024-01-05 19:23 ` Petr Tesařík
@ 2024-01-05 21:53   ` Stephen Brennan
  2024-01-08 20:40     ` Petr Tesařík
  0 siblings, 1 reply; 11+ messages in thread
From: Stephen Brennan @ 2024-01-05 21:53 UTC (permalink / raw)
  To: Petr Tesařík; +Cc: Omar Sandoval, linux-debuggers

[-- Attachment #1: Type: text/plain, Size: 4821 bytes --]

Petr Tesařík <petr@tesarici.cz> writes:
> On Fri, 05 Jan 2024 10:38:16 -0800
> Stephen Brennan <stephen.s.brennan@oracle.com> wrote:
>
>> Hi Petr,
>> 
>> I recently encountered a segmentation fault with libkdumpfile & drgn
>> which appears to be related to the cache implementation. I've included
>> the stack trace at the end of this message, since it's a bit of a longer
>> one. The exact issue occurred with a test vmcore that I could probably
>> share with you privately if you'd like. In any case, the reproducer is
>> fairly straightforward in drgn code:
>> 
>> for t in for_each_task(prog):
>>     prog.stack_trace(t)
>> for t in for_each_task(prog):
>>     prog.stack_trace(t)
>> 
>> The repetition is required, the segfault only occurs on the second
>> iteration of the loop. Which, in hindsight, is a textbook sign that the
>> issue has to do with caching. I'd expect that the issue is specific to
>> this vmcore, it doesn't reproduce on others.
>> 
>> I stuck that into a git bisect script and bisected the libkdumpfile
>> commit that introduced it:
>> 
>> commit 487a8042ea5da580e1fdb5b8f91c8bd7cad05cd6
>> Author: Petr Tesarik <petr@tesarici.cz>
>> Date:   Wed Jan 11 22:53:01 2023 +0100
>> 
>>     Cache: Calculate eprobe in reinit_entry()
>> 
>>     If this function is called to reuse a ghost entry, the probe list
>>     has not been walked yet, so eprobe is left uninitialized.
>> 
>>     This passed the test case, because the correct old value was left
>>     on stack. Modify the test case to poison the stack.
>> 
>>     Signed-off-by: Petr Tesarik <petr@tesarici.cz>
>> 
>>  src/kdumpfile/cache.c      |  6 +++++-
>>  src/kdumpfile/test-cache.c | 13 +++++++++++++
>>  2 files changed, 18 insertions(+), 1 deletion(-)
>
> This looks like a red herring to me. The cache most likely continues in
> a corrupted state without this commit, which may mask the issue (until
> it resurfaces later).

I see, that makes a lot of sense.

>> I haven't yet tried to debug the logic of the cache implementation and
>> create a patch. I'm totally willing to try that, but I figured I would
>> send this report to you first, to see if there's something obvious that
>> sticks out to your eyes.
>
> No, but I should be able to recreate the issue if I get a log of the
> cache API calls:
>
> - cache_alloc() - to know the number of elements
> - cache_get_entry()
> - cache_put_entry()
> - cache_insert()
> - cache_discard()
> - cache_flush() - not likely after initialization, but...

I went ahead and logged each of these calls as you suggested, I tried to
log them at the beginning of the function call and always include the
cache pointer, cache_entry, and the key. I took the resulting log and
filtered it to just contain the most recently logged cache prior to the
crash, compressed it, and attached it. For completeness, the patch
I used is below (applies to tip branch 8254897 ("Merge pull request #78
from fweimer-rh/c99")).

I'll also see if I can reproduce it based on the log.

diff --git a/src/kdumpfile/cache.c b/src/kdumpfile/cache.c
index 826d632..1f5820b 100644
--- a/src/kdumpfile/cache.c
+++ b/src/kdumpfile/cache.c
@@ -33,6 +33,7 @@
 #include "kdumpfile-priv.h"
 
 #include <stdlib.h>
+#include <stdio.h>
 #include <limits.h>
 
 /**  Simple cache.
@@ -531,6 +532,7 @@ cache_get_entry(struct cache *cache, cache_key_t key)
 	entry = cache_get_entry_noref(cache, key);
 	if (entry)
 		++entry->refcnt;
+	fprintf(stderr, "%p: cache_get_entry(%lu) -> %p\n", cache, key, entry);
 
 	return entry;
 }
@@ -549,6 +551,7 @@ cache_insert(struct cache *cache, struct cache_entry *entry)
 {
 	unsigned idx;
 
+	fprintf(stderr, "%p: cache_insert(%p) (key=%lu)\n", cache, entry, entry->key);
 	if (cache_entry_valid(entry))
 		return;
 
@@ -584,6 +587,7 @@ cache_insert(struct cache *cache, struct cache_entry *entry)
 void
 cache_put_entry(struct cache *cache, struct cache_entry *entry)
 {
+	fprintf(stderr, "%p: cache_put_entry(%p) (key=%lu)\n", cache, entry, entry->key);
 	--entry->refcnt;
 }
 
@@ -605,6 +609,8 @@ cache_discard(struct cache *cache, struct cache_entry *entry)
 {
 	unsigned n, idx, eprobe;
 
+	fprintf(stderr, "%p: cache_discard_entry(%p) (key=%lu)\n", cache, entry, entry->key);
+
 	if (--entry->refcnt)
 		return;
 	if (cache_entry_valid(entry))
@@ -669,6 +675,7 @@ cache_flush(struct cache *cache)
 {
 	unsigned i, n;
 
+	fprintf(stderr, "%p: cache_flush()\n", cache);
 	cleanup_entries(cache);
 
 	n = 2 * cache->cap;
@@ -709,6 +716,8 @@ cache_alloc(unsigned n, size_t size)
 	if (!cache)
 		return cache;
 
+	fprintf(stderr, "%p: cache_alloc(%u, %zu)\n", cache, n, size);
+
 	cache->elemsize = size;
 	cache->cap = n;
 	cache->hits.number = 0;


[-- Attachment #2: cache_filtered.log.gz --]
[-- Type: application/gzip, Size: 6129 bytes --]

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: Segmentation fault with drgn + libkdumpfile
  2024-01-05 21:53   ` Stephen Brennan
@ 2024-01-08 20:40     ` Petr Tesařík
  2024-01-09  9:06       ` Petr Tesařík
  0 siblings, 1 reply; 11+ messages in thread
From: Petr Tesařík @ 2024-01-08 20:40 UTC (permalink / raw)
  To: Stephen Brennan; +Cc: Omar Sandoval, linux-debuggers

On Fri, 05 Jan 2024 13:53:15 -0800
Stephen Brennan <stephen.s.brennan@oracle.com> wrote:

> Petr Tesařík <petr@tesarici.cz> writes:
> > On Fri, 05 Jan 2024 10:38:16 -0800
> > Stephen Brennan <stephen.s.brennan@oracle.com> wrote:
> >  
> >> Hi Petr,
> >> 
> >> I recently encountered a segmentation fault with libkdumpfile & drgn
> >> which appears to be related to the cache implementation. I've included
> >> the stack trace at the end of this message, since it's a bit of a longer
> >> one. The exact issue occurred with a test vmcore that I could probably
> >> share with you privately if you'd like. In any case, the reproducer is
> >> fairly straightforward in drgn code:
> >> 
> >> for t in for_each_task(prog):
> >>     prog.stack_trace(t)
> >> for t in for_each_task(prog):
> >>     prog.stack_trace(t)
> >> 
> >> The repetition is required, the segfault only occurs on the second
> >> iteration of the loop. Which, in hindsight, is a textbook sign that the
> >> issue has to do with caching. I'd expect that the issue is specific to
> >> this vmcore, it doesn't reproduce on others.
> >> 
> >> I stuck that into a git bisect script and bisected the libkdumpfile
> >> commit that introduced it:
> >> 
> >> commit 487a8042ea5da580e1fdb5b8f91c8bd7cad05cd6
> >> Author: Petr Tesarik <petr@tesarici.cz>
> >> Date:   Wed Jan 11 22:53:01 2023 +0100
> >> 
> >>     Cache: Calculate eprobe in reinit_entry()
> >> 
> >>     If this function is called to reuse a ghost entry, the probe list
> >>     has not been walked yet, so eprobe is left uninitialized.
> >> 
> >>     This passed the test case, because the correct old value was left
> >>     on stack. Modify the test case to poison the stack.
> >> 
> >>     Signed-off-by: Petr Tesarik <petr@tesarici.cz>
> >> 
> >>  src/kdumpfile/cache.c      |  6 +++++-
> >>  src/kdumpfile/test-cache.c | 13 +++++++++++++
> >>  2 files changed, 18 insertions(+), 1 deletion(-)  
> >
> > This looks like a red herring to me. The cache most likely continues in
> > a corrupted state without this commit, which may mask the issue (until
> > it resurfaces later).  
> 
> I see, that makes a lot of sense.
> 
> >> I haven't yet tried to debug the logic of the cache implementation and
> >> create a patch. I'm totally willing to try that, but I figured I would
> >> send this report to you first, to see if there's something obvious that
> >> sticks out to your eyes.  
> >
> > No, but I should be able to recreate the issue if I get a log of the
> > cache API calls:
> >
> > - cache_alloc() - to know the number of elements
> > - cache_get_entry()
> > - cache_put_entry()
> > - cache_insert()
> > - cache_discard()
> > - cache_flush() - not likely after initialization, but...  
> 
> I went ahead and logged each of these calls as you suggested, I tried to
> log them at the beginning of the function call and always include the
> cache pointer, cache_entry, and the key. I took the resulting log and
> filtered it to just contain the most recently logged cache prior to the
> crash, compressed it, and attached it. For completeness, the patch
> I used is below (applies to tip branch 8254897 ("Merge pull request #78
> from fweimer-rh/c99")).
> 
> I'll also see if I can reproduce it based on the log.

Thank you for the log. I haven't had much time to look at it, but the
first line is a good hint already:

0x56098b68c4c0: cache_alloc(1024, 0)

Zero size means the data pointers are managed by the caller, so this
must be the cache of mmap()'ed segments. That's the only cache which
installs a cleanup callback with set_cache_entry_cleanup(). There is
only one call to the cleanup callback for evicted entries in cache.c:

		/* Get an unused cached entry. */
		if (cs->nuprobe != 0 &&
		    (cs->nuprec == 0 || cache->nprobe + bias > cache->dprobe))
			evict = evict_probe(cache, cs);
		else
			evict = evict_prec(cache, cs);
		if (cache->entry_cleanup)
			cache->entry_cleanup(cache->cleanup_data, evict);

The entries can be evicted from the probe partition or from the precious
partition. This might be relevant. Please, can you re-run and log where
the evict entry comes from?

Cheers
Petr T

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Segmentation fault with drgn + libkdumpfile
  2024-01-08 20:40     ` Petr Tesařík
@ 2024-01-09  9:06       ` Petr Tesařík
  2024-01-10  1:40         ` Stephen Brennan
  0 siblings, 1 reply; 11+ messages in thread
From: Petr Tesařík @ 2024-01-09  9:06 UTC (permalink / raw)
  To: Stephen Brennan; +Cc: Omar Sandoval, linux-debuggers

On Mon, 8 Jan 2024 21:40:08 +0100
Petr Tesařík <petr@tesarici.cz> wrote:

> On Fri, 05 Jan 2024 13:53:15 -0800
> Stephen Brennan <stephen.s.brennan@oracle.com> wrote:
> 
> > Petr Tesařík <petr@tesarici.cz> writes:  
> > > On Fri, 05 Jan 2024 10:38:16 -0800
> > > Stephen Brennan <stephen.s.brennan@oracle.com> wrote:
> > >    
> > >> Hi Petr,
> > >> 
> > >> I recently encountered a segmentation fault with libkdumpfile & drgn
> > >> which appears to be related to the cache implementation. I've included
> > >> the stack trace at the end of this message, since it's a bit of a longer
> > >> one. The exact issue occurred with a test vmcore that I could probably
> > >> share with you privately if you'd like. In any case, the reproducer is
> > >> fairly straightforward in drgn code:
> > >> 
> > >> for t in for_each_task(prog):
> > >>     prog.stack_trace(t)
> > >> for t in for_each_task(prog):
> > >>     prog.stack_trace(t)
> > >> 
> > >> The repetition is required, the segfault only occurs on the second
> > >> iteration of the loop. Which, in hindsight, is a textbook sign that the
> > >> issue has to do with caching. I'd expect that the issue is specific to
> > >> this vmcore, it doesn't reproduce on others.
> > >> 
> > >> I stuck that into a git bisect script and bisected the libkdumpfile
> > >> commit that introduced it:
> > >> 
> > >> commit 487a8042ea5da580e1fdb5b8f91c8bd7cad05cd6
> > >> Author: Petr Tesarik <petr@tesarici.cz>
> > >> Date:   Wed Jan 11 22:53:01 2023 +0100
> > >> 
> > >>     Cache: Calculate eprobe in reinit_entry()
> > >> 
> > >>     If this function is called to reuse a ghost entry, the probe list
> > >>     has not been walked yet, so eprobe is left uninitialized.
> > >> 
> > >>     This passed the test case, because the correct old value was left
> > >>     on stack. Modify the test case to poison the stack.
> > >> 
> > >>     Signed-off-by: Petr Tesarik <petr@tesarici.cz>
> > >> 
> > >>  src/kdumpfile/cache.c      |  6 +++++-
> > >>  src/kdumpfile/test-cache.c | 13 +++++++++++++
> > >>  2 files changed, 18 insertions(+), 1 deletion(-)    
> > >
> > > This looks like a red herring to me. The cache most likely continues in
> > > a corrupted state without this commit, which may mask the issue (until
> > > it resurfaces later).    
> > 
> > I see, that makes a lot of sense.
> >   
> > >> I haven't yet tried to debug the logic of the cache implementation and
> > >> create a patch. I'm totally willing to try that, but I figured I would
> > >> send this report to you first, to see if there's something obvious that
> > >> sticks out to your eyes.    
> > >
> > > No, but I should be able to recreate the issue if I get a log of the
> > > cache API calls:
> > >
> > > - cache_alloc() - to know the number of elements
> > > - cache_get_entry()
> > > - cache_put_entry()
> > > - cache_insert()
> > > - cache_discard()
> > > - cache_flush() - not likely after initialization, but...    
> > 
> > I went ahead and logged each of these calls as you suggested, I tried to
> > log them at the beginning of the function call and always include the
> > cache pointer, cache_entry, and the key. I took the resulting log and
> > filtered it to just contain the most recently logged cache prior to the
> > crash, compressed it, and attached it. For completeness, the patch
> > I used is below (applies to tip branch 8254897 ("Merge pull request #78
> > from fweimer-rh/c99")).
> > 
> > I'll also see if I can reproduce it based on the log.  
> 
> Thank you for the log. I haven't had much time to look at it, but the
> first line is a good hint already:
> 
> 0x56098b68c4c0: cache_alloc(1024, 0)
> 
> Zero size means the data pointers are managed by the caller, so this
> must be the cache of mmap()'ed segments. That's the only cache which
> installs a cleanup callback with set_cache_entry_cleanup(). There is
> only one call to the cleanup callback for evicted entries in cache.c:
> 
> 		/* Get an unused cached entry. */
> 		if (cs->nuprobe != 0 &&
> 		    (cs->nuprec == 0 || cache->nprobe + bias > cache->dprobe))
> 			evict = evict_probe(cache, cs);
> 		else
> 			evict = evict_prec(cache, cs);
> 		if (cache->entry_cleanup)
> 			cache->entry_cleanup(cache->cleanup_data, evict);
> 
> The entries can be evicted from the probe partition or from the precious
> partition. This might be relevant. Please, can you re-run and log where
> the evict entry comes from?

I found some time this morning, and it wouldn't help. Because of a bug
in fcache_new(), the number of elements in the cache is big enough that
cache entries are never evicted in your case. It's quite weird to hit a
cache metadata bug after elements have been inserted. FWIW I am not
able to reproduce the bug by replaying the logged file read pattern.

Since you have a reliable reproducer, it cannot be a Heisenbug. But it
could be caused by the other cache - the cache of decompressed pages.
Do you know for sure that lzo1x_decompress_safe() crashes while trying
to _read_ from the input buffer, and not while trying to _write_ to the
output buffer?

Petr T

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Segmentation fault with drgn + libkdumpfile
  2024-01-09  9:06       ` Petr Tesařík
@ 2024-01-10  1:40         ` Stephen Brennan
  2024-01-10  8:36           ` Petr Tesařík
  0 siblings, 1 reply; 11+ messages in thread
From: Stephen Brennan @ 2024-01-10  1:40 UTC (permalink / raw)
  To: Petr Tesařík; +Cc: Omar Sandoval, linux-debuggers

Petr Tesařík <petr@tesarici.cz> writes:

> On Mon, 8 Jan 2024 21:40:08 +0100
> Petr Tesařík <petr@tesarici.cz> wrote:
>
>> On Fri, 05 Jan 2024 13:53:15 -0800
>> Stephen Brennan <stephen.s.brennan@oracle.com> wrote:
>> 
>> > Petr Tesařík <petr@tesarici.cz> writes:  
>> > > On Fri, 05 Jan 2024 10:38:16 -0800
>> > > Stephen Brennan <stephen.s.brennan@oracle.com> wrote:
>> > >    
>> > >> Hi Petr,
>> > >> 
>> > >> I recently encountered a segmentation fault with libkdumpfile & drgn
>> > >> which appears to be related to the cache implementation. I've included
>> > >> the stack trace at the end of this message, since it's a bit of a longer
>> > >> one. The exact issue occurred with a test vmcore that I could probably
>> > >> share with you privately if you'd like. In any case, the reproducer is
>> > >> fairly straightforward in drgn code:
>> > >> 
>> > >> for t in for_each_task(prog):
>> > >>     prog.stack_trace(t)
>> > >> for t in for_each_task(prog):
>> > >>     prog.stack_trace(t)
>> > >> 
>> > >> The repetition is required, the segfault only occurs on the second
>> > >> iteration of the loop. Which, in hindsight, is a textbook sign that the
>> > >> issue has to do with caching. I'd expect that the issue is specific to
>> > >> this vmcore, it doesn't reproduce on others.
>> > >> 
>> > >> I stuck that into a git bisect script and bisected the libkdumpfile
>> > >> commit that introduced it:
>> > >> 
>> > >> commit 487a8042ea5da580e1fdb5b8f91c8bd7cad05cd6
>> > >> Author: Petr Tesarik <petr@tesarici.cz>
>> > >> Date:   Wed Jan 11 22:53:01 2023 +0100
>> > >> 
>> > >>     Cache: Calculate eprobe in reinit_entry()
>> > >> 
>> > >>     If this function is called to reuse a ghost entry, the probe list
>> > >>     has not been walked yet, so eprobe is left uninitialized.
>> > >> 
>> > >>     This passed the test case, because the correct old value was left
>> > >>     on stack. Modify the test case to poison the stack.
>> > >> 
>> > >>     Signed-off-by: Petr Tesarik <petr@tesarici.cz>
>> > >> 
>> > >>  src/kdumpfile/cache.c      |  6 +++++-
>> > >>  src/kdumpfile/test-cache.c | 13 +++++++++++++
>> > >>  2 files changed, 18 insertions(+), 1 deletion(-)    
>> > >
>> > > This looks like a red herring to me. The cache most likely continues in
>> > > a corrupted state without this commit, which may mask the issue (until
>> > > it resurfaces later).    
>> > 
>> > I see, that makes a lot of sense.
>> >   
>> > >> I haven't yet tried to debug the logic of the cache implementation and
>> > >> create a patch. I'm totally willing to try that, but I figured I would
>> > >> send this report to you first, to see if there's something obvious that
>> > >> sticks out to your eyes.    
>> > >
>> > > No, but I should be able to recreate the issue if I get a log of the
>> > > cache API calls:
>> > >
>> > > - cache_alloc() - to know the number of elements
>> > > - cache_get_entry()
>> > > - cache_put_entry()
>> > > - cache_insert()
>> > > - cache_discard()
>> > > - cache_flush() - not likely after initialization, but...    
>> > 
>> > I went ahead and logged each of these calls as you suggested, I tried to
>> > log them at the beginning of the function call and always include the
>> > cache pointer, cache_entry, and the key. I took the resulting log and
>> > filtered it to just contain the most recently logged cache prior to the
>> > crash, compressed it, and attached it. For completeness, the patch
>> > I used is below (applies to tip branch 8254897 ("Merge pull request #78
>> > from fweimer-rh/c99")).
>> > 
>> > I'll also see if I can reproduce it based on the log.  
>> 
>> Thank you for the log. I haven't had much time to look at it, but the
>> first line is a good hint already:
>> 
>> 0x56098b68c4c0: cache_alloc(1024, 0)
>> 
>> Zero size means the data pointers are managed by the caller, so this
>> must be the cache of mmap()'ed segments. That's the only cache which
>> installs a cleanup callback with set_cache_entry_cleanup(). There is
>> only one call to the cleanup callback for evicted entries in cache.c:
>> 
>> 		/* Get an unused cached entry. */
>> 		if (cs->nuprobe != 0 &&
>> 		    (cs->nuprec == 0 || cache->nprobe + bias > cache->dprobe))
>> 			evict = evict_probe(cache, cs);
>> 		else
>> 			evict = evict_prec(cache, cs);
>> 		if (cache->entry_cleanup)
>> 			cache->entry_cleanup(cache->cleanup_data, evict);
>> 
>> The entries can be evicted from the probe partition or from the precious
>> partition. This might be relevant. Please, can you re-run and log where
>> the evict entry comes from?
>
> I found some time this morning, and it wouldn't help. Because of a bug
> in fcache_new(), the number of elements in the cache is big enough that
> cache entries are never evicted in your case. It's quite weird to hit a
> cache metadata bug after elements have been inserted. FWIW I am not
> able to reproduce the bug by replaying the logged file read pattern.
>
> Since you have a reliable reproducer, it cannot be a Heisenbug. But it
> could be caused by the other cache - the cache of decompressed pages.
> Do you know for sure that lzo1x_decompress_safe() crashes while trying
> to _read_ from the input buffer, and not while trying to _write_ to the
> output buffer?

Hi Petr,

Sorry for the delay here, I got pulled into other issues and am trying
to attend to all my work in a round-robin fashion :)

The fault is definitely in lzo1x_decompress_safe() *writing* to address
0. I fetched debuginfo for all the necessary libraries and we see the
following stack trace:

%<-----------------------
#0  0x00007fcd9adddef3 in lzo1x_decompress_safe (in=<optimized out>,
    in_len=<optimized out>, out=0x0, out_len=0x7ffdee2c1388, wrkmem=<optimized out>)
    at src/lzo1x_d.ch:120
#1  0x00007fcd9ae25be1 in diskdump_read_page (pio=0x7ffdee2c1590) at diskdump.c:584
#2  0x00007fcd9ae32d4d in _kdumpfile_priv_cache_get_page (pio=0x7ffdee2c1590,
    fn=0x7fcd9ae257ae <diskdump_read_page>) at read.c:69
#3  0x00007fcd9ae25e44 in diskdump_get_page (pio=0x7ffdee2c1590) at diskdump.c:647
#4  0x00007fcd9ae32be0 in get_page (pio=0x7ffdee2c1590)
    at /home/stepbren/repos/libkdumpfile/src/kdumpfile/kdumpfile-priv.h:1512
#5  0x00007fcd9ae32ed4 in get_page_xlat (pio=0x7ffdee2c1590) at read.c:126
#6  0x00007fcd9ae32f22 in get_page_maybe_xlat (pio=0x7ffdee2c1590) at read.c:137
#7  0x00007fcd9ae32fb1 in _kdumpfile_priv_read_locked (ctx=0x55745bfca8f0,
    as=KDUMP_KVADDR, addr=18446612133360081960, buffer=0x7ffdee2c17df,
    plength=0x7ffdee2c1698) at read.c:169
#8  0x00007fcd9ae330dd in kdump_read (ctx=0x55745bfca8f0, as=KDUMP_KVADDR,
    addr=18446612133360081960, buffer=0x7ffdee2c17df, plength=0x7ffdee2c1698)
    at read.c:196
#9  0x00007fcd9afb0cc4 in drgn_read_kdump (buf=0x7ffdee2c17df,
    address=18446612133360081960, count=4, offset=18446612133360081960,
    arg=0x55745bfca8f0, physical=false) at ../../libdrgn/kdump.c:73
%<-----------------------

In frame 1 where we are calling the decompressor:

%<-----------------------
(gdb) frame 1
#1  0x00007fcd9ae25be1 in diskdump_read_page (pio=0x7ffdee2c1590) at diskdump.c:584
584                     int ret = lzo1x_decompress_safe(fch.data, pd.size,
(gdb) list
579                     if (ret != KDUMP_OK)
580                             return ret;
581             } else if (pd.flags & DUMP_DH_COMPRESSED_LZO) {
582     #if USE_LZO
583                     lzo_uint retlen = get_page_size(ctx);
584                     int ret = lzo1x_decompress_safe(fch.data, pd.size,
585                                                     pio->chunk.data,
586                                                     &retlen,
587                                                     LZO1X_MEM_DECOMPRESS);
588                     fcache_put_chunk(&fch);
(gdb) p retlen
$7 = 0
(gdb) p pio->chunk.data
$8 = (void *) 0x0
(gdb) p fch.data
$9 = (void *) 0x7fcd7cc33da4
(gdb) p pd.size
$10 = 816
%<-----------------------

As far as I can tell, pio->chunk.data comes directly from the
cache_get_page() function in frame 2:

%<-----------------------
(gdb) up
#2  0x00007fcd9ae32d4d in _kdumpfile_priv_cache_get_page (pio=0x7ffdee2c1590,
    fn=0x7fcd9ae257ae <diskdump_read_page>) at read.c:69
69              ret = fn(pio);
(gdb) list
64              pio->chunk.data = entry->data;
65              pio->chunk.embed_fces->ce = entry;
66              if (cache_entry_valid(entry))
67                      return KDUMP_OK;
68
69              ret = fn(pio);
70              mutex_lock(&ctx->shared->cache_lock);
71              if (ret == KDUMP_OK)
72                      cache_insert(pio->chunk.embed_fces->cache, entry);
73              else
(gdb) p *entry
$11 = {key = 1045860353, state = cs_precious, next = 626, prev = 626, refcnt = 1,
  data = 0x0}
(gdb) p *pio
$12 = {ctx = 0x55745bfca8f0, addr = {addr = 1045860352, as = ADDRXLAT_MACHPHYSADDR},
  chunk = {data = 0x0, nent = 1, {embed_fces = {{data = 0xffff880ff1470788,
          len = 140728599320032, ce = 0x55745c1003d8, cache = 0x55745c0fb540}, {
          data = 0x55745bfd42f0, len = 140728599320112,
          ce = 0x7fcd9ae330ef <kdump_read+102>, cache = 0xffff88003e569c28}},
      fces = 0xffff880ff1470788}}}
%<-----------------------

And here is the cache structure:

%<-----------------------
(gdb) p *pio->chunk.embed_fces->cache
$16 = {split = 487, nprec = 1020, ngprec = 248, nprobe = 3, ngprobe = 239, dprobe = 2,
  cap = 1024, inflight = 626, ninflight = 1, hits = {number = 168473, address = 168473,
    string = 0x29219 <error: Cannot access memory at address 0x29219>,
    bitmap = 0x29219, blob = 0x29219}, misses = {number = 1913, address = 1913,
    string = 0x779 <error: Cannot access memory at address 0x779>, bitmap = 0x779,
    blob = 0x779}, elemsize = 4096, data = 0x7fcd997fe010, entry_cleanup = 0x0,
  cleanup_data = 0x0, ce = 0x55745c0fb598}
%<-----------------------

Thanks for looking into this! I'll continue investigating more as well.

Stephen

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Segmentation fault with drgn + libkdumpfile
  2024-01-10  1:40         ` Stephen Brennan
@ 2024-01-10  8:36           ` Petr Tesařík
  2024-01-10 13:49             ` Petr Tesařík
  0 siblings, 1 reply; 11+ messages in thread
From: Petr Tesařík @ 2024-01-10  8:36 UTC (permalink / raw)
  To: Stephen Brennan; +Cc: Omar Sandoval, linux-debuggers

On Tue, 09 Jan 2024 17:40:15 -0800
Stephen Brennan <stephen.s.brennan@oracle.com> wrote:

> Petr Tesařík <petr@tesarici.cz> writes:
> 
> > On Mon, 8 Jan 2024 21:40:08 +0100
> > Petr Tesařík <petr@tesarici.cz> wrote:
> >  
> >> On Fri, 05 Jan 2024 13:53:15 -0800
> >> Stephen Brennan <stephen.s.brennan@oracle.com> wrote:
> >>   
> >> > Petr Tesařík <petr@tesarici.cz> writes:    
> >> > > On Fri, 05 Jan 2024 10:38:16 -0800
> >> > > Stephen Brennan <stephen.s.brennan@oracle.com> wrote:
> >> > >      
> >> > >> Hi Petr,
> >> > >> 
> >> > >> I recently encountered a segmentation fault with libkdumpfile & drgn
> >> > >> which appears to be related to the cache implementation. I've included
> >> > >> the stack trace at the end of this message, since it's a bit of a longer
> >> > >> one. The exact issue occurred with a test vmcore that I could probably
> >> > >> share with you privately if you'd like. In any case, the reproducer is
> >> > >> fairly straightforward in drgn code:
> >> > >> 
> >> > >> for t in for_each_task(prog):
> >> > >>     prog.stack_trace(t)
> >> > >> for t in for_each_task(prog):
> >> > >>     prog.stack_trace(t)
> >> > >> 
> >> > >> The repetition is required, the segfault only occurs on the second
> >> > >> iteration of the loop. Which, in hindsight, is a textbook sign that the
> >> > >> issue has to do with caching. I'd expect that the issue is specific to
> >> > >> this vmcore, it doesn't reproduce on others.
> >> > >> 
> >> > >> I stuck that into a git bisect script and bisected the libkdumpfile
> >> > >> commit that introduced it:
> >> > >> 
> >> > >> commit 487a8042ea5da580e1fdb5b8f91c8bd7cad05cd6
> >> > >> Author: Petr Tesarik <petr@tesarici.cz>
> >> > >> Date:   Wed Jan 11 22:53:01 2023 +0100
> >> > >> 
> >> > >>     Cache: Calculate eprobe in reinit_entry()
> >> > >> 
> >> > >>     If this function is called to reuse a ghost entry, the probe list
> >> > >>     has not been walked yet, so eprobe is left uninitialized.
> >> > >> 
> >> > >>     This passed the test case, because the correct old value was left
> >> > >>     on stack. Modify the test case to poison the stack.
> >> > >> 
> >> > >>     Signed-off-by: Petr Tesarik <petr@tesarici.cz>
> >> > >> 
> >> > >>  src/kdumpfile/cache.c      |  6 +++++-
> >> > >>  src/kdumpfile/test-cache.c | 13 +++++++++++++
> >> > >>  2 files changed, 18 insertions(+), 1 deletion(-)      
> >> > >
> >> > > This looks like a red herring to me. The cache most likely continues in
> >> > > a corrupted state without this commit, which may mask the issue (until
> >> > > it resurfaces later).      
> >> > 
> >> > I see, that makes a lot of sense.
> >> >     
> >> > >> I haven't yet tried to debug the logic of the cache implementation and
> >> > >> create a patch. I'm totally willing to try that, but I figured I would
> >> > >> send this report to you first, to see if there's something obvious that
> >> > >> sticks out to your eyes.      
> >> > >
> >> > > No, but I should be able to recreate the issue if I get a log of the
> >> > > cache API calls:
> >> > >
> >> > > - cache_alloc() - to know the number of elements
> >> > > - cache_get_entry()
> >> > > - cache_put_entry()
> >> > > - cache_insert()
> >> > > - cache_discard()
> >> > > - cache_flush() - not likely after initialization, but...      
> >> > 
> >> > I went ahead and logged each of these calls as you suggested, I tried to
> >> > log them at the beginning of the function call and always include the
> >> > cache pointer, cache_entry, and the key. I took the resulting log and
> >> > filtered it to just contain the most recently logged cache prior to the
> >> > crash, compressed it, and attached it. For completeness, the patch
> >> > I used is below (applies to tip branch 8254897 ("Merge pull request #78
> >> > from fweimer-rh/c99")).
> >> > 
> >> > I'll also see if I can reproduce it based on the log.    
> >> 
> >> Thank you for the log. I haven't had much time to look at it, but the
> >> first line is a good hint already:
> >> 
> >> 0x56098b68c4c0: cache_alloc(1024, 0)
> >> 
> >> Zero size means the data pointers are managed by the caller, so this
> >> must be the cache of mmap()'ed segments. That's the only cache which
> >> installs a cleanup callback with set_cache_entry_cleanup(). There is
> >> only one call to the cleanup callback for evicted entries in cache.c:
> >> 
> >> 		/* Get an unused cached entry. */
> >> 		if (cs->nuprobe != 0 &&
> >> 		    (cs->nuprec == 0 || cache->nprobe + bias > cache->dprobe))
> >> 			evict = evict_probe(cache, cs);
> >> 		else
> >> 			evict = evict_prec(cache, cs);
> >> 		if (cache->entry_cleanup)
> >> 			cache->entry_cleanup(cache->cleanup_data, evict);
> >> 
> >> The entries can be evicted from the probe partition or from the precious
> >> partition. This might be relevant. Please, can you re-run and log where
> >> the evict entry comes from?  
> >
> > I found some time this morning, and it wouldn't help. Because of a bug
> > in fcache_new(), the number of elements in the cache is big enough that
> > cache entries are never evicted in your case. It's quite weird to hit a
> > cache metadata bug after elements have been inserted. FWIW I am not
> > able to reproduce the bug by replaying the logged file read pattern.
> >
> > Since you have a reliable reproducer, it cannot be a Heisenbug. But it
> > could be caused by the other cache - the cache of decompressed pages.
> > Do you know for sure that lzo1x_decompress_safe() crashes while trying
> > to _read_ from the input buffer, and not while trying to _write_ to the
> > output buffer?  
> 
> Hi Petr,
> 
> Sorry for the delay here, I got pulled into other issues and am trying
> to attend to all my work in a round-robin fashion :)

Hi Stephen,

that's fine, I also work on this only as time permits. ;-)

> The fault is definitely in lzo1x_decompress_safe() *writing* to address
> 0. I fetched debuginfo for all the necessary libraries and we see the
> following stack trace:
> 
> %<-----------------------
> #0  0x00007fcd9adddef3 in lzo1x_decompress_safe (in=<optimized out>,
>     in_len=<optimized out>, out=0x0, out_len=0x7ffdee2c1388, wrkmem=<optimized out>)
>     at src/lzo1x_d.ch:120
> #1  0x00007fcd9ae25be1 in diskdump_read_page (pio=0x7ffdee2c1590) at diskdump.c:584
> #2  0x00007fcd9ae32d4d in _kdumpfile_priv_cache_get_page (pio=0x7ffdee2c1590,
>     fn=0x7fcd9ae257ae <diskdump_read_page>) at read.c:69
> #3  0x00007fcd9ae25e44 in diskdump_get_page (pio=0x7ffdee2c1590) at diskdump.c:647
> #4  0x00007fcd9ae32be0 in get_page (pio=0x7ffdee2c1590)
>     at /home/stepbren/repos/libkdumpfile/src/kdumpfile/kdumpfile-priv.h:1512
> #5  0x00007fcd9ae32ed4 in get_page_xlat (pio=0x7ffdee2c1590) at read.c:126
> #6  0x00007fcd9ae32f22 in get_page_maybe_xlat (pio=0x7ffdee2c1590) at read.c:137
> #7  0x00007fcd9ae32fb1 in _kdumpfile_priv_read_locked (ctx=0x55745bfca8f0,
>     as=KDUMP_KVADDR, addr=18446612133360081960, buffer=0x7ffdee2c17df,
>     plength=0x7ffdee2c1698) at read.c:169
> #8  0x00007fcd9ae330dd in kdump_read (ctx=0x55745bfca8f0, as=KDUMP_KVADDR,
>     addr=18446612133360081960, buffer=0x7ffdee2c17df, plength=0x7ffdee2c1698)
>     at read.c:196
> #9  0x00007fcd9afb0cc4 in drgn_read_kdump (buf=0x7ffdee2c17df,
>     address=18446612133360081960, count=4, offset=18446612133360081960,
>     arg=0x55745bfca8f0, physical=false) at ../../libdrgn/kdump.c:73
> %<-----------------------
> 
> In frame 1 where we are calling the decompressor:
> 
> %<-----------------------
> (gdb) frame 1
> #1  0x00007fcd9ae25be1 in diskdump_read_page (pio=0x7ffdee2c1590) at diskdump.c:584
> 584                     int ret = lzo1x_decompress_safe(fch.data, pd.size,
> (gdb) list
> 579                     if (ret != KDUMP_OK)
> 580                             return ret;
> 581             } else if (pd.flags & DUMP_DH_COMPRESSED_LZO) {
> 582     #if USE_LZO
> 583                     lzo_uint retlen = get_page_size(ctx);
> 584                     int ret = lzo1x_decompress_safe(fch.data, pd.size,
> 585                                                     pio->chunk.data,
> 586                                                     &retlen,
> 587                                                     LZO1X_MEM_DECOMPRESS);
> 588                     fcache_put_chunk(&fch);
> (gdb) p retlen
> $7 = 0

This is a bit weird. Looking at liblzo sources, it seems to me that
the output length is not changed until right before returning from
lzo1x_decompress_safe().

> (gdb) p pio->chunk.data
> $8 = (void *) 0x0

OK, here's our immediate root cause. ;-)

> (gdb) p fch.data
> $9 = (void *) 0x7fcd7cc33da4

This looks sane.

> (gdb) p pd.size
> $10 = 816

This also looks sane.

> %<-----------------------
> 
> As far as I can tell, pio->chunk.data comes directly from the
> cache_get_page() function in frame 2:
> 
> %<-----------------------
> (gdb) up
> #2  0x00007fcd9ae32d4d in _kdumpfile_priv_cache_get_page (pio=0x7ffdee2c1590,
>     fn=0x7fcd9ae257ae <diskdump_read_page>) at read.c:69
> 69              ret = fn(pio);
> (gdb) list
> 64              pio->chunk.data = entry->data;
> 65              pio->chunk.embed_fces->ce = entry;
> 66              if (cache_entry_valid(entry))
> 67                      return KDUMP_OK;
> 68
> 69              ret = fn(pio);
> 70              mutex_lock(&ctx->shared->cache_lock);
> 71              if (ret == KDUMP_OK)
> 72                      cache_insert(pio->chunk.embed_fces->cache, entry);
> 73              else
> (gdb) p *entry
> $11 = {key = 1045860353, state = cs_precious, next = 626, prev = 626, refcnt = 1,
>   data = 0x0}

The key (0x3e569000 | ADDRXLAT_MACHPHYSADDR) corresponds to the
requested virtual address 0xffff88003e569c28.

> (gdb) p *pio
> $12 = {ctx = 0x55745bfca8f0, addr = {addr = 1045860352, as = ADDRXLAT_MACHPHYSADDR},
>   chunk = {data = 0x0, nent = 1, {embed_fces = {{data = 0xffff880ff1470788,
>           len = 140728599320032, ce = 0x55745c1003d8, cache = 0x55745c0fb540}, {
>           data = 0x55745bfd42f0, len = 140728599320112,
>           ce = 0x7fcd9ae330ef <kdump_read+102>, cache = 0xffff88003e569c28}},
>       fces = 0xffff880ff1470788}}}
> %<-----------------------


Looking at pio->chunk->embed_fces->ce, struct cache_entry is at
0x55745c1003d8. Assuming that sizeof(struct cache_entry) == 32 on your
system, this is element 626 in the cache entry array. The next and
prev indices are also 626, which looks good, because cache->ninflight
is 1, so this is the only element in the (circular) in-flight list.

Since state is cs_precious, but the data was discarded, this cache
entry has just been recovered from a ghost partition, evicting another
entry, and _that_ entry had a NULL data pointer.

It would be really helpful if I could get the log for this cache
instead of the one you posted earlier.

> 
> And here is the cache structure:
> 
> %<-----------------------
> (gdb) p *pio->chunk.embed_fces->cache
> $16 = {split = 487, nprec = 1020, ngprec = 248, nprobe = 3, ngprobe = 239, dprobe = 2,
>   cap = 1024, inflight = 626, ninflight = 1, hits = {number = 168473, address = 168473,
>     string = 0x29219 <error: Cannot access memory at address 0x29219>,
>     bitmap = 0x29219, blob = 0x29219}, misses = {number = 1913, address = 1913,
>     string = 0x779 <error: Cannot access memory at address 0x779>, bitmap = 0x779,
>     blob = 0x779}, elemsize = 4096, data = 0x7fcd997fe010, entry_cleanup = 0x0,
>   cleanup_data = 0x0, ce = 0x55745c0fb598}
> %<-----------------------

BTW with a 98.9% hit ratio (168473 hits, 1913 misses), the cache proves
to be quite effective for your workload. ;-)

Petr T

> Thanks for looking into this! I'll continue investigating more as well.
> 
> Stephen
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Segmentation fault with drgn + libkdumpfile
  2024-01-10  8:36           ` Petr Tesařík
@ 2024-01-10 13:49             ` Petr Tesařík
  2024-01-10 18:03               ` Petr Tesařík
  0 siblings, 1 reply; 11+ messages in thread
From: Petr Tesařík @ 2024-01-10 13:49 UTC (permalink / raw)
  To: Stephen Brennan; +Cc: Omar Sandoval, linux-debuggers

On Wed, 10 Jan 2024 09:36:00 +0100
Petr Tesařík <petr@tesarici.cz> wrote:

> On Tue, 09 Jan 2024 17:40:15 -0800
> Stephen Brennan <stephen.s.brennan@oracle.com> wrote:
> 
> > Petr Tesařík <petr@tesarici.cz> writes:
> >   
> > > On Mon, 8 Jan 2024 21:40:08 +0100
> > > Petr Tesařík <petr@tesarici.cz> wrote:
> > >    
> > >> On Fri, 05 Jan 2024 13:53:15 -0800
> > >> Stephen Brennan <stephen.s.brennan@oracle.com> wrote:
> > >>     
> > >> > Petr Tesařík <petr@tesarici.cz> writes:      
> > >> > > On Fri, 05 Jan 2024 10:38:16 -0800
> > >> > > Stephen Brennan <stephen.s.brennan@oracle.com> wrote:
> > >> > >        
> > >> > >> Hi Petr,
> > >> > >> 
> > >> > >> I recently encountered a segmentation fault with libkdumpfile & drgn
> > >> > >> which appears to be related to the cache implementation. I've included
> > >> > >> the stack trace at the end of this message, since it's a bit of a longer
> > >> > >> one. The exact issue occurred with a test vmcore that I could probably
> > >> > >> share with you privately if you'd like. In any case, the reproducer is
> > >> > >> fairly straightforward in drgn code:
> > >> > >> 
> > >> > >> for t in for_each_task(prog):
> > >> > >>     prog.stack_trace(t)
> > >> > >> for t in for_each_task(prog):
> > >> > >>     prog.stack_trace(t)
> > >> > >> 
> > >> > >> The repetition is required, the segfault only occurs on the second
> > >> > >> iteration of the loop. Which, in hindsight, is a textbook sign that the
> > >> > >> issue has to do with caching. I'd expect that the issue is specific to
> > >> > >> this vmcore, it doesn't reproduce on others.
> > >> > >> 
> > >> > >> I stuck that into a git bisect script and bisected the libkdumpfile
> > >> > >> commit that introduced it:
> > >> > >> 
> > >> > >> commit 487a8042ea5da580e1fdb5b8f91c8bd7cad05cd6
> > >> > >> Author: Petr Tesarik <petr@tesarici.cz>
> > >> > >> Date:   Wed Jan 11 22:53:01 2023 +0100
> > >> > >> 
> > >> > >>     Cache: Calculate eprobe in reinit_entry()
> > >> > >> 
> > >> > >>     If this function is called to reuse a ghost entry, the probe list
> > >> > >>     has not been walked yet, so eprobe is left uninitialized.
> > >> > >> 
> > >> > >>     This passed the test case, because the correct old value was left
> > >> > >>     on stack. Modify the test case to poison the stack.
> > >> > >> 
> > >> > >>     Signed-off-by: Petr Tesarik <petr@tesarici.cz>
> > >> > >> 
> > >> > >>  src/kdumpfile/cache.c      |  6 +++++-
> > >> > >>  src/kdumpfile/test-cache.c | 13 +++++++++++++
> > >> > >>  2 files changed, 18 insertions(+), 1 deletion(-)        
> > >> > >
> > >> > > This looks like a red herring to me. The cache most likely continues in
> > >> > > a corrupted state without this commit, which may mask the issue (until
> > >> > > it resurfaces later).        
> > >> > 
> > >> > I see, that makes a lot of sense.
> > >> >       
> > >> > >> I haven't yet tried to debug the logic of the cache implementation and
> > >> > >> create a patch. I'm totally willing to try that, but I figured I would
> > >> > >> send this report to you first, to see if there's something obvious that
> > >> > >> sticks out to your eyes.        
> > >> > >
> > >> > > No, but I should be able to recreate the issue if I get a log of the
> > >> > > cache API calls:
> > >> > >
> > >> > > - cache_alloc() - to know the number of elements
> > >> > > - cache_get_entry()
> > >> > > - cache_put_entry()
> > >> > > - cache_insert()
> > >> > > - cache_discard()
> > >> > > - cache_flush() - not likely after initialization, but...        
> > >> > 
> > >> > I went ahead and logged each of these calls as you suggested, I tried to
> > >> > log them at the beginning of the function call and always include the
> > >> > cache pointer, cache_entry, and the key. I took the resulting log and
> > >> > filtered it to just contain the most recently logged cache prior to the
> > >> > crash, compressed it, and attached it. For completeness, the patch
> > >> > I used is below (applies to tip branch 8254897 ("Merge pull request #78
> > >> > from fweimer-rh/c99")).
> > >> > 
> > >> > I'll also see if I can reproduce it based on the log.      
> > >> 
> > >> Thank you for the log. I haven't had much time to look at it, but the
> > >> first line is a good hint already:
> > >> 
> > >> 0x56098b68c4c0: cache_alloc(1024, 0)
> > >> 
> > >> Zero size means the data pointers are managed by the caller, so this
> > >> must be the cache of mmap()'ed segments. That's the only cache which
> > >> installs a cleanup callback with set_cache_entry_cleanup(). There is
> > >> only one call to the cleanup callback for evicted entries in cache.c:
> > >> 
> > >> 		/* Get an unused cached entry. */
> > >> 		if (cs->nuprobe != 0 &&
> > >> 		    (cs->nuprec == 0 || cache->nprobe + bias > cache->dprobe))
> > >> 			evict = evict_probe(cache, cs);
> > >> 		else
> > >> 			evict = evict_prec(cache, cs);
> > >> 		if (cache->entry_cleanup)
> > >> 			cache->entry_cleanup(cache->cleanup_data, evict);
> > >> 
> > >> The entries can be evicted from the probe partition or from the precious
> > >> partition. This might be relevant. Please, can you re-run and log where
> > >> the evict entry comes from?    
> > >
> > > I found some time this morning, and it wouldn't help. Because of a bug
> > > in fcache_new(), the number of elements in the cache is big enough that
> > > cache entries are never evicted in your case. It's quite weird to hit a
> > > cache metadata bug after elements have been inserted. FWIW I am not
> > > able to reproduce the bug by replaying the logged file read pattern.
> > >
> > > Since you have a reliable reproducer, it cannot be a Heisenbug. But it
> > > could be caused by the other cache - the cache of decompressed pages.
> > > Do you know for sure that lzo1x_decompress_safe() crashes while trying
> > > to _read_ from the input buffer, and not while trying to _write_ to the
> > > output buffer?    
> > 
> > Hi Petr,
> > 
> > Sorry for the delay here, I got pulled into other issues and am trying
> > to attend to all my work in a round-robin fashion :)  
> 
> Hi Stephen,
> 
> that's fine, I also work on this only as time permits. ;-)
> 
> > The fault is definitely in lzo1x_decompress_safe() *writing* to address
> > 0. I fetched debuginfo for all the necessary libraries and we see the
> > following stack trace:
> > 
> > %<-----------------------
> > #0  0x00007fcd9adddef3 in lzo1x_decompress_safe (in=<optimized out>,
> >     in_len=<optimized out>, out=0x0, out_len=0x7ffdee2c1388, wrkmem=<optimized out>)
> >     at src/lzo1x_d.ch:120
> > #1  0x00007fcd9ae25be1 in diskdump_read_page (pio=0x7ffdee2c1590) at diskdump.c:584
> > #2  0x00007fcd9ae32d4d in _kdumpfile_priv_cache_get_page (pio=0x7ffdee2c1590,
> >     fn=0x7fcd9ae257ae <diskdump_read_page>) at read.c:69
> > #3  0x00007fcd9ae25e44 in diskdump_get_page (pio=0x7ffdee2c1590) at diskdump.c:647
> > #4  0x00007fcd9ae32be0 in get_page (pio=0x7ffdee2c1590)
> >     at /home/stepbren/repos/libkdumpfile/src/kdumpfile/kdumpfile-priv.h:1512
> > #5  0x00007fcd9ae32ed4 in get_page_xlat (pio=0x7ffdee2c1590) at read.c:126
> > #6  0x00007fcd9ae32f22 in get_page_maybe_xlat (pio=0x7ffdee2c1590) at read.c:137
> > #7  0x00007fcd9ae32fb1 in _kdumpfile_priv_read_locked (ctx=0x55745bfca8f0,
> >     as=KDUMP_KVADDR, addr=18446612133360081960, buffer=0x7ffdee2c17df,
> >     plength=0x7ffdee2c1698) at read.c:169
> > #8  0x00007fcd9ae330dd in kdump_read (ctx=0x55745bfca8f0, as=KDUMP_KVADDR,
> >     addr=18446612133360081960, buffer=0x7ffdee2c17df, plength=0x7ffdee2c1698)
> >     at read.c:196
> > #9  0x00007fcd9afb0cc4 in drgn_read_kdump (buf=0x7ffdee2c17df,
> >     address=18446612133360081960, count=4, offset=18446612133360081960,
> >     arg=0x55745bfca8f0, physical=false) at ../../libdrgn/kdump.c:73
> > %<-----------------------
> > 
> > In frame 1 where we are calling the decompressor:
> > 
> > %<-----------------------
> > (gdb) frame 1
> > #1  0x00007fcd9ae25be1 in diskdump_read_page (pio=0x7ffdee2c1590) at diskdump.c:584
> > 584                     int ret = lzo1x_decompress_safe(fch.data, pd.size,
> > (gdb) list
> > 579                     if (ret != KDUMP_OK)
> > 580                             return ret;
> > 581             } else if (pd.flags & DUMP_DH_COMPRESSED_LZO) {
> > 582     #if USE_LZO
> > 583                     lzo_uint retlen = get_page_size(ctx);
> > 584                     int ret = lzo1x_decompress_safe(fch.data, pd.size,
> > 585                                                     pio->chunk.data,
> > 586                                                     &retlen,
> > 587                                                     LZO1X_MEM_DECOMPRESS);
> > 588                     fcache_put_chunk(&fch);
> > (gdb) p retlen
> > $7 = 0  
> 
> This is a bit weird. Looking at liblzo sources, it seems to me that
> the output length is not changed until right before returning from
> lzo1x_decompress_safe().
> 
> > (gdb) p pio->chunk.data
> > $8 = (void *) 0x0  
> 
> OK, here's our immediate root cause. ;-)
> 
> > (gdb) p fch.data
> > $9 = (void *) 0x7fcd7cc33da4  
> 
> This looks sane.
> 
> > (gdb) p pd.size
> > $10 = 816  
> 
> This also looks sane.
> 
> > %<-----------------------
> > 
> > As far as I can tell, pio->chunk.data comes directly from the
> > cache_get_page() function in frame 2:
> > 
> > %<-----------------------
> > (gdb) up
> > #2  0x00007fcd9ae32d4d in _kdumpfile_priv_cache_get_page (pio=0x7ffdee2c1590,
> >     fn=0x7fcd9ae257ae <diskdump_read_page>) at read.c:69
> > 69              ret = fn(pio);
> > (gdb) list
> > 64              pio->chunk.data = entry->data;
> > 65              pio->chunk.embed_fces->ce = entry;
> > 66              if (cache_entry_valid(entry))
> > 67                      return KDUMP_OK;
> > 68
> > 69              ret = fn(pio);
> > 70              mutex_lock(&ctx->shared->cache_lock);
> > 71              if (ret == KDUMP_OK)
> > 72                      cache_insert(pio->chunk.embed_fces->cache, entry);
> > 73              else
> > (gdb) p *entry
> > $11 = {key = 1045860353, state = cs_precious, next = 626, prev = 626, refcnt = 1,
> >   data = 0x0}  
> 
> The key (0x3e569000 | ADDRXLAT_MACHPHYSADDR) corresponds to the
> requested virtual address 0xffff88003e569c28.
> 
> > (gdb) p *pio
> > $12 = {ctx = 0x55745bfca8f0, addr = {addr = 1045860352, as = ADDRXLAT_MACHPHYSADDR},
> >   chunk = {data = 0x0, nent = 1, {embed_fces = {{data = 0xffff880ff1470788,
> >           len = 140728599320032, ce = 0x55745c1003d8, cache = 0x55745c0fb540}, {
> >           data = 0x55745bfd42f0, len = 140728599320112,
> >           ce = 0x7fcd9ae330ef <kdump_read+102>, cache = 0xffff88003e569c28}},
> >       fces = 0xffff880ff1470788}}}
> > %<-----------------------  
> 
> 
> Looking at pio->chunk->embed_fces->ce, struct cache_entry is at
> 0x55745c1003d8. Assuming that sizeof(struct cache_entry) == 32 on your
> system, this is element 626 in the cache entry array. The next and
> prev indices are also 626, which looks good, because cache->ninflight
> is 1, so this is the only element in the (circular) in-flight list.
> 
> Since state is cs_precious, but the data was discarded, this cache
> entry has just been recovered from a ghost partition, evicting another
> entry, and _that_ entry had a NULL data pointer.
> 
> It would be really helpful if I could get the log for this cache
> instead of the one you posted earlier.

Let's recap:

1. Data indicates that a reused ghost entry has a NULL data pointer.
2. Reverting commit 487a8042ea5da580e1fdb5b8f91c8bd7cad05cd6 masks the issue.

My conclusion is that the data pointer was taken from an entry in the
unused partition. This partition is empty, except when the cache is
new, or after calling cache_discard(). Given the statistic counter
values, the latter is the case here,

I think I found the bug: reinit_entry() finds the unused partition by
skipping over the ghost probe partition, but if the target itself is a
reused ghost probe entry, cache->ngprobe was already decremented in
get_ghost_or_missed_entry().

I'm going to write a test case and fix.

Stephen, once again, thank you for your report and engagement.

Petr T

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Segmentation fault with drgn + libkdumpfile
  2024-01-10 13:49             ` Petr Tesařík
@ 2024-01-10 18:03               ` Petr Tesařík
  2024-01-10 19:48                 ` Stephen Brennan
  0 siblings, 1 reply; 11+ messages in thread
From: Petr Tesařík @ 2024-01-10 18:03 UTC (permalink / raw)
  To: Stephen Brennan; +Cc: Omar Sandoval, linux-debuggers

On Wed, 10 Jan 2024 14:49:28 +0100
Petr Tesařík <petr@tesarici.cz> wrote:

> On Wed, 10 Jan 2024 09:36:00 +0100
> Petr Tesařík <petr@tesarici.cz> wrote:
> 
> > On Tue, 09 Jan 2024 17:40:15 -0800
> > Stephen Brennan <stephen.s.brennan@oracle.com> wrote:
> > 
> > > Petr Tesařík <petr@tesarici.cz> writes:
> > >   
> > > > On Mon, 8 Jan 2024 21:40:08 +0100
> > > > Petr Tesařík <petr@tesarici.cz> wrote:
> > > >    
> > > >> On Fri, 05 Jan 2024 13:53:15 -0800
> > > >> Stephen Brennan <stephen.s.brennan@oracle.com> wrote:
> > > >>     
> > > >> > Petr Tesařík <petr@tesarici.cz> writes:      
> > > >> > > On Fri, 05 Jan 2024 10:38:16 -0800
> > > >> > > Stephen Brennan <stephen.s.brennan@oracle.com> wrote:
> > > >> > >        
> > > >> > >> Hi Petr,
> > > >> > >> 
> > > >> > >> I recently encountered a segmentation fault with libkdumpfile & drgn
> > > >> > >> which appears to be related to the cache implementation. I've included
> > > >> > >> the stack trace at the end of this message, since it's a bit of a longer
> > > >> > >> one. The exact issue occurred with a test vmcore that I could probably
> > > >> > >> share with you privately if you'd like. In any case, the reproducer is
> > > >> > >> fairly straightforward in drgn code:
> > > >> > >> 
> > > >> > >> for t in for_each_task(prog):
> > > >> > >>     prog.stack_trace(t)
> > > >> > >> for t in for_each_task(prog):
> > > >> > >>     prog.stack_trace(t)
> > > >> > >> 
> > > >> > >> The repetition is required, the segfault only occurs on the second
> > > >> > >> iteration of the loop. Which, in hindsight, is a textbook sign that the
> > > >> > >> issue has to do with caching. I'd expect that the issue is specific to
> > > >> > >> this vmcore, it doesn't reproduce on others.
> > > >> > >> 
> > > >> > >> I stuck that into a git bisect script and bisected the libkdumpfile
> > > >> > >> commit that introduced it:
> > > >> > >> 
> > > >> > >> commit 487a8042ea5da580e1fdb5b8f91c8bd7cad05cd6
> > > >> > >> Author: Petr Tesarik <petr@tesarici.cz>
> > > >> > >> Date:   Wed Jan 11 22:53:01 2023 +0100
> > > >> > >> 
> > > >> > >>     Cache: Calculate eprobe in reinit_entry()
> > > >> > >> 
> > > >> > >>     If this function is called to reuse a ghost entry, the probe list
> > > >> > >>     has not been walked yet, so eprobe is left uninitialized.
> > > >> > >> 
> > > >> > >>     This passed the test case, because the correct old value was left
> > > >> > >>     on stack. Modify the test case to poison the stack.
> > > >> > >> 
> > > >> > >>     Signed-off-by: Petr Tesarik <petr@tesarici.cz>
> > > >> > >> 
> > > >> > >>  src/kdumpfile/cache.c      |  6 +++++-
> > > >> > >>  src/kdumpfile/test-cache.c | 13 +++++++++++++
> > > >> > >>  2 files changed, 18 insertions(+), 1 deletion(-)        
> > > >> > >
> > > >> > > This looks like a red herring to me. The cache most likely continues in
> > > >> > > a corrupted state without this commit, which may mask the issue (until
> > > >> > > it resurfaces later).        
> > > >> > 
> > > >> > I see, that makes a lot of sense.
> > > >> >       
> > > >> > >> I haven't yet tried to debug the logic of the cache implementation and
> > > >> > >> create a patch. I'm totally willing to try that, but I figured I would
> > > >> > >> send this report to you first, to see if there's something obvious that
> > > >> > >> sticks out to your eyes.        
> > > >> > >
> > > >> > > No, but I should be able to recreate the issue if I get a log of the
> > > >> > > cache API calls:
> > > >> > >
> > > >> > > - cache_alloc() - to know the number of elements
> > > >> > > - cache_get_entry()
> > > >> > > - cache_put_entry()
> > > >> > > - cache_insert()
> > > >> > > - cache_discard()
> > > >> > > - cache_flush() - not likely after initialization, but...        
> > > >> > 
> > > >> > I went ahead and logged each of these calls as you suggested, I tried to
> > > >> > log them at the beginning of the function call and always include the
> > > >> > cache pointer, cache_entry, and the key. I took the resulting log and
> > > >> > filtered it to just contain the most recently logged cache prior to the
> > > >> > crash, compressed it, and attached it. For completeness, the patch
> > > >> > I used is below (applies to tip branch 8254897 ("Merge pull request #78
> > > >> > from fweimer-rh/c99")).
> > > >> > 
> > > >> > I'll also see if I can reproduce it based on the log.      
> > > >> 
> > > >> Thank you for the log. I haven't had much time to look at it, but the
> > > >> first line is a good hint already:
> > > >> 
> > > >> 0x56098b68c4c0: cache_alloc(1024, 0)
> > > >> 
> > > >> Zero size means the data pointers are managed by the caller, so this
> > > >> must be the cache of mmap()'ed segments. That's the only cache which
> > > >> installs a cleanup callback with set_cache_entry_cleanup(). There is
> > > >> only one call to the cleanup callback for evicted entries in cache.c:
> > > >> 
> > > >> 		/* Get an unused cached entry. */
> > > >> 		if (cs->nuprobe != 0 &&
> > > >> 		    (cs->nuprec == 0 || cache->nprobe + bias > cache->dprobe))
> > > >> 			evict = evict_probe(cache, cs);
> > > >> 		else
> > > >> 			evict = evict_prec(cache, cs);
> > > >> 		if (cache->entry_cleanup)
> > > >> 			cache->entry_cleanup(cache->cleanup_data, evict);
> > > >> 
> > > >> The entries can be evicted from the probe partition or from the precious
> > > >> partition. This might be relevant. Please, can you re-run and log where
> > > >> the evict entry comes from?    
> > > >
> > > > I found some time this morning, and it wouldn't help. Because of a bug
> > > > in fcache_new(), the number of elements in the cache is big enough that
> > > > cache entries are never evicted in your case. It's quite weird to hit a
> > > > cache metadata bug after elements have been inserted. FWIW I am not
> > > > able to reproduce the bug by replaying the logged file read pattern.
> > > >
> > > > Since you have a reliable reproducer, it cannot be a Heisenbug. But it
> > > > could be caused by the other cache - the cache of decompressed pages.
> > > > Do you know for sure that lzo1x_decompress_safe() crashes while trying
> > > > to _read_ from the input buffer, and not while trying to _write_ to the
> > > > output buffer?    
> > > 
> > > Hi Petr,
> > > 
> > > Sorry for the delay here, I got pulled into other issues and am trying
> > > to attend to all my work in a round-robin fashion :)  
> > 
> > Hi Stephen,
> > 
> > that's fine, I also work on this only as time permits. ;-)
> > 
> > > The fault is definitely in lzo1x_decompress_safe() *writing* to address
> > > 0. I fetched debuginfo for all the necessary libraries and we see the
> > > following stack trace:
> > > 
> > > %<-----------------------
> > > #0  0x00007fcd9adddef3 in lzo1x_decompress_safe (in=<optimized out>,
> > >     in_len=<optimized out>, out=0x0, out_len=0x7ffdee2c1388, wrkmem=<optimized out>)
> > >     at src/lzo1x_d.ch:120
> > > #1  0x00007fcd9ae25be1 in diskdump_read_page (pio=0x7ffdee2c1590) at diskdump.c:584
> > > #2  0x00007fcd9ae32d4d in _kdumpfile_priv_cache_get_page (pio=0x7ffdee2c1590,
> > >     fn=0x7fcd9ae257ae <diskdump_read_page>) at read.c:69
> > > #3  0x00007fcd9ae25e44 in diskdump_get_page (pio=0x7ffdee2c1590) at diskdump.c:647
> > > #4  0x00007fcd9ae32be0 in get_page (pio=0x7ffdee2c1590)
> > >     at /home/stepbren/repos/libkdumpfile/src/kdumpfile/kdumpfile-priv.h:1512
> > > #5  0x00007fcd9ae32ed4 in get_page_xlat (pio=0x7ffdee2c1590) at read.c:126
> > > #6  0x00007fcd9ae32f22 in get_page_maybe_xlat (pio=0x7ffdee2c1590) at read.c:137
> > > #7  0x00007fcd9ae32fb1 in _kdumpfile_priv_read_locked (ctx=0x55745bfca8f0,
> > >     as=KDUMP_KVADDR, addr=18446612133360081960, buffer=0x7ffdee2c17df,
> > >     plength=0x7ffdee2c1698) at read.c:169
> > > #8  0x00007fcd9ae330dd in kdump_read (ctx=0x55745bfca8f0, as=KDUMP_KVADDR,
> > >     addr=18446612133360081960, buffer=0x7ffdee2c17df, plength=0x7ffdee2c1698)
> > >     at read.c:196
> > > #9  0x00007fcd9afb0cc4 in drgn_read_kdump (buf=0x7ffdee2c17df,
> > >     address=18446612133360081960, count=4, offset=18446612133360081960,
> > >     arg=0x55745bfca8f0, physical=false) at ../../libdrgn/kdump.c:73
> > > %<-----------------------
> > > 
> > > In frame 1 where we are calling the decompressor:
> > > 
> > > %<-----------------------
> > > (gdb) frame 1
> > > #1  0x00007fcd9ae25be1 in diskdump_read_page (pio=0x7ffdee2c1590) at diskdump.c:584
> > > 584                     int ret = lzo1x_decompress_safe(fch.data, pd.size,
> > > (gdb) list
> > > 579                     if (ret != KDUMP_OK)
> > > 580                             return ret;
> > > 581             } else if (pd.flags & DUMP_DH_COMPRESSED_LZO) {
> > > 582     #if USE_LZO
> > > 583                     lzo_uint retlen = get_page_size(ctx);
> > > 584                     int ret = lzo1x_decompress_safe(fch.data, pd.size,
> > > 585                                                     pio->chunk.data,
> > > 586                                                     &retlen,
> > > 587                                                     LZO1X_MEM_DECOMPRESS);
> > > 588                     fcache_put_chunk(&fch);
> > > (gdb) p retlen
> > > $7 = 0  
> > 
> > This is a bit weird. Looking at liblzo sources, it seems to me that
> > the output length is not changed until right before returning from
> > lzo1x_decompress_safe().
> > 
> > > (gdb) p pio->chunk.data
> > > $8 = (void *) 0x0  
> > 
> > OK, here's our immediate root cause. ;-)
> > 
> > > (gdb) p fch.data
> > > $9 = (void *) 0x7fcd7cc33da4  
> > 
> > This looks sane.
> > 
> > > (gdb) p pd.size
> > > $10 = 816  
> > 
> > This also looks sane.
> > 
> > > %<-----------------------
> > > 
> > > As far as I can tell, pio->chunk.data comes directly from the
> > > cache_get_page() function in frame 2:
> > > 
> > > %<-----------------------
> > > (gdb) up
> > > #2  0x00007fcd9ae32d4d in _kdumpfile_priv_cache_get_page (pio=0x7ffdee2c1590,
> > >     fn=0x7fcd9ae257ae <diskdump_read_page>) at read.c:69
> > > 69              ret = fn(pio);
> > > (gdb) list
> > > 64              pio->chunk.data = entry->data;
> > > 65              pio->chunk.embed_fces->ce = entry;
> > > 66              if (cache_entry_valid(entry))
> > > 67                      return KDUMP_OK;
> > > 68
> > > 69              ret = fn(pio);
> > > 70              mutex_lock(&ctx->shared->cache_lock);
> > > 71              if (ret == KDUMP_OK)
> > > 72                      cache_insert(pio->chunk.embed_fces->cache, entry);
> > > 73              else
> > > (gdb) p *entry
> > > $11 = {key = 1045860353, state = cs_precious, next = 626, prev = 626, refcnt = 1,
> > >   data = 0x0}  
> > 
> > The key (0x3e569000 | ADDRXLAT_MACHPHYSADDR) corresponds to the
> > requested virtual address 0xffff88003e569c28.
> > 
> > > (gdb) p *pio
> > > $12 = {ctx = 0x55745bfca8f0, addr = {addr = 1045860352, as = ADDRXLAT_MACHPHYSADDR},
> > >   chunk = {data = 0x0, nent = 1, {embed_fces = {{data = 0xffff880ff1470788,
> > >           len = 140728599320032, ce = 0x55745c1003d8, cache = 0x55745c0fb540}, {
> > >           data = 0x55745bfd42f0, len = 140728599320112,
> > >           ce = 0x7fcd9ae330ef <kdump_read+102>, cache = 0xffff88003e569c28}},
> > >       fces = 0xffff880ff1470788}}}
> > > %<-----------------------  
> > 
> > 
> > Looking at pio->chunk->embed_fces->ce, struct cache_entry is at
> > 0x55745c1003d8. Assuming that sizeof(struct cache_entry) == 32 on your
> > system, this is element 626 in the cache entry array. The next and
> > prev indices are also 626, which looks good, because cache->ninflight
> > is 1, so this is the only element in the (circular) in-flight list.
> > 
> > Since state is cs_precious, but the data was discarded, this cache
> > entry has just been recovered from a ghost partition, evicting another
> > entry, and _that_ entry had a NULL data pointer.
> > 
> > It would be really helpful if I could get the log for this cache
> > instead of the one you posted earlier.
> 
> Let's recap:
> 
> 1. Data indicates that a reused ghost entry has a NULL data pointer.
> 2. Reverting commit 487a8042ea5da580e1fdb5b8f91c8bd7cad05cd6 masks the issue.
> 
> My conclusion is that the data pointer was taken from an entry in the
> unused partition. This partition is empty, except when the cache is
> new, or after calling cache_discard(). Given the statistic counter
> values, the latter is the case here,
> 
> I think I found the bug: reinit_entry() finds the unused partition by
> skipping over the ghost probe partition, but if the target itself is a
> reused ghost probe entry, cache->ngprobe was already decremented in
> get_ghost_or_missed_entry().
> 
> I'm going to write a test case and fix.

I was able to write a test case for the NULL data pointer. The bug
should be fixed by commit e63dab40c8cefdfc013dc57915de9852dbb283e4.

Stephen, can you rebuild and verify with your dump file that this was
indeed the same bug?

Petr T

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Segmentation fault with drgn + libkdumpfile
  2024-01-10 18:03               ` Petr Tesařík
@ 2024-01-10 19:48                 ` Stephen Brennan
  2024-01-10 19:58                   ` Petr Tesařík
  0 siblings, 1 reply; 11+ messages in thread
From: Stephen Brennan @ 2024-01-10 19:48 UTC (permalink / raw)
  To: Petr Tesařík; +Cc: Omar Sandoval, linux-debuggers

Petr Tesařík <petr@tesarici.cz> writes:
> On Wed, 10 Jan 2024 14:49:28 +0100
> Petr Tesařík <petr@tesarici.cz> wrote:
>
>> On Wed, 10 Jan 2024 09:36:00 +0100
>> Petr Tesařík <petr@tesarici.cz> wrote:
>> 
>> > On Tue, 09 Jan 2024 17:40:15 -0800
>> > Stephen Brennan <stephen.s.brennan@oracle.com> wrote:
>> > 
>> > > Petr Tesařík <petr@tesarici.cz> writes:
>> > >   
>> > > > On Mon, 8 Jan 2024 21:40:08 +0100
>> > > > Petr Tesařík <petr@tesarici.cz> wrote:
>> > > >    
>> > > >> On Fri, 05 Jan 2024 13:53:15 -0800
>> > > >> Stephen Brennan <stephen.s.brennan@oracle.com> wrote:
>> > > >>     
>> > > >> > Petr Tesařík <petr@tesarici.cz> writes:      
>> > > >> > > On Fri, 05 Jan 2024 10:38:16 -0800
>> > > >> > > Stephen Brennan <stephen.s.brennan@oracle.com> wrote:
>> > > >> > >        
>> > > >> > >> Hi Petr,
>> > > >> > >> 
>> > > >> > >> I recently encountered a segmentation fault with libkdumpfile & drgn
>> > > >> > >> which appears to be related to the cache implementation. I've included
>> > > >> > >> the stack trace at the end of this message, since it's a bit of a longer
>> > > >> > >> one. The exact issue occurred with a test vmcore that I could probably
>> > > >> > >> share with you privately if you'd like. In any case, the reproducer is
>> > > >> > >> fairly straightforward in drgn code:
>> > > >> > >> 
>> > > >> > >> for t in for_each_task(prog):
>> > > >> > >>     prog.stack_trace(t)
>> > > >> > >> for t in for_each_task(prog):
>> > > >> > >>     prog.stack_trace(t)
>> > > >> > >> 
>> > > >> > >> The repetition is required, the segfault only occurs on the second
>> > > >> > >> iteration of the loop. Which, in hindsight, is a textbook sign that the
>> > > >> > >> issue has to do with caching. I'd expect that the issue is specific to
>> > > >> > >> this vmcore, it doesn't reproduce on others.
>> > > >> > >> 
>> > > >> > >> I stuck that into a git bisect script and bisected the libkdumpfile
>> > > >> > >> commit that introduced it:
>> > > >> > >> 
>> > > >> > >> commit 487a8042ea5da580e1fdb5b8f91c8bd7cad05cd6
>> > > >> > >> Author: Petr Tesarik <petr@tesarici.cz>
>> > > >> > >> Date:   Wed Jan 11 22:53:01 2023 +0100
>> > > >> > >> 
>> > > >> > >>     Cache: Calculate eprobe in reinit_entry()
>> > > >> > >> 
>> > > >> > >>     If this function is called to reuse a ghost entry, the probe list
>> > > >> > >>     has not been walked yet, so eprobe is left uninitialized.
>> > > >> > >> 
>> > > >> > >>     This passed the test case, because the correct old value was left
>> > > >> > >>     on stack. Modify the test case to poison the stack.
>> > > >> > >> 
>> > > >> > >>     Signed-off-by: Petr Tesarik <petr@tesarici.cz>
>> > > >> > >> 
>> > > >> > >>  src/kdumpfile/cache.c      |  6 +++++-
>> > > >> > >>  src/kdumpfile/test-cache.c | 13 +++++++++++++
>> > > >> > >>  2 files changed, 18 insertions(+), 1 deletion(-)        
>> > > >> > >
>> > > >> > > This looks like a red herring to me. The cache most likely continues in
>> > > >> > > a corrupted state without this commit, which may mask the issue (until
>> > > >> > > it resurfaces later).        
>> > > >> > 
>> > > >> > I see, that makes a lot of sense.
>> > > >> >       
>> > > >> > >> I haven't yet tried to debug the logic of the cache implementation and
>> > > >> > >> create a patch. I'm totally willing to try that, but I figured I would
>> > > >> > >> send this report to you first, to see if there's something obvious that
>> > > >> > >> sticks out to your eyes.        
>> > > >> > >
>> > > >> > > No, but I should be able to recreate the issue if I get a log of the
>> > > >> > > cache API calls:
>> > > >> > >
>> > > >> > > - cache_alloc() - to know the number of elements
>> > > >> > > - cache_get_entry()
>> > > >> > > - cache_put_entry()
>> > > >> > > - cache_insert()
>> > > >> > > - cache_discard()
>> > > >> > > - cache_flush() - not likely after initialization, but...        
>> > > >> > 
>> > > >> > I went ahead and logged each of these calls as you suggested, I tried to
>> > > >> > log them at the beginning of the function call and always include the
>> > > >> > cache pointer, cache_entry, and the key. I took the resulting log and
>> > > >> > filtered it to just contain the most recently logged cache prior to the
>> > > >> > crash, compressed it, and attached it. For completeness, the patch
>> > > >> > I used is below (applies to tip branch 8254897 ("Merge pull request #78
>> > > >> > from fweimer-rh/c99")).
>> > > >> > 
>> > > >> > I'll also see if I can reproduce it based on the log.      
>> > > >> 
>> > > >> Thank you for the log. I haven't had much time to look at it, but the
>> > > >> first line is a good hint already:
>> > > >> 
>> > > >> 0x56098b68c4c0: cache_alloc(1024, 0)
>> > > >> 
>> > > >> Zero size means the data pointers are managed by the caller, so this
>> > > >> must be the cache of mmap()'ed segments. That's the only cache which
>> > > >> installs a cleanup callback with set_cache_entry_cleanup(). There is
>> > > >> only one call to the cleanup callback for evicted entries in cache.c:
>> > > >> 
>> > > >> 		/* Get an unused cached entry. */
>> > > >> 		if (cs->nuprobe != 0 &&
>> > > >> 		    (cs->nuprec == 0 || cache->nprobe + bias > cache->dprobe))
>> > > >> 			evict = evict_probe(cache, cs);
>> > > >> 		else
>> > > >> 			evict = evict_prec(cache, cs);
>> > > >> 		if (cache->entry_cleanup)
>> > > >> 			cache->entry_cleanup(cache->cleanup_data, evict);
>> > > >> 
>> > > >> The entries can be evicted from the probe partition or from the precious
>> > > >> partition. This might be relevant. Please, can you re-run and log where
>> > > >> the evict entry comes from?    
>> > > >
>> > > > I found some time this morning, and it wouldn't help. Because of a bug
>> > > > in fcache_new(), the number of elements in the cache is big enough that
>> > > > cache entries are never evicted in your case. It's quite weird to hit a
>> > > > cache metadata bug after elements have been inserted. FWIW I am not
>> > > > able to reproduce the bug by replaying the logged file read pattern.
>> > > >
>> > > > Since you have a reliable reproducer, it cannot be a Heisenbug. But it
>> > > > could be caused by the other cache - the cache of decompressed pages.
>> > > > Do you know for sure that lzo1x_decompress_safe() crashes while trying
>> > > > to _read_ from the input buffer, and not while trying to _write_ to the
>> > > > output buffer?    
>> > > 
>> > > Hi Petr,
>> > > 
>> > > Sorry for the delay here, I got pulled into other issues and am trying
>> > > to attend to all my work in a round-robin fashion :)  
>> > 
>> > Hi Stephen,
>> > 
>> > that's fine, I also work on this only as time permits. ;-)
>> > 
>> > > The fault is definitely in lzo1x_decompress_safe() *writing* to address
>> > > 0. I fetched debuginfo for all the necessary libraries and we see the
>> > > following stack trace:
>> > > 
>> > > %<-----------------------
>> > > #0  0x00007fcd9adddef3 in lzo1x_decompress_safe (in=<optimized out>,
>> > >     in_len=<optimized out>, out=0x0, out_len=0x7ffdee2c1388, wrkmem=<optimized out>)
>> > >     at src/lzo1x_d.ch:120
>> > > #1  0x00007fcd9ae25be1 in diskdump_read_page (pio=0x7ffdee2c1590) at diskdump.c:584
>> > > #2  0x00007fcd9ae32d4d in _kdumpfile_priv_cache_get_page (pio=0x7ffdee2c1590,
>> > >     fn=0x7fcd9ae257ae <diskdump_read_page>) at read.c:69
>> > > #3  0x00007fcd9ae25e44 in diskdump_get_page (pio=0x7ffdee2c1590) at diskdump.c:647
>> > > #4  0x00007fcd9ae32be0 in get_page (pio=0x7ffdee2c1590)
>> > >     at /home/stepbren/repos/libkdumpfile/src/kdumpfile/kdumpfile-priv.h:1512
>> > > #5  0x00007fcd9ae32ed4 in get_page_xlat (pio=0x7ffdee2c1590) at read.c:126
>> > > #6  0x00007fcd9ae32f22 in get_page_maybe_xlat (pio=0x7ffdee2c1590) at read.c:137
>> > > #7  0x00007fcd9ae32fb1 in _kdumpfile_priv_read_locked (ctx=0x55745bfca8f0,
>> > >     as=KDUMP_KVADDR, addr=18446612133360081960, buffer=0x7ffdee2c17df,
>> > >     plength=0x7ffdee2c1698) at read.c:169
>> > > #8  0x00007fcd9ae330dd in kdump_read (ctx=0x55745bfca8f0, as=KDUMP_KVADDR,
>> > >     addr=18446612133360081960, buffer=0x7ffdee2c17df, plength=0x7ffdee2c1698)
>> > >     at read.c:196
>> > > #9  0x00007fcd9afb0cc4 in drgn_read_kdump (buf=0x7ffdee2c17df,
>> > >     address=18446612133360081960, count=4, offset=18446612133360081960,
>> > >     arg=0x55745bfca8f0, physical=false) at ../../libdrgn/kdump.c:73
>> > > %<-----------------------
>> > > 
>> > > In frame 1 where we are calling the decompressor:
>> > > 
>> > > %<-----------------------
>> > > (gdb) frame 1
>> > > #1  0x00007fcd9ae25be1 in diskdump_read_page (pio=0x7ffdee2c1590) at diskdump.c:584
>> > > 584                     int ret = lzo1x_decompress_safe(fch.data, pd.size,
>> > > (gdb) list
>> > > 579                     if (ret != KDUMP_OK)
>> > > 580                             return ret;
>> > > 581             } else if (pd.flags & DUMP_DH_COMPRESSED_LZO) {
>> > > 582     #if USE_LZO
>> > > 583                     lzo_uint retlen = get_page_size(ctx);
>> > > 584                     int ret = lzo1x_decompress_safe(fch.data, pd.size,
>> > > 585                                                     pio->chunk.data,
>> > > 586                                                     &retlen,
>> > > 587                                                     LZO1X_MEM_DECOMPRESS);
>> > > 588                     fcache_put_chunk(&fch);
>> > > (gdb) p retlen
>> > > $7 = 0  
>> > 
>> > This is a bit weird. Looking at liblzo sources, it seems to me that
>> > the output length is not changed until right before returning from
>> > lzo1x_decompress_safe().
>> > 
>> > > (gdb) p pio->chunk.data
>> > > $8 = (void *) 0x0  
>> > 
>> > OK, here's our immediate root cause. ;-)
>> > 
>> > > (gdb) p fch.data
>> > > $9 = (void *) 0x7fcd7cc33da4  
>> > 
>> > This looks sane.
>> > 
>> > > (gdb) p pd.size
>> > > $10 = 816  
>> > 
>> > This also looks sane.
>> > 
>> > > %<-----------------------
>> > > 
>> > > As far as I can tell, pio->chunk.data comes directly from the
>> > > cache_get_page() function in frame 2:
>> > > 
>> > > %<-----------------------
>> > > (gdb) up
>> > > #2  0x00007fcd9ae32d4d in _kdumpfile_priv_cache_get_page (pio=0x7ffdee2c1590,
>> > >     fn=0x7fcd9ae257ae <diskdump_read_page>) at read.c:69
>> > > 69              ret = fn(pio);
>> > > (gdb) list
>> > > 64              pio->chunk.data = entry->data;
>> > > 65              pio->chunk.embed_fces->ce = entry;
>> > > 66              if (cache_entry_valid(entry))
>> > > 67                      return KDUMP_OK;
>> > > 68
>> > > 69              ret = fn(pio);
>> > > 70              mutex_lock(&ctx->shared->cache_lock);
>> > > 71              if (ret == KDUMP_OK)
>> > > 72                      cache_insert(pio->chunk.embed_fces->cache, entry);
>> > > 73              else
>> > > (gdb) p *entry
>> > > $11 = {key = 1045860353, state = cs_precious, next = 626, prev = 626, refcnt = 1,
>> > >   data = 0x0}  
>> > 
>> > The key (0x3e569000 | ADDRXLAT_MACHPHYSADDR) corresponds to the
>> > requested virtual address 0xffff88003e569c28.
>> > 
>> > > (gdb) p *pio
>> > > $12 = {ctx = 0x55745bfca8f0, addr = {addr = 1045860352, as = ADDRXLAT_MACHPHYSADDR},
>> > >   chunk = {data = 0x0, nent = 1, {embed_fces = {{data = 0xffff880ff1470788,
>> > >           len = 140728599320032, ce = 0x55745c1003d8, cache = 0x55745c0fb540}, {
>> > >           data = 0x55745bfd42f0, len = 140728599320112,
>> > >           ce = 0x7fcd9ae330ef <kdump_read+102>, cache = 0xffff88003e569c28}},
>> > >       fces = 0xffff880ff1470788}}}
>> > > %<-----------------------  
>> > 
>> > 
>> > Looking at pio->chunk->embed_fces->ce, struct cache_entry is at
>> > 0x55745c1003d8. Assuming that sizeof(struct cache_entry) == 32 on your
>> > system, this is element 626 in the cache entry array. The next and
>> > prev indices are also 626, which looks good, because cache->ninflight
>> > is 1, so this is the only element in the (circular) in-flight list.
>> > 
>> > Since state is cs_precious, but the data was discarded, this cache
>> > entry has just been recovered from a ghost partition, evicting another
>> > entry, and _that_ entry had a NULL data pointer.
>> > 
>> > It would be really helpful if I could get the log for this cache
>> > instead of the one you posted earlier.
>> 
>> Let's recap:
>> 
>> 1. Data indicates that a reused ghost entry has a NULL data pointer.
>> 2. Reverting commit 487a8042ea5da580e1fdb5b8f91c8bd7cad05cd6 masks the issue.
>> 
>> My conclusion is that the data pointer was taken from an entry in the
>> unused partition. This partition is empty, except when the cache is
>> new, or after calling cache_discard(). Given the statistic counter
>> values, the latter is the case here,
>> 
>> I think I found the bug: reinit_entry() finds the unused partition by
>> skipping over the ghost probe partition, but if the target itself is a
>> reused ghost probe entry, cache->ngprobe was already decremented in
>> get_ghost_or_missed_entry().
>> 
>> I'm going to write a test case and fix.
>
> I was able to write a test case for the NULL data pointer. The bug
> should be fixed by commit e63dab40c8cefdfc013dc57915de9852dbb283e4.
>
> Stephen, can you rebuild and verify with your dump file that this was
> indeed the same bug?

Hi Petr,

I went ahead and pulled the latest from tip and ran my reproducer. The
segmentation fault is indeed gone!

I tested c10036e ("cache: Optimize data reclaim for missed entries"), to
be exact, which contains e63dab4 as well.

Thank you so much for taking the time to go back & forth with me on this
one! I suppose there's a reason that caching is one of the two hard
problems in computer science ;)

I owe you a beverage of your choice next time we see each other!

Thanks,
Stephen

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Segmentation fault with drgn + libkdumpfile
  2024-01-10 19:48                 ` Stephen Brennan
@ 2024-01-10 19:58                   ` Petr Tesařík
  0 siblings, 0 replies; 11+ messages in thread
From: Petr Tesařík @ 2024-01-10 19:58 UTC (permalink / raw)
  To: Stephen Brennan; +Cc: Omar Sandoval, linux-debuggers

On Wed, 10 Jan 2024 11:48:22 -0800
Stephen Brennan <stephen.s.brennan@oracle.com> wrote:

> Petr Tesařík <petr@tesarici.cz> writes:
>[...]
> > I was able to write a test case for the NULL data pointer. The bug
> > should be fixed by commit e63dab40c8cefdfc013dc57915de9852dbb283e4.
> >
> > Stephen, can you rebuild and verify with your dump file that this was
> > indeed the same bug?  
> 
> Hi Petr,
> 
> I went ahead and pulled the latest from tip and ran my reproducer. The
> segmentation fault is indeed gone!

Great!

> I tested c10036e ("cache: Optimize data reclaim for missed entries"), to
> be exact, which contains e63dab4 as well.

I spent some time pondering whether the other call of reinit_entry() is
also affected. But it's not, so this other commit should be a no-op.

> Thank you so much for taking the time to go back & forth with me on this
> one! I suppose there's a reason that caching is one of the two hard
> problems in computer science ;)

Given that the relevant counter was decremented, it may also be seen as
an off-by-one bug. I think it's all a matter of naming things...

> I owe you a beverage of your choice next time we see each other!

Beer if in Europe, bourbon if in the US. Thanks. ;-)

Petr T

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-01-10 19:58 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-05 18:38 Segmentation fault with drgn + libkdumpfile Stephen Brennan
2024-01-05 19:23 ` Petr Tesařík
2024-01-05 21:53   ` Stephen Brennan
2024-01-08 20:40     ` Petr Tesařík
2024-01-09  9:06       ` Petr Tesařík
2024-01-10  1:40         ` Stephen Brennan
2024-01-10  8:36           ` Petr Tesařík
2024-01-10 13:49             ` Petr Tesařík
2024-01-10 18:03               ` Petr Tesařík
2024-01-10 19:48                 ` Stephen Brennan
2024-01-10 19:58                   ` Petr Tesařík

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).