Kexec Archive on lore.kernel.org
 help / color / mirror / Atom feed
* crash: struct command can read irrelevant pages.
@ 2014-02-19  6:01 Atsushi Kumagai
  2014-02-19 14:50 ` [Crash-utility] " Dave Anderson
  0 siblings, 1 reply; 5+ messages in thread
From: Atsushi Kumagai @ 2014-02-19  6:01 UTC (permalink / raw)
  To: crash-utility@redhat.com, kexec@lists.infradead.org

Hello,

Finally, I've found the cause of the issue I mentioned as below
when makedumpfile v1.5.5 was released:

> 2. At first, the supported kernel will be updated to 3.12, but I
> found an issue while testing for v1.5.5, which seems that the page
> filtering works wrongly on kernel 3.12. I couldn't investigate this
> yet and it will take some time to finish it.
> Therefore, the latest supported kernel version is 3.11 in v1.5.5.

This is neither a kernel issue nor a makedumpfile issue, it's a crash's bug.
It can happen when a slab cache is stored at almost end of a page.

== Description ==

At the beginning, I found the error message below when I used crash for
a dumpfile generated by makedumpfile -d2:

    please wait... (gathering kmem slab cache data)
    crash: page excluded: kernel virtual address: f4e87000  type: "kmem_cache buffer"

    crash: unable to initialize kmem slab cache subsystem

This message indicated that crash failed to get a slab cache during
kmem_cache_init(), and according to the below, crash failed to get
the slab cache stored at f4e86f40:

    crash> p kmem_cache
    kmem_cache = $1 = (struct kmem_cache *) 0xc0b1cbc0 <kmem_cache_boot>
    crash>
    crash> list kmem_cache.list -s kmem_cache.name -h 0xc0b1cbc0
    ...
    f4d37840
      name = 0xf4edf540 "uid_cache"
    f4e86f40
    list: page excluded: kernel virtual address: f4e87000  type: "gdb_readmem_callback"

It seems that the slab cache covered two pages, [f4e86000- f4e87000] and
[f4e87000- f4e88000]. Well, let's confirm the *real* size of it.

Since slab caches except kmem_cache_boot are allocated as slab objects,
we can confirm the size like below:

  crash> p kmem_cache
  kmem_cache = $2 = (struct kmem_cache *) 0xc0b1cbc0 <kmem_cache_boot>
  crash> struct kmem_cache.object_size 0xc0b1cbc0
    object_size = 104
  crash>

In my environment, the size was 104 bytes. Therefore, the slab cache
stored at f4e86f40 fits in the single page([f4e86000- f4e87000]) and
the excluded page([f4e87000- f4e88000]) isn't a related page.

On the other hand, crash get the size from vmlinux by using gdb,
it was 216 bytes:

    crash> struct kmem_cache
    struct kmem_cache {
        unsigned int batchcount;
        unsigned int limit;
        ...
        struct kmem_cache_node **node;
        struct array_cache *array[33];
    }
    SIZE: 216
    crash>

So crash mistook the correlative pages of the slab cache as
[f4e86000- f4e87000] and [f4e87000- f4e88000] even though the latter
was a irrelevant page.

This gap came from the fact that the size of slab cache is variable.

    struct kmem_cache {
    ...
            struct kmem_cache_node **node;
            struct array_cache *array[NR_CPUS + MAX_NUMNODES];
            /*
             * Do not add fields after array[]
             */
    };

The size of "array" is the variable factor of kmem_cache.
When building vmlinux, the size of kmem_cache will be calculated with
NR_CPUS and MAX_NUMNODES, and put it into vmlinux as a debug information.
(Sorry, I don't know gcc well. I may misunderstand this.)
However, the actual size will be smaller than the defined size because
the actual size will be decided based on the actual number of CPUs and NODEs.

void __init kmem_cache_init(void)::
...
        /*
         * struct kmem_cache size depends on nr_node_ids & nr_cpu_ids
         */
        create_boot_cache(kmem_cache, "kmem_cache",
                offsetof(struct kmem_cache, array[nr_cpu_ids]) +
                                  nr_node_ids * sizeof(struct kmem_cache_node *),  // object_size
                                  SLAB_HWCACHE_ALIGN);
        list_add(&kmem_cache->list, &slab_caches);


As for kmem_cache, we can get the actual size of it from kmem_cache_boot,
but I suppose that kmem_cache is not the only struct in kernel whose size
is variable. So I think we should discuss how to address such issues like this.

By the way, I mentioned the case of *SLAB* in this mail,
but SLUB seems have the same issue.


Thanks
Atsushi Kumagai

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Crash-utility] crash: struct command can read irrelevant pages.
  2014-02-19  6:01 crash: struct command can read irrelevant pages Atsushi Kumagai
@ 2014-02-19 14:50 ` Dave Anderson
  2014-02-20 20:45   ` Dave Anderson
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Anderson @ 2014-02-19 14:50 UTC (permalink / raw)
  To: Discussion list for crash utility usage, maintenance and development
  Cc: kexec



----- Original Message -----
> Hello,
> 
> Finally, I've found the cause of the issue I mentioned as below
> when makedumpfile v1.5.5 was released:
> 
> > 2. At first, the supported kernel will be updated to 3.12, but I
> > found an issue while testing for v1.5.5, which seems that the page
> > filtering works wrongly on kernel 3.12. I couldn't investigate this
> > yet and it will take some time to finish it.
> > Therefore, the latest supported kernel version is 3.11 in v1.5.5.
> 
> This is neither a kernel issue nor a makedumpfile issue, it's a crash's bug.
> It can happen when a slab cache is stored at almost end of a page.
> 
> == Description ==
> 
> At the beginning, I found the error message below when I used crash for
> a dumpfile generated by makedumpfile -d2:
> 
>     please wait... (gathering kmem slab cache data)
>     crash: page excluded: kernel virtual address: f4e87000  type: "kmem_cache
>     buffer"
> 
>     crash: unable to initialize kmem slab cache subsystem
> 
> This message indicated that crash failed to get a slab cache during
> kmem_cache_init(), and according to the below, crash failed to get
> the slab cache stored at f4e86f40:
> 
>     crash> p kmem_cache
>     kmem_cache = $1 = (struct kmem_cache *) 0xc0b1cbc0 <kmem_cache_boot>
>     crash>
>     crash> list kmem_cache.list -s kmem_cache.name -h 0xc0b1cbc0
>     ...
>     f4d37840
>       name = 0xf4edf540 "uid_cache"
>     f4e86f40
>     list: page excluded: kernel virtual address: f4e87000  type:
>     "gdb_readmem_callback"
> 
> It seems that the slab cache covered two pages, [f4e86000- f4e87000] and
> [f4e87000- f4e88000]. Well, let's confirm the *real* size of it.
> 
> Since slab caches except kmem_cache_boot are allocated as slab objects,
> we can confirm the size like below:
> 
>   crash> p kmem_cache
>   kmem_cache = $2 = (struct kmem_cache *) 0xc0b1cbc0 <kmem_cache_boot>
>   crash> struct kmem_cache.object_size 0xc0b1cbc0
>     object_size = 104
>   crash>
> 
> In my environment, the size was 104 bytes. Therefore, the slab cache
> stored at f4e86f40 fits in the single page([f4e86000- f4e87000]) and
> the excluded page([f4e87000- f4e88000]) isn't a related page.
> 
> On the other hand, crash get the size from vmlinux by using gdb,
> it was 216 bytes:
> 
>     crash> struct kmem_cache
>     struct kmem_cache {
>         unsigned int batchcount;
>         unsigned int limit;
>         ...
>         struct kmem_cache_node **node;
>         struct array_cache *array[33];
>     }
>     SIZE: 216
>     crash>
> 
> So crash mistook the correlative pages of the slab cache as
> [f4e86000- f4e87000] and [f4e87000- f4e88000] even though the latter
> was a irrelevant page.
> 
> This gap came from the fact that the size of slab cache is variable.
> 
>     struct kmem_cache {
>     ...
>             struct kmem_cache_node **node;
>             struct array_cache *array[NR_CPUS + MAX_NUMNODES];
>             /*
>              * Do not add fields after array[]
>              */
>     };
> 
> The size of "array" is the variable factor of kmem_cache.
> When building vmlinux, the size of kmem_cache will be calculated with
> NR_CPUS and MAX_NUMNODES, and put it into vmlinux as a debug information.
> (Sorry, I don't know gcc well. I may misunderstand this.)
> However, the actual size will be smaller than the defined size because
> the actual size will be decided based on the actual number of CPUs and NODEs.
> 
> void __init kmem_cache_init(void)::
> ...
>         /*
>          * struct kmem_cache size depends on nr_node_ids & nr_cpu_ids
>          */
>         create_boot_cache(kmem_cache, "kmem_cache",
>                 offsetof(struct kmem_cache, array[nr_cpu_ids]) +
>                                   nr_node_ids * sizeof(struct kmem_cache_node
>                                   *),  // object_size
>                                   SLAB_HWCACHE_ALIGN);
>         list_add(&kmem_cache->list, &slab_caches);
> 
> 
> As for kmem_cache, we can get the actual size of it from kmem_cache_boot,
> but I suppose that kmem_cache is not the only struct in kernel whose size
> is variable. So I think we should discuss how to address such issues like
> this.
> 
> By the way, I mentioned the case of *SLAB* in this mail,
> but SLUB seems have the same issue.
> 
> 
> Thanks
> Atsushi Kumagai


This is a "known" issue has been discussed on the crash-utility list in the past,
at least with respect to the kmem_cache data structure.  But for any random data
structure that has such a construct, I'm not sure what can be done.

In the case of the CONFIG_SLAB kmem_cache data structure, there is a function
that is supposed to "downsize" the size value of the kmem_cache data structure
that is returned by gdb.  It is called here in kmem_cache_init(), just
prior to cycling through all of the kmem_cache structures, where the
page excluded error shown above occurred:

   8561         if (!(pc->flags & RUNTIME))
   8562                 kmem_cache_downsize();
   8563 
   8564         cache_buf = GETBUF(SIZE(kmem_cache_s));
   8565         hq_open();
   8566 
   8567         do {
   8568                 cache_count++;
   8569 
   8570                 if (!readmem(cache, KVADDR, cache_buf, SIZE(kmem_cache_s),
   8571                         "kmem_cache buffer", RETURN_ON_ERROR)) {
   8572                         FREEBUF(cache_buf);
   8573                         vt->flags |= KMEM_CACHE_UNAVAIL;
   8574                         error(INFO,
   8575                           "%sunable to initialize kmem slab cache subsystem\n\n",
   8576                                 DUMPFILE() ? "\n" : "");
   8577                         hq_close();
   8578                         return;
   8579                 }

The SIZE(kmem_cache_s) value should have been downsized by that function,
but presumably it did not work.  If CRASHDEBUG(1) was turned on during initialization, 
you would have seen either of these two messages from kmem_cache_downsize():
 
                if (CRASHDEBUG(1))
                        fprintf(fp, "kmem_cache_downsize: %ld to %ld\n",
                                STRUCT_SIZE("kmem_cache"), SIZE(kmem_cache_s));

or:

                if (CRASHDEBUG(1)) {
                        fprintf(fp,
                            "\nkmem_cache_downsize: SIZE(kmem_cache_s): %ld "
                            "cache_cache.buffer_size: %d\n",
                                STRUCT_SIZE("kmem_cache"), buffer_size);
                        fprintf(fp,
                            "kmem_cache_downsize: nr_node_ids: %ld\n",
                                vt->kmem_cache_len_nodes);
                }

The function failed probably failed due to some kernel change.  In fact, 
I just checked a 3.13 CONFIG_SLAB kernel, and I see that kmem_cache_downsize()
no longer works for that kernel.

I see that kmem_cache_boot would be a good alternative for determining
the size on CONFIG_SLAB kernels, at least on 3.7 and later kernels where
it was introduced.  And for CONFIG_SLUB, which doesn't currently have a
"downsize" function, it looks like its "kmem_cache" cache also has size
fields that could be used.

By any chance can you make the 32-bit vmlinux/vmcore pair available for
me to download?  Reply to me off-list if you can.

Thanks,
  Dave



 







_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Crash-utility] crash: struct command can read irrelevant pages.
  2014-02-19 14:50 ` [Crash-utility] " Dave Anderson
@ 2014-02-20 20:45   ` Dave Anderson
  2014-02-24  5:00     ` Atsushi Kumagai
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Anderson @ 2014-02-20 20:45 UTC (permalink / raw)
  To: Discussion list for crash utility usage, maintenance and development
  Cc: kexec


Hello Atsushi,
 
I've committed a SLAB/SLUB kmem_cache-specific fix for this issue:

  https://github.com/crash-utility/crash/commit/c0b7a74fc13121203810d06d163550436b2d5476

which is queued for crash-7.0.6.

Thanks,
  Dave


----- Original Message -----
> 
> 
> ----- Original Message -----
> > Hello,
> > 
> > Finally, I've found the cause of the issue I mentioned as below
> > when makedumpfile v1.5.5 was released:
> > 
> > > 2. At first, the supported kernel will be updated to 3.12, but I
> > > found an issue while testing for v1.5.5, which seems that the page
> > > filtering works wrongly on kernel 3.12. I couldn't investigate this
> > > yet and it will take some time to finish it.
> > > Therefore, the latest supported kernel version is 3.11 in v1.5.5.
> > 
> > This is neither a kernel issue nor a makedumpfile issue, it's a crash's bug.
> > It can happen when a slab cache is stored at almost end of a page.
> > 
> > == Description ==
> > 
> > At the beginning, I found the error message below when I used crash for
> > a dumpfile generated by makedumpfile -d2:
> > 
> >     please wait... (gathering kmem slab cache data)
> >     crash: page excluded: kernel virtual address: f4e87000  type:
> >     "kmem_cache
> >     buffer"
> > 
> >     crash: unable to initialize kmem slab cache subsystem
> > 
> > This message indicated that crash failed to get a slab cache during
> > kmem_cache_init(), and according to the below, crash failed to get
> > the slab cache stored at f4e86f40:
> > 
> >     crash> p kmem_cache
> >     kmem_cache = $1 = (struct kmem_cache *) 0xc0b1cbc0 <kmem_cache_boot>
> >     crash>
> >     crash> list kmem_cache.list -s kmem_cache.name -h 0xc0b1cbc0
> >     ...
> >     f4d37840
> >       name = 0xf4edf540 "uid_cache"
> >     f4e86f40
> >     list: page excluded: kernel virtual address: f4e87000  type:
> >     "gdb_readmem_callback"
> > 
> > It seems that the slab cache covered two pages, [f4e86000- f4e87000] and
> > [f4e87000- f4e88000]. Well, let's confirm the *real* size of it.
> > 
> > Since slab caches except kmem_cache_boot are allocated as slab objects,
> > we can confirm the size like below:
> > 
> >   crash> p kmem_cache
> >   kmem_cache = $2 = (struct kmem_cache *) 0xc0b1cbc0 <kmem_cache_boot>
> >   crash> struct kmem_cache.object_size 0xc0b1cbc0
> >     object_size = 104
> >   crash>
> > 
> > In my environment, the size was 104 bytes. Therefore, the slab cache
> > stored at f4e86f40 fits in the single page([f4e86000- f4e87000]) and
> > the excluded page([f4e87000- f4e88000]) isn't a related page.
> > 
> > On the other hand, crash get the size from vmlinux by using gdb,
> > it was 216 bytes:
> > 
> >     crash> struct kmem_cache
> >     struct kmem_cache {
> >         unsigned int batchcount;
> >         unsigned int limit;
> >         ...
> >         struct kmem_cache_node **node;
> >         struct array_cache *array[33];
> >     }
> >     SIZE: 216
> >     crash>
> > 
> > So crash mistook the correlative pages of the slab cache as
> > [f4e86000- f4e87000] and [f4e87000- f4e88000] even though the latter
> > was a irrelevant page.
> > 
> > This gap came from the fact that the size of slab cache is variable.
> > 
> >     struct kmem_cache {
> >     ...
> >             struct kmem_cache_node **node;
> >             struct array_cache *array[NR_CPUS + MAX_NUMNODES];
> >             /*
> >              * Do not add fields after array[]
> >              */
> >     };
> > 
> > The size of "array" is the variable factor of kmem_cache.
> > When building vmlinux, the size of kmem_cache will be calculated with
> > NR_CPUS and MAX_NUMNODES, and put it into vmlinux as a debug information.
> > (Sorry, I don't know gcc well. I may misunderstand this.)
> > However, the actual size will be smaller than the defined size because
> > the actual size will be decided based on the actual number of CPUs and
> > NODEs.
> > 
> > void __init kmem_cache_init(void)::
> > ...
> >         /*
> >          * struct kmem_cache size depends on nr_node_ids & nr_cpu_ids
> >          */
> >         create_boot_cache(kmem_cache, "kmem_cache",
> >                 offsetof(struct kmem_cache, array[nr_cpu_ids]) +
> >                                   nr_node_ids * sizeof(struct
> >                                   kmem_cache_node
> >                                   *),  // object_size
> >                                   SLAB_HWCACHE_ALIGN);
> >         list_add(&kmem_cache->list, &slab_caches);
> > 
> > 
> > As for kmem_cache, we can get the actual size of it from kmem_cache_boot,
> > but I suppose that kmem_cache is not the only struct in kernel whose size
> > is variable. So I think we should discuss how to address such issues like
> > this.
> > 
> > By the way, I mentioned the case of *SLAB* in this mail,
> > but SLUB seems have the same issue.
> > 
> > 
> > Thanks
> > Atsushi Kumagai
> 
> 
> This is a "known" issue has been discussed on the crash-utility list in the
> past,
> at least with respect to the kmem_cache data structure.  But for any random
> data
> structure that has such a construct, I'm not sure what can be done.
> 
> In the case of the CONFIG_SLAB kmem_cache data structure, there is a function
> that is supposed to "downsize" the size value of the kmem_cache data
> structure
> that is returned by gdb.  It is called here in kmem_cache_init(), just
> prior to cycling through all of the kmem_cache structures, where the
> page excluded error shown above occurred:
> 
>    8561         if (!(pc->flags & RUNTIME))
>    8562                 kmem_cache_downsize();
>    8563
>    8564         cache_buf = GETBUF(SIZE(kmem_cache_s));
>    8565         hq_open();
>    8566
>    8567         do {
>    8568                 cache_count++;
>    8569
>    8570                 if (!readmem(cache, KVADDR, cache_buf,
>    SIZE(kmem_cache_s),
>    8571                         "kmem_cache buffer", RETURN_ON_ERROR)) {
>    8572                         FREEBUF(cache_buf);
>    8573                         vt->flags |= KMEM_CACHE_UNAVAIL;
>    8574                         error(INFO,
>    8575                           "%sunable to initialize kmem slab cache
>    subsystem\n\n",
>    8576                                 DUMPFILE() ? "\n" : "");
>    8577                         hq_close();
>    8578                         return;
>    8579                 }
> 
> The SIZE(kmem_cache_s) value should have been downsized by that function,
> but presumably it did not work.  If CRASHDEBUG(1) was turned on during
> initialization,
> you would have seen either of these two messages from kmem_cache_downsize():
>  
>                 if (CRASHDEBUG(1))
>                         fprintf(fp, "kmem_cache_downsize: %ld to %ld\n",
>                                 STRUCT_SIZE("kmem_cache"),
>                                 SIZE(kmem_cache_s));
> 
> or:
> 
>                 if (CRASHDEBUG(1)) {
>                         fprintf(fp,
>                             "\nkmem_cache_downsize: SIZE(kmem_cache_s): %ld "
>                             "cache_cache.buffer_size: %d\n",
>                                 STRUCT_SIZE("kmem_cache"), buffer_size);
>                         fprintf(fp,
>                             "kmem_cache_downsize: nr_node_ids: %ld\n",
>                                 vt->kmem_cache_len_nodes);
>                 }
> 
> The function failed probably failed due to some kernel change.  In fact,
> I just checked a 3.13 CONFIG_SLAB kernel, and I see that
> kmem_cache_downsize()
> no longer works for that kernel.
> 
> I see that kmem_cache_boot would be a good alternative for determining
> the size on CONFIG_SLAB kernels, at least on 3.7 and later kernels where
> it was introduced.  And for CONFIG_SLUB, which doesn't currently have a
> "downsize" function, it looks like its "kmem_cache" cache also has size
> fields that could be used.
> 
> By any chance can you make the 32-bit vmlinux/vmcore pair available for
> me to download?  Reply to me off-list if you can.
> 
> Thanks,
>   Dave
> 
> 
> 
>  
> 
> 
> 
> 
> 
> 
> 

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: crash: struct command can read irrelevant pages.
  2014-02-20 20:45   ` Dave Anderson
@ 2014-02-24  5:00     ` Atsushi Kumagai
  2014-02-24 14:59       ` [Crash-utility] " Dave Anderson
  0 siblings, 1 reply; 5+ messages in thread
From: Atsushi Kumagai @ 2014-02-24  5:00 UTC (permalink / raw)
  To: crash-utility@redhat.com; +Cc: kexec@lists.infradead.org

>Hello Atsushi,
>
>I've committed a SLAB/SLUB kmem_cache-specific fix for this issue:
>
>  https://github.com/crash-utility/crash/commit/c0b7a74fc13121203810d06d163550436b2d5476
>
>which is queued for crash-7.0.6.

Thanks Dave, I made sure that this patch solved my problem.

>> This is a "known" issue has been discussed on the crash-utility list in the
>> past, at least with respect to the kmem_cache data structure.  But for any random
>> data structure that has such a construct, I'm not sure what can be done.

I also have no ideas how to solve it, but it seems that it hasn't been a
practical problem yet. So I think your patch is enough for now.

>> By any chance can you make the 32-bit vmlinux/vmcore pair available for
>> me to download?  Reply to me off-list if you can.

Sure, I'll send another mail.


Thanks
Atsushi Kumagai

>>
>>
>> ----- Original Message -----
>> > Hello,
>> >
>> > Finally, I've found the cause of the issue I mentioned as below
>> > when makedumpfile v1.5.5 was released:
>> >
>> > > 2. At first, the supported kernel will be updated to 3.12, but I
>> > > found an issue while testing for v1.5.5, which seems that the page
>> > > filtering works wrongly on kernel 3.12. I couldn't investigate this
>> > > yet and it will take some time to finish it.
>> > > Therefore, the latest supported kernel version is 3.11 in v1.5.5.
>> >
>> > This is neither a kernel issue nor a makedumpfile issue, it's a crash's bug.
>> > It can happen when a slab cache is stored at almost end of a page.
>> >
>> > == Description ==
>> >
>> > At the beginning, I found the error message below when I used crash for
>> > a dumpfile generated by makedumpfile -d2:
>> >
>> >     please wait... (gathering kmem slab cache data)
>> >     crash: page excluded: kernel virtual address: f4e87000  type:
>> >     "kmem_cache
>> >     buffer"
>> >
>> >     crash: unable to initialize kmem slab cache subsystem
>> >
>> > This message indicated that crash failed to get a slab cache during
>> > kmem_cache_init(), and according to the below, crash failed to get
>> > the slab cache stored at f4e86f40:
>> >
>> >     crash> p kmem_cache
>> >     kmem_cache = $1 = (struct kmem_cache *) 0xc0b1cbc0 <kmem_cache_boot>
>> >     crash>
>> >     crash> list kmem_cache.list -s kmem_cache.name -h 0xc0b1cbc0
>> >     ...
>> >     f4d37840
>> >       name = 0xf4edf540 "uid_cache"
>> >     f4e86f40
>> >     list: page excluded: kernel virtual address: f4e87000  type:
>> >     "gdb_readmem_callback"
>> >
>> > It seems that the slab cache covered two pages, [f4e86000- f4e87000] and
>> > [f4e87000- f4e88000]. Well, let's confirm the *real* size of it.
>> >
>> > Since slab caches except kmem_cache_boot are allocated as slab objects,
>> > we can confirm the size like below:
>> >
>> >   crash> p kmem_cache
>> >   kmem_cache = $2 = (struct kmem_cache *) 0xc0b1cbc0 <kmem_cache_boot>
>> >   crash> struct kmem_cache.object_size 0xc0b1cbc0
>> >     object_size = 104
>> >   crash>
>> >
>> > In my environment, the size was 104 bytes. Therefore, the slab cache
>> > stored at f4e86f40 fits in the single page([f4e86000- f4e87000]) and
>> > the excluded page([f4e87000- f4e88000]) isn't a related page.
>> >
>> > On the other hand, crash get the size from vmlinux by using gdb,
>> > it was 216 bytes:
>> >
>> >     crash> struct kmem_cache
>> >     struct kmem_cache {
>> >         unsigned int batchcount;
>> >         unsigned int limit;
>> >         ...
>> >         struct kmem_cache_node **node;
>> >         struct array_cache *array[33];
>> >     }
>> >     SIZE: 216
>> >     crash>
>> >
>> > So crash mistook the correlative pages of the slab cache as
>> > [f4e86000- f4e87000] and [f4e87000- f4e88000] even though the latter
>> > was a irrelevant page.
>> >
>> > This gap came from the fact that the size of slab cache is variable.
>> >
>> >     struct kmem_cache {
>> >     ...
>> >             struct kmem_cache_node **node;
>> >             struct array_cache *array[NR_CPUS + MAX_NUMNODES];
>> >             /*
>> >              * Do not add fields after array[]
>> >              */
>> >     };
>> >
>> > The size of "array" is the variable factor of kmem_cache.
>> > When building vmlinux, the size of kmem_cache will be calculated with
>> > NR_CPUS and MAX_NUMNODES, and put it into vmlinux as a debug information.
>> > (Sorry, I don't know gcc well. I may misunderstand this.)
>> > However, the actual size will be smaller than the defined size because
>> > the actual size will be decided based on the actual number of CPUs and
>> > NODEs.
>> >
>> > void __init kmem_cache_init(void)::
>> > ...
>> >         /*
>> >          * struct kmem_cache size depends on nr_node_ids & nr_cpu_ids
>> >          */
>> >         create_boot_cache(kmem_cache, "kmem_cache",
>> >                 offsetof(struct kmem_cache, array[nr_cpu_ids]) +
>> >                                   nr_node_ids * sizeof(struct
>> >                                   kmem_cache_node
>> >                                   *),  // object_size
>> >                                   SLAB_HWCACHE_ALIGN);
>> >         list_add(&kmem_cache->list, &slab_caches);
>> >
>> >
>> > As for kmem_cache, we can get the actual size of it from kmem_cache_boot,
>> > but I suppose that kmem_cache is not the only struct in kernel whose size
>> > is variable. So I think we should discuss how to address such issues like
>> > this.
>> >
>> > By the way, I mentioned the case of *SLAB* in this mail,
>> > but SLUB seems have the same issue.
>> >
>> >
>> > Thanks
>> > Atsushi Kumagai
>>
>>
>> This is a "known" issue has been discussed on the crash-utility list in the
>> past,
>> at least with respect to the kmem_cache data structure.  But for any random
>> data
>> structure that has such a construct, I'm not sure what can be done.
>>
>> In the case of the CONFIG_SLAB kmem_cache data structure, there is a function
>> that is supposed to "downsize" the size value of the kmem_cache data
>> structure
>> that is returned by gdb.  It is called here in kmem_cache_init(), just
>> prior to cycling through all of the kmem_cache structures, where the
>> page excluded error shown above occurred:
>>
>>    8561         if (!(pc->flags & RUNTIME))
>>    8562                 kmem_cache_downsize();
>>    8563
>>    8564         cache_buf = GETBUF(SIZE(kmem_cache_s));
>>    8565         hq_open();
>>    8566
>>    8567         do {
>>    8568                 cache_count++;
>>    8569
>>    8570                 if (!readmem(cache, KVADDR, cache_buf,
>>    SIZE(kmem_cache_s),
>>    8571                         "kmem_cache buffer", RETURN_ON_ERROR)) {
>>    8572                         FREEBUF(cache_buf);
>>    8573                         vt->flags |= KMEM_CACHE_UNAVAIL;
>>    8574                         error(INFO,
>>    8575                           "%sunable to initialize kmem slab cache
>>    subsystem\n\n",
>>    8576                                 DUMPFILE() ? "\n" : "");
>>    8577                         hq_close();
>>    8578                         return;
>>    8579                 }
>>
>> The SIZE(kmem_cache_s) value should have been downsized by that function,
>> but presumably it did not work.  If CRASHDEBUG(1) was turned on during
>> initialization,
>> you would have seen either of these two messages from kmem_cache_downsize():
>>
>>                 if (CRASHDEBUG(1))
>>                         fprintf(fp, "kmem_cache_downsize: %ld to %ld\n",
>>                                 STRUCT_SIZE("kmem_cache"),
>>                                 SIZE(kmem_cache_s));
>>
>> or:
>>
>>                 if (CRASHDEBUG(1)) {
>>                         fprintf(fp,
>>                             "\nkmem_cache_downsize: SIZE(kmem_cache_s): %ld "
>>                             "cache_cache.buffer_size: %d\n",
>>                                 STRUCT_SIZE("kmem_cache"), buffer_size);
>>                         fprintf(fp,
>>                             "kmem_cache_downsize: nr_node_ids: %ld\n",
>>                                 vt->kmem_cache_len_nodes);
>>                 }
>>
>> The function failed probably failed due to some kernel change.  In fact,
>> I just checked a 3.13 CONFIG_SLAB kernel, and I see that
>> kmem_cache_downsize()
>> no longer works for that kernel.
>>
>> I see that kmem_cache_boot would be a good alternative for determining
>> the size on CONFIG_SLAB kernels, at least on 3.7 and later kernels where
>> it was introduced.  And for CONFIG_SLUB, which doesn't currently have a
>> "downsize" function, it looks like its "kmem_cache" cache also has size
>> fields that could be used.
>>
>> By any chance can you make the 32-bit vmlinux/vmcore pair available for
>> me to download?  Reply to me off-list if you can.
>>
>> Thanks,
>>   Dave

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Crash-utility] crash: struct command can read irrelevant pages.
  2014-02-24  5:00     ` Atsushi Kumagai
@ 2014-02-24 14:59       ` Dave Anderson
  0 siblings, 0 replies; 5+ messages in thread
From: Dave Anderson @ 2014-02-24 14:59 UTC (permalink / raw)
  To: Discussion list for crash utility usage, maintenance and development
  Cc: kexec



----- Original Message -----
> >Hello Atsushi,
> >
> >I've committed a SLAB/SLUB kmem_cache-specific fix for this issue:
> >
> >  https://github.com/crash-utility/crash/commit/c0b7a74fc13121203810d06d163550436b2d5476
> >
> >which is queued for crash-7.0.6.
> 
> Thanks Dave, I made sure that this patch solved my problem.

OK good, thanks.
 
> >> This is a "known" issue has been discussed on the crash-utility list in the
> >> past, at least with respect to the kmem_cache data structure.  But for any random
> >> data structure that has such a construct, I'm not sure what can be done.
> 
> I also have no ideas how to solve it, but it seems that it hasn't been a
> practical problem yet. So I think your patch is enough for now.

At least with compressed kdumps, the "crash --zero_excluded" command line option,
or "set zero_excluded on" during runtime, should handle 99% of the cases.

Dave
 
 

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-02-24 14:59 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-19  6:01 crash: struct command can read irrelevant pages Atsushi Kumagai
2014-02-19 14:50 ` [Crash-utility] " Dave Anderson
2014-02-20 20:45   ` Dave Anderson
2014-02-24  5:00     ` Atsushi Kumagai
2014-02-24 14:59       ` [Crash-utility] " Dave Anderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox