From: Harry Yoo <harry.yoo@oracle.com>
To: Lilith Gkini <lilithpgkini@gmail.com>
Cc: Christoph Lameter <cl@linux.com>,
Pekka Enberg <penberg@kernel.org>,
David Rientjes <rientjes@google.com>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
Andrew Morton <akpm@linux-foundation.org>,
Vlastimil Babka <vbabka@suse.cz>,
Roman Gushchin <roman.gushchin@linux.dev>,
Hyeonggon Yoo <42.hyeyoo@gmail.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] slub: Fix Off-By-One in the While condition in on_freelist()
Date: Tue, 25 Feb 2025 19:08:09 +0900 [thread overview]
Message-ID: <Z72WiUlhxOGnrXFb@harry> (raw)
In-Reply-To: <Z7xiHTN0Q5yI-UmP@Arch>
On Mon, Feb 24, 2025 at 02:12:13PM +0200, Lilith Gkini wrote:
> On Mon, Feb 24, 2025 at 09:00:14AM +0900, Harry Yoo wrote:
> > On second thought (after reading your email),
> > I think it should be (fp == NULL) && (search == NULL)?
> >
> > When fp is NULL after the loop, it means (the end of the freelist
> > is NULL) AND (there were equal to or less than (slab->objects - nr) objects
> > on the freelist).
> >
> > If the loop has ended after observing fp == NULL,
> > on_freelist() should return true only when search == NULL.
> >
> > If fp != NULL, it means there were more objects than it should have on
> > the free list. In that case, we can't take risk of iterating the loop
> > forever and it's reasonable to assume that the freelist does not
> > end with NULL.
> >
> > > The reason I m saying this is cause the While loop is looking through
> > > every non-NULL freelist pointer for the "search" pattern. If someone
> > > wants to search for NULL in the freelist that return 1 will never
> > > trigger, even for normal freelists. If "fp" was ever null it wouldn't re
> > > enter the loop. Thus the result of the function would be search == NULL
> > > because the original writers assumed the freelist will always end with
> > > NULL.
> >
> > Agreed.
> >
> > > As you correctly pointed out there might be a case where the freelist
> > > is corrupted and it doesnt end in NULL. In that case two things could happen:
> > >
> > > 1) The check_valid_pointer() catches it and fixes it, restoring the
> > > corrupted freelist.
> > >
> > > 2) The check_valid_pointer() fails to catch it and there isn't any NULL.
> > >
> > >
> > > In the first case the problem fixes itself and thats probably why
> > > validate_slab() worked fine all this time.
> > >
> > > The second case is pretty rare, but thats the case that validate_slab()
> > > wants to rout out when it checks the `!on_freelist(s, slab, NULL)`,
> > > right?
> >
> > Yes.
> >
> > > Also to make my added suggestion of `return fp == NULL` work in the first
> > > case (since it does corrrect the freelist we want it to now return TRUE)
> > > we could also add a line that nulls out the fp right after the
> > > `set_freepointer(s, object, NULL);`.
> >
> > Why?
> > fp = get_freepionter() will observe NULL anyway.
> >
> > > But words are cheap, I should test it out dynamically rather than just
> > > reading the code to see how it behaves. Feel free to also test it
> > > yourself.
> >
> > Yes, testing is important, and you should test
> > to some extent before submitting a patch.
> >
> > > I know I am supposed to send a proper Patch with the multiple added
> > > lines, but for now we are mostly brainstorming solutions. It will be
> > > better to see its behavior first in a debugger I think.
> >
> > I think it's generally considered good practice to discuss before
> > writing any code.
> >
> > > I hope I am making sense with the thought process I outlined for the
> > > return thing. I probably shouldn't be writing emails ealry Saturday
> > > morning, haha.
> > >
> > > Also I really appreciate the kind feedback! The Linux Kernel is infamously
> > > known for how rude and unhinged people can be, which can make it a bit
> > > stressful and intimidating sending any emails, especially for someone
> > > starting out such as myself.
> >
> > You're welcome ;-)
> >
> > --
> > Cheers,
> > Harry
Hi, Lilith.
If you don't mind, could you please avoid bottom posting and
reply with inline comments like how I reply to you?
It makes it easier to follow the conversation.
> Alright, I managed to run some tests through a debugger.
>
> You are right, my code isn't correct, `return fp == search` should be
> more appropriate.
>
> So I think the right way would be to do these changes:
> - while (fp && nr < slab->objects) {
>
> - fp = NULL;
>
> - return fp == search;
>
> The first line removes the "=" so it doesnt iterate more times than
> slab->objects.
Yes.
> The second line sets fp to NULL for the case where the code caught a
> corrupted freelist (check_valid_pointer) and fixes it by setting
> the freepointer to NULL (set_freepointer). Now NULL will be in the
> freelist.
Yes. but do we care about the return value of on_freelist()
when the freelist is corrupted? (we don't know if it was null-terminated
or not because it's corrupted.)
> The third line is the return value:
> TRUE if the final fp we got happens to be the search value the caller
> was looking for in the freelist.
> FALSE if the final fp we got was not the same as the search value.
>
> This should make the validate_slab() work right and if anyone ever uses
> the on_freelist() again with some other search value other than NULL it
> should also work as intended.
Yes! that makes sense to me.
> For my tests I wrote a kernel module that creates an isolated cache with
> 8 objects per slab, allocated all 8 of them and freed them. Then called
> validate_slab_cache() with my cache and set a breakpoint at on_freelist().
> From there I could set any value I wanted and observe its behavior
> through GDB (I used QEMU and GDB-ed remotely from my host).
> This way I didn't have to set a bunch of EXPORT_SYMBOL() and change
> stuff to not static; It made testing a lot easier.
>
> Here are the tests I did with this new change I just mentioned.
>
> Note: By "full slab" I mean that I allocated every object in the slab
> and freed them.
FYI in slab terms (slab->inuse == slab->objects) means full slab,
You probably meant empty slab?
> By "partial" I mean that I only allocated and freed some
> objects, but not every object in the slab, ie if the total objects can
> be 8 I only alloc-ed and freed 7.
>
> search == NULL
> - full slab
> - correct tail true
> - corrupted tail with garbage false
> - corrupted tail with valid address false
Two falses because when the freepointer of the tail object
is corrupted, the loop stops when nr equals slab->objects?
> - partial slab
> - correct tail true
> - corrupted tail with garbage true
This is true because for partial slabs, the number of objects
plus 1 for the garbage, does not exceed slab->objects?
> - corrupted tail with valid address false
>
> search == some fake address
> - full slab
> - correct tail false
> - corrupted tail with garbage false
> - corrupted tail with valid address false
>
> - partial slab
> - correct tail false
> - corrupted tail with garbage false
> - corrupted tail with valid address false
>
>
> search == some address in the freelist
> - full slab
> - correct tail true
> - corrupted tail with garbage true
> - corrupted tail with valid address true
>
> - partial slab
> - correct tail true
> - corrupted tail with garbage true
> - corrupted tail with valid address true
The result seems valid to me. At least, this is the best SLUB can do
while avoiding the risk of infinite loop iteration.
> I apologize if am going into too many details with my testing, I just
> wanna make sure I didn't miss anything.
No, it's ok ;-)
> If my proposed changes look confusing I can send a proper patch.
> Should I send it in this chain as a reply, or send a new email
> and add you as well to the cc?
You can either send a new email or reply to this email with
In-Reply-To header. And please cc SLAB ALLOCATOR folks in MAINTAINERS file
including me—I recently changed my name and email (previously Hyeonggon Yoo).
--
Cheers,
Harry
next prev parent reply other threads:[~2025-02-25 10:08 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-15 16:57 [PATCH] slub: Fix Off-By-One in the While condition in on_freelist() Lilitha Persefoni Gkini
2025-02-20 8:20 ` Harry Yoo
2025-02-20 9:21 ` Harry Yoo
2025-02-21 14:57 ` Lilith Gkini
2025-02-22 3:58 ` Harry Yoo
2025-02-22 9:24 ` Lilith Gkini
2025-02-24 0:00 ` Harry Yoo
2025-02-24 12:12 ` Lilith Gkini
2025-02-25 10:08 ` Harry Yoo [this message]
2025-02-27 16:40 ` Lilith Gkini
2025-03-02 13:11 ` Harry Yoo
-- strict thread matches above, loose matches on Subject: below --
2025-03-02 18:01 Lilith Persefoni Gkini
2025-03-03 11:06 ` Vlastimil Babka
2025-03-03 16:41 ` Lilith Gkini
2025-03-03 17:39 ` Christoph Lameter (Ampere)
2025-03-03 19:06 ` Vlastimil Babka
2025-03-04 8:24 ` Lilith Gkini
2025-03-04 8:41 ` Vlastimil Babka
2025-03-04 11:06 ` Lilith Gkini
2025-03-04 11:20 ` Vlastimil Babka
2025-03-04 12:18 ` Lilith Gkini
2025-03-04 14:25 ` Vlastimil Babka
2025-03-04 17:14 ` Lilith Gkini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z72WiUlhxOGnrXFb@harry \
--to=harry.yoo@oracle.com \
--cc=42.hyeyoo@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=lilithpgkini@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=penberg@kernel.org \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.