From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58EFBC43334 for ; Tue, 14 Jun 2022 15:55:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B02486B00A5; Tue, 14 Jun 2022 11:55:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AB1916B00A6; Tue, 14 Jun 2022 11:55:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 978796B00A7; Tue, 14 Jun 2022 11:55:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 87EA36B00A5 for ; Tue, 14 Jun 2022 11:55:03 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 5B9FBF87 for ; Tue, 14 Jun 2022 15:55:03 +0000 (UTC) X-FDA: 79577290086.04.2231BEB Received: from mail-lj1-f176.google.com (mail-lj1-f176.google.com [209.85.208.176]) by imf19.hostedemail.com (Postfix) with ESMTP id E5E8A1A0098 for ; Tue, 14 Jun 2022 15:55:02 +0000 (UTC) Received: by mail-lj1-f176.google.com with SMTP id s10so10203794ljh.12 for ; Tue, 14 Jun 2022 08:55:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=1tdwDOTcXF1J86aFeVAQ1bLCziuFw6qe8M9l3cjjghc=; b=QHO8Y5kQknEqWXCUS3+9/ppmF1lVb3XWX26U60n3metMU7uBiMJxDZEqrp4EiGsX4J LpCTPvrO11mqmBSSBPbmWU8pJqjdqYvAypYLphvvxHfbTEaYd1Y2TpLXcxyJ0LcyACIJ PrKwsu8jeaJ4R9Tl09RgsPhwjAxDTRBx7PkzFQKrZ8cV/Y+cArY5moM3XjW+N8ZvDuKV POWdK+Xc21eVKjYPkLHPTfrESauHXjXfayJBxpbVQmpJZG0mHe/I3tRZE22x7o58RT2h ufFGan7qh/7sYPgANT5+lh3//nNJL6IjGDiD7Dt3qJsScBFqCVZZHqTez+9BScDVgasu d11Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=1tdwDOTcXF1J86aFeVAQ1bLCziuFw6qe8M9l3cjjghc=; b=jczAHnMSS3P5pD6Jv1x8vVR7SiPfgPIcFJ1DiGMm5i3gYfIu5jpvs1tj4ZRptytRaS MiD8O8JpiswyuOsxNjOnSKr68INd0HQqMfvrmOh/PzdhsAwnPKMwyeBFRGLa9ljGA6+d x/N7e7HkAdrTcn89aMWjVGdjCqgCuGI8fU+hWJP2u77klmjIbmhCEumRouO9ZwoWKHgz 2Yp8ar+8JBI7qPp/YuvM7adTFAbHQXZ0byrQCJHj3Tb/R8IQyW11Z68ajx3MiwKY/muB FFYsJaBX8MlOY97ttSjr/pu/CpTKAO6btXLO6ZE4adrOaewpZOStsXwa5w02u1HyB8dp FoPw== X-Gm-Message-State: AJIora/mg6lBmdmpMVXJZd7LArAdHNAhFyNSTPNHEaSdELwHrDb+JHFB nA7pfAeBca9wOM4GWC8OKnkbQ+ibOE8AiLq8wpKzCQ== X-Google-Smtp-Source: AGRyM1tNsurC9JadtJ1iKgohdzsrA+F7tv8I3bFp/RtXaVg4DBkc1AjBH1m86u8yzocFpD4/y+mKLzzWVMRTa0ocx3M= X-Received: by 2002:a2e:a225:0:b0:255:81e3:dc48 with SMTP id i5-20020a2ea225000000b0025581e3dc48mr2898624ljm.375.1655222101114; Tue, 14 Jun 2022 08:55:01 -0700 (PDT) MIME-Version: 1.0 References: <20220608182205.2945720-1-jannh@google.com> <95a9f679-93d9-548a-fc26-985ec605e7f8@suse.cz> In-Reply-To: <95a9f679-93d9-548a-fc26-985ec605e7f8@suse.cz> From: Jann Horn Date: Tue, 14 Jun 2022 17:54:24 +0200 Message-ID: Subject: Re: [PATCH] mm/slub: add missing TID updates on slab deactivation To: Vlastimil Babka Cc: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Hyeonggon Yoo <42.hyeyoo@gmail.com>, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1655222103; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1tdwDOTcXF1J86aFeVAQ1bLCziuFw6qe8M9l3cjjghc=; b=S5LYZ9e/4NKSBoseXyyIBzS2jVNeAg/YiDb5qfaTAjhUcbOjQr/O0JP4rMU9TIvV/0WUUQ ysGtUAS3xOGLlVpaozYuOVKuHKa0pDvY6imYufnVZtk/iPnSrOxLkav4PXqE5umE8fd9on B1ZQnT6ZG840ltPyxQ6KgxQL5XSoV1M= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=QHO8Y5kQ; spf=pass (imf19.hostedemail.com: domain of jannh@google.com designates 209.85.208.176 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1655222103; a=rsa-sha256; cv=none; b=5R9YwCVNPjts6AfJIJTtOOgajo4jvmm2fvLtmnU7M8Z3rEH018LSG02r/RaFoap4mXPhUW OV4lr/E5E5jBYLdsvTW41Eg32JgIH9EzPERdWyBJhgFpd4CjmJ2MoL23i0OXVavXEORV5c 2yWP2JER16yy9D5clHXdGLlTkVMTUrs= X-Stat-Signature: ryc333kcxw3cfmf836r6f9ib9913qn5o X-Rspamd-Queue-Id: E5E8A1A0098 X-Rspamd-Server: rspam11 X-Rspam-User: Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=QHO8Y5kQ; spf=pass (imf19.hostedemail.com: domain of jannh@google.com designates 209.85.208.176 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com X-HE-Tag: 1655222102-57476 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jun 14, 2022 at 10:23 AM Vlastimil Babka wrote: > On 6/8/22 20:22, Jann Horn wrote: > > The fastpath in slab_alloc_node() assumes that c->slab is stable as long as > > the TID stays the same. However, two places in __slab_alloc() currently > > don't update the TID when deactivating the CPU slab. > > > > If multiple operations race the right way, this could lead to an object > > getting lost; or, in an even more unlikely situation, it could even lead to > > an object being freed onto the wrong slab's freelist, messing up the > > `inuse` counter and eventually causing a page to be freed to the page > > allocator while it still contains slab objects. [...] > > Fixes: c17dda40a6a4e ("slub: Separate out kmem_cache_cpu processing from deactivate_slab") > > Fixes: 03e404af26dc2 ("slub: fast release on full slab") > > Cc: stable@vger.kernel.org > > Hmm these are old commits, and currently oldest LTS is 4.9, so this will be > fun. Worth doublechecking if it's not recent changes that actually > introduced the bug... but seems not, AFAICS. [...] > > @@ -2936,6 +2936,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, > > > > if (!freelist) { > > c->slab = NULL; > > + c->tid = next_tid(c->tid); > > local_unlock_irqrestore(&s->cpu_slab->lock, flags); > > So this immediate unlock after setting NULL is new from the 5.15 preempt-rt > changes. However even in older versions we could goto new_slab, > new_slab_objects(), new_slab(), allocate_slab(), where if > (gfpflags_allow_blocking()) local_irq_enable(); (there's no extra disabled > preemption besides the irq disable) so I'd say the bug was possible before > too, but less often? Yeah, I think so too. > > stat(s, DEACTIVATE_BYPASS); > > goto new_slab; > > @@ -2968,6 +2969,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, > > freelist = c->freelist; > > c->slab = NULL; > > c->freelist = NULL; > > Previously these were part of deactivate_slab(), which does that at the very > end, but also without bumping tid. > I just wonder if it's necessary too, because IIUC the scenario you described > relies on the missing bump above. This alone doesn't cause the c->slab vs > c->freelist mismatch? It's a different scenario, but at least in the current version, the ALLOC_NODE_MISMATCH case jumps straight to the deactivate_slab label, which takes the local_lock, grabs the old c->freelist, NULLs out ->slab and ->freelist, then drops the local_lock again. If the c->freelist was non-NULL, then this will prevent concurrent cmpxchg success; but there is no reason why c->freelist has to be non-NULL here. So if c->freelist is already NULL, we basically just take the local_lock, set c->slab to NULL, and drop the local_lock. And IIUC the local_lock is the only protection we have here against concurrency, since the slub_get_cpu_ptr() in __slab_alloc() only disables migration? So again a concurrent fastpath free should be able to set c->freelist to non-NULL after c->slab has been set to NULL. So I think this TID bump is also necessary for correctness in the current version. And looking back at older kernels, back to at least 4.9, the ALLOC_NODE_MISMATCH case looks similarly broken - except that again, as you pointed out, we don't have the fine-grained locking, so it only becomes racy if we hit new_slab_objects() -> new_slab() -> allocate_slab() and then either we do local_irq_enable() or the allocation fails. > Thanks. Applying to slab/for-5.19-rc3/fixes branch. Thanks!