From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99E4EC433EF for ; Tue, 2 Nov 2021 09:07:00 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1960A61050 for ; Tue, 2 Nov 2021 09:07:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 1960A61050 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 9774E940013; Tue, 2 Nov 2021 05:06:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9271E94000A; Tue, 2 Nov 2021 05:06:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 815AA940013; Tue, 2 Nov 2021 05:06:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0247.hostedemail.com [216.40.44.247]) by kanga.kvack.org (Postfix) with ESMTP id 721B594000A for ; Tue, 2 Nov 2021 05:06:59 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 1D63F55FA2 for ; Tue, 2 Nov 2021 09:06:59 +0000 (UTC) X-FDA: 78763410558.21.6BBBD3C Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by imf11.hostedemail.com (Postfix) with ESMTP id E6A59F0000B2 for ; Tue, 2 Nov 2021 09:06:57 +0000 (UTC) Received: from dggemv703-chm.china.huawei.com (unknown [172.30.72.56]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4Hk3tZ16Jdz90lF; Tue, 2 Nov 2021 17:06:46 +0800 (CST) Received: from dggpeml500024.china.huawei.com (7.185.36.10) by dggemv703-chm.china.huawei.com (10.3.19.46) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.15; Tue, 2 Nov 2021 17:06:54 +0800 Received: from [10.174.176.231] (10.174.176.231) by dggpeml500024.china.huawei.com (7.185.36.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.15; Tue, 2 Nov 2021 17:06:53 +0800 Subject: Re: [PATCH] mm, slub: place the trace before freeing memory in kmem_cache_free() To: John Hubbard , , , , , Andrew Morton , , , CC: , Hewenliang References: <867f6da4-6d38-6435-3fbb-a2a3744029f1@huawei.com> From: Yunfeng Ye Message-ID: <2ea4e792-816c-a734-db1f-388516c74ea9@huawei.com> Date: Tue, 2 Nov 2021 17:06:42 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US X-Originating-IP: [10.174.176.231] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpeml500024.china.huawei.com (7.185.36.10) X-CFilter-Loop: Reflected X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: E6A59F0000B2 X-Stat-Signature: nkizz6wtfet3cowyr4ug1srigznumiex Authentication-Results: imf11.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=huawei.com; spf=pass (imf11.hostedemail.com: domain of yeyunfeng@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=yeyunfeng@huawei.com X-HE-Tag: 1635844017-791119 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2021/11/2 15:03, John Hubbard wrote: > On 10/30/21 03:11, Yunfeng Ye wrote: >> After the memory is freed, it may be allocated by other CPUs and has >> been recorded by trace. So the timing sequence of the memory tracing i= s >> inaccurate. >> >> For example, we expect the following timing sequeuce: >> >> =C2=A0=C2=A0=C2=A0=C2=A0 CPU 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 CPU 1 >> >> =C2=A0=C2=A0 (1) alloc xxxxxx >> =C2=A0=C2=A0 (2) free=C2=A0 xxxxxx >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 (3) alloc xxxxxx >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 (4) free=C2=A0 xxxxxx >> >> However, the following timing sequence may occur: >> >> =C2=A0=C2=A0=C2=A0=C2=A0 CPU 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 CPU 1 >> >> =C2=A0=C2=A0 (1) alloc xxxxxx >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 (2) alloc xxxxxx >> =C2=A0=C2=A0 (3) free=C2=A0 xxxxxx >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 (4) free=C2=A0 xxxxxx >> >> So place the trace before freeing memory in kmem_cache_free(). >=20 > Hi Yunfeng, >=20 > Like Muchun, I had some difficulty with the above description, but > now I think I get it. :) >=20 > In order to make it easier for others, how about this wording and subje= ct > line, instead: >=20 Ok,I will modify the description in the next version patch. Thanks. >=20 > mm, slub: emit the "free" trace report before freeing memory in kmem_ca= che_free() >=20 > After the memory is freed, it can be immediately allocated by other > CPUs, before the "free" trace report has been emitted. This causes > inaccurate traces. >=20 > For example, if the following sequence of events occurs: >=20 > =C2=A0=C2=A0=C2=A0 CPU 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 CPU 1 >=20 > =C2=A0 (1) alloc xxxxxx > =C2=A0 (2) free=C2=A0 xxxxxx > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = (3) alloc xxxxxx > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = (4) free=C2=A0 xxxxxx >=20 > ...then they will be inaccurately reported via tracing, so that they > appear to have happened in this order. This makes it look like CPU 1 > somehow managed to allocate mmemory that CPU 0 still had allocated for > itself: >=20 > =C2=A0=C2=A0=C2=A0 CPU 0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 CPU 1 >=20 > =C2=A0 (1) alloc xxxxxx > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = (2) alloc xxxxxx > =C2=A0 (3) free=C2=A0 xxxxxx > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = (4) free=C2=A0 xxxxxx >=20 > In order to avoid this, emit the "free xxxxxx" tracing report just > before the actual call to free the memory, instead of just after it. >=20 >=20 >> >> Signed-off-by: Yunfeng Ye >> --- >> =C2=A0 mm/slub.c | 2 +- >> =C2=A0 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/mm/slub.c b/mm/slub.c >> index 432145d7b4ec..427e62034c3f 100644 >> --- a/mm/slub.c >> +++ b/mm/slub.c >> @@ -3526,8 +3526,8 @@ void kmem_cache_free(struct kmem_cache *s, void = *x) >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 s =3D cache_from_obj(s, x); >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (!s) >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return; >> -=C2=A0=C2=A0=C2=A0 slab_free(s, virt_to_head_page(x), x, NULL, 1, _RE= T_IP_); >> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 trace_kmem_cache_free(_RET_IP_, x, s->n= ame); >> +=C2=A0=C2=A0=C2=A0 slab_free(s, virt_to_head_page(x), x, NULL, 1, _RE= T_IP_); >> =C2=A0 } >> =C2=A0 EXPORT_SYMBOL(kmem_cache_free); >> >=20 > ...the diffs seem correct, too, but I'm not exactly a slub reviewer, so > take that for what it's worth. >=20 >=20 > thanks,