From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 96E964D90B2 for ; Mon, 11 May 2026 21:36:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778535383; cv=none; b=c5XBG5BJnMM7hinZ8VFaTIkBxVzTstJ8l80dsJ0gC0g0/FX7WFN5QQYMLcfkj1Zu2zigbz33w/seLvD4ZOQp9XZcP1gURFD3XaH5dqKM55QPBNjiHbxb9mm5CU0rkiWx8CC6Whr50IOugffGscvlBi++PHvX0qH328DvGXTDp4k= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778535383; c=relaxed/simple; bh=+hriyepkl+qPYo7yxtlBoyTS04Rd0xR4OMlhgAyQBt8=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=NI6rg0NRJJXKZHek2N8vdx9/JlJ15LXvT7W4iwVGX+L1JyajRwFup2deYRqK2c1MfMBaFlx2oiB2kcVwUAmFWbtFKTMDYhI2EuFoEAG5iGNBIW2J8rbuRRq4NZht6xFY/SDoDhRhfL8cGXSMjPUKfGcLV2FTQ3zEgYGW9iQdRLE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=bLcFx/P6; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="bLcFx/P6" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1778535380; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NjkU6uFBw+JD2l0cKQeUmqi6JTBB/6clBPyTl97J11Q=; b=bLcFx/P6DUxvNphdysv0mUYBIrqjHXDrzL5Ok0XS1tXP0jYG6l+bvuH61p17SE5oyV6NOt A9JYMhjaKVHbcZEJPtD9ylFB4YbkjI73jtD+Lb/jmHt/F8LRLcYVaPt5nzN2KhFvkxqz1I mqqjJ0/L/r7zZIuEBzn63R7N0yCCnjk= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-225-6AHaggpcOe6gyP4OPZiZ_w-1; Mon, 11 May 2026 17:36:16 -0400 X-MC-Unique: 6AHaggpcOe6gyP4OPZiZ_w-1 X-Mimecast-MFC-AGG-ID: 6AHaggpcOe6gyP4OPZiZ_w_1778535375 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 68DFA180034C; Mon, 11 May 2026 21:36:14 +0000 (UTC) Received: from [10.2.17.16] (unknown [10.2.17.16]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 269E118004A3; Mon, 11 May 2026 21:36:09 +0000 (UTC) Message-ID: <26f7a521-3ae4-44ea-90a3-2ff0e1aa9ae2@redhat.com> Date: Mon, 11 May 2026 17:36:08 -0400 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] sched/isolation: Don't free memblock allocated cpumasks To: Mike Rapoport Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , K Prateek Nayak , Frederic Weisbecker , linux-kernel@vger.kernel.org References: <20260505051821.1107133-1-longman@redhat.com> <0dc53363-6a5d-4adc-bf8a-fd7bbdd8da81@redhat.com> Content-Language: en-US From: Waiman Long In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 On 5/11/26 4:34 AM, Mike Rapoport wrote: > On Mon, May 11, 2026 at 12:55:39AM -0400, Waiman Long wrote: >> On 5/10/26 11:02 AM, Mike Rapoport wrote: >>> Hi Waiman, >>> >>> On Tue, May 05, 2026 at 01:18:21AM -0400, Waiman Long wrote: >>>> When testing a v7.1 kernel with commit 59bd1d914bb5 ("memblock: warn when >>>> freeing reserved memory before memory map is initialized"), the following >>>> warning was hit when there was a "nohz_full" kernel boot parameter. >>>> >>>> [ 0.080911] Cannot free reserved memory because of deferred initialization of the memory map >>>> [ 0.080911] WARNING: mm/memblock.c:904 at __free_reserved_area+0xde/0xf0, CPU#0: swapper/0/0 >>>> : >>>> [ 0.080945] Call Trace: >>>> [ 0.080947] >>>> [ 0.080949] memblock_phys_free+0xcb/0x100 >>>> [ 0.080953] housekeeping_init+0x14c/0x170 >>>> [ 0.080957] start_kernel+0x207/0x450 >>>> [ 0.080961] x86_64_start_reservations+0x24/0x30 >>>> [ 0.080965] x86_64_start_kernel+0xda/0xe0 >>>> [ 0.080967] common_startup_64+0x13e/0x141 >>>> [ 0.080972] >>>> >>>> The commit states that freeing of reserved memory before the memory >>>> map is fully initialized in deferred_init_memmap() would cause access >>>> to uninitialized struct pages and may crash when accessing spurious >>>> list pointers. However, if the memblock_free() call is deferred to >>>> the start of initcall processing in the bootup process, for instance, >>>> the following KASAN warning can appear. >>>> >>>> [ 8.514775] BUG: KASAN: use-after-free in memblock_isolate_range+0x4ac/0x650 >>>> [ 8.514775] Read of size 8 at addr ffff88a07fe6a000 by task swapper/0/1 >>>> : >>>> [ 8.514775] Call Trace: >>>> [ 8.514775] >>>> [ 8.514775] kasan_report+0xb2/0x1b0 >>>> [ 8.514775] memblock_isolate_range+0x4ac/0x650 >>>> [ 8.514775] memblock_phys_free+0xc4/0x190 >>>> [ 8.514775] housekeeping_late_init+0x257/0x280 >>>> [ 8.514775] do_one_initcall+0xaa/0x470 >>>> [ 8.514775] do_initcalls+0x1b4/0x1f0 >>>> [ 8.514775] kernel_init_freeable+0x4b5/0x550 >>>> [ 8.514775] kernel_init+0x1c/0x150 >>>> [ 8.514775] ret_from_fork+0x5dc/0x8e0 >>>> [ 8.514775] ret_from_fork_asm+0x1a/0x30 >>>> [ 8.514775] >>>> >>>> It is likely that memblock_discard() may discard memblock data needed >>>> for memblock_free(). One workaround for now to avoid these warning/bug >>>> messages is to keep the memblock allocated cpumasks even if they are >>>> no longer needed until the memblock subsystem is properly updated to >>>> handle memblock_free(). >>>> >>>> On most systems, memory occuipied by a cpumask is pretty small. So not >>>> much memory will be wasted if the memblock cpumasks are not freed. >>>> >>>> Signed-off-by: Waiman Long >>>> --- >>>> kernel/sched/isolation.c | 8 +++++++- >>>> 1 file changed, 7 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c >>>> index ef152d401fe2..ad9b1a1104e3 100644 >>>> --- a/kernel/sched/isolation.c >>>> +++ b/kernel/sched/isolation.c >>>> @@ -189,7 +189,13 @@ void __init housekeeping_init(void) >>>> WARN_ON_ONCE(cpumask_empty(omask)); >>>> cpumask_copy(nmask, omask); >>>> RCU_INIT_POINTER(housekeeping.cpumasks[type], nmask); >>>> - memblock_free(omask, cpumask_size()); >>>> + >>>> + /* >>>> + * TODO: Don't free memblock allocated cpumasks until the >>>> + * memblock subystem is able to handle the memblock_free() >>>> + * properly. >>>> + */ >>>> + // memblock_free(omask, cpumask_size()); >>> Before 59bd1d914bb5 it was a silent leak. housekeeping_init() is called >>> after memblock moves all the memory to buddy, so this would only update >>> memblock.reserved. >>> >>> The comment a few lines above says that we reallocate to be able to kfree() >>> later. Is it possible to delay reallocation until an initcall? >> My original thought was to defer the freeing to init call. That changes led >> to the KASAN bug splat listed in the commit log, I think the right window to >> free memblock memory is currently just too narrow. Do you mean that with the >> fix patch you sent to Breno, memblock freeing in initcall will work without >> bug report? > Yes, with the fix I sent to Breno memblock_free() should work in an > initcall and "do the right thing". Thanks for the confirmation. I have tested your patch with my patch to defer the memblock_free() to initcall. There is no longer any KASAN splat when booting up a debug test kernel. You can add the following tag when you send out your patch. Tested-by: Waiman Long