From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95450C83030 for ; Fri, 4 Jul 2025 06:34:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 31C026B800E; Fri, 4 Jul 2025 02:34:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2CD7A6B800A; Fri, 4 Jul 2025 02:34:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1BB546B800E; Fri, 4 Jul 2025 02:34:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 09F126B800A for ; Fri, 4 Jul 2025 02:34:27 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id A22CF1272EB for ; Fri, 4 Jul 2025 06:34:26 +0000 (UTC) X-FDA: 83625618132.07.661237F Received: from tor.source.kernel.org (tor.source.kernel.org [172.105.4.254]) by imf25.hostedemail.com (Postfix) with ESMTP id 22580A000A for ; Fri, 4 Jul 2025 06:34:24 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=ZiGnW24m; spf=pass (imf25.hostedemail.com: domain of hare@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=hare@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751610865; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=kM9en+5Jc2cfzOkYXHurtvxr874gyREAVgHuuTvAdJ4=; b=hU85SHL6dr4NyBstlVGfSKhJutKRmB+t7P6L/wZrkKEnTsgggMR0ia1P/gKj+88mos4/3p p32sTzt/2lzaCRf7ZEPZPnW3byk5Q4n0BBnq7HWbzBxSJvidTcXpcnIJtDcPWUT2rokrff O+HUF/hRhU+QHbyNvN/ezr2UIzLjd6Q= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=ZiGnW24m; spf=pass (imf25.hostedemail.com: domain of hare@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=hare@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751610865; a=rsa-sha256; cv=none; b=VIQxBKz0ZDvEJ9rwwAwAs851YpKQGLqNdEsnPbj3Qqr1zBb00eytFXwCb2L6LxQxggB1YT 137xVLPaZzfM581xQ5AJJ3mAEsFeK35lEgb1wx1k2buWUv/6uM6gCufUyBlIzecfDyMTmE xDOcci29LFgCp3UUGzPsb5I6wXYqHvE= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by tor.source.kernel.org (Postfix) with ESMTP id 7553061455; Fri, 4 Jul 2025 06:34:24 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0FF22C4CEE3; Fri, 4 Jul 2025 06:34:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1751610864; bh=gyyUBg92dqZoshE+cFNyHRTGLglYt6cdQ10TYjDGuwQ=; h=From:To:Cc:Subject:Date:From; b=ZiGnW24ms9iZn+ZK6k7Z7in1KXvO9TQVcCVT9M27HQh+g0CfLIyL8zzV03ExhiDEs Hs9aB4Ef5c72vuzukHMe4l4OGwkFQwyw5+rscyp36o1jDdDNTcipvSlUGkzvq0AXUa TLc7lgFpANMA9I//HwWwc1PSzgK6kX/iVSAbXmTxLXvXd7qDyoeZDYysaLAFGL6QYa CFkvWyFxpw/mF6O0UQzAYjpR3ro6gunDnJq20ybASNIDbI+0MfP4gdb30ojxKYNu/l oWZanwL/AKwF5zXzycbcK05gO/usSJ3nmbVamKLco796YbbqbarCuSdow+yiErd5jV AVy/y5YN4c7Fg== From: Hannes Reinecke To: David Hildenbrand Cc: Oscar Salvador , linux-mm@kvack.org, Hannes Reinecke Subject: [PATCHv3 0/3] mm/memory_hotplug: fixup crash during uevent handling Date: Fri, 4 Jul 2025 08:34:01 +0200 Message-ID: <20250704063404.27495-1-hare@kernel.org> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Queue-Id: 22580A000A X-Rspamd-Server: rspam09 X-Stat-Signature: 4ht97c6b4mxinkcrerrcghm9o1sjpcgw X-HE-Tag: 1751610864-163606 X-HE-Meta: U2FsdGVkX1/H3sN15lyqy2K395ReayU0ThpeQLBqYjEsMy7gdZdBbOj7ehbY1Dc76WnpeIXniYJ69pdZrKYtDXeSIZRQlSlcboGXXqBz+hT4NQoNIQfz6U8+lS3KLdDnZz669qKBTpPG3H7LPusi5AfO6PPzZ5jSjks3hkzsjfIArytRlNo/ZrKxBKMGxkB8hwoDgllJZ2jBkiGcZTIuABjmZ8GlerlgvW+p7G9TVYynnvm0XOju0tNQjagGfi08i7UwmetXQ5DojA7W59v1061IQRaiciJnQIMbuBww+FSQW5TQMSjXbnWYaLT0rQb6UYdhy3eLrdtlSOlq22TU6bGBLG+0YyG1ag2htjnmn6IwSFRnOFu4bsHAXQWTGT9P9clqygliQw2Xr/vMY7qjAl474YGoXmbBYvjxFv4ELo6QQvwda3rzldmf/8EdAGzBJdo6aT+xml/WnU+iRgFmwAwvO4MQbcKbyIS6rBn25y/BCDepXrhdvAtiKJwXApEqcxEviRWaQ3k5UA2s9PzQkXTvVLbKHeWlRX1oxCR297ke66DCn10fg6EIGyedNcVztIUk2Ko74rKjGGGQpW6YwPGgubUMVq1zXJGoqG+E/Z5mA0xKwsaZJtUYNAoD/WHPjQiIr6oDDZTDLyKqz7dT7Op+lXe2rbst5N0ZxEtmD90382cPqatpoyNFU+5qrbm3ISPysHdlyroprelIeLGkQPA/5PclHMcdTVerLoTPuRMPFiV3W0MXYd/Ty5ddHdGqKDx/XSSKfOVoTK5f3xR2YW71E0R51DidqiVfrSQXKNyBJF7kkTPCNbgRnEhXFQhPy0s2Mnm5euVEJFFByZ2j2Drc+JwR7f30zhXpU2Xd8g6pvHtplsD20qVuQZf7T0bs5Z+gZyngxH9SYIot/TJUGZ4P+XCJlS5HK4Aj3TH/ufrEQlimovyA2WCmCM5EHRvuOIawCCfjPt4OXJRoPG4 SPnBWJo+ AW1IvIkZmU/lIKIksVUbg0yjLy7sPXuUhD1p+ny+GtTjxOcEeS0XdJlflOdoLeBfgJIwlRKYeNaE0VXDuVREpRfLCkjG490sXTYfPisVacauzrtE8yoGdIHB6Sore3NwEWYUaixKW2MX4IT+TzwvLHIIzpuN3SQ5mpg0H+KrvRrikxwQw5GL7nyXXUxj9DuihoROUGlV920RYfut+KIyeZLQnCNYnm1jjs+ykvrl2JgX5R8GM62oTvsUrS0pKSTkVH6v7oWt5q6zyE6AyS2HCmn4XC6b/kNpl3yap X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi all, we have some udev rules trying to read the sysfs attribute 'valid_zones' during an memory 'add' event, causing a crash in zone_for_pfn_range(). Debugging found that mem->nid was set to NUMA_NO_NODE, which crashed in NODE_DATA(nid). Further analysis revealed that we're running into a race with udev event processing: add_memory_resource() has this function calls: 1) __try_online_node() 2) arch_add_memory() 3) create_memory_block_devices() -> calls device_register() -> memory 'add' event 4) node_set_online()/__register_one_node() -> calls device_register() -> node 'add' event 5) register_memory_blocks_under_node() -> sets mem->nid Which, to the uninitated, is ... weird ... Why do we try to online the node in 1), but only register the node in 4) _after_ we have created the memory blocks in 3) ? And why do we set the 'nid' value in 5), when the uevent (which might need to see the correct 'nid' value) is sent out in 3) ? There must be a reason, I'm sure ... So here's a small patchset to fixup uevent ordering. The first patch adds a 'nid' parameter to add_memory_blocks() (to avoid mem->nid being initialized with NUMA_NO_NODE), and the second patch reshuffles the code in add_memory_resource() to fully initialize the node prior to calling create_memory_block_devices() so that the node is valid at that time and uevent processing will see correct values in sysfs. As usual, comments and reviews are welcome. Changes to the original submission: - Add patch to rename memory_block_add_nid() - Add reviews Changes to v2: - Move changes to nid setting into the last patch - Add reviews from David Hildenbrand Hannes Reinecke (3): drivers/base/memory: add node id parameter to add_memory_block() mm/memory_hotplug: activate node before adding new memory blocks drivers/base: move memory_block_add_nid() into the caller drivers/base/memory.c | 53 ++++++++++++++++++------------------------ drivers/base/node.c | 10 ++++---- include/linux/memory.h | 5 ++-- mm/memory_hotplug.c | 32 +++++++++++++------------ 4 files changed, 46 insertions(+), 54 deletions(-) -- 2.43.0