From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9A0FC41513 for ; Sat, 14 Oct 2023 22:25:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D50798D0003; Sat, 14 Oct 2023 18:25:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CFF258D0001; Sat, 14 Oct 2023 18:25:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BEDE78D0003; Sat, 14 Oct 2023 18:25:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id AC5B78D0001 for ; Sat, 14 Oct 2023 18:25:47 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 7884B406E6 for ; Sat, 14 Oct 2023 22:25:47 +0000 (UTC) X-FDA: 81345500334.16.A1B9B41 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf30.hostedemail.com (Postfix) with ESMTP id C168D80007 for ; Sat, 14 Oct 2023 22:25:45 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=eg3LLDdu; dmarc=none; spf=pass (imf30.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1697322345; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=IHN9ShU+67DVGl3KhwsMlAJsXh4v2A8r732M+gTXbqc=; b=IE4ndDWaF8uvHAO5RjLVocGqnRAzgsh0O0E7oA12UVX+B6HF+0NcZbYyyqLgscl2R772aG fMjl/Kxn2bCGBIZsRFHfBNWcsVEE47IZij79ugvj8zWkmi6uI5oqHPTICi4B6wmu44Iq72 8ybp8j9JRwC0n7l8TRZVelgBzAGNMyM= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=eg3LLDdu; dmarc=none; spf=pass (imf30.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1697322345; a=rsa-sha256; cv=none; b=gYYt3pE78Yu+noxvFpL8iIba/glUabGPZEiXVupDTQXCmxvW9FxpkKqW1/dVoBEVNth8I3 jXwk92Uu/1AKxsOs8rjM8gnjnm0EERJ7Xba4GZO3tIxxAqsp5Y8+7E8Mxg/Pzgo8S5snGM a2nVbN2Jjv3OzKS4DFY8C1D/jluRBjo= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id BA80D60AC6; Sat, 14 Oct 2023 22:25:42 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 677B5C433C7; Sat, 14 Oct 2023 22:25:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1697322342; bh=DVFXu995Ek/eydKdbc/Uxs7E/ribT7x6PLE1LvWtWMs=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=eg3LLDduGAPpZPmH+cwbNSSyYRkJV++4EQMhHvF2yYXau3PoRoqK881SxtlKd1zyF waCMDDEMx8joqUXDlJZcLXCmOyG3PY1S9ww0g5hYFh5BQ7RkHc6WqHt7oA1cQz5fS7 iHZrAoYDVmAd0THWA9jzKLQ3A6zRH38Ph+JNYzjQ= Date: Sat, 14 Oct 2023 15:25:32 -0700 From: Andrew Morton To: Charan Teja Kalla Cc: , , , , , , , Subject: Re: [PATCH] mm/sparsemem: fix race in accessing memory_section->usage Message-Id: <20231014152532.5f3dca7838c2567a1a9ca9c6@linux-foundation.org> In-Reply-To: <1697202267-23600-1-git-send-email-quic_charante@quicinc.com> References: <1697202267-23600-1-git-send-email-quic_charante@quicinc.com> X-Mailer: Sylpheed 3.8.0beta1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: C168D80007 X-Stat-Signature: scyrjhz15rqogp77z7xncpgrho31q9hc X-Rspam-User: X-HE-Tag: 1697322345-49940 X-HE-Meta: U2FsdGVkX1/MyiVxDahO9Xq4I+as4C36HhA+WlFBEo2Y6POaYAsWJ2Kuevg0NghYBVmof13XuVBotl2XUAlfVS62fipoFt+KX+w3aKvCdvrSTZK5YPzq7xcjlzwQGPlZRP8Hyhc9cGK+dS8q0N7C3P2DhPeDhyYhKWScBSFBf9Dw0wFx09hP/2gI1K9T9SvLY+Rgdcjxmq3WfMDfeGx0AvabWhLtZzLcwJhiuVr5ORUAzVDqQmasOGUt6zLuYYXIHOdK7wTXcbAZRNu17Yj+AhJrWpxFeUFod9+GSN3w1o9bvux7tLGduBg5elJPy0BWmDIunlYFCr6We/cWY7yBJ5nRTnfTiHygpusjihBohUHEYXzLvUtzh1zMQJ6fNBJ23Th/Ps7ZtvJYsX/aXTFuJUVSsg/ER0nABF0byJabN7Mb3bPANPLoR2IjTSHCHTrm0rXcqGrsfIFqhzYJPsxacgxTfUJi5NrNAyNlaSu8tyHHwzZIhcHuHVXPU1PweDBUobqWAGxnt2jkhRRpb4aUITQrrFGnn2S6lpa9Ih3jiUG/2OtLBn7SNqnhxQ6SWNhta7jA1jR9VSyTvqCsVo2az1e4D8531KltHYZeYiw2lUQb5Nk5rMCbIhT5WftG825exVLieb6Og/hsu9lXrPaHBCOgv6a2IqK1qPR9Am3LuoTuu4A4el+mW89kwyfSmCfSXOYiBeTYuXFxOLUnRPNZtgXxExHYJrsl3n5/ZpRQSv/9AHUiUY7oKoqt0kSkkDwLYMEfo+afp/69+JoKQ/7RSM2MO/29VGtxCb6k0FMPjC/pKf9/6ZkU6XJWCOJkWm8ErGE3WuR4kO9Th9jONgQTaOkeUbTxReWUMAhkXSyFfjpXCzTEkwTxCDt2dNkUpRGEyaEUZ+uoyUyQwOFiYzQi8tpn9kKWs2G0Oouf1vXbkyO8MW8pV9NTSR8/PYRKv0pmcuVOg90V6seFysjIti/ Uqm4rarM fZ5CBttOaI+Y8ru5TSk7PkG+JEsKShGduGd5C7X6iLtYcnjY1C1psOgEXj0mUwev8buqbmxo1hFd6Usrlu/24lYIPNv7PNIi09NrxzGGmDvo55hmURIbMZUJY7spfd4HFQfuYK+5h/oZXcvZ+ZyGYXHkHi3lWK2JuDuuaWig00eRvya4CufR88i+Ch8ZoCO2zZrafdicDYWfIopI9hzXZnWFtdRpo9bqH4M5WwST9gXuVQ/kecOK+a1sqAv5GKg3BQr4pn/HLpMdtE5FAqMr1TRnb4WUsmFST73+iPaG21ygaPauRfyzhkh8qnyY2riERoKoW5RsdnG/4+Pw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, 13 Oct 2023 18:34:27 +0530 Charan Teja Kalla wrote: > The below race is observed on a PFN which falls into the device memory > region with the system memory configuration where PFN's are such that > [ZONE_NORMAL ZONE_DEVICE ZONE_NORMAL]. Since normal zone start and > end pfn contains the device memory PFN's as well, the compaction > triggered will try on the device memory PFN's too though they end up in > NOP(because pfn_to_online_page() returns NULL for ZONE_DEVICE memory > sections). When from other core, the section mappings are being removed > for the ZONE_DEVICE region, that the PFN in question belongs to, > on which compaction is currently being operated is resulting into the > kernel crash with CONFIG_SPASEMEM_VMEMAP enabled. Seems this bug is four years old, yes? It must be quite hard to hit. When people review this, please offer opinions on whether a fix should be backported into -stable kernels, thanks. > compact_zone() memunmap_page > ------------- --------------- > __pageblock_pfn_to_page > ...... > (a)pfn_valid(): > valid_section()//return true > (b)__remove_pages()-> > sparse_remove_section()-> > section_deactivate(): > [Free the array ms->usage and set > ms->usage = NULL] > pfn_section_valid() > [Access ms->usage which > is NULL] > > NOTE: From the above it can be said that the race is reduced to between > the pfn_valid()/pfn_section_valid() and the section deactivate with > SPASEMEM_VMEMAP enabled. > > The commit b943f045a9af("mm/sparse: fix kernel crash with > pfn_section_valid check") tried to address the same problem by clearing > the SECTION_HAS_MEM_MAP with the expectation of valid_section() returns > false thus ms->usage is not accessed. > > Fix this issue by the below steps: > a) Clear SECTION_HAS_MEM_MAP before freeing the ->usage. > b) RCU protected read side critical section will either return NULL when > SECTION_HAS_MEM_MAP is cleared or can successfully access ->usage. > c) Synchronize the rcu on the write side and free the ->usage. No > attempt will be made to access ->usage after this as the > SECTION_HAS_MEM_MAP is cleared thus valid_section() return false. > > Since the section_deactivate() is a rare operation and will come in the > hot remove path, impact of synchronize_rcu() should be negligble. > > Fixes: f46edbd1b151 ("mm/sparsemem: add helpers track active portions of a section at boot")