From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A36BEC0015E for ; Tue, 1 Aug 2023 17:09:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2C21E94002E; Tue, 1 Aug 2023 13:09:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 27236940010; Tue, 1 Aug 2023 13:09:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 13A2B94002E; Tue, 1 Aug 2023 13:09:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 05362940010 for ; Tue, 1 Aug 2023 13:09:37 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 7608DA05C1 for ; Tue, 1 Aug 2023 17:09:36 +0000 (UTC) X-FDA: 81076172352.26.42C62CE Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf13.hostedemail.com (Postfix) with ESMTP id 5B6A7223C3 for ; Tue, 1 Aug 2023 15:58:19 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=bJIrJ3Bi; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf13.hostedemail.com: domain of bhe@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=bhe@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690905504; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=v1Xszfz9uEDrPkkvCMGSkEQEMthpp9oBlq/CbCyVYks=; b=IrbCmKarBRtKpeg843SCzf8iJdRCgQnpruZHzzTS2LaM/buORxk9azgkcspQR8dsTrEPjv RiRX1A2eX4WjxHDBSbMspmaYPIWB/gV/tAF35KjznwQBZgbmHzGepYbp2o4F9h/XE/ujxs vrfN70Q1bNclx+qfmRp4zMmz7t64xBo= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=bJIrJ3Bi; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf13.hostedemail.com: domain of bhe@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=bhe@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690905504; a=rsa-sha256; cv=none; b=NGL1918m7xg5XvBbLsUmDdw02AiC9Q9rx7+lGhI8m/9Im0SjTogjT01VQsDEnIxZRLKsV1 xhFKdORYT2y/l2qlAPUIop+ffd+KKp9yZLWzvQzUN5YVOhLteUWBDx5LCwd1RaCDGDMaDw w6ekke0/uV8ztwYernrkraRX7a+yUl4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1690905495; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=v1Xszfz9uEDrPkkvCMGSkEQEMthpp9oBlq/CbCyVYks=; b=bJIrJ3BipJRG7ar02litnOEgXGOeZYlgrA2YOWT+raaA7s8PRlw9PphnL06R0XZIi+73YH H72MGnIvD5e80AVFa9NqzS9IUHYElao7JqRNBG8P4eurFFbQ1JVtdQA+kHDnogrF7esEHc k6CV1NT/oqXZpoADff0BCpxBH+R/BoU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-445-x24qKZ6FPHuvf70ETQoxQA-1; Tue, 01 Aug 2023 11:58:12 -0400 X-MC-Unique: x24qKZ6FPHuvf70ETQoxQA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 319E4185A791; Tue, 1 Aug 2023 15:58:11 +0000 (UTC) Received: from localhost (unknown [10.72.112.107]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 993E42166B25; Tue, 1 Aug 2023 15:57:51 +0000 (UTC) Date: Tue, 1 Aug 2023 23:57:48 +0800 From: Baoquan He To: Lorenzo Stoakes Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Uladzislau Rezki , linux-fsdevel@vger.kernel.org, Jiri Olsa , Will Deacon , Mike Galbraith , Mark Rutland , wangkefeng.wang@huawei.com, catalin.marinas@arm.com, ardb@kernel.org, David Hildenbrand , Linux regression tracking , regressions@lists.linux.dev, Matthew Wilcox , Liu Shixin , Jens Axboe , Alexander Viro , stable@vger.kernel.org Subject: Re: [PATCH] fs/proc/kcore: reinstate bounce buffer for KCORE_TEXT regions Message-ID: References: <20230731215021.70911-1-lstoakes@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230731215021.70911-1-lstoakes@gmail.com> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Rspam-User: X-Stat-Signature: c9hq11nbi5u5e1ufdwmdmtiub9x78zew X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 5B6A7223C3 X-HE-Tag: 1690905499-577937 X-HE-Meta: U2FsdGVkX1/W9812iARo0kqbIoX+luB92c+zMOcGZDvQrDio1Ajr8BkNy3SKl3leNeQx2ccppaBDqI3g1CtUkoxAsRjM3CzDBGYElB8NjkY0y0f1mOUNzORbLDh92OCBkoM4qOvJhmCoaTIL9ZVxZq+fJe5SlDIhPyiZBVsLc5FT9IEDPsPL1kwKFweT0LWgRBvFu/dhfyjT9KzNbxvWo+mGLRBNFxeP3odAj6iUh29gde8XSibH1mpuy/RAyPHImfEo2pSzt0HWxJVfi0WXJ8jvNlq103D56lYTQA7BHzwEJRZ4NFbYek8Qlz488VtM47HathpFlyhmcW5V0B+mGpl2heyn86GwCSB8wYWGI8Ugs3IcNCyGpUUdzbFO8Z83VTsToOAQaMRfmSRd5cqSAeuvzGI9k6M19bTvSNDmrSfmTSlfXR5mdtcWU4pxMziCy8Ra9Z3iAmkG4slAEAObj0LRNZHnuaSnlr+d/lebA9HQqOIS7A2vp+6HoewfevteKoI5wqaaT4ZiJucOFLNGWLpUOMgzXGvr6Kpg4GZbA6+mMzfgfiqR7XNh63pH/E24fYSaws0grf8FHpmTPnQY49jNnSFgcAIkkWm0wiCXZ3o3AxXRJjosxLN6c+xt/R5LeOzxPrsjMdqzMpdfKAXOnmAmJ8ufb62FaiOL56uZ3kYcavfJiNkaIYHgdiSrR6vOkfal92rkqmm56pLErOEIx+KWMwaCopBxSBdt3JmH7q1WCAIV+y4jRGeCyOSeu+6Y8B8rQx3p8exXh5Bz66Gj4zpZMYq9vAXcPRgg4XH3/q8dMiE5A2ATAgcMxrSQT/AooPZbh1hYunyP9quZTJGl4Qh3sO9iW/NU8sm/sY/ISiQj8uyGbRFRYbcZ53k+PtcZ80KPuvXbtWjlOHDsyqZbEHqLfQ4pORzcI48sKG1yNYUBdvDntm7C4DHC+c9tY7FUv/skxUgJo8MMnwHW+Cn 0mxp5f5O c8mCe83Tsqk4t49FAWEs7bqgZCcCt8HymNFG9QqfrkBw14D+Vs4Ree+GSIbDL+WMha8PpQpiZKMN5gtsTzbI+192PvKF0TmT1YahP0eNq3bjajPYTjLJnagd/bvLAS06X8eYJKw96rVmFHRQjdBqmV7ij+ICa+EFLNobRmEnuHScqaj6DODHcLYCcpUY6caVqTgBzHC7hgFcI2ZeKSPufABnBSnUKzLrUSs+1UAvUjOilcqlD2fFV8hM6SsYohFMpk5DFb3ZuRplG6nzyM91ErZqNxGzifhEXV1w8PbXqSyT+jhDsZv0an3L7AMFgyeDIKFcv0Pgy9XQbHF94bnSFAdM026kx4tzYTN0j X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 07/31/23 at 10:50pm, Lorenzo Stoakes wrote: > Some architectures do not populate the entire range categorised by > KCORE_TEXT, so we must ensure that the kernel address we read from is > valid. > > Unfortunately there is no solution currently available to do so with a > purely iterator solution so reinstate the bounce buffer in this instance so > we can use copy_from_kernel_nofault() in order to avoid page faults when > regions are unmapped. > > This change partly reverts commit 2e1c0170771e ("fs/proc/kcore: avoid > bounce buffer for ktext data"), reinstating the bounce buffer, but adapts > the code to continue to use an iterator. > > Fixes: 2e1c0170771e ("fs/proc/kcore: avoid bounce buffer for ktext data") > Reported-by: Jiri Olsa > Closes: https://lore.kernel.org/all/ZHc2fm+9daF6cgCE@krava > Cc: stable@vger.kernel.org > Signed-off-by: Lorenzo Stoakes > --- > fs/proc/kcore.c | 26 +++++++++++++++++++++++++- > 1 file changed, 25 insertions(+), 1 deletion(-) > > diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c > index 9cb32e1a78a0..3bc689038232 100644 > --- a/fs/proc/kcore.c > +++ b/fs/proc/kcore.c > @@ -309,6 +309,8 @@ static void append_kcore_note(char *notes, size_t *i, const char *name, > > static ssize_t read_kcore_iter(struct kiocb *iocb, struct iov_iter *iter) > { > + struct file *file = iocb->ki_filp; > + char *buf = file->private_data; > loff_t *fpos = &iocb->ki_pos; > size_t phdrs_offset, notes_offset, data_offset; > size_t page_offline_frozen = 1; > @@ -554,11 +556,22 @@ static ssize_t read_kcore_iter(struct kiocb *iocb, struct iov_iter *iter) > fallthrough; > case KCORE_VMEMMAP: > case KCORE_TEXT: > + /* > + * Sadly we must use a bounce buffer here to be able to > + * make use of copy_from_kernel_nofault(), as these > + * memory regions might not always be mapped on all > + * architectures. > + */ > + if (copy_from_kernel_nofault(buf, (void *)start, tsz)) { > + if (iov_iter_zero(tsz, iter) != tsz) { > + ret = -EFAULT; > + goto out; > + } > /* > * We use _copy_to_iter() to bypass usermode hardening > * which would otherwise prevent this operation. > */ > - if (_copy_to_iter((char *)start, tsz, iter) != tsz) { > + } else if (_copy_to_iter(buf, tsz, iter) != tsz) { > ret = -EFAULT; > goto out; > } > @@ -595,6 +608,10 @@ static int open_kcore(struct inode *inode, struct file *filp) > if (ret) > return ret; > > + filp->private_data = kmalloc(PAGE_SIZE, GFP_KERNEL); > + if (!filp->private_data) > + return -ENOMEM; > + > if (kcore_need_update) > kcore_update_ram(); > if (i_size_read(inode) != proc_root_kcore->size) { > @@ -605,9 +622,16 @@ static int open_kcore(struct inode *inode, struct file *filp) > return 0; > } > > +static int release_kcore(struct inode *inode, struct file *file) > +{ > + kfree(file->private_data); > + return 0; > +} > + > static const struct proc_ops kcore_proc_ops = { > .proc_read_iter = read_kcore_iter, > .proc_open = open_kcore, > + .proc_release = release_kcore, > .proc_lseek = default_llseek, > }; On 6.5-rc4, the failures can be reproduced stably on a arm64 machine. With patch applied, both makedumpfile and objdump test cases passed. And the code change looks good to me, thanks. Tested-by: Baoquan He Reviewed-by: Baoquan He =============================================== [root@ ~]# makedumpfile --mem-usage /proc/kcore The kernel version is not supported. The makedumpfile operation may be incomplete. TYPE PAGES EXCLUDABLE DESCRIPTION ---------------------------------------------------------------------- ZERO 76234 yes Pages filled with zero NON_PRI_CACHE 147613 yes Cache pages without private flag PRI_CACHE 3847 yes Cache pages with private flag USER 15276 yes User process pages FREE 15809884 yes Free pages KERN_DATA 459950 no Dumpable kernel data page size: 4096 Total pages on system: 16512804 Total size on system: 67636445184 Byte [root@ ~]# objdump -d --start-address=0x^C [root@ ~]# cat /proc/kallsyms | grep ksys_read ffffab3be77229d8 T ksys_readahead ffffab3be782a700 T ksys_read [root@ ~]# objdump -d --start-address=0xffffab3be782a700 --stop-address=0xffffab3be782a710 /proc/kcore /proc/kcore: file format elf64-littleaarch64 Disassembly of section load1: ffffab3be782a700 : ffffab3be782a700: aa1e03e9 mov x9, x30 ffffab3be782a704: d503201f nop ffffab3be782a708: d503233f paciasp ffffab3be782a70c: a9bc7bfd stp x29, x30, [sp, #-64]! objdump: error: /proc/kcore(load2) is too large (0x7bff70000000 bytes) objdump: Reading section load2 failed because: memory exhausted