From: Baoquan He <bhe@redhat.com>
To: Lorenzo Stoakes <lstoakes@gmail.com>,
David Hildenbrand <david@redhat.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Uladzislau Rezki <urezki@gmail.com>,
Matthew Wilcox <willy@infradead.org>,
Liu Shixin <liushixin2@huawei.com>, Jiri Olsa <jolsa@kernel.org>,
Jens Axboe <axboe@kernel.dk>,
Alexander Viro <viro@zeniv.linux.org.uk>
Subject: Re: [PATCH v7 4/4] mm: vmalloc: convert vread() to vread_iter()
Date: Thu, 23 Mar 2023 10:52:09 +0800 [thread overview]
Message-ID: <ZBu+2cPCQvvFF/FY@MiWiFi-R3L-srv> (raw)
In-Reply-To: <941f88bc5ab928e6656e1e2593b91bf0f8c81e1b.1679511146.git.lstoakes@gmail.com>
On 03/22/23 at 06:57pm, Lorenzo Stoakes wrote:
> Having previously laid the foundation for converting vread() to an iterator
> function, pull the trigger and do so.
>
> This patch attempts to provide minimal refactoring and to reflect the
> existing logic as best we can, for example we continue to zero portions of
> memory not read, as before.
>
> Overall, there should be no functional difference other than a performance
> improvement in /proc/kcore access to vmalloc regions.
>
> Now we have eliminated the need for a bounce buffer in read_kcore_iter(),
> we dispense with it, and try to write to user memory optimistically but
> with faults disabled via copy_page_to_iter_nofault(). We already have
> preemption disabled by holding a spin lock. We continue faulting in until
> the operation is complete.
I don't understand the sentences here. In vread_iter(), the actual
content reading is done in aligned_vread_iter(), otherwise we zero
filling the region. In aligned_vread_iter(), we will use
vmalloc_to_page() to get the mapped page and read out, otherwise zero
fill. While in this patch, fault_in_iov_iter_writeable() fault in memory
of iter one time and will bail out if failed. I am wondering why we
continue faulting in until the operation is complete, and how that is done.
If we look into the failing point in vread_iter(), it's mainly coming
from copy_page_to_iter_nofault(), e.g page_copy_sane() checking failed,
i->data_source checking failed. If these conditional checking failed,
should we continue reading again and again? And this is not related to
memory faulting in. I saw your discussion with David, but I am still a
little lost. Hope I can learn it, thanks in advance.
......
> diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
> index 08b795fd80b4..25b44b303b35 100644
> --- a/fs/proc/kcore.c
> +++ b/fs/proc/kcore.c
......
> @@ -507,13 +503,30 @@ read_kcore_iter(struct kiocb *iocb, struct iov_iter *iter)
>
> switch (m->type) {
> case KCORE_VMALLOC:
> - vread(buf, (char *)start, tsz);
> - /* we have to zero-fill user buffer even if no read */
> - if (copy_to_iter(buf, tsz, iter) != tsz) {
> - ret = -EFAULT;
> - goto out;
> + {
> + const char *src = (char *)start;
> + size_t read = 0, left = tsz;
> +
> + /*
> + * vmalloc uses spinlocks, so we optimistically try to
> + * read memory. If this fails, fault pages in and try
> + * again until we are done.
> + */
> + while (true) {
> + read += vread_iter(iter, src, left);
> + if (read == tsz)
> + break;
> +
> + src += read;
> + left -= read;
> +
> + if (fault_in_iov_iter_writeable(iter, left)) {
> + ret = -EFAULT;
> + goto out;
> + }
> }
> break;
> + }
> case KCORE_USER:
> /* User page is handled prior to normal kernel page: */
> if (copy_to_iter((char *)start, tsz, iter) != tsz) {
next prev parent reply other threads:[~2023-03-23 2:53 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-22 18:57 [PATCH v7 0/4] convert read_kcore(), vread() to use iterators Lorenzo Stoakes
2023-03-22 18:57 ` [PATCH v7 1/4] fs/proc/kcore: avoid bounce buffer for ktext data Lorenzo Stoakes
2023-03-22 18:57 ` [PATCH v7 2/4] fs/proc/kcore: convert read_kcore() to read_kcore_iter() Lorenzo Stoakes
2023-03-22 18:57 ` [PATCH v7 3/4] iov_iter: add copy_page_to_iter_nofault() Lorenzo Stoakes
2023-03-22 18:57 ` [PATCH v7 4/4] mm: vmalloc: convert vread() to vread_iter() Lorenzo Stoakes
2023-03-23 2:52 ` Baoquan He [this message]
2023-03-23 6:44 ` Lorenzo Stoakes
2023-03-23 10:36 ` Baoquan He
2023-03-23 10:38 ` David Hildenbrand
2023-03-23 13:31 ` Baoquan He
2023-03-26 13:26 ` David Laight
2023-03-26 14:20 ` Lorenzo Stoakes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZBu+2cPCQvvFF/FY@MiWiFi-R3L-srv \
--to=bhe@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=david@redhat.com \
--cc=jolsa@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=liushixin2@huawei.com \
--cc=lstoakes@gmail.com \
--cc=urezki@gmail.com \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.