From: Mike Rapoport <rppt@kernel.org>
To: David Hildenbrand <david@redhat.com>
Cc: linux-kernel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
"Michael S. Tsirkin" <mst@redhat.com>,
Jason Wang <jasowang@redhat.com>,
Alexey Dobriyan <adobriyan@gmail.com>,
"Matthew Wilcox (Oracle)" <willy@infradead.org>,
Oscar Salvador <osalvador@suse.de>,
Michal Hocko <mhocko@suse.com>, Roman Gushchin <guro@fb.com>,
Alex Shi <alex.shi@linux.alibaba.com>,
Steven Price <steven.price@arm.com>,
Mike Kravetz <mike.kravetz@oracle.com>,
Aili Yao <yaoaili@kingsoft.com>, Jiri Bohac <jbohac@suse.cz>,
"K. Y. Srinivasan" <kys@microsoft.com>,
Haiyang Zhang <haiyangz@microsoft.com>,
Stephen Hemminger <sthemmin@microsoft.com>,
Wei Liu <wei.liu@kernel.org>,
Naoya Horiguchi <naoya.horiguchi@nec.com>,
linux-hyperv@vger.kernel.org,
virtualization@lists.linux-foundation.org,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v1 7/7] fs/proc/kcore: use page_offline_(freeze|unfreeze)
Date: Mon, 3 May 2021 12:28:58 +0300 [thread overview]
Message-ID: <YI/CWg6PrMxcCT2D@kernel.org> (raw)
In-Reply-To: <5a5a7552-4f0a-75bc-582f-73d24afcf57b@redhat.com>
On Mon, May 03, 2021 at 10:28:36AM +0200, David Hildenbrand wrote:
> On 02.05.21 08:34, Mike Rapoport wrote:
> > On Thu, Apr 29, 2021 at 02:25:19PM +0200, David Hildenbrand wrote:
> > > Let's properly synchronize with drivers that set PageOffline(). Unfreeze
> > > every now and then, so drivers that want to set PageOffline() can make
> > > progress.
> > >
> > > Signed-off-by: David Hildenbrand <david@redhat.com>
> > > ---
> > > fs/proc/kcore.c | 15 +++++++++++++++
> > > 1 file changed, 15 insertions(+)
> > >
> > > diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
> > > index 92ff1e4436cb..3d7531f47389 100644
> > > --- a/fs/proc/kcore.c
> > > +++ b/fs/proc/kcore.c
> > > @@ -311,6 +311,7 @@ static void append_kcore_note(char *notes, size_t *i, const char *name,
> > > static ssize_t
> > > read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos)
> > > {
> > > + size_t page_offline_frozen = 0;
> > > char *buf = file->private_data;
> > > size_t phdrs_offset, notes_offset, data_offset;
> > > size_t phdrs_len, notes_len;
> > > @@ -509,6 +510,18 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos)
> > > pfn = __pa(start) >> PAGE_SHIFT;
> > > page = pfn_to_online_page(pfn);
> >
> > Can't this race with page offlining for the first time we get here?
>
>
> To clarify, we have three types of offline pages in the kernel ...
>
> a) Pages part of an offline memory section; the memap is stale and not
> trustworthy. pfn_to_online_page() checks that. We *can* protect against
> memory offlining using get_online_mems()/put_online_mems(), but usually
> avoid doing so as the race window is very small (and a problem all over the
> kernel we basically never hit) and locking is rather expensive. In the
> future, we might switch to rcu to handle that more efficiently and avoiding
> these possible races.
>
> b) PageOffline(): logically offline pages contained in an online memory
> section with a sane memmap. virtio-mem calls these pages "fake offline";
> something like a "temporary" memory hole. The new mechanism I propose will
> be used to handle synchronization as races can be more severe, e.g., when
> reading actual page content here.
>
> c) Soft offline pages: hwpoisoned pages that are not actually harmful yet,
> but could become harmful in the future. So we better try to remove the page
> from the page allcoator and try to migrate away existing users.
>
>
> So page_offline_* handle "b) PageOffline()" only. There is a tiny race
> between pfn_to_online_page(pfn) and looking at the memmap as we have in many
> cases already throughout the kernel, to be tackled in the future.
Right, but here you anyway add locking, so why exclude the first iteration?
> (A better name for PageOffline() might make sense; PageSoftOffline() would
> be catchy but interferes with c). PageLogicallyOffline() is ugly;
> PageFakeOffline() might do)
>
> > > + /*
> > > + * Don't race against drivers that set PageOffline()
> > > + * and expect no further page access.
> > > + */
> > > + if (page_offline_frozen == MAX_ORDER_NR_PAGES) {
> > > + page_offline_unfreeze();
> > > + page_offline_frozen = 0;
> > > + cond_resched();
> > > + }
> > > + if (!page_offline_frozen++)
> > > + page_offline_freeze();
> > > +
BTW, did you consider something like
if (page_offline_frozen++ % MAX_ORDER_NR_PAGES == 0) {
page_offline_unfreeze();
cond_resched();
page_offline_freeze();
}
We don't seem to care about page_offline_frozen overflows here, do we?
--
Sincerely yours,
Mike.
next prev parent reply other threads:[~2021-05-03 9:29 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-29 12:25 [PATCH v1 0/7] fs/proc/kcore: don't read offline sections, logically offline pages and hwpoisoned pages David Hildenbrand
2021-04-29 12:25 ` David Hildenbrand
2021-04-29 12:25 ` [PATCH v1 1/7] fs/proc/kcore: drop KCORE_REMAP and KCORE_OTHER David Hildenbrand
2021-04-29 12:25 ` David Hildenbrand
2021-05-02 6:31 ` Mike Rapoport
2021-04-29 12:25 ` [PATCH v1 2/7] fs/proc/kcore: pfn_is_ram check only applies to KCORE_RAM David Hildenbrand
2021-04-29 12:25 ` David Hildenbrand
2021-05-02 6:31 ` Mike Rapoport
2021-04-29 12:25 ` [PATCH v1 3/7] mm: rename and move page_is_poisoned() David Hildenbrand
2021-04-29 12:25 ` David Hildenbrand
2021-05-02 6:32 ` Mike Rapoport
2021-05-05 13:13 ` Michal Hocko
2021-05-05 13:17 ` David Hildenbrand
2021-05-05 13:17 ` David Hildenbrand
2021-05-05 13:27 ` Michal Hocko
2021-05-05 13:39 ` David Hildenbrand
2021-05-05 13:39 ` David Hildenbrand
2021-05-05 13:45 ` Michal Hocko
2021-05-06 1:08 ` Aili Yao
2021-05-06 0:56 ` Aili Yao
2021-05-06 7:06 ` Michal Hocko
2021-05-06 7:28 ` Aili Yao
2021-05-06 7:55 ` Michal Hocko
2021-05-06 8:52 ` Aili Yao
2021-04-29 12:25 ` [PATCH v1 4/7] fs/proc/kcore: don't read offline sections, logically offline pages and hwpoisoned pages David Hildenbrand
2021-04-29 12:25 ` David Hildenbrand
2021-05-02 6:32 ` Mike Rapoport
2021-04-29 12:25 ` [PATCH v1 5/7] mm: introduce page_offline_(begin|end|freeze|unfreeze) to synchronize setting PageOffline() David Hildenbrand
2021-04-29 12:25 ` David Hildenbrand
2021-05-02 6:33 ` Mike Rapoport
2021-05-03 8:11 ` David Hildenbrand
2021-05-03 8:11 ` David Hildenbrand
2021-05-05 13:24 ` Michal Hocko
2021-05-05 15:10 ` David Hildenbrand
2021-05-05 15:10 ` David Hildenbrand
2021-05-05 17:41 ` Mike Rapoport
2021-04-29 12:25 ` [PATCH v1 6/7] virtio-mem: use page_offline_(start|end) when " David Hildenbrand
2021-04-29 12:25 ` David Hildenbrand
2021-05-02 6:33 ` Mike Rapoport
2021-05-03 8:16 ` David Hildenbrand
2021-05-03 8:16 ` David Hildenbrand
2021-05-03 8:23 ` Michael S. Tsirkin
2021-05-03 8:23 ` Michael S. Tsirkin
2021-04-29 12:25 ` [PATCH v1 7/7] fs/proc/kcore: use page_offline_(freeze|unfreeze) David Hildenbrand
2021-04-29 12:25 ` David Hildenbrand
2021-05-02 6:34 ` Mike Rapoport
2021-05-03 8:28 ` David Hildenbrand
2021-05-03 8:28 ` David Hildenbrand
2021-05-03 9:28 ` Mike Rapoport [this message]
2021-05-03 10:13 ` David Hildenbrand
2021-05-03 10:13 ` David Hildenbrand
2021-05-03 11:33 ` Mike Rapoport
2021-05-03 11:35 ` David Hildenbrand
2021-05-03 11:35 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YI/CWg6PrMxcCT2D@kernel.org \
--to=rppt@kernel.org \
--cc=adobriyan@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=alex.shi@linux.alibaba.com \
--cc=david@redhat.com \
--cc=guro@fb.com \
--cc=haiyangz@microsoft.com \
--cc=jasowang@redhat.com \
--cc=jbohac@suse.cz \
--cc=kys@microsoft.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=mike.kravetz@oracle.com \
--cc=mst@redhat.com \
--cc=naoya.horiguchi@nec.com \
--cc=osalvador@suse.de \
--cc=steven.price@arm.com \
--cc=sthemmin@microsoft.com \
--cc=virtualization@lists.linux-foundation.org \
--cc=wei.liu@kernel.org \
--cc=willy@infradead.org \
--cc=yaoaili@kingsoft.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.