qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, "Michael S. Tsirkin" <mst@redhat.com>,
	"Juan Quintela" <quintela@redhat.com>,
	"Leonardo Bras" <leobras@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Peng Tao" <tao.peng@linux.alibaba.com>
Subject: Re: [PATCH v1 1/4] softmmu/physmem: Warn with ram_block_discard_range() on MAP_PRIVATE file mapping
Date: Wed, 21 Jun 2023 18:17:37 +0200	[thread overview]
Message-ID: <9f7afce0-ff7f-33f8-4f39-bba77f2b2ba4@redhat.com> (raw)
In-Reply-To: <ZJMdZRoeu9BVm0z8@x1n>

On 21.06.23 17:55, Peter Xu wrote:
> On Tue, Jun 20, 2023 at 03:03:51PM +0200, David Hildenbrand wrote:
>> ram_block_discard_range() cannot possibly do the right thing in
>> MAP_PRIVATE file mappings in the general case.
>>
>> To achieve the documented semantics, we also have to punch a hole into
>> the file, possibly messing with other MAP_PRIVATE/MAP_SHARED mappings
>> of such a file.
>>
>> For example, using VM templating -- see commit b17fbbe55cba ("migration:
>> allow private destination ram with x-ignore-shared") -- in combination with
>> any mechanism that relies on discarding of RAM is problematic. This
>> includes:
>> * Postcopy live migration
>> * virtio-balloon inflation/deflation or free-page-reporting
>> * virtio-mem
>>
>> So at least warn that there is something possibly dangerous is going on
>> when using ram_block_discard_range() in these cases.
> 
> The issue is probably valid.
> 
> One thing I worry is when the user (or, qemu instance) exclusively owns the
> file, just forgot to attach share=on, where it used to work perfectly then
> it'll show this warning.  But I agree maybe it's good to remind them just
> to attach the share=on.

For memory-backend-memfd "share=on" is fortunately the default. For 
memory-backend-file it isn't (and in most cases you do want share=on, 
like for hugetlbfs or tmpfs).

Missing the "share=on" for memory-backend-file can have sane use cases, 
but for the common /dev/shm/ case it even results in an undesired 
double-memory consumption (just like memory-backend-memfd,share=off).


> 
> For real private mem users, the warning can of real help, one should
> probably leverage things like file snapshot provided by modern file
> systems, so each VM should just have its own snapshot ram file to use then
> map it share=on I suppose.

Yes, I agree. Although we recently learned that fs-backed VM RAM (SSD) 
performs poorly and will severely wear your SSD severly :(

> 
> For the long term, maybe we should simply support private mem here simply
> by a MADV_DONTNEED.  I assume that's the right semantics for postcopy (just
> need to support MINOR faults, though; MISSING faults definitely will stop
> working.. but for all the rest framework shouldn't need much change), and I
> hope that's also the semantics that balloon/virtio-mem wants here.  Not
> sure whether/when that's strongly needed, assuming the corner case above
> can still be work arounded properly by other means.

I briefly thought about that but came to the conclusion that fixing it 
is not that easy. So I went with the warn.

As documented, ram_block_discard_range() guarantees two things

a) Read 0 after discarding succeeded
b) Make postcopy work by triggering a fault on next access

And if we'd simply want to drop the FALLOC_FL_PUNCH_HOLE:

1) For hugetlb, only newer kernels support MADV_DONTNEED. So there is no 
way to just discard in a private mapping here that works for kernels we 
still care about.

2) free-page-reporting wants to read 0's when re-accessing discarded 
memory. If there is still something there in the file, that won't work.

3) Regarding postcopy on MAP_PRIVATE shmem, I am not sure if it will 
actually do what you want if the pagecache holds a page. Maybe it works, 
but I am not so sure. Needs investigation.


> 
> For now, a warning looks all sane.
> 
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> Acked-by: Peter Xu <peterx@redhat.com>

Thanks!

-- 
Cheers,

David / dhildenb



  reply	other threads:[~2023-06-21 16:18 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-20 13:03 [PATCH v1 0/4] virtio-mem: Support "x-ignore-shared" migration David Hildenbrand
2023-06-20 13:03 ` [PATCH v1 1/4] softmmu/physmem: Warn with ram_block_discard_range() on MAP_PRIVATE file mapping David Hildenbrand
2023-06-21 15:55   ` Peter Xu
2023-06-21 16:17     ` David Hildenbrand [this message]
2023-06-21 16:55       ` Peter Xu
2023-06-22 13:10         ` David Hildenbrand
2023-06-22 14:54           ` Peter Xu
2023-06-20 13:03 ` [PATCH v1 2/4] virtio-mem: Skip most of virtio_mem_unplug_all() without plugged memory David Hildenbrand
2023-06-20 13:03 ` [PATCH v1 3/4] migration/ram: Expose ramblock_is_ignored() as migrate_ram_is_ignored() David Hildenbrand
2023-06-21 15:56   ` Peter Xu
2023-06-20 13:03 ` [PATCH v1 4/4] virtio-mem: Support "x-ignore-shared" migration David Hildenbrand
2023-06-20 13:06   ` Michael S. Tsirkin
2023-06-20 13:40     ` David Hildenbrand
2023-07-06  5:59 ` [PATCH v1 0/4] " Mario Casquero
2023-07-06  7:19   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9f7afce0-ff7f-33f8-4f39-bba77f2b2ba4@redhat.com \
    --to=david@redhat.com \
    --cc=leobras@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=tao.peng@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).