qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: "Marcel Apfelbaum" <mapfelba@redhat.com>,
	"Cornelia Huck" <cohuck@redhat.com>,
	"Eduardo Habkost" <ehabkost@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Stefan Weil" <sw@weilnetz.de>,
	"Murilo Opsfelder Araujo" <muriloo@linux.ibm.com>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	"Juan Quintela" <quintela@redhat.com>,
	qemu-devel@nongnu.org, "Halil Pasic" <pasic@linux.ibm.com>,
	"Christian Borntraeger" <borntraeger@de.ibm.com>,
	"Greg Kurz" <groug@kaod.org>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Igor Mammedov" <imammedo@redhat.com>,
	"Thomas Huth" <thuth@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@redhat.com>,
	"Igor Kotrasinski" <i.kotrasinsk@partner.samsung.com>
Subject: Re: [PATCH v3 02/12] softmmu/physmem: Fix ram_block_discard_range() to handle shared anonymous memory
Date: Thu, 11 Mar 2021 18:41:29 +0100	[thread overview]
Message-ID: <d0a57921-61ab-82a5-ed58-061961dfa6a3@redhat.com> (raw)
In-Reply-To: <20210311172236.GG194839@xz-x1>

On 11.03.21 18:22, Peter Xu wrote:
> On Thu, Mar 11, 2021 at 06:15:15PM +0100, David Hildenbrand wrote:
>> On 11.03.21 18:11, Peter Xu wrote:
>>> On Thu, Mar 11, 2021 at 05:45:46PM +0100, David Hildenbrand wrote:
>>>> On 11.03.21 17:39, Dr. David Alan Gilbert wrote:
>>>>> * David Hildenbrand (david@redhat.com) wrote:
>>>>>> We can create shared anonymous memory via
>>>>>>        "-object memory-backend-ram,share=on,..."
>>>>>> which is, for example, required by PVRDMA for mremap() to work.
>>>>>>
>>>>>> Shared anonymous memory is weird, though. Instead of MADV_DONTNEED, we
>>>>>> have to use MADV_REMOVE. MADV_DONTNEED fails silently and does nothing.
>>>>>
>>>>> OK, I wonder how stable these rules are; is it defined anywhere that
>>>>> it's required?
>>>>>
>>>>
>>>> I had a look at the Linux implementation: it's essentially shmem ... but we
>>>> don't have an fd exposed, so we cannot use fallocate() ... :)
>>>>
>>>> MADV_REMOVE documents (man):
>>>>
>>>> "In the initial implementation, only tmpfs(5) was supported MADV_REMOVE; but
>>>> since Linux 3.5, any filesystem which supports the fallocate(2)
>>>> FALLOC_FL_PUNCH_HOLE mode also supports MADV_REMOVE."
>>>
>>> Hmm, I see that MADV_DONTNEED will still tear down all mappings even for
>>> anonymous shmem.. what did I miss?
>>
>> Where did you see that?
> 
> I see madvise_dontneed_free() calls zap_page_range().
> 
>>
>>>
>>
>> MADV_DONTNEED only invalidates private copies in the pagecache. It's
>> essentially useless for any kind of shared mappings.

Let me rephrase because it was wrong: MADV_DONTNEED invalidates private 
COW pages referenced in the page tables :)

> 
> Since it's about zapping page tables, then I don't understand why it won't work
> for shmem..

It zaps the page tables but the shmem pages are still referenced (in the 
pagecache AFAIU). On next user space access, you would fill the page 
tables with the previous content.

That's why MADV_DONTNEED works properly on private anonymous memory, but 
not on shared anonymous memory - the only valid references are in the 
page tables in case of private mappings (well, unless we have other 
references like GUP etc.).


I did wonder, however, if there is benefit in doing both:

MADV_REMOVE followed by MADV_DONTNEED or the other way around. Like, 
will the extra MADV_DONTNEED also remove page tables and not just 
invalidate/zap the entries. Doesn't make a difference 
functionality-wise, but memory-consumption-wise.

I'll still have to have a look.

-- 
Thanks,

David / dhildenb



  reply	other threads:[~2021-03-11 17:50 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-08 15:05 [PATCH v3 00/12] RAM_NORESERVE, MAP_NORESERVE and hostmem "reserve" property David Hildenbrand
2021-03-08 15:05 ` [PATCH v3 01/12] softmmu/physmem: Mark shared anonymous memory RAM_SHARED David Hildenbrand
2021-03-08 15:05 ` [PATCH v3 02/12] softmmu/physmem: Fix ram_block_discard_range() to handle shared anonymous memory David Hildenbrand
2021-03-11 16:39   ` Dr. David Alan Gilbert
2021-03-11 16:45     ` David Hildenbrand
2021-03-11 17:11       ` Peter Xu
2021-03-11 17:15         ` David Hildenbrand
2021-03-11 17:18           ` David Hildenbrand
2021-03-11 17:22           ` Peter Xu
2021-03-11 17:41             ` David Hildenbrand [this message]
2021-03-11 21:25               ` Peter Xu
2021-03-11 21:37   ` Peter Xu
2021-03-11 21:49     ` David Hildenbrand
2021-03-08 15:05 ` [PATCH v3 03/12] softmmu/physmem: Fix qemu_ram_remap() " David Hildenbrand
2021-03-08 15:05 ` [PATCH v3 04/12] util/mmap-alloc: Factor out calculation of the pagesize for the guard page David Hildenbrand
2021-03-08 15:05 ` [PATCH v3 05/12] util/mmap-alloc: Factor out reserving of a memory region to mmap_reserve() David Hildenbrand
2021-03-08 15:05 ` [PATCH v3 06/12] util/mmap-alloc: Factor out activating of memory to mmap_activate() David Hildenbrand
2021-03-08 15:05 ` [PATCH v3 07/12] softmmu/memory: Pass ram_flags into qemu_ram_alloc_from_fd() David Hildenbrand
2021-03-08 15:05 ` [PATCH v3 08/12] softmmu/memory: Pass ram_flags into memory_region_init_ram_shared_nomigrate() David Hildenbrand
2021-03-08 15:05 ` [PATCH v3 09/12] util/mmap-alloc: Pass flags instead of separate bools to qemu_ram_mmap() David Hildenbrand
2021-03-09 20:04   ` Peter Xu
2021-03-09 20:27     ` David Hildenbrand
2021-03-09 20:58       ` Peter Xu
2021-03-10  8:41         ` David Hildenbrand
2021-03-10 10:11           ` David Hildenbrand
2021-03-10 10:55             ` David Hildenbrand
2021-03-10 16:27               ` Peter Xu
2021-03-08 15:05 ` [PATCH v3 10/12] memory: introduce RAM_NORESERVE and wire it up in qemu_ram_mmap() David Hildenbrand
2021-03-08 15:05 ` [PATCH v3 11/12] util/mmap-alloc: Support RAM_NORESERVE via MAP_NORESERVE David Hildenbrand
2021-03-10 10:28   ` David Hildenbrand
2021-03-08 15:06 ` [PATCH v3 12/12] hostmem: Wire up RAM_NORESERVE via "reserve" property David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d0a57921-61ab-82a5-ed58-061961dfa6a3@redhat.com \
    --to=david@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=groug@kaod.org \
    --cc=i.kotrasinsk@partner.samsung.com \
    --cc=imammedo@redhat.com \
    --cc=mapfelba@redhat.com \
    --cc=mst@redhat.com \
    --cc=muriloo@linux.ibm.com \
    --cc=pasic@linux.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=philmd@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=richard.henderson@linaro.org \
    --cc=stefanha@redhat.com \
    --cc=sw@weilnetz.de \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).