From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B20DC433E0 for ; Sun, 7 Mar 2021 14:13:34 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8F3846501E for ; Sun, 7 Mar 2021 14:13:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8F3846501E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:56860 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lIuA0-0002IC-I8 for qemu-devel@archiver.kernel.org; Sun, 07 Mar 2021 09:13:32 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:57988) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lIu8k-0001BU-0y for qemu-devel@nongnu.org; Sun, 07 Mar 2021 09:12:14 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:35867) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1lIu8g-0008N1-0H for qemu-devel@nongnu.org; Sun, 07 Mar 2021 09:12:12 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1615126326; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references; bh=iO8DVvajlIVcikSF0+p61sBq7Hsqgvxjh7GtRJ1CB64=; b=gHcdHiRDI500TkUBd9fO9H5SM4+BDpB/QqPCZvnRtIsEnELnGl+enu2SlwLxjnCsLw6J3i 91dLm+TVtuQxdIi2ndT0OhRuitoXIwqrWOF/KLICrS35eqDDEMd4i/L5wgtw5jGg3ufelS sII3Un1ZC2i7TOl6z2FEnc2aFmjlD5g= Received: from mail-ed1-f70.google.com (mail-ed1-f70.google.com [209.85.208.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-533-_ibbNGICMiOWPNaoUSYj-g-1; Sun, 07 Mar 2021 09:12:04 -0500 X-MC-Unique: _ibbNGICMiOWPNaoUSYj-g-1 Received: by mail-ed1-f70.google.com with SMTP id r19so3319851edv.3 for ; Sun, 07 Mar 2021 06:12:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:reply-to :from:date:message-id:subject:to:cc; bh=iO8DVvajlIVcikSF0+p61sBq7Hsqgvxjh7GtRJ1CB64=; b=N1fTB4rxGI+dhJwAoVhRQOwrZ6JmZ/Gf0WymuNt3l9vLDpHx5yn+Fmp5uhnG81Xwdx dTwksxDZX2s3jqO9CPr6w0JQxMCPpxyvh+XLp54IXbGpc/CqDWxgW5LRHYvWZryuo3f1 T/UsqaDnonCVtNHNrZ6FrYma6jgTMS/dcm1UYRASaoqWa2OmQEjtn3ljar9FOYta+JJ4 oEtw6kNdYngWenCKCWuN8AeQtfGD8YypDseasK06yA35rMk7yeTRtSAQyQEbb3PPVpjH 0mVBDSETWIzt9CJ/Yc02DVy8fwlryUJ1tURQuFaNUKhdyeA3Tcb9AoXEmJ9seDnCf+BB ynXQ== X-Gm-Message-State: AOAM532qgb4HBSQ7O6T8gsS/sxzISArRYKnFSaKMAO0NZQwgpYVObys/ JEwAbVwZl5VGhc3EZ8vjZcmobu3HeKlc/DsH6+8+ST7pPYjq8BjtmK3kXiqBcaaKrsz9MqSRtAK PgzO0IoXEg9pC0lBnjQiwuOnOhO53NZA= X-Received: by 2002:a17:906:1c13:: with SMTP id k19mr11012842ejg.457.1615126323157; Sun, 07 Mar 2021 06:12:03 -0800 (PST) X-Google-Smtp-Source: ABdhPJwEx6BNBZIxKLh5S4rbQJSsTDzZzWJqD+rghvOERgk3pWQUWIojkUZt0kmQg0AxwAf87wA9uQ6AIXAo64CuJiY= X-Received: by 2002:a17:906:1c13:: with SMTP id k19mr11012809ejg.457.1615126322934; Sun, 07 Mar 2021 06:12:02 -0800 (PST) MIME-Version: 1.0 References: <20210305101634.10745-1-david@redhat.com> <20210305101634.10745-9-david@redhat.com> <20210305154206.GH397383@xz-x1> <20210305155141.GI397383@xz-x1> <26dc6c36-5137-5d5e-36f0-2650e42e40ad@redhat.com> In-Reply-To: <26dc6c36-5137-5d5e-36f0-2650e42e40ad@redhat.com> From: Marcel Apfelbaum Date: Sun, 7 Mar 2021 16:11:51 +0200 Message-ID: Subject: Re: [PATCH v2 8/9] util/mmap-alloc: Support RAM_NORESERVE via MAP_NORESERVE To: David Hildenbrand Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=mapfelba@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: multipart/alternative; boundary="000000000000bf0ce905bcf2e6c6" Received-SPF: pass client-ip=216.205.24.124; envelope-from=mapfelba@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -30 X-Spam_score: -3.1 X-Spam_bar: --- X-Spam_report: (-3.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.251, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: marcel@redhat.com Cc: Juan Quintela , Thomas Huth , Cornelia Huck , Eduardo Habkost , "Michael S. Tsirkin" , Stefan Weil , Murilo Opsfelder Araujo , Richard Henderson , "Dr. David Alan Gilbert" , Peter Xu , qemu-devel@nongnu.org, Halil Pasic , Christian Borntraeger , Greg Kurz , Stefan Hajnoczi , Igor Mammedov , Marcel Apfelbaum , Paolo Bonzini , =?UTF-8?Q?Philippe_Mathieu=2DDaud=C3=A9?= , Igor Kotrasinski Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" --000000000000bf0ce905bcf2e6c6 Content-Type: text/plain; charset="UTF-8" Hi David, On Sun, Mar 7, 2021 at 3:18 PM David Hildenbrand wrote: > On 05.03.21 16:51, Peter Xu wrote: > > On Fri, Mar 05, 2021 at 04:44:36PM +0100, David Hildenbrand wrote: > >> On 05.03.21 16:42, Peter Xu wrote: > >>> On Fri, Mar 05, 2021 at 11:16:33AM +0100, David Hildenbrand wrote: > >>>> +#define OVERCOMMIT_MEMORY_PATH "/proc/sys/vm/overcommit_memory" > >>>> +static bool map_noreserve_effective(int fd, bool readonly, bool > shared) > >>>> +{ > >>> > >>> [...] > >>> > >>>> @@ -184,8 +251,7 @@ void *qemu_ram_mmap(int fd, > >>>> size_t offset, total; > >>>> void *ptr, *guardptr; > >>>> - if (noreserve) { > >>>> - error_report("Skipping reservation of swap space is not > supported"); > >>>> + if (noreserve && !map_noreserve_effective(fd, shared, readonly)) > { > >>> > >>> Need to switch "shared" & "readonly"? > >> > >> Indeed, interestingly it has the same effect (as we don't have anonymous > >> read-only memory in QEMU :) ) > > > > But note there is still a "g_assert(!shared || fd >= 0);" inside.. :) > > Aaaaaand, I just figured that we actually can create shared anonymous > memory in QEMU, simply via > > -object memory-backend-ram,share=on > > Introduced in 06329ccecfa0 ("mem: add share parameter to > memory-backend-ram"). That's also where we introduced the "shared" flag > for qemu_anon_ram_alloc(). > > That commit mentions a use case for "RDMA devices in order to remap > non-contiguous QEMU virtual addresses to a contiguous virtual address > range.". I fail to understand why that requires sharing RAM with child > processes. > > Especially: > > a) qemu_ram_is_shared() returned false before patch #1. RAM_SHARED is > never set. > > b) qemu_ram_remap() does not work as expected? > > c) ram_discard_range() is broken with shared anonymous memory. Instead > of MADV_DONTNEED we need MADV_REMOVE. > > This looks like a partially broken feature and I wonder if there is an > actual user. > > @Marcel, can you clarify if there is an actual use case for shared > anonymous memory in QEMU? I.e., if the original use case that required > that change is valid? (and why it wasn't able to just use proper shmem) > > As you correctly stated, the PVRDMA device requires remapping of non-contiguous QEMU virtual addresses to a contiguous virtual address range. In order to do so it calls mremap (... , MREMAP_MAYMOVE | MREMAP_FIXED, ...) The above call succeeds only if the memory is marked as "shared". > Shared anonymous memory is weird. > In this case it is not about sharing the memory between different processes, but about being able to remap it. Thanks, Marcel > > -- > Thanks, > > David / dhildenb > > --000000000000bf0ce905bcf2e6c6 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi David,

On Sun, Mar 7, 2021 at 3:18 PM D= avid Hildenbrand <david@redhat.com> wrote:
On= 05.03.21 16:51, Peter Xu wrote:
> On Fri, Mar 05, 2021 at 04:44:36PM +0100, David Hildenbrand wrote:
>> On 05.03.21 16:42, Peter Xu wrote:
>>> On Fri, Mar 05, 2021 at 11:16:33AM +0100, David Hildenbrand wr= ote:
>>>> +#define OVERCOMMIT_MEMORY_PATH "/proc/sys/vm/overcom= mit_memory"
>>>> +static bool map_noreserve_effective(int fd, bool readonly= , bool shared)
>>>> +{
>>>
>>> [...]
>>>
>>>> @@ -184,8 +251,7 @@ void *qemu_ram_mmap(int fd,
>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 size_t offset, total;
>>>>=C2=A0 =C2=A0 =C2=A0 =C2=A0 void *ptr, *guardptr;
>>>> -=C2=A0 =C2=A0 if (noreserve) {
>>>> -=C2=A0 =C2=A0 =C2=A0 =C2=A0 error_report("Skipping r= eservation of swap space is not supported");
>>>> +=C2=A0 =C2=A0 if (noreserve && !map_noreserve_eff= ective(fd, shared, readonly)) {
>>>
>>> Need to switch "shared" & "readonly"?<= br> >>
>> Indeed, interestingly it has the same effect (as we don't have= anonymous
>> read-only memory in QEMU :) )
>
> But note there is still a "g_assert(!shared || fd >=3D 0);&quo= t; inside.. :)

Aaaaaand, I just figured that we actually can create shared anonymous
memory in QEMU, simply via

-object memory-backend-ram,share=3Don

Introduced in 06329ccecfa0 ("mem: add share parameter to
memory-backend-ram"). That's also where we introduced the "sh= ared" flag
for qemu_anon_ram_alloc().

That commit mentions a use case for "RDMA devices in order to remap non-contiguous QEMU virtual addresses to a contiguous virtual address
range.". I fail to understand why that requires sharing RAM with child=
processes.

Especially:

a) qemu_ram_is_shared() returned false before patch #1. RAM_SHARED is
never set.

b) qemu_ram_remap() does not work as expected?

c) ram_discard_range() is broken with shared anonymous memory. Instead
of MADV_DONTNEED we need MADV_REMOVE.

This looks like a partially broken feature and I wonder if there is an
actual user.

@Marcel, can you clarify if there is an actual use case for shared
anonymous memory in QEMU? I.e., if the original use case that required
that change is valid? (and why it wasn't able to just use proper shmem)=



Shared anonymous memory is weird.



--
Thanks,

David / dhildenb

--000000000000bf0ce905bcf2e6c6--