From: Lorenzo Pieralisi <lpieralisi@kernel.org>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: ankita@nvidia.com, maz@kernel.org, oliver.upton@linux.dev,
catalin.marinas@arm.com, will@kernel.org, aniketa@nvidia.com,
cjia@nvidia.com, kwankhede@nvidia.com, targupta@nvidia.com,
vsethi@nvidia.com, acurrid@nvidia.com, apopple@nvidia.com,
jhubbard@nvidia.com, danw@nvidia.com,
linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1 2/2] KVM: arm64: allow the VM to select DEVICE_* and NORMAL_NC for IO memory
Date: Wed, 13 Sep 2023 17:26:01 +0200 [thread overview]
Message-ID: <ZQHUifAfJ+lZikAn@lpieralisi> (raw)
In-Reply-To: <ZP9MQdRYmlawNsbC@nvidia.com>
On Mon, Sep 11, 2023 at 02:20:01PM -0300, Jason Gunthorpe wrote:
> On Mon, Sep 11, 2023 at 04:57:51PM +0200, Lorenzo Pieralisi wrote:
> > On Thu, Sep 07, 2023 at 11:14:59AM -0700, ankita@nvidia.com wrote:
> > > From: Ankit Agrawal <ankita@nvidia.com>
> > >
> > > Linux allows device drivers to map IO memory on a per-page basis using
> > > "write combining" or WC. This is often done using
> > > pgprot_writecombing(). The driver knows which pages can support WC
> >
> > pgprot_writecombine() ?
> >
> > > access and the proper programming model to generate this IO. Generally
> > > the use case is to boost performance by using write combining to
> > > generate larger PCIe MemWr TLPs.
> >
> > First off, this changeset does not affect *only* Linux guests, obviously.
>
> I think everyone understands that. It can be clarified.
OK, this is not a Linux guest fix *only*, everyone understands that
but I would rather make sure that's crystal clear.
> > I understand that's the use case you are after but this change is
> > targeting all VMs, it must be clear.
> >
> > Then WC and mapping to PCI TLPs, either you describe that in details
> > (NormalNC vs device-nGnRE and resulting SystemBus<->PCI transactions) or
> > you don't describe it at all, as it stands I don't know how to use
> > this information.
>
> How about another pargraph:
>
> KVM prevents all VMs (including Linux) from accessing NORMAL_NC
> mappings, which is how Linux implements pgprot_writecombine(). This
> prevents using this performance optimization within VMs.
>
> I don't think we need to go into details how it works beyond that it
> requires NORMAL_NC.
I think it is worth summarizing the discussion that led to this change,
I can write something up.
> > > Allow VMs to select DEVICE_* or NORMAL_NC on a page by page basis for
> > > all IO memory. This puts the VM in charge of the memory attributes,
> > > and removes the KVM override to DEVICE_nGnRE.
> > >
> > > Ultimately this makes pgprot_writecombing() work correctly in VMs and
> >
> > pgprot_writecombine() ?
> >
> > > allows drivers like mlx5 to fully operate their HW.
> > >
> > > After some discussions with ARM and CPU architects we reached the
> > > conclusion there was no need for KVM to prevent the VM from selecting
> > > between DEVICE_* and NORMAL_NC for IO memory in VMs. There was a fear
> > > that NORMAL_NC could result in uncontained failures, but upon deeper
> > > analysis it turns out there are already possible cases for uncontained
> > > failures with DEVICE types too. Ultimately the platform must be
> > > implemented in a way that ensures that all DEVICE_* and NORMAL_NC
> > > accesses have no uncontained failures.
> >
> > I would reorder/rephrase this changelog as follows:
> >
> > - Describe what the problem is (ie KVM default s2 mappings)
>
> The problem is that pgprot_writecombine() doesn't work in Linux
> VMs. That is the first pagraph.
s/pagraph/paragraph
Well to me that's how the problem was spotted but OK, I can rewrite
the log and post it here, NP.
> > - Describe how you are solving it
>
> That is the middle paragraph "Allow VMs to select DEVICE_* or
> NORMAL_NC on a page by page basis"
>
> > - Add a link to the documentation that states why it is safe to do
> > that and the resulting s1/s2 mappings combination
>
> AFAIK there is no documentation beyond the combining rules. Exactly
> what should happen in various error conditions is implementation
> defined. Catalin did you ever find anything?
>
> > It must be clear why from a legacy standpoint this is a safe change
> > to apply.
>
> This is why:
>
> > > Fortunately real platforms do tend to implement this.
Is it possible please to describe what "this" is in details ?
What does "real platforms" mean ?
Let's describe what HW/SW should be implemented to make this
safe.
> It is why it is safe today, because real platforms don't throw
"real platforms", I have a problem with that, see above.
> uncontained errors from typical PCI accesses that VFIO allows. I think
> the conclusions was it turns out that is just because they don't do
> errors at all, not because DEVICE_* prevents it.
>
> > Remove this sentence, it adds no information for someone who
> > is chasing bugs or just wants to understand the change itself.
>
> So, if you hit a bug here you might evaluate if there is something
> wrong with your platform, ie it is allowing uncontained errors in
> unexpected places.
Developers looking at commit log in the future must be able to
understand why it was a sound change to make. As it stands IMO
the commit log does not explain it fully.
I can write up the commit log and post it if I manage to summarize
it any better - more important the review on the code (that was already
provided), I will try to write something up asap.
Thanks,
Lorenzo
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2023-09-13 15:26 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-07 18:14 [PATCH v1 0/2] KVM: arm64: support write combining and cachable IO memory in VMs ankita
2023-09-07 18:14 ` [PATCH v1 1/2] KVM: arm64: determine memory type from VMA ankita
2023-09-07 19:12 ` Jason Gunthorpe
2023-10-05 16:15 ` Catalin Marinas
2023-10-05 16:54 ` Jason Gunthorpe
2023-10-10 14:25 ` Catalin Marinas
2023-10-10 15:05 ` Jason Gunthorpe
2023-10-10 17:19 ` Catalin Marinas
2023-10-10 18:23 ` Jason Gunthorpe
2023-10-11 17:45 ` Catalin Marinas
2023-10-11 18:38 ` Jason Gunthorpe
2023-10-12 16:16 ` Catalin Marinas
2024-03-10 3:49 ` Ankit Agrawal
2024-03-19 13:38 ` Jason Gunthorpe
2023-10-23 13:20 ` Shameerali Kolothum Thodi
2023-09-07 18:14 ` [PATCH v1 2/2] KVM: arm64: allow the VM to select DEVICE_* and NORMAL_NC for IO memory ankita
2023-09-08 16:40 ` Catalin Marinas
2023-09-11 14:57 ` Lorenzo Pieralisi
2023-09-11 17:20 ` Jason Gunthorpe
2023-09-13 15:26 ` Lorenzo Pieralisi [this message]
2023-09-13 18:54 ` Jason Gunthorpe
2023-09-26 8:31 ` Lorenzo Pieralisi
2023-09-26 12:25 ` Jason Gunthorpe
2023-09-26 13:52 ` Catalin Marinas
2023-09-26 16:12 ` Lorenzo Pieralisi
2023-10-05 9:56 ` Lorenzo Pieralisi
2023-10-05 11:56 ` Jason Gunthorpe
2023-10-05 14:08 ` Lorenzo Pieralisi
2023-10-12 12:35 ` Will Deacon
2023-10-12 13:20 ` Jason Gunthorpe
2023-10-12 14:29 ` Lorenzo Pieralisi
2023-10-12 13:53 ` Catalin Marinas
2023-10-12 14:48 ` Will Deacon
2023-10-12 15:44 ` Jason Gunthorpe
2023-10-12 16:39 ` Will Deacon
2023-10-12 18:36 ` Jason Gunthorpe
2023-10-13 9:29 ` Will Deacon
2023-10-12 17:26 ` Catalin Marinas
2023-10-13 9:29 ` Will Deacon
2023-10-13 13:08 ` Catalin Marinas
2023-10-13 13:45 ` Jason Gunthorpe
2023-10-19 11:07 ` Catalin Marinas
2023-10-19 11:51 ` Jason Gunthorpe
2023-10-20 11:21 ` Catalin Marinas
2023-10-20 11:47 ` Jason Gunthorpe
2023-10-20 14:03 ` Lorenzo Pieralisi
2023-10-20 14:28 ` Jason Gunthorpe
2023-10-19 13:35 ` Lorenzo Pieralisi
2023-10-13 15:28 ` Lorenzo Pieralisi
2023-10-19 11:12 ` Catalin Marinas
2023-11-09 15:34 ` Lorenzo Pieralisi
2023-11-10 14:26 ` Jason Gunthorpe
2023-11-13 0:42 ` Lorenzo Pieralisi
2023-11-13 17:41 ` Catalin Marinas
2023-10-12 12:27 ` Will Deacon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZQHUifAfJ+lZikAn@lpieralisi \
--to=lpieralisi@kernel.org \
--cc=acurrid@nvidia.com \
--cc=aniketa@nvidia.com \
--cc=ankita@nvidia.com \
--cc=apopple@nvidia.com \
--cc=catalin.marinas@arm.com \
--cc=cjia@nvidia.com \
--cc=danw@nvidia.com \
--cc=jgg@nvidia.com \
--cc=jhubbard@nvidia.com \
--cc=kvmarm@lists.linux.dev \
--cc=kwankhede@nvidia.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maz@kernel.org \
--cc=oliver.upton@linux.dev \
--cc=targupta@nvidia.com \
--cc=vsethi@nvidia.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox