linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexey Kardashevskiy <aik@amd.com>
To: linux-pci@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, Bjorn Helgaas <bhelgaas@google.com>,
	David Woodhouse <dwmw@amazon.co.uk>,
	Kai-Heng Feng <kai.heng.feng@canonical.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Santosh Shukla <santosh.shukla@amd.com>,
	"Nikunj A. Dadhania" <nikunj@amd.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>
Subject: Re: [RFC PATCH] PCI: Add quirk to always map ivshmem as write-back
Date: Tue, 24 Jun 2025 11:42:47 +1000	[thread overview]
Message-ID: <930fc54c-a88c-49b3-a1a7-6ad9228d84ac@amd.com> (raw)
In-Reply-To: <52f0d07a-b1a0-432c-8f6f-8c9bf59c1843@amd.com>

Ping? Thanks,


On 12/6/25 18:27, Alexey Kardashevskiy wrote:
> Wrong email for Nikunj :) And I missed the KVM ml. Sorry for the noise.
> 
> 
> On 12/6/25 18:22, Alexey Kardashevskiy wrote:
>> QEMU Inter-VM Shared Memory (ivshmem) is designed to share a memory
>> region between guest and host. The host creates a file, passes it to QEMU
>> which it presents to the guest via PCI BAR#2. The guest userspace
>> can map /sys/bus/pci/devices/0000:01:02.3/resource2(_wc) to use the region
>> without having the guest driver for the device at all.
>>
>> The problem with this, since it is a PCI resource, the PCI sysfs
>> reasonably enforces:
>> - no caching when mapped via "resourceN" (PTE::PCD on x86) or
>> - write-through when mapped via "resourceN_wc" (PTE::PWT on x86).
>>
>> As the result, the host writes are seen by the guest immediately
>> (as the region is just a mapped file) but it takes quite some time for
>> the host to see non-cached guest writes.
>>
>> Add a quirk to always map ivshmem's BAR2 as cacheable (==write-back) as
>> ivshmem is backed by RAM anyway.
>> (Re)use already defined but not used IORESOURCE_CACHEABLE flag.
>>
>> This does not affect other ways of mapping a PCI BAR, a driver can use
>> memremap() for this functionality.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
>> ---
>>
>> What is this IORESOURCE_CACHEABLE for actually?
>>
>> Anyway, the alternatives are:
>>
>> 1. add a new node in sysfs - "resourceN_wb" - for mapping as writeback
>> but this requires changing existing (and likely old) userspace tools;
>>
>> 2. fix the kernel to strictly follow /proc/mtrr (now it is rather
>> a recommendation) but Documentation/arch/x86/mtrr.rst says it is replaced
>> with PAT which does not seem to allow overriding caching for specific
>> devices (==MMIO ranges).
>>
>> ---
>>   drivers/pci/mmap.c   | 6 ++++++
>>   drivers/pci/quirks.c | 8 ++++++++
>>   2 files changed, 14 insertions(+)
>>
>> diff --git a/drivers/pci/mmap.c b/drivers/pci/mmap.c
>> index 8da3347a95c4..8495bee08fae 100644
>> --- a/drivers/pci/mmap.c
>> +++ b/drivers/pci/mmap.c
>> @@ -35,6 +35,7 @@ int pci_mmap_resource_range(struct pci_dev *pdev, int bar,
>>       if (write_combine)
>>           vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
>>       else
>> +    else if (!(pci_resource_flags(pdev, bar) & IORESOURCE_CACHEABLE))
>>           vma->vm_page_prot = pgprot_device(vma->vm_page_prot);
>>       if (mmap_state == pci_mmap_io) {
>> @@ -46,6 +47,11 @@ int pci_mmap_resource_range(struct pci_dev *pdev, int bar,
>>       vma->vm_ops = &pci_phys_vm_ops;
>> +    if (pci_resource_flags(pdev, bar) & IORESOURCE_CACHEABLE)
>> +        return remap_pfn_range_notrack(vma, vma->vm_start, vma->vm_pgoff,
>> +                           vma->vm_end - vma->vm_start,
>> +                           vma->vm_page_prot);
>> +
>>       return io_remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
>>                     vma->vm_end - vma->vm_start,
>>                     vma->vm_page_prot);
>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>> index d7f4ee634263..858869ec6612 100644
>> --- a/drivers/pci/quirks.c
>> +++ b/drivers/pci/quirks.c
>> @@ -6335,3 +6335,11 @@ static void pci_mask_replay_timer_timeout(struct pci_dev *pdev)
>>   DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_GLI, 0x9750, pci_mask_replay_timer_timeout);
>>   DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_GLI, 0x9755, pci_mask_replay_timer_timeout);
>>   #endif
>> +
>> +static void pci_ivshmem_writeback(struct pci_dev *dev)
>> +{
>> +    struct resource *r = &dev->resource[2];
>> +
>> +    r->flags |= IORESOURCE_CACHEABLE;
>> +}
>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_REDHAT_QUMRANET, 0x1110, pci_ivshmem_writeback);
> 

-- 
Alexey


  reply	other threads:[~2025-06-24  1:42 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-12  8:22 [RFC PATCH] PCI: Add quirk to always map ivshmem as write-back Alexey Kardashevskiy
2025-06-12  8:27 ` Alexey Kardashevskiy
2025-06-24  1:42   ` Alexey Kardashevskiy [this message]
2025-06-25 16:28     ` Manivannan Sadhasivam
2025-07-18  1:58       ` Alexey Kardashevskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=930fc54c-a88c-49b3-a1a7-6ad9228d84ac@amd.com \
    --to=aik@amd.com \
    --cc=bhelgaas@google.com \
    --cc=dwmw@amazon.co.uk \
    --cc=kai.heng.feng@canonical.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=nikunj@amd.com \
    --cc=pbonzini@redhat.com \
    --cc=santosh.shukla@amd.com \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).