linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Yongji Xie <xyjxie@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, linux-doc@vger.kernel.org,
	bhelgaas@google.com, alex.williamson@redhat.com, aik@ozlabs.ru,
	benh@kernel.crashing.org, paulus@samba.org, mpe@ellerman.id.au,
	corbet@lwn.net, warrier@linux.vnet.ibm.com,
	zhong@linux.vnet.ibm.com, nikunj@linux.vnet.ibm.com,
	gwshan@linux.vnet.ibm.com
Subject: Re: [RESEND PATCH v2 4/4] PCI: Add support for enforcing all MMIO BARs to be page aligned
Date: Mon, 20 Jun 2016 21:26:43 -0500	[thread overview]
Message-ID: <20160621022643.GD30307@localhost> (raw)
In-Reply-To: <1464846411-16895-5-git-send-email-xyjxie@linux.vnet.ibm.com>

On Thu, Jun 02, 2016 at 01:46:51PM +0800, Yongji Xie wrote:
> When vfio passthrough a PCI device of which MMIO BARs are
> smaller than PAGE_SIZE, guest will not handle the mmio
> accesses to the BARs which leads to mmio emulations in host.
> 
> This is because vfio will not allow to passthrough one BAR's
> mmio page which may be shared with other BARs. Otherwise,
> there will be a backdoor that guest can use to access BARs
> of other guest.
> 
> To solve this issue, this patch modifies resource_alignment
> to support syntax where multiple devices get the same
> alignment. So we can use something like
> "pci=resource_alignment=*:*:*.*:noresize" to enforce the
> alignment of all MMIO BARs to be at least PAGE_SIZE so that
> one BAR's mmio page would not be shared with other BARs.
> 
> And we also define a macro PCIBIOS_MIN_ALIGNMENT to enable this
> automatically on PPC64 platform which can easily hit this issue
> because its PAGE_SIZE is 64KB.
> 
> Note that this would not be applied to VFs whose BARs are always
> page aligned and should be never reassigned according to SRIOV
> spec.

I see that SR-IOV spec r1.1, sec 3.3.13 requires that all VF BAR
resources be aligned on System Page Size, and must be sized to consume
an integral number of pages.

Where does it say VF BARs can't be reassigned?  I thought they *could*
be reassigned, as long as VFs are disabled when you do it.

> Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
> ---
>  Documentation/kernel-parameters.txt |    2 ++
>  arch/powerpc/include/asm/pci.h      |    2 ++
>  drivers/pci/pci.c                   |   68 +++++++++++++++++++++++++++++------
>  3 files changed, 61 insertions(+), 11 deletions(-)
> 
> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index c4802f5..cb09503 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -3003,6 +3003,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>  				aligned memory resources.
>  				If <order of align> is not specified,
>  				PAGE_SIZE is used as alignment.
> +				<domain>, <bus>, <slot> and <func> can be set to
> +				"*" which means match all values.
>  				PCI-PCI bridge can be specified, if resource
>  				windows need to be expanded.
>  				noresize: Don't change the resources' sizes when
> diff --git a/arch/powerpc/include/asm/pci.h b/arch/powerpc/include/asm/pci.h
> index a6f3ac0..742fd34 100644
> --- a/arch/powerpc/include/asm/pci.h
> +++ b/arch/powerpc/include/asm/pci.h
> @@ -28,6 +28,8 @@
>  #define PCIBIOS_MIN_IO		0x1000
>  #define PCIBIOS_MIN_MEM		0x10000000
>  
> +#define PCIBIOS_MIN_ALIGNMENT  PAGE_SIZE
> +
>  struct pci_dev;
>  
>  /* Values for the `which' argument to sys_pciconfig_iobase syscall.  */
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 3ee13e5..664f295 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4759,7 +4759,12 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
>  	int seg, bus, slot, func, align_order, count;
>  	resource_size_t align = 0;
>  	char *p;
> +	bool invalid = false;
>  
> +#ifdef PCIBIOS_MIN_ALIGNMENT
> +	align = PCIBIOS_MIN_ALIGNMENT;
> +	*resize = false;
> +#endif

This PCIBIOS_MIN_ALIGNMENT part should be a separate patch by itself.

If you have PCIBIOS_MIN_ALIGNMENT enabled automatically for powerpc,
do you still need the command-line argument?

>  	spin_lock(&resource_alignment_lock);
>  	p = resource_alignment_param;
>  	while (*p) {
> @@ -4776,16 +4781,49 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
>  		} else {
>  			align_order = -1;
>  		}
> -		if (sscanf(p, "%x:%x:%x.%x%n",
> -			&seg, &bus, &slot, &func, &count) != 4) {
> +		if (p[0] == '*' && p[1] == ':') {
> +			seg = -1;
> +			count = 1;
> +		} else if (sscanf(p, "%x%n", &seg, &count) != 1 ||
> +				p[count] != ':') {
> +			invalid = true;
> +			break;
> +		}
> +		p += count + 1;
> +		if (*p == '*') {
> +			bus = -1;
> +			count = 1;
> +		} else if (sscanf(p, "%x%n", &bus, &count) != 1) {
> +			invalid = true;
> +			break;
> +		}
> +		p += count;
> +		if (*p == '.') {
> +			slot = bus;
> +			bus = seg;
>  			seg = 0;
> -			if (sscanf(p, "%x:%x.%x%n",
> -					&bus, &slot, &func, &count) != 3) {
> -				/* Invalid format */
> -				printk(KERN_ERR "PCI: Can't parse resource_alignment parameter: %s\n",
> -					p);
> +			p++;
> +		} else if (*p == ':') {
> +			p++;
> +			if (p[0] == '*' && p[1] == '.') {
> +				slot = -1;
> +				count = 1;
> +			} else if (sscanf(p, "%x%n", &slot, &count) != 1 ||
> +					p[count] != '.') {
> +				invalid = true;
>  				break;
>  			}
> +			p += count + 1;
> +		} else {
> +			invalid = true;
> +			break;
> +		}
> +		if (*p == '*') {
> +			func = -1;
> +			count = 1;
> +		} else if (sscanf(p, "%x%n", &func, &count) != 1) {
> +			invalid = true;
> +			break;
>  		}
>  		p += count;
>  		if (!strncmp(p, ":noresize", 9)) {
> @@ -4793,10 +4831,10 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
>  			p += 9;
>  		} else
>  			*resize = true;
> -		if (seg == pci_domain_nr(dev->bus) &&
> -			bus == dev->bus->number &&
> -			slot == PCI_SLOT(dev->devfn) &&
> -			func == PCI_FUNC(dev->devfn)) {
> +		if ((seg == pci_domain_nr(dev->bus) || seg == -1) &&
> +			(bus == dev->bus->number || bus == -1) &&
> +			(slot == PCI_SLOT(dev->devfn) || slot == -1) &&
> +			(func == PCI_FUNC(dev->devfn) || func == -1)) {
>  			if (align_order == -1)
>  				align = PAGE_SIZE;
>  			else
> @@ -4806,10 +4844,14 @@ static resource_size_t pci_specified_resource_alignment(struct pci_dev *dev,
>  		}
>  		if (*p != ';' && *p != ',') {
>  			/* End of param or invalid format */
> +			invalid = true;
>  			break;
>  		}
>  		p++;
>  	}
> +	if (invalid)
> +		printk(KERN_ERR "PCI: Can't parse resource_alignment parameter:%s\n",
> +				p);
>  	spin_unlock(&resource_alignment_lock);
>  	return align;
>  }
> @@ -4829,6 +4871,10 @@ void pci_reassigndev_resource_alignment(struct pci_dev *dev)
>  	resource_size_t align, size;
>  	u16 command;
>  
> +	/* We should never try to reassign VF's alignment */
> +	if (dev->is_virtfn)
> +		return;

This part looks like a bugfix that should be in a separate patch.

I assume this is because VFs have no read/write BARs themselves.  A PF
has the usual read/write BAR0-BAR5 at offsets 0x10-0x24, as well as
read/write VF BAR0-BAR5 in the SR-IOV capability.  The VF BARs in the
SR-IOV capability determine the resources assigned for VFs.

For the VFs themselves, BAR0-BAR5 at offsets 0x10-0x24 are read-only
zeroes (SR-IOV spec r1.1., sec 3.4.1.11), and there is no SR-IOV
capability.

Right?

>  	/* check if specified PCI is target device to reassign */
>  	align = pci_specified_resource_alignment(dev, &resize);
>  	if (!align)
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2016-06-21  2:26 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-02  5:46 [RESEND PATCH v2 0/4] PCI: Add support for enforcing all MMIO BARs not to share PAGE_SIZE Yongji Xie
2016-06-02  5:46 ` [RESEND PATCH v2 1/4] PCI: Ignore resource_alignment if PCI_PROBE_ONLY was set Yongji Xie
2016-06-21  1:43   ` [RESEND PATCH v2 1/4] PCI: Ignore resource_alignment if PCI_PROBE_ONLY was set\\ Bjorn Helgaas
2016-06-21  2:16     ` Yongji Xie
2016-06-21  8:38       ` Yongji Xie
2016-06-02  5:46 ` [RESEND PATCH v2 2/4] PCI: Do not Use IORESOURCE_STARTALIGN to identify bridge resources Yongji Xie
2016-06-21  1:50   ` Bjorn Helgaas
2016-06-21  2:59     ` Yongji Xie
2016-06-02  5:46 ` [RESEND PATCH v2 3/4] PCI: Add a new option for resource_alignment to reassign alignment Yongji Xie
2016-06-21  1:57   ` Bjorn Helgaas
2016-06-21  3:26     ` Yongji Xie
2016-06-02  5:46 ` [RESEND PATCH v2 4/4] PCI: Add support for enforcing all MMIO BARs to be page aligned Yongji Xie
2016-06-21  2:26   ` Bjorn Helgaas [this message]
2016-06-21  6:46     ` Yongji Xie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160621022643.GD30307@localhost \
    --to=helgaas@kernel.org \
    --cc=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=benh@kernel.crashing.org \
    --cc=bhelgaas@google.com \
    --cc=corbet@lwn.net \
    --cc=gwshan@linux.vnet.ibm.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=nikunj@linux.vnet.ibm.com \
    --cc=paulus@samba.org \
    --cc=warrier@linux.vnet.ibm.com \
    --cc=xyjxie@linux.vnet.ibm.com \
    --cc=zhong@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).