* Re: [PATCH] virtio-balloon: Trivial cleanups
From: Michael S. Tsirkin @ 2011-11-03 11:09 UTC (permalink / raw)
To: Sasha Levin; +Cc: linux-kernel, virtualization
In-Reply-To: <1320308404-28859-1-git-send-email-levinsasha928@gmail.com>
On Thu, Nov 03, 2011 at 10:20:04AM +0200, Sasha Levin wrote:
> Trivial changes to remove forgotten junk, format comments, and correct names.
>
> Cc: Rusty Russell <rusty@rustcorp.com.au>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: virtualization@lists.linux-foundation.org
> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
3.3 material.
Rusty, you want to take it? Want me to?
> ---
> drivers/virtio/virtio_balloon.c | 6 +++---
> 1 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index e058ace..289e998 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -1,4 +1,5 @@
> -/* Virtio balloon implementation, inspired by Dor Loar and Marcelo
> +/*
> + * Virtio balloon implementation, inspired by Dor Laor and Marcelo
> * Tosatti's implementations.
> *
> * Copyright 2008 Rusty Russell IBM Corporation
> @@ -17,7 +18,7 @@
> * along with this program; if not, write to the Free Software
> * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
> */
> -//#define DEBUG
> +
> #include <linux/virtio.h>
> #include <linux/virtio_balloon.h>
> #include <linux/swap.h>
> @@ -148,7 +149,6 @@ static void leak_balloon(struct virtio_balloon *vb, size_t num)
> vb->num_pages--;
> }
>
> -
> /*
> * Note that if
> * virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);
> --
> 1.7.7.1
^ permalink raw reply
* Re: [PATCH RFC] virtio-pci: flexible configuration layout
From: Sasha Levin @ 2011-11-03 11:09 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Krishna Kumar, kvm, Pawel Moll, Wang Sheng-Hui,
Alexey Kardashevskiy, lkml - Kernel Mailing List, Jesse Barnes,
virtualization, Christian Borntraeger, Amit Shah, Linus Torvalds
In-Reply-To: <20111103103314.GA18296@redhat.com>
On Thu, 2011-11-03 at 12:33 +0200, Michael S. Tsirkin wrote:
> On Thu, Nov 03, 2011 at 02:19:03AM +0200, Sasha Levin wrote:
> > Hi Michael,
> >
> > On Thu, 2011-11-03 at 01:31 +0200, Michael S. Tsirkin wrote:
> > > Add a flexible mechanism to specify virtio configuration layout, using
> > > pci vendor-specific capability. A separate capability is used for each
> > > of common, device specific and data-path accesses.
> > >
> > > Warning: compiled only.
> > > This patch also needs to be split up, pci_iomap changes
> > > also need arch updates for non-x86.
> > >
> > > We also will need to update the spec.
> > >
> > > See the first chunk for layout documentation.
> > >
> > > Posting here for early feedback.
> > >
> > > In particular:
> > >
> > > Do we need to require offset to be aligned?
> > > Does iowrite16 work with unaligned accesses on all architectures?
> > > Does using ioread/write as we do add overhead as compared to
> > > plain PIO accesses?
> > >
> > > Jesse - are you OK with the pci_iomap_range API proposed here
> > > (see last chunks)?
> > > I noticed lots of architectures duplicate the implementation
> > > of pci_iomap - makes sense to clean that up?
> > >
> > > Thanks,
> > >
> > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > >
> > > ---
> > >
> > > diff --git a/include/linux/virtio_pci.h b/include/linux/virtio_pci.h
> > > index ea66f3f..b8a9de2 100644
> > > --- a/include/linux/virtio_pci.h
> > > +++ b/include/linux/virtio_pci.h
> > > @@ -92,4 +92,38 @@
> > > /* The alignment to use between consumer and producer parts of vring.
> > > * x86 pagesize again. */
> > > #define VIRTIO_PCI_VRING_ALIGN 4096
> > > +
> > > +/*
> > > + * Layout for Virtio PCI vendor specific capability (little-endian):
> > > + * 5 bit virtio capability id.
> > > + * 3 bit BAR index register, specifying which BAR to use.
> > > + * 4 byte cfg offset within the BAR.
> >
> > Why not make the 3 bits of the BIR as the lower 3 bits of the offset
> > like its done in the PCI spec?
> >
> > This way you also get 'free' 32bit alignment and simpler masking.
>
> 3 bit gets us 64 bit alignment, no? Seems a bit more than we need:
> we need at most 16 bit alignment.
Yup, it's actually 64 bit, but whats the problem with that? It will also
look more similar to the PCI spec.
>
> > > + * 4 byte cfg size.
> > > + */
> > > +
> > > +/* A single virtio device has multiple vendor specific capabilities, we use the
> > > + * 5 bit ID field to distinguish between these. */
> > > +#define VIRTIO_PCI_CAP_ID 3
> > > +#define VIRTIO_PCI_CAP_ID_MASK 0x1f
> > > +#define VIRTIO_PCI_CAP_ID_SHIFT 0
> > > +
> > > +/* IDs for different capabilities. If a specific configuration
> > > + * is missing, legacy PIO path is used. */
> > > +/* Common configuration */
> > > +#define VIRTIO_PCI_CAP_COMMON_CFG 0
> > > +/* Device specific confiuration */
> > > +#define VIRTIO_PCI_CAP_DEVICE_CFG 1
> > > +/* Notifications and ISR access */
> > > +#define VIRTIO_PCI_CAP_NOTIFY_CFG 2
> >
> > I think that separating notification here is very kvm specific.
>
> I don't see how it's kvm specific. This just makes it possible to split data path and
> configuration. Good for security as well. I was actually debating a
> separate configuration for ISR and NOTIFICATION.
It's not really a clean split between data and config, you still have
device status stuck between notification and ISR, which isn't really in
the data path.
A split as you suggested sounds like a good idea, but we just need to
move some fields around to make it clean.
> > Maybe it
> > could be replaced by allowing virtio to signal eventfds or something
> > similar?
>
> No idea what you mean - it's a pci device, how should it signal eventfds?
I'm not really sure what I was thinking about, ignore that :)
>
> > > +
> > > +/* Index of the BAR including this configuration */
> > > +#define VIRTIO_PCI_CAP_CFG_BIR 3
> > > +#define VIRTIO_PCI_CAP_CFG_BIR_MASK (0x7)
> > > +#define VIRTIO_PCI_CAP_CFG_BIR_SHIFT 5
> > > +
> > > +/* Offset within the BAR */
> > > +#define VIRTIO_PCI_CAP_CFG_OFF 4
> > > +/* Size of the configuration space */
> > > +#define VIRTIO_PCI_CAP_CFG_SIZE 8
> > > +
> > > #endif
> > > diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
> > > index 4bcc8b8..d4ed130 100644
> > > --- a/drivers/virtio/virtio_pci.c
> > > +++ b/drivers/virtio/virtio_pci.c
> > > @@ -37,8 +37,12 @@ struct virtio_pci_device
> > > struct virtio_device vdev;
> > > struct pci_dev *pci_dev;
> > >
> > > - /* the IO mapping for the PCI config space */
> > > + /* the IO address for the common PCI config space */
> > > void __iomem *ioaddr;
> > > + /* the IO address for device specific config */
> > > + void __iomem *ioaddr_device;
> > > + /* the IO address to use for notifications operations */
> > > + void __iomem *ioaddr_notify;
> > >
> > > /* a list of queues so we can dispatch IRQs */
> > > spinlock_t lock;
> > > @@ -57,8 +61,142 @@ struct virtio_pci_device
> > > unsigned msix_used_vectors;
> > > /* Whether we have vector per vq */
> > > bool per_vq_vectors;
> > > +
> > > + /* Various IO mappings: used for resource tracking only. */
> > > +
> > > + /* Legacy BAR0: typically PIO. */
> > > + void __iomem *legacy_map;
> > > +
> > > + /* Mappings specified by device capabilities: typically in MMIO */
> > > + void __iomem *notify_map;
> > > + void __iomem *common_map;
> > > + void __iomem *device_map;
> > > };
> > >
> > > +/*
> > > + * With PIO, device-specific config moves as MSI-X is enabled/disabled.
> > > + * Use this accessor to keep pointer to that config in sync.
> > > + */
> > > +static void virtio_pci_set_msix_enabled(struct virtio_pci_device *vp_dev, int enabled)
> > > +{
> > > + vp_dev->msix_enabled = enabled;
> > > + if (vp_dev->device_map)
> > > + vp_dev->ioaddr_device = vp_dev->device_map;
> > > + else
> > > + vp_dev->ioaddr_device = vp_dev->legacy_map +
> > > + VIRTIO_PCI_CONFIG(vp_dev);
> > > +}
> > > +
> > > +static void __iomem *virtio_pci_map_cfg(struct virtio_pci_device *vp_dev, u8 cap_id)
> > > +{
> > > + u32 size;
> > > + u32 offset;
> > > + u8 bir;
> > > + u8 cap_len;
> > > + int pos;
> > > + struct pci_dev *dev = vp_dev->pci_dev;
> > > + void __iomem *p;
> > > +
> > > + for (pos = pci_find_capability(dev, PCI_CAP_ID_VNDR);
> > > + pos > 0;
> > > + pos = pci_find_next_capability(dev, pos, PCI_CAP_ID_VNDR)) {
> > > + u8 id;
> > > + pci_read_config_byte(dev, pos + PCI_VNDR_CAP_LEN, &cap_len);
> > > + if (cap_len < VIRTIO_PCI_CAP_ID + 1)
> > > + continue;
> > > + pci_read_config_byte(dev, pos + VIRTIO_PCI_CAP_ID, &id);
> > > + id >>= VIRTIO_PCI_CAP_ID_SHIFT;
> > > + id &= VIRTIO_PCI_CAP_ID_MASK;
> > > + if (id == cap_id)
> > > + break;
> > > + }
> > > +
> > > + if (pos <= 0)
> > > + return NULL;
> > > +
> > > + if (cap_len < VIRTIO_PCI_CAP_CFG_BIR + 1)
> > > + goto err;
> > > + pci_read_config_byte(dev, pos + VIRTIO_PCI_CAP_CFG_BIR, &bir);
> > > + if (cap_len < VIRTIO_PCI_CAP_CFG_OFF + 4)
> > > + goto err;
> > > + pci_read_config_dword(dev, pos + VIRTIO_PCI_CAP_CFG_OFF, &offset);
> > > + if (cap_len < VIRTIO_PCI_CAP_CFG_SIZE + 4)
> > > + goto err;
> > > + pci_read_config_dword(dev, pos + VIRTIO_PCI_CAP_CFG_SIZE, &size);
> > > + bir >>= VIRTIO_PCI_CAP_CFG_BIR_SHIFT;
> > > + bir &= VIRTIO_PCI_CAP_CFG_BIR_MASK;
> > > +
> > > + /* It's possible for a device to supply a huge config space,
> > > + * but we'll never need to map more than a page ATM. */
> > > + p = pci_iomap_range(dev, bir, offset, size, PAGE_SIZE);
> > > + if (!p)
> > > + dev_err(&vp_dev->vdev.dev, "Unable to map virtio pci memory");
> > > + return p;
> > > +err:
> > > + dev_err(&vp_dev->vdev.dev, "Unable to parse virtio pci capability");
> > > + return NULL;
> > > +}
> > > +
> > > +static void virtio_pci_iounmap(struct virtio_pci_device *vp_dev)
> > > +{
> > > + if (vp_dev->legacy_map)
> > > + pci_iounmap(vp_dev->pci_dev, vp_dev->legacy_map);
> > > + if (vp_dev->notify_map)
> > > + pci_iounmap(vp_dev->pci_dev, vp_dev->notify_map);
> > > + if (vp_dev->common_map)
> > > + pci_iounmap(vp_dev->pci_dev, vp_dev->common_map);
> > > + if (vp_dev->device_map)
> > > + pci_iounmap(vp_dev->pci_dev, vp_dev->device_map);
> > > +}
> > > +
> > > +static int virtio_pci_iomap(struct virtio_pci_device *vp_dev)
> > > +{
> > > + vp_dev->notify_map = virtio_pci_map_cfg(vp_dev,
> > > + VIRTIO_PCI_CAP_NOTIFY_CFG);
> > > + vp_dev->common_map = virtio_pci_map_cfg(vp_dev,
> > > + VIRTIO_PCI_CAP_COMMON_CFG);
> > > + vp_dev->device_map = virtio_pci_map_cfg(vp_dev,
> > > + VIRTIO_PCI_CAP_DEVICE_CFG);
> > > +
> > > + if (!vp_dev->notify_map || !vp_dev->common_map ||
> > > + !vp_dev->device_map) {
> > > + /*
> > > + * If not all capabilities present, map legacy PIO.
> > > + * Legacy access is at BAR 0. We never need to map
> > > + * more than 256 bytes there, since legacy config space
> > > + * used PIO which has this size limit.
> > > + * */
> >
> > Whats the point of putting each of those new configs into a separate cap
> > structure if you must have all 3 of them for the new config layout to
> > work anyway?
>
> Pls look closer, we don't require all 3 of them. For example, we can have BAR0
> DEVICE and COMMON, and notifications will use BAR0.
>
> By using an identical structure we make is much easier to reuse
> and extend it. For example we might add a separate structure for ISR.
Right.
> > Also, why not use this change to simplify the config space? The attempts
> > to squeeze it into the limited PIO space made it quite messy, and this
> > can be a good chance to fix some things like the MSI-X and offsets
> > issue.
>
> For MSI-X this is exactly what this change does, by splitting
> device-specific parts out. So if DEVICE capability is present,
> this tells us where the device specific data starts and
> enabling MSI-X does not relocate it.
You still have relocations within the shared config space, once 64bit
features or other future extensions are implemented.
> Are there other fields we want to move?
We can do several things:
1. Since we're no longer limited in space - remove queue selector and
just have a full queue table.
2. Move device specific features into the device specific region.
Currently the features field is a mix between virtio-pci and device
specific features.
I did more similar things in the spec suggestion I've sent, could you
look at the second part for more ideas please?
--
Sasha.
^ permalink raw reply
* Re: virtio-pci new configuration proposal
From: Michael S. Tsirkin @ 2011-11-03 11:03 UTC (permalink / raw)
To: Rusty Russell
Cc: Anthony Liguori, virtualization, Sasha Levin, kvm, linux-kernel
In-Reply-To: <8762j2t19l.fsf@rustcorp.com.au>
On Thu, Nov 03, 2011 at 12:28:46PM +1030, Rusty Russell wrote:
> 3) If we're changing the queue layout, it's a chance to fix a
> longstanding bug: let the guest notify the host of preferred
> queue size and alignment.
With device config split from the common one, we can
just tuck new fields at the tail of the common one.
--
MST
^ permalink raw reply
* Re: [PATCH RFC] virtio-pci: flexible configuration layout
From: Avi Kivity @ 2011-11-03 10:37 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Krishna Kumar, kvm, Pawel Moll, Wang Sheng-Hui,
Alexey Kardashevskiy, lkml - Kernel Mailing List, Jesse Barnes,
virtualization, Christian Borntraeger, Sasha Levin, Amit Shah,
Linus Torvalds
In-Reply-To: <20111102233110.GA20289@redhat.com>
On 11/03/2011 01:31 AM, Michael S. Tsirkin wrote:
> Add a flexible mechanism to specify virtio configuration layout, using
> pci vendor-specific capability. A separate capability is used for each
> of common, device specific and data-path accesses.
>
>
How about posting the spec change instead of patches?
--
error compiling committee.c: too many arguments to function
^ permalink raw reply
* Re: [PATCH RFC] virtio-pci: flexible configuration layout
From: Michael S. Tsirkin @ 2011-11-03 10:33 UTC (permalink / raw)
To: Sasha Levin
Cc: Krishna Kumar, kvm, Pawel Moll, Wang Sheng-Hui,
Alexey Kardashevskiy, lkml - Kernel Mailing List, Jesse Barnes,
virtualization, Christian Borntraeger, Amit Shah, Linus Torvalds
In-Reply-To: <1320279543.31237.14.camel@lappy>
On Thu, Nov 03, 2011 at 02:19:03AM +0200, Sasha Levin wrote:
> Hi Michael,
>
> On Thu, 2011-11-03 at 01:31 +0200, Michael S. Tsirkin wrote:
> > Add a flexible mechanism to specify virtio configuration layout, using
> > pci vendor-specific capability. A separate capability is used for each
> > of common, device specific and data-path accesses.
> >
> > Warning: compiled only.
> > This patch also needs to be split up, pci_iomap changes
> > also need arch updates for non-x86.
> >
> > We also will need to update the spec.
> >
> > See the first chunk for layout documentation.
> >
> > Posting here for early feedback.
> >
> > In particular:
> >
> > Do we need to require offset to be aligned?
> > Does iowrite16 work with unaligned accesses on all architectures?
> > Does using ioread/write as we do add overhead as compared to
> > plain PIO accesses?
> >
> > Jesse - are you OK with the pci_iomap_range API proposed here
> > (see last chunks)?
> > I noticed lots of architectures duplicate the implementation
> > of pci_iomap - makes sense to clean that up?
> >
> > Thanks,
> >
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >
> > ---
> >
> > diff --git a/include/linux/virtio_pci.h b/include/linux/virtio_pci.h
> > index ea66f3f..b8a9de2 100644
> > --- a/include/linux/virtio_pci.h
> > +++ b/include/linux/virtio_pci.h
> > @@ -92,4 +92,38 @@
> > /* The alignment to use between consumer and producer parts of vring.
> > * x86 pagesize again. */
> > #define VIRTIO_PCI_VRING_ALIGN 4096
> > +
> > +/*
> > + * Layout for Virtio PCI vendor specific capability (little-endian):
> > + * 5 bit virtio capability id.
> > + * 3 bit BAR index register, specifying which BAR to use.
> > + * 4 byte cfg offset within the BAR.
>
> Why not make the 3 bits of the BIR as the lower 3 bits of the offset
> like its done in the PCI spec?
>
> This way you also get 'free' 32bit alignment and simpler masking.
3 bit gets us 64 bit alignment, no? Seems a bit more than we need:
we need at most 16 bit alignment.
> > + * 4 byte cfg size.
> > + */
> > +
> > +/* A single virtio device has multiple vendor specific capabilities, we use the
> > + * 5 bit ID field to distinguish between these. */
> > +#define VIRTIO_PCI_CAP_ID 3
> > +#define VIRTIO_PCI_CAP_ID_MASK 0x1f
> > +#define VIRTIO_PCI_CAP_ID_SHIFT 0
> > +
> > +/* IDs for different capabilities. If a specific configuration
> > + * is missing, legacy PIO path is used. */
> > +/* Common configuration */
> > +#define VIRTIO_PCI_CAP_COMMON_CFG 0
> > +/* Device specific confiuration */
> > +#define VIRTIO_PCI_CAP_DEVICE_CFG 1
> > +/* Notifications and ISR access */
> > +#define VIRTIO_PCI_CAP_NOTIFY_CFG 2
>
> I think that separating notification here is very kvm specific.
I don't see how it's kvm specific. This just makes it possible to split data path and
configuration. Good for security as well. I was actually debating a
separate configuration for ISR and NOTIFICATION.
> Maybe it
> could be replaced by allowing virtio to signal eventfds or something
> similar?
No idea what you mean - it's a pci device, how should it signal eventfds?
> > +
> > +/* Index of the BAR including this configuration */
> > +#define VIRTIO_PCI_CAP_CFG_BIR 3
> > +#define VIRTIO_PCI_CAP_CFG_BIR_MASK (0x7)
> > +#define VIRTIO_PCI_CAP_CFG_BIR_SHIFT 5
> > +
> > +/* Offset within the BAR */
> > +#define VIRTIO_PCI_CAP_CFG_OFF 4
> > +/* Size of the configuration space */
> > +#define VIRTIO_PCI_CAP_CFG_SIZE 8
> > +
> > #endif
> > diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
> > index 4bcc8b8..d4ed130 100644
> > --- a/drivers/virtio/virtio_pci.c
> > +++ b/drivers/virtio/virtio_pci.c
> > @@ -37,8 +37,12 @@ struct virtio_pci_device
> > struct virtio_device vdev;
> > struct pci_dev *pci_dev;
> >
> > - /* the IO mapping for the PCI config space */
> > + /* the IO address for the common PCI config space */
> > void __iomem *ioaddr;
> > + /* the IO address for device specific config */
> > + void __iomem *ioaddr_device;
> > + /* the IO address to use for notifications operations */
> > + void __iomem *ioaddr_notify;
> >
> > /* a list of queues so we can dispatch IRQs */
> > spinlock_t lock;
> > @@ -57,8 +61,142 @@ struct virtio_pci_device
> > unsigned msix_used_vectors;
> > /* Whether we have vector per vq */
> > bool per_vq_vectors;
> > +
> > + /* Various IO mappings: used for resource tracking only. */
> > +
> > + /* Legacy BAR0: typically PIO. */
> > + void __iomem *legacy_map;
> > +
> > + /* Mappings specified by device capabilities: typically in MMIO */
> > + void __iomem *notify_map;
> > + void __iomem *common_map;
> > + void __iomem *device_map;
> > };
> >
> > +/*
> > + * With PIO, device-specific config moves as MSI-X is enabled/disabled.
> > + * Use this accessor to keep pointer to that config in sync.
> > + */
> > +static void virtio_pci_set_msix_enabled(struct virtio_pci_device *vp_dev, int enabled)
> > +{
> > + vp_dev->msix_enabled = enabled;
> > + if (vp_dev->device_map)
> > + vp_dev->ioaddr_device = vp_dev->device_map;
> > + else
> > + vp_dev->ioaddr_device = vp_dev->legacy_map +
> > + VIRTIO_PCI_CONFIG(vp_dev);
> > +}
> > +
> > +static void __iomem *virtio_pci_map_cfg(struct virtio_pci_device *vp_dev, u8 cap_id)
> > +{
> > + u32 size;
> > + u32 offset;
> > + u8 bir;
> > + u8 cap_len;
> > + int pos;
> > + struct pci_dev *dev = vp_dev->pci_dev;
> > + void __iomem *p;
> > +
> > + for (pos = pci_find_capability(dev, PCI_CAP_ID_VNDR);
> > + pos > 0;
> > + pos = pci_find_next_capability(dev, pos, PCI_CAP_ID_VNDR)) {
> > + u8 id;
> > + pci_read_config_byte(dev, pos + PCI_VNDR_CAP_LEN, &cap_len);
> > + if (cap_len < VIRTIO_PCI_CAP_ID + 1)
> > + continue;
> > + pci_read_config_byte(dev, pos + VIRTIO_PCI_CAP_ID, &id);
> > + id >>= VIRTIO_PCI_CAP_ID_SHIFT;
> > + id &= VIRTIO_PCI_CAP_ID_MASK;
> > + if (id == cap_id)
> > + break;
> > + }
> > +
> > + if (pos <= 0)
> > + return NULL;
> > +
> > + if (cap_len < VIRTIO_PCI_CAP_CFG_BIR + 1)
> > + goto err;
> > + pci_read_config_byte(dev, pos + VIRTIO_PCI_CAP_CFG_BIR, &bir);
> > + if (cap_len < VIRTIO_PCI_CAP_CFG_OFF + 4)
> > + goto err;
> > + pci_read_config_dword(dev, pos + VIRTIO_PCI_CAP_CFG_OFF, &offset);
> > + if (cap_len < VIRTIO_PCI_CAP_CFG_SIZE + 4)
> > + goto err;
> > + pci_read_config_dword(dev, pos + VIRTIO_PCI_CAP_CFG_SIZE, &size);
> > + bir >>= VIRTIO_PCI_CAP_CFG_BIR_SHIFT;
> > + bir &= VIRTIO_PCI_CAP_CFG_BIR_MASK;
> > +
> > + /* It's possible for a device to supply a huge config space,
> > + * but we'll never need to map more than a page ATM. */
> > + p = pci_iomap_range(dev, bir, offset, size, PAGE_SIZE);
> > + if (!p)
> > + dev_err(&vp_dev->vdev.dev, "Unable to map virtio pci memory");
> > + return p;
> > +err:
> > + dev_err(&vp_dev->vdev.dev, "Unable to parse virtio pci capability");
> > + return NULL;
> > +}
> > +
> > +static void virtio_pci_iounmap(struct virtio_pci_device *vp_dev)
> > +{
> > + if (vp_dev->legacy_map)
> > + pci_iounmap(vp_dev->pci_dev, vp_dev->legacy_map);
> > + if (vp_dev->notify_map)
> > + pci_iounmap(vp_dev->pci_dev, vp_dev->notify_map);
> > + if (vp_dev->common_map)
> > + pci_iounmap(vp_dev->pci_dev, vp_dev->common_map);
> > + if (vp_dev->device_map)
> > + pci_iounmap(vp_dev->pci_dev, vp_dev->device_map);
> > +}
> > +
> > +static int virtio_pci_iomap(struct virtio_pci_device *vp_dev)
> > +{
> > + vp_dev->notify_map = virtio_pci_map_cfg(vp_dev,
> > + VIRTIO_PCI_CAP_NOTIFY_CFG);
> > + vp_dev->common_map = virtio_pci_map_cfg(vp_dev,
> > + VIRTIO_PCI_CAP_COMMON_CFG);
> > + vp_dev->device_map = virtio_pci_map_cfg(vp_dev,
> > + VIRTIO_PCI_CAP_DEVICE_CFG);
> > +
> > + if (!vp_dev->notify_map || !vp_dev->common_map ||
> > + !vp_dev->device_map) {
> > + /*
> > + * If not all capabilities present, map legacy PIO.
> > + * Legacy access is at BAR 0. We never need to map
> > + * more than 256 bytes there, since legacy config space
> > + * used PIO which has this size limit.
> > + * */
>
> Whats the point of putting each of those new configs into a separate cap
> structure if you must have all 3 of them for the new config layout to
> work anyway?
Pls look closer, we don't require all 3 of them. For example, we can have BAR0
DEVICE and COMMON, and notifications will use BAR0.
By using an identical structure we make is much easier to reuse
and extend it. For example we might add a separate structure for ISR.
> Also, why not use this change to simplify the config space? The attempts
> to squeeze it into the limited PIO space made it quite messy, and this
> can be a good chance to fix some things like the MSI-X and offsets
> issue.
For MSI-X this is exactly what this change does, by splitting
device-specific parts out. So if DEVICE capability is present,
this tells us where the device specific data starts and
enabling MSI-X does not relocate it.
Are there other fields we want to move?
>
> --
>
> Sasha.
^ permalink raw reply
* Re: [PATCH 4 of 5] virtio: avoid modulus operation
From: Rusty Russell @ 2011-11-03 10:18 UTC (permalink / raw)
To: Pekka Enberg; +Cc: Christoph Hellwig, virtualization, linux-kernel, kvm, mst
In-Reply-To: <CAOJsxLEt6_y7jw0bRsaita4gfb2k+BAQMeRLs9PcHntGVSFvaQ@mail.gmail.com>
On Thu, 3 Nov 2011 09:51:15 +0200, Pekka Enberg <penberg@kernel.org> wrote:
> On Thu, Nov 3, 2011 at 9:42 AM, Rusty Russell <rusty@rustcorp.com.au> wrote:
> > Since we know vq->vring.num is a power of 2, modulus is lazy (it's asserted
> > in vring_new_virtqueue()).
> >
> > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> > ---
> > drivers/virtio/virtio_ring.c | 10 ++++++----
> > 1 file changed, 6 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/x86/lguest/boot.c b/arch/x86/lguest/boot.c
> > --- a/arch/x86/lguest/boot.c
> > +++ b/arch/x86/lguest/boot.c
> > @@ -1420,7 +1420,7 @@ __init void lguest_init(void)
> > add_preferred_console("hvc", 0, NULL);
> >
> > /* Register our very early console. */
> > - virtio_cons_early_init(early_put_chars);
> > +// virtio_cons_early_init(early_put_chars);
>
> What's this hunk here?
Ugly workaround for console until hvc_console breakage gets reverted.
Left in by mistake :)
Thanks,
Rusty.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply
* Re: virtio-pci new configuration proposal
From: Sasha Levin @ 2011-11-03 8:33 UTC (permalink / raw)
To: Rusty Russell
Cc: Anthony Liguori, virtualization, linux-kernel, kvm,
Michael S. Tsirkin
In-Reply-To: <8762j2t19l.fsf@rustcorp.com.au>
On Thu, 2011-11-03 at 12:28 +1030, Rusty Russell wrote:
> On Wed, 02 Nov 2011 20:49:27 +0200, Sasha Levin <levinsasha928@gmail.com> wrote:
> > This is a proposal for a new layout of the virtio-pci config space.
> >
> > We will separate the current configuration into two: A virtio-pci common
> > configuration and a device specific configuration. This allows more flexibility
> > with adding features and makes usage easier, specifically in cases like the
> > ones in virtio-net where device specific configurations depend on device
> > specific features.
>
> Thanks for this Sasha. Several general comments:
>
> 1) How to we distinguish the two layouts? In theory, we need to do this
> forever. In practice we can deprecate the old layout in several
> years' time.
Old layouts won't have the new virtio-pci cap structure in their PCI
config space.
> 2) I don't think we want to turn the device-specific config into a
> linked list. We haven't needed variable-length config (yet!), and
> it's (slightly) more complex. That's also the part of the spec which
> is shared with non-PCI virtio implementations.
Variable length config wasn't used yet because space in the device
specific space was reserved for a feature even if that feature wasn't
used.
For example, the MAC feature reserved 6 bytes in the config space for
the MAC even if VIRTIO_NET_F_MAC wasn't enabled. Here we can just avoid
having it pollute the config space until it's enabled.
I don't think it'll have any impact on non-PCI implementations since the
"pointers" are simply offsets from the beginning of the config space,
and are not PCI specific in any way.
> 3) If we're changing the queue layout, it's a chance to fix a
> longstanding bug: let the guest notify the host of preferred
> queue size and alignment.
Yup, we can do that.
--
Sasha.
^ permalink raw reply
* [PATCH] virtio-balloon: Trivial cleanups
From: Sasha Levin @ 2011-11-03 8:20 UTC (permalink / raw)
To: linux-kernel; +Cc: virtualization, Sasha Levin, Michael S. Tsirkin
Trivial changes to remove forgotten junk, format comments, and correct names.
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
drivers/virtio/virtio_balloon.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index e058ace..289e998 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -1,4 +1,5 @@
-/* Virtio balloon implementation, inspired by Dor Loar and Marcelo
+/*
+ * Virtio balloon implementation, inspired by Dor Laor and Marcelo
* Tosatti's implementations.
*
* Copyright 2008 Rusty Russell IBM Corporation
@@ -17,7 +18,7 @@
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
-//#define DEBUG
+
#include <linux/virtio.h>
#include <linux/virtio_balloon.h>
#include <linux/swap.h>
@@ -148,7 +149,6 @@ static void leak_balloon(struct virtio_balloon *vb, size_t num)
vb->num_pages--;
}
-
/*
* Note that if
* virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);
--
1.7.7.1
^ permalink raw reply related
* Re: [PATCH 3 of 5] virtio: support unlocked queue kick
From: Christoph Hellwig @ 2011-11-03 7:52 UTC (permalink / raw)
To: Rusty Russell
Cc: MichaelS.Tsirkist, Christoph Hellwig, linux-kernel, kvm,
virtualization
In-Reply-To: <c0db93dbb414ab6dbcde.1320306171@localhost6.localdomain6>
On Thu, Nov 03, 2011 at 06:12:51PM +1030, Rusty Russell wrote:
> Based on patch by Christoph for virtio_blk speedup:
Please credit it to Stefan - he also sent a pointer to his original
version in reply to the previous thread.
Also shouldn't virtqueue_kick have kerneldoc comments?
I also notices that you documented the functions bother here and in
the first patch in the headers. At least historically the kerneldoc
tools didn't parse comments at declarations, but only at the function
defintions. Did you check these actually get picked up?
^ permalink raw reply
* Re: [PATCH 4 of 5] virtio: avoid modulus operation
From: Pekka Enberg @ 2011-11-03 7:51 UTC (permalink / raw)
To: Rusty Russell
Cc: MichaelS.Tsirkist, Christoph Hellwig, linux-kernel, kvm,
virtualization
In-Reply-To: <05de9ee25e222c5206f3.1320306172@localhost6.localdomain6>
On Thu, Nov 3, 2011 at 9:42 AM, Rusty Russell <rusty@rustcorp.com.au> wrote:
> Since we know vq->vring.num is a power of 2, modulus is lazy (it's asserted
> in vring_new_virtqueue()).
>
> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> ---
> drivers/virtio/virtio_ring.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/lguest/boot.c b/arch/x86/lguest/boot.c
> --- a/arch/x86/lguest/boot.c
> +++ b/arch/x86/lguest/boot.c
> @@ -1420,7 +1420,7 @@ __init void lguest_init(void)
> add_preferred_console("hvc", 0, NULL);
>
> /* Register our very early console. */
> - virtio_cons_early_init(early_put_chars);
> +// virtio_cons_early_init(early_put_chars);
What's this hunk here?
^ permalink raw reply
* Re: [PATCH 2 of 5] virtio: rename virtqueue_add_buf_gfp to virtqueue_add_buf
From: Christoph Hellwig @ 2011-11-03 7:50 UTC (permalink / raw)
To: Rusty Russell
Cc: MichaelS.Tsirkist, Christoph Hellwig, linux-kernel, kvm,
virtualization
In-Reply-To: <07aa6e8fcfc2bcba854e.1320306170@localhost6.localdomain6>
On Thu, Nov 03, 2011 at 06:12:50PM +1030, Rusty Russell wrote:
> Remove wrapper functions. This makes the allocation type explicit in
> all callers; I used GPF_KERNEL where it seemed obvious, left it at
> GFP_ATOMIC otherwise.
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply
* Re: [PATCH 1 of 5] virtio: document functions better
From: Christoph Hellwig @ 2011-11-03 7:49 UTC (permalink / raw)
To: Rusty Russell
Cc: MichaelS.Tsirkist, Christoph Hellwig, linux-kernel, kvm,
virtualization
In-Reply-To: <7ce2cceb12e0dd2a4c99.1320306169@localhost6.localdomain6>
On Thu, Nov 03, 2011 at 06:12:49PM +1030, Rusty Russell wrote:
> The old documentation is left over from when we used a structure with
> strategy pointers.
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply
* [PATCH 5 of 5] virtio: expose added descriptors immediately
From: Rusty Russell @ 2011-11-03 7:42 UTC (permalink / raw)
To: MichaelS.Tsirkist, Christoph Hellwig; +Cc: linux-kernel, kvm, virtualization
In-Reply-To: <patchbomb.1320306168@localhost6.localdomain6>
A virtio driver does virtqueue_add_buf() multiple times before finally
calling virtqueue_kick(); previously we only exposed the added buffers
in the virtqueue_kick() call. This means we don't need a memory
barrier in virtqueue_add_buf(), but it reduces concurrency as the
device (ie. host) can't see the buffers until the kick.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
---
drivers/virtio/virtio_ring.c | 37 ++++++++++++++++++++++---------------
1 file changed, 22 insertions(+), 15 deletions(-)
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -227,9 +227,15 @@ add_head:
/* Put entry in available array (but don't update avail->idx until they
* do sync). */
- avail = ((vq->vring.avail->idx + vq->num_added++) & (vq->vring.num-1));
+ avail = (vq->vring.avail->idx & (vq->vring.num-1));
vq->vring.avail->ring[avail] = head;
+ /* Descriptors and available array need to be set before we expose the
+ * new available array entries. */
+ virtio_wmb();
+ vq->vring.avail->idx++;
+ vq->num_added++;
+
pr_debug("Added buffer head %i to %p\n", head, vq);
END_USE(vq);
@@ -248,13 +254,10 @@ bool virtqueue_kick_prepare(struct virtq
* new available array entries. */
virtio_wmb();
- old = vq->vring.avail->idx;
- new = vq->vring.avail->idx = old + vq->num_added;
+ old = vq->vring.avail->idx - vq->num_added;
+ new = vq->vring.avail->idx;
vq->num_added = 0;
- /* Need to update avail index before checking if we should notify */
- virtio_mb();
-
if (vq->event) {
needs_kick = vring_need_event(vring_avail_event(&vq->vring),
new, old);
^ permalink raw reply
* [PATCH 4 of 5] virtio: avoid modulus operation
From: Rusty Russell @ 2011-11-03 7:42 UTC (permalink / raw)
To: MichaelS.Tsirkist, Christoph Hellwig; +Cc: linux-kernel, kvm, virtualization
In-Reply-To: <patchbomb.1320306168@localhost6.localdomain6>
Since we know vq->vring.num is a power of 2, modulus is lazy (it's asserted
in vring_new_virtqueue()).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
---
drivers/virtio/virtio_ring.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/arch/x86/lguest/boot.c b/arch/x86/lguest/boot.c
--- a/arch/x86/lguest/boot.c
+++ b/arch/x86/lguest/boot.c
@@ -1420,7 +1420,7 @@ __init void lguest_init(void)
add_preferred_console("hvc", 0, NULL);
/* Register our very early console. */
- virtio_cons_early_init(early_put_chars);
+// virtio_cons_early_init(early_put_chars);
/*
* Last of all, we set the power management poweroff hook to point to
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -226,8 +226,8 @@ add_head:
vq->data[head] = data;
/* Put entry in available array (but don't update avail->idx until they
- * do sync). FIXME: avoid modulus here? */
- avail = (vq->vring.avail->idx + vq->num_added++) % vq->vring.num;
+ * do sync). */
+ avail = ((vq->vring.avail->idx + vq->num_added++) & (vq->vring.num-1));
vq->vring.avail->ring[avail] = head;
pr_debug("Added buffer head %i to %p\n", head, vq);
@@ -317,6 +317,7 @@ void *virtqueue_get_buf(struct virtqueue
struct vring_virtqueue *vq = to_vvq(_vq);
void *ret;
unsigned int i;
+ u16 last_used;
START_USE(vq);
@@ -334,8 +335,9 @@ void *virtqueue_get_buf(struct virtqueue
/* Only get used array entries after they have been exposed by host. */
virtio_rmb();
- i = vq->vring.used->ring[vq->last_used_idx%vq->vring.num].id;
- *len = vq->vring.used->ring[vq->last_used_idx%vq->vring.num].len;
+ last_used = (vq->last_used_idx & (vq->vring.num - 1));
+ i = vq->vring.used->ring[last_used].id;
+ *len = vq->vring.used->ring[last_used].len;
if (unlikely(i >= vq->vring.num)) {
BAD_RING(vq, "id %u out of range\n", i);
^ permalink raw reply
* [PATCH 3 of 5] virtio: support unlocked queue kick
From: Rusty Russell @ 2011-11-03 7:42 UTC (permalink / raw)
To: MichaelS.Tsirkist, Christoph Hellwig; +Cc: linux-kernel, kvm, virtualization
In-Reply-To: <patchbomb.1320306168@localhost6.localdomain6>
Based on patch by Christoph for virtio_blk speedup:
Split virtqueue_kick to be able to do the actual notification
outside the lock protecting the virtqueue. This patch was
originally done by Stefan Hajnoczi, but I can't find the
original one anymore and had to recreated it from memory.
Pointers to the original or corrections for the commit message
are welcome.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -237,10 +237,12 @@ add_head:
}
EXPORT_SYMBOL_GPL(virtqueue_add_buf);
-void virtqueue_kick(struct virtqueue *_vq)
+bool virtqueue_kick_prepare(struct virtqueue *_vq)
{
struct vring_virtqueue *vq = to_vvq(_vq);
u16 new, old;
+ bool needs_kick;
+
START_USE(vq);
/* Descriptors and available array need to be set before we expose the
* new available array entries. */
@@ -253,13 +255,30 @@ void virtqueue_kick(struct virtqueue *_v
/* Need to update avail index before checking if we should notify */
virtio_mb();
- if (vq->event ?
- vring_need_event(vring_avail_event(&vq->vring), new, old) :
- !(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY))
- /* Prod other side to tell it about changes. */
- vq->notify(&vq->vq);
+ if (vq->event) {
+ needs_kick = vring_need_event(vring_avail_event(&vq->vring),
+ new, old);
+ } else {
+ needs_kick = (!(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY));
+ }
+ END_USE(vq);
+ return needs_kick;
+}
+EXPORT_SYMBOL_GPL(virtqueue_kick_prepare);
- END_USE(vq);
+void virtqueue_notify(struct virtqueue *_vq)
+{
+ struct vring_virtqueue *vq = to_vvq(_vq);
+
+ /* Prod other side to tell it about changes. */
+ vq->notify(_vq);
+}
+EXPORT_SYMBOL_GPL(virtqueue_notify);
+
+void virtqueue_kick(struct virtqueue *vq)
+{
+ if (virtqueue_kick_prepare(vq))
+ virtqueue_notify(vq);
}
EXPORT_SYMBOL_GPL(virtqueue_kick);
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -62,6 +62,27 @@ int virtqueue_add_buf(struct virtqueue *
void virtqueue_kick(struct virtqueue *vq);
/**
+ * virtqueue_kick_prepare - first half of split virtqueue_kick call.
+ * @vq: the struct virtqueue
+ *
+ * Instead of virtqueue_kick(), you can do:
+ * if (virtqueue_kick_prepare(vq))
+ * virtqueue_notify(vq);
+ *
+ * This is sometimes useful because the virtqueue_kick_prepare() needs
+ * to be serialized, but the actual virtqueue_notify() call does not.
+ */
+bool virtqueue_kick_prepare(struct virtqueue *vq);
+
+/**
+ * virtqueue_notify - second half of split virtqueue_kick call.
+ * @vq: the struct virtqueue
+ *
+ * This does not need to be serialized.
+ */
+void virtqueue_notify(struct virtqueue *vq);
+
+/**
* virtqueue_get_buf - get the next used buffer
* @vq: the struct virtqueue we're talking about.
* @len: the length written into the buffer
^ permalink raw reply
* [PATCH 2 of 5] virtio: rename virtqueue_add_buf_gfp to virtqueue_add_buf
From: Rusty Russell @ 2011-11-03 7:42 UTC (permalink / raw)
To: MichaelS.Tsirkist, Christoph Hellwig; +Cc: linux-kernel, kvm, virtualization
In-Reply-To: <patchbomb.1320306168@localhost6.localdomain6>
Remove wrapper functions. This makes the allocation type explicit in
all callers; I used GPF_KERNEL where it seemed obvious, left it at
GFP_ATOMIC otherwise.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -171,7 +171,7 @@ static bool do_req(struct request_queue
}
}
- if (virtqueue_add_buf(vblk->vq, vblk->sg, out, in, vbr) < 0) {
+ if (virtqueue_add_buf(vblk->vq, vblk->sg, out, in, vbr, GFP_ATOMIC)<0) {
mempool_free(vbr, vblk->pool);
return false;
}
diff --git a/drivers/char/hw_random/virtio-rng.c b/drivers/char/hw_random/virtio-rng.c
--- a/drivers/char/hw_random/virtio-rng.c
+++ b/drivers/char/hw_random/virtio-rng.c
@@ -46,7 +46,7 @@ static void register_buffer(u8 *buf, siz
sg_init_one(&sg, buf, size);
/* There should always be room for one buffer. */
- if (virtqueue_add_buf(vq, &sg, 0, 1, buf) < 0)
+ if (virtqueue_add_buf(vq, &sg, 0, 1, buf, GFP_KERNEL) < 0)
BUG();
virtqueue_kick(vq);
diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -391,7 +391,7 @@ static int add_inbuf(struct virtqueue *v
sg_init_one(sg, buf->buf, buf->size);
- ret = virtqueue_add_buf(vq, sg, 0, 1, buf);
+ ret = virtqueue_add_buf(vq, sg, 0, 1, buf, GFP_ATOMIC);
virtqueue_kick(vq);
return ret;
}
@@ -456,7 +456,7 @@ static ssize_t __send_control_msg(struct
vq = portdev->c_ovq;
sg_init_one(sg, &cpkt, sizeof(cpkt));
- if (virtqueue_add_buf(vq, sg, 1, 0, &cpkt) >= 0) {
+ if (virtqueue_add_buf(vq, sg, 1, 0, &cpkt, GFP_ATOMIC) >= 0) {
virtqueue_kick(vq);
while (!virtqueue_get_buf(vq, &len))
cpu_relax();
@@ -505,7 +505,7 @@ static ssize_t send_buf(struct port *por
reclaim_consumed_buffers(port);
sg_init_one(sg, in_buf, in_count);
- ret = virtqueue_add_buf(out_vq, sg, 1, 0, in_buf);
+ ret = virtqueue_add_buf(out_vq, sg, 1, 0, in_buf, GFP_ATOMIC);
/* Tell Host to go! */
virtqueue_kick(out_vq);
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -368,7 +368,7 @@ static int add_recvbuf_small(struct virt
skb_to_sgvec(skb, vi->rx_sg + 1, 0, skb->len);
- err = virtqueue_add_buf_gfp(vi->rvq, vi->rx_sg, 0, 2, skb, gfp);
+ err = virtqueue_add_buf(vi->rvq, vi->rx_sg, 0, 2, skb, gfp);
if (err < 0)
dev_kfree_skb(skb);
@@ -413,8 +413,8 @@ static int add_recvbuf_big(struct virtne
/* chain first in list head */
first->private = (unsigned long)list;
- err = virtqueue_add_buf_gfp(vi->rvq, vi->rx_sg, 0, MAX_SKB_FRAGS + 2,
- first, gfp);
+ err = virtqueue_add_buf(vi->rvq, vi->rx_sg, 0, MAX_SKB_FRAGS + 2,
+ first, gfp);
if (err < 0)
give_pages(vi, first);
@@ -432,7 +432,7 @@ static int add_recvbuf_mergeable(struct
sg_init_one(vi->rx_sg, page_address(page), PAGE_SIZE);
- err = virtqueue_add_buf_gfp(vi->rvq, vi->rx_sg, 0, 1, page, gfp);
+ err = virtqueue_add_buf(vi->rvq, vi->rx_sg, 0, 1, page, gfp);
if (err < 0)
give_pages(vi, page);
@@ -601,7 +601,7 @@ static int xmit_skb(struct virtnet_info
hdr->num_sg = skb_to_sgvec(skb, vi->tx_sg + 1, 0, skb->len) + 1;
return virtqueue_add_buf(vi->svq, vi->tx_sg, hdr->num_sg,
- 0, skb);
+ 0, skb, GFP_ATOMIC);
}
static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
@@ -754,7 +754,7 @@ static bool virtnet_send_command(struct
sg_set_buf(&sg[i + 1], sg_virt(s), s->length);
sg_set_buf(&sg[out + in - 1], &status, sizeof(status));
- BUG_ON(virtqueue_add_buf(vi->cvq, sg, out, in, vi) < 0);
+ BUG_ON(virtqueue_add_buf(vi->cvq, sg, out, in, vi, GFP_ATOMIC) < 0);
virtqueue_kick(vi->cvq);
diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -86,7 +86,7 @@ static void tell_host(struct virtio_ball
init_completion(&vb->acked);
/* We should always be able to add one buffer to an empty queue. */
- if (virtqueue_add_buf(vq, &sg, 1, 0, vb) < 0)
+ if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0)
BUG();
virtqueue_kick(vq);
@@ -219,7 +219,7 @@ static void stats_handle_request(struct
vq = vb->stats_vq;
sg_init_one(&sg, vb->stats, sizeof(vb->stats));
- if (virtqueue_add_buf(vq, &sg, 1, 0, vb) < 0)
+ if (virtqueue_add_buf(vq, &sg, 1, 0, vb, GFP_KERNEL) < 0)
BUG();
virtqueue_kick(vq);
}
@@ -312,7 +312,8 @@ static int virtballoon_probe(struct virt
* use it to signal us later.
*/
sg_init_one(&sg, vb->stats, sizeof vb->stats);
- if (virtqueue_add_buf(vb->stats_vq, &sg, 1, 0, vb) < 0)
+ if (virtqueue_add_buf(vb->stats_vq, &sg, 1, 0, vb, GFP_KERNEL)
+ < 0)
BUG();
virtqueue_kick(vb->stats_vq);
}
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -159,12 +159,12 @@ static int vring_add_indirect(struct vri
return head;
}
-int virtqueue_add_buf_gfp(struct virtqueue *_vq,
- struct scatterlist sg[],
- unsigned int out,
- unsigned int in,
- void *data,
- gfp_t gfp)
+int virtqueue_add_buf(struct virtqueue *_vq,
+ struct scatterlist sg[],
+ unsigned int out,
+ unsigned int in,
+ void *data,
+ gfp_t gfp)
{
struct vring_virtqueue *vq = to_vvq(_vq);
unsigned int i, avail, uninitialized_var(prev);
@@ -235,7 +235,7 @@ add_head:
return vq->num_free;
}
-EXPORT_SYMBOL_GPL(virtqueue_add_buf_gfp);
+EXPORT_SYMBOL_GPL(virtqueue_add_buf);
void virtqueue_kick(struct virtqueue *_vq)
{
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -26,7 +26,7 @@ struct virtqueue {
};
/**
- * virtqueue_add_buf_gfp - expose buffer to other end
+ * virtqueue_add_buf - expose buffer to other end
* @vq: the struct virtqueue we're talking about.
* @sg: the description of the buffer(s).
* @out_num: the number of sg readable by other side
@@ -42,27 +42,18 @@ struct virtqueue {
* positive return values as "available": indirect buffers mean that
* we can put an entire sg[] array inside a single queue entry.
*/
-int virtqueue_add_buf_gfp(struct virtqueue *vq,
- struct scatterlist sg[],
- unsigned int out_num,
- unsigned int in_num,
- void *data,
- gfp_t gfp);
-
-static inline int virtqueue_add_buf(struct virtqueue *vq,
- struct scatterlist sg[],
- unsigned int out_num,
- unsigned int in_num,
- void *data)
-{
- return virtqueue_add_buf_gfp(vq, sg, out_num, in_num, data, GFP_ATOMIC);
-}
+int virtqueue_add_buf(struct virtqueue *vq,
+ struct scatterlist sg[],
+ unsigned int out_num,
+ unsigned int in_num,
+ void *data,
+ gfp_t gfp);
/**
* virtqueue_kick - update after add_buf
* @vq: the struct virtqueue
*
- * After one or more virtqueue_add_buf_gfp calls, invoke this to kick
+ * After one or more virtqueue_add_buf calls, invoke this to kick
* the other side.
*
* Caller must ensure we don't call this with other virtqueue
@@ -84,7 +75,7 @@ void virtqueue_kick(struct virtqueue *vq
* operations at the same time (except where noted).
*
* Returns NULL if there are no used buffers, or the "data" token
- * handed to virtqueue_add_buf_gfp().
+ * handed to virtqueue_add_buf().
*/
void *virtqueue_get_buf(struct virtqueue *vq, unsigned int *len);
@@ -131,7 +122,7 @@ bool virtqueue_enable_cb_delayed(struct
* virtqueue_detach_unused_buf - detach first unused buffer
* @vq: the struct virtqueue we're talking about.
*
- * Returns NULL or the "data" token handed to virtqueue_add_buf_gfp().
+ * Returns NULL or the "data" token handed to virtqueue_add_buf().
* This is not valid on an active queue; it is useful only for device
* shutdown.
*/
diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
--- a/net/9p/trans_virtio.c
+++ b/net/9p/trans_virtio.c
@@ -270,7 +270,8 @@ req_retry:
in = pack_sg_list(chan->sg, out,
VIRTQUEUE_NUM, req->rc->sdata, req->rc->capacity);
- err = virtqueue_add_buf(chan->vq, chan->sg, out, in, req->tc);
+ err = virtqueue_add_buf(chan->vq, chan->sg, out, in, req->tc,
+ GFP_ATOMIC);
if (err < 0) {
if (err == -ENOSPC) {
chan->ring_bufs_avail = 0;
@@ -413,7 +414,8 @@ req_retry_pinned:
in += pack_sg_list_p(chan->sg, out + in, VIRTQUEUE_NUM,
in_pages, in_nr_pages, uidata, inlen);
- err = virtqueue_add_buf(chan->vq, chan->sg, out, in, req->tc);
+ err = virtqueue_add_buf(chan->vq, chan->sg, out, in, req->tc,
+ GFP_ATOMIC);
if (err < 0) {
if (err == -ENOSPC) {
chan->ring_bufs_avail = 0;
diff --git a/tools/virtio/linux/virtio.h b/tools/virtio/linux/virtio.h
--- a/tools/virtio/linux/virtio.h
+++ b/tools/virtio/linux/virtio.h
@@ -186,21 +186,12 @@ struct virtqueue {
#endif
/* Interfaces exported by virtio_ring. */
-int virtqueue_add_buf_gfp(struct virtqueue *vq,
- struct scatterlist sg[],
- unsigned int out_num,
- unsigned int in_num,
- void *data,
- gfp_t gfp);
-
-static inline int virtqueue_add_buf(struct virtqueue *vq,
- struct scatterlist sg[],
- unsigned int out_num,
- unsigned int in_num,
- void *data)
-{
- return virtqueue_add_buf_gfp(vq, sg, out_num, in_num, data, GFP_ATOMIC);
-}
+int virtqueue_add_buf(struct virtqueue *vq,
+ struct scatterlist sg[],
+ unsigned int out_num,
+ unsigned int in_num,
+ void *data,
+ gfp_t gfp);
void virtqueue_kick(struct virtqueue *vq);
diff --git a/tools/virtio/virtio_test.c b/tools/virtio/virtio_test.c
--- a/tools/virtio/virtio_test.c
+++ b/tools/virtio/virtio_test.c
@@ -160,7 +160,8 @@ static void run_test(struct vdev_info *d
if (started < bufs) {
sg_init_one(&sl, dev->buf, dev->buf_size);
r = virtqueue_add_buf(vq->vq, &sl, 1, 0,
- dev->buf + started);
+ dev->buf + started,
+ GFP_ATOMIC);
if (likely(r >= 0)) {
++started;
virtqueue_kick(vq->vq);
^ permalink raw reply
* [PATCH 1 of 5] virtio: document functions better
From: Rusty Russell @ 2011-11-03 7:42 UTC (permalink / raw)
To: MichaelS.Tsirkist, Christoph Hellwig; +Cc: linux-kernel, kvm, virtualization
In-Reply-To: <patchbomb.1320306168@localhost6.localdomain6>
The old documentation is left over from when we used a structure with
strategy pointers.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
---
include/linux/virtio.h | 130 ++++++++++++++++++++++++++++++++-----------------
1 file changed, 87 insertions(+), 43 deletions(-)
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -26,52 +26,22 @@ struct virtqueue {
};
/**
- * operations for virtqueue
- * virtqueue_add_buf: expose buffer to other end
- * vq: the struct virtqueue we're talking about.
- * sg: the description of the buffer(s).
- * out_num: the number of sg readable by other side
- * in_num: the number of sg which are writable (after readable ones)
- * data: the token identifying the buffer.
- * gfp: how to do memory allocations (if necessary).
- * Returns remaining capacity of queue (sg segments) or a negative error.
- * virtqueue_kick: update after add_buf
- * vq: the struct virtqueue
- * After one or more add_buf calls, invoke this to kick the other side.
- * virtqueue_get_buf: get the next used buffer
- * vq: the struct virtqueue we're talking about.
- * len: the length written into the buffer
- * Returns NULL or the "data" token handed to add_buf.
- * virtqueue_disable_cb: disable callbacks
- * vq: the struct virtqueue we're talking about.
- * Note that this is not necessarily synchronous, hence unreliable and only
- * useful as an optimization.
- * virtqueue_enable_cb: restart callbacks after disable_cb.
- * vq: the struct virtqueue we're talking about.
- * This re-enables callbacks; it returns "false" if there are pending
- * buffers in the queue, to detect a possible race between the driver
- * checking for more work, and enabling callbacks.
- * virtqueue_enable_cb_delayed: restart callbacks after disable_cb.
- * vq: the struct virtqueue we're talking about.
- * This re-enables callbacks but hints to the other side to delay
- * interrupts until most of the available buffers have been processed;
- * it returns "false" if there are many pending buffers in the queue,
- * to detect a possible race between the driver checking for more work,
- * and enabling callbacks.
- * virtqueue_detach_unused_buf: detach first unused buffer
- * vq: the struct virtqueue we're talking about.
- * Returns NULL or the "data" token handed to add_buf
- * virtqueue_get_vring_size: return the size of the virtqueue's vring
- * vq: the struct virtqueue containing the vring of interest.
- * Returns the size of the vring.
+ * virtqueue_add_buf_gfp - expose buffer to other end
+ * @vq: the struct virtqueue we're talking about.
+ * @sg: the description of the buffer(s).
+ * @out_num: the number of sg readable by other side
+ * @in_num: the number of sg which are writable (after readable ones)
+ * @data: the token identifying the buffer.
+ * @gfp: how to do memory allocations (if necessary).
*
- * Locking rules are straightforward: the driver is responsible for
- * locking. No two operations may be invoked simultaneously, with the exception
- * of virtqueue_disable_cb.
+ * Caller must ensure we don't call this with other virtqueue operations
+ * at the same time (except where noted).
*
- * All operations can be called in any context.
+ * Returns remaining capacity of queue or a negative error
+ * (ie. ENOSPC). Note that it only really makes sense to treat all
+ * positive return values as "available": indirect buffers mean that
+ * we can put an entire sg[] array inside a single queue entry.
*/
-
int virtqueue_add_buf_gfp(struct virtqueue *vq,
struct scatterlist sg[],
unsigned int out_num,
@@ -88,18 +58,92 @@ static inline int virtqueue_add_buf(stru
return virtqueue_add_buf_gfp(vq, sg, out_num, in_num, data, GFP_ATOMIC);
}
+/**
+ * virtqueue_kick - update after add_buf
+ * @vq: the struct virtqueue
+ *
+ * After one or more virtqueue_add_buf_gfp calls, invoke this to kick
+ * the other side.
+ *
+ * Caller must ensure we don't call this with other virtqueue
+ * operations at the same time (except where noted).
+ */
void virtqueue_kick(struct virtqueue *vq);
+/**
+ * virtqueue_get_buf - get the next used buffer
+ * @vq: the struct virtqueue we're talking about.
+ * @len: the length written into the buffer
+ *
+ * If the driver wrote data into the buffer, @len will be set to the
+ * amount written. This means you don't need to clear the buffer
+ * beforehand to ensure there's no data leakage in the case of short
+ * writes.
+ *
+ * Caller must ensure we don't call this with other virtqueue
+ * operations at the same time (except where noted).
+ *
+ * Returns NULL if there are no used buffers, or the "data" token
+ * handed to virtqueue_add_buf_gfp().
+ */
void *virtqueue_get_buf(struct virtqueue *vq, unsigned int *len);
+/**
+ * virtqueue_disable_cb - disable callbacks
+ * @vq: the struct virtqueue we're talking about.
+ *
+ * Note that this is not necessarily synchronous, hence unreliable and only
+ * useful as an optimization.
+ *
+ * Unlike other operations, this need not be serialized.
+ */
void virtqueue_disable_cb(struct virtqueue *vq);
+/**
+ * virtqueue_enable_cb - restart callbacks after disable_cb.
+ * @vq: the struct virtqueue we're talking about.
+ *
+ * This re-enables callbacks; it returns "false" if there are pending
+ * buffers in the queue, to detect a possible race between the driver
+ * checking for more work, and enabling callbacks.
+ *
+ * Caller must ensure we don't call this with other virtqueue
+ * operations at the same time (except where noted).
+ */
bool virtqueue_enable_cb(struct virtqueue *vq);
+/**
+ * virtqueue_enable_cb_delayed - restart callbacks after disable_cb.
+ * @vq: the struct virtqueue we're talking about.
+ *
+ * This re-enables callbacks but hints to the other side to delay
+ * interrupts until most of the available buffers have been processed;
+ * it returns "false" if there are many pending buffers in the queue,
+ * to detect a possible race between the driver checking for more work,
+ * and enabling callbacks.
+ *
+ * Caller must ensure we don't call this with other virtqueue
+ * operations at the same time (except where noted).
+ */
bool virtqueue_enable_cb_delayed(struct virtqueue *vq);
+/**
+ * virtqueue_detach_unused_buf - detach first unused buffer
+ * @vq: the struct virtqueue we're talking about.
+ *
+ * Returns NULL or the "data" token handed to virtqueue_add_buf_gfp().
+ * This is not valid on an active queue; it is useful only for device
+ * shutdown.
+ */
void *virtqueue_detach_unused_buf(struct virtqueue *vq);
+/**
+ * virtqueue_get_vring_size - return the size of the virtqueue's vring
+ * @vq: the struct virtqueue containing the vring of interest.
+ *
+ * Returns the size of the vring. This is mainly used for boasting to
+ * userspace. Unlike other operations, this need not be serialized.
+ */
unsigned int virtqueue_get_vring_size(struct virtqueue *vq);
/**
^ permalink raw reply
* [PATCH 1/5] virtio: document functions better.
From: Rusty Russell @ 2011-11-03 7:10 UTC (permalink / raw)
To: virtualization; +Cc: Christoph Hellwig, linux-kernel, kvm, Michael S. Tsirkin
The old documentation is left over from when we used a structure with
strategy pointers.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
---
include/linux/virtio.h | 130 ++++++++++++++++++++++++++++++++-----------------
1 file changed, 87 insertions(+), 43 deletions(-)
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -26,52 +26,22 @@ struct virtqueue {
};
/**
- * operations for virtqueue
- * virtqueue_add_buf: expose buffer to other end
- * vq: the struct virtqueue we're talking about.
- * sg: the description of the buffer(s).
- * out_num: the number of sg readable by other side
- * in_num: the number of sg which are writable (after readable ones)
- * data: the token identifying the buffer.
- * gfp: how to do memory allocations (if necessary).
- * Returns remaining capacity of queue (sg segments) or a negative error.
- * virtqueue_kick: update after add_buf
- * vq: the struct virtqueue
- * After one or more add_buf calls, invoke this to kick the other side.
- * virtqueue_get_buf: get the next used buffer
- * vq: the struct virtqueue we're talking about.
- * len: the length written into the buffer
- * Returns NULL or the "data" token handed to add_buf.
- * virtqueue_disable_cb: disable callbacks
- * vq: the struct virtqueue we're talking about.
- * Note that this is not necessarily synchronous, hence unreliable and only
- * useful as an optimization.
- * virtqueue_enable_cb: restart callbacks after disable_cb.
- * vq: the struct virtqueue we're talking about.
- * This re-enables callbacks; it returns "false" if there are pending
- * buffers in the queue, to detect a possible race between the driver
- * checking for more work, and enabling callbacks.
- * virtqueue_enable_cb_delayed: restart callbacks after disable_cb.
- * vq: the struct virtqueue we're talking about.
- * This re-enables callbacks but hints to the other side to delay
- * interrupts until most of the available buffers have been processed;
- * it returns "false" if there are many pending buffers in the queue,
- * to detect a possible race between the driver checking for more work,
- * and enabling callbacks.
- * virtqueue_detach_unused_buf: detach first unused buffer
- * vq: the struct virtqueue we're talking about.
- * Returns NULL or the "data" token handed to add_buf
- * virtqueue_get_vring_size: return the size of the virtqueue's vring
- * vq: the struct virtqueue containing the vring of interest.
- * Returns the size of the vring.
+ * virtqueue_add_buf_gfp - expose buffer to other end
+ * @vq: the struct virtqueue we're talking about.
+ * @sg: the description of the buffer(s).
+ * @out_num: the number of sg readable by other side
+ * @in_num: the number of sg which are writable (after readable ones)
+ * @data: the token identifying the buffer.
+ * @gfp: how to do memory allocations (if necessary).
*
- * Locking rules are straightforward: the driver is responsible for
- * locking. No two operations may be invoked simultaneously, with the exception
- * of virtqueue_disable_cb.
+ * Caller must ensure we don't call this with other virtqueue operations
+ * at the same time (except where noted).
*
- * All operations can be called in any context.
+ * Returns remaining capacity of queue or a negative error
+ * (ie. ENOSPC). Note that it only really makes sense to treat all
+ * positive return values as "available": indirect buffers mean that
+ * we can put an entire sg[] array inside a single queue entry.
*/
-
int virtqueue_add_buf_gfp(struct virtqueue *vq,
struct scatterlist sg[],
unsigned int out_num,
@@ -88,18 +58,92 @@ static inline int virtqueue_add_buf(stru
return virtqueue_add_buf_gfp(vq, sg, out_num, in_num, data, GFP_ATOMIC);
}
+/**
+ * virtqueue_kick - update after add_buf
+ * @vq: the struct virtqueue
+ *
+ * After one or more virtqueue_add_buf_gfp calls, invoke this to kick
+ * the other side.
+ *
+ * Caller must ensure we don't call this with other virtqueue
+ * operations at the same time (except where noted).
+ */
void virtqueue_kick(struct virtqueue *vq);
+/**
+ * virtqueue_get_buf - get the next used buffer
+ * @vq: the struct virtqueue we're talking about.
+ * @len: the length written into the buffer
+ *
+ * If the driver wrote data into the buffer, @len will be set to the
+ * amount written. This means you don't need to clear the buffer
+ * beforehand to ensure there's no data leakage in the case of short
+ * writes.
+ *
+ * Caller must ensure we don't call this with other virtqueue
+ * operations at the same time (except where noted).
+ *
+ * Returns NULL if there are no used buffers, or the "data" token
+ * handed to virtqueue_add_buf_gfp().
+ */
void *virtqueue_get_buf(struct virtqueue *vq, unsigned int *len);
+/**
+ * virtqueue_disable_cb - disable callbacks
+ * @vq: the struct virtqueue we're talking about.
+ *
+ * Note that this is not necessarily synchronous, hence unreliable and only
+ * useful as an optimization.
+ *
+ * Unlike other operations, this need not be serialized.
+ */
void virtqueue_disable_cb(struct virtqueue *vq);
+/**
+ * virtqueue_enable_cb - restart callbacks after disable_cb.
+ * @vq: the struct virtqueue we're talking about.
+ *
+ * This re-enables callbacks; it returns "false" if there are pending
+ * buffers in the queue, to detect a possible race between the driver
+ * checking for more work, and enabling callbacks.
+ *
+ * Caller must ensure we don't call this with other virtqueue
+ * operations at the same time (except where noted).
+ */
bool virtqueue_enable_cb(struct virtqueue *vq);
+/**
+ * virtqueue_enable_cb_delayed - restart callbacks after disable_cb.
+ * @vq: the struct virtqueue we're talking about.
+ *
+ * This re-enables callbacks but hints to the other side to delay
+ * interrupts until most of the available buffers have been processed;
+ * it returns "false" if there are many pending buffers in the queue,
+ * to detect a possible race between the driver checking for more work,
+ * and enabling callbacks.
+ *
+ * Caller must ensure we don't call this with other virtqueue
+ * operations at the same time (except where noted).
+ */
bool virtqueue_enable_cb_delayed(struct virtqueue *vq);
+/**
+ * virtqueue_detach_unused_buf - detach first unused buffer
+ * @vq: the struct virtqueue we're talking about.
+ *
+ * Returns NULL or the "data" token handed to virtqueue_add_buf_gfp().
+ * This is not valid on an active queue; it is useful only for device
+ * shutdown.
+ */
void *virtqueue_detach_unused_buf(struct virtqueue *vq);
+/**
+ * virtqueue_get_vring_size - return the size of the virtqueue's vring
+ * @vq: the struct virtqueue containing the vring of interest.
+ *
+ * Returns the size of the vring. This is mainly used for boasting to
+ * userspace. Unlike other operations, this need not be serialized.
+ */
unsigned int virtqueue_get_vring_size(struct virtqueue *vq);
/**
^ permalink raw reply
* Re: virtio-pci new configuration proposal
From: Rusty Russell @ 2011-11-03 1:58 UTC (permalink / raw)
To: Sasha Levin, Michael S. Tsirkin
Cc: Anthony Liguori, linux-kernel, kvm, virtualization
In-Reply-To: <1320259767.22582.2.camel@lappy>
On Wed, 02 Nov 2011 20:49:27 +0200, Sasha Levin <levinsasha928@gmail.com> wrote:
> This is a proposal for a new layout of the virtio-pci config space.
>
> We will separate the current configuration into two: A virtio-pci common
> configuration and a device specific configuration. This allows more flexibility
> with adding features and makes usage easier, specifically in cases like the
> ones in virtio-net where device specific configurations depend on device
> specific features.
Thanks for this Sasha. Several general comments:
1) How to we distinguish the two layouts? In theory, we need to do this
forever. In practice we can deprecate the old layout in several
years' time.
2) I don't think we want to turn the device-specific config into a
linked list. We haven't needed variable-length config (yet!), and
it's (slightly) more complex. That's also the part of the spec which
is shared with non-PCI virtio implementations.
3) If we're changing the queue layout, it's a chance to fix a
longstanding bug: let the guest notify the host of preferred
queue size and alignment.
Cheers,
Rusty.
^ permalink raw reply
* Re: Regression: patch " hvc_console: display printk messages on console." causing infinite loop with 3.2-rc0 + Xen.
From: Greg KH @ 2011-11-03 1:30 UTC (permalink / raw)
To: Stephen Rothwell
Cc: xen-devel, Konrad Rzeszutek Wilk, Benjamin Herrenschmidt, miche,
linux-kernel, virtualization, Linus, Anton Blanchard, Amit Shah,
ppc-dev
In-Reply-To: <20111102121309.4dbb0d2ac90d2879130820dd@canb.auug.org.au>
On Wed, Nov 02, 2011 at 12:13:09PM +1100, Stephen Rothwell wrote:
> Hi Greg,
>
> On Thu, 27 Oct 2011 07:48:06 +0200 Greg KH <gregkh@suse.de> wrote:
> >
> > On Thu, Oct 27, 2011 at 01:30:08AM -0400, Konrad Rzeszutek Wilk wrote:
> > > Hey Miche.
> > >
> > > The git commit 361162459f62dc0826b82c9690a741a940f457f0:
> > >
> > > hvc_console: display printk messages on console.
> > >
> > > is causing an infinite loop when booting Linux under Xen, as so:
> >
> > Ick, not good, thanks for letting us know.
>
> Indeed. I am wondering why it was put in a tree and sent to Linus without
> any Acks or even being replied to by anyone. It appeared in the tty tree
> between Oct 14 and Oct 25 (while I was unfortunately on vacation). If
> anyone had tried to boot this on any PowerPC server, it would have been
> immediately obvious (as it was when I booted Linus' tree last night).
>
> And the original author expressed doubts as to his understanding of how
> it should all work anyway.
>
> Just a little more care, please.
>
> I would vote for reverting the original and having it resubmitted with
> corrections at some later date.
You are right, I will go do that, sorry for the problems.
greg k-h
^ permalink raw reply
* Re: [PATCH RFC 0/2] virtio-pci: polling mode support
From: Rusty Russell @ 2011-11-03 0:56 UTC (permalink / raw)
Cc: Anthony Liguori, Michael S. Tsirkin, qemu-devel, Jan Kiszka,
virtualization, Blue Swirl, Stefan Weil, Avi Kivity,
Richard Henderson
In-Reply-To: <cover.1320259840.git.mst@redhat.com>
On Wed, 2 Nov 2011 21:02:42 +0200, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> MSIX spec requires that device can be operated with
> all vectors masked, by polling.
>
> So the following patchset (lightly tested) adds this
> ability: when driver reads ISR, the device
> recalls a pending notification, and returns
> pending status in the ISR register.
>
> The polling driver can operate as follows:
> - map all VQs and config to the same vector
> - poll ISR to get status - this also flushes VQ updates to memory
> - handle config change or VQ event depending on ISR
>
> Comments?
This seems sane to me...
Cheers,
Rusty.
^ permalink raw reply
* Re: [PATCH RFC] virtio-pci: flexible configuration layout
From: Sasha Levin @ 2011-11-03 0:19 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Krishna Kumar, kvm, Pawel Moll, Wang Sheng-Hui,
Alexey Kardashevskiy, lkml - Kernel Mailing List, Jesse Barnes,
virtualization, Christian Borntraeger, Amit Shah, Linus Torvalds
In-Reply-To: <20111102233110.GA20289@redhat.com>
Hi Michael,
On Thu, 2011-11-03 at 01:31 +0200, Michael S. Tsirkin wrote:
> Add a flexible mechanism to specify virtio configuration layout, using
> pci vendor-specific capability. A separate capability is used for each
> of common, device specific and data-path accesses.
>
> Warning: compiled only.
> This patch also needs to be split up, pci_iomap changes
> also need arch updates for non-x86.
>
> We also will need to update the spec.
>
> See the first chunk for layout documentation.
>
> Posting here for early feedback.
>
> In particular:
>
> Do we need to require offset to be aligned?
> Does iowrite16 work with unaligned accesses on all architectures?
> Does using ioread/write as we do add overhead as compared to
> plain PIO accesses?
>
> Jesse - are you OK with the pci_iomap_range API proposed here
> (see last chunks)?
> I noticed lots of architectures duplicate the implementation
> of pci_iomap - makes sense to clean that up?
>
> Thanks,
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>
> ---
>
> diff --git a/include/linux/virtio_pci.h b/include/linux/virtio_pci.h
> index ea66f3f..b8a9de2 100644
> --- a/include/linux/virtio_pci.h
> +++ b/include/linux/virtio_pci.h
> @@ -92,4 +92,38 @@
> /* The alignment to use between consumer and producer parts of vring.
> * x86 pagesize again. */
> #define VIRTIO_PCI_VRING_ALIGN 4096
> +
> +/*
> + * Layout for Virtio PCI vendor specific capability (little-endian):
> + * 5 bit virtio capability id.
> + * 3 bit BAR index register, specifying which BAR to use.
> + * 4 byte cfg offset within the BAR.
Why not make the 3 bits of the BIR as the lower 3 bits of the offset
like its done in the PCI spec?
This way you also get 'free' 32bit alignment and simpler masking.
> + * 4 byte cfg size.
> + */
> +
> +/* A single virtio device has multiple vendor specific capabilities, we use the
> + * 5 bit ID field to distinguish between these. */
> +#define VIRTIO_PCI_CAP_ID 3
> +#define VIRTIO_PCI_CAP_ID_MASK 0x1f
> +#define VIRTIO_PCI_CAP_ID_SHIFT 0
> +
> +/* IDs for different capabilities. If a specific configuration
> + * is missing, legacy PIO path is used. */
> +/* Common configuration */
> +#define VIRTIO_PCI_CAP_COMMON_CFG 0
> +/* Device specific confiuration */
> +#define VIRTIO_PCI_CAP_DEVICE_CFG 1
> +/* Notifications and ISR access */
> +#define VIRTIO_PCI_CAP_NOTIFY_CFG 2
I think that separating notification here is very kvm specific. Maybe it
could be replaced by allowing virtio to signal eventfds or something
similar?
> +
> +/* Index of the BAR including this configuration */
> +#define VIRTIO_PCI_CAP_CFG_BIR 3
> +#define VIRTIO_PCI_CAP_CFG_BIR_MASK (0x7)
> +#define VIRTIO_PCI_CAP_CFG_BIR_SHIFT 5
> +
> +/* Offset within the BAR */
> +#define VIRTIO_PCI_CAP_CFG_OFF 4
> +/* Size of the configuration space */
> +#define VIRTIO_PCI_CAP_CFG_SIZE 8
> +
> #endif
> diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
> index 4bcc8b8..d4ed130 100644
> --- a/drivers/virtio/virtio_pci.c
> +++ b/drivers/virtio/virtio_pci.c
> @@ -37,8 +37,12 @@ struct virtio_pci_device
> struct virtio_device vdev;
> struct pci_dev *pci_dev;
>
> - /* the IO mapping for the PCI config space */
> + /* the IO address for the common PCI config space */
> void __iomem *ioaddr;
> + /* the IO address for device specific config */
> + void __iomem *ioaddr_device;
> + /* the IO address to use for notifications operations */
> + void __iomem *ioaddr_notify;
>
> /* a list of queues so we can dispatch IRQs */
> spinlock_t lock;
> @@ -57,8 +61,142 @@ struct virtio_pci_device
> unsigned msix_used_vectors;
> /* Whether we have vector per vq */
> bool per_vq_vectors;
> +
> + /* Various IO mappings: used for resource tracking only. */
> +
> + /* Legacy BAR0: typically PIO. */
> + void __iomem *legacy_map;
> +
> + /* Mappings specified by device capabilities: typically in MMIO */
> + void __iomem *notify_map;
> + void __iomem *common_map;
> + void __iomem *device_map;
> };
>
> +/*
> + * With PIO, device-specific config moves as MSI-X is enabled/disabled.
> + * Use this accessor to keep pointer to that config in sync.
> + */
> +static void virtio_pci_set_msix_enabled(struct virtio_pci_device *vp_dev, int enabled)
> +{
> + vp_dev->msix_enabled = enabled;
> + if (vp_dev->device_map)
> + vp_dev->ioaddr_device = vp_dev->device_map;
> + else
> + vp_dev->ioaddr_device = vp_dev->legacy_map +
> + VIRTIO_PCI_CONFIG(vp_dev);
> +}
> +
> +static void __iomem *virtio_pci_map_cfg(struct virtio_pci_device *vp_dev, u8 cap_id)
> +{
> + u32 size;
> + u32 offset;
> + u8 bir;
> + u8 cap_len;
> + int pos;
> + struct pci_dev *dev = vp_dev->pci_dev;
> + void __iomem *p;
> +
> + for (pos = pci_find_capability(dev, PCI_CAP_ID_VNDR);
> + pos > 0;
> + pos = pci_find_next_capability(dev, pos, PCI_CAP_ID_VNDR)) {
> + u8 id;
> + pci_read_config_byte(dev, pos + PCI_VNDR_CAP_LEN, &cap_len);
> + if (cap_len < VIRTIO_PCI_CAP_ID + 1)
> + continue;
> + pci_read_config_byte(dev, pos + VIRTIO_PCI_CAP_ID, &id);
> + id >>= VIRTIO_PCI_CAP_ID_SHIFT;
> + id &= VIRTIO_PCI_CAP_ID_MASK;
> + if (id == cap_id)
> + break;
> + }
> +
> + if (pos <= 0)
> + return NULL;
> +
> + if (cap_len < VIRTIO_PCI_CAP_CFG_BIR + 1)
> + goto err;
> + pci_read_config_byte(dev, pos + VIRTIO_PCI_CAP_CFG_BIR, &bir);
> + if (cap_len < VIRTIO_PCI_CAP_CFG_OFF + 4)
> + goto err;
> + pci_read_config_dword(dev, pos + VIRTIO_PCI_CAP_CFG_OFF, &offset);
> + if (cap_len < VIRTIO_PCI_CAP_CFG_SIZE + 4)
> + goto err;
> + pci_read_config_dword(dev, pos + VIRTIO_PCI_CAP_CFG_SIZE, &size);
> + bir >>= VIRTIO_PCI_CAP_CFG_BIR_SHIFT;
> + bir &= VIRTIO_PCI_CAP_CFG_BIR_MASK;
> +
> + /* It's possible for a device to supply a huge config space,
> + * but we'll never need to map more than a page ATM. */
> + p = pci_iomap_range(dev, bir, offset, size, PAGE_SIZE);
> + if (!p)
> + dev_err(&vp_dev->vdev.dev, "Unable to map virtio pci memory");
> + return p;
> +err:
> + dev_err(&vp_dev->vdev.dev, "Unable to parse virtio pci capability");
> + return NULL;
> +}
> +
> +static void virtio_pci_iounmap(struct virtio_pci_device *vp_dev)
> +{
> + if (vp_dev->legacy_map)
> + pci_iounmap(vp_dev->pci_dev, vp_dev->legacy_map);
> + if (vp_dev->notify_map)
> + pci_iounmap(vp_dev->pci_dev, vp_dev->notify_map);
> + if (vp_dev->common_map)
> + pci_iounmap(vp_dev->pci_dev, vp_dev->common_map);
> + if (vp_dev->device_map)
> + pci_iounmap(vp_dev->pci_dev, vp_dev->device_map);
> +}
> +
> +static int virtio_pci_iomap(struct virtio_pci_device *vp_dev)
> +{
> + vp_dev->notify_map = virtio_pci_map_cfg(vp_dev,
> + VIRTIO_PCI_CAP_NOTIFY_CFG);
> + vp_dev->common_map = virtio_pci_map_cfg(vp_dev,
> + VIRTIO_PCI_CAP_COMMON_CFG);
> + vp_dev->device_map = virtio_pci_map_cfg(vp_dev,
> + VIRTIO_PCI_CAP_DEVICE_CFG);
> +
> + if (!vp_dev->notify_map || !vp_dev->common_map ||
> + !vp_dev->device_map) {
> + /*
> + * If not all capabilities present, map legacy PIO.
> + * Legacy access is at BAR 0. We never need to map
> + * more than 256 bytes there, since legacy config space
> + * used PIO which has this size limit.
> + * */
Whats the point of putting each of those new configs into a separate cap
structure if you must have all 3 of them for the new config layout to
work anyway?
Also, why not use this change to simplify the config space? The attempts
to squeeze it into the limited PIO space made it quite messy, and this
can be a good chance to fix some things like the MSI-X and offsets
issue.
--
Sasha.
^ permalink raw reply
* [PATCH RFC] virtio-pci: flexible configuration layout
From: Michael S. Tsirkin @ 2011-11-02 23:31 UTC (permalink / raw)
To: Rusty Russell
Cc: Krishna Kumar, kvm, Pawel Moll, Wang Sheng-Hui,
Alexey Kardashevskiy, lkml - Kernel Mailing List, Jesse Barnes,
virtualization, Christian Borntraeger, Sasha Levin, Amit Shah,
Linus Torvalds
In-Reply-To: <8739e7uy87.fsf@rustcorp.com.au>
Add a flexible mechanism to specify virtio configuration layout, using
pci vendor-specific capability. A separate capability is used for each
of common, device specific and data-path accesses.
Warning: compiled only.
This patch also needs to be split up, pci_iomap changes
also need arch updates for non-x86.
We also will need to update the spec.
See the first chunk for layout documentation.
Posting here for early feedback.
In particular:
Do we need to require offset to be aligned?
Does iowrite16 work with unaligned accesses on all architectures?
Does using ioread/write as we do add overhead as compared to
plain PIO accesses?
Jesse - are you OK with the pci_iomap_range API proposed here
(see last chunks)?
I noticed lots of architectures duplicate the implementation
of pci_iomap - makes sense to clean that up?
Thanks,
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
diff --git a/include/linux/virtio_pci.h b/include/linux/virtio_pci.h
index ea66f3f..b8a9de2 100644
--- a/include/linux/virtio_pci.h
+++ b/include/linux/virtio_pci.h
@@ -92,4 +92,38 @@
/* The alignment to use between consumer and producer parts of vring.
* x86 pagesize again. */
#define VIRTIO_PCI_VRING_ALIGN 4096
+
+/*
+ * Layout for Virtio PCI vendor specific capability (little-endian):
+ * 5 bit virtio capability id.
+ * 3 bit BAR index register, specifying which BAR to use.
+ * 4 byte cfg offset within the BAR.
+ * 4 byte cfg size.
+ */
+
+/* A single virtio device has multiple vendor specific capabilities, we use the
+ * 5 bit ID field to distinguish between these. */
+#define VIRTIO_PCI_CAP_ID 3
+#define VIRTIO_PCI_CAP_ID_MASK 0x1f
+#define VIRTIO_PCI_CAP_ID_SHIFT 0
+
+/* IDs for different capabilities. If a specific configuration
+ * is missing, legacy PIO path is used. */
+/* Common configuration */
+#define VIRTIO_PCI_CAP_COMMON_CFG 0
+/* Device specific confiuration */
+#define VIRTIO_PCI_CAP_DEVICE_CFG 1
+/* Notifications and ISR access */
+#define VIRTIO_PCI_CAP_NOTIFY_CFG 2
+
+/* Index of the BAR including this configuration */
+#define VIRTIO_PCI_CAP_CFG_BIR 3
+#define VIRTIO_PCI_CAP_CFG_BIR_MASK (0x7)
+#define VIRTIO_PCI_CAP_CFG_BIR_SHIFT 5
+
+/* Offset within the BAR */
+#define VIRTIO_PCI_CAP_CFG_OFF 4
+/* Size of the configuration space */
+#define VIRTIO_PCI_CAP_CFG_SIZE 8
+
#endif
diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
index 4bcc8b8..d4ed130 100644
--- a/drivers/virtio/virtio_pci.c
+++ b/drivers/virtio/virtio_pci.c
@@ -37,8 +37,12 @@ struct virtio_pci_device
struct virtio_device vdev;
struct pci_dev *pci_dev;
- /* the IO mapping for the PCI config space */
+ /* the IO address for the common PCI config space */
void __iomem *ioaddr;
+ /* the IO address for device specific config */
+ void __iomem *ioaddr_device;
+ /* the IO address to use for notifications operations */
+ void __iomem *ioaddr_notify;
/* a list of queues so we can dispatch IRQs */
spinlock_t lock;
@@ -57,8 +61,142 @@ struct virtio_pci_device
unsigned msix_used_vectors;
/* Whether we have vector per vq */
bool per_vq_vectors;
+
+ /* Various IO mappings: used for resource tracking only. */
+
+ /* Legacy BAR0: typically PIO. */
+ void __iomem *legacy_map;
+
+ /* Mappings specified by device capabilities: typically in MMIO */
+ void __iomem *notify_map;
+ void __iomem *common_map;
+ void __iomem *device_map;
};
+/*
+ * With PIO, device-specific config moves as MSI-X is enabled/disabled.
+ * Use this accessor to keep pointer to that config in sync.
+ */
+static void virtio_pci_set_msix_enabled(struct virtio_pci_device *vp_dev, int enabled)
+{
+ vp_dev->msix_enabled = enabled;
+ if (vp_dev->device_map)
+ vp_dev->ioaddr_device = vp_dev->device_map;
+ else
+ vp_dev->ioaddr_device = vp_dev->legacy_map +
+ VIRTIO_PCI_CONFIG(vp_dev);
+}
+
+static void __iomem *virtio_pci_map_cfg(struct virtio_pci_device *vp_dev, u8 cap_id)
+{
+ u32 size;
+ u32 offset;
+ u8 bir;
+ u8 cap_len;
+ int pos;
+ struct pci_dev *dev = vp_dev->pci_dev;
+ void __iomem *p;
+
+ for (pos = pci_find_capability(dev, PCI_CAP_ID_VNDR);
+ pos > 0;
+ pos = pci_find_next_capability(dev, pos, PCI_CAP_ID_VNDR)) {
+ u8 id;
+ pci_read_config_byte(dev, pos + PCI_VNDR_CAP_LEN, &cap_len);
+ if (cap_len < VIRTIO_PCI_CAP_ID + 1)
+ continue;
+ pci_read_config_byte(dev, pos + VIRTIO_PCI_CAP_ID, &id);
+ id >>= VIRTIO_PCI_CAP_ID_SHIFT;
+ id &= VIRTIO_PCI_CAP_ID_MASK;
+ if (id == cap_id)
+ break;
+ }
+
+ if (pos <= 0)
+ return NULL;
+
+ if (cap_len < VIRTIO_PCI_CAP_CFG_BIR + 1)
+ goto err;
+ pci_read_config_byte(dev, pos + VIRTIO_PCI_CAP_CFG_BIR, &bir);
+ if (cap_len < VIRTIO_PCI_CAP_CFG_OFF + 4)
+ goto err;
+ pci_read_config_dword(dev, pos + VIRTIO_PCI_CAP_CFG_OFF, &offset);
+ if (cap_len < VIRTIO_PCI_CAP_CFG_SIZE + 4)
+ goto err;
+ pci_read_config_dword(dev, pos + VIRTIO_PCI_CAP_CFG_SIZE, &size);
+ bir >>= VIRTIO_PCI_CAP_CFG_BIR_SHIFT;
+ bir &= VIRTIO_PCI_CAP_CFG_BIR_MASK;
+
+ /* It's possible for a device to supply a huge config space,
+ * but we'll never need to map more than a page ATM. */
+ p = pci_iomap_range(dev, bir, offset, size, PAGE_SIZE);
+ if (!p)
+ dev_err(&vp_dev->vdev.dev, "Unable to map virtio pci memory");
+ return p;
+err:
+ dev_err(&vp_dev->vdev.dev, "Unable to parse virtio pci capability");
+ return NULL;
+}
+
+static void virtio_pci_iounmap(struct virtio_pci_device *vp_dev)
+{
+ if (vp_dev->legacy_map)
+ pci_iounmap(vp_dev->pci_dev, vp_dev->legacy_map);
+ if (vp_dev->notify_map)
+ pci_iounmap(vp_dev->pci_dev, vp_dev->notify_map);
+ if (vp_dev->common_map)
+ pci_iounmap(vp_dev->pci_dev, vp_dev->common_map);
+ if (vp_dev->device_map)
+ pci_iounmap(vp_dev->pci_dev, vp_dev->device_map);
+}
+
+static int virtio_pci_iomap(struct virtio_pci_device *vp_dev)
+{
+ vp_dev->notify_map = virtio_pci_map_cfg(vp_dev,
+ VIRTIO_PCI_CAP_NOTIFY_CFG);
+ vp_dev->common_map = virtio_pci_map_cfg(vp_dev,
+ VIRTIO_PCI_CAP_COMMON_CFG);
+ vp_dev->device_map = virtio_pci_map_cfg(vp_dev,
+ VIRTIO_PCI_CAP_DEVICE_CFG);
+
+ if (!vp_dev->notify_map || !vp_dev->common_map ||
+ !vp_dev->device_map) {
+ /*
+ * If not all capabilities present, map legacy PIO.
+ * Legacy access is at BAR 0. We never need to map
+ * more than 256 bytes there, since legacy config space
+ * used PIO which has this size limit.
+ * */
+ vp_dev->legacy_map = pci_iomap(vp_dev->pci_dev, 0, 256);
+ if (!vp_dev->legacy_map) {
+ dev_err(&vp_dev->vdev.dev, "Unable to map legacy PIO");
+ goto err;
+ }
+ }
+
+ /* Prefer MMIO if available. If not, fallback to legacy PIO. */
+ if (vp_dev->common_map)
+ vp_dev->ioaddr = vp_dev->common_map;
+ else
+ vp_dev->ioaddr = vp_dev->legacy_map;
+
+ if (vp_dev->device_map)
+ vp_dev->ioaddr_device = vp_dev->device_map;
+ else
+ vp_dev->ioaddr_device = vp_dev->legacy_map +
+ VIRTIO_PCI_CONFIG(vp_dev);
+
+ if (vp_dev->notify_map)
+ vp_dev->ioaddr_notify = vp_dev->notify_map;
+ else
+ vp_dev->ioaddr_notify = vp_dev->legacy_map +
+ VIRTIO_PCI_QUEUE_NOTIFY;
+
+ return 0;
+err:
+ virtio_pci_iounmap(vp_dev);
+ return -EINVAL;
+}
+
/* Constants for MSI-X */
/* Use first vector for configuration changes, second and the rest for
* virtqueues Thus, we need at least 2 vectors for MSI. */
@@ -130,8 +268,7 @@ static void vp_get(struct virtio_device *vdev, unsigned offset,
void *buf, unsigned len)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
- void __iomem *ioaddr = vp_dev->ioaddr +
- VIRTIO_PCI_CONFIG(vp_dev) + offset;
+ void __iomem *ioaddr = vp_dev->ioaddr_device + offset;
u8 *ptr = buf;
int i;
@@ -145,8 +282,7 @@ static void vp_set(struct virtio_device *vdev, unsigned offset,
const void *buf, unsigned len)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
- void __iomem *ioaddr = vp_dev->ioaddr +
- VIRTIO_PCI_CONFIG(vp_dev) + offset;
+ void __iomem *ioaddr = vp_dev->ioaddr_device + offset;
const u8 *ptr = buf;
int i;
@@ -184,7 +320,7 @@ static void vp_notify(struct virtqueue *vq)
/* we write the queue's selector into the notification register to
* signal the other end */
- iowrite16(info->queue_index, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_NOTIFY);
+ iowrite16(info->queue_index, vp_dev->ioaddr_notify);
}
/* Handle a configuration change: Tell driver if it wants to know. */
@@ -231,7 +367,8 @@ static irqreturn_t vp_interrupt(int irq, void *opaque)
/* reading the ISR has the effect of also clearing it so it's very
* important to save off the value. */
- isr = ioread8(vp_dev->ioaddr + VIRTIO_PCI_ISR);
+ isr = ioread8(vp_dev->ioaddr_notify +
+ VIRTIO_PCI_ISR - VIRTIO_PCI_QUEUE_NOTIFY);
/* It's definitely not us if the ISR was not high */
if (!isr)
@@ -265,7 +402,7 @@ static void vp_free_vectors(struct virtio_device *vdev)
ioread16(vp_dev->ioaddr + VIRTIO_MSI_CONFIG_VECTOR);
pci_disable_msix(vp_dev->pci_dev);
- vp_dev->msix_enabled = 0;
+ virtio_pci_set_msix_enabled(vp_dev, 0);
vp_dev->msix_vectors = 0;
}
@@ -303,7 +440,7 @@ static int vp_request_msix_vectors(struct virtio_device *vdev, int nvectors,
if (err)
goto error;
vp_dev->msix_vectors = nvectors;
- vp_dev->msix_enabled = 1;
+ virtio_pci_set_msix_enabled(vp_dev, 1);
/* Set the vector used for configuration */
v = vp_dev->msix_used_vectors;
@@ -447,7 +584,10 @@ static void vp_del_vq(struct virtqueue *vq)
iowrite16(VIRTIO_MSI_NO_VECTOR,
vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
/* Flush the write out to device */
- ioread8(vp_dev->ioaddr + VIRTIO_PCI_ISR);
+ ioread8(vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
+ /* And clear ISR: TODO: really needed? */
+ ioread8(vp_dev->ioaddr_notify +
+ VIRTIO_PCI_ISR - VIRTIO_PCI_QUEUE_NOTIFY);
}
vring_del_virtqueue(vq);
@@ -638,8 +778,8 @@ static int __devinit virtio_pci_probe(struct pci_dev *pci_dev,
if (err)
goto out_enable_device;
- vp_dev->ioaddr = pci_iomap(pci_dev, 0, 0);
- if (vp_dev->ioaddr == NULL)
+ err = virtio_pci_iomap(vp_dev);
+ if (err)
goto out_req_regions;
pci_set_drvdata(pci_dev, vp_dev);
@@ -661,7 +801,7 @@ static int __devinit virtio_pci_probe(struct pci_dev *pci_dev,
out_set_drvdata:
pci_set_drvdata(pci_dev, NULL);
- pci_iounmap(pci_dev, vp_dev->ioaddr);
+ virtio_pci_iounmap(vp_dev);
out_req_regions:
pci_release_regions(pci_dev);
out_enable_device:
@@ -679,7 +819,7 @@ static void __devexit virtio_pci_remove(struct pci_dev *pci_dev)
vp_del_vqs(&vp_dev->vdev);
pci_set_drvdata(pci_dev, NULL);
- pci_iounmap(pci_dev, vp_dev->ioaddr);
+ virtio_pci_iounmap(vp_dev);
pci_release_regions(pci_dev);
pci_disable_device(pci_dev);
}
diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h
index 9120887..3cf1787 100644
--- a/include/asm-generic/io.h
+++ b/include/asm-generic/io.h
@@ -286,6 +286,10 @@ static inline void writesb(const void __iomem *addr, const void *buf, int len)
/* Create a virtual mapping cookie for a PCI BAR (memory or IO) */
struct pci_dev;
extern void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long max);
+extern void __iomem *pci_iomap_range(struct pci_dev *dev, int bar,
+ unsigned offset,
+ unsigned long minlen,
+ unsigned long maxlen);
static inline void pci_iounmap(struct pci_dev *dev, void __iomem *p)
{
}
diff --git a/include/asm-generic/iomap.h b/include/asm-generic/iomap.h
index 98dcd76..6f192d4 100644
--- a/include/asm-generic/iomap.h
+++ b/include/asm-generic/iomap.h
@@ -70,8 +70,19 @@ extern void ioport_unmap(void __iomem *);
/* Create a virtual mapping cookie for a PCI BAR (memory or IO) */
struct pci_dev;
extern void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long max);
+extern void __iomem *pci_iomap_range(struct pci_dev *dev, int bar,
+ unsigned offset,
+ unsigned long minlen,
+ unsigned long maxlen);
extern void pci_iounmap(struct pci_dev *dev, void __iomem *);
#else
+static inline void __iomem *pci_iomap_range(struct pci_dev *dev, int bar,
+ unsigned offset,
+ unsigned long minlen,
+ unsigned long maxlen)
+{
+ return NULL;
+}
struct pci_dev;
static inline void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long max)
{
diff --git a/include/linux/pci_regs.h b/include/linux/pci_regs.h
index e884096..2ec9f81 100644
--- a/include/linux/pci_regs.h
+++ b/include/linux/pci_regs.h
@@ -375,6 +375,10 @@
#define PCI_X_STATUS_266MHZ 0x40000000 /* 266 MHz capable */
#define PCI_X_STATUS_533MHZ 0x80000000 /* 533 MHz capable */
+/* Vendor specific capability */
+#define PCI_VNDR_CAP_LEN 2 /* Capability length (8 bits), including
+ bytes: ID, NEXT and LEN itself. */
+
/* PCI Bridge Subsystem ID registers */
#define PCI_SSVID_VENDOR_ID 4 /* PCI-Bridge subsystem vendor id register */
diff --git a/lib/iomap.c b/lib/iomap.c
index 5dbcb4b..f28720e 100644
--- a/lib/iomap.c
+++ b/lib/iomap.c
@@ -243,26 +243,36 @@ EXPORT_SYMBOL(ioport_unmap);
#ifdef CONFIG_PCI
/**
- * pci_iomap - create a virtual mapping cookie for a PCI BAR
+ * pci_iomap_range - create a virtual mapping cookie for a PCI BAR
* @dev: PCI device that owns the BAR
* @bar: BAR number
- * @maxlen: length of the memory to map
+ * @offset: map memory at the given offset in BAR
+ * @minlen: min length of the memory to map
+ * @maxlen: max length of the memory to map
*
* Using this function you will get a __iomem address to your device BAR.
* You can access it using ioread*() and iowrite*(). These functions hide
* the details if this is a MMIO or PIO address space and will just do what
* you expect from them in the correct way.
*
+ * @minlen specifies the minimum length to map. We check that BAR is
+ * large enough.
* @maxlen specifies the maximum length to map. If you want to get access to
- * the complete BAR without checking for its length first, pass %0 here.
+ * the complete BAR from offset to the end, pass %0 here.
* */
-void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long maxlen)
+void __iomem *pci_iomap_range(struct pci_dev *dev, int bar,
+ unsigned offset,
+ unsigned long minlen,
+ unsigned long maxlen)
{
resource_size_t start = pci_resource_start(dev, bar);
resource_size_t len = pci_resource_len(dev, bar);
unsigned long flags = pci_resource_flags(dev, bar);
- if (!len || !start)
+ if (len <= offset || !start)
+ return NULL;
+ len -= offset;
+ if (len < minlen)
return NULL;
if (maxlen && len > maxlen)
len = maxlen;
@@ -277,10 +287,30 @@ void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long maxlen)
return NULL;
}
+/**
+ * pci_iomap - create a virtual mapping cookie for a PCI BAR
+ * @dev: PCI device that owns the BAR
+ * @bar: BAR number
+ * @maxlen: length of the memory to map
+ *
+ * Using this function you will get a __iomem address to your device BAR.
+ * You can access it using ioread*() and iowrite*(). These functions hide
+ * the details if this is a MMIO or PIO address space and will just do what
+ * you expect from them in the correct way.
+ *
+ * @maxlen specifies the maximum length to map. If you want to get access to
+ * the complete BAR without checking for its length first, pass %0 here.
+ * */
+void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long maxlen)
+{
+ return pci_iomap_range(dev, bar, 0, 0, maxlen);
+}
+
void pci_iounmap(struct pci_dev *dev, void __iomem * addr)
{
IO_COND(addr, /* nothing */, iounmap(addr));
}
EXPORT_SYMBOL(pci_iomap);
+EXPORT_SYMBOL(pci_iomap_range);
EXPORT_SYMBOL(pci_iounmap);
#endif /* CONFIG_PCI */
--
MST
^ permalink raw reply related
* RE: [PATCH 1/1] Staging: hv: Move the mouse driver out of staging
From: KY Srinivasan @ 2011-11-02 23:04 UTC (permalink / raw)
To: Jesper Juhl, Dan Carpenter
Cc: dmitry.torokhov@gmail.com, jkosina@suse.cz, gregkh@suse.de,
ohering@suse.com, linux-kernel@vger.kernel.org,
virtualization@lists.osdl.org, joe@perches.com,
devel@linuxdriverproject.org
In-Reply-To: <alpine.LNX.2.00.1111022353530.20903@swampdragon.chaosbits.net>
> -----Original Message-----
> From: Jesper Juhl [mailto:jj@chaosbits.net]
> Sent: Wednesday, November 02, 2011 6:59 PM
> To: Dan Carpenter
> Cc: KY Srinivasan; dmitry.torokhov@gmail.com; jkosina@suse.cz;
> gregkh@suse.de; ohering@suse.com; linux-kernel@vger.kernel.org;
> virtualization@lists.osdl.org; joe@perches.com; devel@linuxdriverproject.org
> Subject: Re: [PATCH 1/1] Staging: hv: Move the mouse driver out of staging
>
> On Sat, 29 Oct 2011, Dan Carpenter wrote:
>
> > On Sat, Oct 29, 2011 at 12:54:34AM +0200, Jesper Juhl wrote:
> > > > + default:
> > > > + pr_err("unhandled packet type %d, tid %llx len
> %d\n",
> > > > + desc->type,
> > > > + req_id,
> > > > + bytes_recvd);
> > >
> > > Why not:
> > >
> > > pr_err("unhandled packet type %d, tid %llx
> > > len %d\n", desc->type, req_id, bytes_recvd);
> >
> > Because then the printk would be messed up? Your final printed
> > string would look like:
> >
> > "unhandled packet type %d, tid %llx
> > len %d\n"
> >
> > Don't break strings up across lines because it breaks grep. If K. Y.
> > wants to put all the parameters on one line instead of three that
> > would probably be better, but in the end who cares?
> >
> Right, so I obviously "fat fingered" that and should have read my email
> once more before sending. *But* the point really was just the "put all the
> parameters on one line rather than 3" bit...
>
> As for who cares; well, I cared enough to actually read the patch and send
> a reply, and I thought we needed more reviewers... When I review something
> I comment on everything I spot from bugs to trivial stuff - then it is
> up to the recipient to pick the things they want to address from the
> reply..
I am sure I will have to re-spin this. I will take care of it then.
Regards,
K. Y
^ permalink raw reply
* Re: [PATCH 1/1] Staging: hv: Move the mouse driver out of staging
From: Jesper Juhl @ 2011-11-02 22:59 UTC (permalink / raw)
To: Dan Carpenter
Cc: K. Y. Srinivasan, dmitry.torokhov, jkosina, gregkh, ohering,
linux-kernel, virtualization, joe, devel
In-Reply-To: <20111029063240.GH14881@longonot.mountain>
On Sat, 29 Oct 2011, Dan Carpenter wrote:
> On Sat, Oct 29, 2011 at 12:54:34AM +0200, Jesper Juhl wrote:
> > > + default:
> > > + pr_err("unhandled packet type %d, tid %llx len %d\n",
> > > + desc->type,
> > > + req_id,
> > > + bytes_recvd);
> >
> > Why not:
> >
> > pr_err("unhandled packet type %d, tid %llx
> > len %d\n", desc->type, req_id, bytes_recvd);
>
> Because then the printk would be messed up? Your final printed
> string would look like:
>
> "unhandled packet type %d, tid %llx
> len %d\n"
>
> Don't break strings up across lines because it breaks grep. If K. Y.
> wants to put all the parameters on one line instead of three that
> would probably be better, but in the end who cares?
>
Right, so I obviously "fat fingered" that and should have read my email
once more before sending. *But* the point really was just the "put all the
parameters on one line rather than 3" bit...
As for who cares; well, I cared enough to actually read the patch and send
a reply, and I thought we needed more reviewers... When I review something
I comment on everything I spot from bugs to trivial stuff - then it is
up to the recipient to pick the things they want to address from the
reply..
--
Jesper Juhl <jj@chaosbits.net> http://www.chaosbits.net/
Don't top-post http://www.catb.org/jargon/html/T/top-post.html
Plain text mails only, please.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox