* Re: [RFC 7/11] virtio_pci: new, capability-aware driver.
From: Rusty Russell @ 2011-12-13 2:21 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: Sasha Levin, virtualization
In-Reply-To: <20111212182533.GB25916@redhat.com>
On Mon, 12 Dec 2011 20:25:34 +0200, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> By the way, a generic question on virtio-pci: we now have:
>
> /* virtio config->get() implementation */
> static void vp_get(struct virtio_device *vdev, unsigned offset,
> void *buf, unsigned len)
> {
> struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> void __iomem *ioaddr = vp_dev->ioaddr +
> VIRTIO_PCI_CONFIG(vp_dev) + offset;
> u8 *ptr = buf;
> int i;
>
> for (i = 0; i < len; i++)
> ptr[i] = ioread8(ioaddr + i);
> }
>
> This means that if configuration is read while
> it is changed, we might get an inconsistent state,
> with parts of a 64 bit field coming from old
> and parts from new value.
>
> Isn't this a problem?
I don't think so; it's the caller's problem if they need to do locking.
Is there a caller which needs this?
Or am I missing something?
Rusty.
^ permalink raw reply
* Re: [PATCH RFC] virtio_net: fix refill related races
From: Rusty Russell @ 2011-12-13 2:35 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Amit Shah, netdev, Tejun Heo, linux-kernel, virtualization
In-Reply-To: <20111212115405.GB7946@redhat.com>
On Mon, 12 Dec 2011 13:54:06 +0200, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Mon, Dec 12, 2011 at 09:25:07AM +1030, Rusty Russell wrote:
> > Orthogonally, the refill-stop code is still buggy, as you noted.
>
> Sorry I don't understand how it's still buggy.
Both places where we call:
cancel_delayed_work_sync(&vi->refill);
Do not actually guarantee that vi->refill isn't running, because it
can requeue itself. A 'bool no_more_refill' field seems like the
simplest fix for this, but I don't think it's sufficient.
Tejun, is this correct? What's the correct way to synchronously stop a
delayed_work which can "schedule_delayed_work(&vi->refill, HZ/2);" on
itself?
Thanks,
Rusty.
^ permalink raw reply
* hv_storvsc driver
From: K. Y. Srinivasan @ 2011-12-13 15:58 UTC (permalink / raw)
To: gregkh, linux-kernel, devel, virtualization, ohering,
James.Bottomley, hch, linux-scsi
Sometime back I had sent the scsi mailing list, the patch to move the
Hyper-V storage driver out of staging. This patch addresses the
review comments I have gotten to date from James, Greg and Christoph (I have not heard
from anybody else in the scsi mailing list). The patches addressing the review
comments have already been sent to the staging tree and Greg has already
applied these patches. James, Let me know what I can do to expedite the
move from the staging tree.
Greg, do you have any issues you would want
me to address before this driver can exit staging. Let me know.
Regards,
K. Y
^ permalink raw reply
* Re: hv_storvsc driver
From: Greg KH @ 2011-12-13 16:07 UTC (permalink / raw)
To: K. Y. Srinivasan
Cc: linux-kernel, devel, virtualization, ohering, James.Bottomley,
hch, linux-scsi
In-Reply-To: <1323791916-2340-1-git-send-email-kys@microsoft.com>
On Tue, Dec 13, 2011 at 07:58:36AM -0800, K. Y. Srinivasan wrote:
>
> Sometime back I had sent the scsi mailing list, the patch to move the
> Hyper-V storage driver out of staging. This patch addresses the
> review comments I have gotten to date from James, Greg and Christoph (I have not heard
> from anybody else in the scsi mailing list). The patches addressing the review
> comments have already been sent to the staging tree and Greg has already
> applied these patches. James, Let me know what I can do to expedite the
> move from the staging tree.
>
> Greg, do you have any issues you would want
> me to address before this driver can exit staging. Let me know.
It's not up to me, it's up to the scsi maintainers, sorry.
greg k-h
^ permalink raw reply
* Re: [PATCH] xen-blkfront: Use kcalloc instead of kzalloc to allocate array
From: Konrad Rzeszutek Wilk @ 2011-12-14 18:56 UTC (permalink / raw)
To: Thomas Meyer; +Cc: xen-devel, linux-kernel, virtualization
In-Reply-To: <1322600880.1534.293.camel@localhost.localdomain>
On Tue, Nov 29, 2011 at 10:08:00PM +0100, Thomas Meyer wrote:
> The advantage of kcalloc is, that will prevent integer overflows which could
> result from the multiplication of number of elements and size and it is also
> a bit nicer to read.
>
> The semantic patch that makes this change is available
> in https://lkml.org/lkml/2011/11/25/107
>
Thomas,
I put the xen-blkfront part of the patch in my tree and dropped the cciss_scsi one.
> Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
> ---
>
> diff -u -p a/drivers/block/cciss_scsi.c b/drivers/block/cciss_scsi.c
> --- a/drivers/block/cciss_scsi.c 2011-11-28 19:36:47.343430551 +0100
> +++ b/drivers/block/cciss_scsi.c 2011-11-28 19:49:24.922716381 +0100
> @@ -534,10 +534,10 @@ adjust_cciss_scsi_table(ctlr_info_t *h,
> int nadded, nremoved;
> struct Scsi_Host *sh = NULL;
>
> - added = kzalloc(sizeof(*added) * CCISS_MAX_SCSI_DEVS_PER_HBA,
> - GFP_KERNEL);
> - removed = kzalloc(sizeof(*removed) * CCISS_MAX_SCSI_DEVS_PER_HBA,
> + added = kcalloc(CCISS_MAX_SCSI_DEVS_PER_HBA, sizeof(*added),
> GFP_KERNEL);
> + removed = kcalloc(CCISS_MAX_SCSI_DEVS_PER_HBA, sizeof(*removed),
> + GFP_KERNEL);
>
> if (!added || !removed) {
> dev_warn(&h->pdev->dev,
> @@ -1191,8 +1191,8 @@ cciss_update_non_disk_devices(ctlr_info_
>
> ld_buff = kzalloc(reportlunsize, GFP_KERNEL);
> inq_buff = kmalloc(OBDR_TAPE_INQ_SIZE, GFP_KERNEL);
> - currentsd = kzalloc(sizeof(*currentsd) *
> - (CCISS_MAX_SCSI_DEVS_PER_HBA+1), GFP_KERNEL);
> + currentsd = kcalloc(CCISS_MAX_SCSI_DEVS_PER_HBA + 1,
> + sizeof(*currentsd), GFP_KERNEL);
> if (ld_buff == NULL || inq_buff == NULL || currentsd == NULL) {
> printk(KERN_ERR "cciss: out of memory\n");
> goto out;
> diff -u -p a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
> --- a/drivers/block/xen-blkfront.c 2011-11-13 11:07:22.680095573 +0100
> +++ b/drivers/block/xen-blkfront.c 2011-11-28 19:49:29.109460410 +0100
> @@ -156,7 +156,7 @@ static int xlbd_reserve_minors(unsigned
> if (end > nr_minors) {
> unsigned long *bitmap, *old;
>
> - bitmap = kzalloc(BITS_TO_LONGS(end) * sizeof(*bitmap),
> + bitmap = kcalloc(BITS_TO_LONGS(end), sizeof(*bitmap),
> GFP_KERNEL);
> if (bitmap == NULL)
> return -ENOMEM;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply
* Re: [PATCH RFC] virtio_net: fix refill related races
From: Tejun Heo @ 2011-12-14 23:54 UTC (permalink / raw)
To: Rusty Russell
Cc: Amit Shah, netdev, virtualization, linux-kernel,
Michael S. Tsirkin
In-Reply-To: <87iplltd0g.fsf@rustcorp.com.au>
Hello, Rusty.
On Tue, Dec 13, 2011 at 01:05:11PM +1030, Rusty Russell wrote:
> Both places where we call:
>
> cancel_delayed_work_sync(&vi->refill);
>
> Do not actually guarantee that vi->refill isn't running, because it
> can requeue itself. A 'bool no_more_refill' field seems like the
> simplest fix for this, but I don't think it's sufficient.
>
> Tejun, is this correct? What's the correct way to synchronously stop a
> delayed_work which can "schedule_delayed_work(&vi->refill, HZ/2);" on
> itself?
cancel_delayed_work_sync() itself should be good enough. It first
steals the pending state and then waits for it to finish if in-flight.
Queueing itself afterwards becomes noop.
Thanks.
--
tejun
^ permalink raw reply
* Re: [net-next RFC PATCH 0/5] Series short description
From: Ben Hutchings @ 2011-12-15 1:36 UTC (permalink / raw)
To: Rusty Russell; +Cc: krkumar2, kvm, mst, netdev, virtualization, levinsasha928
In-Reply-To: <87r50efgza.fsf@rustcorp.com.au>
On Fri, 2011-12-09 at 16:01 +1030, Rusty Russell wrote:
> On Wed, 7 Dec 2011 17:02:04 +0000, Ben Hutchings <bhutchings@solarflare.com> wrote:
> > Solarflare controllers (sfc driver) have 8192 perfect filters for
> > TCP/IPv4 and UDP/IPv4 which can be used for flow steering. (The filters
> > are organised as a hash table, but matched based on 5-tuples.) I
> > implemented the 'accelerated RFS' interface in this driver.
> >
> > I believe the Intel 82599 controllers (ixgbe driver) have both
> > hash-based and perfect filter modes and the driver can be configured to
> > use one or the other. The driver has its own independent mechanism for
> > steering RX and TX flows which predates RFS; I don't know whether it
> > uses hash-based or perfect filters.
>
> Thanks for this summary (and Jason, too). I've fallen a long way behind
> NIC state-of-the-art.
>
> > Most multi-queue controllers could support a kind of hash-based
> > filtering for TCP/IP by adjusting the RSS indirection table. However,
> > this table is usually quite small (64-256 entries). This means that
> > hash collisions will be quite common and this can result in reordering.
> > The same applies to the small table Jason has proposed for virtio-net.
>
> But this happens on real hardware today. Better that real hardware is
> nice, but is it overkill?
What do you mean, it happens on real hardware today? So far as I know,
the only cases where we have dynamic adjustment of flow steering are in
ixgbe (big table of hash filters, I think) and sfc (perfect filters).
I don't think that anyone's currently doing flow steering with the RSS
indirection table. (At least, not on Linux. I think that Microsoft was
intending to do so on Windows, but I don't know whether they ever did.)
> And can't you reorder even with perfect matching, since prior packets
> will be on the old queue and more recent ones on the new queue? Does it
> discard or requeue old ones? Or am I missing a trick?
Yes, that is possible. RFS is careful to avoid such reordering by only
changing the steering of a flow when none of its packets can be in a
software receive queue. It is not generally possible to do the same for
hardware receive queues. However, when the first condition is met it is
likely that there won't be a whole lot of packets for that flow in the
hardware receive queue either. (But if there are, then I think as a
side-effect of commit 09994d1 RFS will repeatedly ask the driver to
steer the flow. Which isn't ideal.)
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* Re: [PATCH 1/2] params: <level>_initcall-like kernel parameters
From: Rusty Russell @ 2011-12-15 3:51 UTC (permalink / raw)
To: linux-kernel, virtualization; +Cc: Pawel Moll
In-Reply-To: <1323712627-17353-1-git-send-email-pawel.moll@arm.com>
On Mon, 12 Dec 2011 17:57:06 +0000, Pawel Moll <pawel.moll@arm.com> wrote:
> This patch adds a set of macros that can be used to declare
> kernel parameters to be parsed _before_ initcalls at a chosen
> level are executed. Such parameters are marked using existing
> "flags" field of the "kernel_param" structure.
>
> Linker macro collating init calls had to be modified in order
> to add additional symbols between levels that are later used
> by the init code to split the calls into blocks.
This patch wasn't quite what I was thinking, but I've realized that
we can't put the params in the .init section, your approach is probably
the best one.
Note that I've just created a series which gets rid of that silly ISBOOL
thing, so you can use the whole field for "level". Then I set the level
to -1 for the normal calls; I want to use -2 for the early calls, but
that's not done yet...
I'll rework and rebase your patch like that now.
Thanks!
Rusty.
^ permalink raw reply
* Re: [RFC 7/11] virtio_pci: new, capability-aware driver.
From: Michael S. Tsirkin @ 2011-12-15 8:27 UTC (permalink / raw)
To: Rusty Russell; +Cc: Sasha Levin, virtualization
In-Reply-To: <87liqhtdnj.fsf@rustcorp.com.au>
On Tue, Dec 13, 2011 at 12:51:20PM +1030, Rusty Russell wrote:
> On Mon, 12 Dec 2011 20:25:34 +0200, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > By the way, a generic question on virtio-pci: we now have:
> >
> > /* virtio config->get() implementation */
> > static void vp_get(struct virtio_device *vdev, unsigned offset,
> > void *buf, unsigned len)
> > {
> > struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> > void __iomem *ioaddr = vp_dev->ioaddr +
> > VIRTIO_PCI_CONFIG(vp_dev) + offset;
> > u8 *ptr = buf;
> > int i;
> >
> > for (i = 0; i < len; i++)
> > ptr[i] = ioread8(ioaddr + i);
> > }
> >
> > This means that if configuration is read while
> > it is changed, we might get an inconsistent state,
> > with parts of a 64 bit field coming from old
> > and parts from new value.
> >
> > Isn't this a problem?
>
> I don't think so; it's the caller's problem if they need to do locking.
> Is there a caller which needs this?
>
> Or am I missing something?
> Rusty.
I mean like this in block:
/* Host must always specify the capacity. */
vdev->config->get(vdev, offsetof(struct virtio_blk_config,
capacity),
&capacity, sizeof(capacity));
/* If capacity is too big, truncate with warning. */
if ((sector_t)capacity != capacity) {
dev_warn(&vdev->dev, "Capacity %llu too large:
truncating\n",
(unsigned long long)capacity);
capacity = (sector_t)-1;
}
Now let's assume capacity field is changed from 0x8000 to 0x10000
on host. Is it possible that we read two upper bytes
before the change so we see 0x0000....
and 2 lower bytes after the change
so we see 0x....0000 and resulting capacity appears
to be 0?
If no why not?
And what kind of locking can help?
--
MST
^ permalink raw reply
* Re: [RFC 3/11] pci: add pci_iomap_range
From: Michael S. Tsirkin @ 2011-12-15 8:30 UTC (permalink / raw)
To: Rusty Russell; +Cc: Sasha Levin, virtualization
In-Reply-To: <87hb1bgxpt.fsf@rustcorp.com.au>
On Thu, Dec 08, 2011 at 09:02:46PM +1030, Rusty Russell wrote:
> From: Michael S Tsirkin <mst@redhat.com>
>
> Virtio drivers should map the part of the range they need, not necessarily
> all of it.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I think that we should add a forcenocache flag.
This will let devices put the cap structure in
the prefetcheable BAR. That has an advantage that
it can be located anywhere in the 2^64 space,
while non-prefetcheable BARs are limited to lower 4G
for devices behind a PCI-to-PCI bridge.
> ---
> include/asm-generic/io.h | 4 ++++
> include/asm-generic/iomap.h | 11 +++++++++++
> lib/iomap.c | 41 ++++++++++++++++++++++++++++++++++++-----
> 3 files changed, 51 insertions(+), 5 deletions(-)
>
> diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h
> index 9120887..3cf1787 100644
> --- a/include/asm-generic/io.h
> +++ b/include/asm-generic/io.h
> @@ -286,6 +286,10 @@ static inline void writesb(const void __iomem *addr, const void *buf, int len)
> /* Create a virtual mapping cookie for a PCI BAR (memory or IO) */
> struct pci_dev;
> extern void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long max);
> +extern void __iomem *pci_iomap_range(struct pci_dev *dev, int bar,
> + unsigned offset,
> + unsigned long minlen,
> + unsigned long maxlen);
> static inline void pci_iounmap(struct pci_dev *dev, void __iomem *p)
> {
> }
> diff --git a/include/asm-generic/iomap.h b/include/asm-generic/iomap.h
> index 98dcd76..6f192d4 100644
> --- a/include/asm-generic/iomap.h
> +++ b/include/asm-generic/iomap.h
> @@ -70,8 +70,19 @@ extern void ioport_unmap(void __iomem *);
> /* Create a virtual mapping cookie for a PCI BAR (memory or IO) */
> struct pci_dev;
> extern void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long max);
> +extern void __iomem *pci_iomap_range(struct pci_dev *dev, int bar,
> + unsigned offset,
> + unsigned long minlen,
> + unsigned long maxlen);
> extern void pci_iounmap(struct pci_dev *dev, void __iomem *);
> #else
> +static inline void __iomem *pci_iomap_range(struct pci_dev *dev, int bar,
> + unsigned offset,
> + unsigned long minlen,
> + unsigned long maxlen)
> +{
> + return NULL;
> +}
> struct pci_dev;
> static inline void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long max)
> {
> diff --git a/lib/iomap.c b/lib/iomap.c
> index 5dbcb4b..93ae915 100644
> --- a/lib/iomap.c
> +++ b/lib/iomap.c
> @@ -243,26 +243,37 @@ EXPORT_SYMBOL(ioport_unmap);
>
> #ifdef CONFIG_PCI
> /**
> - * pci_iomap - create a virtual mapping cookie for a PCI BAR
> + * pci_iomap_range - create a virtual mapping cookie for a PCI BAR
> * @dev: PCI device that owns the BAR
> * @bar: BAR number
> - * @maxlen: length of the memory to map
> + * @offset: map memory at the given offset in BAR
> + * @minlen: min length of the memory to map
> + * @maxlen: max length of the memory to map
> *
> * Using this function you will get a __iomem address to your device BAR.
> * You can access it using ioread*() and iowrite*(). These functions hide
> * the details if this is a MMIO or PIO address space and will just do what
> * you expect from them in the correct way.
> *
> + * @minlen specifies the minimum length to map. We check that BAR is
> + * large enough.
> * @maxlen specifies the maximum length to map. If you want to get access to
> - * the complete BAR without checking for its length first, pass %0 here.
> + * the complete BAR from offset to the end, pass %0 here.
> * */
> -void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long maxlen)
> +void __iomem *pci_iomap_range(struct pci_dev *dev, int bar,
> + unsigned offset,
> + unsigned long minlen,
> + unsigned long maxlen)
> {
> resource_size_t start = pci_resource_start(dev, bar);
> resource_size_t len = pci_resource_len(dev, bar);
> unsigned long flags = pci_resource_flags(dev, bar);
>
> - if (!len || !start)
> + if (len <= offset || !start)
> + return NULL;
> + len -= offset;
> + start += offset;
> + if (len < minlen)
> return NULL;
> if (maxlen && len > maxlen)
> len = maxlen;
> @@ -277,10 +288,30 @@ void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long maxlen)
> return NULL;
> }
>
> +/**
> + * pci_iomap - create a virtual mapping cookie for a PCI BAR
> + * @dev: PCI device that owns the BAR
> + * @bar: BAR number
> + * @maxlen: length of the memory to map
> + *
> + * Using this function you will get a __iomem address to your device BAR.
> + * You can access it using ioread*() and iowrite*(). These functions hide
> + * the details if this is a MMIO or PIO address space and will just do what
> + * you expect from them in the correct way.
> + *
> + * @maxlen specifies the maximum length to map. If you want to get access to
> + * the complete BAR without checking for its length first, pass %0 here.
> + * */
> +void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long maxlen)
> +{
> + return pci_iomap_range(dev, bar, 0, 0, maxlen);
> +}
> +
> void pci_iounmap(struct pci_dev *dev, void __iomem * addr)
> {
> IO_COND(addr, /* nothing */, iounmap(addr));
> }
> EXPORT_SYMBOL(pci_iomap);
> +EXPORT_SYMBOL(pci_iomap_range);
> EXPORT_SYMBOL(pci_iounmap);
> #endif /* CONFIG_PCI */
^ permalink raw reply
* Re: [PATCH 1/2] params: <level>_initcall-like kernel parameters
From: Pawel Moll @ 2011-12-15 9:38 UTC (permalink / raw)
To: Rusty Russell
Cc: linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org
In-Reply-To: <871us6sdaz.fsf@rustcorp.com.au>
Morning,
On Thu, 2011-12-15 at 03:51 +0000, Rusty Russell wrote:
> On Mon, 12 Dec 2011 17:57:06 +0000, Pawel Moll <pawel.moll@arm.com> wrote:
> > This patch adds a set of macros that can be used to declare
> > kernel parameters to be parsed _before_ initcalls at a chosen
> > level are executed. Such parameters are marked using existing
> > "flags" field of the "kernel_param" structure.
> >
> > Linker macro collating init calls had to be modified in order
> > to add additional symbols between levels that are later used
> > by the init code to split the calls into blocks.
>
> This patch wasn't quite what I was thinking, but I've realized that
> we can't put the params in the .init section, your approach is probably
> the best one.
The only way I could think of to put the parameters passing code in
between levels was adding new linker sections, and that sounded like an
overkill...
> Note that I've just created a series which gets rid of that silly ISBOOL
> thing, so you can use the whole field for "level". Then I set the level
> to -1 for the normal calls; I want to use -2 for the early calls, but
> that's not done yet...
>
> I'll rework and rebase your patch like that now.
Cool, it's all yours, especially now that I'm the last day at work this
year so won't be able to contribute much in the following weeks... Could
I just ask you to remember about the virtio_mmio parameters patch if you
get somewhere with this? I'll be most grateful!
Cheers!
Paweł
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply
* [PATCH v5 00/11] virtio: s4 support
From: Amit Shah @ 2011-12-15 12:45 UTC (permalink / raw)
To: Virtualization List
Cc: linux-pm, linux-kernel, Michael S. Tsirkin, levinsasha928,
Amit Shah
Hi,
These patches add support for S4 to virtio (pci) and all drivers.
Michael saw some race in virtio-net module removal which will need a
similar fix for the freeze code as well. I'll update the virtio-net
patch with that fix once the fix is settled upon and applied.
For each driver, all vqs are removed before hibernation, and then
re-created after restore. Some driver-specific uninit and init work
is also done in the freeze and restore functions.
All the drivers in testing work fine:
* virtio-blk is used for the only disk in the VM, IO works fine before
and after. 'dd if=/dev/zero of=/tmp/bigfile bs=1024 count=200000'
across S4 gives same sha1sum for the file in the guest as well as
one that's created without invoking S4.
* virtio-console: port IO keeps working fine before and after.
* If a port is waiting for data from the host (blocking read(2)
call), this works fine in both the cases: host-side connection is
available or unavailable after resume. In case the host-side
connection isn't available, the blocking call is terminated. If
it is available, the call continues to remain in blocked state
till further data arrives.
* virtio-net: ping remains active across S4.
* virtio-balloon: Works fine before and after. Forgets the ballooned
value across S4 (see details in commit log). Maintains ballooned
value on failed freeze.
All of these tests are run in parallel.
v5:
- Enable virtio device after the driver-specific restore/thaw
callbacks are completed.
- Balloon: Don't exit kthread on freeze. It's already frozen by the
PM API before invoking the freeze callback, so exiting is pointless.
v4:
- Disable / enable napi across S4 (Michael S. Tsirkin)
- Balloon: lots of improvements (I had neglected this driver thinking
it was a simple one, but this one needed the most thought! Check
the commit log for patch 12 for details.)
- Net, Blk: Reset device as the first operation on freeze
v3:
- Reset vqs before deleting them (Sasha Levin)
- Flush block queue before freeze (Rusty)
- Detach netdev before freeze (Michael S. Tsirkin)
v2:
- fix checkpatch errors/warnings
Amit Shah (11):
virtio: pci: switch to new PM API
virtio: pci: add PM notification handlers for restore, freeze, thaw,
poweroff
virtio: console: Move vq and vq buf removal into separate functions
virtio: console: Add freeze and restore handlers to support S4
virtio: blk: Move vq initialization to separate function
virtio: blk: Add freeze, restore handlers to support S4
virtio: net: Move vq initialization into separate function
virtio: net: Move vq and vq buf removal into separate function
virtio: net: Add freeze, restore handlers to support S4
virtio: balloon: Move vq initialization into separate function
virtio: balloon: Add freeze, restore handlers to support S4
drivers/block/virtio_blk.c | 57 ++++++++++++++++--
drivers/char/virtio_console.c | 126 ++++++++++++++++++++++++++++++---------
drivers/net/virtio_net.c | 102 +++++++++++++++++++++++--------
drivers/virtio/virtio_balloon.c | 95 ++++++++++++++++++++++++------
drivers/virtio/virtio_pci.c | 106 +++++++++++++++++++++++++++++++-
include/linux/virtio.h | 5 ++
6 files changed, 410 insertions(+), 81 deletions(-)
--
1.7.7.3
^ permalink raw reply
* [PATCH v5 01/11] virtio: pci: switch to new PM API
From: Amit Shah @ 2011-12-15 12:45 UTC (permalink / raw)
To: Virtualization List
Cc: linux-pm, linux-kernel, Michael S. Tsirkin, levinsasha928,
Amit Shah
In-Reply-To: <cover.1323952933.git.amit.shah@redhat.com>
The older PM API doesn't have a way to get notifications on hibernate
events. Switch to the newer one that gives us those notifications.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
---
drivers/virtio/virtio_pci.c | 16 ++++++++++++----
1 files changed, 12 insertions(+), 4 deletions(-)
diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
index 03d1984..23e1532 100644
--- a/drivers/virtio/virtio_pci.c
+++ b/drivers/virtio/virtio_pci.c
@@ -708,19 +708,28 @@ static void __devexit virtio_pci_remove(struct pci_dev *pci_dev)
}
#ifdef CONFIG_PM
-static int virtio_pci_suspend(struct pci_dev *pci_dev, pm_message_t state)
+static int virtio_pci_suspend(struct device *dev)
{
+ struct pci_dev *pci_dev = to_pci_dev(dev);
+
pci_save_state(pci_dev);
pci_set_power_state(pci_dev, PCI_D3hot);
return 0;
}
-static int virtio_pci_resume(struct pci_dev *pci_dev)
+static int virtio_pci_resume(struct device *dev)
{
+ struct pci_dev *pci_dev = to_pci_dev(dev);
+
pci_restore_state(pci_dev);
pci_set_power_state(pci_dev, PCI_D0);
return 0;
}
+
+static const struct dev_pm_ops virtio_pci_pm_ops = {
+ .suspend = virtio_pci_suspend,
+ .resume = virtio_pci_resume,
+};
#endif
static struct pci_driver virtio_pci_driver = {
@@ -729,8 +738,7 @@ static struct pci_driver virtio_pci_driver = {
.probe = virtio_pci_probe,
.remove = __devexit_p(virtio_pci_remove),
#ifdef CONFIG_PM
- .suspend = virtio_pci_suspend,
- .resume = virtio_pci_resume,
+ .driver.pm = &virtio_pci_pm_ops,
#endif
};
--
1.7.7.3
^ permalink raw reply related
* [PATCH v5 02/11] virtio: pci: add PM notification handlers for restore, freeze, thaw, poweroff
From: Amit Shah @ 2011-12-15 12:45 UTC (permalink / raw)
To: Virtualization List
Cc: linux-pm, linux-kernel, Michael S. Tsirkin, levinsasha928,
Amit Shah
In-Reply-To: <cover.1323952933.git.amit.shah@redhat.com>
Handle thaw, restore and freeze notifications from the PM core. Expose
these to individual virtio drivers that can quiesce and resume vq
operations. For drivers not implementing the thaw() method, use the
restore method instead.
These functions also save device-specific data so that the device can be
put in pre-suspend state after resume, and disable and enable the PCI
device in the freeze and resume functions, respectively.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
---
drivers/virtio/virtio_pci.c | 94 ++++++++++++++++++++++++++++++++++++++++++-
include/linux/virtio.h | 5 ++
2 files changed, 97 insertions(+), 2 deletions(-)
diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
index 23e1532..63bf242 100644
--- a/drivers/virtio/virtio_pci.c
+++ b/drivers/virtio/virtio_pci.c
@@ -55,6 +55,10 @@ struct virtio_pci_device
unsigned msix_vectors;
/* Vectors allocated, excluding per-vq vectors if any */
unsigned msix_used_vectors;
+
+ /* Status saved during hibernate/restore */
+ u8 saved_status;
+
/* Whether we have vector per vq */
bool per_vq_vectors;
};
@@ -726,9 +730,95 @@ static int virtio_pci_resume(struct device *dev)
return 0;
}
+static int virtio_pci_freeze(struct device *dev)
+{
+ struct pci_dev *pci_dev = to_pci_dev(dev);
+ struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev);
+ struct virtio_driver *drv;
+ int ret;
+
+ drv = container_of(vp_dev->vdev.dev.driver,
+ struct virtio_driver, driver);
+
+ ret = 0;
+ vp_dev->saved_status = vp_get_status(&vp_dev->vdev);
+ if (drv && drv->freeze)
+ ret = drv->freeze(&vp_dev->vdev);
+
+ if (!ret)
+ pci_disable_device(pci_dev);
+ return ret;
+}
+
+static int restore_common(struct device *dev)
+{
+ struct pci_dev *pci_dev = to_pci_dev(dev);
+ struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev);
+ int ret;
+
+ ret = pci_enable_device(pci_dev);
+ if (ret)
+ return ret;
+ pci_set_master(pci_dev);
+ vp_finalize_features(&vp_dev->vdev);
+
+ return ret;
+}
+
+static int virtio_pci_thaw(struct device *dev)
+{
+ struct pci_dev *pci_dev = to_pci_dev(dev);
+ struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev);
+ struct virtio_driver *drv;
+ int ret;
+
+ ret = restore_common(dev);
+ if (ret)
+ return ret;
+
+ drv = container_of(vp_dev->vdev.dev.driver,
+ struct virtio_driver, driver);
+
+ if (drv && drv->thaw)
+ ret = drv->thaw(&vp_dev->vdev);
+ else if (drv && drv->restore)
+ ret = drv->restore(&vp_dev->vdev);
+
+ /* Finally, tell the device we're all set */
+ if (!ret)
+ vp_set_status(&vp_dev->vdev, vp_dev->saved_status);
+
+ return ret;
+}
+
+static int virtio_pci_restore(struct device *dev)
+{
+ struct pci_dev *pci_dev = to_pci_dev(dev);
+ struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev);
+ struct virtio_driver *drv;
+ int ret;
+
+ drv = container_of(vp_dev->vdev.dev.driver,
+ struct virtio_driver, driver);
+
+ ret = restore_common(dev);
+ if (!ret && drv && drv->restore)
+ ret = drv->restore(&vp_dev->vdev);
+
+ /* Finally, tell the device we're all set */
+ if (!ret)
+ vp_set_status(&vp_dev->vdev, vp_dev->saved_status);
+
+ return ret;
+}
+
static const struct dev_pm_ops virtio_pci_pm_ops = {
- .suspend = virtio_pci_suspend,
- .resume = virtio_pci_resume,
+ .suspend = virtio_pci_suspend,
+ .resume = virtio_pci_resume,
+ .freeze = virtio_pci_freeze,
+ .thaw = virtio_pci_thaw,
+ .restore = virtio_pci_restore,
+ .poweroff = virtio_pci_suspend,
};
#endif
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 4c069d8..92902ab 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -146,6 +146,11 @@ struct virtio_driver {
int (*probe)(struct virtio_device *dev);
void (*remove)(struct virtio_device *dev);
void (*config_changed)(struct virtio_device *dev);
+#ifdef CONFIG_PM
+ int (*freeze)(struct virtio_device *dev);
+ int (*thaw)(struct virtio_device *dev);
+ int (*restore)(struct virtio_device *dev);
+#endif
};
int register_virtio_driver(struct virtio_driver *drv);
--
1.7.7.3
^ permalink raw reply related
* [PATCH v5 03/11] virtio: console: Move vq and vq buf removal into separate functions
From: Amit Shah @ 2011-12-15 12:45 UTC (permalink / raw)
To: Virtualization List
Cc: linux-pm, linux-kernel, Michael S. Tsirkin, levinsasha928,
Amit Shah
In-Reply-To: <cover.1323952933.git.amit.shah@redhat.com>
This common code will be shared with the PM freeze function.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
---
drivers/char/virtio_console.c | 68 ++++++++++++++++++++++++-----------------
1 files changed, 40 insertions(+), 28 deletions(-)
diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index 8e3c46d..e14f5aa 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -1271,6 +1271,20 @@ static void remove_port(struct kref *kref)
kfree(port);
}
+static void remove_port_data(struct port *port)
+{
+ struct port_buffer *buf;
+
+ /* Remove unused data this port might have received. */
+ discard_port_data(port);
+
+ reclaim_consumed_buffers(port);
+
+ /* Remove buffers we queued up for the Host to send us data in. */
+ while ((buf = virtqueue_detach_unused_buf(port->in_vq)))
+ free_buf(buf);
+}
+
/*
* Port got unplugged. Remove port from portdev's list and drop the
* kref reference. If no userspace has this port opened, it will
@@ -1278,8 +1292,6 @@ static void remove_port(struct kref *kref)
*/
static void unplug_port(struct port *port)
{
- struct port_buffer *buf;
-
spin_lock_irq(&port->portdev->ports_lock);
list_del(&port->list);
spin_unlock_irq(&port->portdev->ports_lock);
@@ -1300,14 +1312,7 @@ static void unplug_port(struct port *port)
hvc_remove(port->cons.hvc);
}
- /* Remove unused data this port might have received. */
- discard_port_data(port);
-
- reclaim_consumed_buffers(port);
-
- /* Remove buffers we queued up for the Host to send us data in. */
- while ((buf = virtqueue_detach_unused_buf(port->in_vq)))
- free_buf(buf);
+ remove_port_data(port);
/*
* We should just assume the device itself has gone off --
@@ -1659,6 +1664,28 @@ static const struct file_operations portdev_fops = {
.owner = THIS_MODULE,
};
+static void remove_vqs(struct ports_device *portdev)
+{
+ portdev->vdev->config->del_vqs(portdev->vdev);
+ kfree(portdev->in_vqs);
+ kfree(portdev->out_vqs);
+}
+
+static void remove_controlq_data(struct ports_device *portdev)
+{
+ struct port_buffer *buf;
+ unsigned int len;
+
+ if (!use_multiport(portdev))
+ return;
+
+ while ((buf = virtqueue_get_buf(portdev->c_ivq, &len)))
+ free_buf(buf);
+
+ while ((buf = virtqueue_detach_unused_buf(portdev->c_ivq)))
+ free_buf(buf);
+}
+
/*
* Once we're further in boot, we get probed like any other virtio
* device.
@@ -1764,9 +1791,7 @@ free_vqs:
/* The host might want to notify mgmt sw about device add failure */
__send_control_msg(portdev, VIRTIO_CONSOLE_BAD_ID,
VIRTIO_CONSOLE_DEVICE_READY, 0);
- vdev->config->del_vqs(vdev);
- kfree(portdev->in_vqs);
- kfree(portdev->out_vqs);
+ remove_vqs(portdev);
free_chrdev:
unregister_chrdev(portdev->chr_major, "virtio-portsdev");
free:
@@ -1804,21 +1829,8 @@ static void virtcons_remove(struct virtio_device *vdev)
* have to just stop using the port, as the vqs are going
* away.
*/
- if (use_multiport(portdev)) {
- struct port_buffer *buf;
- unsigned int len;
-
- while ((buf = virtqueue_get_buf(portdev->c_ivq, &len)))
- free_buf(buf);
-
- while ((buf = virtqueue_detach_unused_buf(portdev->c_ivq)))
- free_buf(buf);
- }
-
- vdev->config->del_vqs(vdev);
- kfree(portdev->in_vqs);
- kfree(portdev->out_vqs);
-
+ remove_controlq_data(portdev);
+ remove_vqs(portdev);
kfree(portdev);
}
--
1.7.7.3
^ permalink raw reply related
* [PATCH v5 04/11] virtio: console: Add freeze and restore handlers to support S4
From: Amit Shah @ 2011-12-15 12:45 UTC (permalink / raw)
To: Virtualization List
Cc: linux-pm, linux-kernel, Michael S. Tsirkin, levinsasha928,
Amit Shah
In-Reply-To: <cover.1323952933.git.amit.shah@redhat.com>
Remove all vqs and associated buffers in the freeze callback which
prepares us to go into hibernation state. On restore, re-create all the
vqs and populate the input vqs with buffers to get to the pre-hibernate
state.
Note: Any outstanding unconsumed buffers are discarded; which means
there's a possibility of data loss in case the host or the guest didn't
consume any data already present in the vqs. This can be addressed in a
later patch series, perhaps in virtio common code.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
---
drivers/char/virtio_console.c | 58 +++++++++++++++++++++++++++++++++++++++++
1 files changed, 58 insertions(+), 0 deletions(-)
diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index e14f5aa..fd2fd6f 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -1844,6 +1844,60 @@ static unsigned int features[] = {
VIRTIO_CONSOLE_F_MULTIPORT,
};
+#ifdef CONFIG_PM
+static int virtcons_freeze(struct virtio_device *vdev)
+{
+ struct ports_device *portdev;
+ struct port *port;
+
+ portdev = vdev->priv;
+
+ vdev->config->reset(vdev);
+
+ cancel_work_sync(&portdev->control_work);
+ remove_controlq_data(portdev);
+
+ list_for_each_entry(port, &portdev->ports, list) {
+ /*
+ * We'll ask the host later if the new invocation has
+ * the port opened or closed.
+ */
+ port->host_connected = false;
+ remove_port_data(port);
+ }
+ remove_vqs(portdev);
+
+ return 0;
+}
+
+static int virtcons_restore(struct virtio_device *vdev)
+{
+ struct ports_device *portdev;
+ struct port *port;
+ int ret;
+
+ portdev = vdev->priv;
+
+ ret = init_vqs(portdev);
+ if (ret)
+ return ret;
+
+ if (use_multiport(portdev))
+ fill_queue(portdev->c_ivq, &portdev->cvq_lock);
+
+ list_for_each_entry(port, &portdev->ports, list) {
+ port->in_vq = portdev->in_vqs[port->id];
+ port->out_vq = portdev->out_vqs[port->id];
+
+ fill_queue(port->in_vq, &port->inbuf_lock);
+
+ /* Get port open/close status on the host */
+ send_control_msg(port, VIRTIO_CONSOLE_PORT_READY, 1);
+ }
+ return 0;
+}
+#endif
+
static struct virtio_driver virtio_console = {
.feature_table = features,
.feature_table_size = ARRAY_SIZE(features),
@@ -1853,6 +1907,10 @@ static struct virtio_driver virtio_console = {
.probe = virtcons_probe,
.remove = virtcons_remove,
.config_changed = config_intr,
+#ifdef CONFIG_PM
+ .freeze = virtcons_freeze,
+ .restore = virtcons_restore,
+#endif
};
static int __init init(void)
--
1.7.7.3
^ permalink raw reply related
* [PATCH v5 05/11] virtio: blk: Move vq initialization to separate function
From: Amit Shah @ 2011-12-15 12:45 UTC (permalink / raw)
To: Virtualization List
Cc: linux-pm, linux-kernel, Michael S. Tsirkin, levinsasha928,
Amit Shah
In-Reply-To: <cover.1323952933.git.amit.shah@redhat.com>
The probe and PM restore functions will share this code.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
---
drivers/block/virtio_blk.c | 19 ++++++++++++++-----
1 files changed, 14 insertions(+), 5 deletions(-)
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 4d0b70a..467f218 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -349,6 +349,18 @@ static void virtblk_config_changed(struct virtio_device *vdev)
queue_work(virtblk_wq, &vblk->config_work);
}
+static int init_vq(struct virtio_blk *vblk)
+{
+ int err = 0;
+
+ /* We expect one virtqueue, for output. */
+ vblk->vq = virtio_find_single_vq(vblk->vdev, blk_done, "requests");
+ if (IS_ERR(vblk->vq))
+ err = PTR_ERR(vblk->vq);
+
+ return err;
+}
+
static int __devinit virtblk_probe(struct virtio_device *vdev)
{
struct virtio_blk *vblk;
@@ -390,12 +402,9 @@ static int __devinit virtblk_probe(struct virtio_device *vdev)
sg_init_table(vblk->sg, vblk->sg_elems);
INIT_WORK(&vblk->config_work, virtblk_config_changed_work);
- /* We expect one virtqueue, for output. */
- vblk->vq = virtio_find_single_vq(vdev, blk_done, "requests");
- if (IS_ERR(vblk->vq)) {
- err = PTR_ERR(vblk->vq);
+ err = init_vq(vblk);
+ if (err)
goto out_free_vblk;
- }
vblk->pool = mempool_create_kmalloc_pool(1,sizeof(struct virtblk_req));
if (!vblk->pool) {
--
1.7.7.3
^ permalink raw reply related
* [PATCH v5 06/11] virtio: blk: Add freeze, restore handlers to support S4
From: Amit Shah @ 2011-12-15 12:45 UTC (permalink / raw)
To: Virtualization List
Cc: linux-pm, linux-kernel, Michael S. Tsirkin, levinsasha928,
Amit Shah
In-Reply-To: <cover.1323952933.git.amit.shah@redhat.com>
Delete the vq and flush any pending requests from the block queue on the
freeze callback to prepare for hibernation.
Re-create the vq in the restore callback to resume normal function.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
---
drivers/block/virtio_blk.c | 38 ++++++++++++++++++++++++++++++++++++++
1 files changed, 38 insertions(+), 0 deletions(-)
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 467f218..a9147a6 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -568,6 +568,40 @@ static void __devexit virtblk_remove(struct virtio_device *vdev)
ida_simple_remove(&vd_index_ida, index);
}
+#ifdef CONFIG_PM
+static int virtblk_freeze(struct virtio_device *vdev)
+{
+ struct virtio_blk *vblk = vdev->priv;
+
+ /* Ensure we don't receive any more interrupts */
+ vdev->config->reset(vdev);
+
+ flush_work(&vblk->config_work);
+
+ spin_lock_irq(vblk->disk->queue->queue_lock);
+ blk_stop_queue(vblk->disk->queue);
+ spin_unlock_irq(vblk->disk->queue->queue_lock);
+ blk_sync_queue(vblk->disk->queue);
+
+ vdev->config->del_vqs(vdev);
+ return 0;
+}
+
+static int virtblk_restore(struct virtio_device *vdev)
+{
+ struct virtio_blk *vblk = vdev->priv;
+ int ret;
+
+ ret = init_vq(vdev->priv);
+ if (!ret) {
+ spin_lock_irq(vblk->disk->queue->queue_lock);
+ blk_start_queue(vblk->disk->queue);
+ spin_unlock_irq(vblk->disk->queue->queue_lock);
+ }
+ return ret;
+}
+#endif
+
static const struct virtio_device_id id_table[] = {
{ VIRTIO_ID_BLOCK, VIRTIO_DEV_ANY_ID },
{ 0 },
@@ -593,6 +627,10 @@ static struct virtio_driver __refdata virtio_blk = {
.probe = virtblk_probe,
.remove = __devexit_p(virtblk_remove),
.config_changed = virtblk_config_changed,
+#ifdef CONFIG_PM
+ .freeze = virtblk_freeze,
+ .restore = virtblk_restore,
+#endif
};
static int __init init(void)
--
1.7.7.3
^ permalink raw reply related
* [PATCH v5 07/11] virtio: net: Move vq initialization into separate function
From: Amit Shah @ 2011-12-15 12:45 UTC (permalink / raw)
To: Virtualization List
Cc: linux-pm, linux-kernel, Michael S. Tsirkin, levinsasha928,
Amit Shah
In-Reply-To: <cover.1323952933.git.amit.shah@redhat.com>
The probe and PM restore functions will share this code.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
---
drivers/net/virtio_net.c | 47 +++++++++++++++++++++++++++------------------
1 files changed, 28 insertions(+), 19 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 6ee8410..6baa563 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -954,15 +954,38 @@ static void virtnet_config_changed(struct virtio_device *vdev)
virtnet_update_status(vi);
}
+static int init_vqs(struct virtnet_info *vi)
+{
+ struct virtqueue *vqs[3];
+ vq_callback_t *callbacks[] = { skb_recv_done, skb_xmit_done, NULL};
+ const char *names[] = { "input", "output", "control" };
+ int nvqs, err;
+
+ /* We expect two virtqueues, receive then send,
+ * and optionally control. */
+ nvqs = virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ) ? 3 : 2;
+
+ err = vi->vdev->config->find_vqs(vi->vdev, nvqs, vqs, callbacks, names);
+ if (err)
+ return err;
+
+ vi->rvq = vqs[0];
+ vi->svq = vqs[1];
+
+ if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ)) {
+ vi->cvq = vqs[2];
+
+ if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VLAN))
+ vi->dev->features |= NETIF_F_HW_VLAN_FILTER;
+ }
+ return 0;
+}
+
static int virtnet_probe(struct virtio_device *vdev)
{
int err;
struct net_device *dev;
struct virtnet_info *vi;
- struct virtqueue *vqs[3];
- vq_callback_t *callbacks[] = { skb_recv_done, skb_xmit_done, NULL};
- const char *names[] = { "input", "output", "control" };
- int nvqs;
/* Allocate ourselves a network device with room for our info */
dev = alloc_etherdev(sizeof(struct virtnet_info));
@@ -1034,24 +1057,10 @@ static int virtnet_probe(struct virtio_device *vdev)
if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF))
vi->mergeable_rx_bufs = true;
- /* We expect two virtqueues, receive then send,
- * and optionally control. */
- nvqs = virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ) ? 3 : 2;
-
- err = vdev->config->find_vqs(vdev, nvqs, vqs, callbacks, names);
+ err = init_vqs(vi);
if (err)
goto free_stats;
- vi->rvq = vqs[0];
- vi->svq = vqs[1];
-
- if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ)) {
- vi->cvq = vqs[2];
-
- if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VLAN))
- dev->features |= NETIF_F_HW_VLAN_FILTER;
- }
-
err = register_netdev(dev);
if (err) {
pr_debug("virtio_net: registering device failed\n");
--
1.7.7.3
^ permalink raw reply related
* [PATCH v5 08/11] virtio: net: Move vq and vq buf removal into separate function
From: Amit Shah @ 2011-12-15 12:45 UTC (permalink / raw)
To: Virtualization List
Cc: linux-pm, linux-kernel, Michael S. Tsirkin, levinsasha928,
Amit Shah
In-Reply-To: <cover.1323952933.git.amit.shah@redhat.com>
The remove and PM freeze functions will share this code.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
---
drivers/net/virtio_net.c | 19 ++++++++++++-------
1 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 6baa563..697a0fc 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1123,24 +1123,29 @@ static void free_unused_bufs(struct virtnet_info *vi)
BUG_ON(vi->num != 0);
}
-static void __devexit virtnet_remove(struct virtio_device *vdev)
+static void remove_vq_common(struct virtnet_info *vi)
{
- struct virtnet_info *vi = vdev->priv;
-
/* Stop all the virtqueues. */
- vdev->config->reset(vdev);
-
+ vi->vdev->config->reset(vi->vdev);
- unregister_netdev(vi->dev);
cancel_delayed_work_sync(&vi->refill);
/* Free unused buffers in both send and recv, if any. */
free_unused_bufs(vi);
- vdev->config->del_vqs(vi->vdev);
+ vi->vdev->config->del_vqs(vi->vdev);
while (vi->pages)
__free_pages(get_a_page(vi, GFP_KERNEL), 0);
+}
+
+static void __devexit virtnet_remove(struct virtio_device *vdev)
+{
+ struct virtnet_info *vi = vdev->priv;
+
+ unregister_netdev(vi->dev);
+
+ remove_vq_common(vi);
free_percpu(vi->stats);
free_netdev(vi->dev);
--
1.7.7.3
^ permalink raw reply related
* [PATCH v5 09/11] virtio: net: Add freeze, restore handlers to support S4
From: Amit Shah @ 2011-12-15 12:45 UTC (permalink / raw)
To: Virtualization List
Cc: linux-pm, linux-kernel, Michael S. Tsirkin, levinsasha928,
Amit Shah
In-Reply-To: <cover.1323952933.git.amit.shah@redhat.com>
Remove all the vqs, disable napi and detach from the netdev on
hibernation.
Re-create vqs after restoring from a hibernated image, re-enable napi
and re-attach the netdev. This keeps networking working across
hibernation.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
---
drivers/net/virtio_net.c | 36 ++++++++++++++++++++++++++++++++++++
1 files changed, 36 insertions(+), 0 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 697a0fc..1378f3c 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1151,6 +1151,38 @@ static void __devexit virtnet_remove(struct virtio_device *vdev)
free_netdev(vi->dev);
}
+#ifdef CONFIG_PM
+static int virtnet_freeze(struct virtio_device *vdev)
+{
+ struct virtnet_info *vi = vdev->priv;
+
+ netif_device_detach(vi->dev);
+ if (netif_running(vi->dev))
+ napi_disable(&vi->napi);
+
+ remove_vq_common(vi);
+
+ return 0;
+}
+
+static int virtnet_restore(struct virtio_device *vdev)
+{
+ struct virtnet_info *vi = vdev->priv;
+ int err;
+
+ err = init_vqs(vi);
+ if (err)
+ return err;
+
+ try_fill_recv(vi, GFP_KERNEL);
+ if (netif_running(vi->dev))
+ virtnet_napi_enable(vi);
+
+ netif_device_attach(vi->dev);
+ return 0;
+}
+#endif
+
static struct virtio_device_id id_table[] = {
{ VIRTIO_ID_NET, VIRTIO_DEV_ANY_ID },
{ 0 },
@@ -1175,6 +1207,10 @@ static struct virtio_driver virtio_net_driver = {
.probe = virtnet_probe,
.remove = __devexit_p(virtnet_remove),
.config_changed = virtnet_config_changed,
+#ifdef CONFIG_PM
+ .freeze = virtnet_freeze,
+ .restore = virtnet_restore,
+#endif
};
static int __init init(void)
--
1.7.7.3
^ permalink raw reply related
* [PATCH v5 10/11] virtio: balloon: Move vq initialization into separate function
From: Amit Shah @ 2011-12-15 12:45 UTC (permalink / raw)
To: Virtualization List
Cc: linux-pm, linux-kernel, Michael S. Tsirkin, levinsasha928,
Amit Shah
In-Reply-To: <cover.1323952933.git.amit.shah@redhat.com>
The probe and PM restore functions will share this code.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
---
drivers/virtio/virtio_balloon.c | 48 ++++++++++++++++++++++++--------------
1 files changed, 30 insertions(+), 18 deletions(-)
diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 94fd738..1ff3cf4 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -275,32 +275,21 @@ static int balloon(void *_vballoon)
return 0;
}
-static int virtballoon_probe(struct virtio_device *vdev)
+static int init_vqs(struct virtio_balloon *vb)
{
- struct virtio_balloon *vb;
struct virtqueue *vqs[3];
vq_callback_t *callbacks[] = { balloon_ack, balloon_ack, stats_request };
const char *names[] = { "inflate", "deflate", "stats" };
int err, nvqs;
- vdev->priv = vb = kmalloc(sizeof(*vb), GFP_KERNEL);
- if (!vb) {
- err = -ENOMEM;
- goto out;
- }
-
- INIT_LIST_HEAD(&vb->pages);
- vb->num_pages = 0;
- init_waitqueue_head(&vb->config_change);
- vb->vdev = vdev;
- vb->need_stats_update = 0;
-
- /* We expect two virtqueues: inflate and deflate,
- * and optionally stat. */
+ /*
+ * We expect two virtqueues: inflate and deflate, and
+ * optionally stat.
+ */
nvqs = virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_STATS_VQ) ? 3 : 2;
- err = vdev->config->find_vqs(vdev, nvqs, vqs, callbacks, names);
+ err = vb->vdev->config->find_vqs(vb->vdev, nvqs, vqs, callbacks, names);
if (err)
- goto out_free_vb;
+ return err;
vb->inflate_vq = vqs[0];
vb->deflate_vq = vqs[1];
@@ -317,6 +306,29 @@ static int virtballoon_probe(struct virtio_device *vdev)
BUG();
virtqueue_kick(vb->stats_vq);
}
+ return 0;
+}
+
+static int virtballoon_probe(struct virtio_device *vdev)
+{
+ struct virtio_balloon *vb;
+ int err;
+
+ vdev->priv = vb = kmalloc(sizeof(*vb), GFP_KERNEL);
+ if (!vb) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ INIT_LIST_HEAD(&vb->pages);
+ vb->num_pages = 0;
+ init_waitqueue_head(&vb->config_change);
+ vb->vdev = vdev;
+ vb->need_stats_update = 0;
+
+ err = init_vqs(vb);
+ if (err)
+ goto out_free_vb;
vb->thread = kthread_run(balloon, vb, "vballoon");
if (IS_ERR(vb->thread)) {
--
1.7.7.3
^ permalink raw reply related
* [PATCH v5 11/11] virtio: balloon: Add freeze, restore handlers to support S4
From: Amit Shah @ 2011-12-15 12:45 UTC (permalink / raw)
To: Virtualization List
Cc: linux-pm, linux-kernel, Michael S. Tsirkin, levinsasha928,
Amit Shah
In-Reply-To: <cover.1323952933.git.amit.shah@redhat.com>
Handling balloon hibernate / restore is tricky. If the balloon was
inflated before going into the hibernation state, upon resume, the host
will not have any memory of that. Any pages that were passed on to the
host earlier would most likely be invalid, and the host will have to
re-balloon to the previous value to get in the pre-hibernate state.
So the only sane thing for the guest to do here is to discard all the
pages that were put in the balloon. When to discard the pages is the
next question.
One solution is to deflate the balloon just before writing the image to
the disk (in the freeze() PM callback). However, asking for pages from
the host just to discard them immediately after seems wasteful of
resources. Hence, it makes sense to do this by just fudging our
counters soon after wakeup. This means we don't deflate the balloon
before sleep, and also don't put unnecessary pressure on the host.
This also helps in the thaw case: if the freeze fails for whatever
reason, the balloon should continue to remain in the inflated state.
This was tested by issuing 'swapoff -a' and trying to go into the S4
state. That fails, and the balloon stays inflated, as expected. Both
the host and the guest are happy.
Finally, in the restore() callback, we empty the list of pages that were
previously given off to the host, add the appropriate number of pages to
the totalram_pages counter, reset the num_pages counter to 0, and
all is fine.
As a last step, delete the vqs on the freeze callback to prepare for
hibernation, and re-create them in the restore and thaw callbacks to
resume normal operation.
The kthread doesn't race with any operations here, since it's frozen
before the freeze() call and is thawed after the thaw() and restore()
callbacks, so we're safe with that.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
---
drivers/virtio/virtio_balloon.c | 47 +++++++++++++++++++++++++++++++++++++++
1 files changed, 47 insertions(+), 0 deletions(-)
diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 1ff3cf4..4c327c7 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -363,6 +363,48 @@ static void __devexit virtballoon_remove(struct virtio_device *vdev)
kfree(vb);
}
+#ifdef CONFIG_PM
+static int virtballoon_freeze(struct virtio_device *vdev)
+{
+ /*
+ * The kthread is already frozen by the PM core before this
+ * function is called.
+ */
+
+ /* Ensure we don't get any more requests from the host */
+ vdev->config->reset(vdev);
+ vdev->config->del_vqs(vdev);
+ return 0;
+}
+
+static int virtballoon_thaw(struct virtio_device *vdev)
+{
+ return init_vqs(vdev->priv);
+}
+
+static int virtballoon_restore(struct virtio_device *vdev)
+{
+ struct virtio_balloon *vb = vdev->priv;
+ struct page *page, *page2;
+
+ /* We're starting from a clean slate */
+ vb->num_pages = 0;
+
+ /*
+ * If a request wasn't complete at the time of freezing, this
+ * could have been set.
+ */
+ vb->need_stats_update = 0;
+
+ /* We don't have these pages in the balloon anymore! */
+ list_for_each_entry_safe(page, page2, &vb->pages, lru) {
+ list_del(&page->lru);
+ totalram_pages++;
+ }
+ return init_vqs(vdev->priv);
+}
+#endif
+
static unsigned int features[] = {
VIRTIO_BALLOON_F_MUST_TELL_HOST,
VIRTIO_BALLOON_F_STATS_VQ,
@@ -377,6 +419,11 @@ static struct virtio_driver virtio_balloon_driver = {
.probe = virtballoon_probe,
.remove = __devexit_p(virtballoon_remove),
.config_changed = virtballoon_changed,
+#ifdef CONFIG_PM
+ .freeze = virtballoon_freeze,
+ .restore = virtballoon_restore,
+ .thaw = virtballoon_thaw,
+#endif
};
static int __init init(void)
--
1.7.7.3
^ permalink raw reply related
* Re: [PATCH v5 00/11] virtio: s4 support
From: Michael S. Tsirkin @ 2011-12-15 13:13 UTC (permalink / raw)
To: Amit Shah; +Cc: linux-kernel, linux-pm, levinsasha928, Virtualization List
In-Reply-To: <cover.1323952933.git.amit.shah@redhat.com>
On Thu, Dec 15, 2011 at 06:15:46PM +0530, Amit Shah wrote:
> Hi,
>
> These patches add support for S4 to virtio (pci) and all drivers.
>
> Michael saw some race in virtio-net module removal which will need a
> similar fix for the freeze code as well. I'll update the virtio-net
> patch with that fix once the fix is settled upon and applied.
block too?
--
MST
^ permalink raw reply
* CFP: ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC'12)
From: Ioan Raicu @ 2011-12-15 13:46 UTC (permalink / raw)
To: virtualization
**** CALL FOR PAPERS ****
The 21st International ACM Symposium on
High-Performance Parallel and Distributed Computing
(HPDC'12)
Delft University of Technology, Delft, the Netherlands
June 18-22, 2012
http://www.hpdc.org/2012
The ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC)
is the premier annual conference on the design, the implementation, the evaluation, and
the use of parallel and distributed systems for high-end computing. HPDC'12 will take place
in Delft, the Netherlands, a historical, picturesque city that is less than one hour away
from Amsterdam-Schiphol airport. The conference will be held on June 20-22 (Wednesday to
Friday), with affiliated workshops taking place on June 18-19 (Monday and Tuesday).
**** SUBMISSION DEADLINES ****
Abstracts: 16 January 2012
Papers: 23 January 2012 (No extensions!)
**** HPDC'12 GENERAL CHAIR ****
Dick Epema, Delft University of Technology, Delft, the Netherlands
**** HPDC'12 PROGRAM CO-CHAIRS ****
Thilo Kielmann, Vrije Universiteit, Amsterdam, the Netherlands
Matei Ripeanu, The University of British Columbia, Vancouver, Canada
**** HPDC'12 WORKSHOPS CHAIR ****
Alexandru Iosup, Delft University of Technology, Delft, the Netherlands
**** SCOPE AND TOPICS ****
Submissions are welcomed on all forms of high-performance parallel and distributed computing,
including but not limited to clusters, clouds, grids, utility computing, data-intensive
computing, and massively multicore systems. Submissions that explore solutions to estimate
and reduce the energy footprint of such systems are particularly encouraged. All papers
will be evaluated for their originality, potential impact, correctness, quality of
presentation, appropriate presentation of related work, and relevance to the conference,
with a strong preference for rigorous results obtained in operational parallel and
distributed systems.
The topics of interest of the conference include, but are not limited to, the following,
in the context of high-performance parallel and distributed computing:
- Systems, networks, and architectures for high-end computing
- Massively multicore systems
- Virtualization of machines, networks, and storage
- Programming languages and environments
- I/O, storage systems, and data management
- Resource management, energy and cost minimizations
- Performance modeling and analysis
- Fault tolerance, reliability, and availability
- Data-intensive computing
- Applications of parallel and distributed computing
**** PAPER SUBMISSION GUIDELINES ****
Authors are invited to submit technical papers of at most 12 pages in PDF format, including
figures and references. Papers should be formatted in the ACM Proceedings Style and
submitted via the conference web site. No changes to the margins, spacing, or font sizes as
specified by the style file are allowed. Accepted papers will appear in the conference
proceedings, and will be incorporated into the ACM Digital Library. A limited number of
papers will be accepted as posters.
Papers must be self-contained and provide the technical substance required for the program
committee to evaluate their contributions. Submitted papers must be original work that has
not appeared in and is not under consideration for another conference or a journal. See the
ACM Prior Publication Policy for more details.
**** IMPORTANT DATES ****
Abstracts Due: 16 January 2012
Papers Due: 23 January 2012 (No extensions!)
Reviews Released to Authors: 8 March 2012
Author Rebuttals Due: 12 March 2012
Author Notifications: 19 March 2012
Final Papers Due: 16 April 2012
Conference Dates: 18-22 June 2012
--
=================================================================
Ioan Raicu, Ph.D.
Assistant Professor, Illinois Institute of Technology (IIT)
Guest Research Faculty, Argonne National Laboratory (ANL)
=================================================================
Data-Intensive Distributed Systems Laboratory, CS/IIT
Distributed Systems Laboratory, MCS/ANL
=================================================================
Cel: 1-847-722-0876
Office: 1-312-567-5704
Email: iraicu@cs.iit.edu
Web: http://www.cs.iit.edu/~iraicu/
Web: http://datasys.cs.iit.edu/
=================================================================
=================================================================
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox