Linux virtualization list
 help / color / mirror / Atom feed
* [PATCH net-next 1/2] virtio-net: correctly update XDP_TX counters
From: Jason Wang @ 2018-07-31  9:43 UTC (permalink / raw)
  To: mst, jasowang; +Cc: netdev, linux-kernel, virtualization

Commit 5b8f3c8d30a6 ("virtio_net: Add XDP related stats") tries to
count TX XDP stats in virtnet_receive(). This will cause several
issues:

- virtnet_xdp_sq() was called without checking whether or not XDP is
  set. This may cause out of bound access when there's no enough txq
  for XDP.
- Stats were updated even if there's no XDP/XDP_TX.

Fixing this by reusing virtnet_xdp_xmit() for XDP_TX which can counts
TX XDP counter itself and remove the unnecessary tx stats embedded in
rx stats.

Reported-by: syzbot+604f8271211546f5b3c7@syzkaller.appspotmail.com
Fixes: 5b8f3c8d30a6 ("virtio_net: Add XDP related stats")
Cc: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/net/virtio_net.c | 39 ++++-----------------------------------
 1 file changed, 4 insertions(+), 35 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 1880c86..72d3f68 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -105,10 +105,6 @@ struct virtnet_rq_stats {
 
 struct virtnet_rx_stats {
 	struct virtnet_rq_stat_items rx;
-	struct {
-		unsigned int xdp_tx;
-		unsigned int xdp_tx_drops;
-	} tx;
 };
 
 #define VIRTNET_SQ_STAT(m)	offsetof(struct virtnet_sq_stats, m)
@@ -485,22 +481,6 @@ static struct send_queue *virtnet_xdp_sq(struct virtnet_info *vi)
 	return &vi->sq[qp];
 }
 
-static int __virtnet_xdp_tx_xmit(struct virtnet_info *vi,
-				   struct xdp_frame *xdpf)
-{
-	struct xdp_frame *xdpf_sent;
-	struct send_queue *sq;
-	unsigned int len;
-
-	sq = virtnet_xdp_sq(vi);
-
-	/* Free up any pending old buffers before queueing new ones. */
-	while ((xdpf_sent = virtqueue_get_buf(sq->vq, &len)) != NULL)
-		xdp_return_frame(xdpf_sent);
-
-	return __virtnet_xdp_xmit_one(vi, sq, xdpf);
-}
-
 static int virtnet_xdp_xmit(struct net_device *dev,
 			    int n, struct xdp_frame **frames, u32 flags)
 {
@@ -707,10 +687,8 @@ static struct sk_buff *receive_small(struct net_device *dev,
 			xdpf = convert_to_xdp_frame(&xdp);
 			if (unlikely(!xdpf))
 				goto err_xdp;
-			stats->tx.xdp_tx++;
-			err = __virtnet_xdp_tx_xmit(vi, xdpf);
-			if (unlikely(err)) {
-				stats->tx.xdp_tx_drops++;
+			err = virtnet_xdp_xmit(dev, 1, &xdpf, 0);
+			if (unlikely(err < 0)) {
 				trace_xdp_exception(vi->dev, xdp_prog, act);
 				goto err_xdp;
 			}
@@ -879,10 +857,8 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 			xdpf = convert_to_xdp_frame(&xdp);
 			if (unlikely(!xdpf))
 				goto err_xdp;
-			stats->tx.xdp_tx++;
-			err = __virtnet_xdp_tx_xmit(vi, xdpf);
-			if (unlikely(err)) {
-				stats->tx.xdp_tx_drops++;
+			err = virtnet_xdp_xmit(dev, 1, &xdpf, 0);
+			if (unlikely(err < 0)) {
 				trace_xdp_exception(vi->dev, xdp_prog, act);
 				if (unlikely(xdp_page != page))
 					put_page(xdp_page);
@@ -1315,7 +1291,6 @@ static int virtnet_receive(struct receive_queue *rq, int budget,
 {
 	struct virtnet_info *vi = rq->vq->vdev->priv;
 	struct virtnet_rx_stats stats = {};
-	struct send_queue *sq;
 	unsigned int len;
 	void *buf;
 	int i;
@@ -1351,12 +1326,6 @@ static int virtnet_receive(struct receive_queue *rq, int budget,
 	}
 	u64_stats_update_end(&rq->stats.syncp);
 
-	sq = virtnet_xdp_sq(vi);
-	u64_stats_update_begin(&sq->stats.syncp);
-	sq->stats.xdp_tx += stats.tx.xdp_tx;
-	sq->stats.xdp_tx_drops += stats.tx.xdp_tx_drops;
-	u64_stats_update_end(&sq->stats.syncp);
-
 	return stats.rx.packets;
 }
 
-- 
2.7.4

^ permalink raw reply related

* Re: [RFC 2/4] virtio: Override device's DMA OPS with virtio_direct_dma_ops selectively
From: Anshuman Khandual @ 2018-07-31  7:00 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: robh, srikar, mst, linuxram, linux-kernel, virtualization, paulus,
	joe, linuxppc-dev, elfring, haren, david
In-Reply-To: <20180730092551.GB26245@infradead.org>

On 07/30/2018 02:55 PM, Christoph Hellwig wrote:
>> +const struct dma_map_ops virtio_direct_dma_ops;
> 
> This belongs into a header if it is non-static.  If you only
> use it in this file anyway please mark it static and avoid a forward
> declaration.

Sure, will make it static, move the definition up in the file to avoid
forward declaration.
 
> 
>> +
>>  int virtio_finalize_features(struct virtio_device *dev)
>>  {
>>  	int ret = dev->config->finalize_features(dev);
>> @@ -174,6 +176,9 @@ int virtio_finalize_features(struct virtio_device *dev)
>>  	if (ret)
>>  		return ret;
>>  
>> +	if (virtio_has_iommu_quirk(dev))
>> +		set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops);
> 
> This needs a big fat comment explaining what is going on here.

Sure, will do. Also talk about the XEN domain exception as well once
that goes into this conditional statement.

> 
> Also not new, but I find the existance of virtio_has_iommu_quirk and its
> name horribly confusing.  It might be better to open code it here once
> only a single caller is left.

Sure will do. There is one definition in the tools directory which can
be removed and then this will be the only one left.

^ permalink raw reply

* Re: [RFC 2/4] virtio: Override device's DMA OPS with virtio_direct_dma_ops selectively
From: Anshuman Khandual @ 2018-07-31  6:39 UTC (permalink / raw)
  To: Christoph Hellwig, Michael S. Tsirkin
  Cc: robh, srikar, benh, linuxram, linux-kernel, virtualization,
	paulus, mpe, joe, linuxppc-dev, elfring, haren, david
In-Reply-To: <20180730093027.GC26245@infradead.org>

On 07/30/2018 03:00 PM, Christoph Hellwig wrote:
>>> +
>>> +	if (xen_domain())
>>> +		goto skip_override;
>>> +
>>> +	if (virtio_has_iommu_quirk(dev))
>>> +		set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops);
>>> +
>>> + skip_override:
>>> +
>>
>> I prefer normal if scoping as opposed to goto spaghetti pls.
>> Better yet move vring_use_dma_api here and use it.
>> Less of a chance something will break.
> 
> I agree about avoid pointless gotos here, but we can do things
> perfectly well without either gotos or a confusing helper here
> if we structure it right. E.g.:
> 
> 	// suitably detailed comment here
> 	if (!xen_domain() &&
> 	    !virtio_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM))
> 		set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops);

I had updated this patch calling vring_use_dma_api() as a helper
as suggested by Michael but yes we can have the above condition
with a comment block. I will change this patch accordingly.

> 
> and while we're at it - modifying dma ops for the parent looks very
> dangerous.  I don't think we can do that, as it could break iommu
> setup interactions.  IFF we set a specific dma map ops it has to be
> on the virtio device itself, of which we have full control.

I understand your concern. At present virtio core calls parent's DMA
ops callbacks when device has VIRTIO_F_IOMMU_PLATFORM flag set. Most
likely those DMA OPS are architecture specific ones which can really
configure IOMMU. Most probably all devices and their parents share
the same DMA ops callback. IIUC as long as the entire system has a
single DMA ops structure, it should be okay. But I may be missing
other implications. I tried changing virtio core so that it always
calls device's DMA ops instead of it's parent DMA ops, it hit the
following WARN_ON for devices without IOMMU flag and hit both the
WARN_ON and BUG_ON for devices with the IOMMU flag.

static inline void *dma_alloc_attrs(struct device *dev, size_t size,
                                       dma_addr_t *dma_handle, gfp_t flag,
                                       unsigned long attrs)
{
        const struct dma_map_ops *ops = get_dma_ops(dev);
        void *cpu_addr;

        BUG_ON(!ops);
        WARN_ON_ONCE(dev && !dev->coherent_dma_mask);

--------

Seems like virtio device's DMA ops and coherent_dma_mask was never
set correctly assuming that virtio core always called parent's DMA
OPS all the time. We may have to change virtio device init to fix
this. Any thoughts ?

^ permalink raw reply

* IEEE Record # 44854: iCATccT 2018, Alva's Institute Of Engineering & Technology (AIET)-CFP
From: Dr. S K Niranjan Aradhya @ 2018-07-31  5:57 UTC (permalink / raw)
  To: virtualization


[-- Attachment #1.1: Type: text/plain, Size: 1653 bytes --]

<< Apologies for cross-postings >>
<<< Please circulate among your friends, peers and researchers >>>

IEEE Conference Record No.: # 44854;

4th International Conference on Applied and Theoretical Computing and
Communication Technology (iCATccT - 2018)
 Alva's Institute Of Engineering & Technology (AIET)

Conference Date : 6-8 Sept 2018
Submission Deadline: 10 August 2018

Submission Link: http://itekcmsonline.com/icatcct18/index.php/icatcct18/
icatcct18/login

Review is underway for submitted papers.

IEEE ISBN : 978-1-5386-7706-3
IEEE Part No. : CFP18D66-ART
Selected, accepted and extended paper will be published in Scopus Indexed
International Journal of Forensic Software Engineering published by
InderScience
All accepted and presented papers will be submitted to the IEEE for
possible publication in IEEE Xplore Digital Library. Previous edition
indexed in: SCOPUS, ISI Web of Science, Engineering Index, Google, etc.

If you like to join the TPC or propose a special session or symposiums
please write to: secretariat@icatcct.org

General Chair(s)
iCATccT 2018 Conference

----------------------
Disclaimer: We have clearly mentioned the subject lines and your email
address won't be misleading in any form. We have found your mail address
through our own efforts on the web search and not through any illegal way.
If you wish to remove your information from our mailing list or no longer
receive future announcements, please email with REMOVE in subject. Your
request to opt-out will be effective within a reasonable amount of time.
 icatcct-cfp.pdf
<https://drive.google.com/file/d/1OWXPZVS1IRZlNoWTjfVyxl-yIL2CsByg/view?usp=drive_web>

[-- Attachment #1.2: Type: text/html, Size: 3240 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [RFC 1/4] virtio: Define virtio_direct_dma_ops structure
From: Anshuman Khandual @ 2018-07-31  4:01 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: robh, srikar, mst, benh, linuxram, linux-kernel, virtualization,
	paulus, mpe, joe, linuxppc-dev, elfring, haren, david
In-Reply-To: <20180730092419.GA26245@infradead.org>

On 07/30/2018 02:54 PM, Christoph Hellwig wrote:
>> +/*
>> + * Virtio direct mapping DMA API operations structure
>> + *
>> + * This defines DMA API structure for all virtio devices which would not
>> + * either bring in their own DMA OPS from architecture or they would not
>> + * like to use architecture specific IOMMU based DMA OPS because QEMU
>> + * expects GPA instead of an IOVA in absence of VIRTIO_F_IOMMU_PLATFORM.
>> + */
>> +dma_addr_t virtio_direct_map_page(struct device *dev, struct page *page,
>> +			    unsigned long offset, size_t size,
>> +			    enum dma_data_direction dir,
>> +			    unsigned long attrs)
> 
> All these functions should probably be marked static.

Sure.

> 
>> +void virtio_direct_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
>> +			size_t size, enum dma_data_direction dir,
>> +			unsigned long attrs)
>> +{
>> +}
> 
> No need to implement no-op callbacks in struct dma_map_ops.

Okay.

> 
>> +
>> +int virtio_direct_mapping_error(struct device *hwdev, dma_addr_t dma_addr)
>> +{
>> +	return 0;
>> +}
> 
> Including this one.
> 
>> +void *virtio_direct_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
>> +		gfp_t gfp, unsigned long attrs)
>> +{
>> +	void *queue = alloc_pages_exact(PAGE_ALIGN(size), gfp);
>> +
>> +	if (queue) {
>> +		phys_addr_t phys_addr = virt_to_phys(queue);
>> +		*dma_handle = (dma_addr_t)phys_addr;
>> +
>> +		if (WARN_ON_ONCE(*dma_handle != phys_addr)) {
>> +			free_pages_exact(queue, PAGE_ALIGN(size));
>> +			return NULL;
>> +		}
>> +	}
>> +	return queue;
> 
> queue is a very odd name in a generic memory allocator.

Will change it to addr.

> 
>> +void virtio_direct_free(struct device *dev, size_t size, void *vaddr,
>> +		dma_addr_t dma_addr, unsigned long attrs)
>> +{
>> +	free_pages_exact(vaddr, PAGE_ALIGN(size));
>> +}
>> +
>> +const struct dma_map_ops virtio_direct_dma_ops = {
>> +	.alloc			= virtio_direct_alloc,
>> +	.free			= virtio_direct_free,
>> +	.map_page		= virtio_direct_map_page,
>> +	.unmap_page		= virtio_direct_unmap_page,
>> +	.mapping_error		= virtio_direct_mapping_error,
>> +};
> 
> This is missing a dma_map_sg implementation.  In general this is
> mandatory for dma_ops.  So either you implement it or explain in
> a common why you think you can skip it.

Hmm. IIUC virtio core never used dma_map_sg(). Am I missing something
here ? The only reference to dma_map_sg() is inside a comment.

$git grep dma_map_sg drivers/virtio/
drivers/virtio/virtio_ring.c:    * We can't use dma_map_sg, because we don't use scatterlists in

> 
>> +EXPORT_SYMBOL(virtio_direct_dma_ops);
> 
> EXPORT_SYMBOL_GPL like all virtio symbols, please.

I am planning to drop EXPORT_SYMBOL from virtio_direct_dma_ops structure.

^ permalink raw reply

* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Michael S. Tsirkin @ 2018-07-30 13:26 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: robh, srikar, benh, Will Deacon, linux-kernel, linuxram,
	virtualization, paulus, marc.zyngier, mpe, joe, robin.murphy,
	david, linuxppc-dev, elfring, haren, Anshuman Khandual
In-Reply-To: <20180730111802.GA9830@infradead.org>

On Mon, Jul 30, 2018 at 04:18:02AM -0700, Christoph Hellwig wrote:
> On Mon, Jul 30, 2018 at 01:28:03PM +0300, Michael S. Tsirkin wrote:
> > Let me reply to the "crappy" part first:
> > So virtio devices can run on another CPU or on a PCI bus. Configuration
> > can happen over mupltiple transports.  There is a discovery protocol to
> > figure out where it is. It has some warts but any real system has warts.
> > 
> > So IMHO virtio running on another CPU isn't "legacy virtual crappy
> > virtio". virtio devices that actually sit on a PCI bus aren't "sane"
> > simply because the DMA is more convoluted on some architectures.
> 
> All of what you said would be true if virtio didn't claim to be
> a PCI device.  

There's nothing virtio claims to be.  It's a PV device that uses PCI for
its configuration.  Configuration is enumerated on the virtual PCI bus.
That part of the interface is emulated PCI. Data path is through a
PV device enumerated on the virtio bus.

> Once it claims to be a PCI device and we also see
> real hardware written to the interface I stand to all what I said
> above.

Real hardware would reuse parts of the interface but by necessity it
needs to behave slightly differently on some platforms.  However for
some platforms (such as x86) a PV virtio driver will by luck work with a
PCI device backend without changes. As these platforms and drivers are
widely deployed, some people will deploy hardware like that.  Should be
a non issue as by definition it's transparent to guests.

> > With this out of my system:
> > I agree these approaches are hacky. I think it is generally better to
> > have virtio feature negotiation tell you whether device runs on a CPU or
> > not rather than rely on platform specific ways for this. To this end
> > there was a recent proposal to rename VIRTIO_F_IO_BARRIER to
> > VIRTIO_F_REAL_DEVICE.  It got stuck since "real" sounds vague to people,
> > e.g.  what if it's a VF - is that real or not? But I can see something
> > like e.g. VIRTIO_F_PLATFORM_DMA gaining support.
> > 
> > We would then rename virtio_has_iommu_quirk to virtio_has_dma_quirk
> > and test VIRTIO_F_PLATFORM_DMA in addition to the IOMMU thing.
> 
> I don't really care about the exact naming, and indeed a device that
> sets the flag doesn't have to be a 'real' device - it just has to act
> like one.  I explained all the issues that this means (at least relating
> to DMA) in one of the previous threads.

I believe you refer to this:
https://lkml.org/lkml/2018/6/7/15
that was a very helpful list outlining the problems we need to solve,
thanks a lot for that!

> The important bit is that we can specify exact behavior for both
> devices that sets the "I'm real!" flag and that ones that don't exactly
> in the spec.

I would very much like that, yes.

> And that very much excludes arch-specific (or
> Xen-specific) overrides.

We already committed to a xen specific hack but generally I prefer
devices that describe how they work instead of platforms magically
guessing, yes.

However the question people raise is that DMA API is already full of
arch-specific tricks the likes of which are outlined in your post linked
above. How is this one much worse?

-- 
MST

^ permalink raw reply

* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Christoph Hellwig @ 2018-07-30 11:18 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: robh, srikar, benh, Will Deacon, linux-kernel, linuxram,
	virtualization, Christoph Hellwig, paulus, marc.zyngier, mpe, joe,
	robin.murphy, david, linuxppc-dev, elfring, haren,
	Anshuman Khandual
In-Reply-To: <20180730125100-mutt-send-email-mst@kernel.org>

On Mon, Jul 30, 2018 at 01:28:03PM +0300, Michael S. Tsirkin wrote:
> Let me reply to the "crappy" part first:
> So virtio devices can run on another CPU or on a PCI bus. Configuration
> can happen over mupltiple transports.  There is a discovery protocol to
> figure out where it is. It has some warts but any real system has warts.
> 
> So IMHO virtio running on another CPU isn't "legacy virtual crappy
> virtio". virtio devices that actually sit on a PCI bus aren't "sane"
> simply because the DMA is more convoluted on some architectures.

All of what you said would be true if virtio didn't claim to be
a PCI device.  Once it claims to be a PCI device and we also see
real hardware written to the interface I stand to all what I said
above.

> With this out of my system:
> I agree these approaches are hacky. I think it is generally better to
> have virtio feature negotiation tell you whether device runs on a CPU or
> not rather than rely on platform specific ways for this. To this end
> there was a recent proposal to rename VIRTIO_F_IO_BARRIER to
> VIRTIO_F_REAL_DEVICE.  It got stuck since "real" sounds vague to people,
> e.g.  what if it's a VF - is that real or not? But I can see something
> like e.g. VIRTIO_F_PLATFORM_DMA gaining support.
> 
> We would then rename virtio_has_iommu_quirk to virtio_has_dma_quirk
> and test VIRTIO_F_PLATFORM_DMA in addition to the IOMMU thing.

I don't really care about the exact naming, and indeed a device that
sets the flag doesn't have to be a 'real' device - it just has to act
like one.  I explained all the issues that this means (at least relating
to DMA) in one of the previous threads.

The important bit is that we can specify exact behavior for both
devices that sets the "I'm real!" flag and that ones that don't exactly
in the spec.  And that very much excludes arch-specific (or
Xen-specific) overrides.

^ permalink raw reply

* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Michael S. Tsirkin @ 2018-07-30 10:28 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: robh, srikar, benh, Will Deacon, linux-kernel, linuxram,
	virtualization, paulus, marc.zyngier, mpe, joe, robin.murphy,
	david, linuxppc-dev, elfring, haren, Anshuman Khandual
In-Reply-To: <20180730093414.GD26245@infradead.org>

On Mon, Jul 30, 2018 at 02:34:14AM -0700, Christoph Hellwig wrote:
> We really need to distinguish between legacy virtual crappy
> virtio (and that includes v1) that totally ignores the bus it pretends
> to be on, and sane virtio (to be defined) that sit on a real (or
> properly emulated including iommu and details for dma mapping) bus.

Let me reply to the "crappy" part first:
So virtio devices can run on another CPU or on a PCI bus. Configuration
can happen over mupltiple transports.  There is a discovery protocol to
figure out where it is. It has some warts but any real system has warts.

So IMHO virtio running on another CPU isn't "legacy virtual crappy
virtio". virtio devices that actually sit on a PCI bus aren't "sane"
simply because the DMA is more convoluted on some architectures.

Performance impact of the optimizations possible when you know
your "device" is in fact just another CPU has been measured,
it is real, so we aren't interested in adding all that overhead back
just so we can use DMA API. The "correct then fast" mantra doesn't
apply to something that is as widely deployed as virtio.

And I can accept an argument that maybe the DMA API isn't designed to
support such virtual DMA. Whether it should I don't know.

With this out of my system:
I agree these approaches are hacky. I think it is generally better to
have virtio feature negotiation tell you whether device runs on a CPU or
not rather than rely on platform specific ways for this. To this end
there was a recent proposal to rename VIRTIO_F_IO_BARRIER to
VIRTIO_F_REAL_DEVICE.  It got stuck since "real" sounds vague to people,
e.g.  what if it's a VF - is that real or not? But I can see something
like e.g. VIRTIO_F_PLATFORM_DMA gaining support.

We would then rename virtio_has_iommu_quirk to virtio_has_dma_quirk
and test VIRTIO_F_PLATFORM_DMA in addition to the IOMMU thing.

-- 
MST

^ permalink raw reply

* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Christoph Hellwig @ 2018-07-30  9:34 UTC (permalink / raw)
  To: Will Deacon
  Cc: robh, srikar, mst, benh, linuxram, linux-kernel, virtualization,
	hch, paulus, marc.zyngier, mpe, joe, robin.murphy, david,
	linuxppc-dev, elfring, haren, Anshuman Khandual
In-Reply-To: <20180727095804.GA25592@arm.com>

On Fri, Jul 27, 2018 at 10:58:05AM +0100, Will Deacon wrote:
> 
> I just wanted to say that this patch series provides a means for us to
> force the coherent DMA ops for legacy virtio devices on arm64, which in turn
> means that we can enable the SMMU with legacy devices in our fastmodel
> emulation platform (which is slowly being upgraded to virtio 1.0) without
> hanging during boot. Patch below.

Yikes, this is a nightmare.  That is exactly where I do not want things
to end up.  We really need to distinguish between legacy virtual crappy
virtio (and that includes v1) that totally ignores the bus it pretends
to be on, and sane virtio (to be defined) that sit on a real (or
properly emulated including iommu and details for dma mapping) bus.

Having a mumble jumble of arch specific undocumented magic as in
the powerpc patch replied to or this arm patch is a complete no-go.

Nacked-by: Christoph Hellwig <hch@lst.de>

for both.

^ permalink raw reply

* Re: [RFC 2/4] virtio: Override device's DMA OPS with virtio_direct_dma_ops selectively
From: Christoph Hellwig @ 2018-07-30  9:30 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: robh, srikar, benh, linuxram, linux-kernel, virtualization, hch,
	paulus, mpe, joe, david, linuxppc-dev, elfring, haren,
	Anshuman Khandual
In-Reply-To: <20180729001344-mutt-send-email-mst@kernel.org>

> > +
> > +	if (xen_domain())
> > +		goto skip_override;
> > +
> > +	if (virtio_has_iommu_quirk(dev))
> > +		set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops);
> > +
> > + skip_override:
> > +
> 
> I prefer normal if scoping as opposed to goto spaghetti pls.
> Better yet move vring_use_dma_api here and use it.
> Less of a chance something will break.

I agree about avoid pointless gotos here, but we can do things
perfectly well without either gotos or a confusing helper here
if we structure it right. E.g.:

	// suitably detailed comment here
	if (!xen_domain() &&
	    !virtio_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM))
		set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops);

and while we're at it - modifying dma ops for the parent looks very
dangerous.  I don't think we can do that, as it could break iommu
setup interactions.  IFF we set a specific dma map ops it has to be
on the virtio device itself, of which we have full control.

^ permalink raw reply

* Re: [RFC 2/4] virtio: Override device's DMA OPS with virtio_direct_dma_ops selectively
From: Christoph Hellwig @ 2018-07-30  9:25 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: robh, srikar, mst, benh, linuxram, linux-kernel, virtualization,
	hch, paulus, mpe, joe, linuxppc-dev, elfring, haren, david
In-Reply-To: <20180720035941.6844-3-khandual@linux.vnet.ibm.com>

> +const struct dma_map_ops virtio_direct_dma_ops;

This belongs into a header if it is non-static.  If you only
use it in this file anyway please mark it static and avoid a forward
declaration.

> +
>  int virtio_finalize_features(struct virtio_device *dev)
>  {
>  	int ret = dev->config->finalize_features(dev);
> @@ -174,6 +176,9 @@ int virtio_finalize_features(struct virtio_device *dev)
>  	if (ret)
>  		return ret;
>  
> +	if (virtio_has_iommu_quirk(dev))
> +		set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops);

This needs a big fat comment explaining what is going on here.

Also not new, but I find the existance of virtio_has_iommu_quirk and its
name horribly confusing.  It might be better to open code it here once
only a single caller is left.

^ permalink raw reply

* Re: [RFC 1/4] virtio: Define virtio_direct_dma_ops structure
From: Christoph Hellwig @ 2018-07-30  9:24 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: robh, srikar, mst, benh, linuxram, linux-kernel, virtualization,
	hch, paulus, mpe, joe, linuxppc-dev, elfring, haren, david
In-Reply-To: <20180720035941.6844-2-khandual@linux.vnet.ibm.com>

> +/*
> + * Virtio direct mapping DMA API operations structure
> + *
> + * This defines DMA API structure for all virtio devices which would not
> + * either bring in their own DMA OPS from architecture or they would not
> + * like to use architecture specific IOMMU based DMA OPS because QEMU
> + * expects GPA instead of an IOVA in absence of VIRTIO_F_IOMMU_PLATFORM.
> + */
> +dma_addr_t virtio_direct_map_page(struct device *dev, struct page *page,
> +			    unsigned long offset, size_t size,
> +			    enum dma_data_direction dir,
> +			    unsigned long attrs)

All these functions should probably be marked static.

> +void virtio_direct_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
> +			size_t size, enum dma_data_direction dir,
> +			unsigned long attrs)
> +{
> +}

No need to implement no-op callbacks in struct dma_map_ops.

> +
> +int virtio_direct_mapping_error(struct device *hwdev, dma_addr_t dma_addr)
> +{
> +	return 0;
> +}

Including this one.

> +void *virtio_direct_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle,
> +		gfp_t gfp, unsigned long attrs)
> +{
> +	void *queue = alloc_pages_exact(PAGE_ALIGN(size), gfp);
> +
> +	if (queue) {
> +		phys_addr_t phys_addr = virt_to_phys(queue);
> +		*dma_handle = (dma_addr_t)phys_addr;
> +
> +		if (WARN_ON_ONCE(*dma_handle != phys_addr)) {
> +			free_pages_exact(queue, PAGE_ALIGN(size));
> +			return NULL;
> +		}
> +	}
> +	return queue;

queue is a very odd name in a generic memory allocator.

> +void virtio_direct_free(struct device *dev, size_t size, void *vaddr,
> +		dma_addr_t dma_addr, unsigned long attrs)
> +{
> +	free_pages_exact(vaddr, PAGE_ALIGN(size));
> +}
> +
> +const struct dma_map_ops virtio_direct_dma_ops = {
> +	.alloc			= virtio_direct_alloc,
> +	.free			= virtio_direct_free,
> +	.map_page		= virtio_direct_map_page,
> +	.unmap_page		= virtio_direct_unmap_page,
> +	.mapping_error		= virtio_direct_mapping_error,
> +};

This is missing a dma_map_sg implementation.  In general this is
mandatory for dma_ops.  So either you implement it or explain in
a common why you think you can skip it.

> +EXPORT_SYMBOL(virtio_direct_dma_ops);

EXPORT_SYMBOL_GPL like all virtio symbols, please.

^ permalink raw reply

* Re: [PATCH v2 2/2] virtio_balloon: replace oom notifier with shrinker
From: Michal Hocko @ 2018-07-30  9:00 UTC (permalink / raw)
  To: Wei Wang; +Cc: virtio-dev, mst, linux-kernel, virtualization, linux-mm, akpm
In-Reply-To: <1532683495-31974-3-git-send-email-wei.w.wang@intel.com>

On Fri 27-07-18 17:24:55, Wei Wang wrote:
> The OOM notifier is getting deprecated to use for the reasons mentioned
> here by Michal Hocko: https://lkml.org/lkml/2018/7/12/314
> 
> This patch replaces the virtio-balloon oom notifier with a shrinker
> to release balloon pages on memory pressure.

It would be great to document the replacement. This is not a small
change...

> In addition, the bug in the replaced virtballoon_oom_notify that only
> VIRTIO_BALLOON_ARRAY_PFNS_MAX (i.e 256) balloon pages can be freed
> though the user has specified more than that number is fixed in the
> shrinker_scan function.
> 
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> ---
>  drivers/virtio/virtio_balloon.c | 115 +++++++++++++++++++++++-----------------
>  1 file changed, 65 insertions(+), 50 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 9356a1a..6b2229b 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -27,7 +27,6 @@
>  #include <linux/slab.h>
>  #include <linux/module.h>
>  #include <linux/balloon_compaction.h>
> -#include <linux/oom.h>
>  #include <linux/wait.h>
>  #include <linux/mm.h>
>  #include <linux/mount.h>
> @@ -40,12 +39,12 @@
>   */
>  #define VIRTIO_BALLOON_PAGES_PER_PAGE (unsigned)(PAGE_SIZE >> VIRTIO_BALLOON_PFN_SHIFT)
>  #define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256
> -#define OOM_VBALLOON_DEFAULT_PAGES 256
> +#define DEFAULT_BALLOON_PAGES_TO_SHRINK 256
>  #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
>  
> -static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
> -module_param(oom_pages, int, S_IRUSR | S_IWUSR);
> -MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
> +static unsigned long balloon_pages_to_shrink = DEFAULT_BALLOON_PAGES_TO_SHRINK;
> +module_param(balloon_pages_to_shrink, ulong, 0600);
> +MODULE_PARM_DESC(balloon_pages_to_shrink, "pages to free on memory presure");
>  
>  #ifdef CONFIG_BALLOON_COMPACTION
>  static struct vfsmount *balloon_mnt;
> @@ -86,8 +85,8 @@ struct virtio_balloon {
>  	/* Memory statistics */
>  	struct virtio_balloon_stat stats[VIRTIO_BALLOON_S_NR];
>  
> -	/* To register callback in oom notifier call chain */
> -	struct notifier_block nb;
> +	/* To register a shrinker to shrink memory upon memory pressure */
> +	struct shrinker shrinker;
>  };
>  
>  static struct virtio_device_id id_table[] = {
> @@ -365,38 +364,6 @@ static void update_balloon_size(struct virtio_balloon *vb)
>  		      &actual);
>  }
>  
> -/*
> - * virtballoon_oom_notify - release pages when system is under severe
> - *			    memory pressure (called from out_of_memory())
> - * @self : notifier block struct
> - * @dummy: not used
> - * @parm : returned - number of freed pages
> - *
> - * The balancing of memory by use of the virtio balloon should not cause
> - * the termination of processes while there are pages in the balloon.
> - * If virtio balloon manages to release some memory, it will make the
> - * system return and retry the allocation that forced the OOM killer
> - * to run.
> - */
> -static int virtballoon_oom_notify(struct notifier_block *self,
> -				  unsigned long dummy, void *parm)
> -{
> -	struct virtio_balloon *vb;
> -	unsigned long *freed;
> -	unsigned num_freed_pages;
> -
> -	vb = container_of(self, struct virtio_balloon, nb);
> -	if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
> -		return NOTIFY_OK;
> -
> -	freed = parm;
> -	num_freed_pages = leak_balloon(vb, oom_pages);
> -	update_balloon_size(vb);
> -	*freed += num_freed_pages;
> -
> -	return NOTIFY_OK;
> -}
> -
>  static void update_balloon_stats_func(struct work_struct *work)
>  {
>  	struct virtio_balloon *vb;
> @@ -548,6 +515,54 @@ static struct file_system_type balloon_fs = {
>  
>  #endif /* CONFIG_BALLOON_COMPACTION */
>  
> +static unsigned long virtio_balloon_shrinker_scan(struct shrinker *shrinker,
> +						  struct shrink_control *sc)
> +{
> +	unsigned long pages_to_free = balloon_pages_to_shrink,
> +		      pages_freed = 0;
> +	struct virtio_balloon *vb = container_of(shrinker,
> +					struct virtio_balloon, shrinker);
> +
> +	/*
> +	 * One invocation of leak_balloon can deflate at most
> +	 * VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it
> +	 * multiple times to deflate pages till reaching
> +	 * balloon_pages_to_shrink pages.
> +	 */
> +	while (vb->num_pages && pages_to_free) {
> +		pages_to_free = balloon_pages_to_shrink - pages_freed;
> +		pages_freed += leak_balloon(vb, pages_to_free);
> +	}
> +	update_balloon_size(vb);
> +
> +	return pages_freed / VIRTIO_BALLOON_PAGES_PER_PAGE;
> +}
> +
> +static unsigned long virtio_balloon_shrinker_count(struct shrinker *shrinker,
> +						   struct shrink_control *sc)
> +{
> +	struct virtio_balloon *vb = container_of(shrinker,
> +					struct virtio_balloon, shrinker);
> +
> +	return min_t(unsigned long, vb->num_pages, balloon_pages_to_shrink) /
> +	       VIRTIO_BALLOON_PAGES_PER_PAGE;
> +}
> +
> +static void virtio_balloon_unregister_shrinker(struct virtio_balloon *vb)
> +{
> +	unregister_shrinker(&vb->shrinker);
> +}
> +
> +static int virtio_balloon_register_shrinker(struct virtio_balloon *vb)
> +{
> +	vb->shrinker.scan_objects = virtio_balloon_shrinker_scan;
> +	vb->shrinker.count_objects = virtio_balloon_shrinker_count;
> +	vb->shrinker.batch = 0;
> +	vb->shrinker.seeks = DEFAULT_SEEKS;
> +
> +	return register_shrinker(&vb->shrinker);
> +}
> +
>  static int virtballoon_probe(struct virtio_device *vdev)
>  {
>  	struct virtio_balloon *vb;
> @@ -580,17 +595,10 @@ static int virtballoon_probe(struct virtio_device *vdev)
>  	if (err)
>  		goto out_free_vb;
>  
> -	vb->nb.notifier_call = virtballoon_oom_notify;
> -	vb->nb.priority = VIRTBALLOON_OOM_NOTIFY_PRIORITY;
> -	err = register_oom_notifier(&vb->nb);
> -	if (err < 0)
> -		goto out_del_vqs;
> -
>  #ifdef CONFIG_BALLOON_COMPACTION
>  	balloon_mnt = kern_mount(&balloon_fs);
>  	if (IS_ERR(balloon_mnt)) {
>  		err = PTR_ERR(balloon_mnt);
> -		unregister_oom_notifier(&vb->nb);
>  		goto out_del_vqs;
>  	}
>  
> @@ -599,13 +607,20 @@ static int virtballoon_probe(struct virtio_device *vdev)
>  	if (IS_ERR(vb->vb_dev_info.inode)) {
>  		err = PTR_ERR(vb->vb_dev_info.inode);
>  		kern_unmount(balloon_mnt);
> -		unregister_oom_notifier(&vb->nb);
>  		vb->vb_dev_info.inode = NULL;
>  		goto out_del_vqs;
>  	}
>  	vb->vb_dev_info.inode->i_mapping->a_ops = &balloon_aops;
>  #endif
> -
> +	/*
> +	 * We continue to use VIRTIO_BALLOON_F_DEFLATE_ON_OOM to decide if a
> +	 * shrinker needs to be registered to relieve memory pressure.
> +	 */
> +	if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) {
> +		err = virtio_balloon_register_shrinker(vb);
> +		if (err)
> +			goto out_del_vqs;
> +	}
>  	virtio_device_ready(vdev);
>  
>  	if (towards_target(vb))
> @@ -637,8 +652,8 @@ static void virtballoon_remove(struct virtio_device *vdev)
>  {
>  	struct virtio_balloon *vb = vdev->priv;
>  
> -	unregister_oom_notifier(&vb->nb);
> -
> +	if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
> +		virtio_balloon_unregister_shrinker(vb);
>  	spin_lock_irq(&vb->stop_update_lock);
>  	vb->stop_update = true;
>  	spin_unlock_irq(&vb->stop_update_lock);
> -- 
> 2.7.4

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply

* Re: [PATCH 2/2] tools/virtio: add kmalloc_array stub
From: Jason Wang @ 2018-07-30  7:28 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel; +Cc: virtualization, khandual
In-Reply-To: <20180725134057.113423-2-mst@redhat.com>



On 2018年07月25日 21:45, Michael S. Tsirkin wrote:
> Fixes: 6da2ec56059 ("treewide: kmalloc() -> kmalloc_array()")
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>   tools/virtio/linux/kernel.h | 5 +++++
>   1 file changed, 5 insertions(+)
>
> diff --git a/tools/virtio/linux/kernel.h b/tools/virtio/linux/kernel.h
> index fca8381bbe04..fb22bccfbc8a 100644
> --- a/tools/virtio/linux/kernel.h
> +++ b/tools/virtio/linux/kernel.h
> @@ -52,6 +52,11 @@ static inline void *kmalloc(size_t s, gfp_t gfp)
>   		return __kmalloc_fake;
>   	return malloc(s);
>   }
> +static inline void *kmalloc_array(unsigned n, size_t s, gfp_t gfp)
> +{
> +	return kmalloc(n * s, gfp);
> +}
> +
>   static inline void *kzalloc(size_t s, gfp_t gfp)
>   {
>   	void *p = kmalloc(s, gfp);

Reviewed-by: Jason Wang <jasowang@redhat.com>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH 1/2] tools/virtio: add dma barrier stubs
From: Jason Wang @ 2018-07-30  7:28 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel; +Cc: virtualization, khandual
In-Reply-To: <20180725134057.113423-1-mst@redhat.com>



On 2018年07月25日 21:45, Michael S. Tsirkin wrote:
> Fixes: 55e49dc43a8 ("virtio_ring: switch to dma_XX barriers for rpmsg")
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>   tools/virtio/asm/barrier.h | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tools/virtio/asm/barrier.h b/tools/virtio/asm/barrier.h
> index 0ac3caf90877..d0351f83aebe 100644
> --- a/tools/virtio/asm/barrier.h
> +++ b/tools/virtio/asm/barrier.h
> @@ -13,8 +13,8 @@
>   } while (0);
>   /* Weak barriers should be used. If not - it's a bug */
>   # define mb() abort()
> -# define rmb() abort()
> -# define wmb() abort()
> +# define dma_rmb() abort()
> +# define dma_wmb() abort()
>   #else
>   #error Please fill in barrier macros
>   #endif

Reviewed-by: Jason Wang <jasowang@redhat.com>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [RFC 2/4] virtio: Override device's DMA OPS with virtio_direct_dma_ops selectively
From: Anshuman Khandual @ 2018-07-30  4:15 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: robh, srikar, benh, linuxram, linux-kernel, virtualization, hch,
	paulus, mpe, joe, linuxppc-dev, elfring, haren, david
In-Reply-To: <20180729001344-mutt-send-email-mst@kernel.org>

On 07/29/2018 02:46 AM, Michael S. Tsirkin wrote:
> On Sat, Jul 28, 2018 at 02:26:24PM +0530, Anshuman Khandual wrote:
>> On 07/20/2018 09:29 AM, Anshuman Khandual wrote:
>>> Now that virtio core always needs all virtio devices to have DMA OPS, we
>>> need to make sure that the structure it points is the right one. In the
>>> absence of VIRTIO_F_IOMMU_PLATFORM flag QEMU expects GPA from guest kernel.
>>> In such case, virtio device must use default virtio_direct_dma_ops DMA OPS
>>> structure which transforms scatter gather buffer addresses as GPA. This
>>> DMA OPS override must happen as early as possible during virtio device
>>> initializatin sequence before virtio core starts using given device's DMA
>>> OPS callbacks for I/O transactions. This change detects device's IOMMU flag
>>> and does the override in case the flag is cleared.
>>>
>>> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>>> ---
>>>  drivers/virtio/virtio.c | 5 +++++
>>>  1 file changed, 5 insertions(+)
>>>
>>> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
>>> index 7907ad3..6b13987 100644
>>> --- a/drivers/virtio/virtio.c
>>> +++ b/drivers/virtio/virtio.c
>>> @@ -166,6 +166,8 @@ void virtio_add_status(struct virtio_device *dev, unsigned int status)
>>>  }
>>>  EXPORT_SYMBOL_GPL(virtio_add_status);
>>>
>>> +const struct dma_map_ops virtio_direct_dma_ops;
>>> +
>>>  int virtio_finalize_features(struct virtio_device *dev)
>>>  {
>>>  	int ret = dev->config->finalize_features(dev);
>>> @@ -174,6 +176,9 @@ int virtio_finalize_features(struct virtio_device *dev)
>>>  	if (ret)
>>>  		return ret;
>>
>>
>> The previous patch removed the code block for XEN guests which forced
>> the use of DMA API all the time irrespective of VIRTIO_F_IOMMU_PLATFORM
>> flag on the device. Here is what I have removed with patch 2/4 which
>> breaks the existing semantics on XEN guests.
>>
>> -static bool vring_use_dma_api(struct virtio_device *vdev)
>> -{
>> -	if (!virtio_has_iommu_quirk(vdev))
>> -		return true;
>> -
>> -	/* Otherwise, we are left to guess. */
>> -	/*
>> -	 * In theory, it's possible to have a buggy QEMU-supposed
>> -	 * emulated Q35 IOMMU and Xen enabled at the same time.  On
>> -	 * such a configuration, virtio has never worked and will
>> -	 * not work without an even larger kludge.  Instead, enable
>> -	 * the DMA API if we're a Xen guest, which at least allows
>> -	 * all of the sensible Xen configurations to work correctly.
>> -	 */
>> -	if (xen_domain())
>> -		return true;
>> -
>> -	return false;
>> -}
>>
>> XEN guests would not like override with virtio_direct_dma_ops in any
>> case irrespective of the flag VIRTIO_F_IOMMU_PLATFORM. So the existing
>> semantics can be preserved with something like this. It just assumes
>> that dev->dma_ops is non-NULL and a valid one set by the architecture.
>> If required we can add those tests here before skipping the override.
>>
>> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
>> index 7907ad3..6b13987 100644
>> --- a/drivers/virtio/virtio.c
>> +++ b/drivers/virtio/virtio.c
>> @@ -166,6 +166,8 @@ void virtio_add_status(struct virtio_device *dev, unsigned int status)
>>  }
>>  EXPORT_SYMBOL_GPL(virtio_add_status);
>>
>> +const struct dma_map_ops virtio_direct_dma_ops;
>> +
>>  int virtio_finalize_features(struct virtio_device *dev)
>>  {
>>  	int ret = dev->config->finalize_features(dev);
>> @@ -174,6 +176,9 @@ int virtio_finalize_features(struct virtio_device *dev)
>>  	if (ret)
>>  		return ret;
>> +
>> +	if (xen_domain())
>> +		goto skip_override;
>> +
>> +	if (virtio_has_iommu_quirk(dev))
>> +		set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops);
>> +
>> + skip_override:
>> +
> 
> I prefer normal if scoping as opposed to goto spaghetti pls.
> Better yet move vring_use_dma_api here and use it.
> Less of a chance something will break.

Sure, will move vring_use_dma_api() function in here.

^ permalink raw reply

* Re: [PATCH net-next v6 3/4] net: vhost: factor out busy polling logic to vhost_net_busy_poll()
From: Jason Wang @ 2018-07-30  3:16 UTC (permalink / raw)
  To: Tonghao Zhang, makita.toshiaki
  Cc: Linux Kernel Network Developers, toshiaki.makita1, virtualization,
	mst
In-Reply-To: <CAMDZJNVVJs35kuvktTxn+mmDz7db+1K-kfuOoMUn9Z=WoayUVw@mail.gmail.com>



On 2018年07月24日 11:28, Tonghao Zhang wrote:
> On Tue, Jul 24, 2018 at 10:53 AM Toshiaki Makita
> <makita.toshiaki@lab.ntt.co.jp>  wrote:
>> On 2018/07/24 2:31, Tonghao Zhang wrote:
>>> On Mon, Jul 23, 2018 at 10:20 PM Toshiaki Makita
>>> <toshiaki.makita1@gmail.com>  wrote:
>>>> On 18/07/23 (月) 21:43, Tonghao Zhang wrote:
>>>>> On Mon, Jul 23, 2018 at 5:58 PM Toshiaki Makita
>>>>> <makita.toshiaki@lab.ntt.co.jp>  wrote:
>>>>>> On 2018/07/22 3:04,xiangxia.m.yue@gmail.com  wrote:
>>>>>>> From: Tonghao Zhang<xiangxia.m.yue@gmail.com>
>>>>>>>
>>>>>>> Factor out generic busy polling logic and will be
>>>>>>> used for in tx path in the next patch. And with the patch,
>>>>>>> qemu can set differently the busyloop_timeout for rx queue.
>>>>>>>
>>>>>>> Signed-off-by: Tonghao Zhang<xiangxia.m.yue@gmail.com>
>>>>>>> ---
>>>>>> ...
>>>>>>> +static void vhost_net_busy_poll_vq_check(struct vhost_net *net,
>>>>>>> +                                      struct vhost_virtqueue *rvq,
>>>>>>> +                                      struct vhost_virtqueue *tvq,
>>>>>>> +                                      bool rx)
>>>>>>> +{
>>>>>>> +     struct socket *sock = rvq->private_data;
>>>>>>> +
>>>>>>> +     if (rx) {
>>>>>>> +             if (!vhost_vq_avail_empty(&net->dev, tvq)) {
>>>>>>> +                     vhost_poll_queue(&tvq->poll);
>>>>>>> +             } else if (unlikely(vhost_enable_notify(&net->dev, tvq))) {
>>>>>>> +                     vhost_disable_notify(&net->dev, tvq);
>>>>>>> +                     vhost_poll_queue(&tvq->poll);
>>>>>>> +             }
>>>>>>> +     } else if ((sock && sk_has_rx_data(sock->sk)) &&
>>>>>>> +                 !vhost_vq_avail_empty(&net->dev, rvq)) {
>>>>>>> +             vhost_poll_queue(&rvq->poll);
>>>>>> Now we wait for vq_avail for rx as well, I think you cannot skip
>>>>>> vhost_enable_notify() on tx. Probably you might want to do:
>>>>> I think vhost_enable_notify is needed.
>>>>>
>>>>>> } else if (sock && sk_has_rx_data(sock->sk)) {
>>>>>>           if (!vhost_vq_avail_empty(&net->dev, rvq)) {
>>>>>>                   vhost_poll_queue(&rvq->poll);
>>>>>>           } else if (unlikely(vhost_enable_notify(&net->dev, rvq))) {
>>>>>>                   vhost_disable_notify(&net->dev, rvq);
>>>>>>                   vhost_poll_queue(&rvq->poll);
>>>>>>           }
>>>>>> }
>>>>> As Jason review as before, we only want rx kick when packet is pending at
>>>>> socket but we're out of available buffers. So we just enable notify,
>>>>> but not poll it ?
>>>>>
>>>>>           } else if ((sock && sk_has_rx_data(sock->sk)) &&
>>>>>                       !vhost_vq_avail_empty(&net->dev, rvq)) {
>>>>>                   vhost_poll_queue(&rvq->poll);
>>>>>           else {
>>>>>                   vhost_enable_notify(&net->dev, rvq);
>>>>>           }
>>>> When vhost_enable_notify() returns true the avail becomes non-empty
>>>> while we are enabling notify. We may delay the rx process if we don't
>>>> check the return value of vhost_enable_notify().
>>> I got it thanks.
>>>>>> Also it's better to care vhost_net_disable_vq()/vhost_net_enable_vq() on tx?
>>>>> I cant find why it is better, if necessary, we can do it.
>>>> The reason is pretty simple... we are busypolling the socket so we don't
>>>> need rx wakeups during it?
>>> OK, but one question, how about rx? do we use the
>>> vhost_net_disable_vq/vhost_net_ensable_vq on rx ?
>> If we are busypolling the sock tx buf? I'm not sure if polling it
>> improves the performance.
> Not the sock tx buff, when we are busypolling in handle_rx, we will
> check the tx vring via  vhost_vq_avail_empty.
> So, should we the disable tvq, e.g. vhost_net_disable_vq(net, tvq)?> --

This could be done on top since tx wakeups only happnes when we run out 
of sndbuf.

Thanks

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH net-next v6 1/4] net: vhost: lock the vqs one by one
From: Jason Wang @ 2018-07-30  2:54 UTC (permalink / raw)
  To: Tonghao Zhang, mst; +Cc: Linux Kernel Network Developers, virtualization
In-Reply-To: <CAMDZJNX41vtdNNEAxHYwC+WcrJFkON70hVumVE9rbFDBC5QUOQ@mail.gmail.com>



On 2018年07月25日 20:05, Tonghao Zhang wrote:
> On Sun, Jul 22, 2018 at 11:26 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>> On Sat, Jul 21, 2018 at 11:03:59AM -0700, xiangxia.m.yue@gmail.com wrote:
>>> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>>>
>>> This patch changes the way that lock all vqs
>>> at the same, to lock them one by one. It will
>>> be used for next patch to avoid the deadlock.
>>>
>>> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
>>> Acked-by: Jason Wang <jasowang@redhat.com>
>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>> ---
>>>   drivers/vhost/vhost.c | 24 +++++++-----------------
>>>   1 file changed, 7 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>>> index a502f1a..a1c06e7 100644
>>> --- a/drivers/vhost/vhost.c
>>> +++ b/drivers/vhost/vhost.c
>>> @@ -294,8 +294,11 @@ static void vhost_vq_meta_reset(struct vhost_dev *d)
>>>   {
>>>        int i;
>>>
>>> -     for (i = 0; i < d->nvqs; ++i)
>>> +     for (i = 0; i < d->nvqs; ++i) {
>>> +             mutex_lock(&d->vqs[i]->mutex);
>>>                __vhost_vq_meta_reset(d->vqs[i]);
>>> +             mutex_unlock(&d->vqs[i]->mutex);
>>> +     }
>>>   }
>>>
>>>   static void vhost_vq_reset(struct vhost_dev *dev,
>>> @@ -890,20 +893,6 @@ static inline void __user *__vhost_get_user(struct vhost_virtqueue *vq,
>>>   #define vhost_get_used(vq, x, ptr) \
>>>        vhost_get_user(vq, x, ptr, VHOST_ADDR_USED)
>>>
>>> -static void vhost_dev_lock_vqs(struct vhost_dev *d)
>>> -{
>>> -     int i = 0;
>>> -     for (i = 0; i < d->nvqs; ++i)
>>> -             mutex_lock_nested(&d->vqs[i]->mutex, i);
>>> -}
>>> -
>>> -static void vhost_dev_unlock_vqs(struct vhost_dev *d)
>>> -{
>>> -     int i = 0;
>>> -     for (i = 0; i < d->nvqs; ++i)
>>> -             mutex_unlock(&d->vqs[i]->mutex);
>>> -}
>>> -
>>>   static int vhost_new_umem_range(struct vhost_umem *umem,
>>>                                u64 start, u64 size, u64 end,
>>>                                u64 userspace_addr, int perm)
>>> @@ -953,7 +942,10 @@ static void vhost_iotlb_notify_vq(struct vhost_dev *d,
>>>                if (msg->iova <= vq_msg->iova &&
>>>                    msg->iova + msg->size - 1 > vq_msg->iova &&
>>>                    vq_msg->type == VHOST_IOTLB_MISS) {
>>> +                     mutex_lock(&node->vq->mutex);
>>>                        vhost_poll_queue(&node->vq->poll);
>>> +                     mutex_unlock(&node->vq->mutex);
>>> +
>>>                        list_del(&node->node);
>>>                        kfree(node);
>>>                }
>>> @@ -985,7 +977,6 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev,
>>>        int ret = 0;
>>>
>>>        mutex_lock(&dev->mutex);
>>> -     vhost_dev_lock_vqs(dev);
>>>        switch (msg->type) {
>>>        case VHOST_IOTLB_UPDATE:
>>>                if (!dev->iotlb) {
>>> @@ -1019,7 +1010,6 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev,
>>>                break;
>>>        }
>>>
>>> -     vhost_dev_unlock_vqs(dev);
>>>        mutex_unlock(&dev->mutex);
>>>
>>>        return ret;
>> I do prefer the finer-grained locking but I remember we
>> discussed something like this in the past and Jason saw issues
>> with such a locking.
> This change is suggested by Jason. Should I send new version because
> the patch 3 is changed.
>
>> Jason?

Actually, the code was a little bit tricky here. Since it assumes 
handle_tx() and handle_rx() run on a single thread. Though the lock 
ordering is different, it was still safe.

Maybe we can add some comments to explain this.

Thanks

>>
>>> --
>>> 1.8.3.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [RFC 2/4] virtio: Override device's DMA OPS with virtio_direct_dma_ops selectively
From: Michael S. Tsirkin @ 2018-07-28 21:16 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: robh, srikar, benh, linuxram, linux-kernel, virtualization, hch,
	paulus, mpe, joe, linuxppc-dev, elfring, haren, david
In-Reply-To: <c443ad8c-fb81-302d-edb2-5521831d38da@linux.vnet.ibm.com>

On Sat, Jul 28, 2018 at 02:26:24PM +0530, Anshuman Khandual wrote:
> On 07/20/2018 09:29 AM, Anshuman Khandual wrote:
> > Now that virtio core always needs all virtio devices to have DMA OPS, we
> > need to make sure that the structure it points is the right one. In the
> > absence of VIRTIO_F_IOMMU_PLATFORM flag QEMU expects GPA from guest kernel.
> > In such case, virtio device must use default virtio_direct_dma_ops DMA OPS
> > structure which transforms scatter gather buffer addresses as GPA. This
> > DMA OPS override must happen as early as possible during virtio device
> > initializatin sequence before virtio core starts using given device's DMA
> > OPS callbacks for I/O transactions. This change detects device's IOMMU flag
> > and does the override in case the flag is cleared.
> > 
> > Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> > ---
> >  drivers/virtio/virtio.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > index 7907ad3..6b13987 100644
> > --- a/drivers/virtio/virtio.c
> > +++ b/drivers/virtio/virtio.c
> > @@ -166,6 +166,8 @@ void virtio_add_status(struct virtio_device *dev, unsigned int status)
> >  }
> >  EXPORT_SYMBOL_GPL(virtio_add_status);
> > 
> > +const struct dma_map_ops virtio_direct_dma_ops;
> > +
> >  int virtio_finalize_features(struct virtio_device *dev)
> >  {
> >  	int ret = dev->config->finalize_features(dev);
> > @@ -174,6 +176,9 @@ int virtio_finalize_features(struct virtio_device *dev)
> >  	if (ret)
> >  		return ret;
> 
> 
> The previous patch removed the code block for XEN guests which forced
> the use of DMA API all the time irrespective of VIRTIO_F_IOMMU_PLATFORM
> flag on the device. Here is what I have removed with patch 2/4 which
> breaks the existing semantics on XEN guests.
> 
> -static bool vring_use_dma_api(struct virtio_device *vdev)
> -{
> -	if (!virtio_has_iommu_quirk(vdev))
> -		return true;
> -
> -	/* Otherwise, we are left to guess. */
> -	/*
> -	 * In theory, it's possible to have a buggy QEMU-supposed
> -	 * emulated Q35 IOMMU and Xen enabled at the same time.  On
> -	 * such a configuration, virtio has never worked and will
> -	 * not work without an even larger kludge.  Instead, enable
> -	 * the DMA API if we're a Xen guest, which at least allows
> -	 * all of the sensible Xen configurations to work correctly.
> -	 */
> -	if (xen_domain())
> -		return true;
> -
> -	return false;
> -}
> 
> XEN guests would not like override with virtio_direct_dma_ops in any
> case irrespective of the flag VIRTIO_F_IOMMU_PLATFORM. So the existing
> semantics can be preserved with something like this. It just assumes
> that dev->dma_ops is non-NULL and a valid one set by the architecture.
> If required we can add those tests here before skipping the override.
> 
> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> index 7907ad3..6b13987 100644
> --- a/drivers/virtio/virtio.c
> +++ b/drivers/virtio/virtio.c
> @@ -166,6 +166,8 @@ void virtio_add_status(struct virtio_device *dev, unsigned int status)
>  }
>  EXPORT_SYMBOL_GPL(virtio_add_status);
> 
> +const struct dma_map_ops virtio_direct_dma_ops;
> +
>  int virtio_finalize_features(struct virtio_device *dev)
>  {
>  	int ret = dev->config->finalize_features(dev);
> @@ -174,6 +176,9 @@ int virtio_finalize_features(struct virtio_device *dev)
>  	if (ret)
>  		return ret;
> +
> +	if (xen_domain())
> +		goto skip_override;
> +
> +	if (virtio_has_iommu_quirk(dev))
> +		set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops);
> +
> + skip_override:
> +

I prefer normal if scoping as opposed to goto spaghetti pls.
Better yet move vring_use_dma_api here and use it.
Less of a chance something will break.

>  	if (!virtio_has_feature(dev, VIRTIO_F_VERSION_1))
>  		return 0
> 
> Will incorporate these changes in the next version.

^ permalink raw reply

* Call for Workshops Proposals - WorldCIST'19, La Toja Island, Spain
From: Maria Lemos @ 2018-07-28 13:12 UTC (permalink / raw)
  To: virtualization


[-- Attachment #1.1: Type: text/plain, Size: 4413 bytes --]

----------------- CALL FOR WORKSHOPS PROPOSALS --------------------
WorldCIST'19 - 7th World Conference on Information Systems and Technologies
                  16th-19th of April 2019, La Toja Island, Galicia, Spain
                                       http://www.worldcist.org/ <http://www.worldcist.org/>
-----------------------------------------------------------------------------------


The Information Systems and Technologies research and industrial community is invited to submit proposals for the organization of Workshops at WorldCist'19 - 7th World Conference on Information Systems and Technologies, to be held at La Toja Island, Galicia, Spain, 16 - 19 April 2019. WorldCist is a global forum for researchers and practitioners to present and discuss the most recent innovations, trends, results, experiences and concerns in the several perspectives of Information Systems and Technologies.


###############
WORKSHOP FORMAT
###############

Workshops should focus on a specific scientific subject on the scope of WorldCist'19 but not directly included on the main conference areas. Each workshop will be coordinated by an Organizing Committee composed of, at least, two researchers in the field, preferably from different institutions and different countries. The organizers should create an international Program Committee for the Workshop, with recognized researchers within the specific Workshop scientific area. Each workshop should have at least ten submissions and five accepted papers in order to be conducted at WorldCist'19.

The selection of Workshops will be performed by WorldCist'19 Conference/Workshop Chairs. Workshops full and short papers will be published in the conference main proceedings in specific Workshop chapters published by Springer in a book of the AISC series. Proceedings will be submitted for indexation by ISI Thomson, SCOPUS, DBLP, EI-Compendex among several other scientific databases. Extended versions of best selected papers will be published in journals indexed by ISI/SCI, SCOPUS and DBLP. Detailed and up-to-date information may be found at WorldCist'19 website: http://www.worldcist.org/ <http://www.worldcist.org/>


#####################
WORKSHOP ORGANIZATION
#####################

The Organizing Committee of each Workshop will be responsible for:

- Producing and distributing the Workshop Call for Papers (CFP);
- Coordinating the review and selection process for the papers submitted to the Workshop, as Workshop chairs (on the paper submission system to be installed);
- Delivering the final versions of the papers accepted for the Workshop in accordance with the guidelines and deadlines defined by WorldCist'19 organizers;
- Coordinating and chairing the Workshop sessions at the conference.

WorldCist'19 organizers reserve the right to cancel any Workshop if deadlines are missed or if the number of registered attendees is too low to support the costs associated with the Workshop.


################
PROPOSAL CONTENT
################

Workshop proposals should contain the following information:

- Workshop title;
- Brief description of the specific scientific scope of the Workshop;
- List of topics of interest (max 15 topics);
- Reasons the Workshop should be held within WorldCist’19;
- Name, postal address, phone and email of all the members of the Workshop Organizing Committee;
- Preliminary proposal for the Workshop Program Committee (Names and affiliations).

Proposals should be submitted at https://easychair.org/conferences/?conf=worldcist-workshops2019 <https://easychair.org/conferences/?conf=worldcist-workshops2019> in PDF (in English), by September 10, 2018.


###############
IMPORTANT DATES
###############

- Deadline for Workshop proposals: September 10, 2018
- Notification of Workshop acceptance: September 20, 2018
- Workshop Final Information and Program Committee: October 10, 2018
- Deadline for paper submission: November 30, 2018
- Notification of paper acceptance: January 6, 2019
- Deadline for final versions and conference registration: January 20, 2019
- Conference dates: April 16-19, 2019


#####
CHAIR
#####

Luis Paulo Reis, AISTI, IEEE & University of Porto, Portugal


WorldCIST'19 Website: http://www.worldcist.org/ <http://www.worldcist.org/>


---
This email has been checked for viruses by AVG.
https://www.avg.com

[-- Attachment #1.2: Type: text/html, Size: 6036 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [RFC 2/4] virtio: Override device's DMA OPS with virtio_direct_dma_ops selectively
From: Anshuman Khandual @ 2018-07-28  8:56 UTC (permalink / raw)
  To: virtualization, linux-kernel
  Cc: robh, srikar, mst, benh, linuxram, hch, paulus, mpe, joe,
	linuxppc-dev, elfring, haren, david
In-Reply-To: <20180720035941.6844-3-khandual@linux.vnet.ibm.com>

On 07/20/2018 09:29 AM, Anshuman Khandual wrote:
> Now that virtio core always needs all virtio devices to have DMA OPS, we
> need to make sure that the structure it points is the right one. In the
> absence of VIRTIO_F_IOMMU_PLATFORM flag QEMU expects GPA from guest kernel.
> In such case, virtio device must use default virtio_direct_dma_ops DMA OPS
> structure which transforms scatter gather buffer addresses as GPA. This
> DMA OPS override must happen as early as possible during virtio device
> initializatin sequence before virtio core starts using given device's DMA
> OPS callbacks for I/O transactions. This change detects device's IOMMU flag
> and does the override in case the flag is cleared.
> 
> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> ---
>  drivers/virtio/virtio.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> index 7907ad3..6b13987 100644
> --- a/drivers/virtio/virtio.c
> +++ b/drivers/virtio/virtio.c
> @@ -166,6 +166,8 @@ void virtio_add_status(struct virtio_device *dev, unsigned int status)
>  }
>  EXPORT_SYMBOL_GPL(virtio_add_status);
> 
> +const struct dma_map_ops virtio_direct_dma_ops;
> +
>  int virtio_finalize_features(struct virtio_device *dev)
>  {
>  	int ret = dev->config->finalize_features(dev);
> @@ -174,6 +176,9 @@ int virtio_finalize_features(struct virtio_device *dev)
>  	if (ret)
>  		return ret;


The previous patch removed the code block for XEN guests which forced
the use of DMA API all the time irrespective of VIRTIO_F_IOMMU_PLATFORM
flag on the device. Here is what I have removed with patch 2/4 which
breaks the existing semantics on XEN guests.

-static bool vring_use_dma_api(struct virtio_device *vdev)
-{
-	if (!virtio_has_iommu_quirk(vdev))
-		return true;
-
-	/* Otherwise, we are left to guess. */
-	/*
-	 * In theory, it's possible to have a buggy QEMU-supposed
-	 * emulated Q35 IOMMU and Xen enabled at the same time.  On
-	 * such a configuration, virtio has never worked and will
-	 * not work without an even larger kludge.  Instead, enable
-	 * the DMA API if we're a Xen guest, which at least allows
-	 * all of the sensible Xen configurations to work correctly.
-	 */
-	if (xen_domain())
-		return true;
-
-	return false;
-}

XEN guests would not like override with virtio_direct_dma_ops in any
case irrespective of the flag VIRTIO_F_IOMMU_PLATFORM. So the existing
semantics can be preserved with something like this. It just assumes
that dev->dma_ops is non-NULL and a valid one set by the architecture.
If required we can add those tests here before skipping the override.

diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index 7907ad3..6b13987 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -166,6 +166,8 @@ void virtio_add_status(struct virtio_device *dev, unsigned int status)
 }
 EXPORT_SYMBOL_GPL(virtio_add_status);

+const struct dma_map_ops virtio_direct_dma_ops;
+
 int virtio_finalize_features(struct virtio_device *dev)
 {
 	int ret = dev->config->finalize_features(dev);
@@ -174,6 +176,9 @@ int virtio_finalize_features(struct virtio_device *dev)
 	if (ret)
 		return ret;
+
+	if (xen_domain())
+		goto skip_override;
+
+	if (virtio_has_iommu_quirk(dev))
+		set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops);
+
+ skip_override:
+
 	if (!virtio_has_feature(dev, VIRTIO_F_VERSION_1))
 		return 0

Will incorporate these changes in the next version.

^ permalink raw reply related

* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Anshuman Khandual @ 2018-07-28  8:37 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: robh, srikar, linuxram, linux-kernel, virtualization, hch, paulus,
	joe, linuxppc-dev, elfring, haren, david
In-Reply-To: <20180727143008-mutt-send-email-mst@kernel.org>

On 07/27/2018 05:01 PM, Michael S. Tsirkin wrote:
> On Wed, Jul 25, 2018 at 08:56:23AM +0530, Anshuman Khandual wrote:
>> Results with and without the patches are similar.
> 
> Thanks! And another thing to try is virtio-net with
> a fast NIC backend (40G and up). Unfortunately
> at this point loopback tests stress the host
> scheduler too much.
> 

Sure. Will look around for a 40G NIC system. BTW I have been testing
virtio-net with a TAP device as back end.

ip tuntap add dev tap0 mode tap user $(whoami)
ip link set tap0 master virbr0
ip link set dev virbr0 up
ip link set dev tap0 up

which is exported into the guest as follows

-device virtio-net,netdev=network0,mac=52:55:00:d1:55:01 \
-netdev tap,id=network0,ifname=tap0,script=no,downscript=no \

But I have not run any network benchmarks on it though.

^ permalink raw reply

* RE: [PATCH v2 0/2] virtio-balloon: some improvements
From: Wang, Wei W @ 2018-07-28  2:00 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev@lists.oasis-open.org, linux-kernel@vger.kernel.org,
	mhocko@kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org,
	virtualization@lists.linux-foundation.org
In-Reply-To: <20180727170605-mutt-send-email-mst@kernel.org>

On Friday, July 27, 2018 10:06 PM, Michael S. Tsirkin wrote:
> On Fri, Jul 27, 2018 at 05:24:53PM +0800, Wei Wang wrote:
> > This series is split from the "Virtio-balloon: support free page
> > reporting" series to make some improvements.
> >
> > v1->v2 ChangeLog:
> > - register the shrinker when VIRTIO_BALLOON_F_DEFLATE_ON_OOM is
> negotiated.
> >
> > Wei Wang (2):
> >   virtio-balloon: remove BUG() in init_vqs
> >   virtio_balloon: replace oom notifier with shrinker
> 
> Thanks!
> Given it's very late in the release cycle, I'll merge this for the next Linux
> release.

No problem. Thanks!

Best,
Wei

^ permalink raw reply

* [PATCH] drm: qxl: Fix NULL pointer dereference at qxl_alloc_client_monitors_config
From: Anton Vasilyev @ 2018-07-27 15:30 UTC (permalink / raw)
  To: Dave Airlie
  Cc: ldv-project, David Airlie, linux-kernel, dri-devel,
	virtualization, Anton Vasilyev

If qxl_alloc_client_monitors_config() fails to allocate
client_monitors_config then NULL pointer dereference occurs
in function qxl_display_copy_rom_client_monitors_config() after
qxl_alloc_client_monitors_config() call.

The patch adds return error from qxl_alloc_client_monitors_config()
and additional status for qxl_display_copy_rom_client_monitors_config
return value.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru>
---
Note: Is it correct that qxl_display_read_client_monitors_config() does not
return error in case of fail?
---
 drivers/gpu/drm/qxl/qxl_display.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/qxl/qxl_display.c b/drivers/gpu/drm/qxl/qxl_display.c
index 768207fbbae3..a59b2eca5f5b 100644
--- a/drivers/gpu/drm/qxl/qxl_display.c
+++ b/drivers/gpu/drm/qxl/qxl_display.c
@@ -37,7 +37,8 @@ static bool qxl_head_enabled(struct qxl_head *head)
 	return head->width && head->height;
 }
 
-static void qxl_alloc_client_monitors_config(struct qxl_device *qdev, unsigned count)
+static int qxl_alloc_client_monitors_config(struct qxl_device *qdev,
+		unsigned int count)
 {
 	if (qdev->client_monitors_config &&
 	    count > qdev->client_monitors_config->count) {
@@ -49,15 +50,17 @@ static void qxl_alloc_client_monitors_config(struct qxl_device *qdev, unsigned c
 				sizeof(struct qxl_monitors_config) +
 				sizeof(struct qxl_head) * count, GFP_KERNEL);
 		if (!qdev->client_monitors_config)
-			return;
+			return -ENOMEM;
 	}
 	qdev->client_monitors_config->count = count;
+	return 0;
 }
 
 enum {
 	MONITORS_CONFIG_MODIFIED,
 	MONITORS_CONFIG_UNCHANGED,
 	MONITORS_CONFIG_BAD_CRC,
+	MONITORS_CONFIG_ERROR,
 };
 
 static int qxl_display_copy_rom_client_monitors_config(struct qxl_device *qdev)
@@ -87,7 +90,10 @@ static int qxl_display_copy_rom_client_monitors_config(struct qxl_device *qdev)
 	      && (num_monitors != qdev->client_monitors_config->count)) {
 		status = MONITORS_CONFIG_MODIFIED;
 	}
-	qxl_alloc_client_monitors_config(qdev, num_monitors);
+	if (qxl_alloc_client_monitors_config(qdev, num_monitors)) {
+		status = MONITORS_CONFIG_ERROR;
+		return status;
+	}
 	/* we copy max from the client but it isn't used */
 	qdev->client_monitors_config->max_allowed =
 				qdev->monitors_config->max_allowed;
@@ -161,6 +167,10 @@ void qxl_display_read_client_monitors_config(struct qxl_device *qdev)
 			break;
 		udelay(5);
 	}
+	if (status == MONITORS_CONFIG_ERROR) {
+		DRM_DEBUG_KMS("ignoring client monitors config: error");
+		return;
+	}
 	if (status == MONITORS_CONFIG_BAD_CRC) {
 		DRM_DEBUG_KMS("ignoring client monitors config: bad crc");
 		return;
-- 
2.18.0

^ permalink raw reply related

* Re: [PATCH v2 0/2] virtio-balloon: some improvements
From: Michael S. Tsirkin @ 2018-07-27 14:06 UTC (permalink / raw)
  To: Wei Wang; +Cc: virtio-dev, linux-kernel, mhocko, linux-mm, akpm, virtualization
In-Reply-To: <1532683495-31974-1-git-send-email-wei.w.wang@intel.com>

On Fri, Jul 27, 2018 at 05:24:53PM +0800, Wei Wang wrote:
> This series is split from the "Virtio-balloon: support free page
> reporting" series to make some improvements.
> 
> v1->v2 ChangeLog:
> - register the shrinker when VIRTIO_BALLOON_F_DEFLATE_ON_OOM is negotiated.
> 
> Wei Wang (2):
>   virtio-balloon: remove BUG() in init_vqs
>   virtio_balloon: replace oom notifier with shrinker

Thanks!
Given it's very late in the release cycle, I'll merge this for
the next Linux release.

>  drivers/virtio/virtio_balloon.c | 125 +++++++++++++++++++++++-----------------
>  1 file changed, 72 insertions(+), 53 deletions(-)
> 
> -- 
> 2.7.4

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox