Linux virtualization list

Linux virtualization list
 help / color / mirror / Atom feed

* Re: [PATCH v3] virtio_blk: unlock vblk->lock during kick
From: Stefan Hajnoczi @ 2012-06-06 15:25 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: khoa, Stefan Hajnoczi, kvm, virtualization
In-Reply-To: <20120604111134.GA28673@redhat.com>

On Mon, Jun 4, 2012 at 12:11 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Fri, Jun 01, 2012 at 10:13:06AM +0100, Stefan Hajnoczi wrote:
>> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
>> index 774c31d..d674977 100644
>> --- a/drivers/block/virtio_blk.c
>> +++ b/drivers/block/virtio_blk.c
>> @@ -199,8 +199,14 @@ static void do_virtblk_request(struct request_queue *q)
>>               issued++;
>>       }
>>
>> -     if (issued)
>> -             virtqueue_kick(vblk->vq);
>> +     if (!issued)
>> +             return;
>> +
>> +     if (virtqueue_kick_prepare(vblk->vq)) {
>> +             spin_unlock_irq(vblk->disk->queue->queue_lock);
>> +             virtqueue_notify(vblk->vq);
>
> If blk_done runs and completes the request at this point,
> can hot unplug then remove the queue?
> If yes will we get a use after free?

This is a difficult question, I haven't been able to decide one way or
another.  The use-after-free is the
spin_lock_irq(vblk->disk->queue->queue_lock).

It still doesn't explain why existing drivers are doing this.  In nbd,
for example, I can't see anything preventing the same situation in
drivers/block/nbd.c:do_nbd_request() between wake_up() and
spin_lock_irq(q->queue_lock).  If the request completes (like in your
example scenario) then the module remove code path has no way of
knowing there is still a thread in do_nbd_request().

Any ideas?

Stefan

^ permalink raw reply

* Re: [PATCH] virtio-net: fix a race on 32bit arches
From: Michael S. Tsirkin @ 2012-06-06 16:17 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, linux-kernel, virtualization, Stephen Hemminger
In-Reply-To: <1338995944.26966.6.camel@edumazet-glaptop>

On Wed, Jun 06, 2012 at 05:19:04PM +0200, Eric Dumazet wrote:
> On Wed, 2012-06-06 at 17:49 +0300, Michael S. Tsirkin wrote:
> > On Wed, Jun 06, 2012 at 03:10:10PM +0200, Eric Dumazet wrote:
> > > On Wed, 2012-06-06 at 14:13 +0300, Michael S. Tsirkin wrote:
> > > 
> > > > We currently do all stats either on napi callback or from
> > > > start_xmit callback.
> > > > This makes them safe, yes?
> > > 
> > > Hmm, then _bh() variant is needed in virtnet_stats(), as explained in
> > > include/linux/u64_stats_sync.h section 6)
> > > 
> > >  * 6) If counter might be written by an interrupt, readers should block interrupts.
> > >  *    (On UP, there is no seqcount_t protection, a reader allowing interrupts could
> > >  *     read partial values)
> > > 
> > > Yes, its tricky...
> > 
> > Sounds good, but I have a question: this realies on counters
> > being atomic on 64 bit.
> > Would not it be better to always use a seqlock even on 64 bit?
> > This way counters would actually be correct and in sync.
> > As it is if we want e.g. average packet size,
> > we can not rely e.g. on it being bytes/packets.
> 
> When this stuff was discussed, we chose to have a nop on 64bits.
> 
> Your point has little to do with 64bit stats, it was already like that
> with 'long int' counters.

Yes, of course.

> Consider average driver doing :
> 
> dev->stats.rx_bytes += skb->len;
> dev->stats.rx_packets++;
> 
> A concurrent reader can read an updated rx_bytes and a 'previous'
> rx_packets one.
> 
> 'fixing' this requires a lot of work and memory barriers (in all
> drivers), for a very litle gain (at most one packet error)
> u64_stats_sync was really meant to be 0-cost on 64bit arches.
> 
> 

I understand, and not arguing about that.

But why do you say at most 1 packet?

Consider get_stats doing:
               u64_stats_update_begin(&stats->syncp);
               stats->tx_bytes += skb->len;

on 64 bit at this point
tx_packets might get incremented any number of times, no?

                stats->tx_packets++;
                u64_stats_update_end(&stats->syncp);

now tx_bytes and tx_packets are out of sync by more than 1.

^ permalink raw reply

* Re: [PATCH] virtio-net: fix a race on 32bit arches
From: Michael S. Tsirkin @ 2012-06-06 16:57 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, linux-kernel, virtualization, Stephen Hemminger
In-Reply-To: <1338995944.26966.6.camel@edumazet-glaptop>

On Wed, Jun 06, 2012 at 05:19:04PM +0200, Eric Dumazet wrote:
> On Wed, 2012-06-06 at 17:49 +0300, Michael S. Tsirkin wrote:
> > On Wed, Jun 06, 2012 at 03:10:10PM +0200, Eric Dumazet wrote:
> > > On Wed, 2012-06-06 at 14:13 +0300, Michael S. Tsirkin wrote:
> > > 
> > > > We currently do all stats either on napi callback or from
> > > > start_xmit callback.
> > > > This makes them safe, yes?
> > > 
> > > Hmm, then _bh() variant is needed in virtnet_stats(), as explained in
> > > include/linux/u64_stats_sync.h section 6)
> > > 
> > >  * 6) If counter might be written by an interrupt, readers should block interrupts.
> > >  *    (On UP, there is no seqcount_t protection, a reader allowing interrupts could
> > >  *     read partial values)
> > > 
> > > Yes, its tricky...
> > 
> > Sounds good, but I have a question: this realies on counters
> > being atomic on 64 bit.
> > Would not it be better to always use a seqlock even on 64 bit?
> > This way counters would actually be correct and in sync.
> > As it is if we want e.g. average packet size,
> > we can not rely e.g. on it being bytes/packets.
> 
> When this stuff was discussed, we chose to have a nop on 64bits.
> 
> Your point has little to do with 64bit stats, it was already like that
> with 'long int' counters.
> 
> Consider average driver doing :
> 
> dev->stats.rx_bytes += skb->len;
> dev->stats.rx_packets++;
> 
> A concurrent reader can read an updated rx_bytes and a 'previous'
> rx_packets one.
> 
> 'fixing' this requires a lot of work and memory barriers (in all
> drivers), for a very litle gain (at most one packet error)
> 
> u64_stats_sync was really meant to be 0-cost on 64bit arches.
> 



So for virtio since all counters get incremented from bh we can
ensure they are read atomically, simply but reading them
from the correct CPU with bh disabled.
And then we don't need u64_stats_sync at all.

-- 
MST

^ permalink raw reply

* Re: [PATCH] virtio-net: fix a race on 32bit arches
From: Eric Dumazet @ 2012-06-06 17:13 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, linux-kernel, virtualization, Stephen Hemminger
In-Reply-To: <20120606161715.GA17575@redhat.com>

On Wed, 2012-06-06 at 19:17 +0300, Michael S. Tsirkin wrote:

> But why do you say at most 1 packet?
> 
> Consider get_stats doing:
>                u64_stats_update_begin(&stats->syncp);
>                stats->tx_bytes += skb->len;
> 
> on 64 bit at this point
> tx_packets might get incremented any number of times, no?
> 
>                 stats->tx_packets++;
>                 u64_stats_update_end(&stats->syncp);
> 
> now tx_bytes and tx_packets are out of sync by more than 1.

You lost me there.

No idea of what you are thinking about.

There is no atomicity guarantee in SNMP counters. (Ie fetching tx_bytes
and tx_packets in a transaction is not mandatory in any RFC)

As long as there is no cumulative error, its OK.

^ permalink raw reply

* Re: [PATCH] virtio-net: fix a race on 32bit arches
From: Michael S. Tsirkin @ 2012-06-06 18:43 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, linux-kernel, virtualization, Stephen Hemminger
In-Reply-To: <1339002782.26966.22.camel@edumazet-glaptop>

On Wed, Jun 06, 2012 at 07:13:02PM +0200, Eric Dumazet wrote:
> On Wed, 2012-06-06 at 19:17 +0300, Michael S. Tsirkin wrote:
> 
> > But why do you say at most 1 packet?
> > 
> > Consider get_stats doing:
> >                u64_stats_update_begin(&stats->syncp);
> >                stats->tx_bytes += skb->len;
> > 
> > on 64 bit at this point
> > tx_packets might get incremented any number of times, no?
> > 
> >                 stats->tx_packets++;
> >                 u64_stats_update_end(&stats->syncp);
> > 
> > now tx_bytes and tx_packets are out of sync by more than 1.
> 
> You lost me there.
> 
> No idea of what you are thinking about.

Sorry about that. This is not a bug. I am saying two things:

1. We are trying to look at counters for purposes of tuning the device.
E.g. if ethtool reports packets and bytes, we'd like to calculate
average packet size by bytes/packets.

If both counters are read atomically the metric becomes more exact.
Not a must but nice to have.

2. 32 bit systems have some overhead because of the seqlock.
virtio could instead simply keep tx counters in the queue structure, and
get the tx lock when they are read.

-- 
MST

^ permalink raw reply

* Re: [PATCH] virtio-net: fix a race on 32bit arches
From: Michael S. Tsirkin @ 2012-06-06 18:51 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Eric Dumazet, netdev, linux-kernel, virtualization
In-Reply-To: <20120606081432.6b602065@nehalam.linuxnetplumber.net>

On Wed, Jun 06, 2012 at 08:14:32AM -0700, Stephen Hemminger wrote:
> On Wed, 6 Jun 2012 17:49:42 +0300
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > Sounds good, but I have a question: this realies on counters
> > being atomic on 64 bit.
> > Would not it be better to always use a seqlock even on 64 bit?
> > This way counters would actually be correct and in sync.
> > As it is if we want e.g. average packet size,
> > we can not rely e.g. on it being bytes/packets.
> 
> This has not been a requirement on real physical devices; therefore
> the added overhead is not really justified.
> 
> Many network cards use counters in hardware to count packets/bytes
> and there is no expectation of atomic access there.

BTW for cards that do implement the counters in software,
under xmit lock, is anything wrong with simply taking the xmit lock
when we get the stats instead of the per-cpu trick + seqlock?

^ permalink raw reply

* Re: [PATCH] virtio-net: fix a race on 32bit arches
From: Eric Dumazet @ 2012-06-06 19:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, linux-kernel, virtualization, Stephen Hemminger
In-Reply-To: <20120606185107.GA20503@redhat.com>

On Wed, 2012-06-06 at 21:51 +0300, Michael S. Tsirkin wrote:

> BTW for cards that do implement the counters in software,
> under xmit lock, is anything wrong with simply taking the xmit lock
> when we get the stats instead of the per-cpu trick + seqlock?
> 

I still dont understand why you would do that.

Most modern machines are 64bits, so there is no seqlock overhead,
nothing at all.

If you focus on 32bit hardware, just stick on 32bit counters ?

Note that most u64_stats_sync users are virtual drivers, without xmit
lock (LLTX drivers)

^ permalink raw reply

* Re: [PATCH] virtio-net: fix a race on 32bit arches
From: Michael S. Tsirkin @ 2012-06-06 19:58 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, linux-kernel, virtualization, Stephen Hemminger
In-Reply-To: <1339012441.26966.48.camel@edumazet-glaptop>

On Wed, Jun 06, 2012 at 09:54:01PM +0200, Eric Dumazet wrote:
> On Wed, 2012-06-06 at 21:51 +0300, Michael S. Tsirkin wrote:
> 
> > BTW for cards that do implement the counters in software,
> > under xmit lock, is anything wrong with simply taking the xmit lock
> > when we get the stats instead of the per-cpu trick + seqlock?
> > 
> 
> I still dont understand why you would do that.
> 
> Most modern machines are 64bits, so there is no seqlock overhead,
> nothing at all.
> 
> If you focus on 32bit hardware, just stick on 32bit counters ?

These wrap around.

> Note that most u64_stats_sync users are virtual drivers, without xmit
> lock (LLTX drivers)
> 
> 

Absolutely, I am talking about virtio here.  I'm not kicking
u64_stats_sync idea I am just saying that simple locking
would work for virtio and might be better as it
gives us a way to get counters atomically.

-- 
MST

^ permalink raw reply

* Re: [PATCH] virtio-net: fix a race on 32bit arches
From: Eric Dumazet @ 2012-06-06 20:00 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, linux-kernel, virtualization, Stephen Hemminger
In-Reply-To: <20120606165754.GA19357@redhat.com>

On Wed, 2012-06-06 at 19:57 +0300, Michael S. Tsirkin wrote:

> So for virtio since all counters get incremented from bh we can
> ensure they are read atomically, simply but reading them
> from the correct CPU with bh disabled.
> And then we don't need u64_stats_sync at all.
> 

Really ? How are you going to read 64bit stats from foreign cpus on
32bit arches, without additional cost in fast path ?

You should read include/linux/u64_stats_sync.h to fully understand the
issues.

^ permalink raw reply

* Re: [PATCH] virtio-net: fix a race on 32bit arches
From: Eric Dumazet @ 2012-06-06 20:06 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, linux-kernel, virtualization, Stephen Hemminger
In-Reply-To: <20120606184351.GA20380@redhat.com>

On Wed, 2012-06-06 at 21:43 +0300, Michael S. Tsirkin wrote:

> 1. We are trying to look at counters for purposes of tuning the device.
> E.g. if ethtool reports packets and bytes, we'd like to calculate
> average packet size by bytes/packets.
> 
> If both counters are read atomically the metric becomes more exact.
> Not a must but nice to have.
> 

metrics are exact right now.

As soon as you read a value, it might already have changed.

Maybe you want to stop_machine() to make sure all the metrics you want
are 'exact' ;)

> 2. 32 bit systems have some overhead because of the seqlock.
> virtio could instead simply keep tx counters in the queue structure, and
> get the tx lock when they are read.
> 

But then you need atomic64 stuff, have you an idea of the cost of such
primitives on 32bit ?

3. use 32bit counters on 32bit arches, as many drivers still do ?

^ permalink raw reply

* Re: [PATCH] virtio-net: fix a race on 32bit arches
From: Eric Dumazet @ 2012-06-06 20:08 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, linux-kernel, virtualization, Stephen Hemminger
In-Reply-To: <20120606195814.GA20677@redhat.com>

On Wed, 2012-06-06 at 22:58 +0300, Michael S. Tsirkin wrote:

> Absolutely, I am talking about virtio here.  I'm not kicking
> u64_stats_sync idea I am just saying that simple locking
> would work for virtio and might be better as it
> gives us a way to get counters atomically.

Which lock do you own in the RX path ?

You'll have to add a lock in fast path. This sounds really a bad choice
to me.

^ permalink raw reply

* Re: [PATCH] virtio-net: fix a race on 32bit arches
From: Michael S. Tsirkin @ 2012-06-06 20:16 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, linux-kernel, virtualization, Stephen Hemminger
In-Reply-To: <1339013289.26966.62.camel@edumazet-glaptop>

On Wed, Jun 06, 2012 at 10:08:09PM +0200, Eric Dumazet wrote:
> On Wed, 2012-06-06 at 22:58 +0300, Michael S. Tsirkin wrote:
> 
> > Absolutely, I am talking about virtio here.  I'm not kicking
> > u64_stats_sync idea I am just saying that simple locking
> > would work for virtio and might be better as it
> > gives us a way to get counters atomically.
> 
> Which lock do you own in the RX path ?

We can just disable napi, everything is updated from napi callback.

> You'll have to add a lock in fast path. This sounds really a bad choice
> to me.

.ndo_get_stats64 is not data path though, is it?

-- 
MST

^ permalink raw reply

* Re: [PATCH] virtio-net: fix a race on 32bit arches
From: Ben Hutchings @ 2012-06-06 20:19 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Michael S. Tsirkin, netdev, linux-kernel, virtualization,
	Stephen Hemminger
In-Reply-To: <1339013289.26966.62.camel@edumazet-glaptop>

On Wed, 2012-06-06 at 22:08 +0200, Eric Dumazet wrote:
> On Wed, 2012-06-06 at 22:58 +0300, Michael S. Tsirkin wrote:
> 
> > Absolutely, I am talking about virtio here.  I'm not kicking
> > u64_stats_sync idea I am just saying that simple locking
> > would work for virtio and might be better as it
> > gives us a way to get counters atomically.
> 
> Which lock do you own in the RX path ?
> 
> You'll have to add a lock in fast path. This sounds really a bad choice
> to me.

You have the NAPI 'lock', so when gathering stats you can synchronise
using napi_disable() ;-)

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH] virtio-net: fix a race on 32bit arches
From: Michael S. Tsirkin @ 2012-06-06 20:19 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, linux-kernel, virtualization, Stephen Hemminger
In-Reply-To: <1339013171.26966.60.camel@edumazet-glaptop>

On Wed, Jun 06, 2012 at 10:06:11PM +0200, Eric Dumazet wrote:
> On Wed, 2012-06-06 at 21:43 +0300, Michael S. Tsirkin wrote:
> 
> > 1. We are trying to look at counters for purposes of tuning the device.
> > E.g. if ethtool reports packets and bytes, we'd like to calculate
> > average packet size by bytes/packets.
> > 
> > If both counters are read atomically the metric becomes more exact.
> > Not a must but nice to have.
> > 
> 
> metrics are exact right now.

Yes, but they are not synchronised between themselves.
E.g. you can in theory have a report where #of packets > #of bytes.

I know there's no guarantee they are synchronised
on an arbitrary device but if they are, without
slowing fast path, it's nice.

^ permalink raw reply

* Re: [PATCH] virtio-net: fix a race on 32bit arches
From: Eric Dumazet @ 2012-06-06 20:24 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, linux-kernel, virtualization, Stephen Hemminger
In-Reply-To: <20120606201620.GA23358@redhat.com>

On Wed, 2012-06-06 at 23:16 +0300, Michael S. Tsirkin wrote:
> On Wed, Jun 06, 2012 at 10:08:09PM +0200, Eric Dumazet wrote:
> > On Wed, 2012-06-06 at 22:58 +0300, Michael S. Tsirkin wrote:
> > 
> > > Absolutely, I am talking about virtio here.  I'm not kicking
> > > u64_stats_sync idea I am just saying that simple locking
> > > would work for virtio and might be better as it
> > > gives us a way to get counters atomically.
> > 
> > Which lock do you own in the RX path ?
> 
> We can just disable napi, everything is updated from napi callback.

This is very disruptive, and illegal from ndo_get_stats64()

(ndo_get_stats64() is not allowed to sleep, and I cant see how you are
going to disable napi without sleeping)

^ permalink raw reply

* Re: [PATCH] virtio-net: fix a race on 32bit arches
From: Eric Dumazet @ 2012-06-06 20:25 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Michael S. Tsirkin, netdev, linux-kernel, virtualization,
	Stephen Hemminger
In-Reply-To: <1339013979.2836.52.camel@bwh-desktop.uk.solarflarecom.com>

On Wed, 2012-06-06 at 21:19 +0100, Ben Hutchings wrote:
> On Wed, 2012-06-06 at 22:08 +0200, Eric Dumazet wrote:
> > On Wed, 2012-06-06 at 22:58 +0300, Michael S. Tsirkin wrote:
> > 
> > > Absolutely, I am talking about virtio here.  I'm not kicking
> > > u64_stats_sync idea I am just saying that simple locking
> > > would work for virtio and might be better as it
> > > gives us a way to get counters atomically.
> > 
> > Which lock do you own in the RX path ?
> > 
> > You'll have to add a lock in fast path. This sounds really a bad choice
> > to me.
> 
> You have the NAPI 'lock', so when gathering stats you can synchronise
> using napi_disable() ;-)

Nice, this adds one new bug in network stack.

Really guys, can we stop this thread, please ?

^ permalink raw reply

* Re: [PATCH] virtio-net: fix a race on 32bit arches
From: Ben Hutchings @ 2012-06-06 20:35 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Eric Dumazet, netdev, linux-kernel, virtualization,
	Stephen Hemminger
In-Reply-To: <20120606201620.GA23358@redhat.com>

On Wed, 2012-06-06 at 23:16 +0300, Michael S. Tsirkin wrote:
> On Wed, Jun 06, 2012 at 10:08:09PM +0200, Eric Dumazet wrote:
> > On Wed, 2012-06-06 at 22:58 +0300, Michael S. Tsirkin wrote:
> > 
> > > Absolutely, I am talking about virtio here.  I'm not kicking
> > > u64_stats_sync idea I am just saying that simple locking
> > > would work for virtio and might be better as it
> > > gives us a way to get counters atomically.
> > 
> > Which lock do you own in the RX path ?
> 
> We can just disable napi, everything is updated from napi callback.

Seriously, though: don't do that; this is going to hurt performance for
minimal benefit.

Ben.

> > You'll have to add a lock in fast path. This sounds really a bad choice
> > to me.
> 
> .ndo_get_stats64 is not data path though, is it?
> 

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH] virtio-net: fix a race on 32bit arches
From: Eric Dumazet @ 2012-06-06 20:38 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, linux-kernel, virtualization, Stephen Hemminger
In-Reply-To: <1339014259.26966.70.camel@edumazet-glaptop>

On Wed, 2012-06-06 at 22:24 +0200, Eric Dumazet wrote:

> (ndo_get_stats64() is not allowed to sleep, and I cant see how you are
> going to disable napi without sleeping)
> 
> 

In case you wonder, take a look at bond_get_stats() in
drivers/net/bonding/bond_main.c

^ permalink raw reply

* Re: [PATCH] virtio-net: fix a race on 32bit arches
From: Michael S. Tsirkin @ 2012-06-06 20:43 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Eric Dumazet, netdev, linux-kernel, virtualization,
	Stephen Hemminger
In-Reply-To: <1339014904.2836.56.camel@bwh-desktop.uk.solarflarecom.com>

On Wed, Jun 06, 2012 at 09:35:04PM +0100, Ben Hutchings wrote:
> On Wed, 2012-06-06 at 23:16 +0300, Michael S. Tsirkin wrote:
> > On Wed, Jun 06, 2012 at 10:08:09PM +0200, Eric Dumazet wrote:
> > > On Wed, 2012-06-06 at 22:58 +0300, Michael S. Tsirkin wrote:
> > > 
> > > > Absolutely, I am talking about virtio here.  I'm not kicking
> > > > u64_stats_sync idea I am just saying that simple locking
> > > > would work for virtio and might be better as it
> > > > gives us a way to get counters atomically.
> > > 
> > > Which lock do you own in the RX path ?
> > 
> > We can just disable napi, everything is updated from napi callback.
> 
> Seriously, though: don't do that; this is going to hurt performance for
> minimal benefit.
> 
> Ben.

Yea, it doesn't work anyway. Maybe take a xmit lock for tx and keep
using the per-cpu counters for rx. Or does this sound too disruptive
too?

> > > You'll have to add a lock in fast path. This sounds really a bad choice
> > > to me.
> > 
> > .ndo_get_stats64 is not data path though, is it?
> > 
> 
> -- 
> Ben Hutchings, Staff Engineer, Solarflare
> Not speaking for my employer; that's the marketing department's job.
> They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [V2 RFC net-next PATCH 2/2] virtio_net: export more statistics through ethtool
From: Ben Hutchings @ 2012-06-07 17:15 UTC (permalink / raw)
  To: Jason Wang; +Cc: netdev, mst, linux-kernel, virtualization
In-Reply-To: <20120606075217.29081.30713.stgit@amd-6168-8-1.englab.nay.redhat.com>

On Wed, 2012-06-06 at 15:52 +0800, Jason Wang wrote:
> Satistics counters is useful for debugging and performance optimization, so this
> patch lets virtio_net driver collect following and export them to userspace
> through "ethtool -S":
> 
> - number of packets sent/received
> - number of bytes sent/received
> - number of callbacks for tx/rx
> - number of kick for tx/rx
> - number of bytes/packets queued for tx
> 
> As virtnet_stats were per-cpu, so both per-cpu and gloabl satistics were
> collected like:
[...]

I would really like to see some sort of convention for presenting
per-queue statistics through ethtool.  At the moment we have a complete
mess of different formats:

bnx2x:    "[${index}]: ${name}"
be2net:   "${qtype}q${index}: ${name}"
ehea:     "PR${index} ${name}"
mlx4_en:  "${qtype}${index}_${name}"
myri10ge: dummy stat names as headings
niu:      dummy stat names as headings
s2io:     "ring_${index}_${name}"
vmxnet3:  dummy stat names as headings
vxge:     "${name}_${index}"; also dummy stat names as headings

And you're introducing yet another format!

(Additionally some of the drivers are playing games with spaces and tabs
to make ethtool indent the stats the way they like.  Ethtool statistics
are inconsistent enough already without drivers pulling that sort of
crap.

I'm inclined to make ethtool start stripping whitespace from stat names,
and *if* people can agree on a common format for per-queue statistic
names then I'll indent them *consistently*.  Also, I would make such
stats optional, so you don't get hundreds of lines of crap by default.)

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [V2 RFC net-next PATCH 2/2] virtio_net: export more statistics through ethtool
From: David Miller @ 2012-06-07 20:05 UTC (permalink / raw)
  To: bhutchings; +Cc: mst, netdev, linux-kernel, virtualization
In-Reply-To: <1339089306.2770.10.camel@bwh-desktop.uk.solarflarecom.com>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Thu, 7 Jun 2012 18:15:06 +0100

> I would really like to see some sort of convention for presenting
> per-queue statistics through ethtool.  At the moment we have a complete
> mess of different formats:

Indeed.  Probably ${QUEUE_TYPE}-${INDEX}-${STATISTIC} is best.
With an agreed upon list of queue types such as "rx", "tx", "rxtx"
etc.

^ permalink raw reply

* Re: [V2 RFC net-next PATCH 2/2] virtio_net: export more statistics through ethtool
From: Ben Hutchings @ 2012-06-07 20:24 UTC (permalink / raw)
  To: David Miller; +Cc: mst, netdev, linux-kernel, virtualization
In-Reply-To: <20120607.130512.219951433412203999.davem@davemloft.net>

On Thu, 2012-06-07 at 13:05 -0700, David Miller wrote:
> From: Ben Hutchings <bhutchings@solarflare.com>
> Date: Thu, 7 Jun 2012 18:15:06 +0100
> 
> > I would really like to see some sort of convention for presenting
> > per-queue statistics through ethtool.  At the moment we have a complete
> > mess of different formats:
> 
> Indeed.  Probably ${QUEUE_TYPE}-${INDEX}-${STATISTIC} is best.
> With an agreed upon list of queue types such as "rx", "tx", "rxtx"
> etc.

I think we should leave the type names open-ended, as there are other
useful groupings like per-virtual-port.  In that case the separator
should be chosen to allow arbitrary type names without ambiguity.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [V2 RFC net-next PATCH 2/2] virtio_net: export more statistics through ethtool
From: Rick Jones @ 2012-06-07 20:39 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: mst, netdev, linux-kernel, virtualization, David Miller
In-Reply-To: <1339100649.2770.20.camel@bwh-desktop.uk.solarflarecom.com>

On 06/07/2012 01:24 PM, Ben Hutchings wrote:
> On Thu, 2012-06-07 at 13:05 -0700, David Miller wrote:
>> From: Ben Hutchings<bhutchings@solarflare.com>
>> Date: Thu, 7 Jun 2012 18:15:06 +0100
>>
>>> I would really like to see some sort of convention for presenting
>>> per-queue statistics through ethtool.  At the moment we have a complete
>>> mess of different formats:
>>
>> Indeed.  Probably ${QUEUE_TYPE}-${INDEX}-${STATISTIC} is best.
>> With an agreed upon list of queue types such as "rx", "tx", "rxtx"
>> etc.
>
> I think we should leave the type names open-ended, as there are other
> useful groupings like per-virtual-port.  In that case the separator
> should be chosen to allow arbitrary type names without ambiguity.

So you mean like something along the lines of the presence of say '.' 
indicating indent a level:

rx_bytes:  1234
     myqueue1.rx_bytes: 234
     myqueue2.rx_bytes: 345
     ...

rick jones

^ permalink raw reply

* Re: [V2 RFC net-next PATCH 2/2] virtio_net: export more statistics through ethtool
From: Ben Hutchings @ 2012-06-07 20:56 UTC (permalink / raw)
  To: Rick Jones; +Cc: mst, netdev, linux-kernel, virtualization, David Miller
In-Reply-To: <4FD11194.2040405@hp.com>

On Thu, 2012-06-07 at 13:39 -0700, Rick Jones wrote:
> On 06/07/2012 01:24 PM, Ben Hutchings wrote:
> > On Thu, 2012-06-07 at 13:05 -0700, David Miller wrote:
> >> From: Ben Hutchings<bhutchings@solarflare.com>
> >> Date: Thu, 7 Jun 2012 18:15:06 +0100
> >>
> >>> I would really like to see some sort of convention for presenting
> >>> per-queue statistics through ethtool.  At the moment we have a complete
> >>> mess of different formats:
> >>
> >> Indeed.  Probably ${QUEUE_TYPE}-${INDEX}-${STATISTIC} is best.
> >> With an agreed upon list of queue types such as "rx", "tx", "rxtx"
> >> etc.
> >
> > I think we should leave the type names open-ended, as there are other
> > useful groupings like per-virtual-port.  In that case the separator
> > should be chosen to allow arbitrary type names without ambiguity.
> 
> So you mean like something along the lines of the presence of say '.' 
> indicating indent a level:
> 
> rx_bytes:  1234
>      myqueue1.rx_bytes: 234
>      myqueue2.rx_bytes: 345
>      ...

Most drivers seem to want this sort of ordering/grouping:

group0.foo
group0.bar
...
group1.foo
group1.bar
...

but if we have a standard way of indicating groups of statistics then
the user can choose whether they want to reorder by type name.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [V2 RFC net-next PATCH 2/2] virtio_net: export more statistics through ethtool
From: Ben Hutchings @ 2012-06-07 20:58 UTC (permalink / raw)
  To: Rick Jones; +Cc: mst, netdev, linux-kernel, virtualization, David Miller
In-Reply-To: <1339102567.2770.25.camel@bwh-desktop.uk.solarflarecom.com>

On Thu, 2012-06-07 at 21:56 +0100, Ben Hutchings wrote:
> On Thu, 2012-06-07 at 13:39 -0700, Rick Jones wrote:
> > On 06/07/2012 01:24 PM, Ben Hutchings wrote:
> > > On Thu, 2012-06-07 at 13:05 -0700, David Miller wrote:
> > >> From: Ben Hutchings<bhutchings@solarflare.com>
> > >> Date: Thu, 7 Jun 2012 18:15:06 +0100
> > >>
> > >>> I would really like to see some sort of convention for presenting
> > >>> per-queue statistics through ethtool.  At the moment we have a complete
> > >>> mess of different formats:
> > >>
> > >> Indeed.  Probably ${QUEUE_TYPE}-${INDEX}-${STATISTIC} is best.
> > >> With an agreed upon list of queue types such as "rx", "tx", "rxtx"
> > >> etc.
> > >
> > > I think we should leave the type names open-ended, as there are other
> > > useful groupings like per-virtual-port.  In that case the separator
> > > should be chosen to allow arbitrary type names without ambiguity.
> > 
> > So you mean like something along the lines of the presence of say '.' 
> > indicating indent a level:
> > 
> > rx_bytes:  1234
> >      myqueue1.rx_bytes: 234
> >      myqueue2.rx_bytes: 345
> >      ...
> 
> Most drivers seem to want this sort of ordering/grouping:
> 
> group0.foo
> group0.bar
> ...
> group1.foo
> group1.bar
> ...
> 
> but if we have a standard way of indicating groups of statistics then
> the user can choose whether they want to reorder by type name.

I mean, whether they want to reorder/regroup by the final part of the
statistic name.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox