netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	netdev@vger.kernel.org, Jason Wang <jasowang@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [BUG] Inconsistent lock state in virtnet poll
Date: Wed, 6 May 2020 03:31:58 -0400	[thread overview]
Message-ID: <20200506032237-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <71b1b9dd-78e3-9694-2daa-5723355293d4@gmail.com>

On Tue, May 05, 2020 at 07:24:18PM -0700, Eric Dumazet wrote:
> 
> 
> On 5/5/20 6:25 PM, Michael S. Tsirkin wrote:
> > On Tue, May 05, 2020 at 06:19:09PM -0700, Eric Dumazet wrote:
> >>
> >>
> >> On 5/5/20 5:43 PM, Michael S. Tsirkin wrote:
> >>> On Tue, May 05, 2020 at 03:40:09PM -0700, Eric Dumazet wrote:
> >>>>
> >>>>
> >>>> On 5/5/20 3:30 PM, Thomas Gleixner wrote:
> >>>>> "Michael S. Tsirkin" <mst@redhat.com> writes:
> >>>>>> On Tue, May 05, 2020 at 02:08:56PM +0200, Thomas Gleixner wrote:
> >>>>>>>
> >>>>>>> The following lockdep splat happens reproducibly on 5.7-rc4
> >>>>>>
> >>>>>>> ================================
> >>>>>>> WARNING: inconsistent lock state
> >>>>>>> 5.7.0-rc4+ #79 Not tainted
> >>>>>>> --------------------------------
> >>>>>>> inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> >>>>>>> ip/356 [HC0[0]:SC1[1]:HE1:SE0] takes:
> >>>>>>> f3ee4cd8 (&syncp->seq#2){+.?.}-{0:0}, at: net_rx_action+0xfb/0x390
> >>>>>>> {SOFTIRQ-ON-W} state was registered at:
> >>>>>>>   lock_acquire+0x82/0x300
> >>>>>>>   try_fill_recv+0x39f/0x590
> >>>>>>
> >>>>>> Weird. Where does try_fill_recv acquire any locks?
> >>>>>
> >>>>>   u64_stats_update_begin(&rq->stats.syncp);
> >>>>>
> >>>>> That's a 32bit kernel which uses a seqcount for this. sequence counts
> >>>>> are "lock" constructs where you need to make sure that writers are
> >>>>> serialized.
> >>>>>
> >>>>> Actually the problem at hand is that try_fill_recv() is called from
> >>>>> fully preemptible context initialy and then from softirq context.
> >>>>>
> >>>>> Obviously that's for the open() path a non issue, but lockdep does not
> >>>>> know about that. OTOH, there is other code which calls that from
> >>>>> non-softirq context.
> >>>>>
> >>>>> The hack below made it shut up. It's obvioulsy not ideal, but at least
> >>>>> it let me look at the actual problem I was chasing down :)
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>>         tglx
> >>>>>
> >>>>> 8<-----------
> >>>>> --- a/drivers/net/virtio_net.c
> >>>>> +++ b/drivers/net/virtio_net.c
> >>>>> @@ -1243,9 +1243,11 @@ static bool try_fill_recv(struct virtnet
> >>>>>  			break;
> >>>>>  	} while (rq->vq->num_free);
> >>>>>  	if (virtqueue_kick_prepare(rq->vq) && virtqueue_notify(rq->vq)) {
> >>>>> +		local_bh_disable();
> >>>>
> >>>> Or use u64_stats_update_begin_irqsave() whic is a NOP on 64bit kernels
> >>>
> >>> I applied this, but am still trying to think of something that
> >>> is 0 overhead for all configs.
> >>> Maybe we can select a lockdep class depending on whether napi
> >>> is enabled?
> >>
> >>
> >> Do you _really_ need 64bit counter for stats.kicks on 32bit kernels ?
> >>
> >> Adding 64bit counters just because we can might be overhead anyway.
> > 
> > Well 32 bit kernels don't fundamentally kick less than 64 bit ones,
> > and we kick more or less per packet, sometimes per batch,
> > people expect these to be in sync ..
> 
> Well, we left many counters in networking stack as 'unsigned long'
> and nobody complained yet of overflows on 32bit kernels.

Right.  For TX it is helpful that everything is maintained
atomically so we do need the seqlock machinery anyway:

        u64_stats_update_begin(&sq->stats.syncp);
        sq->stats.bytes += bytes;
        sq->stats.packets += packets;
        sq->stats.xdp_tx += n;
        sq->stats.xdp_tx_drops += drops;
        sq->stats.kicks += kicks;
        u64_stats_update_end(&sq->stats.syncp);

for RX kicks are currently updated separately.  Which I guess is more or
less a minor bug.

        if (rq->vq->num_free > min((unsigned int)budget, virtqueue_get_vring_size(rq->vq)) / 2) {
                if (!try_fill_recv(vi, rq, GFP_ATOMIC))
                        schedule_delayed_work(&vi->refill, 0);
        }

        u64_stats_update_begin(&rq->stats.syncp);
        for (i = 0; i < VIRTNET_RQ_STATS_LEN; i++) {
                size_t offset = virtnet_rq_stats_desc[i].offset;
                u64 *item;

                item = (u64 *)((u8 *)&rq->stats + offset);
                *item += *(u64 *)((u8 *)&stats + offset);
        }
        u64_stats_update_end(&rq->stats.syncp);

we should update kicks in virtnet_receive.

And as long as we do that there's no cost to 64 bit counters ...


> SNMP agents are used to the fact that counters do overflow.
> 
> Problems might happen if the overflows happen too fast, say every 10 seconds,
> but other than that, forcing 64bit counters for something which is not
> _required_ for the data path is adding pain.
> 
> I am mentioning this, because trying to add lockdep stuff and associated
> maintenance cost for 32bit kernels in 2020 makes little sense to me,
> considering I added include/linux/u64_stats_sync.h 10 years ago.
> 

Not sure what do you suggest here...


> 
> 
> 


  reply	other threads:[~2020-05-06  7:36 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-05 12:08 [BUG] Inconsistent lock state in virtnet poll Thomas Gleixner
2020-05-05 16:10 ` Michael S. Tsirkin
2020-05-05 22:30   ` Thomas Gleixner
2020-05-05 22:40     ` Eric Dumazet
2020-05-05 23:49       ` Michael S. Tsirkin
2020-05-06  0:43       ` Michael S. Tsirkin
2020-05-06  1:19         ` Eric Dumazet
2020-05-06  1:25           ` Michael S. Tsirkin
2020-05-06  2:24             ` Eric Dumazet
2020-05-06  7:31               ` Michael S. Tsirkin [this message]
2020-05-06  8:15                 ` Jason Wang
2020-05-05 23:54     ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200506032237-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=eric.dumazet@gmail.com \
    --cc=jasowang@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).