Re: [BUG] Inconsistent lock state in virtnet poll

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Michael S. Tsirkin" <mst@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	netdev@vger.kernel.org, Jason Wang <jasowang@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [BUG] Inconsistent lock state in virtnet poll
Date: Wed, 6 May 2020 03:31:58 -0400	[thread overview]
Message-ID: <20200506032237-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <71b1b9dd-78e3-9694-2daa-5723355293d4@gmail.com>

On Tue, May 05, 2020 at 07:24:18PM -0700, Eric Dumazet wrote:
> 
> 
> On 5/5/20 6:25 PM, Michael S. Tsirkin wrote:
> > On Tue, May 05, 2020 at 06:19:09PM -0700, Eric Dumazet wrote:
> >>
> >>
> >> On 5/5/20 5:43 PM, Michael S. Tsirkin wrote:
> >>> On Tue, May 05, 2020 at 03:40:09PM -0700, Eric Dumazet wrote:
> >>>>
> >>>>
> >>>> On 5/5/20 3:30 PM, Thomas Gleixner wrote:
> >>>>> "Michael S. Tsirkin" <mst@redhat.com> writes:
> >>>>>> On Tue, May 05, 2020 at 02:08:56PM +0200, Thomas Gleixner wrote:
> >>>>>>>
> >>>>>>> The following lockdep splat happens reproducibly on 5.7-rc4
> >>>>>>
> >>>>>>> ================================
> >>>>>>> WARNING: inconsistent lock state
> >>>>>>> 5.7.0-rc4+ #79 Not tainted
> >>>>>>> --------------------------------
> >>>>>>> inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> >>>>>>> ip/356 [HC0[0]:SC1[1]:HE1:SE0] takes:
> >>>>>>> f3ee4cd8 (&syncp->seq#2){+.?.}-{0:0}, at: net_rx_action+0xfb/0x390
> >>>>>>> {SOFTIRQ-ON-W} state was registered at:
> >>>>>>>   lock_acquire+0x82/0x300
> >>>>>>>   try_fill_recv+0x39f/0x590
> >>>>>>
> >>>>>> Weird. Where does try_fill_recv acquire any locks?
> >>>>>
> >>>>>   u64_stats_update_begin(&rq->stats.syncp);
> >>>>>
> >>>>> That's a 32bit kernel which uses a seqcount for this. sequence counts
> >>>>> are "lock" constructs where you need to make sure that writers are
> >>>>> serialized.
> >>>>>
> >>>>> Actually the problem at hand is that try_fill_recv() is called from
> >>>>> fully preemptible context initialy and then from softirq context.
> >>>>>
> >>>>> Obviously that's for the open() path a non issue, but lockdep does not
> >>>>> know about that. OTOH, there is other code which calls that from
> >>>>> non-softirq context.
> >>>>>
> >>>>> The hack below made it shut up. It's obvioulsy not ideal, but at least
> >>>>> it let me look at the actual problem I was chasing down :)
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>>         tglx
> >>>>>
> >>>>> 8<-----------
> >>>>> --- a/drivers/net/virtio_net.c
> >>>>> +++ b/drivers/net/virtio_net.c
> >>>>> @@ -1243,9 +1243,11 @@ static bool try_fill_recv(struct virtnet
> >>>>>  			break;
> >>>>>  	} while (rq->vq->num_free);
> >>>>>  	if (virtqueue_kick_prepare(rq->vq) && virtqueue_notify(rq->vq)) {
> >>>>> +		local_bh_disable();
> >>>>
> >>>> Or use u64_stats_update_begin_irqsave() whic is a NOP on 64bit kernels
> >>>
> >>> I applied this, but am still trying to think of something that
> >>> is 0 overhead for all configs.
> >>> Maybe we can select a lockdep class depending on whether napi
> >>> is enabled?
> >>
> >>
> >> Do you _really_ need 64bit counter for stats.kicks on 32bit kernels ?
> >>
> >> Adding 64bit counters just because we can might be overhead anyway.
> > 
> > Well 32 bit kernels don't fundamentally kick less than 64 bit ones,
> > and we kick more or less per packet, sometimes per batch,
> > people expect these to be in sync ..
> 
> Well, we left many counters in networking stack as 'unsigned long'
> and nobody complained yet of overflows on 32bit kernels.

Right.  For TX it is helpful that everything is maintained
atomically so we do need the seqlock machinery anyway:

        u64_stats_update_begin(&sq->stats.syncp);
        sq->stats.bytes += bytes;
        sq->stats.packets += packets;
        sq->stats.xdp_tx += n;
        sq->stats.xdp_tx_drops += drops;
        sq->stats.kicks += kicks;
        u64_stats_update_end(&sq->stats.syncp);

for RX kicks are currently updated separately.  Which I guess is more or
less a minor bug.

        if (rq->vq->num_free > min((unsigned int)budget, virtqueue_get_vring_size(rq->vq)) / 2) {
                if (!try_fill_recv(vi, rq, GFP_ATOMIC))
                        schedule_delayed_work(&vi->refill, 0);
        }

        u64_stats_update_begin(&rq->stats.syncp);
        for (i = 0; i < VIRTNET_RQ_STATS_LEN; i++) {
                size_t offset = virtnet_rq_stats_desc[i].offset;
                u64 *item;

                item = (u64 *)((u8 *)&rq->stats + offset);
                *item += *(u64 *)((u8 *)&stats + offset);
        }
        u64_stats_update_end(&rq->stats.syncp);

we should update kicks in virtnet_receive.

And as long as we do that there's no cost to 64 bit counters ...


> SNMP agents are used to the fact that counters do overflow.
> 
> Problems might happen if the overflows happen too fast, say every 10 seconds,
> but other than that, forcing 64bit counters for something which is not
> _required_ for the data path is adding pain.
> 
> I am mentioning this, because trying to add lockdep stuff and associated
> maintenance cost for 32bit kernels in 2020 makes little sense to me,
> considering I added include/linux/u64_stats_sync.h 10 years ago.
> 

Not sure what do you suggest here...


> 
> 
>

next prev parent reply	other threads:[~2020-05-06  7:36 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-05 12:08 [BUG] Inconsistent lock state in virtnet poll Thomas Gleixner
2020-05-05 16:10 ` Michael S. Tsirkin
2020-05-05 22:30   ` Thomas Gleixner
2020-05-05 22:40     ` Eric Dumazet
2020-05-05 23:49       ` Michael S. Tsirkin
2020-05-06  0:43       ` Michael S. Tsirkin
2020-05-06  1:19         ` Eric Dumazet
2020-05-06  1:25           ` Michael S. Tsirkin
2020-05-06  2:24             ` Eric Dumazet
2020-05-06  7:31               ` Michael S. Tsirkin [this message]
2020-05-06  8:15                 ` Jason Wang
2020-05-05 23:54     ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200506032237-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=eric.dumazet@gmail.com \
    --cc=jasowang@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.