From: Florian Westphal <fw@strlen.de>
To: Frank Schreuder <fschreuder@transip.nl>
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>,
Johan Schuijt <johan@transip.nl>,
Eric Dumazet <eric.dumazet@gmail.com>,
"nikolay@redhat.com" <nikolay@redhat.com>,
"davem@davemloft.net" <davem@davemloft.net>,
"fw@strlen.de" <fw@strlen.de>,
"chutzpah@gentoo.org" <chutzpah@gentoo.org>,
Robin Geuze <robing@transip.nl>, netdev <netdev@vger.kernel.org>
Subject: Re: reproducable panic eviction work queue
Date: Mon, 20 Jul 2015 16:30:23 +0200 [thread overview]
Message-ID: <20150720143023.GC11985@breakpoint.cc> (raw)
In-Reply-To: <55ACEDE9.3090205@transip.nl>
Frank Schreuder <fschreuder@transip.nl> wrote:
>
> On 7/18/2015 05:32 PM, Nikolay Aleksandrov wrote:
> >On 07/18/2015 05:28 PM, Johan Schuijt wrote:
> >>Thx for your looking into this!
> >>
> >>>Thank you for the report, I will try to reproduce this locally
> >>>Could you please post the full crash log ?
> >>Of course, please see attached file.
> >>
> >>>Also could you test
> >>>with a clean current kernel from Linus' tree or Dave's -net ?
> >>Will do.
> >>
> >>>These are available at:
> >>>git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> >>>git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
> >>>respectively.
> >>>
> >>>One last question how many IRQs do you pin i.e. how many cores
> >>>do you actively use for receive ?
> >>This varies a bit across our systems, but we’ve managed to reproduce this with IRQs pinned on as many as 2,4,8 or 20 cores.
> >>
> >>I won’t have access to our test-setup till Monday again, so I’ll be testing 3 scenario’s then:
> >>- Your patch
> >-----
> >>- Linux tree
> >>- Dave’s -net tree
> >Just one of these two would be enough. I couldn't reproduce it here but
> >I don't have as many machines to test right now and had to improvise with VMs. :-)
> >
> >>I’ll make sure to keep you posted on all the results then. We have a kernel dump of the panic, so if you need me to extract any data from there just let me know! (Some instructions might be needed)
> >>
> >>- Johan
> >>
> >Great, thank you!
> >
> I'm able to reproduce this panic on the following kernel builds:
> - 3.18.7
> - 3.18.18
> - 3.18.18 + patch from Nikolay Aleksandrov
> - 4.1.0
>
> Would you happen to have any more suggestions we can try?
Yes, although I admit its clutching at straws.
Problem is that I don't see how we can race with timer, but OTOH
I don't see why this needs to play refcnt tricks if we can just skip
the entry completely ...
The other issue is parallel completion on other cpu, but don't
see how we could trip there either.
Do you always get this one crash backtrace from evictor wq?
I'll set up a bigger test machine soon and will also try to reproduce
this.
Thanks for reporting!
diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -131,24 +131,14 @@ inet_evict_bucket(struct inet_frags *f, struct inet_frag_bucket *hb)
unsigned int evicted = 0;
HLIST_HEAD(expired);
-evict_again:
spin_lock(&hb->chain_lock);
hlist_for_each_entry_safe(fq, n, &hb->chain, list) {
if (!inet_fragq_should_evict(fq))
continue;
- if (!del_timer(&fq->timer)) {
- /* q expiring right now thus increment its refcount so
- * it won't be freed under us and wait until the timer
- * has finished executing then destroy it
- */
- atomic_inc(&fq->refcnt);
- spin_unlock(&hb->chain_lock);
- del_timer_sync(&fq->timer);
- inet_frag_put(fq, f);
- goto evict_again;
- }
+ if (!del_timer(&fq->timer))
+ continue;
fq->flags |= INET_FRAG_EVICTED;
hlist_del(&fq->list);
@@ -240,18 +230,20 @@ void inet_frags_exit_net(struct netns_frags *nf, struct inet_frags *f)
int i;
nf->low_thresh = 0;
- local_bh_disable();
evict_again:
+ local_bh_disable();
seq = read_seqbegin(&f->rnd_seqlock);
for (i = 0; i < INETFRAGS_HASHSZ ; i++)
inet_evict_bucket(f, &f->hash[i]);
- if (read_seqretry(&f->rnd_seqlock, seq))
- goto evict_again;
-
local_bh_enable();
+ cond_resched();
+
+ if (read_seqretry(&f->rnd_seqlock, seq) ||
+ percpu_counter_sum(&nf->mem))
+ goto evict_again;
percpu_counter_destroy(&nf->mem);
}
@@ -286,6 +278,8 @@ static inline void fq_unlink(struct inet_frag_queue *fq, struct inet_frags *f)
hb = get_frag_bucket_locked(fq, f);
if (!(fq->flags & INET_FRAG_EVICTED))
hlist_del(&fq->list);
+
+ fq->flags |= INET_FRAG_COMPLETE;
spin_unlock(&hb->chain_lock);
}
@@ -297,7 +291,6 @@ void inet_frag_kill(struct inet_frag_queue *fq, struct inet_frags *f)
if (!(fq->flags & INET_FRAG_COMPLETE)) {
fq_unlink(fq, f);
atomic_dec(&fq->refcnt);
- fq->flags |= INET_FRAG_COMPLETE;
}
}
EXPORT_SYMBOL(inet_frag_kill);
next prev parent reply other threads:[~2015-07-20 14:30 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <F8D94413-90A2-4F80-AAA2-7A6AB57DF314@transip.nl>
2015-07-18 8:56 ` reproducable panic eviction work queue Eric Dumazet
2015-07-18 9:01 ` Johan Schuijt
2015-07-18 10:02 ` Nikolay Aleksandrov
2015-07-18 13:31 ` Nikolay Aleksandrov
2015-07-18 15:28 ` Johan Schuijt
2015-07-18 15:30 ` Johan Schuijt
2015-07-18 15:32 ` Nikolay Aleksandrov
2015-07-20 12:47 ` Frank Schreuder
2015-07-20 14:02 ` Nikolay Aleksandrov
2015-07-20 14:30 ` Florian Westphal [this message]
2015-07-21 11:50 ` Frank Schreuder
2015-07-21 18:34 ` Florian Westphal
2015-07-22 8:09 ` Frank Schreuder
2015-07-22 8:17 ` Frank Schreuder
2015-07-22 9:11 ` Nikolay Aleksandrov
2015-07-22 10:55 ` Frank Schreuder
2015-07-22 13:58 ` Florian Westphal
2015-07-22 14:03 ` Nikolay Aleksandrov
2015-07-22 14:14 ` Nikolay Aleksandrov
2015-07-22 15:31 ` Frank Schreuder
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150720143023.GC11985@breakpoint.cc \
--to=fw@strlen.de \
--cc=chutzpah@gentoo.org \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=fschreuder@transip.nl \
--cc=johan@transip.nl \
--cc=netdev@vger.kernel.org \
--cc=nikolay@cumulusnetworks.com \
--cc=nikolay@redhat.com \
--cc=robing@transip.nl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).