From: "Paul E. McKenney" <paulmck@linux.ibm.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Joel Fernandes <joel@joelfernandes.org>,
Matthew Wilcox <willy@infradead.org>,
aarcange@redhat.com, akpm@linux-foundation.org,
christian@brauner.io, davem@davemloft.net, ebiederm@xmission.com,
elena.reshetova@intel.com, guro@fb.com, hch@infradead.org,
james.bottomley@hansenpartnership.com, jasowang@redhat.com,
jglisse@redhat.com, keescook@chromium.org, ldv@altlinux.org,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-parisc@vger.kernel.org, luto@amacapital.net,
mhocko@suse.com, mingo@kernel.org, namit@vmware.com,
peterz@infradead.org, syzkaller-bugs@googlegroups.com,
viro@zeniv.linux.org.uk, wad@chromium.org
Subject: Re: RFC: call_rcu_outstanding (was Re: WARNING in __mmdrop)
Date: Mon, 22 Jul 2019 11:58:38 -0700 [thread overview]
Message-ID: <20190722185838.GN14271@linux.ibm.com> (raw)
In-Reply-To: <20190722123016-mutt-send-email-mst@kernel.org>
On Mon, Jul 22, 2019 at 12:32:17PM -0400, Michael S. Tsirkin wrote:
> On Mon, Jul 22, 2019 at 09:25:51AM -0700, Paul E. McKenney wrote:
> > On Mon, Jul 22, 2019 at 12:13:40PM -0400, Michael S. Tsirkin wrote:
> > > On Mon, Jul 22, 2019 at 08:55:34AM -0700, Paul E. McKenney wrote:
> > > > On Mon, Jul 22, 2019 at 11:47:24AM -0400, Michael S. Tsirkin wrote:
> > > > > On Mon, Jul 22, 2019 at 11:14:39AM -0400, Joel Fernandes wrote:
> > > > > > [snip]
> > > > > > > > Would it make sense to have call_rcu() check to see if there are many
> > > > > > > > outstanding requests on this CPU and if so process them before returning?
> > > > > > > > That would ensure that frequent callers usually ended up doing their
> > > > > > > > own processing.
> > > > > >
> > > > > > Other than what Paul already mentioned about deadlocks, I am not sure if this
> > > > > > would even work for all cases since call_rcu() has to wait for a grace
> > > > > > period.
> > > > > >
> > > > > > So, if the number of outstanding requests are higher than a certain amount,
> > > > > > then you *still* have to wait for some RCU configurations for the grace
> > > > > > period duration and cannot just execute the callback in-line. Did I miss
> > > > > > something?
> > > > > >
> > > > > > Can waiting in-line for a grace period duration be tolerated in the vhost case?
> > > > > >
> > > > > > thanks,
> > > > > >
> > > > > > - Joel
> > > > >
> > > > > No, but it has many other ways to recover (try again later, drop a
> > > > > packet, use a slower copy to/from user).
> > > >
> > > > True enough! And your idea of taking recovery action based on the number
> > > > of callbacks seems like a good one while we are getting RCU's callback
> > > > scheduling improved.
> > > >
> > > > By the way, was this a real problem that you could make happen on real
> > > > hardware?
> > >
> > > > If not, I would suggest just letting RCU get improved over
> > > > the next couple of releases.
> > >
> > > So basically use kfree_rcu but add a comment saying e.g. "WARNING:
> > > in the future callers of kfree_rcu might need to check that
> > > not too many callbacks get queued. In that case, we can
> > > disable the optimization, or recover in some other way.
> > > Watch this space."
> >
> > That sounds fair.
> >
> > > > If it is something that you actually made happen, please let me know
> > > > what (if anything) you need from me for your callback-counting EBUSY
> > > > scheme.
> > >
> > > If you mean kfree_rcu causing OOM then no, it's all theoretical.
> > > If you mean synchronize_rcu stalling to the point where guest will OOPs,
> > > then yes, that's not too hard to trigger.
> >
> > Is synchronize_rcu() being stalled by the userspace loop that is invoking
> > your ioctl that does kfree_rcu()? Or instead by the resulting callback
> > invocation?
>
> Sorry, let me clarify. We currently have synchronize_rcu in a userspace
> loop. I have a patch replacing that with kfree_rcu. This isn't the
> first time synchronize_rcu is stalling a VM for a long while so I didn't
> investigate further.
Ah, so a bunch of synchronize_rcu() calls within a single system call
inside the host is stalling the guest, correct?
If so, one straightforward approach is to do an rcu_barrier() every
(say) 1000 kfree_rcu() calls within that loop in the system call.
This will decrease the overhead by almost a factor of 1000 compared to
a synchronize_rcu() on each trip through that loop, and will prevent
callback overload.
Or if the situation is different (for example, the guest does a long
sequence of system calls, each of which does a single kfree_rcu() or
some such), please let me know what the situation is.
Thanx, Paul
WARNING: multiple messages have this Message-ID (diff)
From: "Paul E. McKenney" <paulmck@linux.ibm.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: mhocko@suse.com, peterz@infradead.org, jasowang@redhat.com,
ldv@altlinux.org, james.bottomley@hansenpartnership.com,
linux-mm@kvack.org, namit@vmware.com,
Joel Fernandes <joel@joelfernandes.org>,
mingo@kernel.org, elena.reshetova@intel.com, aarcange@redhat.com,
davem@davemloft.net, Matthew Wilcox <willy@infradead.org>,
hch@infradead.org, linux-arm-kernel@lists.infradead.org,
keescook@chromium.org, syzkaller-bugs@googlegroups.com,
jglisse@redhat.com, viro@zeniv.linux.org.uk,
christian@brauner.io, wad@chromium.org,
linux-parisc@vger.kernel.org, linux-kernel@vger.kernel.org,
luto@amacapital.net, ebiederm@xmission.com,
akpm@linux-foundation.org, guro@fb.com
Subject: Re: RFC: call_rcu_outstanding (was Re: WARNING in __mmdrop)
Date: Mon, 22 Jul 2019 11:58:38 -0700 [thread overview]
Message-ID: <20190722185838.GN14271@linux.ibm.com> (raw)
In-Reply-To: <20190722123016-mutt-send-email-mst@kernel.org>
On Mon, Jul 22, 2019 at 12:32:17PM -0400, Michael S. Tsirkin wrote:
> On Mon, Jul 22, 2019 at 09:25:51AM -0700, Paul E. McKenney wrote:
> > On Mon, Jul 22, 2019 at 12:13:40PM -0400, Michael S. Tsirkin wrote:
> > > On Mon, Jul 22, 2019 at 08:55:34AM -0700, Paul E. McKenney wrote:
> > > > On Mon, Jul 22, 2019 at 11:47:24AM -0400, Michael S. Tsirkin wrote:
> > > > > On Mon, Jul 22, 2019 at 11:14:39AM -0400, Joel Fernandes wrote:
> > > > > > [snip]
> > > > > > > > Would it make sense to have call_rcu() check to see if there are many
> > > > > > > > outstanding requests on this CPU and if so process them before returning?
> > > > > > > > That would ensure that frequent callers usually ended up doing their
> > > > > > > > own processing.
> > > > > >
> > > > > > Other than what Paul already mentioned about deadlocks, I am not sure if this
> > > > > > would even work for all cases since call_rcu() has to wait for a grace
> > > > > > period.
> > > > > >
> > > > > > So, if the number of outstanding requests are higher than a certain amount,
> > > > > > then you *still* have to wait for some RCU configurations for the grace
> > > > > > period duration and cannot just execute the callback in-line. Did I miss
> > > > > > something?
> > > > > >
> > > > > > Can waiting in-line for a grace period duration be tolerated in the vhost case?
> > > > > >
> > > > > > thanks,
> > > > > >
> > > > > > - Joel
> > > > >
> > > > > No, but it has many other ways to recover (try again later, drop a
> > > > > packet, use a slower copy to/from user).
> > > >
> > > > True enough! And your idea of taking recovery action based on the number
> > > > of callbacks seems like a good one while we are getting RCU's callback
> > > > scheduling improved.
> > > >
> > > > By the way, was this a real problem that you could make happen on real
> > > > hardware?
> > >
> > > > If not, I would suggest just letting RCU get improved over
> > > > the next couple of releases.
> > >
> > > So basically use kfree_rcu but add a comment saying e.g. "WARNING:
> > > in the future callers of kfree_rcu might need to check that
> > > not too many callbacks get queued. In that case, we can
> > > disable the optimization, or recover in some other way.
> > > Watch this space."
> >
> > That sounds fair.
> >
> > > > If it is something that you actually made happen, please let me know
> > > > what (if anything) you need from me for your callback-counting EBUSY
> > > > scheme.
> > >
> > > If you mean kfree_rcu causing OOM then no, it's all theoretical.
> > > If you mean synchronize_rcu stalling to the point where guest will OOPs,
> > > then yes, that's not too hard to trigger.
> >
> > Is synchronize_rcu() being stalled by the userspace loop that is invoking
> > your ioctl that does kfree_rcu()? Or instead by the resulting callback
> > invocation?
>
> Sorry, let me clarify. We currently have synchronize_rcu in a userspace
> loop. I have a patch replacing that with kfree_rcu. This isn't the
> first time synchronize_rcu is stalling a VM for a long while so I didn't
> investigate further.
Ah, so a bunch of synchronize_rcu() calls within a single system call
inside the host is stalling the guest, correct?
If so, one straightforward approach is to do an rcu_barrier() every
(say) 1000 kfree_rcu() calls within that loop in the system call.
This will decrease the overhead by almost a factor of 1000 compared to
a synchronize_rcu() on each trip through that loop, and will prevent
callback overload.
Or if the situation is different (for example, the guest does a long
sequence of system calls, each of which does a single kfree_rcu() or
some such), please let me know what the situation is.
Thanx, Paul
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2019-07-22 18:58 UTC|newest]
Thread overview: 175+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-19 3:35 WARNING in __mmdrop syzbot
2019-07-20 10:08 ` syzbot
2019-07-20 10:08 ` syzbot
2019-07-21 10:02 ` Michael S. Tsirkin
2019-07-21 10:02 ` Michael S. Tsirkin
2019-07-21 12:18 ` Michael S. Tsirkin
2019-07-21 12:18 ` Michael S. Tsirkin
2019-07-22 5:24 ` Jason Wang
2019-07-22 5:24 ` Jason Wang
2019-07-22 8:08 ` Michael S. Tsirkin
2019-07-22 8:08 ` Michael S. Tsirkin
2019-07-23 4:01 ` Jason Wang
2019-07-23 4:01 ` Jason Wang
2019-07-23 5:01 ` Michael S. Tsirkin
2019-07-23 5:01 ` Michael S. Tsirkin
2019-07-23 5:47 ` Jason Wang
2019-07-23 5:47 ` Jason Wang
2019-07-23 7:23 ` Michael S. Tsirkin
2019-07-23 7:23 ` Michael S. Tsirkin
2019-07-23 7:53 ` Jason Wang
2019-07-23 7:53 ` Jason Wang
2019-07-23 8:10 ` Michael S. Tsirkin
2019-07-23 8:10 ` Michael S. Tsirkin
2019-07-23 8:49 ` Jason Wang
2019-07-23 8:49 ` Jason Wang
2019-07-23 9:26 ` Michael S. Tsirkin
2019-07-23 9:26 ` Michael S. Tsirkin
2019-07-23 13:31 ` Jason Wang
2019-07-23 13:31 ` Jason Wang
2019-07-25 5:52 ` Michael S. Tsirkin
2019-07-25 5:52 ` Michael S. Tsirkin
2019-07-25 7:43 ` Jason Wang
2019-07-25 7:43 ` Jason Wang
2019-07-25 8:28 ` Michael S. Tsirkin
2019-07-25 8:28 ` Michael S. Tsirkin
2019-07-25 13:21 ` Jason Wang
2019-07-25 13:21 ` Jason Wang
2019-07-25 13:26 ` Michael S. Tsirkin
2019-07-25 13:26 ` Michael S. Tsirkin
2019-07-25 14:25 ` Jason Wang
2019-07-25 14:25 ` Jason Wang
2019-07-26 11:49 ` Michael S. Tsirkin
2019-07-26 11:49 ` Michael S. Tsirkin
2019-07-26 12:00 ` Jason Wang
2019-07-26 12:00 ` Jason Wang
2019-07-26 12:38 ` Michael S. Tsirkin
2019-07-26 12:38 ` Michael S. Tsirkin
2019-07-26 12:53 ` Jason Wang
2019-07-26 12:53 ` Jason Wang
2019-07-26 13:36 ` Jason Wang
2019-07-26 13:36 ` Jason Wang
2019-07-26 13:49 ` Michael S. Tsirkin
2019-07-26 13:49 ` Michael S. Tsirkin
2019-07-29 5:54 ` Jason Wang
2019-07-29 5:54 ` Jason Wang
2019-07-29 8:59 ` Michael S. Tsirkin
2019-07-29 8:59 ` Michael S. Tsirkin
2019-07-29 14:24 ` Jason Wang
2019-07-29 14:24 ` Jason Wang
2019-07-29 14:44 ` Michael S. Tsirkin
2019-07-29 14:44 ` Michael S. Tsirkin
2019-07-30 7:44 ` Jason Wang
2019-07-30 7:44 ` Jason Wang
2019-07-30 8:03 ` Jason Wang
2019-07-30 8:03 ` Jason Wang
2019-07-30 15:08 ` Michael S. Tsirkin
2019-07-30 15:08 ` Michael S. Tsirkin
2019-07-31 8:49 ` Jason Wang
2019-07-31 8:49 ` Jason Wang
2019-07-31 23:00 ` Jason Gunthorpe
2019-07-31 23:00 ` Jason Gunthorpe
2019-07-26 13:47 ` Michael S. Tsirkin
2019-07-26 13:47 ` Michael S. Tsirkin
2019-07-26 14:00 ` Jason Wang
2019-07-26 14:00 ` Jason Wang
2019-07-26 14:10 ` Michael S. Tsirkin
2019-07-26 14:10 ` Michael S. Tsirkin
2019-07-26 15:03 ` Jason Gunthorpe
2019-07-26 15:03 ` Jason Gunthorpe
2019-07-29 5:56 ` Jason Wang
2019-07-29 5:56 ` Jason Wang
2019-07-21 12:28 ` RFC: call_rcu_outstanding (was Re: WARNING in __mmdrop) Michael S. Tsirkin
2019-07-21 12:28 ` Michael S. Tsirkin
2019-07-21 13:17 ` Paul E. McKenney
2019-07-21 13:17 ` Paul E. McKenney
2019-07-21 17:53 ` Michael S. Tsirkin
2019-07-21 17:53 ` Michael S. Tsirkin
2019-07-21 19:28 ` Paul E. McKenney
2019-07-21 19:28 ` Paul E. McKenney
2019-07-22 7:56 ` Michael S. Tsirkin
2019-07-22 7:56 ` Michael S. Tsirkin
2019-07-22 11:57 ` Paul E. McKenney
2019-07-22 11:57 ` Paul E. McKenney
2019-07-21 21:08 ` Matthew Wilcox
2019-07-21 21:08 ` Matthew Wilcox
2019-07-21 23:31 ` Paul E. McKenney
2019-07-21 23:31 ` Paul E. McKenney
2019-07-22 7:52 ` Michael S. Tsirkin
2019-07-22 7:52 ` Michael S. Tsirkin
2019-07-22 11:51 ` Paul E. McKenney
2019-07-22 11:51 ` Paul E. McKenney
2019-07-22 13:41 ` Jason Gunthorpe
2019-07-22 13:41 ` Jason Gunthorpe
2019-07-22 15:52 ` Paul E. McKenney
2019-07-22 15:52 ` Paul E. McKenney
2019-07-22 16:04 ` Jason Gunthorpe
2019-07-22 16:04 ` Jason Gunthorpe
2019-07-22 16:15 ` Michael S. Tsirkin
2019-07-22 16:15 ` Michael S. Tsirkin
2019-07-22 16:15 ` Paul E. McKenney
2019-07-22 16:15 ` Paul E. McKenney
2019-07-22 15:14 ` Joel Fernandes
2019-07-22 15:14 ` Joel Fernandes
2019-07-22 15:47 ` Michael S. Tsirkin
2019-07-22 15:47 ` Michael S. Tsirkin
2019-07-22 15:55 ` Paul E. McKenney
2019-07-22 15:55 ` Paul E. McKenney
2019-07-22 16:13 ` Michael S. Tsirkin
2019-07-22 16:13 ` Michael S. Tsirkin
2019-07-22 16:25 ` Paul E. McKenney
2019-07-22 16:25 ` Paul E. McKenney
2019-07-22 16:32 ` Michael S. Tsirkin
2019-07-22 16:32 ` Michael S. Tsirkin
2019-07-22 18:58 ` Paul E. McKenney [this message]
2019-07-22 18:58 ` Paul E. McKenney
2019-07-22 5:21 ` WARNING in __mmdrop Jason Wang
2019-07-22 5:21 ` Jason Wang
2019-07-22 8:02 ` Michael S. Tsirkin
2019-07-22 8:02 ` Michael S. Tsirkin
2019-07-23 3:55 ` Jason Wang
2019-07-23 3:55 ` Jason Wang
2019-07-23 5:02 ` Michael S. Tsirkin
2019-07-23 5:02 ` Michael S. Tsirkin
2019-07-23 5:48 ` Jason Wang
2019-07-23 5:48 ` Jason Wang
2019-07-23 7:25 ` Michael S. Tsirkin
2019-07-23 7:25 ` Michael S. Tsirkin
2019-07-23 7:55 ` Jason Wang
2019-07-23 7:55 ` Jason Wang
2019-07-23 7:56 ` Michael S. Tsirkin
2019-07-23 7:56 ` Michael S. Tsirkin
2019-07-23 8:42 ` Jason Wang
2019-07-23 8:42 ` Jason Wang
2019-07-23 10:27 ` Michael S. Tsirkin
2019-07-23 10:27 ` Michael S. Tsirkin
2019-07-23 13:34 ` Jason Wang
2019-07-23 13:34 ` Jason Wang
2019-07-23 15:02 ` Michael S. Tsirkin
2019-07-23 15:02 ` Michael S. Tsirkin
2019-07-24 2:17 ` Jason Wang
2019-07-24 2:17 ` Jason Wang
2019-07-24 8:05 ` Michael S. Tsirkin
2019-07-24 8:05 ` Michael S. Tsirkin
2019-07-24 10:08 ` Jason Wang
2019-07-24 10:08 ` Jason Wang
2019-07-24 18:25 ` Michael S. Tsirkin
2019-07-24 18:25 ` Michael S. Tsirkin
2019-07-25 3:44 ` Jason Wang
2019-07-25 3:44 ` Jason Wang
2019-07-25 5:09 ` Michael S. Tsirkin
2019-07-25 5:09 ` Michael S. Tsirkin
2019-07-24 16:53 ` Jason Gunthorpe
2019-07-24 16:53 ` Jason Gunthorpe
2019-07-24 18:25 ` Michael S. Tsirkin
2019-07-24 18:25 ` Michael S. Tsirkin
2019-07-23 10:42 ` Michael S. Tsirkin
2019-07-23 10:42 ` Michael S. Tsirkin
2019-07-23 13:37 ` Jason Wang
2019-07-23 13:37 ` Jason Wang
2019-07-22 14:11 ` Jason Gunthorpe
2019-07-22 14:11 ` Jason Gunthorpe
2019-07-25 6:02 ` Michael S. Tsirkin
2019-07-25 6:02 ` Michael S. Tsirkin
2019-07-25 7:44 ` Jason Wang
2019-07-25 7:44 ` Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190722185838.GN14271@linux.ibm.com \
--to=paulmck@linux.ibm.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=christian@brauner.io \
--cc=davem@davemloft.net \
--cc=ebiederm@xmission.com \
--cc=elena.reshetova@intel.com \
--cc=guro@fb.com \
--cc=hch@infradead.org \
--cc=james.bottomley@hansenpartnership.com \
--cc=jasowang@redhat.com \
--cc=jglisse@redhat.com \
--cc=joel@joelfernandes.org \
--cc=keescook@chromium.org \
--cc=ldv@altlinux.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-parisc@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=mhocko@suse.com \
--cc=mingo@kernel.org \
--cc=mst@redhat.com \
--cc=namit@vmware.com \
--cc=peterz@infradead.org \
--cc=syzkaller-bugs@googlegroups.com \
--cc=viro@zeniv.linux.org.uk \
--cc=wad@chromium.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.