All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Tobias Huschle <huschle@linux.ibm.com>
Cc: Jason Wang <jasowang@redhat.com>,
	Abel Wu <wuyun.abel@bytedance.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	kvm@vger.kernel.org, virtualization@lists.linux.dev,
	netdev@vger.kernel.org
Subject: Re: Re: Re: EEVDF/vhost regression (bisected to 86bfbb7ce4f6 sched/fair: Add lag based placement)
Date: Thu, 1 Feb 2024 07:08:08 -0500	[thread overview]
Message-ID: <20240201070437-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <89460.124020106474400877@us-mta-475.us.mimecast.lan>

On Thu, Feb 01, 2024 at 12:47:39PM +0100, Tobias Huschle wrote:
> On Thu, Feb 01, 2024 at 03:08:07AM -0500, Michael S. Tsirkin wrote:
> > On Thu, Feb 01, 2024 at 08:38:43AM +0100, Tobias Huschle wrote:
> > > On Sun, Jan 21, 2024 at 01:44:32PM -0500, Michael S. Tsirkin wrote:
> > > > On Mon, Jan 08, 2024 at 02:13:25PM +0100, Tobias Huschle wrote:
> > > > > On Thu, Dec 14, 2023 at 02:14:59AM -0500, Michael S. Tsirkin wrote:
> > > 
> > > -------- Summary --------
> > > 
> > > In my (non-vhost experience) opinion the way to go would be either
> > > replacing the cond_resched with a hard schedule or setting the
> > > need_resched flag within vhost if the a data transfer was successfully
> > > initiated. It will be necessary to check if this causes problems with
> > > other workloads/benchmarks.
> > 
> > Yes but conceptually I am still in the dark on whether the fact that
> > periodically invoking cond_resched is no longer sufficient to be nice to
> > others is a bug, or intentional.  So you feel it is intentional?
> 
> I would assume that cond_resched is still a valid concept.
> But, in this particular scenario we have the following problem:
> 
> So far (with CFS) we had:
> 1. vhost initiates data transfer
> 2. kworker is woken up
> 3. CFS gives priority to woken up task and schedules it
> 4. kworker runs
> 
> Now (with EEVDF) we have:
> 0. In some cases, kworker has accumulated negative lag 
> 1. vhost initiates data transfer
> 2. kworker is woken up
> -3a. EEVDF does not schedule kworker if it has negative lag
> -4a. vhost continues running, kworker on same CPU starves
> --
> -3b. EEVDF schedules kworker if it has positive or no lag
> -4b. kworker runs
> 
> In the 3a/4a case, the kworker is given no chance to set the
> necessary flag. The flag can only be set by another CPU now.
> The schedule of the kworker was not caused by cond_resched, but
> rather by the wakeup path of the scheduler.
> 
> cond_resched works successfully once the load balancer (I suppose) 
> decides to migrate the vhost off to another CPU. In that case, the
> load balancer on another CPU sets that flag and we are good.
> That then eventually allows the scheduler to pick kworker, but very
> late.

I don't really understand what is special about vhost though.
Wouldn't it apply to any kernel code?

> > I propose a two patch series then:
> > 
> > patch 1: in this text in Documentation/kernel-hacking/hacking.rst
> > 
> > If you're doing longer computations: first think userspace. If you
> > **really** want to do it in kernel you should regularly check if you need
> > to give up the CPU (remember there is cooperative multitasking per CPU).
> > Idiom::
> > 
> >     cond_resched(); /* Will sleep */
> > 
> > 
> > replace cond_resched -> schedule
> > 
> > 
> > Since apparently cond_resched is no longer sufficient to
> > make the scheduler check whether you need to give up the CPU.
> > 
> > patch 2: make this change for vhost.
> > 
> > WDYT?
> 
> For patch 1, I would like to see some feedback from Peter (or someone else
> from the scheduler maintainers).

I am guessing once you post it you will see feedback.

> For patch 2, I would prefer to do some more testing first if this might have
> an negative effect on other benchmarks.
> 
> I also stumbled upon something in the scheduler code that I want to verify.
> Maybe a cgroup thing, will check that out again.
> 
> I'll do some more testing with the cond_resched->schedule fix, check the
> cgroup thing and wait for Peter then.
> Will get back if any of the above yields some results.
> 
> > 
> > -- 
> > MST
> > 
> > 


  parent reply	other threads:[~2024-02-01 12:08 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-16 18:58 EEVDF/vhost regression (bisected to 86bfbb7ce4f6 sched/fair: Add lag based placement) Tobias Huschle
2023-11-17  9:23 ` Peter Zijlstra
2023-11-17  9:58   ` Peter Zijlstra
2023-11-17 12:24   ` Tobias Huschle
2023-11-17 12:37     ` Peter Zijlstra
2023-11-17 13:07       ` Abel Wu
2023-11-21 13:17         ` Tobias Huschle
2023-11-22 10:00           ` Peter Zijlstra
2023-11-27 13:56             ` Tobias Huschle
     [not found]             ` <6564a012.c80a0220.adb78.f0e4SMTPIN_ADDED_BROKEN@mx.google.com>
2023-11-28  8:55               ` Abel Wu
2023-11-29  6:31                 ` Tobias Huschle
2023-12-07  6:22                 ` Tobias Huschle
     [not found]                 ` <07513.123120701265800278@us-mta-474.us.mimecast.lan>
2023-12-07  6:48                   ` Michael S. Tsirkin
2023-12-08  9:24                     ` Tobias Huschle
2023-12-08 17:28                       ` Mike Christie
     [not found]                     ` <56082.123120804242300177@us-mta-137.us.mimecast.lan>
2023-12-08 10:31                       ` Re: " Michael S. Tsirkin
2023-12-08 11:41                         ` Tobias Huschle
     [not found]                         ` <53044.123120806415900549@us-mta-342.us.mimecast.lan>
2023-12-09 10:42                           ` Michael S. Tsirkin
2023-12-11  7:26                             ` Jason Wang
2023-12-11 16:53                               ` Michael S. Tsirkin
2023-12-12  3:00                                 ` Jason Wang
2023-12-12 16:15                                   ` Michael S. Tsirkin
2023-12-13 10:37                                     ` Tobias Huschle
     [not found]                                     ` <42870.123121305373200110@us-mta-641.us.mimecast.lan>
2023-12-13 12:00                                       ` Michael S. Tsirkin
2023-12-13 12:45                                         ` Tobias Huschle
     [not found]                                         ` <25485.123121307454100283@us-mta-18.us.mimecast.lan>
2023-12-13 14:47                                           ` Michael S. Tsirkin
2023-12-13 14:55                                           ` Michael S. Tsirkin
2023-12-14  7:14                                             ` Michael S. Tsirkin
2024-01-08 13:13                                               ` Tobias Huschle
     [not found]                                               ` <92916.124010808133201076@us-mta-622.us.mimecast.lan>
2024-01-09 23:07                                                 ` Michael S. Tsirkin
2024-01-21 18:44                                                 ` Michael S. Tsirkin
2024-01-22 11:29                                                   ` Tobias Huschle
2024-02-01  7:38                                                   ` Tobias Huschle
     [not found]                                                   ` <07974.124020102385100135@us-mta-501.us.mimecast.lan>
2024-02-01  8:08                                                     ` Michael S. Tsirkin
2024-02-01 11:47                                                       ` Tobias Huschle
     [not found]                                                       ` <89460.124020106474400877@us-mta-475.us.mimecast.lan>
2024-02-01 12:08                                                         ` Michael S. Tsirkin [this message]
2024-02-22 19:23                                                         ` Michael S. Tsirkin
2024-03-11 17:05                                                         ` Michael S. Tsirkin
2024-03-12  9:45                                                           ` Luis Machado
2024-03-14 11:46                                                             ` Tobias Huschle
     [not found]                                                             ` <73123.124031407552500165@us-mta-156.us.mimecast.lan>
2024-03-14 15:09                                                               ` Michael S. Tsirkin
2024-03-15  8:33                                                                 ` Tobias Huschle
     [not found]                                                                 ` <84704.124031504335801509@us-mta-515.us.mimecast.lan>
2024-03-15 10:31                                                                   ` Michael S. Tsirkin
2024-03-19  8:21                                                                     ` Tobias Huschle
2024-03-19  8:29                                                                       ` Michael S. Tsirkin
2024-03-19  8:59                                                                         ` Tobias Huschle
2024-04-30 10:50                                                                           ` Tobias Huschle
2024-05-01 10:51                                                                             ` Peter Zijlstra
2024-05-01 15:31                                                                               ` Michael S. Tsirkin
2024-05-02  9:16                                                                                 ` Peter Zijlstra
2024-05-02 12:23                                                                                 ` Tobias Huschle
2024-05-02 12:20                                                                               ` Tobias Huschle
2023-11-18  5:14   ` Abel Wu
2023-11-20 10:56     ` Peter Zijlstra
2023-11-20 12:06       ` Abel Wu
2023-11-18  7:33 ` Abel Wu
2023-11-18 15:29   ` Honglei Wang
2023-11-19 13:29 ` Bagas Sanjaya

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240201070437-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=huschle@linux.ibm.com \
    --cc=jasowang@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=virtualization@lists.linux.dev \
    --cc=wuyun.abel@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.