From: Juri Lelli <juri.lelli@arm.com>
To: Vincent Legout <vincent@legout.info>
Cc: "peterz@infradead.org" <peterz@infradead.org>,
"mingo@redhat.com" <mingo@redhat.com>,
"juri.lelli@gmail.com" <juri.lelli@gmail.com>,
"raistlin@linux.it" <raistlin@linux.it>,
"michael@amarulasolutions.com" <michael@amarulasolutions.com>,
"fchecconi@gmail.com" <fchecconi@gmail.com>,
"daniel.wagner@bmw-carit.de" <daniel.wagner@bmw-carit.de>,
"luca.abeni@unitn.it" <luca.abeni@unitn.it>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Li Zefan <lizefan@huawei.com>,
"cgroups@vger.kernel.org" <cgroups@vger.kernel.org>
Subject: Re: [PATCH 2/3] sched/deadline: fix bandwidth check/update when migrating tasks between exclusive cpusets
Date: Tue, 23 Sep 2014 09:09:27 +0100 [thread overview]
Message-ID: <54212AB7.3070406@arm.com> (raw)
In-Reply-To: <87k34vo3vb.fsf@cecht.legt.fr>
Hi Vincent,
On 22/09/14 20:24, Vincent Legout wrote:
> Hello,
>
> Juri Lelli <juri.lelli@arm.com> writes:
>
>> Exclusive cpusets are the only way users can restrict SCHED_DEADLINE tasks
>> affinity (performing what is commonly called clustered scheduling).
>> Unfortunately, such thing is currently broken for two reasons:
>>
>> - No check is performed when the user tries to attach a task to
>> an exlusive cpuset (recall that exclusive cpusets have an
>> associated maximum allowed bandwidth).
>>
>> - Bandwidths of source and destination cpusets are not correctly
>> updated after a task is migrated between them.
>>
>> This patch fixes both things at once, as they are opposite faces
>> of the same coin.
>>
>> The check is performed in cpuset_can_attach(), as there aren't any
>> points of failure after that function. The updated is split in two
>> halves. We first reserve bandwidth in the destination cpuset, after
>> we pass the check in cpuset_can_attach(). And we then release
>> bandwidth from the source cpuset when the task's affinity is
>> actually changed. Even if there can be time windows when sched_setattr()
>> may erroneously fail in the source cpuset, we are fine with it, as
>> we can't perfom an atomic update of both cpusets at once.
>
> Thanks, this seems to fix the other problem I had. However, this bug,
> which I never had before, now happens randomly (with or without patch
> 3/3):
>
> Sep 19 09:54:37 starbuck kernel: [ 1309.728678] ------------[ cut here ]------------
> Sep 19 09:54:37 starbuck kernel: [ 1309.728699] kernel BUG at kernel/sched/deadline.c:819!
> Sep 19 09:54:37 starbuck kernel: [ 1309.728719] invalid opcode: 0000 [#1] PREEMPT SMP
> Sep 19 09:54:37 starbuck kernel: [ 1309.728744] Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 dm_crypt nfsd auth_rpcgss oid_registry exportfs nfs_acl nfs lockd sunrpc bridge stp llc lp coretemp kvm_intel kvm ppdev ioatdma microcode ipmi_si parport_pc lpc_ich dca mfd_core parport ipmi_msghandler joydev serio_raw hid_generic usbhid hid crc32c_intel psmouse e1000e ptp pps_core
> Sep 19 09:54:37 starbuck kernel: [ 1309.728928] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 3.16.0+ #20
> Sep 19 09:54:37 starbuck kernel: [ 1309.728950] Hardware name: empty empty/S7002, BIOS 'V1.10.B10 ' 05/03/2011
> Sep 19 09:54:37 starbuck kernel: [ 1309.728977] task: ffff88023691c920 ti: ffff88023692c000 task.ti: ffff88023692c000
> Sep 19 09:54:37 starbuck kernel: [ 1309.729003] RIP: 0010:[<ffffffff810a543e>] [<ffffffff810a543e>] enqueue_task_dl+0x44e/0x450
> Sep 19 09:54:37 starbuck kernel: [ 1309.729041] RSP: 0018:ffff88043fc23e68 EFLAGS: 00010082
> Sep 19 09:54:37 starbuck kernel: [ 1309.729060] RAX: 0000000000000000 RBX: ffff880434edb0c0 RCX: ffff880434edb2f8
> Sep 19 09:54:37 starbuck kernel: [ 1309.729086] RDX: 0000000000000008 RSI: ffff880434edb0c0 RDI: 0000000000000000
> Sep 19 09:54:37 starbuck kernel: [ 1309.729140] RBP: ffff88043fc23ea8 R08: 0000000000000001 R09: 000002cb4aebb39d
> Sep 19 09:54:37 starbuck kernel: [ 1309.729193] R10: 13955a8129438cf2 R11: 0000000000000202 R12: 0000000000000008
> Sep 19 09:54:37 starbuck kernel: [ 1309.729247] R13: ffff88043fc33f00 R14: ffff88043fc2e0e0 R15: ffff880434edb2f8
> Sep 19 09:54:37 starbuck kernel: [ 1309.729301] FS: 0000000000000000(0000) GS:ffff88043fc20000(0000) knlGS:0000000000000000
> Sep 19 09:54:37 starbuck kernel: [ 1309.729383] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Sep 19 09:54:37 starbuck kernel: [ 1309.729436] CR2: 0000000000000000 CR3: 0000000435a07000 CR4: 00000000000027e0
>
This can be related to the problems Daniel is also experiencing.
> My script launches 6 processes and schedules them on 2 cpusets where
> each cpuset contains only one cpu. It moves processes from one cpuset to
> another and also updates their runtime. I can investigate more and try
> to provide a short script to reproduce if needed.
>
I should be able to dig into this next week. But yes, in the meantime a
script would be useful to reproduce the problem.
Thanks,
- Juri
next prev parent reply other threads:[~2014-09-23 8:09 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-19 9:22 [PATCH 0/3] SCHED_DEADLINE fix AC and SMP scheduling Juri Lelli
2014-09-19 9:22 ` [PATCH 1/3] sched/deadline: clear dl_entity params when setscheduling to different class Juri Lelli
2014-09-19 11:44 ` Daniel Wagner
2014-09-19 12:43 ` Juri Lelli
2014-09-22 18:50 ` Vincent Legout
2014-09-24 14:54 ` [tip:sched/core] sched/deadline: Clear " tip-bot for Juri Lelli
2014-10-08 12:32 ` [PATCH 1/3] sched/deadline: clear " Wanpeng Li
2014-10-21 12:15 ` Wanpeng Li
2014-10-21 13:15 ` Juri Lelli
2014-09-19 9:22 ` [PATCH 2/3] sched/deadline: fix bandwidth check/update when migrating tasks between exclusive cpusets Juri Lelli
2014-09-19 11:47 ` Daniel Wagner
2014-09-19 11:47 ` Daniel Wagner
[not found] ` <541C17D6.5020608-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
2014-09-19 12:46 ` Juri Lelli
2014-09-19 12:46 ` Juri Lelli
[not found] ` <1411118561-26323-3-git-send-email-juri.lelli-5wv7dgnIgG8@public.gmane.org>
2014-09-19 21:25 ` Peter Zijlstra
2014-09-19 21:25 ` Peter Zijlstra
[not found] ` <20140919212547.GG2832-IIpfhp3q70wB9AHHLWeGtNQXobZC6xk2@public.gmane.org>
2014-09-23 8:12 ` Juri Lelli
2014-09-23 8:12 ` Juri Lelli
2014-10-07 8:59 ` Juri Lelli
2014-10-07 8:59 ` Juri Lelli
[not found] ` <5433AB8A.7050908-5wv7dgnIgG8@public.gmane.org>
2014-10-07 12:31 ` Peter Zijlstra
2014-10-07 12:31 ` Peter Zijlstra
[not found] ` <20141007123109.GG19379-ndre7Fmf5hadTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2014-10-07 13:12 ` Juri Lelli
2014-10-07 13:12 ` Juri Lelli
2014-10-28 11:07 ` [tip:sched/core] sched/deadline: Ensure that updates to exclusive cpusets don't break AC tip-bot for Juri Lelli
2014-09-22 19:24 ` [PATCH 2/3] sched/deadline: fix bandwidth check/update when migrating tasks between exclusive cpusets Vincent Legout
2014-09-23 8:09 ` Juri Lelli [this message]
[not found] ` <54212AB7.3070406-5wv7dgnIgG8@public.gmane.org>
2014-09-23 13:08 ` Daniel Wagner
2014-09-23 13:08 ` Daniel Wagner
2014-10-28 11:07 ` [tip:sched/core] sched/deadline: Fix bandwidth check/ update " tip-bot for Juri Lelli
2014-09-19 9:22 ` [PATCH 3/3] sched/deadline: fix inter- exclusive cpusets migrations Juri Lelli
2014-09-24 14:55 ` [tip:sched/core] sched/deadline: Fix " tip-bot for Juri Lelli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54212AB7.3070406@arm.com \
--to=juri.lelli@arm.com \
--cc=cgroups@vger.kernel.org \
--cc=daniel.wagner@bmw-carit.de \
--cc=fchecconi@gmail.com \
--cc=juri.lelli@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lizefan@huawei.com \
--cc=luca.abeni@unitn.it \
--cc=michael@amarulasolutions.com \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=raistlin@linux.it \
--cc=vincent@legout.info \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.