Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls

linux-next.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Michael Ellerman <mpe@ellerman.id.au>
To: linux-tip-commits@vger.kernel.org
Cc: pmladek@suse.com, riel@redhat.com, linuxppc-dev@lists.ozlabs.org,
	luca.abeni@unitn.it, sergey.senozhatsky.work@gmail.com,
	matt@codeblueprint.co.uk, fweisbec@gmail.com, efault@gmx.de,
	yuyang.du@intel.com, linux-kernel@vger.kernel.org,
	byungchul.park@lge.com, peterz@infradead.org,
	torvalds@linux-foundation.org, umgwanakikbuti@gmail.com,
	linux-next@vger.kernel.org, hpa@zytor.com, jack@suse.cz,
	tglx@linutronix.de, mgorman@techsingularity.net,
	mingo@kernel.org, wanpeng.li@hotmail.com
Subject: Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls
Date: Tue, 31 Jan 2017 08:24:53 +1100	[thread overview]
Message-ID: <87tw8gutp6.fsf@concordia.ellerman.id.au> (raw)
In-Reply-To: <tip-cb42c9a3ebbbb23448c3f9a25417fae6309b1a92@git.kernel.org>

tip-bot for Matt Fleming <tipbot@zytor.com> writes:

> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 98e7eee..6eeae7e 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -768,48 +768,110 @@ static inline u64 __rq_clock_broken(struct rq *rq)
>  	return READ_ONCE(rq->clock);
>  }
>  
> +/*
> + * rq::clock_update_flags bits
> + *
> + * %RQCF_REQ_SKIP - will request skipping of clock update on the next
> + *  call to __schedule(). This is an optimisation to avoid
> + *  neighbouring rq clock updates.
> + *
> + * %RQCF_ACT_SKIP - is set from inside of __schedule() when skipping is
> + *  in effect and calls to update_rq_clock() are being ignored.
> + *
> + * %RQCF_UPDATED - is a debug flag that indicates whether a call has been
> + *  made to update_rq_clock() since the last time rq::lock was pinned.
> + *
> + * If inside of __schedule(), clock_update_flags will have been
> + * shifted left (a left shift is a cheap operation for the fast path
> + * to promote %RQCF_REQ_SKIP to %RQCF_ACT_SKIP), so you must use,
> + *
> + *	if (rq-clock_update_flags >= RQCF_UPDATED)
> + *
> + * to check if %RQCF_UPADTED is set. It'll never be shifted more than
> + * one position though, because the next rq_unpin_lock() will shift it
> + * back.
> + */
> +#define RQCF_REQ_SKIP	0x01
> +#define RQCF_ACT_SKIP	0x02
> +#define RQCF_UPDATED	0x04
> +
> +static inline void assert_clock_updated(struct rq *rq)
> +{
> +	/*
> +	 * The only reason for not seeing a clock update since the
> +	 * last rq_pin_lock() is if we're currently skipping updates.
> +	 */
> +	SCHED_WARN_ON(rq->clock_update_flags < RQCF_ACT_SKIP);
> +}

I'm hitting this on multiple powerpc systems:

[   38.339126] rq->clock_update_flags < RQCF_ACT_SKIP
[   38.339134] ------------[ cut here ]------------
[   38.339142] WARNING: CPU: 2 PID: 1 at kernel/sched/sched.h:804 detach_task_cfs_rq+0xa0c/0xd10
[   38.339142] Modules linked in: fuse pseries_rng rng_core vmx_crypto ib_iser rdma_cm iw_cm ib_cm ib_core libiscsi scsi_transport_iscsi autofs4 btrfs raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c multipath raid10
[   38.339162] CPU: 2 PID: 1 Comm: systemd Not tainted 4.10.0-rc6-gccN-next-20170130-g14dcd7e #1
[   38.339163] task: c000000ffc900000 task.stack: c0000007fda00000
[   38.339164] NIP: c00000000012d31c LR: c00000000012d318 CTR: c0000000005aed80
[   38.339165] REGS: c0000007fda03630 TRAP: 0700   Not tainted  (4.10.0-rc6-gccN-next-20170130-g14dcd7e)
[   38.339166] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
[   38.339170]   CR: 28044242  XER: 20000000
[   38.339171] CFAR: c000000000a732c8 SOFTE: 0 
               GPR00: c00000000012d318 c0000007fda038b0 c000000000f32c00 0000000000000026 
               GPR04: 0000000000000000 00000000000001f7 6574616470755f6b 0000000000000000 
               GPR08: 00000004baa70000 c000000000e1edb0 0000000000000000 0000000000000e8a 
               GPR12: 0000000000004400 c00000000eaa0800 00003fffc25fdd50 000000003a30b518 
               GPR16: 000001002495fe70 0000000000000000 000000003a301668 000000000000001b 
               GPR20: 0000000000000001 c000000ff1d0a000 c0000004a8191300 c00000000b12fe90 
               GPR24: c000000000e80500 c000000001026dfc c0000007fda03aa0 c0000007fda03b30 
               GPR28: c0000007fda03b10 c0000004bb864880 c00000000b12fe80 c0000004acb8a900 
[   38.339186] NIP [c00000000012d31c] detach_task_cfs_rq+0xa0c/0xd10
[   38.339188] LR [c00000000012d318] detach_task_cfs_rq+0xa08/0xd10
[   38.339188] Call Trace:
[   38.339189] [c0000007fda038b0] [c00000000012d318] detach_task_cfs_rq+0xa08/0xd10 (unreliable)
[   38.339192] [c0000007fda03970] [c000000000133cb8] task_change_group_fair+0x48/0x130
[   38.339193] [c0000007fda039b0] [c0000000001139e4] sched_change_group+0x64/0xd0
[   38.339195] [c0000007fda039f0] [c000000000121790] sched_move_task+0x60/0x210
[   38.339197] [c0000007fda03a40] [c000000000121984] cpu_cgroup_attach+0x44/0x70
[   38.339200] [c0000007fda03a80] [c0000000001b84d0] cgroup_migrate_execute+0x410/0x460
[   38.339202] [c0000007fda03af0] [c0000000001b8790] cgroup_attach_task+0x180/0x350
[   38.339203] [c0000007fda03bb0] [c0000000001bb76c] __cgroup_procs_write+0x30c/0x450
[   38.339205] [c0000007fda03c40] [c0000000001b5644] cgroup_file_write+0x74/0x160
[   38.339208] [c0000007fda03ca0] [c0000000003fec00] kernfs_fop_write+0x190/0x260
[   38.339210] [c0000007fda03cf0] [c000000000344a0c] __vfs_write+0x6c/0x1c0
[   38.339211] [c0000007fda03d90] [c000000000346988] vfs_write+0xd8/0x240
[   38.339213] [c0000007fda03de0] [c00000000034882c] SyS_write+0x6c/0x110
[   38.339215] [c0000007fda03e30] [c00000000000b760] system_call+0x38/0xfc
[   38.339215] Instruction dump:
[   38.339217] 60420000 3d22fff4 8929130e 2f890000 409ef6c4 39200001 3d42fff4 3c62ffd2 
[   38.339220] 3863b610 992a130e 48945f65 60000000 <0fe00000> 4bfff6a0 60000000 60000000 
[   38.339223] ---[ end trace d76dabbf37156e2a ]---


I assume I should be worried?

cheers

next      parent reply	other threads:[~2017-01-30 21:24 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20160921133813.31976-8-matt@codeblueprint.co.uk>
     [not found] ` <tip-cb42c9a3ebbbb23448c3f9a25417fae6309b1a92@git.kernel.org>
2017-01-30 21:24   ` Michael Ellerman [this message]
2017-01-30 21:34     ` [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls Matt Fleming
2017-01-31  8:35       ` Michael Ellerman
2017-01-31 11:00       ` Sachin Sant
2017-01-31 11:48         ` Mike Galbraith
2017-01-31 17:22           ` Ross Zwisler
2017-02-02 15:55             ` Peter Zijlstra
2017-02-02 22:01               ` Matt Fleming
2017-02-03  3:05               ` Mike Galbraith
2017-02-03  4:33               ` Sachin Sant
2017-02-03  8:53                 ` Peter Zijlstra
2017-02-03 11:04                   ` Sachin Sant
2017-02-03 12:59                   ` Mike Galbraith
2017-02-03 13:37                     ` Peter Zijlstra
2017-02-03 13:52                       ` Mike Galbraith
2017-02-03 15:44                       ` Paul E. McKenney
2017-02-03 15:54                         ` Paul E. McKenney
2017-02-06  6:23                           ` Sachin Sant
2017-02-06 15:10                             ` Paul E. McKenney
2017-02-06 15:14                               ` Paul E. McKenney
2017-02-03 13:04               ` Borislav Petkov
2017-02-22  9:03               ` Wanpeng Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tw8gutp6.fsf@concordia.ellerman.id.au \
    --to=mpe@ellerman.id.au \
    --cc=byungchul.park@lge.com \
    --cc=efault@gmx.de \
    --cc=fweisbec@gmail.com \
    --cc=hpa@zytor.com \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-next@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=luca.abeni@unitn.it \
    --cc=matt@codeblueprint.co.uk \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=riel@redhat.com \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=umgwanakikbuti@gmail.com \
    --cc=wanpeng.li@hotmail.com \
    --cc=yuyang.du@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).