* Weird sched_clock behaviour during boot with -rc1 @ 2014-02-04 18:36 Will Deacon 2014-02-04 20:46 ` John Stultz 0 siblings, 1 reply; 15+ messages in thread From: Will Deacon @ 2014-02-04 18:36 UTC (permalink / raw) To: linux-arm-kernel Hi guys, Booting -rc1 on my TC2 gives the following strange entries in the dmesg: Uncompressing Linux... done, booting the kernel. [ 0.000000] Booting Linux on physical CPU 0x0 [...] [ 0.000000] HighMem zone: 329728 pages, LIFO batch:31 [ 7.789662] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 178956969942ns [ 0.000129] PERCPU: Embedded 9 pages/cpu @ee7bd000 s12800 r8192 d15872 u36864 [...] [ 0.868297] NR_IRQS:16 nr_irqs:16 16 [ 0.886350] Architected cp15 timer(s) running at 24.00MHz (phys). [ 2915.164998] sched_clock: 56 bits at 24MHz, resolution 41ns, wraps every 2863311519744ns [ 0.000002] Switching to timer-based delay loop [ 0.014249] Console: colour dummy device 80x30 so it looks like something whacky goes on during sched_clock registration. Sure enough, we're doing a pr_info in-between updating cs.* and calling update_sched_clock(), so moving the print sorts things out (diff below). What I can't figure out is why this has suddenly started happening with 3.14. Any ideas? Cheers, Will --->8 diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c index 0abb36464281..f3de5d113afc 100644 --- a/kernel/time/sched_clock.c +++ b/kernel/time/sched_clock.c @@ -147,9 +147,6 @@ void __init sched_clock_register(u64 (*read)(void), int bits, /* calculate the ns resolution of this counter */ res = cyc_to_ns(1ULL, cd.mult, cd.shift); - pr_info("sched_clock: %u bits at %lu%cHz, resolution %lluns, wraps every %lluns\n", - bits, r, r_unit, res, wrap); - update_sched_clock(); /* @@ -161,6 +158,9 @@ void __init sched_clock_register(u64 (*read)(void), int bits, if (irqtime > 0 || (irqtime == -1 && rate >= 1000000)) enable_sched_clock_irqtime(); + pr_info("sched_clock: %u bits at %lu%cHz, resolution %lluns, wraps every %lluns\n", + bits, r, r_unit, res, wrap); + pr_debug("Registered %pF as sched_clock source\n", read); } ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Weird sched_clock behaviour during boot with -rc1 2014-02-04 18:36 Weird sched_clock behaviour during boot with -rc1 Will Deacon @ 2014-02-04 20:46 ` John Stultz 2014-02-04 22:00 ` Stephen Boyd 0 siblings, 1 reply; 15+ messages in thread From: John Stultz @ 2014-02-04 20:46 UTC (permalink / raw) To: linux-arm-kernel On 02/04/2014 10:36 AM, Will Deacon wrote: > Hi guys, > > Booting -rc1 on my TC2 gives the following strange entries in the dmesg: > > > Uncompressing Linux... done, booting the kernel. > [ 0.000000] Booting Linux on physical CPU 0x0 > > [...] > > [ 0.000000] HighMem zone: 329728 pages, LIFO batch:31 > [ 7.789662] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 178956969942ns > [ 0.000129] PERCPU: Embedded 9 pages/cpu @ee7bd000 s12800 r8192 d15872 u36864 > > [...] > > [ 0.868297] NR_IRQS:16 nr_irqs:16 16 > [ 0.886350] Architected cp15 timer(s) running at 24.00MHz (phys). > [ 2915.164998] sched_clock: 56 bits at 24MHz, resolution 41ns, wraps every 2863311519744ns > [ 0.000002] Switching to timer-based delay loop > [ 0.014249] Console: colour dummy device 80x30 > > > so it looks like something whacky goes on during sched_clock registration. > Sure enough, we're doing a pr_info in-between updating cs.* and calling > update_sched_clock(), so moving the print sorts things out (diff below). Yea... we have to be particularly careful with sched_clock to avoid locks since we don't want to deadlock, but in this case sched_clock_register is a little too relaxed here. Stephen: Would it make sense to set cd.suspended = true at the top of the registration? That should block any sched_clock calls from getting half-updated data, but still allow the sched_clock_update function to work. > What I can't figure out is why this has suddenly started happening with > 3.14. Any ideas? No clue. I'm guessing something like timing changes in the printk paths that call sched_clock? I suspect your patch to move the print down will also be a good idea along with the suspending sched_clock while we register new ones. thanks -john ^ permalink raw reply [flat|nested] 15+ messages in thread
* Weird sched_clock behaviour during boot with -rc1 2014-02-04 20:46 ` John Stultz @ 2014-02-04 22:00 ` Stephen Boyd 2014-02-05 21:47 ` Josh Cartwright 2014-02-07 18:23 ` John Stultz 0 siblings, 2 replies; 15+ messages in thread From: Stephen Boyd @ 2014-02-04 22:00 UTC (permalink / raw) To: linux-arm-kernel On 02/04, John Stultz wrote: > On 02/04/2014 10:36 AM, Will Deacon wrote: > > Hi guys, > > > > Booting -rc1 on my TC2 gives the following strange entries in the dmesg: > > > > > > Uncompressing Linux... done, booting the kernel. > > [ 0.000000] Booting Linux on physical CPU 0x0 > > > > [...] > > > > [ 0.000000] HighMem zone: 329728 pages, LIFO batch:31 > > [ 7.789662] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 178956969942ns > > [ 0.000129] PERCPU: Embedded 9 pages/cpu @ee7bd000 s12800 r8192 d15872 u36864 > > > > [...] > > > > [ 0.868297] NR_IRQS:16 nr_irqs:16 16 > > [ 0.886350] Architected cp15 timer(s) running at 24.00MHz (phys). > > [ 2915.164998] sched_clock: 56 bits at 24MHz, resolution 41ns, wraps every 2863311519744ns > > [ 0.000002] Switching to timer-based delay loop > > [ 0.014249] Console: colour dummy device 80x30 > > > > > > so it looks like something whacky goes on during sched_clock registration. > > Sure enough, we're doing a pr_info in-between updating cs.* and calling > > update_sched_clock(), so moving the print sorts things out (diff below). > > Yea... we have to be particularly careful with sched_clock to avoid > locks since we don't want to deadlock, but in this case > sched_clock_register is a little too relaxed here. > > Stephen: Would it make sense to set cd.suspended = true at the top of > the registration? That should block any sched_clock calls from getting > half-updated data, but still allow the sched_clock_update function to work. > That would work, but why can't we just hold the write seqlock during the registration? We would need to make a lockeless version of update_sched_clock() but that doesn't look too hard. It might actually turn out nicer because we call update_sched_clock() here just to set the epoch_cyc but we have to reset the epoch_ns back to 0 to start the count off right. How about this? The only concern is calling read_sched_clock() inside the seqlock, but I don't think that's a concern and if it is we can call it outside the lock at the beginning of this function. ----8<---- diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c index 0abb36464281..ae8f10a02c32 100644 --- a/kernel/time/sched_clock.c +++ b/kernel/time/sched_clock.c @@ -124,6 +124,7 @@ void __init sched_clock_register(u64 (*read)(void), int bits, return; WARN_ON(!irqs_disabled()); + raw_write_seqcount_begin(&cd.seq); read_sched_clock = read; sched_clock_mask = CLOCKSOURCE_MASK(bits); cd.rate = rate; @@ -147,15 +148,16 @@ void __init sched_clock_register(u64 (*read)(void), int bits, /* calculate the ns resolution of this counter */ res = cyc_to_ns(1ULL, cd.mult, cd.shift); - pr_info("sched_clock: %u bits at %lu%cHz, resolution %lluns, wraps every %lluns\n", - bits, r, r_unit, res, wrap); - - update_sched_clock(); /* * Ensure that sched_clock() starts off at 0ns */ cd.epoch_ns = 0; + cd.epoch_cyc = read_sched_clock(); + raw_write_seqcount_end(&cd.seq); + + pr_info("sched_clock: %u bits@%lu%cHz, resolution %lluns, wraps every %lluns\n", + bits, r, r_unit, res, wrap); /* Enable IRQ time accounting if we have a fast enough sched_clock */ -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Weird sched_clock behaviour during boot with -rc1 2014-02-04 22:00 ` Stephen Boyd @ 2014-02-05 21:47 ` Josh Cartwright 2014-02-07 18:23 ` John Stultz 1 sibling, 0 replies; 15+ messages in thread From: Josh Cartwright @ 2014-02-05 21:47 UTC (permalink / raw) To: linux-arm-kernel On Tue, Feb 04, 2014 at 02:00:45PM -0800, Stephen Boyd wrote: > On 02/04, John Stultz wrote: > > On 02/04/2014 10:36 AM, Will Deacon wrote: > > > Hi guys, > > > > > > Booting -rc1 on my TC2 gives the following strange entries in the dmesg: > > > > > > > > > Uncompressing Linux... done, booting the kernel. > > > [ 0.000000] Booting Linux on physical CPU 0x0 > > > > > > [...] > > > > > > [ 0.000000] HighMem zone: 329728 pages, LIFO batch:31 > > > [ 7.789662] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 178956969942ns > > > [ 0.000129] PERCPU: Embedded 9 pages/cpu @ee7bd000 s12800 r8192 d15872 u36864 > > > > > > [...] > > > > > > [ 0.868297] NR_IRQS:16 nr_irqs:16 16 > > > [ 0.886350] Architected cp15 timer(s) running at 24.00MHz (phys). > > > [ 2915.164998] sched_clock: 56 bits at 24MHz, resolution 41ns, wraps every 2863311519744ns > > > [ 0.000002] Switching to timer-based delay loop > > > [ 0.014249] Console: colour dummy device 80x30 > > > > > > > > > so it looks like something whacky goes on during sched_clock registration. > > > Sure enough, we're doing a pr_info in-between updating cs.* and calling > > > update_sched_clock(), so moving the print sorts things out (diff below). > > > > Yea... we have to be particularly careful with sched_clock to avoid > > locks since we don't want to deadlock, but in this case > > sched_clock_register is a little too relaxed here. > > > > Stephen: Would it make sense to set cd.suspended = true at the top of > > the registration? That should block any sched_clock calls from getting > > half-updated data, but still allow the sched_clock_update function to work. > > > > That would work, but why can't we just hold the write seqlock > during the registration? We would need to make a lockeless > version of update_sched_clock() but that doesn't look too hard. > It might actually turn out nicer because we call > update_sched_clock() here just to set the epoch_cyc but we have > to reset the epoch_ns back to 0 to start the count off right. > > How about this? The only concern is calling read_sched_clock() > inside the seqlock, but I don't think that's a concern and if it > is we can call it outside the lock at the beginning of this > function. If we go down this route, it looks there is an opportunity to rearrange the logic a bit to front load the clocksource calculations, minimizing the amount of time the lock is held. ----8<---- diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c index e5387a0..9bad1f7 100644 --- a/kernel/time/sched_clock.c +++ b/kernel/time/sched_clock.c @@ -116,20 +116,40 @@ static enum hrtimer_restart sched_clock_poll(struct hrtimer *hrt) void __init sched_clock_register(u64 (*read)(void), int bits, unsigned long rate) { + u64 res, wrap, new_mask; + u32 new_mult, new_shift; + ktime_t new_wrap_kt; unsigned long r; - u64 res, wrap; char r_unit; if (cd.rate > rate) return; WARN_ON(!irqs_disabled()); + + /* calculate the mult/shift to convert counter ticks to ns. */ + clocks_calc_mult_shift(&new_mult, &new_shift, rate, NSEC_PER_SEC, 3600); + + new_mask = CLOCKSOURCE_MASK(bits); + + /* calculate how many ns until we wrap */ + wrap = clocks_calc_max_nsecs(new_mult, new_shift, 0, new_mask); + new_wrap_kt = ns_to_ktime(wrap - (wrap >> 3)); + + raw_write_seqcount_begin(&cd.seq); read_sched_clock = read; - sched_clock_mask = CLOCKSOURCE_MASK(bits); + sched_clock_mask = new_mask; cd.rate = rate; + cd.wrap_kt = new_wrap_kt; + cd.mult = new_mult; + cd.shift = new_shift; - /* calculate the mult/shift to convert counter ticks to ns. */ - clocks_calc_mult_shift(&cd.mult, &cd.shift, rate, NSEC_PER_SEC, 3600); + /* + * Ensure that sched_clock() starts off at 0ns + */ + cd.epoch_ns = 0; + cd.epoch_cyc = read_sched_clock(); + raw_write_seqcount_end(&cd.seq); r = rate; /* @@ -145,22 +165,12 @@ void __init sched_clock_register(u64 (*read)(void), int bits, } else r_unit = ' '; - /* calculate how many ns until we wrap */ - wrap = clocks_calc_max_nsecs(cd.mult, cd.shift, 0, sched_clock_mask); - cd.wrap_kt = ns_to_ktime(wrap - (wrap >> 3)); - /* calculate the ns resolution of this counter */ - res = cyc_to_ns(1ULL, cd.mult, cd.shift); + res = cyc_to_ns(1ULL, new_mult, new_shift); + pr_info("sched_clock: %u bits at %lu%cHz, resolution %lluns, wraps every %lluns\n", bits, r, r_unit, res, wrap); - update_sched_clock(); - - /* - * Ensure that sched_clock() starts off at 0ns - */ - cd.epoch_ns = 0; - /* Enable IRQ time accounting if we have a fast enough sched_clock */ if (irqtime > 0 || (irqtime == -1 && rate >= 1000000)) enable_sched_clock_irqtime(); -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Weird sched_clock behaviour during boot with -rc1 2014-02-04 22:00 ` Stephen Boyd 2014-02-05 21:47 ` Josh Cartwright @ 2014-02-07 18:23 ` John Stultz 2014-02-07 19:37 ` Stephen Boyd 2014-02-07 20:48 ` [PATCH] sched_clock: Prevent callers from seeing half-updated data Stephen Boyd 1 sibling, 2 replies; 15+ messages in thread From: John Stultz @ 2014-02-07 18:23 UTC (permalink / raw) To: linux-arm-kernel On 02/04/2014 02:00 PM, Stephen Boyd wrote: > On 02/04, John Stultz wrote: >> On 02/04/2014 10:36 AM, Will Deacon wrote: >>> Hi guys, >>> >>> Booting -rc1 on my TC2 gives the following strange entries in the dmesg: >>> >>> >>> Uncompressing Linux... done, booting the kernel. >>> [ 0.000000] Booting Linux on physical CPU 0x0 >>> >>> [...] >>> >>> [ 0.000000] HighMem zone: 329728 pages, LIFO batch:31 >>> [ 7.789662] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 178956969942ns >>> [ 0.000129] PERCPU: Embedded 9 pages/cpu @ee7bd000 s12800 r8192 d15872 u36864 >>> >>> [...] >>> >>> [ 0.868297] NR_IRQS:16 nr_irqs:16 16 >>> [ 0.886350] Architected cp15 timer(s) running at 24.00MHz (phys). >>> [ 2915.164998] sched_clock: 56 bits at 24MHz, resolution 41ns, wraps every 2863311519744ns >>> [ 0.000002] Switching to timer-based delay loop >>> [ 0.014249] Console: colour dummy device 80x30 >>> >>> >>> so it looks like something whacky goes on during sched_clock registration. >>> Sure enough, we're doing a pr_info in-between updating cs.* and calling >>> update_sched_clock(), so moving the print sorts things out (diff below). >> Yea... we have to be particularly careful with sched_clock to avoid >> locks since we don't want to deadlock, but in this case >> sched_clock_register is a little too relaxed here. >> >> Stephen: Would it make sense to set cd.suspended = true at the top of >> the registration? That should block any sched_clock calls from getting >> half-updated data, but still allow the sched_clock_update function to work. >> > That would work, but why can't we just hold the write seqlock > during the registration? We would need to make a lockeless > version of update_sched_clock() but that doesn't look too hard. > It might actually turn out nicer because we call > update_sched_clock() here just to set the epoch_cyc but we have > to reset the epoch_ns back to 0 to start the count off right. > > How about this? The only concern is calling read_sched_clock() > inside the seqlock, but I don't think that's a concern and if it > is we can call it outside the lock at the beginning of this > function. Hey Stephen, So whats the story here? Are we waiting on an ack from Will or would you rather go with Josh's approach? thanks -john ^ permalink raw reply [flat|nested] 15+ messages in thread
* Weird sched_clock behaviour during boot with -rc1 2014-02-07 18:23 ` John Stultz @ 2014-02-07 19:37 ` Stephen Boyd 2014-02-07 20:48 ` [PATCH] sched_clock: Prevent callers from seeing half-updated data Stephen Boyd 1 sibling, 0 replies; 15+ messages in thread From: Stephen Boyd @ 2014-02-07 19:37 UTC (permalink / raw) To: linux-arm-kernel On 02/07, John Stultz wrote: > On 02/04/2014 02:00 PM, Stephen Boyd wrote: > >> > > That would work, but why can't we just hold the write seqlock > > during the registration? We would need to make a lockeless > > version of update_sched_clock() but that doesn't look too hard. > > It might actually turn out nicer because we call > > update_sched_clock() here just to set the epoch_cyc but we have > > to reset the epoch_ns back to 0 to start the count off right. > > > > How about this? The only concern is calling read_sched_clock() > > inside the seqlock, but I don't think that's a concern and if it > > is we can call it outside the lock at the beginning of this > > function. > > So whats the story here? Are we waiting on an ack from Will or would you > rather go with Josh's approach? An acked-by/tested-by from Will would be good. I'll cook up a patch right now to do everything that has been mentioned in this thread. -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] sched_clock: Prevent callers from seeing half-updated data 2014-02-07 18:23 ` John Stultz 2014-02-07 19:37 ` Stephen Boyd @ 2014-02-07 20:48 ` Stephen Boyd 2014-02-07 22:22 ` Stephen Boyd 2014-02-07 22:28 ` [PATCH v2] " Stephen Boyd 1 sibling, 2 replies; 15+ messages in thread From: Stephen Boyd @ 2014-02-07 20:48 UTC (permalink / raw) To: linux-arm-kernel If two sched_clock sources are registered we may end up in a situation where a call to sched_clock() may be accessing the epoch cycle count for the old counter and the cycle count for the new counter. This can lead to confusing results where sched_clock() values jump and then are reset to 0 (due to the way the registration function forces the epoch_ns to be 0). Fix this by reorganizing the registration function to hold the seqlock for as short a time as possible while we update the clock_data structure for a new counter and stop resetting the epoch_ns count to 0. Reported-by: Will Deacon <will.deacon@arm.com> Cc: Josh Cartwright <joshc@codeaurora.org> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> --- kernel/time/sched_clock.c | 42 +++++++++++++++++++++++++----------------- 1 file changed, 25 insertions(+), 17 deletions(-) diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c index 0abb36464281..36ffce3a4b83 100644 --- a/kernel/time/sched_clock.c +++ b/kernel/time/sched_clock.c @@ -116,20 +116,38 @@ static enum hrtimer_restart sched_clock_poll(struct hrtimer *hrt) void __init sched_clock_register(u64 (*read)(void), int bits, unsigned long rate) { + u64 res, wrap, new_mask, new_epoch; + u32 new_mult, new_shift; + ktime_t new_wrap_kt; unsigned long r; - u64 res, wrap; char r_unit; if (cd.rate > rate) return; WARN_ON(!irqs_disabled()); - read_sched_clock = read; - sched_clock_mask = CLOCKSOURCE_MASK(bits); - cd.rate = rate; /* calculate the mult/shift to convert counter ticks to ns. */ - clocks_calc_mult_shift(&cd.mult, &cd.shift, rate, NSEC_PER_SEC, 3600); + clocks_calc_mult_shift(&new_mult, &new_shift, rate, NSEC_PER_SEC, 3600); + + new_mask = CLOCKSOURCE_MASK(bits); + + /* calculate how many ns until we wrap */ + wrap = clocks_calc_max_nsecs(new_mult, new_shift, 0, new_mask); + new_wrap_kt = ns_to_ktime(wrap - (wrap >> 3)); + + /* update epoch for the new counter */ + new_epoch = read(); + + raw_write_seqcount_begin(&cd.seq); + read_sched_clock = read; + sched_clock_mask = new_mask; + cd.rate = rate; + cd.wrap_kt = new_wrap_kt; + cd.mult = new_mult; + cd.shift = new_shift; + cd.epoch_cyc = new_epoch; + raw_write_seqcount_end(&cd.seq); r = rate; if (r >= 4000000) { @@ -141,22 +159,12 @@ void __init sched_clock_register(u64 (*read)(void), int bits, } else r_unit = ' '; - /* calculate how many ns until we wrap */ - wrap = clocks_calc_max_nsecs(cd.mult, cd.shift, 0, sched_clock_mask); - cd.wrap_kt = ns_to_ktime(wrap - (wrap >> 3)); - /* calculate the ns resolution of this counter */ - res = cyc_to_ns(1ULL, cd.mult, cd.shift); + res = cyc_to_ns(1ULL, new_mult, new_shift); + pr_info("sched_clock: %u bits at %lu%cHz, resolution %lluns, wraps every %lluns\n", bits, r, r_unit, res, wrap); - update_sched_clock(); - - /* - * Ensure that sched_clock() starts off at 0ns - */ - cd.epoch_ns = 0; - /* Enable IRQ time accounting if we have a fast enough sched_clock */ if (irqtime > 0 || (irqtime == -1 && rate >= 1000000)) enable_sched_clock_irqtime(); -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH] sched_clock: Prevent callers from seeing half-updated data 2014-02-07 20:48 ` [PATCH] sched_clock: Prevent callers from seeing half-updated data Stephen Boyd @ 2014-02-07 22:22 ` Stephen Boyd 2014-02-07 22:28 ` John Stultz 2014-02-07 22:28 ` [PATCH v2] " Stephen Boyd 1 sibling, 1 reply; 15+ messages in thread From: Stephen Boyd @ 2014-02-07 22:22 UTC (permalink / raw) To: linux-arm-kernel On 02/07, Stephen Boyd wrote: > If two sched_clock sources are registered we may end up in a > situation where a call to sched_clock() may be accessing the > epoch cycle count for the old counter and the cycle count for the > new counter. This can lead to confusing results where > sched_clock() values jump and then are reset to 0 (due to the way > the registration function forces the epoch_ns to be 0). Fix this > by reorganizing the registration function to hold the seqlock for > as short a time as possible while we update the clock_data > structure for a new counter and stop resetting the epoch_ns count > to 0. Hmm.. This won't properly accumulate time. We need to put whatever time has elapsed into epoch_ns when we register the new counter for this to work. I don't have a board with this configuration but I'll send a v2 that should fix this. Hopefully Will can test it. -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] sched_clock: Prevent callers from seeing half-updated data 2014-02-07 22:22 ` Stephen Boyd @ 2014-02-07 22:28 ` John Stultz 2014-02-11 6:49 ` Stephen Boyd 0 siblings, 1 reply; 15+ messages in thread From: John Stultz @ 2014-02-07 22:28 UTC (permalink / raw) To: linux-arm-kernel On 02/07/2014 02:22 PM, Stephen Boyd wrote: > On 02/07, Stephen Boyd wrote: >> If two sched_clock sources are registered we may end up in a >> situation where a call to sched_clock() may be accessing the >> epoch cycle count for the old counter and the cycle count for the >> new counter. This can lead to confusing results where >> sched_clock() values jump and then are reset to 0 (due to the way >> the registration function forces the epoch_ns to be 0). Fix this >> by reorganizing the registration function to hold the seqlock for >> as short a time as possible while we update the clock_data >> structure for a new counter and stop resetting the epoch_ns count >> to 0. > Hmm.. This won't properly accumulate time. We need to put > whatever time has elapsed into epoch_ns when we register the new > counter for this to work. I don't have a board with this > configuration but I'll send a v2 that should fix this. Hopefully > Will can test it. Also maybe clarify in the commit message that this is a result of not having the necessary locking in place in the registration code (likely due to it not really being required in the single clock case), just so Ingo and others have some more context as to why this is needed now and wasn't hit before. thanks -john ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] sched_clock: Prevent callers from seeing half-updated data 2014-02-07 22:28 ` John Stultz @ 2014-02-11 6:49 ` Stephen Boyd 2014-02-17 18:13 ` John Stultz 0 siblings, 1 reply; 15+ messages in thread From: Stephen Boyd @ 2014-02-11 6:49 UTC (permalink / raw) To: linux-arm-kernel On 02/07, John Stultz wrote: > On 02/07/2014 02:22 PM, Stephen Boyd wrote: > > On 02/07, Stephen Boyd wrote: > >> If two sched_clock sources are registered we may end up in a > >> situation where a call to sched_clock() may be accessing the > >> epoch cycle count for the old counter and the cycle count for the > >> new counter. This can lead to confusing results where > >> sched_clock() values jump and then are reset to 0 (due to the way > >> the registration function forces the epoch_ns to be 0). Fix this > >> by reorganizing the registration function to hold the seqlock for > >> as short a time as possible while we update the clock_data > >> structure for a new counter and stop resetting the epoch_ns count > >> to 0. > > Hmm.. This won't properly accumulate time. We need to put > > whatever time has elapsed into epoch_ns when we register the new > > counter for this to work. I don't have a board with this > > configuration but I'll send a v2 that should fix this. Hopefully > > Will can test it. > > Also maybe clarify in the commit message that this is a result of not > having the necessary locking in place in the registration code (likely > due to it not really being required in the single clock case), just so > Ingo and others have some more context as to why this is needed now and > wasn't hit before. > Hmph... I already sent v2 before you replied. Is the commit text good enough? I do mention that this is about two sched_clock sources being registered. -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] sched_clock: Prevent callers from seeing half-updated data 2014-02-11 6:49 ` Stephen Boyd @ 2014-02-17 18:13 ` John Stultz 0 siblings, 0 replies; 15+ messages in thread From: John Stultz @ 2014-02-17 18:13 UTC (permalink / raw) To: linux-arm-kernel On 02/10/2014 10:49 PM, Stephen Boyd wrote: > On 02/07, John Stultz wrote: >> On 02/07/2014 02:22 PM, Stephen Boyd wrote: >>> On 02/07, Stephen Boyd wrote: >>>> If two sched_clock sources are registered we may end up in a >>>> situation where a call to sched_clock() may be accessing the >>>> epoch cycle count for the old counter and the cycle count for the >>>> new counter. This can lead to confusing results where >>>> sched_clock() values jump and then are reset to 0 (due to the way >>>> the registration function forces the epoch_ns to be 0). Fix this >>>> by reorganizing the registration function to hold the seqlock for >>>> as short a time as possible while we update the clock_data >>>> structure for a new counter and stop resetting the epoch_ns count >>>> to 0. >>> Hmm.. This won't properly accumulate time. We need to put >>> whatever time has elapsed into epoch_ns when we register the new >>> counter for this to work. I don't have a board with this >>> configuration but I'll send a v2 that should fix this. Hopefully >>> Will can test it. >> Also maybe clarify in the commit message that this is a result of not >> having the necessary locking in place in the registration code (likely >> due to it not really being required in the single clock case), just so >> Ingo and others have some more context as to why this is needed now and >> wasn't hit before. >> > Hmph... I already sent v2 before you replied. Is the commit text > good enough? I do mention that this is about two sched_clock > sources being registered. I'll tweak the commit message a bit make this point more clear. thanks -john ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2] sched_clock: Prevent callers from seeing half-updated data 2014-02-07 20:48 ` [PATCH] sched_clock: Prevent callers from seeing half-updated data Stephen Boyd 2014-02-07 22:22 ` Stephen Boyd @ 2014-02-07 22:28 ` Stephen Boyd 2014-02-10 11:14 ` Will Deacon 1 sibling, 1 reply; 15+ messages in thread From: Stephen Boyd @ 2014-02-07 22:28 UTC (permalink / raw) To: linux-arm-kernel If two sched_clock sources are registered we may end up in a situation where a call to sched_clock() may be accessing the epoch cycle count for the old counter and the cycle count for the new counter. This can lead to confusing results where sched_clock() values jump and then are reset to 0 (due to the way the registration function forces the epoch_ns to be 0). Fix this by reorganizing the registration function to hold the seqlock for as short a time as possible while we update the clock_data structure for a new counter. We also put any accumulated time into epoch_ns instead of resetting the time to 0 so that the clock doesn't reset after each successful registration. Reported-by: Will Deacon <will.deacon@arm.com> Cc: Josh Cartwright <joshc@codeaurora.org> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> --- Changes since v1: * Put elapsed time into epoch_ns kernel/time/sched_clock.c | 46 +++++++++++++++++++++++++++++----------------- 1 file changed, 29 insertions(+), 17 deletions(-) diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c index 0abb36464281..4d23dc4d8139 100644 --- a/kernel/time/sched_clock.c +++ b/kernel/time/sched_clock.c @@ -116,20 +116,42 @@ static enum hrtimer_restart sched_clock_poll(struct hrtimer *hrt) void __init sched_clock_register(u64 (*read)(void), int bits, unsigned long rate) { + u64 res, wrap, new_mask, new_epoch, cyc, ns; + u32 new_mult, new_shift; + ktime_t new_wrap_kt; unsigned long r; - u64 res, wrap; char r_unit; if (cd.rate > rate) return; WARN_ON(!irqs_disabled()); - read_sched_clock = read; - sched_clock_mask = CLOCKSOURCE_MASK(bits); - cd.rate = rate; /* calculate the mult/shift to convert counter ticks to ns. */ - clocks_calc_mult_shift(&cd.mult, &cd.shift, rate, NSEC_PER_SEC, 3600); + clocks_calc_mult_shift(&new_mult, &new_shift, rate, NSEC_PER_SEC, 3600); + + new_mask = CLOCKSOURCE_MASK(bits); + + /* calculate how many ns until we wrap */ + wrap = clocks_calc_max_nsecs(new_mult, new_shift, 0, new_mask); + new_wrap_kt = ns_to_ktime(wrap - (wrap >> 3)); + + /* update epoch for new counter and update epoch_ns from old counter*/ + new_epoch = read(); + cyc = read_sched_clock(); + ns = cd.epoch_ns + cyc_to_ns((cyc - cd.epoch_cyc) & sched_clock_mask, + cd.mult, cd.shift); + + raw_write_seqcount_begin(&cd.seq); + read_sched_clock = read; + sched_clock_mask = new_mask; + cd.rate = rate; + cd.wrap_kt = new_wrap_kt; + cd.mult = new_mult; + cd.shift = new_shift; + cd.epoch_cyc = new_epoch; + cd.epoch_ns = ns; + raw_write_seqcount_end(&cd.seq); r = rate; if (r >= 4000000) { @@ -141,22 +163,12 @@ void __init sched_clock_register(u64 (*read)(void), int bits, } else r_unit = ' '; - /* calculate how many ns until we wrap */ - wrap = clocks_calc_max_nsecs(cd.mult, cd.shift, 0, sched_clock_mask); - cd.wrap_kt = ns_to_ktime(wrap - (wrap >> 3)); - /* calculate the ns resolution of this counter */ - res = cyc_to_ns(1ULL, cd.mult, cd.shift); + res = cyc_to_ns(1ULL, new_mult, new_shift); + pr_info("sched_clock: %u bits at %lu%cHz, resolution %lluns, wraps every %lluns\n", bits, r, r_unit, res, wrap); - update_sched_clock(); - - /* - * Ensure that sched_clock() starts off at 0ns - */ - cd.epoch_ns = 0; - /* Enable IRQ time accounting if we have a fast enough sched_clock */ if (irqtime > 0 || (irqtime == -1 && rate >= 1000000)) enable_sched_clock_irqtime(); -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2] sched_clock: Prevent callers from seeing half-updated data 2014-02-07 22:28 ` [PATCH v2] " Stephen Boyd @ 2014-02-10 11:14 ` Will Deacon 2014-02-17 11:19 ` Will Deacon 0 siblings, 1 reply; 15+ messages in thread From: Will Deacon @ 2014-02-10 11:14 UTC (permalink / raw) To: linux-arm-kernel On Fri, Feb 07, 2014 at 10:28:58PM +0000, Stephen Boyd wrote: > If two sched_clock sources are registered we may end up in a > situation where a call to sched_clock() may be accessing the > epoch cycle count for the old counter and the cycle count for the > new counter. This can lead to confusing results where > sched_clock() values jump and then are reset to 0 (due to the way > the registration function forces the epoch_ns to be 0). Fix this > by reorganizing the registration function to hold the seqlock for > as short a time as possible while we update the clock_data > structure for a new counter. We also put any accumulated time > into epoch_ns instead of resetting the time to 0 so that the clock > doesn't reset after each successful registration. > > Reported-by: Will Deacon <will.deacon@arm.com> > Cc: Josh Cartwright <joshc@codeaurora.org> > Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> > --- > > Changes since v1: > * Put elapsed time into epoch_ns Well, this certainly fixes the dmesg prints and the system seems ok once it's booted: Tested-by: Will Deacon <will.deacon@arm.com> Will ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2] sched_clock: Prevent callers from seeing half-updated data 2014-02-10 11:14 ` Will Deacon @ 2014-02-17 11:19 ` Will Deacon 2014-02-17 18:04 ` John Stultz 0 siblings, 1 reply; 15+ messages in thread From: Will Deacon @ 2014-02-17 11:19 UTC (permalink / raw) To: linux-arm-kernel On Mon, Feb 10, 2014 at 11:14:35AM +0000, Will Deacon wrote: > On Fri, Feb 07, 2014 at 10:28:58PM +0000, Stephen Boyd wrote: > > If two sched_clock sources are registered we may end up in a > > situation where a call to sched_clock() may be accessing the > > epoch cycle count for the old counter and the cycle count for the > > new counter. This can lead to confusing results where > > sched_clock() values jump and then are reset to 0 (due to the way > > the registration function forces the epoch_ns to be 0). Fix this > > by reorganizing the registration function to hold the seqlock for > > as short a time as possible while we update the clock_data > > structure for a new counter. We also put any accumulated time > > into epoch_ns instead of resetting the time to 0 so that the clock > > doesn't reset after each successful registration. > > > > Reported-by: Will Deacon <will.deacon@arm.com> > > Cc: Josh Cartwright <joshc@codeaurora.org> > > Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> > > --- > > > > Changes since v1: > > * Put elapsed time into epoch_ns > > Well, this certainly fixes the dmesg prints and the system seems ok once > it's booted: > > Tested-by: Will Deacon <will.deacon@arm.com> Any chance of getting this merged please? I still see the weird timing numbers when booting -rc3. Cheers, Will ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2] sched_clock: Prevent callers from seeing half-updated data 2014-02-17 11:19 ` Will Deacon @ 2014-02-17 18:04 ` John Stultz 0 siblings, 0 replies; 15+ messages in thread From: John Stultz @ 2014-02-17 18:04 UTC (permalink / raw) To: linux-arm-kernel On 02/17/2014 03:19 AM, Will Deacon wrote: > On Mon, Feb 10, 2014 at 11:14:35AM +0000, Will Deacon wrote: >> On Fri, Feb 07, 2014 at 10:28:58PM +0000, Stephen Boyd wrote: >>> If two sched_clock sources are registered we may end up in a >>> situation where a call to sched_clock() may be accessing the >>> epoch cycle count for the old counter and the cycle count for the >>> new counter. This can lead to confusing results where >>> sched_clock() values jump and then are reset to 0 (due to the way >>> the registration function forces the epoch_ns to be 0). Fix this >>> by reorganizing the registration function to hold the seqlock for >>> as short a time as possible while we update the clock_data >>> structure for a new counter. We also put any accumulated time >>> into epoch_ns instead of resetting the time to 0 so that the clock >>> doesn't reset after each successful registration. >>> >>> Reported-by: Will Deacon <will.deacon@arm.com> >>> Cc: Josh Cartwright <joshc@codeaurora.org> >>> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> >>> --- >>> >>> Changes since v1: >>> * Put elapsed time into epoch_ns >> Well, this certainly fixes the dmesg prints and the system seems ok once >> it's booted: >> >> Tested-by: Will Deacon <will.deacon@arm.com> > Any chance of getting this merged please? I still see the weird timing > numbers when booting -rc3. My apologies! I somehow missed your tested-by mail. I'll queue it up and send it in. thanks -john ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2014-02-17 18:13 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-02-04 18:36 Weird sched_clock behaviour during boot with -rc1 Will Deacon 2014-02-04 20:46 ` John Stultz 2014-02-04 22:00 ` Stephen Boyd 2014-02-05 21:47 ` Josh Cartwright 2014-02-07 18:23 ` John Stultz 2014-02-07 19:37 ` Stephen Boyd 2014-02-07 20:48 ` [PATCH] sched_clock: Prevent callers from seeing half-updated data Stephen Boyd 2014-02-07 22:22 ` Stephen Boyd 2014-02-07 22:28 ` John Stultz 2014-02-11 6:49 ` Stephen Boyd 2014-02-17 18:13 ` John Stultz 2014-02-07 22:28 ` [PATCH v2] " Stephen Boyd 2014-02-10 11:14 ` Will Deacon 2014-02-17 11:19 ` Will Deacon 2014-02-17 18:04 ` John Stultz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).