All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-wired-lan] ice driver bug with PTP vclocks
@ 2022-11-09 15:20 Miroslav Lichvar
  2022-11-09 20:08 ` Keller, Jacob E
  0 siblings, 1 reply; 3+ messages in thread
From: Miroslav Lichvar @ 2022-11-09 15:20 UTC (permalink / raw)
  To: intel-wired-lan; +Cc: Richard Cochran

It seems the locking of the PTP clock in the ice driver conflicts with
spinlocks used by PTP virtual clocks protecting their timecounter and
cyclecounter. I get the following report when running ptp4l+phc2sys
pairs on multiple ports of an E810 with vclocks enabled:

BUG: scheduling while atomic: ptp7/3859/0x00000002
Preemption disabled at:
[<0000000000000000>] 0x0
CPU: 1 PID: 3859 Comm: ptp7 Tainted: G        W          6.0.6-300.fc37.x86_64 #1
Call Trace:
 <TASK>
 dump_stack_lvl+0x44/0x5c
 __schedule_bug.cold+0x81/0x8e
 __schedule+0xe82/0x12b0
 ? get_nohz_timer_target+0x18/0x1a0
 ? timerqueue_add+0x62/0xb0
 ? enqueue_hrtimer+0x2f/0x80
 schedule+0x5d/0xe0
 schedule_hrtimeout_range_clock+0xb5/0x100
 ? __hrtimer_init+0xe0/0xe0
 usleep_range_state+0x50/0x70
 ice_ptp_lock+0x39/0x60 [ice]
 ice_ptp_gettimex64+0x31/0x70 [ice]
 ? ptp_clock_release+0x50/0x50
 ? kthread_stop+0x170/0x170
 ptp_vclock_read+0x37/0x90
 timecounter_read+0x14/0x60
 ptp_vclock_refresh+0x2a/0x50
 ptp_aux_kworker+0x1c/0x40
 kthread_worker_fn+0xaa/0x250
 kthread+0xe9/0x110
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x22/0x30
 </TASK>

I tried to fix this in the vclock code by moving the PHC read outside
of the spinlock, but it turns out that's not a complete fix and it
breaks the update of the timecounter's cycle_last value. I suspect I'd
need to modify the cyclecounter to use a cached value which would need
to be read ahead of the timecounter call.

Any chance this could be addressed in the ice driver? Any suggestions?

-- 
Miroslav Lichvar

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Intel-wired-lan] ice driver bug with PTP vclocks
  2022-11-09 15:20 [Intel-wired-lan] ice driver bug with PTP vclocks Miroslav Lichvar
@ 2022-11-09 20:08 ` Keller, Jacob E
  2023-02-02 20:43   ` Jacob Keller
  0 siblings, 1 reply; 3+ messages in thread
From: Keller, Jacob E @ 2022-11-09 20:08 UTC (permalink / raw)
  To: Miroslav Lichvar, intel-wired-lan@lists.osuosl.org,
	Kolacinski, Karol
  Cc: Richard Cochran



> -----Original Message-----
> From: Miroslav Lichvar <mlichvar@redhat.com>
> Sent: Wednesday, November 9, 2022 7:20 AM
> To: intel-wired-lan@lists.osuosl.org
> Cc: Keller, Jacob E <jacob.e.keller@intel.com>; Richard Cochran
> <richardcochran@gmail.com>
> Subject: ice driver bug with PTP vclocks
> 
> It seems the locking of the PTP clock in the ice driver conflicts with
> spinlocks used by PTP virtual clocks protecting their timecounter and
> cyclecounter. I get the following report when running ptp4l+phc2sys
> pairs on multiple ports of an E810 with vclocks enabled:
> 
> BUG: scheduling while atomic: ptp7/3859/0x00000002
> Preemption disabled at:
> [<0000000000000000>] 0x0
> CPU: 1 PID: 3859 Comm: ptp7 Tainted: G        W          6.0.6-300.fc37.x86_64 #1
> Call Trace:
>  <TASK>
>  dump_stack_lvl+0x44/0x5c
>  __schedule_bug.cold+0x81/0x8e
>  __schedule+0xe82/0x12b0
>  ? get_nohz_timer_target+0x18/0x1a0
>  ? timerqueue_add+0x62/0xb0
>  ? enqueue_hrtimer+0x2f/0x80
>  schedule+0x5d/0xe0
>  schedule_hrtimeout_range_clock+0xb5/0x100
>  ? __hrtimer_init+0xe0/0xe0
>  usleep_range_state+0x50/0x70
>  ice_ptp_lock+0x39/0x60 [ice]
>  ice_ptp_gettimex64+0x31/0x70 [ice]
>  ? ptp_clock_release+0x50/0x50
>  ? kthread_stop+0x170/0x170
>  ptp_vclock_read+0x37/0x90
>  timecounter_read+0x14/0x60
>  ptp_vclock_refresh+0x2a/0x50
>  ptp_aux_kworker+0x1c/0x40
>  kthread_worker_fn+0xaa/0x250
>  kthread+0xe9/0x110
>  ? kthread_complete_and_exit+0x20/0x20
>  ret_from_fork+0x22/0x30
>  </TASK>
> 
> I tried to fix this in the vclock code by moving the PHC read outside
> of the spinlock, but it turns out that's not a complete fix and it
> breaks the update of the timecounter's cycle_last value. I suspect I'd
> need to modify the cyclecounter to use a cached value which would need
> to be read ahead of the timecounter call.
> 
> Any chance this could be addressed in the ice driver? Any suggestions?
> 

I think we can look at how to fix this. I suspect that we are using udelay instead of usleep. Alternatively, I think Karol recently posted a patch that drops the PTP semaphore around the read accesses, which might fix this as well.

See https://lore.kernel.org/intel-wired-lan/877d0yt0ns.fsf@intel.com/

> --
> Miroslav Lichvar

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Intel-wired-lan] ice driver bug with PTP vclocks
  2022-11-09 20:08 ` Keller, Jacob E
@ 2023-02-02 20:43   ` Jacob Keller
  0 siblings, 0 replies; 3+ messages in thread
From: Jacob Keller @ 2023-02-02 20:43 UTC (permalink / raw)
  To: intel-wired-lan



On 11/9/2022 12:08 PM, Keller, Jacob E wrote:
> 
> 
>> -----Original Message-----
>> From: Miroslav Lichvar <mlichvar@redhat.com>
>> Sent: Wednesday, November 9, 2022 7:20 AM
>> To: intel-wired-lan@lists.osuosl.org
>> Cc: Keller, Jacob E <jacob.e.keller@intel.com>; Richard Cochran
>> <richardcochran@gmail.com>
>> Subject: ice driver bug with PTP vclocks
>>
>> It seems the locking of the PTP clock in the ice driver conflicts with
>> spinlocks used by PTP virtual clocks protecting their timecounter and
>> cyclecounter. I get the following report when running ptp4l+phc2sys
>> pairs on multiple ports of an E810 with vclocks enabled:
>>
>> BUG: scheduling while atomic: ptp7/3859/0x00000002
>> Preemption disabled at:
>> [<0000000000000000>] 0x0
>> CPU: 1 PID: 3859 Comm: ptp7 Tainted: G        W          6.0.6-300.fc37.x86_64 #1
>> Call Trace:
>>  <TASK>
>>  dump_stack_lvl+0x44/0x5c
>>  __schedule_bug.cold+0x81/0x8e
>>  __schedule+0xe82/0x12b0
>>  ? get_nohz_timer_target+0x18/0x1a0
>>  ? timerqueue_add+0x62/0xb0
>>  ? enqueue_hrtimer+0x2f/0x80
>>  schedule+0x5d/0xe0
>>  schedule_hrtimeout_range_clock+0xb5/0x100
>>  ? __hrtimer_init+0xe0/0xe0
>>  usleep_range_state+0x50/0x70
>>  ice_ptp_lock+0x39/0x60 [ice]
>>  ice_ptp_gettimex64+0x31/0x70 [ice]
>>  ? ptp_clock_release+0x50/0x50
>>  ? kthread_stop+0x170/0x170
>>  ptp_vclock_read+0x37/0x90
>>  timecounter_read+0x14/0x60
>>  ptp_vclock_refresh+0x2a/0x50
>>  ptp_aux_kworker+0x1c/0x40
>>  kthread_worker_fn+0xaa/0x250
>>  kthread+0xe9/0x110
>>  ? kthread_complete_and_exit+0x20/0x20
>>  ret_from_fork+0x22/0x30
>>  </TASK>
>>
>> I tried to fix this in the vclock code by moving the PHC read outside
>> of the spinlock, but it turns out that's not a complete fix and it
>> breaks the update of the timecounter's cycle_last value. I suspect I'd
>> need to modify the cyclecounter to use a cached value which would need
>> to be read ahead of the timecounter call.
>>
>> Any chance this could be addressed in the ice driver? Any suggestions?
>>
> 
> I think we can look at how to fix this. I suspect that we are using udelay instead of usleep. Alternatively, I think Karol recently posted a patch that drops the PTP semaphore around the read accesses, which might fix this as well.
> 
> See https://lore.kernel.org/intel-wired-lan/877d0yt0ns.fsf@intel.com/
> 

Richard didn't like dropping the semaphore. I don't like the thought of
having to use udelay either.

We discussed why we can't use a normal lock at
https://lore.kernel.org/netdev/Y31O6zWRjaqttANO@hoboy.vegasvil.org/ on
netdev and
https://lore.kernel.org/intel-wired-lan/877d0yt0ns.fsf@intel.com/ on
intel-wired-lan.

I'm not sure what the right approach here is... Sorry this slipped
through the cracks.
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-02-02 20:43 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-09 15:20 [Intel-wired-lan] ice driver bug with PTP vclocks Miroslav Lichvar
2022-11-09 20:08 ` Keller, Jacob E
2023-02-02 20:43   ` Jacob Keller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.