* [Intel-wired-lan] ice driver bug with PTP vclocks
@ 2022-11-09 15:20 Miroslav Lichvar
2022-11-09 20:08 ` Keller, Jacob E
0 siblings, 1 reply; 3+ messages in thread
From: Miroslav Lichvar @ 2022-11-09 15:20 UTC (permalink / raw)
To: intel-wired-lan; +Cc: Richard Cochran
It seems the locking of the PTP clock in the ice driver conflicts with
spinlocks used by PTP virtual clocks protecting their timecounter and
cyclecounter. I get the following report when running ptp4l+phc2sys
pairs on multiple ports of an E810 with vclocks enabled:
BUG: scheduling while atomic: ptp7/3859/0x00000002
Preemption disabled at:
[<0000000000000000>] 0x0
CPU: 1 PID: 3859 Comm: ptp7 Tainted: G W 6.0.6-300.fc37.x86_64 #1
Call Trace:
<TASK>
dump_stack_lvl+0x44/0x5c
__schedule_bug.cold+0x81/0x8e
__schedule+0xe82/0x12b0
? get_nohz_timer_target+0x18/0x1a0
? timerqueue_add+0x62/0xb0
? enqueue_hrtimer+0x2f/0x80
schedule+0x5d/0xe0
schedule_hrtimeout_range_clock+0xb5/0x100
? __hrtimer_init+0xe0/0xe0
usleep_range_state+0x50/0x70
ice_ptp_lock+0x39/0x60 [ice]
ice_ptp_gettimex64+0x31/0x70 [ice]
? ptp_clock_release+0x50/0x50
? kthread_stop+0x170/0x170
ptp_vclock_read+0x37/0x90
timecounter_read+0x14/0x60
ptp_vclock_refresh+0x2a/0x50
ptp_aux_kworker+0x1c/0x40
kthread_worker_fn+0xaa/0x250
kthread+0xe9/0x110
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x22/0x30
</TASK>
I tried to fix this in the vclock code by moving the PHC read outside
of the spinlock, but it turns out that's not a complete fix and it
breaks the update of the timecounter's cycle_last value. I suspect I'd
need to modify the cyclecounter to use a cached value which would need
to be read ahead of the timecounter call.
Any chance this could be addressed in the ice driver? Any suggestions?
--
Miroslav Lichvar
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Intel-wired-lan] ice driver bug with PTP vclocks
2022-11-09 15:20 [Intel-wired-lan] ice driver bug with PTP vclocks Miroslav Lichvar
@ 2022-11-09 20:08 ` Keller, Jacob E
2023-02-02 20:43 ` Jacob Keller
0 siblings, 1 reply; 3+ messages in thread
From: Keller, Jacob E @ 2022-11-09 20:08 UTC (permalink / raw)
To: Miroslav Lichvar, intel-wired-lan@lists.osuosl.org,
Kolacinski, Karol
Cc: Richard Cochran
> -----Original Message-----
> From: Miroslav Lichvar <mlichvar@redhat.com>
> Sent: Wednesday, November 9, 2022 7:20 AM
> To: intel-wired-lan@lists.osuosl.org
> Cc: Keller, Jacob E <jacob.e.keller@intel.com>; Richard Cochran
> <richardcochran@gmail.com>
> Subject: ice driver bug with PTP vclocks
>
> It seems the locking of the PTP clock in the ice driver conflicts with
> spinlocks used by PTP virtual clocks protecting their timecounter and
> cyclecounter. I get the following report when running ptp4l+phc2sys
> pairs on multiple ports of an E810 with vclocks enabled:
>
> BUG: scheduling while atomic: ptp7/3859/0x00000002
> Preemption disabled at:
> [<0000000000000000>] 0x0
> CPU: 1 PID: 3859 Comm: ptp7 Tainted: G W 6.0.6-300.fc37.x86_64 #1
> Call Trace:
> <TASK>
> dump_stack_lvl+0x44/0x5c
> __schedule_bug.cold+0x81/0x8e
> __schedule+0xe82/0x12b0
> ? get_nohz_timer_target+0x18/0x1a0
> ? timerqueue_add+0x62/0xb0
> ? enqueue_hrtimer+0x2f/0x80
> schedule+0x5d/0xe0
> schedule_hrtimeout_range_clock+0xb5/0x100
> ? __hrtimer_init+0xe0/0xe0
> usleep_range_state+0x50/0x70
> ice_ptp_lock+0x39/0x60 [ice]
> ice_ptp_gettimex64+0x31/0x70 [ice]
> ? ptp_clock_release+0x50/0x50
> ? kthread_stop+0x170/0x170
> ptp_vclock_read+0x37/0x90
> timecounter_read+0x14/0x60
> ptp_vclock_refresh+0x2a/0x50
> ptp_aux_kworker+0x1c/0x40
> kthread_worker_fn+0xaa/0x250
> kthread+0xe9/0x110
> ? kthread_complete_and_exit+0x20/0x20
> ret_from_fork+0x22/0x30
> </TASK>
>
> I tried to fix this in the vclock code by moving the PHC read outside
> of the spinlock, but it turns out that's not a complete fix and it
> breaks the update of the timecounter's cycle_last value. I suspect I'd
> need to modify the cyclecounter to use a cached value which would need
> to be read ahead of the timecounter call.
>
> Any chance this could be addressed in the ice driver? Any suggestions?
>
I think we can look at how to fix this. I suspect that we are using udelay instead of usleep. Alternatively, I think Karol recently posted a patch that drops the PTP semaphore around the read accesses, which might fix this as well.
See https://lore.kernel.org/intel-wired-lan/877d0yt0ns.fsf@intel.com/
> --
> Miroslav Lichvar
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Intel-wired-lan] ice driver bug with PTP vclocks
2022-11-09 20:08 ` Keller, Jacob E
@ 2023-02-02 20:43 ` Jacob Keller
0 siblings, 0 replies; 3+ messages in thread
From: Jacob Keller @ 2023-02-02 20:43 UTC (permalink / raw)
To: intel-wired-lan
On 11/9/2022 12:08 PM, Keller, Jacob E wrote:
>
>
>> -----Original Message-----
>> From: Miroslav Lichvar <mlichvar@redhat.com>
>> Sent: Wednesday, November 9, 2022 7:20 AM
>> To: intel-wired-lan@lists.osuosl.org
>> Cc: Keller, Jacob E <jacob.e.keller@intel.com>; Richard Cochran
>> <richardcochran@gmail.com>
>> Subject: ice driver bug with PTP vclocks
>>
>> It seems the locking of the PTP clock in the ice driver conflicts with
>> spinlocks used by PTP virtual clocks protecting their timecounter and
>> cyclecounter. I get the following report when running ptp4l+phc2sys
>> pairs on multiple ports of an E810 with vclocks enabled:
>>
>> BUG: scheduling while atomic: ptp7/3859/0x00000002
>> Preemption disabled at:
>> [<0000000000000000>] 0x0
>> CPU: 1 PID: 3859 Comm: ptp7 Tainted: G W 6.0.6-300.fc37.x86_64 #1
>> Call Trace:
>> <TASK>
>> dump_stack_lvl+0x44/0x5c
>> __schedule_bug.cold+0x81/0x8e
>> __schedule+0xe82/0x12b0
>> ? get_nohz_timer_target+0x18/0x1a0
>> ? timerqueue_add+0x62/0xb0
>> ? enqueue_hrtimer+0x2f/0x80
>> schedule+0x5d/0xe0
>> schedule_hrtimeout_range_clock+0xb5/0x100
>> ? __hrtimer_init+0xe0/0xe0
>> usleep_range_state+0x50/0x70
>> ice_ptp_lock+0x39/0x60 [ice]
>> ice_ptp_gettimex64+0x31/0x70 [ice]
>> ? ptp_clock_release+0x50/0x50
>> ? kthread_stop+0x170/0x170
>> ptp_vclock_read+0x37/0x90
>> timecounter_read+0x14/0x60
>> ptp_vclock_refresh+0x2a/0x50
>> ptp_aux_kworker+0x1c/0x40
>> kthread_worker_fn+0xaa/0x250
>> kthread+0xe9/0x110
>> ? kthread_complete_and_exit+0x20/0x20
>> ret_from_fork+0x22/0x30
>> </TASK>
>>
>> I tried to fix this in the vclock code by moving the PHC read outside
>> of the spinlock, but it turns out that's not a complete fix and it
>> breaks the update of the timecounter's cycle_last value. I suspect I'd
>> need to modify the cyclecounter to use a cached value which would need
>> to be read ahead of the timecounter call.
>>
>> Any chance this could be addressed in the ice driver? Any suggestions?
>>
>
> I think we can look at how to fix this. I suspect that we are using udelay instead of usleep. Alternatively, I think Karol recently posted a patch that drops the PTP semaphore around the read accesses, which might fix this as well.
>
> See https://lore.kernel.org/intel-wired-lan/877d0yt0ns.fsf@intel.com/
>
Richard didn't like dropping the semaphore. I don't like the thought of
having to use udelay either.
We discussed why we can't use a normal lock at
https://lore.kernel.org/netdev/Y31O6zWRjaqttANO@hoboy.vegasvil.org/ on
netdev and
https://lore.kernel.org/intel-wired-lan/877d0yt0ns.fsf@intel.com/ on
intel-wired-lan.
I'm not sure what the right approach here is... Sorry this slipped
through the cracks.
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-02-02 20:43 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-09 15:20 [Intel-wired-lan] ice driver bug with PTP vclocks Miroslav Lichvar
2022-11-09 20:08 ` Keller, Jacob E
2023-02-02 20:43 ` Jacob Keller
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.