public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* Microchip net DSA with ptp4l getting tx_timeout failed msg using 6.3.12 kernel and KSZ9567 switch
@ 2023-08-22 21:49 Brian Hutchinson
  2023-08-23  8:22 ` Christian Eggers
  0 siblings, 1 reply; 7+ messages in thread
From: Brian Hutchinson @ 2023-08-22 21:49 UTC (permalink / raw)
  To: netdev; +Cc: Christian Eggers, Vladimir Oltean, arun.ramadoss

Getting this tx_timestamp_timeout error over and over when I try to run ptp4l:

ptp4l[1366.143]: selected best master clock 001747.fffe.70151b
ptp4l[1366.143]: updating UTC offset to 37
ptp4l[1366.143]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[1366.860]: port 1: delay timeout
ptp4l[1376.871]: timed out while polling for tx timestamp
ptp4l[1376.871]: increasing tx_timestamp_timeout may correct this
issue, but it is likely caused by a driver bug
ptp4l[1376.871]: port 1: send delay request failed

I was using 5.10.69 with Christians patches before they were mainlined
and had everything working with the help of Christian, Vladimir and
others.

Now I need to update kernel so tried 6.3.12 which contains Christians
upstream patches and I also back ported v8 of the upstreamed patches
to 6.1.38 and I'm getting the same results with that kernel too.

I'm using ptp4l 2.0.1 in a yocto Dunfell release (3.1.24) on imx8mm platform.

I tried both lan1 and the failover bond1 I created between lan1 and
lan2 ports and had the same result.

I tried increasing tx_timestamp and it doesn't appear to matter.  I
feel like I had this problem before when first starting to work with
5.10.69 but can't remember if another patch resolved it.  With 5.10.69
I've got quite a few more patches than just the 13 that were mainlined
in 6.3.  Looking through old emails I want to say it might have been
resolved with net-dsa-ksz9477-avoid-PTP-races-with-the-data-path-l.patch
that Vladimir gave me but looking at the code it doesn't appear
mainline has that one.

As best as possible, I've checked kernel .config between this version
of kernel and what I was using before with 5.10.69 and I can't find a
smoking gun to figure out why this is happening.

Here's the output I get:

root@localhost: ptp4l -f my_ptp4l..conf -s -i bond1 -m -q -l 7
ptp4l[687.807]: config item (null).assume_two_step is 0
ptp4l[687.807]: config item (null).check_fup_sync is 0
ptp4l[687.807]: config item (null).tx_timestamp_timeout is 3000
ptp4l[687.807]: config item (null).clock_servo is 0
ptp4l[687.807]: config item (null).clock_type is 32768
ptp4l[687.807]: config item (null).clock_servo is 0
ptp4l[687.807]: config item (null).clockClass is 248
ptp4l[687.807]: config item (null).clockAccuracy is 254
ptp4l[687.807]: config item (null).offsetScaledLogVariance is 65535
ptp4l[687.807]: config item (null).productDescription is ';;'
ptp4l[687.807]: config item (null).revisionData is ';;'
ptp4l[687.807]: config item (null).userDescription is ';'
ptp4l[687.807]: config item (null).manufacturerIdentity is '00:00:00'
ptp4l[687.807]: config item (null).domainNumber is 44
ptp4l[687.807]: config item (null).slaveOnly is 1
ptp4l[687.808]: config item (null).gmCapable is 1
ptp4l[687.808]: config item (null).gmCapable is 1
ptp4l[687.808]: config item (null).G.8275.defaultDS.localPriority is 128
ptp4l[687.808]: config item (null).time_stamping is 4
ptp4l[687.808]: config item (null).twoStepFlag is 0
ptp4l[687.808]: config item (null).twoStepFlag is 0
ptp4l[687.808]: config item (null).time_stamping is 4
ptp4l[687.808]: config item (null).priority1 is 128
ptp4l[687.808]: config item (null).priority2 is 128
ptp4l[687.808]: interface index 6 is up
ptp4l[687.808]: config item (null).free_running is 0
ptp4l[687.808]: selected /dev/ptp1 as PTP clock
ptp4l[687.808]: config item (null).uds_address is '/var/run/ptp4l'
ptp4l[687.808]: section item /var/run/ptp4l.announceReceiptTimeout now 0
ptp4l[687.808]: section item /var/run/ptp4l.delay_mechanism now 0
ptp4l[687.808]: section item /var/run/ptp4l.network_transport now 0
ptp4l[687.808]: section item /var/run/ptp4l.delay_filter_length now 1
ptp4l[687.808]: config item (null).free_running is 0
ptp4l[687.808]: config item (null).freq_est_interval is 1
ptp4l[687.808]: config item (null).gmCapable is 1
ptp4l[687.808]: config item (null).kernel_leap is 1
ptp4l[687.808]: config item (null).utc_offset is 37
ptp4l[687.808]: config item (null).timeSource is 160
ptp4l[687.809]: config item (null).pi_proportional_const is 0.200000
ptp4l[687.809]: config item (null).pi_integral_const is 0.037500
ptp4l[687.809]: config item (null).pi_proportional_scale is 0.000000
ptp4l[687.809]: config item (null).pi_proportional_exponent is -0.300000
ptp4l[687.809]: config item (null).pi_proportional_norm_max is 0.700000
ptp4l[687.809]: config item (null).pi_integral_scale is 0.000000
ptp4l[687.809]: config item (null).pi_integral_exponent is 0.400000
ptp4l[687.809]: config item (null).pi_integral_norm_max is 0.300000
ptp4l[687.809]: config item (null).step_threshold is 0.000000
ptp4l[687.809]: config item (null).first_step_threshold is 0.000020
ptp4l[687.809]: config item (null).max_frequency is 900000000
ptp4l[687.809]: config item (null).dataset_comparison is 1
ptp4l[687.809]: config item (null).tsproc_mode is 0
ptp4l[687.809]: config item (null).delay_filter is 1
ptp4l[687.809]: config item (null).delay_filter_length is 5
ptp4l[687.809]: config item (null).initial_delay is 0
ptp4l[687.809]: config item (null).summary_interval is 4
ptp4l[687.809]: config item (null).sanity_freq_limit is 200000000
ptp4l[687.809]: PI servo: sync interval 1.000 kp 0.200 ki 0.037500
ptp4l[687.809]: config item /var/run/ptp4l.boundary_clock_jbod is 0
ptp4l[687.809]: config item /var/run/ptp4l.network_transport is 0
ptp4l[687.809]: config item /var/run/ptp4l.delayAsymmetry is 0
ptp4l[687.809]: config item /var/run/ptp4l.follow_up_info is 0
ptp4l[687.809]: config item /var/run/ptp4l.freq_est_interval is 1
ptp4l[687.809]: config item /var/run/ptp4l.net_sync_monitor is 0
ptp4l[687.809]: config item /var/run/ptp4l.path_trace_enabled is 0
ptp4l[687.809]: config item /var/run/ptp4l.tc_spanning_tree is 0
ptp4l[687.809]: config item /var/run/ptp4l.ingressLatency is 0
ptp4l[687.809]: config item /var/run/ptp4l.egressLatency is 0
ptp4l[687.809]: config item /var/run/ptp4l.delay_mechanism is 0
ptp4l[688.184]: config item /var/run/ptp4l.hybrid_e2e is 1
ptp4l[688.184]: port 0: hybrid_e2e only works with E2E
ptp4l[688.184]: config item /var/run/ptp4l.fault_badpeernet_interval is 16
ptp4l[688.184]: config item /var/run/ptp4l.fault_reset_interval is -128
ptp4l[688.185]: config item /var/run/ptp4l.tsproc_mode is 0
ptp4l[688.185]: config item /var/run/ptp4l.delay_filter is 1
ptp4l[688.185]: config item /var/run/ptp4l.delay_filter_length is 1
ptp4l[688.185]: config item bond1.boundary_clock_jbod is 0
ptp4l[688.185]: config item bond1.network_transport is 1
ptp4l[688.185]: config item bond1.delayAsymmetry is 0
ptp4l[688.185]: config item bond1.follow_up_info is 0
ptp4l[688.185]: config item bond1.freq_est_interval is 1
ptp4l[688.185]: config item bond1.net_sync_monitor is 0
ptp4l[688.185]: config item bond1.path_trace_enabled is 0
ptp4l[688.185]: config item bond1.tc_spanning_tree is 0
ptp4l[688.185]: config item bond1.ingressLatency is 0
ptp4l[688.185]: config item bond1.egressLatency is 0
ptp4l[688.185]: config item bond1.delay_mechanism is 1
ptp4l[688.185]: config item bond1.unicast_master_table is 1
ptp4l[688.185]: config item bond1.unicast_req_duration is 300
ptp4l[688.185]: section item bond1.hybrid_e2e now 1
ptp4l[688.185]: config item bond1.unicast_listen is 1
ptp4l[688.185]: section item bond1.hybrid_e2e now 1
ptp4l[688.185]: config item bond1.inhibit_multicast_service is 1
ptp4l[688.185]: config item bond1.hybrid_e2e is 1
ptp4l[688.185]: config item bond1.fault_badpeernet_interval is 16
ptp4l[688.185]: config item bond1.fault_reset_interval is -128
ptp4l[688.185]: config item bond1.tsproc_mode is 0
ptp4l[688.185]: config item bond1.delay_filter is 1
ptp4l[688.185]: config item bond1.delay_filter_length is 5
ptp4l[688.185]: config item bond1.logMinDelayReqInterval is 0
ptp4l[688.185]: config item bond1.logAnnounceInterval is 1
ptp4l[688.185]: config item bond1.announceReceiptTimeout is 3
ptp4l[688.185]: config item bond1.syncReceiptTimeout is 0
ptp4l[688.185]: config item bond1.transportSpecific is 0
ptp4l[688.185]: config item bond1.ignore_transport_specific is 0
ptp4l[688.185]: config item bond1.masterOnly is 0
ptp4l[688.185]: config item bond1.G.8275.portDS.localPriority is 128
ptp4l[688.185]: config item bond1.logSyncInterval is 0
ptp4l[688.185]: config item bond1.logMinPdelayReqInterval is 0
ptp4l[688.185]: config item bond1.neighborPropDelayThresh is 20000000
ptp4l[688.185]: config item bond1.min_neighbor_prop_delay is -20000000
ptp4l[688.185]: config item bond1.udp_ttl is 1
ptp4l[688.186]: config item (null).dscp_event is 0
ptp4l[688.186]: config item (null).dscp_general is 0
ptp4l[688.186]: port 1: INITIALIZING to LISTENING on INIT_COMPLETE
ptp4l[688.186]: config item /var/run/ptp4l.logMinDelayReqInterval is 0
ptp4l[688.186]: config item /var/run/ptp4l.logAnnounceInterval is 1
ptp4l[688.186]: config item /var/run/ptp4l.announceReceiptTimeout is 0
ptp4l[688.186]: config item /var/run/ptp4l.syncReceiptTimeout is 0
ptp4l[688.186]: config item /var/run/ptp4l.transportSpecific is 0
ptp4l[688.186]: config item /var/run/ptp4l.ignore_transport_specific is 0
ptp4l[688.186]: config item /var/run/ptp4l.masterOnly is 0
ptp4l[688.186]: config item /var/run/ptp4l.G.8275.portDS.localPriority is 128
ptp4l[688.186]: config item /var/run/ptp4l.logSyncInterval is 0
ptp4l[688.186]: config item /var/run/ptp4l.logMinPdelayReqInterval is 0
ptp4l[688.186]: config item /var/run/ptp4l.neighborPropDelayThresh is 20000000
ptp4l[688.186]: config item /var/run/ptp4l.min_neighbor_prop_delay is -20000000
ptp4l[688.186]: config item (null).uds_address is '/var/run/ptp4l'
ptp4l[688.186]: port 0: INITIALIZING to LISTENING on INIT_COMPLETE
ptp4l[688.186]: port 1: received link status notification
ptp4l[688.186]: interface index 6 is up
ptp4l[688.247]: port 1: setting asCapable
ptp4l[688.259]: port 1: new foreign master 001747.fffe.70151b-1
ptp4l[692.186]: port 1: unicast request timeout
ptp4l[692.188]: port 1: unicast ANNOUNCE granted for 300 sec
ptp4l[692.188]: port 1: renewal timeout at 917
ptp4l[692.296]: selected best master clock 001747.fffe.70151b
ptp4l[692.296]: updating UTC offset to 37
ptp4l[692.296]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[693.710]: port 1: delay timeout
ptp4l[696.714]: timed out while polling for tx timestamp
ptp4l[696.714]: increasing tx_timestamp_timeout may correct this
issue, but it is likely caused by a driver bug
ptp4l[696.714]: port 1: send delay request failed
ptp4l[696.714]: port 1: clearing fault immediately
ptp4l[696.715]: config item bond1.logMinDelayReqInterval is 0
ptp4l[696.715]: config item bond1.logAnnounceInterval is 1
ptp4l[696.715]: config item bond1.announceReceiptTimeout is 3
ptp4l[696.715]: config item bond1.syncReceiptTimeout is 0
ptp4l[696.715]: config item bond1.transportSpecific is 0
ptp4l[696.715]: config item bond1.ignore_transport_specific is 0
ptp4l[696.715]: config item bond1.masterOnly is 0
ptp4l[696.715]: config item bond1.G.8275.portDS.localPriority is 128
ptp4l[696.715]: config item bond1.logSyncInterval is 0
ptp4l[696.715]: config item bond1.logMinPdelayReqInterval is 0
ptp4l[696.715]: config item bond1.neighborPropDelayThresh is 20000000
ptp4l[696.715]: config item bond1.min_neighbor_prop_delay is -20000000
ptp4l[696.715]: config item bond1.udp_ttl is 1
ptp4l[696.716]: config item (null).dscp_event is 0
ptp4l[696.716]: config item (null).dscp_general is 0
ptp4l[696.716]: port 1: UNCALIBRATED to LISTENING on INIT_COMPLETE
ptp4l[696.716]: port 1: received link status notification
ptp4l[696.716]: interface index 6 is up
ptp4l[697.330]: port 1: setting asCapable
ptp4l[698.351]: port 1: new foreign master 001747.fffe.70151b-1
ptp4l[700.716]: port 1: unicast request timeout
ptp4l[700.717]: port 1: unicast SYNC granted for 300 sec
ptp4l[700.717]: PI servo: sync interval 1.000 kp 0.200 ki 0.037500
ptp4l[700.717]: port 1: unicast DELAY_RESP granted for 300 sec
ptp4l[702.387]: selected best master clock 001747.fffe.70151b
ptp4l[702.387]: updating UTC offset to 37
ptp4l[702.387]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[703.614]: port 1: delay timeout
ptp4l[706.618]: timed out while polling for tx timestamp
ptp4l[706.618]: increasing tx_timestamp_timeout may correct this
issue, but it is likely caused by a driver bug
ptp4l[706.618]: port 1: send delay request failed
ptp4l[706.618]: port 1: clearing fault immediately
ptp4l[706.619]: config item bond1.logMinDelayReqInterval is 0
ptp4l[706.619]: config item bond1.logAnnounceInterval is 1
ptp4l[706.619]: config item bond1.announceReceiptTimeout is 3
ptp4l[706.619]: config item bond1.syncReceiptTimeout is 0
ptp4l[706.619]: config item bond1.transportSpecific is 0
ptp4l[706.619]: config item bond1.ignore_transport_specific is 0
ptp4l[706.619]: config item bond1.masterOnly is 0
ptp4l[706.619]: config item bond1.G.8275.portDS.localPriority is 128
ptp4l[706.619]: config item bond1.logSyncInterval is 0
ptp4l[706.619]: config item bond1.logMinPdelayReqInterval is 0
ptp4l[706.619]: config item bond1.neighborPropDelayThresh is 20000000
ptp4l[706.619]: config item bond1.min_neighbor_prop_delay is -20000000
ptp4l[706.619]: config item bond1.udp_ttl is 1
ptp4l[706.620]: config item (null).dscp_event is 0
ptp4l[706.620]: config item (null).dscp_general is 0
ptp4l[706.620]: port 1: UNCALIBRATED to LISTENING on INIT_COMPLETE
ptp4l[706.620]: port 1: received link status notification
ptp4l[706.620]: interface index 6 is up
ptp4l[707.406]: port 1: setting asCapable
ptp4l[708.425]: port 1: new foreign master 001747.fffe.70151b-1
ptp4l[710.621]: port 1: unicast request timeout
ptp4l[712.432]: selected best master clock 001747.fffe.70151b
ptp4l[712.432]: updating UTC offset to 37
ptp4l[712.433]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[714.001]: port 1: delay timeout
ptp4l[717.005]: timed out while polling for tx timestamp
ptp4l[717.005]: increasing tx_timestamp_timeout may correct this
issue, but it is likely caused by a driver bug
ptp4l[717.005]: port 1: send delay request failed
ptp4l[717.005]: port 1: clearing fault immediately
ptp4l[717.005]: config item bond1.logMinDelayReqInterval is 0
ptp4l[717.005]: config item bond1.logAnnounceInterval is 1
ptp4l[717.005]: config item bond1.announceReceiptTimeout is 3
ptp4l[717.005]: config item bond1.syncReceiptTimeout is 0
ptp4l[717.005]: config item bond1.transportSpecific is 0
ptp4l[717.005]: config item bond1.ignore_transport_specific is 0
ptp4l[717.005]: config item bond1.masterOnly is 0
ptp4l[717.005]: config item bond1.G.8275.portDS.localPriority is 128
ptp4l[717.005]: config item bond1.logSyncInterval is 0
ptp4l[717.006]: config item bond1.logMinPdelayReqInterval is 0
ptp4l[717.006]: config item bond1.neighborPropDelayThresh is 20000000
ptp4l[717.006]: config item bond1.min_neighbor_prop_delay is -20000000
ptp4l[717.006]: config item bond1.udp_ttl is 1
ptp4l[717.006]: config item (null).dscp_event is 0
ptp4l[717.006]: config item (null).dscp_general is 0
ptp4l[717.007]: port 1: UNCALIBRATED to LISTENING on INIT_COMPLETE
ptp4l[717.007]: port 1: received link status notification
ptp4l[717.007]: interface index 6 is up
ptp4l[717.442]: port 1: setting asCapable

Here is my ptp4l config:

[global]
#
# Default Data Set
#
twoStepFlag             0
slaveOnly               1
priority1               128
#priority2               255
priority2               128
domainNumber            44
#utc_offset             37
clockClass              248
#clockClass              255
#step_window            48
clockAccuracy           0xFE
offsetScaledLogVariance 0xFFFF
free_running            0
freq_est_interval       1
dscp_event              0
dscp_general            0
#dataset_comparison     ieee1588
#for G.8275.1
dataset_comparison      G.8275.x
G.8275.defaultDS.localPriority  128
#
# Port Data Set
#
logAnnounceInterval     1
logSyncInterval         0
logMinDelayReqInterval  0
logMinPdelayReqInterval 0
announceReceiptTimeout  3
syncReceiptTimeout      0
delayAsymmetry          0
fault_reset_interval    -128
#fault_reset_interval    4
neighborPropDelayThresh 20000000
masterOnly              0
G.8275.portDS.localPriority     128
#
# Run time options
#
assume_two_step         0
logging_level           6
path_trace_enabled      0
follow_up_info          0
hybrid_e2e              1
inhibit_multicast_service       1
net_sync_monitor        0
tc_spanning_tree        0
#tx_timestamp_timeout    300
#tx_timestamp_timeout   1000
tx_timestamp_timeout    10000
unicast_listen          1
unicast_req_duration    300
unicast_master_table    1
use_syslog              0
verbose                 0
summary_interval        4
kernel_leap             1
check_fup_sync          0
#
# Servo Options
#
#servo_offset_threshold 100
#servo_num_offset_values        64
pi_proportional_const   0.2
pi_integral_const       0.0375
pi_proportional_scale   0.0
pi_proportional_exponent        -0.3
pi_proportional_norm_max        0.7
pi_integral_scale       0.0
pi_integral_exponent    0.4
pi_integral_norm_max    0.3
step_threshold          0.0
#step_threshold         0.00002
first_step_threshold    0.00002
max_frequency           900000000
clock_servo             pi
sanity_freq_limit       200000000
ntpshm_segment          0
#
# Transport options
#
transportSpecific       0x0
ptp_dst_mac            01:1B:19:00:00:00
p2p_dst_mac            01:80:C2:00:00:0E
udp_ttl                 1
#udp6_scope             0x0E
uds_address             /var/run/ptp4l
#
# Default interface options
#
#clock_type              OC
network_transport       UDPv4
#delay_mechanism         P2P
delay_mechanism         E2E
time_stamping           p2p1step
#time_stamping           onestep
#time_stamping           hardware
tsproc_mode             filter
#tsproc_mode            raw
#tsproc_mode            raw_weight
delay_filter            moving_median
delay_filter_length     5
egressLatency           0
ingressLatency          0
boundary_clock_jbod     0
#
# Clock description
#
productDescription      ;;
revisionData            ;;
manufacturerIdentity    00:00:00
userDescription         ;
timeSource              0xA0
#maxStepsRemoved                255
#
[unicast_master_table]
table_id                        1
logQueryInterval                2
#UDPv4                           192.168.0.250
UDPv4                           192.168.1.250
#
[lan1]
unicast_master_table            1

cat /proc/interrupts:

          CPU0       CPU1       CPU2       CPU3
11:     323676        486        484        489     GICv3  30 Level
 arch_timer
14:      50097          0          0          2     GICv3  79 Level
 timer@306a0000
15:          0          0          0          0     GICv3  23 Level     arm-pmu
16:          0          0          0          0     GICv3 135 Level
 302c0000.dma-controller
17:          0          0          0          0     GICv3  66 Level
 302b0000.dma-controller
18:          0          0          0          0     GICv3  34 Level
 30bd0000.dma-controller
19:       3385          0          0          0     GICv3  59 Level
 30890000.serial
20:       1670          0          0          0     GICv3 139 Level
 30bb0000.spi
21:          0          0          0          0     GICv3  51 Level
 rtc alarm
22:          0          0          0          0     GICv3 110 Level
 30280000.watchdog
23:       7579          0          0          0     GICv3  56 Level     mmc2
25:          0          0          0          0     GICv3 127 Level     sai
26:          0          0          0          0     GICv3  82 Level     sai
32:          0          0          0          0  gpio-mxc   3 Level
 bd718xx-irq
39:         16          0          0          0  gpio-mxc  10 Level
 global_port_irq
44:          0          0          0          0  gpio-mxc  15 Edge
 30b50000.mmc cd
197:    1183025          0          0          0     GICv3  67 Level
  30a20000.i2c
198:          0          0          0          0  bd718xx-irq   5 Edge
     gpio_keys
199:          0          0          0          0     GICv3  68 Level
  30a30000.i2c
200:          0          0          0          0     GICv3  69 Level
  30a40000.i2c
201:          0          0          0          0     GICv3  70 Level
  30a50000.i2c
202:          0          0          0          0     GICv3  64 Level
  30830000.spi
203:          0          0          0          0     GICv3 150 Level
  30be0000.ethernet
204:          0          0          0          0     GICv3 151 Level
  30be0000.ethernet
205:       6819          0          0          0     GICv3 152 Level
  30be0000.ethernet
206:          0          0          0          0     GICv3 153 Level
  30be0000.ethernet
207:          0          0          0          0     GICv3  55 Level     mmc1
208:         16          0          0          0   ksz-irq   0 Edge
  port_irq-0
209:          0          0          0          0   ksz-irq   1 Edge
  port_irq-1
217:         16          0          0          0   ksz-irq   2 Edge
  ptp-irq-0
218:          0          0          0          0   ksz-irq   0 Edge
  pdresp-msg
219:         16          0          0          0   ksz-irq   1 Edge
  xdreq-msg
220:          0          0          0          0   ksz-irq   2 Edge
  sync-msg
223:          0          0          0          0   ksz-irq   2 Edge
  ptp-irq-1
224:          0          0          0          0   ksz-irq   0 Edge
  pdresp-msg
225:          0          0          0          0   ksz-irq   1 Edge
  xdreq-msg
226:          0          0          0          0   ksz-irq   2 Edge
  sync-msg
228:          0          0          0          0     GICv3  36 Level
  30370000.snvs:snvs-powerkey
229:          0          0          0          0     GICv3 130 Level
  imx8_ddr_perf_pmu
230:        213          0          0          0     GICv3 137 Level
  30901000.jr
231:          0          0          0          0     GICv3 138 Level
  30902000.jr
232:          0          0          0          0     GICv3 146 Level
  30903000.jr
IPI0:         6        371        371        370       Rescheduling interrupts
IPI1:        24         63         63         63       Function call interrupts
IPI2:         0          0          0          0       CPU stop interrupts
IPI3:         0          0          0          0       CPU stop (for
crash dump) interrupts
IPI4:         0          0          0          0       Timer broadcast
interrupts
IPI5:     28207          0          0          0       IRQ work interrupts
IPI6:         0          0          0          0       CPU wake-up interrupts

Please let me know if I need to supply more data or answer questions
to track this down.

Regards,

Brian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Microchip net DSA with ptp4l getting tx_timeout failed msg using 6.3.12 kernel and KSZ9567 switch
  2023-08-22 21:49 Microchip net DSA with ptp4l getting tx_timeout failed msg using 6.3.12 kernel and KSZ9567 switch Brian Hutchinson
@ 2023-08-23  8:22 ` Christian Eggers
  2023-08-23 18:12   ` Brian Hutchinson
       [not found]   ` <CAFZh4h-6yWvpvzJyv06zy8MbtMmXG==V0h2vU=uUN8iMMcb=ig@mail.gmail.com>
  0 siblings, 2 replies; 7+ messages in thread
From: Christian Eggers @ 2023-08-23  8:22 UTC (permalink / raw)
  To: netdev, Brian Hutchinson; +Cc: Vladimir Oltean, arun.ramadoss

[-- Attachment #1: Type: text/plain, Size: 1961 bytes --]

Hi Brian,

I just return from my holidays...

Am Dienstag, 22. August 2023, 23:49:33 CEST schrieben Sie:
> Getting this tx_timestamp_timeout error over and over when I try to run ptp4l:
> 
> ptp4l[1366.143]: selected best master clock 001747.fffe.70151b
> ptp4l[1366.143]: updating UTC offset to 37
> ptp4l[1366.143]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
> ptp4l[1366.860]: port 1: delay timeout
> ptp4l[1376.871]: timed out while polling for tx timestamp
> ptp4l[1376.871]: increasing tx_timestamp_timeout may correct this
> issue, but it is likely caused by a driver bug
> ptp4l[1376.871]: port 1: send delay request failed
> 
> I was using 5.10.69 with Christians patches before they were mainlined
> and had everything working with the help of Christian, Vladimir and
> others.
> 
> Now I need to update kernel so tried 6.3.12 which contains Christians
> upstream patches and I also back ported v8 of the upstreamed patches
> to 6.1.38 and I'm getting the same results with that kernel too.
> 

I am also in the process of upgrading to 6.1.38 (but not really tested).
I cherry-picked all necessary patches from the latest master (see attached
archive). Maybe you would like to compare this with your patch series.

> [...]
>
> I tried increasing tx_timestamp and it doesn't appear to matter.  I
> feel like I had this problem before when first starting to work with
> 5.10.69 but can't remember if another patch resolved it.  With 5.10.69
> I've got quite a few more patches than just the 13 that were mainlined
> in 6.3.  Looking through old emails I want to say it might have been
> resolved with net-dsa-ksz9477-avoid-PTP-races-with-the-data-path-l.patch
> that Vladimir gave me but looking at the code it doesn't appear
> mainline has that one.

How is the IRQ line of you switch attached? I remember there was a problem
with the IRQ type (edge vs. level), but I think this has already been
applied to 6.1.38 (via -stable).

regards,
Christian

[-- Attachment #2: 2023-08-23_ptp.tar.gz --]
[-- Type: application/x-compressed-tar, Size: 29954 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Microchip net DSA with ptp4l getting tx_timeout failed msg using 6.3.12 kernel and KSZ9567 switch
  2023-08-23  8:22 ` Christian Eggers
@ 2023-08-23 18:12   ` Brian Hutchinson
       [not found]   ` <CAFZh4h-6yWvpvzJyv06zy8MbtMmXG==V0h2vU=uUN8iMMcb=ig@mail.gmail.com>
  1 sibling, 0 replies; 7+ messages in thread
From: Brian Hutchinson @ 2023-08-23 18:12 UTC (permalink / raw)
  To: netdev

Hi Christian,

On Wed, Aug 23, 2023 at 4:22 AM Christian Eggers <ceggers@arri.de> wrote:
>
> Hi Brian,
>
> I just return from my holidays...

Hope you had a good one ... I need one too!

>
> Am Dienstag, 22. August 2023, 23:49:33 CEST schrieben Sie:
> > Getting this tx_timestamp_timeout error over and over when I try to run ptp4l:
> >
> > ptp4l[1366.143]: selected best master clock 001747.fffe.70151b
> > ptp4l[1366.143]: updating UTC offset to 37
> > ptp4l[1366.143]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
> > ptp4l[1366.860]: port 1: delay timeout
> > ptp4l[1376.871]: timed out while polling for tx timestamp
> > ptp4l[1376.871]: increasing tx_timestamp_timeout may correct this
> > issue, but it is likely caused by a driver bug
> > ptp4l[1376.871]: port 1: send delay request failed
> >
> > I was using 5.10.69 with Christians patches before they were mainlined
> > and had everything working with the help of Christian, Vladimir and
> > others.
> >
> > Now I need to update kernel so tried 6.3.12 which contains Christians
> > upstream patches and I also back ported v8 of the upstreamed patches
> > to 6.1.38 and I'm getting the same results with that kernel too.
> >
>
> I am also in the process of upgrading to 6.1.38 (but not really tested).
> I cherry-picked all necessary patches from the latest master (see attached
> archive). Maybe you would like to compare this with your patch series.

Excellent, I will check it out!  Yeah, we needed to be on a LTS kernel
so that's why I'm focusing on 6.1.38 as it's the latest in the
yocto/oe universe.

>
> > [...]
> >
> > I tried increasing tx_timestamp and it doesn't appear to matter. I
> > feel like I had this problem before when first starting to work with
> > 5.10.69 but can't remember if another patch resolved it. With 5.10.69
> > I've got quite a few more patches than just the 13 that were mainlined
> > in 6.3. Looking through old emails I want to say it might have been
> > resolved with net-dsa-ksz9477-avoid-PTP-races-with-the-data-path-l.patch
> > that Vladimir gave me but looking at the code it doesn't appear
> > mainline has that one.
>
> How is the IRQ line of you switch attached? I remember there was a problem
> with the IRQ type (edge vs. level), but I think this has already been
> applied to 6.1.38 (via -stable).

So that's one of the first things I thought of which is why I provided
cat of /proc/interrupts.

I also do have a /dev/ptp1 (/dev/ptp0 is imx8mm)

My device tree node is the same as before:

         i2c_ksz9567: ksz9567@5f {
               compatible = "microchip,ksz9567";
               reg = <0x5f>;
               phy-mode = "rgmii-id";
               status = "okay";
               interrupt-parent = <&gpio1>;
               interrupts = <10 IRQ_TYPE_LEVEL_LOW>;

               ports {
                       #address-cells = <1>;
                       #size-cells = <0>;
                       port@0 {
                               reg = <0>;
                               label = "lan1";
                       };
                       port@1 {
                               reg = <1>;
                               label = "lan2";
                       };
                       port@6 {
                               reg = <6>;
                               label = "cpu";
                               ethernet = <&fec1>;
                               phy-mode = "rgmii-id";
                               fixed-link {
                                       speed = <100>;
                                       full-duplex;
                               };
                       };
               };
       };

And I have same pinmux setup as before.  I double checked all of that.

I noticed new kernel /proc/interrupts now has a bunch of ksz lines in
addition to "gpio-mxc  10 Level" which is IRQ from the ksz switch.

Here is what the old 5.10.69 /proc/interrupts looked like:

cat /proc/interrupts
          CPU0       CPU1       CPU2       CPU3
11:      46141        127        127        124     GICv3  30 Level
 arch_timer
14:       5260          0          0          0     GICv3  79 Level
 timer@306a0000
15:          0          0          0          0     GICv3  23 Level     arm-pmu
20:          0          0          0          0     GICv3 127 Level     sai
21:          0          0          0          0     GICv3  82 Level     sai
32:          0          0          0          0     GICv3 110 Level
 30280000.watchdog
33:          0          0          0          0     GICv3 135 Level     sdma
34:          0          0          0          0     GICv3  66 Level     sdma
35:          0          0          0          0     GICv3  52 Level
 caam-snvs
36:          0          0          0          0     GICv3  51 Level
 rtc alarm
37:          0          0          0          0     GICv3  36 Level
 30370000.snvs:snvs-powerkey
39:          0          0          0          0     GICv3  64 Level
 30830000.spi
40:       1412          0          0          0     GICv3  59 Level
 30890000.serial
42:      55291          0          0          0     GICv3  67 Level
 30a20000.i2c
43:          0          0          0          0     GICv3  68 Level
 30a30000.i2c
44:          0          0          0          0     GICv3  69 Level
 30a40000.i2c
45:          0          0          0          0     GICv3  70 Level
 30a50000.i2c
47:          0          0          0          0     GICv3  55 Level     mmc1
48:       3003          0          0          0     GICv3  56 Level     mmc2
49:       2565          0          0          0     GICv3 139 Level
 30bb0000.spi
50:          0          0          0          0     GICv3  34 Level     sdma
51:          0          0          0          0     GICv3 150 Level
 30be0000.ethernet
52:          0          0          0          0     GICv3 151 Level
 30be0000.ethernet
53:       1417          0          0          0     GICv3 152 Level
 30be0000.ethernet
54:          0          0          0          0     GICv3 153 Level
 30be0000.ethernet
56:          0          0          0          0     GICv3 130 Level
 imx8_ddr_perf_pmu
60:          0          0          0          0  gpio-mxc   3 Level
 bd718xx-irq
67:         23          0          0          0  gpio-mxc  10 Level     0-005f
72:          0          0          0          0  gpio-mxc  15 Edge
 30b50000.mmc cd
217:          0          0          0          0  bd718xx-irq   5 Edge
     gpio_keys
IPI0:        29         14         13         13       Rescheduling interrupts
IPI1:         0         41         41         41       Function call interrupts
IPI2:         0          0          0          0       CPU stop interrupts
IPI3:         0          0          0          0       CPU stop (for
crash dump) interrupts
IPI4:         0          0          0          0       Timer broadcast
interrupts
IPI5:      7959          0          0          0       IRQ work interrupts
IPI6:         0          0          0          0       CPU wake-up interrupts
Err:          0

I'll check out your 6.1.38 changes compared to what I did.

Thanks,

Brian

>
> Get all the latest information from www.arri.com, Facebook, Twitter, Instagram, LinkedIn and YouTube.
>
> Arnold & Richter Cine Technik GmbH & Co. Betriebs KG
> Sitz: München ‑ Registergericht: Amtsgericht München ‑ Handelsregisternummer: HRA 57918
> Persönlich haftender Gesellschafter: Arnold & Richter Cine Technik GmbH
> Sitz: München ‑ Registergericht: Amtsgericht München ‑ Handelsregisternummer: HRB 54477
> Geschäftsführer: Dr. Matthias Erb (Chairman); Lars Weyer; Stephan Schenk; Walter Trauninger
>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Microchip net DSA with ptp4l getting tx_timeout failed msg using 6.3.12 kernel and KSZ9567 switch
       [not found]   ` <CAFZh4h-6yWvpvzJyv06zy8MbtMmXG==V0h2vU=uUN8iMMcb=ig@mail.gmail.com>
@ 2023-08-24 18:26     ` Brian Hutchinson
  2023-08-24 19:03       ` Brian Hutchinson
  0 siblings, 1 reply; 7+ messages in thread
From: Brian Hutchinson @ 2023-08-24 18:26 UTC (permalink / raw)
  To: Christian Eggers; +Cc: netdev, Vladimir Oltean, arun.ramadoss

Hi Christian,


On Wed, Aug 23, 2023 at 9:29 AM Brian Hutchinson <b.hutchman@gmail.com> wrote:
>
>
>
> On Wed, Aug 23, 2023 at 4:22 AM Christian Eggers <ceggers@arri.de> wrote:
>>
>> Hi Brian,
>>
>> I just return from my holidays...
>
>
> Hope you had a good one ... I need one too!
>
>>
>>
>> Am Dienstag, 22. August 2023, 23:49:33 CEST schrieben Sie:
>> > Getting this tx_timestamp_timeout error over and over when I try to run ptp4l:
>> >
>> > ptp4l[1366.143]: selected best master clock 001747.fffe.70151b
>> > ptp4l[1366.143]: updating UTC offset to 37
>> > ptp4l[1366.143]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
>> > ptp4l[1366.860]: port 1: delay timeout
>> > ptp4l[1376.871]: timed out while polling for tx timestamp
>> > ptp4l[1376.871]: increasing tx_timestamp_timeout may correct this
>> > issue, but it is likely caused by a driver bug
>> > ptp4l[1376.871]: port 1: send delay request failed
>> >
>> > I was using 5.10.69 with Christians patches before they were mainlined
>> > and had everything working with the help of Christian, Vladimir and
>> > others.
>> >
>> > Now I need to update kernel so tried 6.3.12 which contains Christians
>> > upstream patches and I also back ported v8 of the upstreamed patches
>> > to 6.1.38 and I'm getting the same results with that kernel too.
>> >
>>
>> I am also in the process of upgrading to 6.1.38 (but not really tested).
>> I cherry-picked all necessary patches from the latest master (see attached
>> archive). Maybe you would like to compare this with your patch series.
>
>
> Excellent, I will check it out!  Yeah, we needed to be on a LTS kernel so that's why I'm focusing on 6.1.38 as it's the latest in the yocto/oe universe.

So I checked all of your patches for 6.1.38 vs the ones I had.  I had
all except 0002 and 0003.  I didn't have all of 0001 but I got a build
error on diff_by_scaled_ppm and back ported that function from 6.3.12
to make things build.

I applied the missing patches I got from you and rebuilt everything
and still have the same result with tx_timestamp_timeout.  Which
didn't surprise me as I mentioned before I tried 6.3.12 mainline and
get same result there too.

Regards,

Brian

>
>>
>>
>> > [...]
>> >
>> > I tried increasing tx_timestamp and it doesn't appear to matter. I
>> > feel like I had this problem before when first starting to work with
>> > 5.10.69 but can't remember if another patch resolved it. With 5.10.69
>> > I've got quite a few more patches than just the 13 that were mainlined
>> > in 6.3. Looking through old emails I want to say it might have been
>> > resolved with net-dsa-ksz9477-avoid-PTP-races-with-the-data-path-l.patch
>> > that Vladimir gave me but looking at the code it doesn't appear
>> > mainline has that one.
>>
>> How is the IRQ line of you switch attached? I remember there was a problem
>> with the IRQ type (edge vs. level), but I think this has already been
>> applied to 6.1.38 (via -stable).
>
>
> So that's one of the first things I thought of which is why I provided cat of /proc/interrupts.
>
> I also do have a /dev/ptp1 (/dev/ptp0 is imx8mm)
>
> My device tree node is the same as before:
>
>          i2c_ksz9567: ksz9567@5f {
>                compatible = "microchip,ksz9567";
>                reg = <0x5f>;
>                phy-mode = "rgmii-id";
>                status = "okay";
>                interrupt-parent = <&gpio1>;
>                interrupts = <10 IRQ_TYPE_LEVEL_LOW>;
>
>                ports {
>                        #address-cells = <1>;
>                        #size-cells = <0>;
>                        port@0 {
>                                reg = <0>;
>                                label = "lan1";
>                        };
>                        port@1 {
>                                reg = <1>;
>                                label = "lan2";
>                        };
>                        port@6 {
>                                reg = <6>;
>                                label = "cpu";
>                                ethernet = <&fec1>;
>                                phy-mode = "rgmii-id";
>                                fixed-link {
>                                        speed = <100>;
>                                        full-duplex;
>                                };
>                        };
>                };
>        };
>
> And I have same pinmux setup as before.  I double checked all of that.
>
> I noticed new kernel /proc/interrupts now has a bunch of ksz lines in addition to "gpio-mxc  10 Level" which is IRQ from the ksz switch.
>
> Here is what the old 5.10.69 /proc/interrupts looked like:
>
> cat /proc/interrupts
>           CPU0       CPU1       CPU2       CPU3
> 11:      46141        127        127        124     GICv3  30 Level     arch_timer
> 14:       5260          0          0          0     GICv3  79 Level     timer@306a0000
> 15:          0          0          0          0     GICv3  23 Level     arm-pmu
> 20:          0          0          0          0     GICv3 127 Level     sai
> 21:          0          0          0          0     GICv3  82 Level     sai
> 32:          0          0          0          0     GICv3 110 Level     30280000.watchdog
> 33:          0          0          0          0     GICv3 135 Level     sdma
> 34:          0          0          0          0     GICv3  66 Level     sdma
> 35:          0          0          0          0     GICv3  52 Level     caam-snvs
> 36:          0          0          0          0     GICv3  51 Level     rtc alarm
> 37:          0          0          0          0     GICv3  36 Level     30370000.snvs:snvs-powerkey
> 39:          0          0          0          0     GICv3  64 Level     30830000.spi
> 40:       1412          0          0          0     GICv3  59 Level     30890000.serial
> 42:      55291          0          0          0     GICv3  67 Level     30a20000.i2c
> 43:          0          0          0          0     GICv3  68 Level     30a30000.i2c
> 44:          0          0          0          0     GICv3  69 Level     30a40000.i2c
> 45:          0          0          0          0     GICv3  70 Level     30a50000.i2c
> 47:          0          0          0          0     GICv3  55 Level     mmc1
> 48:       3003          0          0          0     GICv3  56 Level     mmc2
> 49:       2565          0          0          0     GICv3 139 Level     30bb0000.spi
> 50:          0          0          0          0     GICv3  34 Level     sdma
> 51:          0          0          0          0     GICv3 150 Level     30be0000.ethernet
> 52:          0          0          0          0     GICv3 151 Level     30be0000.ethernet
> 53:       1417          0          0          0     GICv3 152 Level     30be0000.ethernet
> 54:          0          0          0          0     GICv3 153 Level     30be0000.ethernet
> 56:          0          0          0          0     GICv3 130 Level     imx8_ddr_perf_pmu
> 60:          0          0          0          0  gpio-mxc   3 Level     bd718xx-irq
> 67:         23          0          0          0  gpio-mxc  10 Level     0-005f
> 72:          0          0          0          0  gpio-mxc  15 Edge      30b50000.mmc cd
> 217:          0          0          0          0  bd718xx-irq   5 Edge      gpio_keys
> IPI0:        29         14         13         13       Rescheduling interrupts
> IPI1:         0         41         41         41       Function call interrupts
> IPI2:         0          0          0          0       CPU stop interrupts
> IPI3:         0          0          0          0       CPU stop (for crash dump) interrupts
> IPI4:         0          0          0          0       Timer broadcast interrupts
> IPI5:      7959          0          0          0       IRQ work interrupts
> IPI6:         0          0          0          0       CPU wake-up interrupts
> Err:          0
>
> I'll check out your 6.1.38 changes compared to what I did.
>
> Thanks,
>
> Brian
>
>>
>>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Microchip net DSA with ptp4l getting tx_timeout failed msg using 6.3.12 kernel and KSZ9567 switch
  2023-08-24 18:26     ` Brian Hutchinson
@ 2023-08-24 19:03       ` Brian Hutchinson
  2023-08-25 15:49         ` Christian Eggers
  2023-08-26 11:43         ` Vladimir Oltean
  0 siblings, 2 replies; 7+ messages in thread
From: Brian Hutchinson @ 2023-08-24 19:03 UTC (permalink / raw)
  To: Christian Eggers
  Cc: netdev, Vladimir Oltean, arun.ramadoss, rakesh.sankaranarayanan

Update.  Top posting because I think this is my issue.

I dug further into my problem.  I'm using E2E and it looks like the
mainlined Microchip KSZ DSA PTP code is only supporting P2P.

The 5.10.69 kernel that I was first able to get working with
Christian's early pre-mainlined patches had:
0016-net-dsa-microchip-ksz9477-add-E2E-support.patch

... which gets into the "sticky" bits of why these patches weren't
accepted in the first place due to some Microchip specific
implementation if I recall correctly.

Regards,

Brian


On Thu, Aug 24, 2023 at 2:26 PM Brian Hutchinson <b.hutchman@gmail.com> wrote:
>
> Hi Christian,
>
>
> On Wed, Aug 23, 2023 at 9:29 AM Brian Hutchinson <b.hutchman@gmail.com> wrote:
> >
> >
> >
> > On Wed, Aug 23, 2023 at 4:22 AM Christian Eggers <ceggers@arri.de> wrote:
> >>
> >> Hi Brian,
> >>
> >> I just return from my holidays...
> >
> >
> > Hope you had a good one ... I need one too!
> >
> >>
> >>
> >> Am Dienstag, 22. August 2023, 23:49:33 CEST schrieben Sie:
> >> > Getting this tx_timestamp_timeout error over and over when I try to run ptp4l:
> >> >
> >> > ptp4l[1366.143]: selected best master clock 001747.fffe.70151b
> >> > ptp4l[1366.143]: updating UTC offset to 37
> >> > ptp4l[1366.143]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
> >> > ptp4l[1366.860]: port 1: delay timeout
> >> > ptp4l[1376.871]: timed out while polling for tx timestamp
> >> > ptp4l[1376.871]: increasing tx_timestamp_timeout may correct this
> >> > issue, but it is likely caused by a driver bug
> >> > ptp4l[1376.871]: port 1: send delay request failed
> >> >
> >> > I was using 5.10.69 with Christians patches before they were mainlined
> >> > and had everything working with the help of Christian, Vladimir and
> >> > others.
> >> >
> >> > Now I need to update kernel so tried 6.3.12 which contains Christians
> >> > upstream patches and I also back ported v8 of the upstreamed patches
> >> > to 6.1.38 and I'm getting the same results with that kernel too.
> >> >
> >>
> >> I am also in the process of upgrading to 6.1.38 (but not really tested).
> >> I cherry-picked all necessary patches from the latest master (see attached
> >> archive). Maybe you would like to compare this with your patch series.
> >
> >
> > Excellent, I will check it out!  Yeah, we needed to be on a LTS kernel so that's why I'm focusing on 6.1.38 as it's the latest in the yocto/oe universe.
>
> So I checked all of your patches for 6.1.38 vs the ones I had.  I had
> all except 0002 and 0003.  I didn't have all of 0001 but I got a build
> error on diff_by_scaled_ppm and back ported that function from 6.3.12
> to make things build.
>
> I applied the missing patches I got from you and rebuilt everything
> and still have the same result with tx_timestamp_timeout.  Which
> didn't surprise me as I mentioned before I tried 6.3.12 mainline and
> get same result there too.
>
> Regards,
>
> Brian
>
> >
> >>
> >>
> >> > [...]
> >> >
> >> > I tried increasing tx_timestamp and it doesn't appear to matter. I
> >> > feel like I had this problem before when first starting to work with
> >> > 5.10.69 but can't remember if another patch resolved it. With 5.10.69
> >> > I've got quite a few more patches than just the 13 that were mainlined
> >> > in 6.3. Looking through old emails I want to say it might have been
> >> > resolved with net-dsa-ksz9477-avoid-PTP-races-with-the-data-path-l.patch
> >> > that Vladimir gave me but looking at the code it doesn't appear
> >> > mainline has that one.
> >>
> >> How is the IRQ line of you switch attached? I remember there was a problem
> >> with the IRQ type (edge vs. level), but I think this has already been
> >> applied to 6.1.38 (via -stable).
> >
> >
> > So that's one of the first things I thought of which is why I provided cat of /proc/interrupts.
> >
> > I also do have a /dev/ptp1 (/dev/ptp0 is imx8mm)
> >
> > My device tree node is the same as before:
> >
> >          i2c_ksz9567: ksz9567@5f {
> >                compatible = "microchip,ksz9567";
> >                reg = <0x5f>;
> >                phy-mode = "rgmii-id";
> >                status = "okay";
> >                interrupt-parent = <&gpio1>;
> >                interrupts = <10 IRQ_TYPE_LEVEL_LOW>;
> >
> >                ports {
> >                        #address-cells = <1>;
> >                        #size-cells = <0>;
> >                        port@0 {
> >                                reg = <0>;
> >                                label = "lan1";
> >                        };
> >                        port@1 {
> >                                reg = <1>;
> >                                label = "lan2";
> >                        };
> >                        port@6 {
> >                                reg = <6>;
> >                                label = "cpu";
> >                                ethernet = <&fec1>;
> >                                phy-mode = "rgmii-id";
> >                                fixed-link {
> >                                        speed = <100>;
> >                                        full-duplex;
> >                                };
> >                        };
> >                };
> >        };
> >
> > And I have same pinmux setup as before.  I double checked all of that.
> >
> > I noticed new kernel /proc/interrupts now has a bunch of ksz lines in addition to "gpio-mxc  10 Level" which is IRQ from the ksz switch.
> >
> > Here is what the old 5.10.69 /proc/interrupts looked like:
> >
> > cat /proc/interrupts
> >           CPU0       CPU1       CPU2       CPU3
> > 11:      46141        127        127        124     GICv3  30 Level     arch_timer
> > 14:       5260          0          0          0     GICv3  79 Level     timer@306a0000
> > 15:          0          0          0          0     GICv3  23 Level     arm-pmu
> > 20:          0          0          0          0     GICv3 127 Level     sai
> > 21:          0          0          0          0     GICv3  82 Level     sai
> > 32:          0          0          0          0     GICv3 110 Level     30280000.watchdog
> > 33:          0          0          0          0     GICv3 135 Level     sdma
> > 34:          0          0          0          0     GICv3  66 Level     sdma
> > 35:          0          0          0          0     GICv3  52 Level     caam-snvs
> > 36:          0          0          0          0     GICv3  51 Level     rtc alarm
> > 37:          0          0          0          0     GICv3  36 Level     30370000.snvs:snvs-powerkey
> > 39:          0          0          0          0     GICv3  64 Level     30830000.spi
> > 40:       1412          0          0          0     GICv3  59 Level     30890000.serial
> > 42:      55291          0          0          0     GICv3  67 Level     30a20000.i2c
> > 43:          0          0          0          0     GICv3  68 Level     30a30000.i2c
> > 44:          0          0          0          0     GICv3  69 Level     30a40000.i2c
> > 45:          0          0          0          0     GICv3  70 Level     30a50000.i2c
> > 47:          0          0          0          0     GICv3  55 Level     mmc1
> > 48:       3003          0          0          0     GICv3  56 Level     mmc2
> > 49:       2565          0          0          0     GICv3 139 Level     30bb0000.spi
> > 50:          0          0          0          0     GICv3  34 Level     sdma
> > 51:          0          0          0          0     GICv3 150 Level     30be0000.ethernet
> > 52:          0          0          0          0     GICv3 151 Level     30be0000.ethernet
> > 53:       1417          0          0          0     GICv3 152 Level     30be0000.ethernet
> > 54:          0          0          0          0     GICv3 153 Level     30be0000.ethernet
> > 56:          0          0          0          0     GICv3 130 Level     imx8_ddr_perf_pmu
> > 60:          0          0          0          0  gpio-mxc   3 Level     bd718xx-irq
> > 67:         23          0          0          0  gpio-mxc  10 Level     0-005f
> > 72:          0          0          0          0  gpio-mxc  15 Edge      30b50000.mmc cd
> > 217:          0          0          0          0  bd718xx-irq   5 Edge      gpio_keys
> > IPI0:        29         14         13         13       Rescheduling interrupts
> > IPI1:         0         41         41         41       Function call interrupts
> > IPI2:         0          0          0          0       CPU stop interrupts
> > IPI3:         0          0          0          0       CPU stop (for crash dump) interrupts
> > IPI4:         0          0          0          0       Timer broadcast interrupts
> > IPI5:      7959          0          0          0       IRQ work interrupts
> > IPI6:         0          0          0          0       CPU wake-up interrupts
> > Err:          0
> >
> > I'll check out your 6.1.38 changes compared to what I did.
> >
> > Thanks,
> >
> > Brian
> >
> >>
> >>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Microchip net DSA with ptp4l getting tx_timeout failed msg using 6.3.12 kernel and KSZ9567 switch
  2023-08-24 19:03       ` Brian Hutchinson
@ 2023-08-25 15:49         ` Christian Eggers
  2023-08-26 11:43         ` Vladimir Oltean
  1 sibling, 0 replies; 7+ messages in thread
From: Christian Eggers @ 2023-08-25 15:49 UTC (permalink / raw)
  To: Brian Hutchinson
  Cc: netdev, Vladimir Oltean, arun.ramadoss, rakesh.sankaranarayanan

Hi Brian,

On Thursday, 24 August 2023, 21:03:32 CEST, Brian Hutchinson wrote:
> Update.  Top posting because I think this is my issue.
> 
> I dug further into my problem.  I'm using E2E and it looks like the
> mainlined Microchip KSZ DSA PTP code is only supporting P2P.
> 
> The 5.10.69 kernel that I was first able to get working with
> Christian's early pre-mainlined patches had:
> 0016-net-dsa-microchip-ksz9477-add-E2E-support.patch

sorry for this, but I forgot that you use E2E.  Unfortunately I
have no up-to-date patches for this, so you may try to port
the old patch yourself.

regards
Christian

> 
> ... which gets into the "sticky" bits of why these patches weren't
> accepted in the first place due to some Microchip specific
> implementation if I recall correctly.
> 
> Regards,
> 
> Brian
> 
> 
> On Thu, Aug 24, 2023 at 2:26 PM Brian Hutchinson <b.hutchman@gmail.com> wrote:
> >
> > Hi Christian,
> >
> >
> > On Wed, Aug 23, 2023 at 9:29 AM Brian Hutchinson <b.hutchman@gmail.com> wrote:
> > >
> > >
> > >
> > > On Wed, Aug 23, 2023 at 4:22 AM Christian Eggers <ceggers@arri.de> wrote:
> > >>
> > >> Hi Brian,
> > >>
> > >> I just return from my holidays...
> > >
> > >
> > > Hope you had a good one ... I need one too!
> > >
> > >>
> > >>
> > >> Am Dienstag, 22. August 2023, 23:49:33 CEST schrieben Sie:
> > >> > Getting this tx_timestamp_timeout error over and over when I try to run ptp4l:
> > >> >
> > >> > ptp4l[1366.143]: selected best master clock 001747.fffe.70151b
> > >> > ptp4l[1366.143]: updating UTC offset to 37
> > >> > ptp4l[1366.143]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
> > >> > ptp4l[1366.860]: port 1: delay timeout
> > >> > ptp4l[1376.871]: timed out while polling for tx timestamp
> > >> > ptp4l[1376.871]: increasing tx_timestamp_timeout may correct this
> > >> > issue, but it is likely caused by a driver bug
> > >> > ptp4l[1376.871]: port 1: send delay request failed
> > >> >
> > >> > I was using 5.10.69 with Christians patches before they were mainlined
> > >> > and had everything working with the help of Christian, Vladimir and
> > >> > others.
> > >> >
> > >> > Now I need to update kernel so tried 6.3.12 which contains Christians
> > >> > upstream patches and I also back ported v8 of the upstreamed patches
> > >> > to 6.1.38 and I'm getting the same results with that kernel too.
> > >> >
> > >>
> > >> I am also in the process of upgrading to 6.1.38 (but not really tested).
> > >> I cherry-picked all necessary patches from the latest master (see attached
> > >> archive). Maybe you would like to compare this with your patch series.
> > >
> > >
> > > Excellent, I will check it out!  Yeah, we needed to be on a LTS kernel so that's why I'm focusing on 6.1.38 as it's the latest in the yocto/oe universe.
> >
> > So I checked all of your patches for 6.1.38 vs the ones I had.  I had
> > all except 0002 and 0003.  I didn't have all of 0001 but I got a build
> > error on diff_by_scaled_ppm and back ported that function from 6.3.12
> > to make things build.
> >
> > I applied the missing patches I got from you and rebuilt everything
> > and still have the same result with tx_timestamp_timeout.  Which
> > didn't surprise me as I mentioned before I tried 6.3.12 mainline and
> > get same result there too.
> >
> > Regards,
> >
> > Brian
> >
> > >
> > >>
> > >>
> > >> > [...]
> > >> >
> > >> > I tried increasing tx_timestamp and it doesn't appear to matter. I
> > >> > feel like I had this problem before when first starting to work with
> > >> > 5.10.69 but can't remember if another patch resolved it. With 5.10.69
> > >> > I've got quite a few more patches than just the 13 that were mainlined
> > >> > in 6.3. Looking through old emails I want to say it might have been
> > >> > resolved with net-dsa-ksz9477-avoid-PTP-races-with-the-data-path-l.patch
> > >> > that Vladimir gave me but looking at the code it doesn't appear
> > >> > mainline has that one.
> > >>
> > >> How is the IRQ line of you switch attached? I remember there was a problem
> > >> with the IRQ type (edge vs. level), but I think this has already been
> > >> applied to 6.1.38 (via -stable).
> > >
> > >
> > > So that's one of the first things I thought of which is why I provided cat of /proc/interrupts.
> > >
> > > I also do have a /dev/ptp1 (/dev/ptp0 is imx8mm)
> > >
> > > My device tree node is the same as before:
> > >
> > >          i2c_ksz9567: ksz9567@5f {
> > >                compatible = "microchip,ksz9567";
> > >                reg = <0x5f>;
> > >                phy-mode = "rgmii-id";
> > >                status = "okay";
> > >                interrupt-parent = <&gpio1>;
> > >                interrupts = <10 IRQ_TYPE_LEVEL_LOW>;
> > >
> > >                ports {
> > >                        #address-cells = <1>;
> > >                        #size-cells = <0>;
> > >                        port@0 {
> > >                                reg = <0>;
> > >                                label = "lan1";
> > >                        };
> > >                        port@1 {
> > >                                reg = <1>;
> > >                                label = "lan2";
> > >                        };
> > >                        port@6 {
> > >                                reg = <6>;
> > >                                label = "cpu";
> > >                                ethernet = <&fec1>;
> > >                                phy-mode = "rgmii-id";
> > >                                fixed-link {
> > >                                        speed = <100>;
> > >                                        full-duplex;
> > >                                };
> > >                        };
> > >                };
> > >        };
> > >
> > > And I have same pinmux setup as before.  I double checked all of that.
> > >
> > > I noticed new kernel /proc/interrupts now has a bunch of ksz lines in addition to "gpio-mxc  10 Level" which is IRQ from the ksz switch.
> > >
> > > Here is what the old 5.10.69 /proc/interrupts looked like:
> > >
> > > cat /proc/interrupts
> > >           CPU0       CPU1       CPU2       CPU3
> > > 11:      46141        127        127        124     GICv3  30 Level     arch_timer
> > > 14:       5260          0          0          0     GICv3  79 Level     timer@306a0000
> > > 15:          0          0          0          0     GICv3  23 Level     arm-pmu
> > > 20:          0          0          0          0     GICv3 127 Level     sai
> > > 21:          0          0          0          0     GICv3  82 Level     sai
> > > 32:          0          0          0          0     GICv3 110 Level     30280000.watchdog
> > > 33:          0          0          0          0     GICv3 135 Level     sdma
> > > 34:          0          0          0          0     GICv3  66 Level     sdma
> > > 35:          0          0          0          0     GICv3  52 Level     caam-snvs
> > > 36:          0          0          0          0     GICv3  51 Level     rtc alarm
> > > 37:          0          0          0          0     GICv3  36 Level     30370000.snvs:snvs-powerkey
> > > 39:          0          0          0          0     GICv3  64 Level     30830000.spi
> > > 40:       1412          0          0          0     GICv3  59 Level     30890000.serial
> > > 42:      55291          0          0          0     GICv3  67 Level     30a20000.i2c
> > > 43:          0          0          0          0     GICv3  68 Level     30a30000.i2c
> > > 44:          0          0          0          0     GICv3  69 Level     30a40000.i2c
> > > 45:          0          0          0          0     GICv3  70 Level     30a50000.i2c
> > > 47:          0          0          0          0     GICv3  55 Level     mmc1
> > > 48:       3003          0          0          0     GICv3  56 Level     mmc2
> > > 49:       2565          0          0          0     GICv3 139 Level     30bb0000.spi
> > > 50:          0          0          0          0     GICv3  34 Level     sdma
> > > 51:          0          0          0          0     GICv3 150 Level     30be0000.ethernet
> > > 52:          0          0          0          0     GICv3 151 Level     30be0000.ethernet
> > > 53:       1417          0          0          0     GICv3 152 Level     30be0000.ethernet
> > > 54:          0          0          0          0     GICv3 153 Level     30be0000.ethernet
> > > 56:          0          0          0          0     GICv3 130 Level     imx8_ddr_perf_pmu
> > > 60:          0          0          0          0  gpio-mxc   3 Level     bd718xx-irq
> > > 67:         23          0          0          0  gpio-mxc  10 Level     0-005f
> > > 72:          0          0          0          0  gpio-mxc  15 Edge      30b50000.mmc cd
> > > 217:          0          0          0          0  bd718xx-irq   5 Edge      gpio_keys
> > > IPI0:        29         14         13         13       Rescheduling interrupts
> > > IPI1:         0         41         41         41       Function call interrupts
> > > IPI2:         0          0          0          0       CPU stop interrupts
> > > IPI3:         0          0          0          0       CPU stop (for crash dump) interrupts
> > > IPI4:         0          0          0          0       Timer broadcast interrupts
> > > IPI5:      7959          0          0          0       IRQ work interrupts
> > > IPI6:         0          0          0          0       CPU wake-up interrupts
> > > Err:          0
> > >
> > > I'll check out your 6.1.38 changes compared to what I did.
> > >
> > > Thanks,
> > >
> > > Brian
> > >
> > >>
> > >>
> 





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Microchip net DSA with ptp4l getting tx_timeout failed msg using 6.3.12 kernel and KSZ9567 switch
  2023-08-24 19:03       ` Brian Hutchinson
  2023-08-25 15:49         ` Christian Eggers
@ 2023-08-26 11:43         ` Vladimir Oltean
  1 sibling, 0 replies; 7+ messages in thread
From: Vladimir Oltean @ 2023-08-26 11:43 UTC (permalink / raw)
  To: Brian Hutchinson, Tristram.Ha, Woojung.Huh
  Cc: Christian Eggers, netdev, arun.ramadoss, rakesh.sankaranarayanan

On Thu, Aug 24, 2023 at 03:03:32PM -0400, Brian Hutchinson wrote:
> Update.  Top posting because I think this is my issue.
> 
> I dug further into my problem.  I'm using E2E and it looks like the
> mainlined Microchip KSZ DSA PTP code is only supporting P2P.
> 
> The 5.10.69 kernel that I was first able to get working with
> Christian's early pre-mainlined patches had:
> 0016-net-dsa-microchip-ksz9477-add-E2E-support.patch
> 
> ... which gets into the "sticky" bits of why these patches weren't
> accepted in the first place due to some Microchip specific
> implementation if I recall correctly.
> 
> Regards,
> 
> Brian

+Tristram Ha, Woojung Huh

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-08-26 11:43 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-22 21:49 Microchip net DSA with ptp4l getting tx_timeout failed msg using 6.3.12 kernel and KSZ9567 switch Brian Hutchinson
2023-08-23  8:22 ` Christian Eggers
2023-08-23 18:12   ` Brian Hutchinson
     [not found]   ` <CAFZh4h-6yWvpvzJyv06zy8MbtMmXG==V0h2vU=uUN8iMMcb=ig@mail.gmail.com>
2023-08-24 18:26     ` Brian Hutchinson
2023-08-24 19:03       ` Brian Hutchinson
2023-08-25 15:49         ` Christian Eggers
2023-08-26 11:43         ` Vladimir Oltean

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox