Netdev List
 help / color / mirror / Atom feed
* RE: [E1000-devel] SFP+ EEPROM readouts fail on X722 (ethtool -m: Invalid argument)
From: Buchholz, Donald @ 2019-08-28 16:12 UTC (permalink / raw)
  To: Jakub Jankowski, Fujinaka, Todd,
	e1000-devel@lists.sourceforge.net
  Cc: netdev@vger.kernel.org, mhemsley@open-systems.com, Yang, Lihong
In-Reply-To: <1277a516-78ac-8bcd-64ac-d97a260451bc@toxcorp.com>

Hi Jakub,

That commit was for "ethtool -e" and not "ethtool -m".

There was some firmware support required to implement "ethtool -m"
missing in original X722 NVM images that, ttbomk, still was not
present in early 2019.  My situation is similar to Todd's -- we
know a request to add this support has been submitted, but are
not clear if it has been approved, completed, or shipped yet.

- Don



> -----Original Message-----
> From: Jakub Jankowski [mailto:shasta@toxcorp.com]
> Sent: Wednesday, August 28, 2019 12:18 AM
> To: Fujinaka, Todd <todd.fujinaka@intel.com>; e1000-
> devel@lists.sourceforge.net
> Cc: netdev@vger.kernel.org; mhemsley@open-systems.com; Yang, Lihong
> <lihong.yang@intel.com>
> Subject: Re: [E1000-devel] SFP+ EEPROM readouts fail on X722 (ethtool -m:
> Invalid argument)
> 
> This commit suggests that it should be possible:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/
> ?id=c271dd6c391b535226cf1a81aaad9f33cb5899d3
> (It has been in upstream kernel since v4.12, so my test kernel does have
> it, and so does the out-of-tree driver I'm testing with)
> 
> On 8/28/19 2:53 AM, Fujinaka, Todd wrote:
> > Sorry about the top posting, but if I don't do it this way I can't read
> anything in Outlook (not my preferred MUA).
> >
> > I think I may have been wrong about things. I'm not as familiar with the
> x722, and the NVM versions are completely different than the x710 and I
> was confused.
> >
> > Even worse, I'm not sure if the x722 is able to read the data from the
> SFP/SFP+ EEPROM. I remembered that was a feature we requested internally
> but I don't remember what the progress was.
> >
> > I'm asking around to see if I can get clarification. I haven't heard
> anything yet.
> >
> > Todd Fujinaka
> > Software Application Engineer
> > Datacenter Engineering Group
> > Intel Corporation
> > todd.fujinaka@intel.com
> >
> >
> > -----Original Message-----
> > From: Jakub Jankowski [mailto:shasta@toxcorp.com]
> > Sent: Tuesday, August 27, 2019 4:01 PM
> > To: Fujinaka, Todd <todd.fujinaka@intel.com>; e1000-
> devel@lists.sourceforge.net
> > Cc: netdev@vger.kernel.org; mhemsley@open-systems.com
> > Subject: Re: [E1000-devel] SFP+ EEPROM readouts fail on X722 (ethtool -
> m: Invalid argument)
> >
> > Hi,
> >
> > On 8/27/19 7:56 PM, Fujinaka, Todd wrote:
> >> The hints should be:
> >> # ethtool -m eth10
> >> Cannot get module EEPROM information: Invalid argument # dmesg | tail -
> n 1 [  445.971974] i40e 0000:3d:00.3 eth10: Module EEPROM memory read not
> supported. Please update the NVM image.
> >>
> >> # ethtool -i eth10
> >> driver: i40e
> >> version: 2.9.21
> >> firmware-version: 3.31 0x80000d31 1.1767.0
> >>
> >> And the working case:
> >> # ethtool -i eth8
> >> driver: i40e
> >> version: 2.9.21
> >> firmware-version: 6.01 0x800035cf 1.1876.0
> >>
> >> If you don't see it, 6.01 > 3.31.
> > The reason why firmware between the two is (that much) different is
> because the non-working case is from X722 NIC, while the working one is
> from X710.
> >
> >> The NVM update tool should be available on downloadcenter.intel.com
> > Thanks for the pointer to NVM updater. I'd like to offer some additional
> comments about my experience with the newest one (v4.00):
> >
> > a) running ./nvmupdate64e (from X722_NVMUpdate_Linux_x64 subdir) errors
> out without really saying what's wrong:
> >
> >     # ./nvmupdate64e
> >
> >     Intel(R) Ethernet NVM Update Tool
> >     NVMUpdate version 1.30.2.11
> >     Copyright (C) 2013 - 2017 Intel Corporation.
> >
> >
> >     WARNING: To avoid damage to your device, do not stop the update or
> reboot or power off the system during this update.
> >     Inventory in progress. Please wait [+.........]
> >     Tool execution completed with the following status: The
> configuration file could not be opened/read, or a syntax error was
> discovered in the file
> >     Press any key to exit.
> >
> > after enabling logging (-l out.log) a bit more is revealed:
> >
> >     # tail -n 2 out.log
> >     Error:   Config file line 2: Not supported config file version.
> >     Error:   Missing CONFIG VERSION parameter in configuration file.
> >
> > but that's not entirely true, CONFIG VERSION is set in the default
> configuration file:
> >
> >     # head -n 2 nvmupdate.cfg
> >     CURRENT FAMILY: 1.0.0
> >     CONFIG VERSION: 1.14.0
> >
> > so why isn't this understood?
> > Manually editing nvmupdate.cfg and setting CONFIG VERSION: 1.11.0 seems
> to make this particular problem go away.
> >
> > b) Re-doing this with downgraded config version exposes another problem:
> >
> >     Config file read.
> >     Error:   Can't open NVM map file [Immediate_offset_2.txt]
> >
> > and indeed, there is no Immediate_offset_2.txt in
> NVMUpdatePackage_WFT_WFQ&WF0_v4.00/X722_NVMUpdate_Linux_x64/
> > There is one, however, in
> > NVMUpdatePackage_WFT_WFQ&WF0_v4.00/X722_NVMUpdate_EFIx64/ subdir.
> > Copying it over to the _Linux_x64 resolves this particular problem
> >
> > c) Re-doing this with Immediate_offset_2.txt in place exposes third
> problem:
> >
> >     Error:   Can't open NVM image file
> >
> [LBG_B2_Wolf_Pass_WFT_X557_P01_PHY_Auto_Detect_P23_NCSI_v3.31_800016DB.bin
> ]
> >
> > and once again - same story. It exists in
> NVMUpdatePackage_WFT_WFQ&WF0_v4.00/X722_NVMUpdate_EFIx64/ but not
> NVMUpdatePackage_WFT_WFQ&WF0_v4.00/X722_NVMUpdate_Linux_x64/ - had to copy
> it over.
> >
> >
> > Once I managed to get all these out of the way, the tool finally ran:
> >
> >     Num Description                               Ver. DevId S:B Status
> >     === ======================================== ===== ===== ======
> ===============
> >     01) Intel(R) Ethernet Server Adapter I350-T4  1.99  1521 00:024
> Update not available
> >     02) Intel(R) Ethernet Connection X722 for     3.49  37D2 00:061
> Update
> >         10GBASE-T available
> >     03) Intel(R) Ethernet Server Adapter I350-T4  1.99  1521 00:175
> Update not available
> >
> >
> > The initial starting point was:
> >
> > 0) firmware-version: 3.31 0x80000d31 1.1767.0
> >
> > After first update+reboot, this was bumped to:
> >
> > 1) firmware-version: 3.1d 0x800016db 1.1767.0    (but ethtool -m ethX
> still doesn't work)
> >
> > So I ran the tool the second time, it said 'Update available' again, but
> this time:
> >
> >     Num Description                               Ver. DevId S:B Status
> >     === ======================================== ===== ===== ======
> ===============
> >     01) Intel(R) Ethernet Server Adapter I350-T4  1.99  1521 00:024
> Update not available
> >     02) Intel(R) Ethernet Connection X722 for     3.29  37D2 00:061
> Update
> >         10GBASE-T available
> >     03) Intel(R) Ethernet Server Adapter I350-T4  1.99  1521 00:175
> Update not available
> >
> >     Options: Adapter Index List (comma-separated), [A]ll, e[X]it
> >     Enter selection:02
> >     Would you like to back up the NVM images? [Y]es/[N]o: Y
> >     Update in progress. This operation may take several minutes.
> >     [*******+..]
> >     Tool execution completed with the following status: <---------- why
> is there no status printed?
> >     Press any key to exit.
> >
> >
> > Checking output log:
> >
> >     # cat out3.log
> >     Intel(R) Ethernet NVM Update Tool
> >     NVMUpdate version 1.30.2.11
> >     Copyright (C) 2013 - 2017 Intel Corporation.
> >
> >     ./nvmupdate64e -c nvmupdate.cfg -l out3.log
> >
> >     Config file read.
> >     Inventory
> >     [00:061:00:00]: Intel(R) Ethernet Connection X722 for 10GBASE-T
> >         Flash inventory started
> >         Shadow RAM inventory started
> >     Alternate MAC address is not set
> >         Shadow RAM inventory finished
> >         Flash inventory finished
> >         OROM inventory started
> >         OROM inventory finished
> >         PHY NVM inventory started
> >         PHY NVM inventory finished
> >     [00:061:00:01]: Intel(R) Ethernet Connection X722 for 10GBASE-T
> >         Device already inventoried.
> >     [00:061:00:02]: Intel(R) Ethernet Connection X722 for 10GbE SFP+
> >         Device already inventoried.
> >         PHY NVM inventory started
> >         PHY NVM inventory finished
> >     [00:061:00:03]: Intel(R) Ethernet Connection X722 for 10GbE SFP+
> >         Device already inventoried.
> >     Update
> >     [00:061:00:00]: Intel(R) Ethernet Connection X722 for 10GBASE-T
> >         Creating backup images in directory: A4BF0164884A
> >         Backup images created.
> >         Flash update started
> >         NVM image verification started
> >         Shadow RAM image verification started
> >
> >     Image differences found at offset 0x3AE [Device=0xF, Buffer=0x0] -
> > update required.
> >     Error:   Flash update failed
> >     [00:061:00:02]: Intel(R) Ethernet Connection X722 for 10GbE SFP+
> >     #
> >
> > However, ethtool -i suggests that firmware was updated to:
> >
> > 2) firmware-version: 4.00 0x80001577 1.1580.0    <------- so it did
> > _something_ after all?
> >
> > At this point, every subsequent attempt to run the NVM updater yields
> > the same results: an update is available, but attempting to apply it
> > fails with the same message in log.
> >
> > And my initial issue still persists - ethtool -m <iface> still returns
> > "invalid argument" with "Module EEPROM memory read not supported. Please
> > update the NVM image" logged in dmesg.
> >
> > How can I resolve this?
> >
> > Cheers,
> >    Jakub.
> >
> >> Todd Fujinaka
> >> Software Application Engineer
> >> Datacenter Engineering Group
> >> Intel Corporation
> >> todd.fujinaka@intel.com
> >>
> >>
> >> -----Original Message-----
> >> From: Jakub Jankowski [mailto:shasta@toxcorp.com]
> >> Sent: Tuesday, August 27, 2019 4:03 AM
> >> To: e1000-devel@lists.sourceforge.net
> >> Cc: netdev@vger.kernel.org; shasta@toxcorp.com; mhemsley@open-
> systems.com
> >> Subject: [E1000-devel] SFP+ EEPROM readouts fail on X722 (ethtool -m:
> Invalid argument)
> >>
> >> Hi,
> >>
> >> We can't get SFP+ EEPROM readouts for X722 to work at all:
> >>
> >> # ethtool -m eth10
> >> Cannot get module EEPROM information: Invalid argument # dmesg | tail -
> n 1 [  445.971974] i40e 0000:3d:00.3 eth10: Module EEPROM memory read not
> supported. Please update the NVM image.
> >> # lspci | grep 3d:00.3
> >> 3d:00.3 Ethernet controller: Intel Corporation Ethernet Connection X722
> for 10GbE SFP+ (rev 09)
> >>
> >>
> >> We're running 4.19.65 kernel at the moment, testing using the newest
> out-of-tree Intel module
> >>
> >> # modinfo -F version i40e
> >> 2.9.21
> >>
> >> We also tried:
> >> - 4.19.65 with in-tree i40e (2.3.2-k)
> >> - stock Arch Linux (kernel 5.2.5, driver 2.8.20-k) and the results are
> the same, as shown above.
> >>
> >> # ethtool -i eth10
> >> driver: i40e
> >> version: 2.9.21
> >> firmware-version: 3.31 0x80000d31 1.1767.0
> >> expansion-rom-version:
> >> bus-info: 0000:3d:00.3
> >> supports-statistics: yes
> >> supports-test: yes
> >> supports-eeprom-access: yes
> >> supports-register-dump: yes
> >> supports-priv-flags: yes
> >> # dmidecode -s baseboard-manufacturer
> >> Intel Corporation
> >> # dmidecode -s baseboard-product-name
> >> S2600WFT
> >> # dmidecode -s baseboard-version
> >> H48104-853
> >>
> >> # lspci -vvv
> >> (...)
> >> 3d:00.3 Ethernet controller: Intel Corporation Ethernet Connection X722
> for 10GbE SFP+ (rev 09)
> >> 	DeviceName: Intel PCH Integrated 10 Gigabit Ethernet Controller
> >> 	Subsystem: Intel Corporation Ethernet Connection X722 for 10GbE SFP+
> >> 	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+
> Stepping- SERR+ FastB2B- DisINTx+
> >> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
> >> 	Latency: 0, Cache Line Size: 32 bytes
> >> 	Interrupt: pin A routed to IRQ 112
> >> 	NUMA node: 0
> >> 	Region 0: Memory at ab000000 (64-bit, prefetchable) [size=16M]
> >> 	Region 3: Memory at b0000000 (64-bit, prefetchable) [size=32K]
> >> 	Expansion ROM at <ignored> [disabled]
> >> 	Capabilities: [40] Power Management version 3
> >> 		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-
> ,D3hot+,D3cold+)
> >> 		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
> >> 	Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
> >> 		Address: 0000000000000000  Data: 0000
> >> 		Masking: 00000000  Pending: 00000000
> >> 	Capabilities: [70] MSI-X: Enable+ Count=129 Masked-
> >> 		Vector table: BAR=3 offset=00000000
> >> 		PBA: BAR=3 offset=00001000
> >> 	Capabilities: [a0] Express (v2) Endpoint, MSI 00
> >> 		DevCap:	MaxPayload 512 bytes, PhantFunc 0, Latency L0s
> <512ns, L1 <64us
> >> 			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
> SlotPowerLimit 0.000W
> >> 		DevCtl:	CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
> >> 			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop- FLReset-
> >> 			MaxPayload 256 bytes, MaxReadReq 512 bytes
> >> 		DevSta:	CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr+
> TransPend-
> >> 		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1,
> Exit Latency L0s <64ns, L1 <1us
> >> 			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
> >> 		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk+
> >> 			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> >> 		LnkSta:	Speed 2.5GT/s (ok), Width x1 (ok)
> >> 			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> >> 		DevCap2: Completion Timeout: Range AB, TimeoutDis+, LTR-, OBFF
> Not Supported
> >> 			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
> >> 		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-,
> OBFF Disabled
> >> 			 AtomicOpsCtl: ReqEn-
> >> 		LnkSta2: Current De-emphasis Level: -6dB,
> EqualizationComplete-, EqualizationPhase1-
> >> 			 EqualizationPhase2-, EqualizationPhase3-,
> LinkEqualizationRequest-
> >> 	Capabilities: [e0] Vital Product Data
> >> 		Product Name: Example VPD
> >> 		Read-only fields:
> >> 			[V0] Vendor specific:
> >> 			[RV] Reserved: checksum good, 0 byte(s) reserved
> >> 		End
> >> 	Capabilities: [100 v2] Advanced Error Reporting
> >> 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> >> 		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
> >> 		UESvrt:	DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> >> 		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> AdvNonFatalErr+
> >> 		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> AdvNonFatalErr+
> >> 		AERCap:	First Error Pointer: 00, ECRCGenCap+ ECRCGenEn-
> ECRCChkCap+ ECRCChkEn-
> >> 			MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
> >> 		HeaderLog: 00000000 00000000 00000000 00000000
> >> 	Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
> >> 		ARICap:	MFVC- ACS-, Next Function: 0
> >> 		ARICtl:	MFVC- ACS-, Function Group: 0
> >> 	Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV)
> >> 		IOVCap:	Migration-, Interrupt Message Number: 000
> >> 		IOVCtl:	Enable- Migration- Interrupt- MSE- ARIHierarchy-
> >> 		IOVSta:	Migration-
> >> 		Initial VFs: 32, Total VFs: 32, Number of VFs: 0, Function
> Dependency Link: 03
> >> 		VF offset: 109, stride: 1, Device ID: 37cd
> >> 		Supported Page Size: 00000553, System Page Size: 00000001
> >> 		Region 0: Memory at 00000000af000000 (64-bit, prefetchable)
> >> 		Region 3: Memory at 00000000b0020000 (64-bit, prefetchable)
> >> 		VF Migration: offset: 00000000, BIR: 0
> >> 	Capabilities: [1a0 v1] Transaction Processing Hints
> >> 		Device specific mode supported
> >> 		No steering table available
> >> 	Capabilities: [1b0 v1] Access Control Services
> >> 		ACSCap:	SrcValid- TransBlk- ReqRedir- CmpltRedir-
> UpstreamFwd- EgressCtrl- DirectTrans-
> >> 		ACSCtl:	SrcValid- TransBlk- ReqRedir- CmpltRedir-
> UpstreamFwd- EgressCtrl- DirectTrans-
> >> 	Kernel driver in use: i40e
> >> 	Kernel modules: i40e
> >>
> >>
> >> Same kernel+i40e, same SFP+ module - but on Intel X710, works like a
> treat:
> >>
> >> # lspci | grep X7
> >> 81:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710
> for 10GbE SFP+ (rev 01)
> >> 81:00.1 Ethernet controller: Intel Corporation Ethernet Controller X710
> for 10GbE SFP+ (rev 01) # ethtool -m eth8
> >> 	Identifier                                : 0x03 (SFP)
> >> 	Extended identifier                       : 0x04 (GBIC/SFP defined
> by 2-wire interface ID)
> >> 	Connector                                 : 0x07 (LC)
> >> 	Transceiver codes                         : 0x10 0x00 0x00 0x01 0x00
> 0x00 0x00 0x00 0x00
> >> 	Transceiver type                          : 10G Ethernet: 10G Base-
> SR
> >> 	Transceiver type                          : Ethernet: 1000BASE-SX
> >> 	Encoding                                  : 0x06 (64B/66B)
> >> 	BR, Nominal                               : 10300MBd
> >>            (...)
> >> # ethtool -i eth8
> >> driver: i40e
> >> version: 2.9.21
> >> firmware-version: 6.01 0x800035cf 1.1876.0
> >> expansion-rom-version:
> >> bus-info: 0000:81:00.0
> >> supports-statistics: yes
> >> supports-test: yes
> >> supports-eeprom-access: yes
> >> supports-register-dump: yes
> >> supports-priv-flags: yes
> >> #
> >>
> >>
> >> Is this a known problem?
> >>
> >>
> >> Best regards,
> >>     Jakub
> >>
> >>
> >>
> >> _______________________________________________
> >> E1000-devel mailing list
> >> E1000-devel@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/e1000-devel
> >> To learn more about Intel&#174; Ethernet, visit
> http://communities.intel.com/community/wired
> 
> 
> --
> Jakub Jankowski|shasta@toxcorp.com|http://toxcorp.com/
> GPG: FCBF F03D 9ADB B768 8B92 BB52 0341 9037 A875 942D
> 
> 
> 
> _______________________________________________
> E1000-devel mailing list
> E1000-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/e1000-devel
> To learn more about Intel&#174; Ethernet, visit
> http://communities.intel.com/community/wired

^ permalink raw reply

* RE: [Intel-wired-lan] [PATCH] net: ixgbe: fix memory leaks
From: Bowers, AndrewX @ 2019-08-28 16:22 UTC (permalink / raw)
  To: open list:NETWORKING DRIVERS,
	moderated list:INTEL ETHERNET DRIVERS, open list
In-Reply-To: <1565554067-4994-1-git-send-email-wenwen@cs.uga.edu>

> -----Original Message-----
> From: Intel-wired-lan [mailto:intel-wired-lan-bounces@osuosl.org] On
> Behalf Of Wenwen Wang
> Sent: Sunday, August 11, 2019 1:08 PM
> To: Wenwen Wang <wenwen@cs.uga.edu>
> Cc: open list:NETWORKING DRIVERS <netdev@vger.kernel.org>; moderated
> list:INTEL ETHERNET DRIVERS <intel-wired-lan@lists.osuosl.org>; open list
> <linux-kernel@vger.kernel.org>; David S. Miller <davem@davemloft.net>
> Subject: [Intel-wired-lan] [PATCH] net: ixgbe: fix memory leaks
> 
> In ixgbe_configure_clsu32(), 'jump', 'input', and 'mask' are allocated through
> kzalloc() respectively in a for loop body. Then,
> ixgbe_clsu32_build_input() is invoked to build the input. If this process fails,
> next iteration of the for loop will be executed. However, the allocated
> 'jump', 'input', and 'mask' are not deallocated on this execution path, leading
> to memory leaks.
> 
> Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 4 ++++
>  1 file changed, 4 insertions(+)

Tested-by: Andrew Bowers <andrewx.bowers@intel.com>



^ permalink raw reply

* RE: [Intel-wired-lan] [PATCH] i40e: check __I40E_VF_DISABLE bit in i40e_sync_filters_subtask
From: Bowers, AndrewX @ 2019-08-28 16:23 UTC (permalink / raw)
  To: intel-wired-lan@lists.osuosl.org; +Cc: netdev@vger.kernel.org
In-Reply-To: <20190821140929.26985-1-sassmann@kpanic.de>

> -----Original Message-----
> From: Intel-wired-lan [mailto:intel-wired-lan-bounces@osuosl.org] On
> Behalf Of Stefan Assmann
> Sent: Wednesday, August 21, 2019 7:09 AM
> To: intel-wired-lan@lists.osuosl.org
> Cc: netdev@vger.kernel.org; davem@davemloft.net; sassmann@kpanic.de
> Subject: [Intel-wired-lan] [PATCH] i40e: check __I40E_VF_DISABLE bit in
> i40e_sync_filters_subtask
> 
> While testing VF spawn/destroy the following panic occured.
> 
> BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000029 [...]
> Workqueue: i40e i40e_service_task [i40e]
> RIP: 0010:i40e_sync_vsi_filters+0x6fd/0xc60 [i40e] [...] Call Trace:
>  ? __switch_to_asm+0x35/0x70
>  ? __switch_to_asm+0x41/0x70
>  ? __switch_to_asm+0x35/0x70
>  ? _cond_resched+0x15/0x30
>  i40e_sync_filters_subtask+0x56/0x70 [i40e]
>  i40e_service_task+0x382/0x11b0 [i40e]
>  ? __switch_to_asm+0x41/0x70
>  ? __switch_to_asm+0x41/0x70
>  process_one_work+0x1a7/0x3b0
>  worker_thread+0x30/0x390
>  ? create_worker+0x1a0/0x1a0
>  kthread+0x112/0x130
>  ? kthread_bind+0x30/0x30
>  ret_from_fork+0x35/0x40
> 
> Investigation revealed a race where pf->vf[vsi->vf_id].trusted may get
> accessed by the watchdog via i40e_sync_filters_subtask() although
> i40e_free_vfs() already free'd pf->vf.
> To avoid this the call to i40e_sync_vsi_filters() in
> i40e_sync_filters_subtask() needs to be guarded by __I40E_VF_DISABLE,
> which is also used by i40e_free_vfs().
> 
> Note: put the __I40E_VF_DISABLE check after the
> __I40E_MACVLAN_SYNC_PENDING check as the latter is more likely to
> trigger.
> 
> Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_main.c | 5 +++++
>  1 file changed, 5 insertions(+)

Tested-by: Andrew Bowers <andrewx.bowers@intel.com>



^ permalink raw reply

* [PATCH net-next] net: dsa: mv88e6xxx: get serdes lane after lock
From: Vivien Didelot @ 2019-08-28 16:26 UTC (permalink / raw)
  To: netdev; +Cc: davem, Marek Behún, f.fainelli, andrew, Vivien Didelot

This is a follow-up patch for commit 17deaf5cb37a ("net: dsa:
mv88e6xxx: create serdes_get_lane chip operation").

The .serdes_get_lane implementations access the CMODE of a port,
even though it is cached at the moment, it is safer to call them
after the mutex is locked, not before.

At the same time, check for an eventual error and return IRQ_DONE,
instead of blindly ignoring it.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
---
 drivers/net/dsa/mv88e6xxx/serdes.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/serdes.c b/drivers/net/dsa/mv88e6xxx/serdes.c
index 9424e401dbc7..38c0da2492c0 100644
--- a/drivers/net/dsa/mv88e6xxx/serdes.c
+++ b/drivers/net/dsa/mv88e6xxx/serdes.c
@@ -646,10 +646,12 @@ static irqreturn_t mv88e6390_serdes_thread_fn(int irq, void *dev_id)
 	int err;
 	u8 lane;
 
-	mv88e6xxx_serdes_get_lane(chip, port->port, &lane);
-
 	mv88e6xxx_reg_lock(chip);
 
+	err = mv88e6xxx_serdes_get_lane(chip, port->port, &lane);
+	if (err)
+		goto out;
+
 	switch (cmode) {
 	case MV88E6XXX_PORT_STS_CMODE_SGMII:
 	case MV88E6XXX_PORT_STS_CMODE_1000BASEX:
-- 
2.23.0


^ permalink raw reply related

* [PATCH net-next] net: dsa: mv88e6xxx: keep CMODE writable code private
From: Vivien Didelot @ 2019-08-28 16:26 UTC (permalink / raw)
  To: netdev; +Cc: davem, Marek Behún, f.fainelli, andrew, Vivien Didelot

This is a follow-up patch for commit 7a3007d22e8d ("net: dsa:
mv88e6xxx: fully support SERDES on Topaz family").

Since .port_set_cmode is only called from mv88e6xxx_port_setup_mac and
mv88e6xxx_phylink_mac_config, it is fine to keep this "make writable"
code private to the mv88e6341_port_set_cmode implementation, instead
of adding yet another operation to the switch info structure.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
---
 drivers/net/dsa/mv88e6xxx/chip.c | 8 --------
 drivers/net/dsa/mv88e6xxx/chip.h | 1 -
 drivers/net/dsa/mv88e6xxx/port.c | 9 ++++++++-
 drivers/net/dsa/mv88e6xxx/port.h | 1 -
 4 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 54e88aafba2f..6525075f6bd3 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -454,12 +454,6 @@ int mv88e6xxx_port_setup_mac(struct mv88e6xxx_chip *chip, int port, int link,
 			goto restore_link;
 	}
 
-	if (chip->info->ops->port_set_cmode_writable) {
-		err = chip->info->ops->port_set_cmode_writable(chip, port);
-		if (err && err != -EOPNOTSUPP)
-			goto restore_link;
-	}
-
 	if (chip->info->ops->port_set_cmode) {
 		err = chip->info->ops->port_set_cmode(chip, port, mode);
 		if (err && err != -EOPNOTSUPP)
@@ -2919,7 +2913,6 @@ static const struct mv88e6xxx_ops mv88e6141_ops = {
 	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
 	.port_link_state = mv88e6352_port_link_state,
 	.port_get_cmode = mv88e6352_port_get_cmode,
-	.port_set_cmode_writable = mv88e6341_port_set_cmode_writable,
 	.port_set_cmode = mv88e6341_port_set_cmode,
 	.port_setup_message_port = mv88e6xxx_setup_message_port,
 	.stats_snapshot = mv88e6390_g1_stats_snapshot,
@@ -3618,7 +3611,6 @@ static const struct mv88e6xxx_ops mv88e6341_ops = {
 	.port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
 	.port_link_state = mv88e6352_port_link_state,
 	.port_get_cmode = mv88e6352_port_get_cmode,
-	.port_set_cmode_writable = mv88e6341_port_set_cmode_writable,
 	.port_set_cmode = mv88e6341_port_set_cmode,
 	.port_setup_message_port = mv88e6xxx_setup_message_port,
 	.stats_snapshot = mv88e6390_g1_stats_snapshot,
diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h
index d6b1aa35aa1a..421e8b84bec3 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.h
+++ b/drivers/net/dsa/mv88e6xxx/chip.h
@@ -400,7 +400,6 @@ struct mv88e6xxx_ops {
 	/* CMODE control what PHY mode the MAC will use, eg. SGMII, RGMII, etc.
 	 * Some chips allow this to be configured on specific ports.
 	 */
-	int (*port_set_cmode_writable)(struct mv88e6xxx_chip *chip, int port);
 	int (*port_set_cmode)(struct mv88e6xxx_chip *chip, int port,
 			      phy_interface_t mode);
 	int (*port_get_cmode)(struct mv88e6xxx_chip *chip, int port, u8 *cmode);
diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c
index 542201214c36..4f841335ea32 100644
--- a/drivers/net/dsa/mv88e6xxx/port.c
+++ b/drivers/net/dsa/mv88e6xxx/port.c
@@ -510,7 +510,8 @@ int mv88e6390_port_set_cmode(struct mv88e6xxx_chip *chip, int port,
 	return mv88e6xxx_port_set_cmode(chip, port, mode);
 }
 
-int mv88e6341_port_set_cmode_writable(struct mv88e6xxx_chip *chip, int port)
+static int mv88e6341_port_set_cmode_writable(struct mv88e6xxx_chip *chip,
+					     int port)
 {
 	int err, addr;
 	u16 reg, bits;
@@ -537,6 +538,8 @@ int mv88e6341_port_set_cmode_writable(struct mv88e6xxx_chip *chip, int port)
 int mv88e6341_port_set_cmode(struct mv88e6xxx_chip *chip, int port,
 			     phy_interface_t mode)
 {
+	int err;
+
 	if (port != 5)
 		return -EOPNOTSUPP;
 
@@ -551,6 +554,10 @@ int mv88e6341_port_set_cmode(struct mv88e6xxx_chip *chip, int port,
 		break;
 	}
 
+	err = mv88e6341_port_set_cmode_writable(chip, port);
+	if (err)
+		return err;
+
 	return mv88e6xxx_port_set_cmode(chip, port, mode);
 }
 
diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h
index e78d68c3e671..d4e9bea6e82f 100644
--- a/drivers/net/dsa/mv88e6xxx/port.h
+++ b/drivers/net/dsa/mv88e6xxx/port.h
@@ -336,7 +336,6 @@ int mv88e6097_port_pause_limit(struct mv88e6xxx_chip *chip, int port, u8 in,
 			       u8 out);
 int mv88e6390_port_pause_limit(struct mv88e6xxx_chip *chip, int port, u8 in,
 			       u8 out);
-int mv88e6341_port_set_cmode_writable(struct mv88e6xxx_chip *chip, int port);
 int mv88e6341_port_set_cmode(struct mv88e6xxx_chip *chip, int port,
 			     phy_interface_t mode);
 int mv88e6390_port_set_cmode(struct mv88e6xxx_chip *chip, int port,
-- 
2.23.0


^ permalink raw reply related

* Re: [PATCH net 1/3] taprio: Fix kernel panic in taprio_destroy
From: Vinicius Costa Gomes @ 2019-08-28 16:31 UTC (permalink / raw)
  To: Vladimir Oltean, jhs, xiyou.wangcong, jiri, davem, vedang.patel,
	leandro.maciel.dorileo
  Cc: netdev, Vladimir Oltean
In-Reply-To: <20190828144829.32570-2-olteanv@gmail.com>

Hi,

Vladimir Oltean <olteanv@gmail.com> writes:

> taprio_init may fail earlier than this line:
>
> 	list_add(&q->taprio_list, &taprio_list);
>
> i.e. due to the net device not being multi queue.

Good catch.

>
> Attempting to remove q from the global taprio_list when it is not part
> of it will result in a kernel panic.
>
> Fix it by iterating through the list and removing it only if found.
>
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> ---
>  net/sched/sch_taprio.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
> index 540bde009ea5..f1eea8c68011 100644
> --- a/net/sched/sch_taprio.c
> +++ b/net/sched/sch_taprio.c
> @@ -1199,12 +1199,17 @@ static int taprio_change(struct Qdisc *sch, struct nlattr *opt,
>  
>  static void taprio_destroy(struct Qdisc *sch)
>  {
> -	struct taprio_sched *q = qdisc_priv(sch);
> +	struct taprio_sched *p, *q = qdisc_priv(sch);
>  	struct net_device *dev = qdisc_dev(sch);
> +	struct list_head *pos, *tmp;
>  	unsigned int i;
>  
>  	spin_lock(&taprio_list_lock);
> -	list_del(&q->taprio_list);
> +	list_for_each_safe(pos, tmp, &taprio_list) {
> +		p = list_entry(pos, struct taprio_sched, taprio_list);
> +		if (p == q)
> +			list_del(&q->taprio_list);
> +	}

Personally, I would do things differently, I am thinking: adding the
taprio instance earlier to the list in taprio_init(), and keeping
taprio_destroy() the way it is now. But take this more as a suggestion
:-)


Cheers,
--
Vinicius


^ permalink raw reply

* [PATCH net-next 0/2] Fixes for unlocked cls hardware offload API refactoring
From: Vlad Buslov @ 2019-08-28 16:41 UTC (permalink / raw)
  To: netdev; +Cc: jhs, xiyou.wangcong, jiri, davem, saeedm, idosch, Vlad Buslov

Two fixes for my "Refactor cls hardware offload API to support
rtnl-independent drivers" series.

Vlad Buslov (2):
  net: sched: cls_matchall: cleanup flow_action before deallocating
  net/mlx5e: Move local var definition into ifdef block

 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 6 ++++--
 net/sched/cls_matchall.c                          | 2 ++
 2 files changed, 6 insertions(+), 2 deletions(-)

-- 
2.21.0


^ permalink raw reply

* [PATCH net-next 1/2] net: sched: cls_matchall: cleanup flow_action before deallocating
From: Vlad Buslov @ 2019-08-28 16:41 UTC (permalink / raw)
  To: netdev; +Cc: jhs, xiyou.wangcong, jiri, davem, saeedm, idosch, Vlad Buslov
In-Reply-To: <20190828164104.6020-1-vladbu@mellanox.com>

Recent rtnl lock removal patch changed flow_action infra to require proper
cleanup besides simple memory deallocation. However, matchall classifier
was not updated to call tc_cleanup_flow_action(). Add proper cleanup to
mall_replace_hw_filter() and mall_reoffload().

Fixes: 5a6ff4b13d59 ("net: sched: take reference to action dev before calling offloads")
Reported-by: Ido Schimmel <idosch@mellanox.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
---
 net/sched/cls_matchall.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c
index 3266f25011cc..7fc2eb62aa98 100644
--- a/net/sched/cls_matchall.c
+++ b/net/sched/cls_matchall.c
@@ -111,6 +111,7 @@ static int mall_replace_hw_filter(struct tcf_proto *tp,
 
 	err = tc_setup_cb_add(block, tp, TC_SETUP_CLSMATCHALL, &cls_mall,
 			      skip_sw, &head->flags, &head->in_hw_count, true);
+	tc_cleanup_flow_action(&cls_mall.rule->action);
 	kfree(cls_mall.rule);
 
 	if (err) {
@@ -313,6 +314,7 @@ static int mall_reoffload(struct tcf_proto *tp, bool add, flow_setup_cb_t *cb,
 	err = tc_setup_cb_reoffload(block, tp, add, cb, TC_SETUP_CLSMATCHALL,
 				    &cls_mall, cb_priv, &head->flags,
 				    &head->in_hw_count);
+	tc_cleanup_flow_action(&cls_mall.rule->action);
 	kfree(cls_mall.rule);
 
 	if (err)
-- 
2.21.0


^ permalink raw reply related

* [PATCH net-next 2/2] net/mlx5e: Move local var definition into ifdef block
From: Vlad Buslov @ 2019-08-28 16:41 UTC (permalink / raw)
  To: netdev
  Cc: jhs, xiyou.wangcong, jiri, davem, saeedm, idosch, Vlad Buslov,
	tanhuazhong
In-Reply-To: <20190828164104.6020-1-vladbu@mellanox.com>

New local variable "struct flow_block_offload *f" was added to
mlx5e_setup_tc() in recent rtnl lock removal patches. The variable is used
in code that is only compiled when CONFIG_MLX5_ESWITCH is enabled. This
results compilation warning about unused variable when CONFIG_MLX5_ESWITCH
is not set. Move the variable definition into eswitch-specific code block
from the begging of mlx5e_setup_tc() function.

Fixes: c9f14470d048 ("net: sched: add API for registering unlocked offload block callbacks")
Reported-by: tanhuazhong <tanhuazhong@huawei.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 8592b98d0e70..c10a1fc8e469 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3470,16 +3470,18 @@ static int mlx5e_setup_tc(struct net_device *dev, enum tc_setup_type type,
 			  void *type_data)
 {
 	struct mlx5e_priv *priv = netdev_priv(dev);
-	struct flow_block_offload *f = type_data;
 
 	switch (type) {
 #ifdef CONFIG_MLX5_ESWITCH
-	case TC_SETUP_BLOCK:
+	case TC_SETUP_BLOCK: {
+		struct flow_block_offload *f = type_data;
+
 		f->unlocked_driver_cb = true;
 		return flow_block_cb_setup_simple(type_data,
 						  &mlx5e_block_cb_list,
 						  mlx5e_setup_tc_block_cb,
 						  priv, priv, true);
+	}
 #endif
 	case TC_SETUP_QDISC_MQPRIO:
 		return mlx5e_setup_tc_mqprio(priv, type_data);
-- 
2.21.0


^ permalink raw reply related

* Re: [PATCH net 2/3] taprio: Set default link speed to 10 Mbps in taprio_set_picos_per_byte
From: Vinicius Costa Gomes @ 2019-08-28 16:42 UTC (permalink / raw)
  To: Vladimir Oltean, jhs, xiyou.wangcong, jiri, davem, vedang.patel,
	leandro.maciel.dorileo
  Cc: netdev, Vladimir Oltean
In-Reply-To: <20190828144829.32570-3-olteanv@gmail.com>

Vladimir Oltean <olteanv@gmail.com> writes:

> The taprio budget needs to be adapted at runtime according to interface
> link speed. But that handling is problematic.
>
> For one thing, installing a qdisc on an interface that doesn't have
> carrier is not illegal. But taprio prints the following stack trace:
>
> [   31.851373] ------------[ cut here ]------------
> [   31.856024] WARNING: CPU: 1 PID: 207 at net/sched/sch_taprio.c:481 taprio_dequeue+0x1a8/0x2d4
> [   31.864566] taprio: dequeue() called with unknown picos per byte.
> [   31.864570] Modules linked in:
> [   31.873701] CPU: 1 PID: 207 Comm: tc Not tainted 5.3.0-rc5-01199-g8838fe023cd6 #1689
> [   31.881398] Hardware name: Freescale LS1021A
> [   31.885661] [<c03133a4>] (unwind_backtrace) from [<c030d8cc>] (show_stack+0x10/0x14)
> [   31.893368] [<c030d8cc>] (show_stack) from [<c10ac958>] (dump_stack+0xb4/0xc8)
> [   31.900555] [<c10ac958>] (dump_stack) from [<c0349d04>] (__warn+0xe0/0xf8)
> [   31.907395] [<c0349d04>] (__warn) from [<c0349d64>] (warn_slowpath_fmt+0x48/0x6c)
> [   31.914841] [<c0349d64>] (warn_slowpath_fmt) from [<c0f38db4>] (taprio_dequeue+0x1a8/0x2d4)
> [   31.923150] [<c0f38db4>] (taprio_dequeue) from [<c0f227b0>] (__qdisc_run+0x90/0x61c)
> [   31.930856] [<c0f227b0>] (__qdisc_run) from [<c0ec82ac>] (net_tx_action+0x12c/0x2bc)
> [   31.938560] [<c0ec82ac>] (net_tx_action) from [<c0302298>] (__do_softirq+0x130/0x3c8)
> [   31.946350] [<c0302298>] (__do_softirq) from [<c03502a0>] (irq_exit+0xbc/0xd8)
> [   31.953536] [<c03502a0>] (irq_exit) from [<c03a4808>] (__handle_domain_irq+0x60/0xb4)
> [   31.961328] [<c03a4808>] (__handle_domain_irq) from [<c0754478>] (gic_handle_irq+0x58/0x9c)
> [   31.969638] [<c0754478>] (gic_handle_irq) from [<c0301a8c>] (__irq_svc+0x6c/0x90)
> [   31.977076] Exception stack(0xe8167b20 to 0xe8167b68)
> [   31.982100] 7b20: e9d4bd80 00000cc0 000000cf 00000000 e9d4bd80 c1f38958 00000cc0 c1f38960
> [   31.990234] 7b40: 00000001 000000cf 00000004 e9dc0800 00000000 e8167b70 c0f478ec c0f46d94
> [   31.998363] 7b60: 60070013 ffffffff
> [   32.001833] [<c0301a8c>] (__irq_svc) from [<c0f46d94>] (netlink_trim+0x18/0xd8)
> [   32.009104] [<c0f46d94>] (netlink_trim) from [<c0f478ec>] (netlink_broadcast_filtered+0x34/0x414)
> [   32.017930] [<c0f478ec>] (netlink_broadcast_filtered) from [<c0f47cec>] (netlink_broadcast+0x20/0x28)
> [   32.027102] [<c0f47cec>] (netlink_broadcast) from [<c0eea378>] (rtnetlink_send+0x34/0x88)
> [   32.035238] [<c0eea378>] (rtnetlink_send) from [<c0f25890>] (notify_and_destroy+0x2c/0x44)
> [   32.043461] [<c0f25890>] (notify_and_destroy) from [<c0f25e08>] (qdisc_graft+0x398/0x470)
> [   32.051595] [<c0f25e08>] (qdisc_graft) from [<c0f27a00>] (tc_modify_qdisc+0x3a4/0x724)
> [   32.059470] [<c0f27a00>] (tc_modify_qdisc) from [<c0ee4c84>] (rtnetlink_rcv_msg+0x260/0x2ec)
> [   32.067864] [<c0ee4c84>] (rtnetlink_rcv_msg) from [<c0f4a988>] (netlink_rcv_skb+0xb8/0x110)
> [   32.076172] [<c0f4a988>] (netlink_rcv_skb) from [<c0f4a170>] (netlink_unicast+0x1b4/0x22c)
> [   32.084392] [<c0f4a170>] (netlink_unicast) from [<c0f4a5e4>] (netlink_sendmsg+0x33c/0x380)
> [   32.092614] [<c0f4a5e4>] (netlink_sendmsg) from [<c0ea9f40>] (sock_sendmsg+0x14/0x24)
> [   32.100403] [<c0ea9f40>] (sock_sendmsg) from [<c0eaa780>] (___sys_sendmsg+0x214/0x228)
> [   32.108279] [<c0eaa780>] (___sys_sendmsg) from [<c0eabad0>] (__sys_sendmsg+0x50/0x8c)
> [   32.116068] [<c0eabad0>] (__sys_sendmsg) from [<c0301000>] (ret_fast_syscall+0x0/0x54)
> [   32.123938] Exception stack(0xe8167fa8 to 0xe8167ff0)
> [   32.128960] 7fa0:                   b6fa68c8 000000f8 00000003 bea142d0 00000000 00000000
> [   32.137093] 7fc0: b6fa68c8 000000f8 0052154c 00000128 5d6468a2 00000000 00000028 00558c9c
> [   32.145224] 7fe0: 00000070 bea14278 00530d64 b6e17e64
> [   32.150659] ---[ end trace 2139c9827c3e5177 ]---
>
> This happens because the qdisc ->dequeue callback gets called. Which
> again is not illegal, the qdisc will dequeue even when the interface is
> up but doesn't have carrier (and hence SPEED_UNKNOWN), and the frames
> will be dropped further down the stack in dev_direct_xmit().
>
> And, at the end of the day, for what? For calculating the initial budget
> of an interface which is non-operational at the moment and where frames
> will get dropped anyway.
>
> So if we can't figure out the link speed, default to SPEED_10 and move
> along. We can also remove the runtime check now.
>
> Cc: Leandro Dorileo <leandro.maciel.dorileo@intel.com>
> Fixes: 7b9eba7ba0c1 ("net/sched: taprio: fix picos_per_byte miscalculation")
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> ---

Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>


^ permalink raw reply

* Re: [PATCH net 3/3] net/sched: cbs: Set default link speed to 10 Mbps in cbs_set_port_rate
From: Vinicius Costa Gomes @ 2019-08-28 16:45 UTC (permalink / raw)
  To: Vladimir Oltean, jhs, xiyou.wangcong, jiri, davem, vedang.patel,
	leandro.maciel.dorileo
  Cc: netdev, Vladimir Oltean
In-Reply-To: <20190828144829.32570-4-olteanv@gmail.com>

Vladimir Oltean <olteanv@gmail.com> writes:

> The discussion to be made is absolutely the same as in the case of
> previous patch ("taprio: Set default link speed to 10 Mbps in
> taprio_set_picos_per_byte"). Nothing is lost when setting a default.
>
> Cc: Leandro Dorileo <leandro.maciel.dorileo@intel.com>
> Fixes: e0a7683d30e9 ("net/sched: cbs: fix port_rate miscalculation")
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> ---

Hm, taking another look at cbs it has a similar problem than the problem
your patch 1/3 solves for taprio, I will propose a patch in a few
moments.

For this one:

Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>


Cheers,
--
Vinicius

^ permalink raw reply

* Re: [PATCH net-next] net: dsa: mv88e6xxx: get serdes lane after lock
From: Marek Behún @ 2019-08-28 16:48 UTC (permalink / raw)
  To: Vivien Didelot; +Cc: netdev, davem, f.fainelli, andrew
In-Reply-To: <20190828162611.10064-1-vivien.didelot@gmail.com>

On Wed, 28 Aug 2019 12:26:11 -0400
Vivien Didelot <vivien.didelot@gmail.com> wrote:

> This is a follow-up patch for commit 17deaf5cb37a ("net: dsa:
> mv88e6xxx: create serdes_get_lane chip operation").
> 
> The .serdes_get_lane implementations access the CMODE of a port,
> even though it is cached at the moment, it is safer to call them
> after the mutex is locked, not before.
> 
> At the same time, check for an eventual error and return IRQ_DONE,
> instead of blindly ignoring it.
> 
> Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
> ---
>  drivers/net/dsa/mv88e6xxx/serdes.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/dsa/mv88e6xxx/serdes.c
> b/drivers/net/dsa/mv88e6xxx/serdes.c index 9424e401dbc7..38c0da2492c0
> 100644 --- a/drivers/net/dsa/mv88e6xxx/serdes.c
> +++ b/drivers/net/dsa/mv88e6xxx/serdes.c
> @@ -646,10 +646,12 @@ static irqreturn_t
> mv88e6390_serdes_thread_fn(int irq, void *dev_id) int err;
>  	u8 lane;
>  
> -	mv88e6xxx_serdes_get_lane(chip, port->port, &lane);
> -
>  	mv88e6xxx_reg_lock(chip);
>  
> +	err = mv88e6xxx_serdes_get_lane(chip, port->port, &lane);
> +	if (err)
> +		goto out;
> +
>  	switch (cmode) {
>  	case MV88E6XXX_PORT_STS_CMODE_SGMII:
>  	case MV88E6XXX_PORT_STS_CMODE_1000BASEX:

Reviewed-by: Marek Behún <marek.behun@nic.cz>

^ permalink raw reply

* Re: [PATCH net-next] net: dsa: mv88e6xxx: keep CMODE writable code private
From: Marek Behún @ 2019-08-28 16:49 UTC (permalink / raw)
  To: Vivien Didelot; +Cc: netdev, davem, f.fainelli, andrew
In-Reply-To: <20190828162659.10306-1-vivien.didelot@gmail.com>

On Wed, 28 Aug 2019 12:26:59 -0400
Vivien Didelot <vivien.didelot@gmail.com> wrote:

> This is a follow-up patch for commit 7a3007d22e8d ("net: dsa:
> mv88e6xxx: fully support SERDES on Topaz family").
> 
> Since .port_set_cmode is only called from mv88e6xxx_port_setup_mac and
> mv88e6xxx_phylink_mac_config, it is fine to keep this "make writable"
> code private to the mv88e6341_port_set_cmode implementation, instead
> of adding yet another operation to the switch info structure.
> 
> Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
> ---
>  drivers/net/dsa/mv88e6xxx/chip.c | 8 --------
>  drivers/net/dsa/mv88e6xxx/chip.h | 1 -
>  drivers/net/dsa/mv88e6xxx/port.c | 9 ++++++++-
>  drivers/net/dsa/mv88e6xxx/port.h | 1 -
>  4 files changed, 8 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c
> b/drivers/net/dsa/mv88e6xxx/chip.c index 54e88aafba2f..6525075f6bd3
> 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c
> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
> @@ -454,12 +454,6 @@ int mv88e6xxx_port_setup_mac(struct
> mv88e6xxx_chip *chip, int port, int link, goto restore_link;
>  	}
>  
> -	if (chip->info->ops->port_set_cmode_writable) {
> -		err = chip->info->ops->port_set_cmode_writable(chip,
> port);
> -		if (err && err != -EOPNOTSUPP)
> -			goto restore_link;
> -	}
> -
>  	if (chip->info->ops->port_set_cmode) {
>  		err = chip->info->ops->port_set_cmode(chip, port,
> mode); if (err && err != -EOPNOTSUPP)
> @@ -2919,7 +2913,6 @@ static const struct mv88e6xxx_ops mv88e6141_ops
> = { .port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
>  	.port_link_state = mv88e6352_port_link_state,
>  	.port_get_cmode = mv88e6352_port_get_cmode,
> -	.port_set_cmode_writable = mv88e6341_port_set_cmode_writable,
>  	.port_set_cmode = mv88e6341_port_set_cmode,
>  	.port_setup_message_port = mv88e6xxx_setup_message_port,
>  	.stats_snapshot = mv88e6390_g1_stats_snapshot,
> @@ -3618,7 +3611,6 @@ static const struct mv88e6xxx_ops mv88e6341_ops
> = { .port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
>  	.port_link_state = mv88e6352_port_link_state,
>  	.port_get_cmode = mv88e6352_port_get_cmode,
> -	.port_set_cmode_writable = mv88e6341_port_set_cmode_writable,
>  	.port_set_cmode = mv88e6341_port_set_cmode,
>  	.port_setup_message_port = mv88e6xxx_setup_message_port,
>  	.stats_snapshot = mv88e6390_g1_stats_snapshot,
> diff --git a/drivers/net/dsa/mv88e6xxx/chip.h
> b/drivers/net/dsa/mv88e6xxx/chip.h index d6b1aa35aa1a..421e8b84bec3
> 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.h
> +++ b/drivers/net/dsa/mv88e6xxx/chip.h
> @@ -400,7 +400,6 @@ struct mv88e6xxx_ops {
>  	/* CMODE control what PHY mode the MAC will use, eg. SGMII,
> RGMII, etc.
>  	 * Some chips allow this to be configured on specific ports.
>  	 */
> -	int (*port_set_cmode_writable)(struct mv88e6xxx_chip *chip,
> int port); int (*port_set_cmode)(struct mv88e6xxx_chip *chip, int
> port, phy_interface_t mode);
>  	int (*port_get_cmode)(struct mv88e6xxx_chip *chip, int port,
> u8 *cmode); diff --git a/drivers/net/dsa/mv88e6xxx/port.c
> b/drivers/net/dsa/mv88e6xxx/port.c index 542201214c36..4f841335ea32
> 100644 --- a/drivers/net/dsa/mv88e6xxx/port.c
> +++ b/drivers/net/dsa/mv88e6xxx/port.c
> @@ -510,7 +510,8 @@ int mv88e6390_port_set_cmode(struct
> mv88e6xxx_chip *chip, int port, return mv88e6xxx_port_set_cmode(chip,
> port, mode); }
>  
> -int mv88e6341_port_set_cmode_writable(struct mv88e6xxx_chip *chip,
> int port) +static int mv88e6341_port_set_cmode_writable(struct
> mv88e6xxx_chip *chip,
> +					     int port)
>  {
>  	int err, addr;
>  	u16 reg, bits;
> @@ -537,6 +538,8 @@ int mv88e6341_port_set_cmode_writable(struct
> mv88e6xxx_chip *chip, int port) int mv88e6341_port_set_cmode(struct
> mv88e6xxx_chip *chip, int port, phy_interface_t mode)
>  {
> +	int err;
> +
>  	if (port != 5)
>  		return -EOPNOTSUPP;
>  
> @@ -551,6 +554,10 @@ int mv88e6341_port_set_cmode(struct
> mv88e6xxx_chip *chip, int port, break;
>  	}
>  
> +	err = mv88e6341_port_set_cmode_writable(chip, port);
> +	if (err)
> +		return err;
> +
>  	return mv88e6xxx_port_set_cmode(chip, port, mode);
>  }
>  
> diff --git a/drivers/net/dsa/mv88e6xxx/port.h
> b/drivers/net/dsa/mv88e6xxx/port.h index e78d68c3e671..d4e9bea6e82f
> 100644 --- a/drivers/net/dsa/mv88e6xxx/port.h
> +++ b/drivers/net/dsa/mv88e6xxx/port.h
> @@ -336,7 +336,6 @@ int mv88e6097_port_pause_limit(struct
> mv88e6xxx_chip *chip, int port, u8 in, u8 out);
>  int mv88e6390_port_pause_limit(struct mv88e6xxx_chip *chip, int
> port, u8 in, u8 out);
> -int mv88e6341_port_set_cmode_writable(struct mv88e6xxx_chip *chip,
> int port); int mv88e6341_port_set_cmode(struct mv88e6xxx_chip *chip,
> int port, phy_interface_t mode);
>  int mv88e6390_port_set_cmode(struct mv88e6xxx_chip *chip, int port,

Reviewed-by: Marek Behún <marek.behun@nic.cz>

^ permalink raw reply

* Re: [PATCH net-next] net: dsa: mv88e6xxx: get serdes lane after lock
From: Andrew Lunn @ 2019-08-28 16:49 UTC (permalink / raw)
  To: Vivien Didelot; +Cc: netdev, davem, Marek Behún, f.fainelli
In-Reply-To: <20190828162611.10064-1-vivien.didelot@gmail.com>

On Wed, Aug 28, 2019 at 12:26:11PM -0400, Vivien Didelot wrote:
> This is a follow-up patch for commit 17deaf5cb37a ("net: dsa:
> mv88e6xxx: create serdes_get_lane chip operation").
> 
> The .serdes_get_lane implementations access the CMODE of a port,
> even though it is cached at the moment, it is safer to call them
> after the mutex is locked, not before.
> 
> At the same time, check for an eventual error and return IRQ_DONE,
> instead of blindly ignoring it.
> 
> Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH net-next] net: dsa: mv88e6xxx: keep CMODE writable code private
From: Andrew Lunn @ 2019-08-28 16:51 UTC (permalink / raw)
  To: Vivien Didelot; +Cc: netdev, davem, Marek Behún, f.fainelli
In-Reply-To: <20190828162659.10306-1-vivien.didelot@gmail.com>

On Wed, Aug 28, 2019 at 12:26:59PM -0400, Vivien Didelot wrote:
> This is a follow-up patch for commit 7a3007d22e8d ("net: dsa:
> mv88e6xxx: fully support SERDES on Topaz family").
> 
> Since .port_set_cmode is only called from mv88e6xxx_port_setup_mac and
> mv88e6xxx_phylink_mac_config, it is fine to keep this "make writable"
> code private to the mv88e6341_port_set_cmode implementation, instead
> of adding yet another operation to the switch info structure.
> 
> Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH net 1/3] taprio: Fix kernel panic in taprio_destroy
From: Vladimir Oltean @ 2019-08-28 16:51 UTC (permalink / raw)
  To: Vinicius Costa Gomes
  Cc: jhs, xiyou.wangcong, Jiri Pirko, David S. Miller, vedang.patel,
	leandro.maciel.dorileo, netdev
In-Reply-To: <87a7btqmk7.fsf@intel.com>

Hi Vinicius,

On Wed, 28 Aug 2019 at 19:31, Vinicius Costa Gomes
<vinicius.gomes@intel.com> wrote:
>
> Hi,
>
> Vladimir Oltean <olteanv@gmail.com> writes:
>
> > taprio_init may fail earlier than this line:
> >
> >       list_add(&q->taprio_list, &taprio_list);
> >
> > i.e. due to the net device not being multi queue.
>
> Good catch.
>
> >
> > Attempting to remove q from the global taprio_list when it is not part
> > of it will result in a kernel panic.
> >
> > Fix it by iterating through the list and removing it only if found.
> >
> > Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> > ---
> >  net/sched/sch_taprio.c | 9 +++++++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
> > index 540bde009ea5..f1eea8c68011 100644
> > --- a/net/sched/sch_taprio.c
> > +++ b/net/sched/sch_taprio.c
> > @@ -1199,12 +1199,17 @@ static int taprio_change(struct Qdisc *sch, struct nlattr *opt,
> >
> >  static void taprio_destroy(struct Qdisc *sch)
> >  {
> > -     struct taprio_sched *q = qdisc_priv(sch);
> > +     struct taprio_sched *p, *q = qdisc_priv(sch);
> >       struct net_device *dev = qdisc_dev(sch);
> > +     struct list_head *pos, *tmp;
> >       unsigned int i;
> >
> >       spin_lock(&taprio_list_lock);
> > -     list_del(&q->taprio_list);
> > +     list_for_each_safe(pos, tmp, &taprio_list) {
> > +             p = list_entry(pos, struct taprio_sched, taprio_list);
> > +             if (p == q)
> > +                     list_del(&q->taprio_list);
> > +     }
>
> Personally, I would do things differently, I am thinking: adding the
> taprio instance earlier to the list in taprio_init(), and keeping
> taprio_destroy() the way it is now. But take this more as a suggestion
> :-)
>

While I don't strongly oppose your proposal (keep the list removal
unconditional, but match it better in placement to the list addition),
I think it's rather fragile and I do see this bug recurring in the
future. Anyway if you want to keep it "simpler" I can respin it like
that.

>
> Cheers,
> --
> Vinicius
>

Regards,
-Vladimir

^ permalink raw reply

* Re: [PATCH net-next v5] sched: Add dualpi2 qdisc
From: Dave Taht @ 2019-08-28 16:55 UTC (permalink / raw)
  To: Bob Briscoe
  Cc: Tilmans, Olivier (Nokia - BE/Antwerp), Eric Dumazet,
	Stephen Hemminger, Olga Albisser,
	De Schepper, Koen (Nokia - BE/Antwerp), Henrik Steen,
	Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S. Miller,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org
In-Reply-To: <bded966b-5176-69c8-4ac3-70d81d344c22@bobbriscoe.net>

On Wed, Aug 28, 2019 at 7:00 AM Bob Briscoe <research@bobbriscoe.net> wrote:
>
> Olivier, Dave,
>
> On 23/08/2019 13:59, Tilmans, Olivier (Nokia - BE/Antwerp) wrote:
>
> as best as I can
> tell (but could be wrong) the NQB idea wants to put something into the
> l4s fast queue? Or is NQB supposed to
> be a third queue?
>
> NQB is not supported in this release of the code. But FYI, it's not for a third queue.

At the time of my code review of dualpi I had not gone back to review
the NQB draft fully.

> We can add support for NQB in the future, by expanding the
> dualpi2_skb_classify() function. This is however out of scope at the
> moment as NQB is not yet adopted by the TSV WG. I'd guess we may want more

> than just the NQB DSCP codepoint in the L queue, which then warrant
> another way to classify traffic, e.g., using tc filter hints.

Yes, you'll find find folk are fans of being able to put tc (and ebpf)
filters in front of various qdiscs for classification, logging, and/or
dropping behavior.

A fairly typical stanza is here:
https://github.com/torvalds/linux/blob/master/net/sched/sch_sfq.c#L171
to line 193.

> The IETF adopted the NQB draft at the meeting just passed in July, but the draft has not yet been updated to reflect that: https://tools.ietf.org/html/draft-white-tsvwg-nqb-02

Hmmm... no. I think oliver's statement was correct.

NQB was put into the "call for adoption into tsvwg" state (
https://mailarchive.ietf.org/arch/msg/tsvwg/fjyYQgU9xQCNalwPO7v9-al6mGk
) in the tsvwg aug 21st, which
doesn't mean "adopted by the ietf", either. In response to that call
several folk did put in (rather pithy),
comments on the current state of the NQB idea and internet draft, starting here:

https://mailarchive.ietf.org/arch/msg/tsvwg/hZGjm899t87YZl9JJUOWQq4KBsk

For those here that are not familiar with IETF processes (and there
are many!) there are "internet drafts" that may or may not become
working group items, that if they become accepted by the working group
may or may not evolve to become actual RFCs.  Unlike lkml usage where
we use RFC in its original meaning as a mere request for comments,
there are several classes of IETF RFC - standards track, experimental,
and informational - whenever they are adopted and published by the
ietf.

There are RFCs for how they do RFCs, and BCPs and other TLAs, and if
you really want to know more about how the ietf processes actually
work, please contact me off list. Anyway...

Much of the experimental L4S architecture itself (of which NQB MAY
become part, and dualpi/tcpprague/etc are) is presently an accepted
tsvwg wg item with a list of 11 problems on the bug database here (
https://trac.ietf.org/trac/tsvwg/report/1?sort=ticket&asc=1&page=1 ).
IMHO it's not currently near last call for standardization as a set of
experimental RFCs.

L4S takes advantage of several RFCs that have
indeed been published as experimental, notably, RFC8311, which too few
have read as yet.

While using up ECT1 in the L4S code as an identifier and not as a
congestion indicator is very controversial for me (
https://lwn.net/Articles/783673/ ), AND I'd rather it not be baked
into the linux api for dualpi should this identifier not be chosen by
the wg (thus my suggestion of a mask or lookup table)...

... I also dearly would like both sides of this code - dualpi and tcp
prague - in a simultaneously testable and high quality state. Without
that, many core ideas in dualpi cannot be tested, nor objectively
evaluated against other tcps and qdiscs using rfc3168 behavior along
the path. Multiple experimental ideas in RFC8311 (such as those in
section 4.3) have also not been re-evaluated in any context.

Is the known to work reference codebase for "tcp prague" still 3.19 based?

> The draft requests 0x2A (decimal 42) as the DSCP but, until the IETF converges on a specific DSCP for NQB, I believe we should not code in a default classifier anyway.
>
>
>
> Bob
>
> --
> ________________________________________________________________
> Bob Briscoe                               http://bobbriscoe.net/



--

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

^ permalink raw reply

* Re: [PATCH v1 net-next] net: phy: mdio_bus: make mdiobus_scan also cover PHY that only talks C45
From: Florian Fainelli @ 2019-08-28 17:00 UTC (permalink / raw)
  To: Ong, Boon Leong, Andrew Lunn
  Cc: David S. Miller, Maxime Coquelin, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, Jose Abreu, Voon, Weifeng,
	Heiner Kallweit
In-Reply-To: <AF233D1473C1364ABD51D28909A1B1B75C22CD3C@pgsmsx114.gar.corp.intel.com>

On 8/28/19 8:41 AM, Ong, Boon Leong wrote:
>> On Tue, Aug 27, 2019 at 03:23:34PM +0000, Voon, Weifeng wrote:
>>>>>> Make mdiobus_scan() to try harder to look for any PHY that only
>>>> talks C45.
>>>>> If you are not using Device Tree or ACPI, and you are letting the MDIO
>>>>> bus be scanned, it sounds like there should be a way for you to
>>>>> provide a hint as to which addresses should be scanned (that's
>>>>> mii_bus::phy_mask) and possibly enhance that with a mask of possible
>>>>> C45 devices?
>>>>
>>>> Yes, i don't like this unconditional c45 scanning. A lot of MDIO bus
>>>> drivers don't look for the MII_ADDR_C45. They are going to do a C22
>>>> transfer, and maybe not mask out the MII_ADDR_C45 from reg, causing an
>>>> invalid register write. Bad things can then happen.
>>>>
>>>> With DT and ACPI, we have an explicit indication that C45 should be used,
>>>> so we know on this platform C45 is safe to use. We need something
>>>> similar when not using DT or ACPI.
>>>>
>>>> 	  Andrew
>>>
>>> Florian and Andrew,
>>> The mdio c22 is using the start-of-frame ST=01 while mdio c45 is using ST=00
>>> as identifier. So mdio c22 device will not response to mdio c45 protocol.
>>> As in IEEE 802.1ae-2002 Annex 45A.3 mention that:
>>> " Even though the Clause 45 MDIO frames using the ST=00 frame code
>>> will also be driven on to the Clause 22 MII Management interface,
>>> the Clause 22 PHYs will ignore the frames. "
>>>
>>> Hence, I am not seeing any concern that the c45 scanning will mess up with
>>> c22 devices.
>>
>> Hi Voon
>>
>> Take for example mdio-hisi-femac.c
>>
>> static int hisi_femac_mdio_read(struct mii_bus *bus, int mii_id, int regnum)
>> {
>>        struct hisi_femac_mdio_data *data = bus->priv;
>>        int ret;
>>
>>        ret = hisi_femac_mdio_wait_ready(data);
>>        if (ret)
>>                return ret;
>>
>>        writel((mii_id << BIT_PHY_ADDR_OFFSET) | regnum,
>>               data->membase + MDIO_RWCTRL);
>>
>>
>> There is no check here for MII_ADDR_C45. So it will perform a C22
>> transfer. And regnum will still have MII_ADDR_C45 in it, so the
>> writel() is going to set bit 30, since #define MII_ADDR_C45
>> (1<<30). What happens on this hardware under these conditions?
>>
>> You cannot unconditionally ask an MDIO driver to do a C45
>> transfer. Some drivers are going to do bad things.
> 
> Andrew & Florian, thanks for your review on this patch and insights on it.
> We will look into the implementation as suggested as follow. 
> 
> - for each bit clear in mii_bus::phy_mask, scan it as C22
> - for each bit clear in mii_bus::phy_c45_mask, scan it as C45
> 
> We will work on this and resubmit soonest. 

Sounds good. If you do not need to scan the MDIO bus, another approach
is to call get_phy_device() by passing the is_c45 boolean to true in
order to connect directly to a C45 device for which you already know the
address.

Assuming this is done for the stmmac PCI changes that you have
submitted, and that those cards have a fixed set of addresses for their
PHYs, maybe scanning the bus is overkill?
-- 
Florian

^ permalink raw reply

* RE: [PATCH v1 net-next] net: stmmac: Add support for MDIO interrupts
From: Voon, Weifeng @ 2019-08-28 17:07 UTC (permalink / raw)
  To: Florian Fainelli, Andrew Lunn
  Cc: David S. Miller, Maxime Coquelin, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, Jose Abreu, Giuseppe Cavallaro,
	Alexandre Torgue, Ong, Boon Leong
In-Reply-To: <cac5aba0-b47b-00c6-f99b-64c6b385308a@gmail.com>

> >> DW EQoS v5.xx controllers added capability for interrupt generation
> >> when MDIO interface is done (GMII Busy bit is cleared).
> >> This patch adds support for this interrupt on supported HW to avoid
> >> polling on GMII Busy bit.
> >>
> >> stmmac_mdio_read() & stmmac_mdio_write() will sleep until wake_up()
> >> is called by the interrupt handler.
> >
> > Hi Voon
> >
> > I _think_ there are some order of operation issues here. The mdiobus
> > is registered in the probe function. As soon as of_mdiobus_register()
> > is called, the MDIO bus must work. At that point MDIO read/writes can
> > start to happen.
> >
> > As far as i can see, the interrupt handler is only requested in
> > stmmac_open(). So it seems like any MDIO operations after probe, but
> > before open are going to fail?
> 
> AFAIR, wait_event_timeout() will continue to busy loop and wait until
> the timeout, but not return an error because the polled condition was
> true, at least that is my recollection from having the same issue with
> the bcmgenet driver before it was moved to connecting to the PHY in the
> ndo_open() function.
> --
> Florian

Florian is right as the poll condition is still true after the timeout. 
Hence, any mdio operation after probe and before ndo_open will still work.
The only cons here is that attaching the PHY will takes a full length of 
timeout time for each mdio_read and mdio_write. 
So we should attach the phy only after the interrupt handler is requested?
 

^ permalink raw reply

* Re: [PATCH net-next 03/15] net: sgi: ioc3-eth: remove checkpatch errors/warning
From: Joe Perches @ 2019-08-28 17:10 UTC (permalink / raw)
  To: Thomas Bogendoerfer, Ralf Baechle, Paul Burton, James Hogan,
	David S. Miller, linux-mips, linux-kernel, netdev
In-Reply-To: <20190828140315.17048-4-tbogendoerfer@suse.de>

On Wed, 2019-08-28 at 16:03 +0200, Thomas Bogendoerfer wrote:
> Before massaging the driver further fix oddities found by checkpatch like
> - wrong indention
> - comment formatting
> - use of printk instead or netdev_xxx/pr_xxx

trivial notes:

Please try to make the code better rather than merely
shutting up checkpatch.

> diff --git a/drivers/net/ethernet/sgi/ioc3-eth.c b/drivers/net/ethernet/sgi/ioc3-eth.c
[]
> @@ -209,8 +201,7 @@ static inline void nic_write_bit(u32 __iomem *mcr, int bit)
>  	nic_wait(mcr);
>  }
>  
> -/*
> - * Read a byte from an iButton device
> +/* Read a byte from an iButton device
>   */

These comment styles would be simpler on a single line

/* Read a byte from an iButton device */

>  static u32 nic_read_byte(u32 __iomem *mcr)
>  {
> @@ -223,8 +214,7 @@ static u32 nic_read_byte(u32 __iomem *mcr)
>  	return result;
>  }
>  
> -/*
> - * Write a byte to an iButton device
> +/* Write a byte to an iButton device
>   */

/* Write a byte to an iButton device */

etc...

[]
> @@ -323,16 +315,15 @@ static int nic_init(u32 __iomem *mcr)
>  		break;
>  	}
>  
> -	printk("Found %s NIC", type);
> +	pr_info("Found %s NIC", type);
>  	if (type != unknown)
> -		printk (" registration number %pM, CRC %02x", serial, crc);
> -	printk(".\n");
> +		pr_cont(" registration number %pM, CRC %02x", serial, crc);
> +	pr_cont(".\n");

This code would be more sensible as

	if (type != unknown)
		pr_info("Found %s NIC registration number %pM, CRC %02x\n",
			type, serial, crc);
	else
		pr_info("Found %s NIC\n", type); 

Though I don't know if registration number is actually a MAC
address or something else.  If it's just a 6 byte identifier
that uses colon separation it should probably use "%6phC"
instead of "%pM"

[] 

> @@ -645,22 +636,21 @@ static inline void ioc3_tx(struct net_device *dev)
>  static void ioc3_error(struct net_device *dev, u32 eisr)
>  {
>  	struct ioc3_private *ip = netdev_priv(dev);
> -	unsigned char *iface = dev->name;
>  
>  	spin_lock(&ip->ioc3_lock);
>  
>  	if (eisr & EISR_RXOFLO)
> -		printk(KERN_ERR "%s: RX overflow.\n", iface);
> +		netdev_err(dev, "RX overflow.\n");
>  	if (eisr & EISR_RXBUFOFLO)
> -		printk(KERN_ERR "%s: RX buffer overflow.\n", iface);
> +		netdev_err(dev, "RX buffer overflow.\n");
>  	if (eisr & EISR_RXMEMERR)
> -		printk(KERN_ERR "%s: RX PCI error.\n", iface);
> +		netdev_err(dev, "RX PCI error.\n");
>  	if (eisr & EISR_RXPARERR)
> -		printk(KERN_ERR "%s: RX SSRAM parity error.\n", iface);
> +		netdev_err(dev, "RX SSRAM parity error.\n");
>  	if (eisr & EISR_TXBUFUFLO)
> -		printk(KERN_ERR "%s: TX buffer underflow.\n", iface);
> +		netdev_err(dev, "TX buffer underflow.\n");
>  	if (eisr & EISR_TXMEMERR)
> -		printk(KERN_ERR "%s: TX PCI error.\n", iface);
> +		netdev_err(dev, "TX PCI error.\n");

All of these should probably be ratelimited() output.



^ permalink raw reply

* Re: [RFC PATCH 1/1] phylink: Set speed to SPEED_UNKNOWN when there is no PHY connected
From: Russell King - ARM Linux admin @ 2019-08-28 17:14 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: andrew, f.fainelli, asolokha, netdev
In-Reply-To: <20190828145802.3609-2-olteanv@gmail.com>

On Wed, Aug 28, 2019 at 05:58:02PM +0300, Vladimir Oltean wrote:
> phylink_ethtool_ksettings_get can be called while the interface may not
> even be up, which should not be a problem. But there are drivers (e.g.
> gianfar) which connect to the PHY in .ndo_open and disconnect in
> .ndo_close. While odd, to my knowledge this is again not illegal and
> there may be more that do the same. But PHYLINK for example has this
> check in phylink_ethtool_ksettings_get:
> 
> 	if (pl->phydev) {
> 		phy_ethtool_ksettings_get(pl->phydev, kset);
> 	} else {
> 		kset->base.port = pl->link_port;
> 	}
> 
> So it will not populate kset->base.speed if there is no PHY connected.
> The speed will be 0, by way of a previous memset. Not SPEED_UNKNOWN.
> It is arguable whether that is legal or not. include/uapi/linux/ethtool.h
> says:
> 
> 	All values 0 to INT_MAX are legal.
> 
> By that measure it may be. But it sure would make users of the
> __ethtool_get_link_ksettings API need make more complicated checks
> (against -1, against 0, 1, etc). So far the kernel community has been ok
> with just checking for SPEED_UNKNOWN.
> 
> Take net/sched/sch_taprio.c for example. The check in
> taprio_set_picos_per_byte is currently not robust enough and will
> trigger this division by zero, due to PHYLINK not setting SPEED_UNKNOWN:
> 
> 	if (!__ethtool_get_link_ksettings(dev, &ecmd) &&
> 	    ecmd.base.speed != SPEED_UNKNOWN)
> 		picos_per_byte = div64_s64(NSEC_PER_SEC * 1000LL * 8,
> 					   ecmd.base.speed * 1000 * 1000);

The ethtool API says:

 * If it is enabled then they are read-only; if the link
 * is up they represent the negotiated link mode; if the link is down,
 * the speed is 0, %SPEED_UNKNOWN or the highest enabled speed and
 * @duplex is %DUPLEX_UNKNOWN or the best enabled duplex mode.

So, it seems that taprio is not following the API... I'd suggest either
fixing taprio, or getting agreement to change the ethtool API.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply

* Re: [PATCH v1 net-next] net: stmmac: Add support for MDIO interrupts
From: Florian Fainelli @ 2019-08-28 17:14 UTC (permalink / raw)
  To: Voon, Weifeng, Andrew Lunn
  Cc: David S. Miller, Maxime Coquelin, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, Jose Abreu, Giuseppe Cavallaro,
	Alexandre Torgue, Ong, Boon Leong
In-Reply-To: <D6759987A7968C4889FDA6FA91D5CBC814759747@PGSMSX103.gar.corp.intel.com>

On 8/28/19 10:07 AM, Voon, Weifeng wrote:
>>>> DW EQoS v5.xx controllers added capability for interrupt generation
>>>> when MDIO interface is done (GMII Busy bit is cleared).
>>>> This patch adds support for this interrupt on supported HW to avoid
>>>> polling on GMII Busy bit.
>>>>
>>>> stmmac_mdio_read() & stmmac_mdio_write() will sleep until wake_up()
>>>> is called by the interrupt handler.
>>>
>>> Hi Voon
>>>
>>> I _think_ there are some order of operation issues here. The mdiobus
>>> is registered in the probe function. As soon as of_mdiobus_register()
>>> is called, the MDIO bus must work. At that point MDIO read/writes can
>>> start to happen.
>>>
>>> As far as i can see, the interrupt handler is only requested in
>>> stmmac_open(). So it seems like any MDIO operations after probe, but
>>> before open are going to fail?
>>
>> AFAIR, wait_event_timeout() will continue to busy loop and wait until
>> the timeout, but not return an error because the polled condition was
>> true, at least that is my recollection from having the same issue with
>> the bcmgenet driver before it was moved to connecting to the PHY in the
>> ndo_open() function.
>> --
>> Florian
> 
> Florian is right as the poll condition is still true after the timeout. 
> Hence, any mdio operation after probe and before ndo_open will still work.
> The only cons here is that attaching the PHY will takes a full length of 
> timeout time for each mdio_read and mdio_write. 
> So we should attach the phy only after the interrupt handler is requested?

From a power management/resource utilization perspective, it is better
to initialize as close as possible from the time where you are actually
going to use the hardware, therefore ndo_open().

This may not be convenient or possible given how widely use stmmac is,
and I do not know if parts of the Ethernet MAC require the PHY to supply
the clock, in which case, you may have some chicke and egg conditions if
the design does not allow for MDIO to work independently from the data
plane. Also, I would be worried about introducing bugs.

You could do a couple of things:

- continue to probe the device with interrupts disabled and add a
condition around the call to wait_event_timeout() to do a busy-loop
without going to the maximum defined timeout, if the interrupt line is
requested, use wait_event_timeout()

- request the interrupt during the probe function, but only
unmask/enable the MDIO interrupts for the probe to succeed and leave the
data path interrupts for a later enabling during ndo_open()
-- 
Florian

^ permalink raw reply

* [RFC net-next v1 1/5] net: phy: make mdiobus_create_device() function callable from Eth driver
From: Ong Boon Leong @ 2019-08-28 17:33 UTC (permalink / raw)
  To: davem, linux, mcoquelin.stm32, joabreu, f.fainelli, andrew
  Cc: netdev, linux-kernel, peppe.cavallaro, alexandre.torgue,
	weifeng.voon
In-Reply-To: <20190828173321.25334-1-boon.leong.ong@intel.com>

PHY converter and external PHY drivers depend on MDIO functions of Eth
driver and such MDIO read/write completion may fire IRQ. The ISR for MDIO
completion IRQ is done in the open() function of driver.

For PHY converter mdio driver that registers ISR event that uses MDIO
read/write function during its probe() function, the MDIO ISR should have
been performed a head of time before mdio driver probe() is called. It is
for reason as such, the mdio device creation and registration will need
to be callable from Eth driver open() function.

Why existing way to register mdio_device for PHY converter that is done
via mdiobus_register_board_info() is not feasible is the mdio device
creation and registration happens inside Eth driver probe() function,
specifically in mdiobus_setup_mdiodevfrom_board_info() that is called
by mdiobus_register().

Therefore, to fulfill the need mentioned above, we make mdiobus_create_
device() to be callable from Eth driver open().

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
---
 drivers/net/phy/mdio_bus.c | 5 +++--
 include/linux/phy.h        | 7 +++++++
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
index bd04fe762056..06658d9197a1 100644
--- a/drivers/net/phy/mdio_bus.c
+++ b/drivers/net/phy/mdio_bus.c
@@ -338,8 +338,8 @@ static inline void of_mdiobus_link_mdiodev(struct mii_bus *mdio,
  *
  * Returns 0 on success or < 0 on error.
  */
-static int mdiobus_create_device(struct mii_bus *bus,
-				 struct mdio_board_info *bi)
+int mdiobus_create_device(struct mii_bus *bus,
+			  struct mdio_board_info *bi)
 {
 	struct mdio_device *mdiodev;
 	int ret = 0;
@@ -359,6 +359,7 @@ static int mdiobus_create_device(struct mii_bus *bus,
 
 	return ret;
 }
+EXPORT_SYMBOL(mdiobus_create_device);
 
 /**
  * __mdiobus_register - bring up all the PHYs on a given bus and attach them to bus
diff --git a/include/linux/phy.h b/include/linux/phy.h
index d26779f1fb6b..4524db57fe0b 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -1249,12 +1249,19 @@ struct mdio_board_info {
 #if IS_ENABLED(CONFIG_MDIO_DEVICE)
 int mdiobus_register_board_info(const struct mdio_board_info *info,
 				unsigned int n);
+int mdiobus_create_device(struct mii_bus *bus, struct mdio_board_info *bi);
 #else
 static inline int mdiobus_register_board_info(const struct mdio_board_info *i,
 					      unsigned int n)
 {
 	return 0;
 }
+
+static inline int mdiobus_create_device(struct mii_bus *bus,
+					struct mdio_board_info *bi)
+{
+	return 0;
+}
 #endif
 
 
-- 
2.17.0


^ permalink raw reply related

* [RFC net-next v1 2/5] net: phy: introduce mdiobus_get_mdio_device
From: Ong Boon Leong @ 2019-08-28 17:33 UTC (permalink / raw)
  To: davem, linux, mcoquelin.stm32, joabreu, f.fainelli, andrew
  Cc: netdev, linux-kernel, peppe.cavallaro, alexandre.torgue,
	weifeng.voon
In-Reply-To: <20190828173321.25334-1-boon.leong.ong@intel.com>

Add the function to get mdio_device based on the mdio addr.

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
---
 drivers/net/phy/mdio_bus.c | 6 ++++++
 include/linux/mdio.h       | 1 +
 2 files changed, 7 insertions(+)

diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
index 06658d9197a1..96ef94f87ff1 100644
--- a/drivers/net/phy/mdio_bus.c
+++ b/drivers/net/phy/mdio_bus.c
@@ -130,6 +130,12 @@ struct phy_device *mdiobus_get_phy(struct mii_bus *bus, int addr)
 }
 EXPORT_SYMBOL(mdiobus_get_phy);
 
+struct mdio_device *mdiobus_get_mdio_device(struct mii_bus *bus, int addr)
+{
+	return bus->mdio_map[addr];
+}
+EXPORT_SYMBOL(mdiobus_get_mdio_device);
+
 bool mdiobus_is_registered_device(struct mii_bus *bus, int addr)
 {
 	return bus->mdio_map[addr];
diff --git a/include/linux/mdio.h b/include/linux/mdio.h
index e8242ad88c81..e0ccd56a7ac0 100644
--- a/include/linux/mdio.h
+++ b/include/linux/mdio.h
@@ -315,6 +315,7 @@ int mdiobus_register_device(struct mdio_device *mdiodev);
 int mdiobus_unregister_device(struct mdio_device *mdiodev);
 bool mdiobus_is_registered_device(struct mii_bus *bus, int addr);
 struct phy_device *mdiobus_get_phy(struct mii_bus *bus, int addr);
+struct mdio_device *mdiobus_get_mdio_device(struct mii_bus *bus, int addr);
 
 /**
  * mdio_module_driver() - Helper macro for registering mdio drivers
-- 
2.17.0


^ permalink raw reply related

* [RFC net-next v1 3/5] net: phy: add private data to mdio_device
From: Ong Boon Leong @ 2019-08-28 17:33 UTC (permalink / raw)
  To: davem, linux, mcoquelin.stm32, joabreu, f.fainelli, andrew
  Cc: netdev, linux-kernel, peppe.cavallaro, alexandre.torgue,
	weifeng.voon
In-Reply-To: <20190828173321.25334-1-boon.leong.ong@intel.com>

PHY converter device is represented as mdio_device and requires private
data. So, we add pointer for private data to mdio_device struct.

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
---
 include/linux/mdio.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/mdio.h b/include/linux/mdio.h
index e0ccd56a7ac0..fc7dfbe75006 100644
--- a/include/linux/mdio.h
+++ b/include/linux/mdio.h
@@ -40,6 +40,8 @@ struct mdio_device {
 	struct reset_control *reset_ctrl;
 	unsigned int reset_assert_delay;
 	unsigned int reset_deassert_delay;
+	/* Private data */
+	void *priv;
 };
 #define to_mdio_device(d) container_of(d, struct mdio_device, dev)
 
-- 
2.17.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox