From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ozlabs.org (ozlabs.org [203.10.76.45]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mx.ozlabs.org", Issuer "CA Cert Signing Authority" (verified OK)) by bilbo.ozlabs.org (Postfix) with ESMTPS id 54FEEB70A7 for ; Thu, 25 Jun 2009 17:03:48 +1000 (EST) Received: from buildserver.ru.mvista.com (unknown [213.79.90.228]) by ozlabs.org (Postfix) with ESMTP id 5FEE6DDD01 for ; Thu, 25 Jun 2009 17:03:46 +1000 (EST) Date: Thu, 25 Jun 2009 11:02:36 +0400 From: Anton Vorontsov To: Mark Huth Subject: Re: [PATCH] ucc_geth: Fix half-duplex operation for non-MII/RMII interfaces Message-ID: <20090625070236.GA27711@oksana.dev.rtsoft.ru> References: <20090624174557.GA31479@oksana.dev.rtsoft.ru> <4A4306F2.3070909@mvista.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf8 In-Reply-To: <4A4306F2.3070909@mvista.com> Cc: linuxppc-dev@ozlabs.org, netdev@vger.kernel.org, Li Yang , David Miller Reply-To: avorontsov@ru.mvista.com List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Wed, Jun 24, 2009 at 10:11:14PM -0700, Mark Huth wrote: > Anton Vorontsov wrote: >> Currently the half-duplex operation seems to not work reliably for >> RGMII/GMII PHY interfaces. It takes about 10 minutes to boot NFS >> rootfs using 10/half link, following symptoms were observed: >> >> ucc_geth: QE UCC Gigabit Ethernet Controller >> ucc_geth: UCC1 at 0xe0082000 (irq = 32) >> [...] >> Sending DHCP and RARP requests . >> PHY: mdio@e0082120:07 - Link is Up - 10/Half >> ., OK > So why does the phy think this is a half-duplex network? Because it's physical media now in half-duplex. At least that's what PHY detects. [...] >> tx-late-collsion: 604 >> tx-aborted-frames: 604 > The above two counters are the actual errors from a half-duplex ethernet > configuration. The size of the collision domain is limited so that the > collisions from one end will reach the other end within the minimum > frame length wire time. Thus the collision will be detected within the > first 64 bytes of the frame. A late collision indicates a > mis-configured network. The fact that everything seems to work when the > MAC is placed into full-duplex mode hints that the network is really a > full-duplex network. No, it's half. Can be configured so on both sides, with or without auto-negotiation. The "10/half" message comes from a PHY layer, the PHY layer reports human readable values of PHY's LPA/BMSR registers, not MAC's configuration. Of course, it could be that the root cause of the problems I observe is weird NIC on my host. Well, then QA team should have used the same broken NIC on their hosts. :-) I can easily test it by interconnecting two targets though. > Otherwise, if the network is really half-duplex, then presence of a > full-duplex node will result in the other nodes seeing CRC/framing > errors on receive, and possibly also late collisions, as the full-duplex > node does not observe the CS or the CD( carrier sense and collision > detect) part of CSMA/CD, because it doesn't care. > > Putting a node in full-duplex will always make the nasty collision > related errors go away, but it may not be a proper diagnosis of the > problem. >> tx-frames-ok: 4967 >> tx-256-511-frames: 3 >> tx-512-1023-frames: 79 >> tx-1024-1518-frames: 71 >> rx-256-511-frames: 37 >> rx-512-1023-frames: 73 >> rx-1024-1518-frames: 5243 >> >> According to current QEIWRM (Rev. 2 5/2009), FDX bit can be 0 for >> RGMII(10/100) modes, while MPC8568ERM (Rev. C 02/2007) spec says >> that cleared FDX bit is permitted for MII/RMII modes only. >> >> The symptoms above were seen on MPC8569E-MDS boards, so QEIWRM is >> clearly wrong, and this patch completely cures the problems above. > > Not so fast - RGMII and GMII refer to the interface between the MAC and > the PHY. Correct. > While Gigabit physical links will always be full-duplex, phys > that detect lower operational modes will indicate half-duplex where > needed, and putting the MAC into full-duplex will make other nodes see > errors. D'oh! [1358634.636147] eth1: Transmit error, Tx status register 82. [1358634.636150] Probably a duplex mismatch. See Documentation/networking/vortex.txt It's on a host side. > As Andy indicated later, it may be necessary to alter the interface > definition in those cases, depending on the particular hardware. Forcing > full-duplex does not seem to be a general solution. Definitely. Though I'm out of ideas if it's NOT host-side issue. Thanks! -- Anton Vorontsov email: cbouatmailru@gmail.com irc://irc.freenode.net/bd2