From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6FF9DC282C4 for ; Tue, 12 Feb 2019 03:58:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 358602083E for ; Tue, 12 Feb 2019 03:58:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=lunn.ch header.i=@lunn.ch header.b="DUMbftTK" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727231AbfBLD6P (ORCPT ); Mon, 11 Feb 2019 22:58:15 -0500 Received: from vps0.lunn.ch ([185.16.172.187]:47727 "EHLO vps0.lunn.ch" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726940AbfBLD6O (ORCPT ); Mon, 11 Feb 2019 22:58:14 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lunn.ch; s=20171124; h=In-Reply-To:Content-Type:MIME-Version:References:Message-ID: Subject:Cc:To:From:Date:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=OskBFVz2bVvQGoNKUiW3HFuWnNweK0fzxYQll9gLM34=; b=DUMbftTKbmTUZqqVPGTHBjeR3N ZNgmQK/dC62CIsmro8QQYxeYmESJFvm6GD3pduKsyeYr+GhyQ+k6ZS6ftTWwHyykjsBs86d7DOTF9 wCPkG34E3h6ZocpYz8XVM4hdGDlbjLvnZ+P/IfrNRozGqbtvagBxs29d0EK/USBKOoBY=; Received: from andrew by vps0.lunn.ch with local (Exim 4.89) (envelope-from ) id 1gtPCw-000747-4w; Tue, 12 Feb 2019 04:58:06 +0100 Date: Tue, 12 Feb 2019 04:58:06 +0100 From: Andrew Lunn To: John David Anglin , Heiner Kallweit Cc: Russell King , Vivien Didelot , Florian Fainelli , netdev@vger.kernel.org Subject: Re: [PATCH net] dsa: mv88e6xxx: Ensure all pending interrupts are handled prior to exit Message-ID: <20190212035806.GE19023@lunn.ch> References: <20190130172818.GJ21904@lunn.ch> <2ea9fd81-f92d-9505-dd0b-bdd0f67d8ce7@bell.net> <20190130223846.GB30115@lunn.ch> <9415d82e-965b-7777-0ad0-f23d6c9f177e@bell.net> <53b49df8-53ed-704f-9197-230b18d83090@bell.net> <824d011b-3692-69c3-5e2c-58e950a80abf@bell.net> <6a1ebc61-3505-beb8-21cb-ea42ad9fe67e@bell.net> <20190211233327.GB8591@lunn.ch> <2b6bbb4c-1346-461b-ff7a-cb96b4142f7a@bell.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2b6bbb4c-1346-461b-ff7a-cb96b4142f7a@bell.net> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org > > Hi David > > > > I just tested this on one of my boards. It loops endlessly: > > > > [ 47.173396] mv88e6xxx_g1_irq_thread_work: c881 a8 80 > > [ 47.182108] mv88e6xxx_g1_irq_thread_work: c881 a8 80 > > [ 47.190820] mv88e6xxx_g1_irq_thread_work: c881 a8 80 > > [ 47.199535] mv88e6xxx_g1_irq_thread_work: c881 a8 80 > > [ 47.208254] mv88e6xxx_g1_irq_thread_work: c881 a8 80 > > > > These are reg, ctl1, reg & ctl1. > > > > So there is an unhandled device interrupt. Hi Heiner Your patch Fixes: 2b3e88ea6528 ("net: phy: improve phy state checking") is causing me problems with interrupts for the Marvell switches. That change means we don't check the PHY device if it caused an interrupt when its state is less than UP. What i'm seeing is that the PHY is interrupting pretty early on after a reboot when the previous boot had the interface up. [ 10.125702] Marvell 88E6390 mv88e6xxx-0:02: phy_start_interrupts [ 10.162798] Marvell 88E6390 mv88e6xxx-0:02: phy_enable_interrupts [ 10.168931] Marvell 88E6390 mv88e6xxx-0:02: marvell_ack_interrupt [ 10.180164] Marvell 88E6390 mv88e6xxx-0:02: marvell_config_intr 1 a little later it interrupts: [ 12.999717] mv88e6xxx_g1_irq_thread_fn [ 13.007253] mv88e6xxx_g2_irq_thread_fn: 4 811c 4 [ 13.012015] libphy: __phy_is_started: phydev->state 1 PHY_UP 3 [ 13.017941] Marvell 88E6390 mv88e6xxx-0:02: phy_interrupt: phy_is_started(phydev) 0 The current code just causes it to be ignored. So the interrupts fires again, and again... If i change to code to call into the PHY driver and let it handle the interrupts, things keep running. A little bit later the interface is configured up: [ 15.921326] mv88e6085 gpio-0:00 red: configuring for phy/gmii link mode [ 15.928693] libphy: __phy_is_started: phydev->state 3 PHY_UP 3 [ 15.929442] IPv6: ADDRCONF(NETDEV_UP): red: link is not ready [ 15.935596] Marvell 88E6390 mv88e6xxx-0:02: m88e6390_config_aneg [ 15.935608] Marvell 88E6390 mv88e6xxx-0:02: m88e6390_errata [ 16.071364] Marvell 88E6390 mv88e6xxx-0:02: m88e1510_config_aneg [ 16.112362] Marvell 88E6390 mv88e6xxx-0:02: m88e1318_config_aneg [ 16.151245] Marvell 88E6390 mv88e6xxx-0:02: m88e1121_config_aneg [ 16.368206] Marvell 88E6390 mv88e6xxx-0:02: PHY state change UP -> NOLINK and after another interrupt the link goes up. [ 19.519840] mv88e6xxx_g1_irq_thread_fn [ 19.528546] mv88e6xxx_g2_irq_thread_fn: 4 811c 4 [ 19.534152] libphy: __phy_is_started: phydev->state 5 PHY_UP 3 [ 19.540030] Marvell 88E6390 mv88e6xxx-0:02: phy_interrupt: phy_is_started(phydev) 1 [ 19.547721] Marvell 88E6390 mv88e6xxx-0:02: m88e1121_did_interrupt [ 19.559829] Marvell 88E6390 mv88e6xxx-0:02: marvell_ack_interrupt [ 19.590753] Marvell 88E6390 mv88e6xxx-0:02: marvell_read_status [ 19.596712] Marvell 88E6390 mv88e6xxx-0:02: marvell_update_link [ 19.628387] Marvell 88E6390 mv88e6xxx-0:02: PHY state change NOLINK -> RUNNING [ 19.628453] mv88e6085 gpio-0:00 red: Link is Up - 1Gbps/Full - flow control off [ 19.635920] IPv6: ADDRCONF(NETDEV_CHANGE): red: link becomes ready I don't yet know why the first interrupt happens, before we configure auto-neg, etc. But it is not too unreasonable. We have configured interrupts, so it could be reporting link down etc. So i think we might need to revert part of this change, call into the driver so long as the PHY is not in state PHY_HALTED. What do you think? Andrew