From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5C87C4338F for ; Tue, 10 Aug 2021 07:26:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B032361077 for ; Tue, 10 Aug 2021 07:26:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233256AbhHJH1S (ORCPT ); Tue, 10 Aug 2021 03:27:18 -0400 Received: from out2-smtp.messagingengine.com ([66.111.4.26]:52573 "EHLO out2-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231860AbhHJH1Q (ORCPT ); Tue, 10 Aug 2021 03:27:16 -0400 Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.nyi.internal (Postfix) with ESMTP id 10FFF5C014E; Tue, 10 Aug 2021 03:26:55 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute5.internal (MEProxy); Tue, 10 Aug 2021 03:26:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; bh=FXtXEh UakaarjvmQzlKwHY9uwwqI2BRNZGTyMq3R2aE=; b=p1aBA1lwgMcG7Xc1pOD7/3 67Mj2vmZGipADSDcUk9G6esWXFQEt1CwLRqczXeWdcqDr6D24fBW5sbU88oDxfWu U4VmwT+QxR7D0182/8fSLU8fqhppWks2//aS1dtxAFxntcArWIRS94ZLBQx8CNhe BkZs7G+IwhX3HYYJHn94VnCuFLp7iPdv93W8aEVrcHAtp1TTnld8e/beRgH9Cxzd IityD7UBhUd/YdhHpuYOKSf24Ez+3YqkXWDGgxm4wbWVmUEU/Cktog2hvDGbDKfq M8F6IlkYqGO41BVYAeTYL3r7a0z/yn59VAtypwnJug/KsoTNK4J66WgdzdEnvPqA == X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvtddrjeekgdduudelucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvffukfhfgggtuggjsehttdertddttddvnecuhfhrohhmpefkughoucfu tghhihhmmhgvlhcuoehiughoshgthhesihguohhstghhrdhorhhgqeenucggtffrrghtth gvrhhnpedtffekkeefudffveegueejffejhfetgfeuuefgvedtieehudeuueekhfduheel teenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehiug hoshgthhesihguohhstghhrdhorhhg X-ME-Proxy: Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 10 Aug 2021 03:26:52 -0400 (EDT) Date: Tue, 10 Aug 2021 10:26:49 +0300 From: Ido Schimmel To: Andrew Lunn Cc: netdev@vger.kernel.org, davem@davemloft.net, kuba@kernel.org, mkubecek@suse.cz, pali@kernel.org, vadimp@nvidia.com, mlxsw@nvidia.com, Ido Schimmel Subject: Re: [RFC PATCH net-next 1/8] ethtool: Add ability to control transceiver modules' low power mode Message-ID: References: <20210809102152.719961-1-idosch@idosch.org> <20210809102152.719961-2-idosch@idosch.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Mon, Aug 09, 2021 at 04:28:32PM +0200, Andrew Lunn wrote: > On Mon, Aug 09, 2021 at 01:21:45PM +0300, Ido Schimmel wrote: > > From: Ido Schimmel > > > > Add a pair of new ethtool messages, 'ETHTOOL_MSG_MODULE_SET' and > > 'ETHTOOL_MSG_MODULE_GET', that can be used to control transceiver > > modules parameters and retrieve their status. > > Hi Ido > > I've not read all the patchset yet, but i like the general direction. > > > The first parameter to control is the low power mode of the module. It > > is only relevant for paged memory modules, as flat memory modules always > > operate in low power mode. > > > > When a paged memory module is in low power mode, its power consumption > > is reduced to the minimum, the management interface towards the host is > > available and the data path is deactivated. > > > > User space can choose to put modules that are not currently in use in > > low power mode and transition them to high power mode before putting the > > associated ports administratively up. > > > > Transitioning into low power mode means loss of carrier, so error is > > returned when the netdev is administratively up. > > However, i don't get this use case. With copper PHYs, putting the link > administratively down results in a call into phylib and into the > driver to down the link. This effectively puts the PHY into a low > power mode. The management interface, as defined by C22 and C45 remain > available, but the data path is disabled. For a 1G PHY, this can save > a few watts. > > For SFPs managed by phylink and the kernal SFP driver, the exact same > happens. The TX_ENABLE pin of the SFP is set to false. The I2C bus > still works, but the data path is disabled. > > So i would expect a driver using firmware, not Linux code to manage > SFPs, to just do this on link down. Why do we need user space > involved? The transition from low power to high power can take a few seconds with QSFP/QSFP-DD and it's likely to only get longer with future / more complex modules. Therefore, to reduce link-up time, the firmware automatically transitions modules to high power mode. There is obviously a trade-off here between power consumption and link-up time. My understanding is that Mellanox is not the only vendor favoring shorter link-up times as users have the ability to control the low power mode of the modules in other implementations. Regarding "why do we need user space involved?", by default, it does not need to be involved (the system works without this API), but if it wants to reduce the power consumption by setting unused modules to low power mode, then it will need to use this API.