netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ido Schimmel <idosch@nvidia.com>
To: "Krzysztof Olędzki" <ole@ans.pl>
Cc: Andrew Lunn <andrew@lunn.ch>, Michal Kubecek <mkubecek@suse.cz>,
	Moshe Shemesh <moshe@nvidia.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	tariqt@nvidia.com
Subject: Re: "netlink error: Invalid argument" with ethtool-5.13+ on recent kernels due to "ethtool: Add netlink handler for getmodule (-m)" - 25b64c66f58d3df0ad7272dda91c3ab06fe7a303, also no SFP-DOM support via netlink?
Date: Wed, 22 May 2024 11:40:30 +0300	[thread overview]
Message-ID: <Zk2vfmI7qnBMxABo@shredder> (raw)
In-Reply-To: <c3726cb7-6eff-43c6-a7d4-1e931d48151f@ans.pl>

On Tue, May 21, 2024 at 09:54:48PM -0700, Krzysztof Olędzki wrote:
> Hi,
> 
> On 21.05.2024 at 13:21, Andrew Lunn wrote:
> >> sending genetlink packet (76 bytes):
> >>     msg length 76 ethool ETHTOOL_MSG_MODULE_EEPROM_GET
> >>     ETHTOOL_MSG_MODULE_EEPROM_GET
> >>         ETHTOOL_A_MODULE_EEPROM_HEADER
> >>             ETHTOOL_A_HEADER_DEV_NAME = "eth3"
> >>         ETHTOOL_A_MODULE_EEPROM_LENGTH = 128
> >>         ETHTOOL_A_MODULE_EEPROM_OFFSET = 128
> >>         ETHTOOL_A_MODULE_EEPROM_PAGE = 3
> >>         ETHTOOL_A_MODULE_EEPROM_BANK = 0
> >>         ETHTOOL_A_MODULE_EEPROM_I2C_ADDRESS = 80
> >> received genetlink packet (96 bytes):
> >>     msg length 96 error errno=-22
> > 
> > This is a mellanox card right?
> 
> Yes, sorry. This is indeed Mellanox (now Nvidia) CX3 / CX3Pro, using the drivers/net/ethernet/mellanox/mlx4 driver.
> 
> > mlx4_en_get_module_info() and mlx4_en_get_module_eeprom() implement
> > the old API for reading data from an SFP module. So the ethtool core
> > will be mapping the new API to the old API. The interesting function
> > is probably fallback_set_params():
> > 
> > https://elixir.bootlin.com/linux/latest/source/net/ethtool/eeprom.c#L29
> > 
> > and my guess is, you are hitting:
> > 
> > 	if (offset >= modinfo->eeprom_len)
> > 		return -EINVAL;
> > 
> > offset is 3 * 128 + 128 = 512.
> > 
> > mlx4_en_get_module_info() is probably returning eeprom_len of 256?
> > 
> > Could you verify this?
> 
> Ah, excellent catch Andrew!

Yes, I believe Andrew's analysis is correct.

> 
> # egrep -R 'ETH_MODULE_SFF_[0-9]+_LEN' include/uapi/linux/ethtool.h
> #define ETH_MODULE_SFF_8079_LEN         256
> #define ETH_MODULE_SFF_8472_LEN         512
> #define ETH_MODULE_SFF_8636_LEN         256
> #define ETH_MODULE_SFF_8436_LEN         256
> 
> The code in mlx4_en_get_module_info (with length annotation):
> 
>         switch (data[0] /* identifier */) {
>         case MLX4_MODULE_ID_QSFP:
>                 modinfo->type = ETH_MODULE_SFF_8436;
>                 modinfo->eeprom_len = ETH_MODULE_SFF_8436_LEN;		// 256
>                 break;
>         case MLX4_MODULE_ID_QSFP_PLUS:
>                 if (data[1] >= 0x3) { /* revision id */
>                         modinfo->type = ETH_MODULE_SFF_8636;
>                         modinfo->eeprom_len = ETH_MODULE_SFF_8636_LEN;	// 256
>                 } else {
>                         modinfo->type = ETH_MODULE_SFF_8436;
>                         modinfo->eeprom_len = ETH_MODULE_SFF_8436_LEN;	// 256
>                 }
>                 break;
>         case MLX4_MODULE_ID_QSFP28:
>                 modinfo->type = ETH_MODULE_SFF_8636;
>                 modinfo->eeprom_len = ETH_MODULE_SFF_8636_LEN;		// 256
>                 break;
>         case MLX4_MODULE_ID_SFP:
>                 modinfo->type = ETH_MODULE_SFF_8472;
>                 modinfo->eeprom_len = ETH_MODULE_SFF_8472_LEN;		// 512
>                 break;
>         default:
>                 return -EINVAL;
>         }
> 
> So right, the function returns 512 for SFP and 256 for everything else, which explains why SFP does work but QSFP - not.

Since you already did all the work and you are able to test patches, do
you want to fix it yourself and submit or report to the mlx4 maintainer
(copied)? Fix should be similar to mlx5 commit a708fb7b1f8d ("net/mlx5e:
ethtool, Add support for EEPROM high pages query").

> 
> Following your advice, I added some debug printks to net/ethtool/eeprom.c:
> 
> @@ -33,16 +33,24 @@
>         u32 offset = request->offset;
>         u32 length = request->length;
> 
> +       printk("A: offset=%u, modinfo->eeprom_len=%u\n", offset, modinfo->eeprom_len);
> +
>         if (request->page)
>                 offset = request->page * ETH_MODULE_EEPROM_PAGE_LEN + offset;
> 
> +       printk("B: offset=%u, modinfo->eeprom_len=%u\n", offset, modinfo->eeprom_len);
> +
>         if (modinfo->type == ETH_MODULE_SFF_8472 &&
>             request->i2c_address == 0x51)
>                 offset += ETH_MODULE_EEPROM_PAGE_LEN * 2;
> 
> +       printk("C: offset=%u, modinfo->eeprom_len=%u\n", offset, modinfo->eeprom_len);
> +
>         if (offset >= modinfo->eeprom_len)
>                 return -EINVAL;
> 
> +       printk("D: offset=%u, modinfo->eeprom_len=%u\n", offset, modinfo->eeprom_len);
> +
>         eeprom->cmd = ETHTOOL_GMODULEEEPROM;
>         eeprom->len = length;
>         eeprom->offset = offset;
> 
> Here is the result:
> 
> SFP:
> A: offset=0, modinfo->eeprom_len=512
> B: offset=0, modinfo->eeprom_len=512
> C: offset=0, modinfo->eeprom_len=512
> D: offset=0, modinfo->eeprom_len=512
> A: offset=0, modinfo->eeprom_len=512
> B: offset=0, modinfo->eeprom_len=512
> C: offset=0, modinfo->eeprom_len=512
> D: offset=0, modinfo->eeprom_len=512
> 
> QSFP:
> A: offset=0, modinfo->eeprom_len=256
> B: offset=0, modinfo->eeprom_len=256
> C: offset=0, modinfo->eeprom_len=256
> D: offset=0, modinfo->eeprom_len=256
> 
> A: offset=0, modinfo->eeprom_len=256
> B: offset=0, modinfo->eeprom_len=256
> C: offset=0, modinfo->eeprom_len=256
> D: offset=0, modinfo->eeprom_len=256
> 
> A: offset=128, modinfo->eeprom_len=256
> B: offset=128, modinfo->eeprom_len=256
> C: offset=128, modinfo->eeprom_len=256
> D: offset=128, modinfo->eeprom_len=256
> 
> A: offset=128, modinfo->eeprom_len=256
> B: offset=512, modinfo->eeprom_len=256
> C: offset=512, modinfo->eeprom_len=256
> Note - no "D" as -EINVAL is returned exactly as you predicted.
> 
> BTW: there is another suspicious looking thing in this code:
>  - "u32 length = request->length;" is set early in the function
>  - length is never updated
>  - at the end, we have "eeprom->len = length"
> 
> In this case, the existence of length seems at least seems redundant, unless I missed something?

Looks like it

> 
> For the reference, the function was added in https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=96d971e307cc0e434f96329b42bbd98cfbca07d2
> Later https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a3bb7b63813f674fb62bac321cdd897cc62de094 changed ETH_MODULE_SFF_8079 to ETH_MODULE_SFF_8472.
> 
> Thanks,
>  Krzysztof

  reply	other threads:[~2024-05-22  8:40 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-21  6:26 "netlink error: Invalid argument" with ethtool-5.13+ on recent kernels due to "ethtool: Add netlink handler for getmodule (-m)" - 25b64c66f58d3df0ad7272dda91c3ab06fe7a303, also no SFP-DOM support via netlink? Krzysztof Olędzki
2024-05-21  6:55 ` Michal Kubecek
2024-05-21  7:02   ` Krzysztof Olędzki
2024-05-21  7:16     ` Krzysztof Olędzki
2024-05-21  7:34     ` Michal Kubecek
2024-05-21  7:38       ` Krzysztof Olędzki
2024-05-21 20:21         ` Andrew Lunn
2024-05-22  4:54           ` Krzysztof Olędzki
2024-05-22  8:40             ` Ido Schimmel [this message]
2024-05-22 12:44               ` Andrew Lunn
2024-05-23  5:29                 ` Krzysztof Olędzki
2024-05-23 10:37                   ` Michal Kubecek
2024-05-23 10:48                   ` Ido Schimmel
2024-05-23 15:35                   ` Andrew Lunn
2024-07-08  3:41                     ` [PATCH] net/mlx4: Add support for EEPROM high pages query for QSFP/QSFP+/QSFP28 Krzysztof Olędzki
2024-07-08 16:28                       ` Ido Schimmel
2024-07-09 11:17                       ` Dan Merillat
2024-07-08  3:41                     ` [PATCH] qsfp: Better handling of Page 03h netlink read failure Krzysztof Olędzki
2024-07-08 16:12                       ` Ido Schimmel
2024-07-31  0:55                         ` Krzysztof Olędzki
2024-07-31  8:48                           ` Ido Schimmel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zk2vfmI7qnBMxABo@shredder \
    --to=idosch@nvidia.com \
    --cc=andrew@lunn.ch \
    --cc=mkubecek@suse.cz \
    --cc=moshe@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=ole@ans.pl \
    --cc=tariqt@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).