All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leon Romanovsky <leonro@nvidia.com>
To: Dennis Afanasev <dennis.afanasev@stateless.net>,
	Vlad Buslov <vladbu@nvidia.com>,
	Dmytro Linkin <dlinkin@nvidia.com>, Roi Dayan <roid@nvidia.com>
Cc: <saeedm@nvidia.com>, <netdev@vger.kernel.org>,
	<linux-rdma@vger.kernel.org>
Subject: Re: PROBLEM: mlx5_core driver crashes when a VRF device with a route is added with mlx5 devices in switchdev mode
Date: Sun, 2 May 2021 09:21:42 +0300	[thread overview]
Message-ID: <YI5E9mgNDzPMXTRh@unreal> (raw)
In-Reply-To: <CACJMemXjp6F0KzzAfR8yR4s5BU8zJBpsXmF0LWu3ubmF8Kke3Q@mail.gmail.com>

Thanks for the report.

+ more people.

On Fri, Apr 30, 2021 at 04:56:17PM -0400, Dennis Afanasev wrote:
> Dear Saeed and Leo,
> I am reporting a bug in the mlx5_core driver discovered by our team at
> Stateless while setting up SRIOV devices in eswitch mode. Below are the
> details and relevant files that relate to the bug. Please reach out to me
> if I can provide any further information.
> 
>    1.
> 
>    Description of problem: When creating SRIOV devices off physical mlx5
>    PCIe devices and then putting the physical devices into switchdev mode,
>    adding a new VRF device with a default route will cause the mlx5_core
>    driver to segfault (replicate_bug1.sh). In addition, attempting to set the
>    physical devices to switchdev mode after adding a VRF with a default route
>    will cause the mlx5_core driver to segfault (replicate_bug2.sh). The seg
>    fault occurs in the function mlx5e_tc_tun_fib_event in both cases.
>    2.
> 
>    Keywords: mlx5, ml5x_core, mlx5e_tc_tun_fib_event, tc, netdev, 5.12-rc7
>    3.
> 
>    Kernel information: Linux version 5.12.0-rc7 (root@data) (gcc (Debian
>    10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP
>    4.
> 
>    Kernel config file: File attached - config-5.12.0-rc7
>    5.
> 
>    Oops message: Files attached - dmesg_output_bug1 and dmesg_output_bug2
>    6.
> 
>    Shell script to replicate: Files attached - replicate_bug1.sh and
>    replicate_bug2.sh
>    7.
> 
>    ver_linux output: File attached - ver_linux_output
>    8.
> 
>    Processor information: File attached - cpuinfo
>    9.
> 
>    Module information: File attached - modules
>    10.
> 
>    Loaded driver and hardware: Files attached - ioport and iomem
>    11.
> 
>    PCI information: File attached - pci_info
>    12.
> 
>    Other information - I hardcoded the values of the physical PCIe device
>    and the address of the created SRIOV device. This will have to be adjusted
>    depending on your machine.




> #!/bin/bash
> 
> set -euxETo pipefail
> 
> mst start
> 
> # (Hardcoded) These need to be modified based on the host machine
> nic1_port0="0000:5e:00.0"
> nic1_port1="0000:5e:00.1"
> 
> # Create 1 SRIOV device per NIC port
> echo 1 > /sys/bus/pci/drivers/mlx5_core/$nic1_port0/sriov_numvfs
> echo 1 > /sys/bus/pci/drivers/mlx5_core/$nic1_port1/sriov_numvfs
> 
> # The SRIOV devices are given these addresses
> nic1_port0_vf="0000:5e:00.2"
> nic1_port1_vf="0000:5e:00.4"
> 
> declare -ar PCIE_PHYSICAL_ADDRESSES=($nic1_port0 $nic1_port1)
> declare -ar PCIE_SRIOV_ADDRESSES=($nic1_port0_vf $nic1_port1_vf)
> 
> # Unbind the driver from the SRIOV, required to activate the eswitch
> for pcie_address in "${PCIE_SRIOV_ADDRESSES[@]}"; do
>   echo "${pcie_address}" > /sys/bus/pci/drivers/mlx5_core/unbind
> done
> 
> # Wait for the binds to disappear
> for pcie_address in "${PCIE_SRIOV_ADDRESSES[@]}"; do
>   declare sys_symlink_file="/sys/bus/pci/drivers/mlx5_core/${pcie_address}"
>   until [[ ! -h "${sys_symlink_file}" ]]; do
>     inotifywait --event delete_self --timeout 1 "${sys_symlink_file}" || true
>   done
> done
> sync --file-system /sys
> udevadm settle --timeout=30
> sleep 5
> 
> # Set the cards to 'switchdev'
> for pcie_address in "${PCIE_PHYSICAL_ADDRESSES[@]}"; do
>   devlink dev eswitch set "pci/${pcie_address}" mode switchdev encap-mode basic
> done
> 
> # Wait for the cards to be in switchdev mode
> for pcie_address in "${PCIE_PHYSICAL_ADDRESSES[@]}"; do
>   until [[ "$(devlink -j dev eswitch show "pci/${pcie_address}" |
>     jq --arg dev "pci/${pcie_address}" -r '.dev[$dev].mode' 2> /dev/null)" == "switchdev" ]]; do
>     sleep 1
>   done
> done
> sync --file-system /sys
> udevadm settle --timeout=30
> sleep 5
> 
> for pcie_address in "${PCIE_SRIOV_ADDRESSES[@]}"; do
>   echo "${pcie_address}" > /sys/bus/pci/drivers/mlx5_core/bind
> done
> 
> ip link set group default up
> ip link add vrf0 type vrf table 100
> 
> # This will crash the kernel
> ip route add table 100 unreachable default

> #!/bin/bash
> 
> set -euxETo pipefail
> 
> mst start
> 
> # Add the VRF device and a route
> ip link add vrf0 type vrf table 100
> ip route add table 100 unreachable default
> 
> # (Hardcoded) These need to be modified based on the host machine
> nic1_port0="0000:5e:00.0"
> nic1_port1="0000:5e:00.1"
> 
> # Create 1 SRIOV device per NIC port
> echo 1 > /sys/bus/pci/drivers/mlx5_core/$nic1_port0/sriov_numvfs
> echo 1 > /sys/bus/pci/drivers/mlx5_core/$nic1_port1/sriov_numvfs
> 
> # The SRIOV devices are given these addresses
> nic1_port0_vf="0000:5e:00.2"
> nic1_port1_vf="0000:5e:00.4"
> 
> declare -ar PCIE_PHYSICAL_ADDRESSES=($nic1_port0 $nic1_port1)
> declare -ar PCIE_SRIOV_ADDRESSES=($nic1_port0_vf $nic1_port1_vf)
> 
> # Unbind the driver from the SRIOV, required to activate the eswitch
> for pcie_address in "${PCIE_SRIOV_ADDRESSES[@]}"; do
>   echo "${pcie_address}" > /sys/bus/pci/drivers/mlx5_core/unbind
> done
> 
> # Wait for the binds to disappear
> for pcie_address in "${PCIE_SRIOV_ADDRESSES[@]}"; do
>   declare sys_symlink_file="/sys/bus/pci/drivers/mlx5_core/${pcie_address}"
>   until [[ ! -h "${sys_symlink_file}" ]]; do
>     inotifywait --event delete_self --timeout 1 "${sys_symlink_file}" || true
>   done
> done
> sync --file-system /sys
> udevadm settle --timeout=30
> 
> # set the cards to 'switchdev'
> for pcie_address in "${PCIE_PHYSICAL_ADDRESSES[@]}"; do
>   # This will crash the kernel
>   devlink dev eswitch set "pci/${pcie_address}" mode switchdev encap-mode basic
> done








       reply	other threads:[~2021-05-02  6:21 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CACJMemXjp6F0KzzAfR8yR4s5BU8zJBpsXmF0LWu3ubmF8Kke3Q@mail.gmail.com>
2021-05-02  6:21 ` Leon Romanovsky [this message]
2021-05-02  7:33   ` PROBLEM: mlx5_core driver crashes when a VRF device with a route is added with mlx5 devices in switchdev mode Roi Dayan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YI5E9mgNDzPMXTRh@unreal \
    --to=leonro@nvidia.com \
    --cc=dennis.afanasev@stateless.net \
    --cc=dlinkin@nvidia.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=roid@nvidia.com \
    --cc=saeedm@nvidia.com \
    --cc=vladbu@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.