* 3.10.0-rc2 mlx4 not receiving packets for some multicast groups @ 2013-05-24 15:49 Shawn Bohrer 2013-05-24 16:34 ` Shawn Bohrer 2013-05-25 3:49 ` Or Gerlitz 0 siblings, 2 replies; 16+ messages in thread From: Shawn Bohrer @ 2013-05-24 15:49 UTC (permalink / raw) To: netdev; +Cc: Or Gerlitz, Hadar Hen Zion, Rony Efraim, Amir Vadai I just started testing the 3.10 kernel, previously we were on 3.4 so there is a fairly large jump. I've additionally applied the following four patches to the 3.10.0-rc2 kernel that I'm testing: https://patchwork.kernel.org/patch/2484651/ https://patchwork.kernel.org/patch/2484671/ https://patchwork.kernel.org/patch/2484681/ https://patchwork.kernel.org/patch/2484641/ I don't know if those patches are related to my issues or not but I plan on trying to reproduce without them soon. The issue I'm seeing is that our applications listen on a number of multicast addresses. In this case I'm listening to about 350 different addresses per machine, across many different processes, with usually one socket per address. The problem is that some of the sockets are not receiving any data and some are, even though they all should be. If I put the device in promiscuous mode then I start receiving data on all of my sockets. Running netstat -g shows all of my memberships so it appears to me that the kernel and the switch think I've joined the groups, but the card may be filtering the data. This is with: 05:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3] # ethtool -i eth4 driver: mlx4_en version: 2.0 (Dec 2011) firmware-version: 2.11.500 bus-info: 0000:05:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: no The other strange part is that I've got multiple machines all running the same kernel and not all of them are experiencing the issue. At one point they were all working fine, but the issue appeared after I rebooted one of the machines and multiple reboots later it is still in this bad state. Rebooting that machine back to 3.4 causes it to work as expected but no luck under 3.10. I've now got two machines in this bad state and they both started immediately after a reboot. Does anyone have any ideas? Thanks, Shawn ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups 2013-05-24 15:49 3.10.0-rc2 mlx4 not receiving packets for some multicast groups Shawn Bohrer @ 2013-05-24 16:34 ` Shawn Bohrer 2013-05-24 16:58 ` Eric Dumazet 2013-05-25 3:41 ` Or Gerlitz 2013-05-25 3:49 ` Or Gerlitz 1 sibling, 2 replies; 16+ messages in thread From: Shawn Bohrer @ 2013-05-24 16:34 UTC (permalink / raw) To: netdev; +Cc: Or Gerlitz, Hadar Hen Zion, Rony Efraim, Amir Vadai On Fri, May 24, 2013 at 10:49:31AM -0500, Shawn Bohrer wrote: > I just started testing the 3.10 kernel, previously we were on 3.4 so > there is a fairly large jump. I've additionally applied the following > four patches to the 3.10.0-rc2 kernel that I'm testing: > > https://patchwork.kernel.org/patch/2484651/ > https://patchwork.kernel.org/patch/2484671/ > https://patchwork.kernel.org/patch/2484681/ > https://patchwork.kernel.org/patch/2484641/ > > I don't know if those patches are related to my issues or not but I > plan on trying to reproduce without them soon. I've reverted the four patches above from my test kernel and still see the issue so they don't appear to be the cause. -- Shawn ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups 2013-05-24 16:34 ` Shawn Bohrer @ 2013-05-24 16:58 ` Eric Dumazet 2013-05-25 3:41 ` Or Gerlitz 1 sibling, 0 replies; 16+ messages in thread From: Eric Dumazet @ 2013-05-24 16:58 UTC (permalink / raw) To: Shawn Bohrer; +Cc: netdev, Or Gerlitz, Hadar Hen Zion, Rony Efraim, Amir Vadai On Fri, 2013-05-24 at 11:34 -0500, Shawn Bohrer wrote: > On Fri, May 24, 2013 at 10:49:31AM -0500, Shawn Bohrer wrote: > > I just started testing the 3.10 kernel, previously we were on 3.4 so > > there is a fairly large jump. I've additionally applied the following > > four patches to the 3.10.0-rc2 kernel that I'm testing: > > > > https://patchwork.kernel.org/patch/2484651/ > > https://patchwork.kernel.org/patch/2484671/ > > https://patchwork.kernel.org/patch/2484681/ > > https://patchwork.kernel.org/patch/2484641/ > > > > I don't know if those patches are related to my issues or not but I > > plan on trying to reproduce without them soon. > > I've reverted the four patches above from my test kernel and still see > the issue so they don't appear to be the cause. I suggest adding in tools/testing/selftests/net tests about multicast stuff. It seems many NIC suffer from bugs in this area, especially when dealing with a lot of groups. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups 2013-05-24 16:34 ` Shawn Bohrer 2013-05-24 16:58 ` Eric Dumazet @ 2013-05-25 3:41 ` Or Gerlitz 2013-05-25 15:13 ` Shawn Bohrer 1 sibling, 1 reply; 16+ messages in thread From: Or Gerlitz @ 2013-05-25 3:41 UTC (permalink / raw) To: Shawn Bohrer; +Cc: netdev, Hadar Hen Zion, Amir Vadai On Fri, May 24, 2013 at 7:34 PM, Shawn Bohrer <shawn.bohrer@gmail.com> wrote: > On Fri, May 24, 2013 at 10:49:31AM -0500, Shawn Bohrer wrote: > > I just started testing the 3.10 kernel, previously we were on 3.4 so > > there is a fairly large jump. I've additionally applied the following > > four patches to the 3.10.0-rc2 kernel that I'm testing: > > > > https://patchwork.kernel.org/patch/2484651/ > > https://patchwork.kernel.org/patch/2484671/ > > https://patchwork.kernel.org/patch/2484681/ > > https://patchwork.kernel.org/patch/2484641/ > > >> I don't know if those patches are related to my issues or not but I >> plan on trying to reproduce without them soon. > I've reverted the four patches above from my test kernel and still see > the issue so they don't appear to be the cause. Hi Shawn, So 3.4 works, 3.10-rc2 breaks? its indeed a fairly large gap, maybe try to bisec that? just to make sure, did use touch any mlx4 non-default config? specifically did you turn DMFS (Device Managed Flow Steering) on using the set the mlx4_core module param of log_num_mgm_entry_size or you were using B0 steering (the default)? Or. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups 2013-05-25 3:41 ` Or Gerlitz @ 2013-05-25 15:13 ` Shawn Bohrer 2013-05-25 19:41 ` Or Gerlitz 2013-05-28 20:15 ` Shawn Bohrer 0 siblings, 2 replies; 16+ messages in thread From: Shawn Bohrer @ 2013-05-25 15:13 UTC (permalink / raw) To: Or Gerlitz; +Cc: netdev, Hadar Hen Zion, Amir Vadai On Sat, May 25, 2013 at 06:41:05AM +0300, Or Gerlitz wrote: > On Fri, May 24, 2013 at 7:34 PM, Shawn Bohrer <shawn.bohrer@gmail.com> wrote: > > On Fri, May 24, 2013 at 10:49:31AM -0500, Shawn Bohrer wrote: > > > I just started testing the 3.10 kernel, previously we were on 3.4 so > > > there is a fairly large jump. I've additionally applied the following > > > four patches to the 3.10.0-rc2 kernel that I'm testing: > > > > > > https://patchwork.kernel.org/patch/2484651/ > > > https://patchwork.kernel.org/patch/2484671/ > > > https://patchwork.kernel.org/patch/2484681/ > > > https://patchwork.kernel.org/patch/2484641/ > > > > >> I don't know if those patches are related to my issues or not but I > >> plan on trying to reproduce without them soon. > > > I've reverted the four patches above from my test kernel and still see > > the issue so they don't appear to be the cause. > > Hi Shawn, > > So 3.4 works, 3.10-rc2 breaks? its indeed a fairly large gap, maybe > try to bisec that? just to make sure, did use touch any mlx4 > non-default config? specifically did you turn DMFS (Device Managed > Flow Steering) on using the set the mlx4_core module param of > log_num_mgm_entry_size or you were using B0 steering (the default)? Initially my goal is to sanity check 3.10 before I start playing with the knobs, so I haven't explicitly changed any new mlx4 settings yet. We do however set some non-default values but I'm doing that on both kernels: mlx4_core log_num_vlan=7 mlx4_en pfctx=0xff pfcrx=0xff I may indeed try to bisect this, but first I need to see how easily I can reproduce it. I did some more testing last night that left me feeling certifiably insane. I'll explain what I saw with hopes that either it will confirm I'm insane or maybe actually make sense to someone... My testing of 3.10 has basically gone like this: 1. I have 40 test machines. I installed 3.10.0-rc2 on machine 1, rebooted, and it came back without any fireworks so I installed 3.10.0-rc2 on the remaining 39 machines and rebooted them all in one shot. 2. I then started my test applications which appeared everything was functioning correctly on all machines. There were some pretty significant end-to-end latency regressions in our system so I started to narrow down where the added latency might be coming from (interrupts, memory, disk, scheduler, send/receive...). 3. 6 of my 40 machines are configured to receive the same data on approximately 350 multicast groups. I picked machine #1 built a new kernel disabling the new adaptive NO_HZ and RCU no CB settings and rebooted that machine. When I re-ran my application machine #1 was now only receiving data on a small fraction of the multicast groups. 4. After puzzling over machine #1 I decided to reboot machine #2 to see if it was the reboot or the new kernel or maybe something else. When machine #2 came back it was in the same state as machine #1 and only received multicast data on a small number of the 350 groups. This meant it wasn't my config change but the reboot that triggered the issue. 5. Debugging I noticed that tcpdump on machine #1 or #2 caused them to suddenly receive data, and simply putting the interface in promiscuous mode had the same result. I rebooted both machine #1 and #2 several times and each time they had the same issue. I then rebooted them back into 3.4 and they both functioned as expected and received data on all 350 groups. Rebooted them both back into 3.10 and they were both still broken. This is when I sent my initial email to netdev. *Here is where I went insane* 6. I still had 6 machines all configured the same and receiving the same data. #1 and #2 were still broken so I decided to see what would happen if I simply rebooted #3. I rebooted #3 started my application and as I sort of expected #3 no longer received data on most of the multicast groups. The crazy part was that machine #1 was now working! I didn't touch that machine at all, just stopped, and restarted my application. 7. Confused I rebooted #4. Again machine #4 was now broken, and magically machine #2 started working. 8. When I rebooted machine #5 it came back and received all of the data, but it also magically fixed #3. 9. At this point my brain was fried and it was time to go home so I rebooted all machines back to 3.4 and gave up. I'll revisit this again next week. Thanks, Shawn ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups 2013-05-25 15:13 ` Shawn Bohrer @ 2013-05-25 19:41 ` Or Gerlitz 2013-05-25 21:37 ` Shawn Bohrer 2013-05-28 20:15 ` Shawn Bohrer 1 sibling, 1 reply; 16+ messages in thread From: Or Gerlitz @ 2013-05-25 19:41 UTC (permalink / raw) To: Shawn Bohrer; +Cc: netdev, Hadar Hen Zion, Amir Vadai On Sat, May 25, 2013 at 6:13 PM, Shawn Bohrer <shawn.bohrer@gmail.com> wrote: [...] > 5. Debugging I noticed that tcpdump on machine #1 or #2 caused them to > suddenly receive data, and simply putting the interface in promiscuous > mode had the same result. I rebooted both machine #1 and #2 several > times and each time they had the same issue. I then rebooted them > back into 3.4 and they both functioned as expected and received data > on all 350 groups. Rebooted them both back into 3.10 and they were > both still broken. This is when I sent my initial email to netdev. [..] Shawn, thanks for all the details, just one small confirmation, when you moved from 3.4 to 3.10 have you done ANY change to your app? e.g do you still use the same QP type (UD) or moved to RAW PACKET? Or. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups 2013-05-25 19:41 ` Or Gerlitz @ 2013-05-25 21:37 ` Shawn Bohrer 0 siblings, 0 replies; 16+ messages in thread From: Shawn Bohrer @ 2013-05-25 21:37 UTC (permalink / raw) To: Or Gerlitz; +Cc: netdev, Hadar Hen Zion, Amir Vadai On Sat, May 25, 2013 at 10:41:39PM +0300, Or Gerlitz wrote: > On Sat, May 25, 2013 at 6:13 PM, Shawn Bohrer <shawn.bohrer@gmail.com> wrote: > [...] > > 5. Debugging I noticed that tcpdump on machine #1 or #2 caused them to > > suddenly receive data, and simply putting the interface in promiscuous > > mode had the same result. I rebooted both machine #1 and #2 several > > times and each time they had the same issue. I then rebooted them > > back into 3.4 and they both functioned as expected and received data > > on all 350 groups. Rebooted them both back into 3.10 and they were > > both still broken. This is when I sent my initial email to netdev. > [..] > > Shawn, thanks for all the details, just one small confirmation, when > you moved from 3.4 to 3.10 have you done ANY change to your app? e.g > do you still use the same QP type (UD) or moved to RAW PACKET? No modifications have been made to the application. In this case I'm using plain old UDP multicast sockets over 10g Ethernet. -- Shawn ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups 2013-05-25 15:13 ` Shawn Bohrer 2013-05-25 19:41 ` Or Gerlitz @ 2013-05-28 20:15 ` Shawn Bohrer 2013-05-29 13:55 ` Or Gerlitz 1 sibling, 1 reply; 16+ messages in thread From: Shawn Bohrer @ 2013-05-28 20:15 UTC (permalink / raw) To: Or Gerlitz; +Cc: netdev, Hadar Hen Zion, Amir Vadai On Sat, May 25, 2013 at 10:13:47AM -0500, Shawn Bohrer wrote: > On Sat, May 25, 2013 at 06:41:05AM +0300, Or Gerlitz wrote: > > On Fri, May 24, 2013 at 7:34 PM, Shawn Bohrer <shawn.bohrer@gmail.com> wrote: > > > On Fri, May 24, 2013 at 10:49:31AM -0500, Shawn Bohrer wrote: > > > > I just started testing the 3.10 kernel, previously we were on 3.4 so > > > > there is a fairly large jump. I've additionally applied the following > > > > four patches to the 3.10.0-rc2 kernel that I'm testing: > > > > > > > > https://patchwork.kernel.org/patch/2484651/ > > > > https://patchwork.kernel.org/patch/2484671/ > > > > https://patchwork.kernel.org/patch/2484681/ > > > > https://patchwork.kernel.org/patch/2484641/ > > > > > > >> I don't know if those patches are related to my issues or not but I > > >> plan on trying to reproduce without them soon. > > > > > I've reverted the four patches above from my test kernel and still see > > > the issue so they don't appear to be the cause. > > > > Hi Shawn, > > > > So 3.4 works, 3.10-rc2 breaks? its indeed a fairly large gap, maybe > > try to bisec that? just to make sure, did use touch any mlx4 > > non-default config? specifically did you turn DMFS (Device Managed > > Flow Steering) on using the set the mlx4_core module param of > > log_num_mgm_entry_size or you were using B0 steering (the default)? > > Initially my goal is to sanity check 3.10 before I start playing with > the knobs, so I haven't explicitly changed any new mlx4 settings yet. > We do however set some non-default values but I'm doing that on both > kernels: > > mlx4_core log_num_vlan=7 > mlx4_en pfctx=0xff pfcrx=0xff Naturally I was wrong and we set more than the above non-default values. We additionally set high_rate_steer=1 on mlx4_core. As you may know this parameter isn't currently available in the upstream driver, so I've been carrying the following patch in my 3.4 and 3.10 trees: --- drivers/net/ethernet/mellanox/mlx4/main.c | 10 ++++++++++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c index 0d32a82..7808e4a 100644 --- a/drivers/net/ethernet/mellanox/mlx4/main.c +++ b/drivers/net/ethernet/mellanox/mlx4/main.c @@ -71,6 +71,11 @@ static int msi_x = 1; module_param(msi_x, int, 0444); MODULE_PARM_DESC(msi_x, "attempt to use MSI-X if nonzero"); +static int high_rate_steer; +module_param(high_rate_steer, int, 0444); +MODULE_PARM_DESC(high_rate_steer, "Enable steering mode for higher packet rate" + " (default off)"); + #else /* CONFIG_PCI_MSI */ #define msi_x (0) @@ -288,6 +293,11 @@ static int mlx4_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap) if (mlx4_is_mfunc(dev)) dev->caps.flags &= ~MLX4_DEV_CAP_FLAG_SENSE_SUPPORT; + if (high_rate_steer && !mlx4_is_mfunc(dev)) { + dev->caps.flags &= ~MLX4_DEV_CAP_FLAG_VEP_UC_STEER; + dev->caps.flags &= ~MLX4_DEV_CAP_FLAG_VEP_MC_STEER; + } + dev->caps.log_num_macs = log_num_mac; dev->caps.log_num_vlans = MLX4_LOG_NUM_VLANS; dev->caps.log_num_prios = use_prio ? 3 : 0; -- What I've found really happened is: 1. Installed 3.10 rebooted and everything worked. high_rate_steer=1 was set at this point. 2. Our configuration management software saw the new kernel and disabled high_rate_steer. 3. As I rebooted machines high_rate_steer was cleared and they no longer received multicast data on most of their addresses. I've confirmed that with the above high_rate_steer patch and high_rate_steer=1 I receive data on 3.10.0-rc3 and with high_rate_steer=0 I only receive data on a small number of multicast addresses. With 3.4 and the same patch I receive data in both cases. I also previously claimed that rebooting one machine appeared to make a different machine receive data. I doubt this was true. Instead what I think happened was that each time I start my application a different set of multicast groups will receive data and the rest will not. I did not verify that all groups were actually receiving data and thus am guessing I just happened to get lucky and see a few new ones working that previously were not. So now that we know that high_rate_steer=1 fixes my multicast issue does that provide any clues as to why I do not receive data on all multicast groups without it? Additionally as I'm sure I should have done earlier is there a reason the high_rate_steer option has not been upstreamed? I can see that the out of tree Mellanox driver now additionally clears MLX4_DEV_CAP_FLAG2_FS_EN when high_rate_steer=1 and has moved that code into choose_steering_mode() so my local patch probably needs an update if this isn't going upstream. For a little bit of background the reason we are using the high_rate_steer=1 option was because it enabled us to handle larger/faster bursts of packets without dropping packets. Historically we got very similar results by using log_num_mgm_entry_size=7 but we stuck with high_rate_steer=1 simply because we had tried/verified it first. For those wondering using log_num_mgm_entry_size=7 and high_rate_steer=0 on 3.10 does not work since I do not receive data on all multicast groups. -- Shawn ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups 2013-05-28 20:15 ` Shawn Bohrer @ 2013-05-29 13:55 ` Or Gerlitz 2013-05-30 20:31 ` Shawn Bohrer 0 siblings, 1 reply; 16+ messages in thread From: Or Gerlitz @ 2013-05-29 13:55 UTC (permalink / raw) To: Shawn Bohrer; +Cc: netdev, Hadar Hen Zion, Amir Vadai On Tue, May 28, 2013 at 11:15 PM, Shawn Bohrer <shawn.bohrer@gmail.com> wrote: > Naturally I was wrong and we set more than the above non-default > values. We additionally set high_rate_steer=1 on mlx4_core. As > you may know this parameter isn't currently available in the upstream > driver, so I've been carrying the following patch in my 3.4 and 3.10 trees: [...] > I've confirmed that with the above high_rate_steer patch and > high_rate_steer=1 I receive data on 3.10.0-rc3 and with > high_rate_steer=0 I only receive data on a small number of multicast > addresses. With 3.4 and the same patch I receive data in both cases. [...] Shawn, so end-in mind you want the NIC steering mode to be DMFS (Device Managed Flow Steering) e.g for the processes bypassing the kernel, correct? since the NIC steering mode is global, you will not be able to use that non-upstream patch moving forward. So we need to debug/bisect why without the patch (what you call high_rate_steer=0) you don't get data on all groups. Can you bisect that on a single node, e.g set the rest of the environment with 3.4 that works, and on a given node see what is the commit that breaks that? Or. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups 2013-05-29 13:55 ` Or Gerlitz @ 2013-05-30 20:31 ` Shawn Bohrer 2013-05-30 20:42 ` Or Gerlitz 0 siblings, 1 reply; 16+ messages in thread From: Shawn Bohrer @ 2013-05-30 20:31 UTC (permalink / raw) To: Or Gerlitz; +Cc: netdev, Hadar Hen Zion, Amir Vadai, Vlad Yasevich On Wed, May 29, 2013 at 04:55:32PM +0300, Or Gerlitz wrote: > On Tue, May 28, 2013 at 11:15 PM, Shawn Bohrer <shawn.bohrer@gmail.com> wrote: > > Naturally I was wrong and we set more than the above non-default > > values. We additionally set high_rate_steer=1 on mlx4_core. As > > you may know this parameter isn't currently available in the upstream > > driver, so I've been carrying the following patch in my 3.4 and 3.10 trees: > > [...] > > > I've confirmed that with the above high_rate_steer patch and > > high_rate_steer=1 I receive data on 3.10.0-rc3 and with > > high_rate_steer=0 I only receive data on a small number of multicast > > addresses. With 3.4 and the same patch I receive data in both cases. > > [...] > > Shawn, so end-in mind you want the NIC steering mode to be DMFS > (Device Managed Flow Steering) e.g for the processes bypassing the > kernel, correct? since the NIC steering mode is global, you will not > be able to use that non-upstream patch moving forward. Yes, end goal is to use DMFS. However, we have some ConnectX-2 cards which I guess do not support DMFS and naturally I'd like plain old UDP multicast to continue to work at the same level as 3.4. So I may still want that high_rate_steer option upstreamed, but we'll see once I get 3.10 into better shape. > So we need to > debug/bisect why without the patch (what you call high_rate_steer=0) > you don't get data on all groups. Can you bisect that on a single > node, e.g set the rest of the environment with 3.4 that works, and on > a given node see what is the commit that breaks that? Done. It appears that the patch that breaks receiving packets on many different multicast groups/sockets is: commit 4cd729b04285b7330edaf5a7080aa795d6d15ff3 Author: Vlad Yasevich <vyasevic@redhat.com> Date: Mon Apr 15 09:54:25 2013 +0000 net: add dev_uc_sync_multiple() and dev_mc_sync_multiple() api The current implementation of dev_uc_sync/unsync() assumes that there is a strict 1-to-1 relationship between the source and destination of the sync. In other words, once an address has been synced to a destination device, it will not be synced to any other device through the sync API. However, there are some virtual devices that aggreate a number of lower devices and need to sync addresses to all of them. The current API falls short there. This patch introduces a new dev_uc_sync_multiple() api that can be called in the above circumstances and allows sync to work for every invocation. CC: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> I've confirmed that reverting this patch on top of 3.10-rc3 allows me to receive packets on all of my multicast groups without the Mellanox high_rate_steer option set. -- Shawn ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups 2013-05-30 20:31 ` Shawn Bohrer @ 2013-05-30 20:42 ` Or Gerlitz 2013-05-30 20:57 ` Vlad Yasevich 0 siblings, 1 reply; 16+ messages in thread From: Or Gerlitz @ 2013-05-30 20:42 UTC (permalink / raw) To: Shawn Bohrer, Vlad Yasevich Cc: netdev, Hadar Hen Zion, Amir Vadai, Jiri Pirko On Thu, May 30, 2013 at 11:31 PM, Shawn Bohrer <shawn.bohrer@gmail.com> wrote: >> So we need to >> debug/bisect why without the patch (what you call high_rate_steer=0) >> you don't get data on all groups. Can you bisect that on a single >> node, e.g set the rest of the environment with 3.4 that works, and on >> a given node see what is the commit that breaks that? > Done. It appears that the patch that breaks receiving packets on many > different multicast groups/sockets is: > > commit 4cd729b04285b7330edaf5a7080aa795d6d15ff3 > Author: Vlad Yasevich <vyasevic@redhat.com> > Date: Mon Apr 15 09:54:25 2013 +0000 > > net: add dev_uc_sync_multiple() and dev_mc_sync_multiple() api > > The current implementation of dev_uc_sync/unsync() assumes that there is > a strict 1-to-1 relationship between the source and destination of the sync. > In other words, once an address has been synced to a destination device, it > will not be synced to any other device through the sync API. > However, there are some virtual devices that aggreate a number of lower > devices and need to sync addresses to all of them. The current > API falls short there. > > This patch introduces a new dev_uc_sync_multiple() api that can be called > in the above circumstances and allows sync to work for every invocation. > > CC: Jiri Pirko <jiri@resnulli.us> > Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> > Signed-off-by: David S. Miller <davem@davemloft.net> > > I've confirmed that reverting this patch on top of 3.10-rc3 allows me > to receive packets on all of my multicast groups without the Mellanox > high_rate_steer option set. OK, impressive debugging... so what do we do from here? Vlad, Shawn observes a regression once this patch is used on a large scale setup that uses many multicast groups (you can read the posts done earlier on this thread), does this rings any bell w.r.t to the actual problem in the patch? Or. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups 2013-05-30 20:42 ` Or Gerlitz @ 2013-05-30 20:57 ` Vlad Yasevich 2013-05-31 0:23 ` Jay Vosburgh 0 siblings, 1 reply; 16+ messages in thread From: Vlad Yasevich @ 2013-05-30 20:57 UTC (permalink / raw) To: Or Gerlitz; +Cc: Shawn Bohrer, netdev, Hadar Hen Zion, Amir Vadai, Jiri Pirko On 05/30/2013 04:42 PM, Or Gerlitz wrote: > On Thu, May 30, 2013 at 11:31 PM, Shawn Bohrer <shawn.bohrer@gmail.com> wrote: >>> So we need to >>> debug/bisect why without the patch (what you call high_rate_steer=0) >>> you don't get data on all groups. Can you bisect that on a single >>> node, e.g set the rest of the environment with 3.4 that works, and on >>> a given node see what is the commit that breaks that? > >> Done. It appears that the patch that breaks receiving packets on many >> different multicast groups/sockets is: >> >> commit 4cd729b04285b7330edaf5a7080aa795d6d15ff3 >> Author: Vlad Yasevich <vyasevic@redhat.com> >> Date: Mon Apr 15 09:54:25 2013 +0000 >> >> net: add dev_uc_sync_multiple() and dev_mc_sync_multiple() api >> >> The current implementation of dev_uc_sync/unsync() assumes that there is >> a strict 1-to-1 relationship between the source and destination of the sync. >> In other words, once an address has been synced to a destination device, it >> will not be synced to any other device through the sync API. >> However, there are some virtual devices that aggreate a number of lower >> devices and need to sync addresses to all of them. The current >> API falls short there. >> >> This patch introduces a new dev_uc_sync_multiple() api that can be called >> in the above circumstances and allows sync to work for every invocation. >> >> CC: Jiri Pirko <jiri@resnulli.us> >> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> >> Signed-off-by: David S. Miller <davem@davemloft.net> >> >> I've confirmed that reverting this patch on top of 3.10-rc3 allows me >> to receive packets on all of my multicast groups without the Mellanox >> high_rate_steer option set. > > OK, impressive debugging... so what do we do from here? Vlad, Shawn > observes a regression once this patch is used on a large scale setup > that uses many multicast groups (you can read the posts done earlier > on this thread), does this rings any bell w.r.t to the actual problem > in the patch? I haven't seen that, but I didn't test with that many multicast groups. I had 20 groups working. I'll take a look and see what might be going on. Thanks -vlad > > Or. > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups 2013-05-30 20:57 ` Vlad Yasevich @ 2013-05-31 0:23 ` Jay Vosburgh 2013-05-31 15:17 ` Shawn Bohrer 0 siblings, 1 reply; 16+ messages in thread From: Jay Vosburgh @ 2013-05-31 0:23 UTC (permalink / raw) To: vyasevic Cc: Or Gerlitz, Shawn Bohrer, netdev, Hadar Hen Zion, Amir Vadai, Jiri Pirko Vlad Yasevich <vyasevic@redhat.com> wrote: >>> CC: Jiri Pirko <jiri@resnulli.us> >>> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> >>> Signed-off-by: David S. Miller <davem@davemloft.net> >>> >>> I've confirmed that reverting this patch on top of 3.10-rc3 allows me >>> to receive packets on all of my multicast groups without the Mellanox >>> high_rate_steer option set. >> >> OK, impressive debugging... so what do we do from here? Vlad, Shawn >> observes a regression once this patch is used on a large scale setup >> that uses many multicast groups (you can read the posts done earlier >> on this thread), does this rings any bell w.r.t to the actual problem >> in the patch? > >I haven't seen that, but I didn't test with that many multicast groups. I >had 20 groups working. > >I'll take a look and see what might be going on. I've actually been porting bonding to the dev_sync/unsync system, and have a patch series of 4 fixes to various internals of dev_sync/unsync; I'll post those under separate cover. It may be that one or more of those things are the source of this problem (or I might have it all wrong). -J --- -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups 2013-05-31 0:23 ` Jay Vosburgh @ 2013-05-31 15:17 ` Shawn Bohrer 0 siblings, 0 replies; 16+ messages in thread From: Shawn Bohrer @ 2013-05-31 15:17 UTC (permalink / raw) To: Jay Vosburgh Cc: vyasevic, Or Gerlitz, netdev, Hadar Hen Zion, Amir Vadai, Jiri Pirko On Thu, May 30, 2013 at 05:23:20PM -0700, Jay Vosburgh wrote: > Vlad Yasevich <vyasevic@redhat.com> wrote: > > >>> CC: Jiri Pirko <jiri@resnulli.us> > >>> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> > >>> Signed-off-by: David S. Miller <davem@davemloft.net> > >>> > >>> I've confirmed that reverting this patch on top of 3.10-rc3 allows me > >>> to receive packets on all of my multicast groups without the Mellanox > >>> high_rate_steer option set. > >> > >> OK, impressive debugging... so what do we do from here? Vlad, Shawn > >> observes a regression once this patch is used on a large scale setup > >> that uses many multicast groups (you can read the posts done earlier > >> on this thread), does this rings any bell w.r.t to the actual problem > >> in the patch? > > > >I haven't seen that, but I didn't test with that many multicast groups. I > >had 20 groups working. > > > >I'll take a look and see what might be going on. > > I've actually been porting bonding to the dev_sync/unsync > system, and have a patch series of 4 fixes to various internals of > dev_sync/unsync; I'll post those under separate cover. It may be that > one or more of those things are the source of this problem (or I might > have it all wrong). Thanks Jay, I've tested your 4 patches on top of Linus' tree and they do solve the multicast issue I was seeing in this thread. -- Shawn ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups 2013-05-24 15:49 3.10.0-rc2 mlx4 not receiving packets for some multicast groups Shawn Bohrer 2013-05-24 16:34 ` Shawn Bohrer @ 2013-05-25 3:49 ` Or Gerlitz 2013-05-25 14:02 ` Shawn Bohrer 1 sibling, 1 reply; 16+ messages in thread From: Or Gerlitz @ 2013-05-25 3:49 UTC (permalink / raw) To: Shawn Bohrer; +Cc: netdev, Hadar Hen Zion, Rony Efraim, Amir Vadai On Fri, May 24, 2013 at 6:49 PM, Shawn Bohrer <shawn.bohrer@gmail.com> wrote: > I just started testing the 3.10 kernel, previously we were on 3.4 so > there is a fairly large jump. [...] > 05:00.0 Network controller: Mellanox Technologies MT27500 Family > [ConnectX-3] > > # ethtool -i eth4 > driver: mlx4_en > version: 2.0 (Dec 2011) > firmware-version: 2.11.500 Did you change firmware between the point it was working to where you are now? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups 2013-05-25 3:49 ` Or Gerlitz @ 2013-05-25 14:02 ` Shawn Bohrer 0 siblings, 0 replies; 16+ messages in thread From: Shawn Bohrer @ 2013-05-25 14:02 UTC (permalink / raw) To: Or Gerlitz; +Cc: netdev, Hadar Hen Zion, Rony Efraim, Amir Vadai On Sat, May 25, 2013 at 06:49:22AM +0300, Or Gerlitz wrote: > On Fri, May 24, 2013 at 6:49 PM, Shawn Bohrer <shawn.bohrer@gmail.com> wrote: > > I just started testing the 3.10 kernel, previously we were on 3.4 so > > there is a fairly large jump. > [...] > > > 05:00.0 Network controller: Mellanox Technologies MT27500 Family > > [ConnectX-3] > > > > # ethtool -i eth4 > > driver: mlx4_en > > version: 2.0 (Dec 2011) > > firmware-version: 2.11.500 > > Did you change firmware between the point it was working to where you are now? Nope, we've been using 2.11.500 for a while now. -- Shawn ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2013-05-31 15:17 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-05-24 15:49 3.10.0-rc2 mlx4 not receiving packets for some multicast groups Shawn Bohrer 2013-05-24 16:34 ` Shawn Bohrer 2013-05-24 16:58 ` Eric Dumazet 2013-05-25 3:41 ` Or Gerlitz 2013-05-25 15:13 ` Shawn Bohrer 2013-05-25 19:41 ` Or Gerlitz 2013-05-25 21:37 ` Shawn Bohrer 2013-05-28 20:15 ` Shawn Bohrer 2013-05-29 13:55 ` Or Gerlitz 2013-05-30 20:31 ` Shawn Bohrer 2013-05-30 20:42 ` Or Gerlitz 2013-05-30 20:57 ` Vlad Yasevich 2013-05-31 0:23 ` Jay Vosburgh 2013-05-31 15:17 ` Shawn Bohrer 2013-05-25 3:49 ` Or Gerlitz 2013-05-25 14:02 ` Shawn Bohrer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).