From: bugzilla@dpdk.org
To: dev@dpdk.org
Subject: [dpdk-dev] [Bug 440] net/mlx5: Read of "out_of_buffer" using fopen/fread/fclose causing TLB shootdowns due to mmap/munmap
Date: Mon, 06 Apr 2020 04:10:11 +0000 [thread overview]
Message-ID: <bug-440-3@http.bugs.dpdk.org/> (raw)
https://bugs.dpdk.org/show_bug.cgi?id=440
Bug ID: 440
Summary: net/mlx5: Read of "out_of_buffer" using
fopen/fread/fclose causing TLB shootdowns due to
mmap/munmap
Product: DPDK
Version: 19.11
Hardware: x86
OS: Linux
Status: UNCONFIRMED
Severity: normal
Priority: Normal
Component: ethdev
Assignee: dev@dpdk.org
Reporter: mohsinshaikh@niometrics.com
Target Milestone: ---
Created attachment 86
--> https://bugs.dpdk.org/attachment.cgi?id=86&action=edit
Potential fix patch
Setup: HPE ProLiant DL560 Gen10 Server with 4 x "Intel(R) Xeon(R) Platinum
8180M CPU @ 2.50GHz" and 6TB of memory (1 TB reserved huge pages) with 2 X
"Mellanox Technologies ConnectX®-5 EN network interface card, 100GbE
single-port QSFP28, PCIe3.0 x16, tall bracket; MCX515A-CCAT". OS is Centos 7
with kernel 3.10.0-957.27.2.el7.x86_64.
While testing I noticed that the rx cores of the DPDK app were getting TLB
shootdowns periodically. After debugging I found that the stats thread of the
DPDK app was responsible for those. Specifically the read of "out_of_buffer"
stat:
static inline void
mlx5_read_ib_stat(struct mlx5_priv *priv, const char *ctr_name, uint64_t *stat)
{
FILE *file;
if (priv->sh) {
MKSTR(path, "%s/ports/%d/hw_counters/%s",
priv->sh->ibdev_path,
priv->ibv_port,
ctr_name);
file = fopen(path, "rb");
if (file) {
int n = fscanf(file, "%" SCNu64, stat); <----- Calls
mmap().
fclose(file); <---- Calls munmap which leads to TLB
shootdown
if (n == 1)
return;
}
}
*stat = 0;
}
NOTE: We almost exclusively use huge pages in the DPDK app for allocations.
Stack traces:
(gdb) where
#0 __mmap (addr=addr@entry=0x0, len=4096, prot=prot@entry=3,
flags=flags@entry=34, fd=fd@entry=-1, offset=offset@entry=0) at
../sysdeps/unix/sysv/linux/wordsize-64/mmap.c:32
#1 0x00007fffdeaf8ed1 in __GI__IO_file_doallocate (fp=0x2b52263ae000) at
filedoalloc.c:127
#2 0x00007fffdeb07d07 in __GI__IO_doallocbuf (fp=fp@entry=0x2b52263ae000) at
genops.c:399
#3 0x00007fffdeb06c8c in _IO_new_file_underflow (fp=0x2b52263ae000) at
fileops.c:557
#4 0x00007fffdeb07dd2 in __GI__IO_default_uflow (fp=0x2b52263ae000) at
genops.c:414
#5 0x00007fffdeae7efa in _IO_vfscanf_internal (s=<optimized out>,
format=<optimized out>, argptr=argptr@entry=0x7fff51f07498,
errp=errp@entry=0x0) at vfscanf.c:600
#6 0x00007fffdeaed617 in ___vfscanf (s=<optimized out>, format=<optimized
out>, argptr=argptr@entry=0x7fff51f07498) at vfscanf.c:2942
#7 0x00007fffdeaf6247 in __fscanf (stream=<optimized out>, format=<optimized
out>) at fscanf.c:31
#8 0x00007fffe632ee9f in mlx5_stats_get () from
/opt/nio/dpdk/lib/librte_pmd_mlx5.so.20.0
#9 0x00007fffeee6b468 in rte_eth_stats_get () from
/opt/nio/dpdk/lib/librte_ethdev.so.20.0
Breakpoint 2, munmap () at ../sysdeps/unix/syscall-template.S:81
81 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
(gdb) where
#0 munmap () at ../sysdeps/unix/syscall-template.S:81
#1 0x00007fffdeb07cb2 in __GI__IO_setb (f=f@entry=0x2b52263ae000,
b=b@entry=0x0, eb=eb@entry=0x0, a=a@entry=0) at genops.c:383
#2 0x00007fffdeb06090 in _IO_new_file_close_it (fp=fp@entry=0x2b52263ae000) at
fileops.c:192
#3 0x00007fffdeaf9138 in _IO_new_fclose (fp=0x2b52263ae000) at iofclose.c:58
#4 0x00007fffe632eeaa in mlx5_stats_get () from
/opt/nio/dpdk/lib/librte_pmd_mlx5.so.20.0
#5 0x00007fffeee6b468 in rte_eth_stats_get () from
/opt/nio/dpdk/lib/librte_ethdev.so.20.0
Output from kernel tracing (using trace-cmd):
<...>-120694 [000] 502442.376421: function:
native_flush_tlb_others
<...>-120694 [000] 502442.376423: kernel_stack: <stack
trace>
=> tlb_flush_mmu.part.76 (ffffffffa89e57d7)
=> tlb_finish_mmu (ffffffffa89e70d5)
=> unmap_region (ffffffffa89f0754)
=> do_munmap (ffffffffa89f2d45)
=> vm_munmap (ffffffffa89f2f85)
=> SyS_munmap (ffffffffa89f4212)
=> tracesys (ffffffffa8f7706b)
<...>-120324 [002] 502442.382250: function:
flush_tlb_func
<...>-120323 [001] 502442.382250: function:
flush_tlb_func
<...>-120325 [003] 502442.382251: function:
flush_tlb_func
<...>-120327 [005] 502442.382251: function:
flush_tlb_func
<...>-120326 [004] 502442.382251: function:
flush_tlb_func
<...>-120332 [010] 502442.382251: function:
flush_tlb_func
<...>-120324 [002] 502442.382252: kernel_stack: <stack
trace>
=> generic_smp_call_function_single_interrupt (ffffffffa8912ea3)
=> smp_call_function_interrupt (ffffffffa885737d)
=> call_function_interrupt (ffffffffa8f79382)
The following patch seems to fix the issue:
=========================================================================
commit e4e93bd3ffb4e75b2178b0fdfe12341bfddf171d
Author: Mohsin Shaikh <mohsinshaikh@niometrics.com>
Date: Tue Mar 31 14:28:41 2020 +0800
net/mlx5: Use open/read/close for reading ib stat
diff --git a/drivers/net/mlx5/mlx5_stats.c b/drivers/net/mlx5/mlx5_stats.c
index 205e4fe..769f080 100644
--- a/drivers/net/mlx5/mlx5_stats.c
+++ b/drivers/net/mlx5/mlx5_stats.c
@@ -8,6 +8,8 @@
#include <linux/ethtool.h>
#include <stdint.h>
#include <stdio.h>
+#include <unistd.h>
+#include <fcntl.h>
#include <rte_ethdev_driver.h>
#include <rte_common.h>
@@ -139,20 +141,22 @@
static inline void
mlx5_read_ib_stat(struct mlx5_priv *priv, const char *ctr_name, uint64_t
*stat)
{
- FILE *file;
+ int fd;
if (priv->sh) {
MKSTR(path, "%s/ports/%d/hw_counters/%s",
priv->sh->ibdev_path,
priv->ibv_port,
ctr_name);
- file = fopen(path, "rb");
- if (file) {
- int n = fscanf(file, "%" SCNu64, stat);
-
- fclose(file);
- if (n == 1)
+ fd = open(path, O_RDONLY);
+ if (fd != -1) {
+ char buf[32];
+ ssize_t n = read(fd, buf, sizeof(buf));
+ close(fd);
+ if (n != -1) {
+ sscanf(buf, "%lu", stat);
return;
+ }
}
}
*stat = 0;
=========================================================================
Patch attached to bug.
These TLB shootdowns lead to packets being dropped sporadically by the mlx5
NICs with the "rx_discards_phy" counter being incremented. The above patch
significantly reduced the drops in our app.
--
You are receiving this mail because:
You are the assignee for the bug.
reply other threads:[~2020-04-06 4:10 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bug-440-3@http.bugs.dpdk.org/ \
--to=bugzilla@dpdk.org \
--cc=dev@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.