All of lore.kernel.org
 help / color / mirror / Atom feed
From: bugzilla@dpdk.org
To: dev@dpdk.org
Subject: [Bug 1298] memif disconnect thread vs. rx_burst worker thread(s) crash
Date: Thu, 12 Oct 2023 15:37:37 +0000	[thread overview]
Message-ID: <bug-1298-3@http.bugs.dpdk.org/> (raw)

[-- Attachment #1: Type: text/plain, Size: 4395 bytes --]

https://bugs.dpdk.org/show_bug.cgi?id=1298

            Bug ID: 1298
           Summary: memif disconnect thread vs. rx_burst worker thread(s)
                    crash
           Product: DPDK
           Version: 21.11
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Severity: normal
          Priority: Normal
         Component: ethdev
          Assignee: dev@dpdk.org
          Reporter: bly454@gmail.com
  Target Milestone: ---

Hello,

We have run into a timing issue between threads when using the memif interface
type and need some guidance.

Our application has a DPDK based process operating (among other things) a memif
server interface. The problem is exposed when this memif interface receives a
memif.disconnect message from the remote client, while in the middle of an
rte_eth_rx_burst() on this same memif interface. As the IRQ message handling is
on its own thread as compared to the DPDK worker thread doing the rx_burst,
this resulted in a crash. The backtraces for which have been shared below.

How does one ensure there are guard rails in place to gracefully exit the
rx-burst when a disconnect occurs? Or, how do we properly modify the code such
that  we defer responding to the disconnect CB after the rx-burst operation has
completed?

We are utilizing DPDK 21.11.2. I have diff’d dpdks-stable:22.11.3 in
./drivers/net/memif, but I do not see anything obvious that would address this.
I did a similar diff for dpdk:23.07, but do not see anything obvious there
either.

-Mike

(gdb) thread 1
[Switching to thread 1 (Thread 0x7f17e2813600 (LWP 470))]
#0  0x00007f17e374d225 in eth_memif_rx (queue=0x1189023b00,
bufs=0x7f17e28100e8, nb_pkts=32) at
../git/drivers/net/memif/rte_eth_memif.c:338
338                     last_slot = __atomic_load_n(&ring->head,
__ATOMIC_ACQUIRE);
(gdb) bt
#0  0x00007f17e374d225 in eth_memif_rx (queue=0x1189023b00,
bufs=0x7f17e28100e8, nb_pkts=32) at
../git/drivers/net/memif/rte_eth_memif.c:338
#1  0x000000000047e6fb in rte_eth_rx_burst (nb_pkts=32, rx_pkts=0x7f17e28100e8,
queue_id=0, port_id=<optimized out>) at /usr/include/rte_ethdev.h:5368
#2  pmd_main_loop () at ../git/swfw/api/src/swfwPmd.c:1086
#3  0x000000000047f309 in pmd_launch_one_lcore (dummy=<optimized out>) at
../git/my_process.c:1157
#4  0x00007f17f7070e7c in eal_thread_loop (arg=<optimized out>) at
../git/lib/eal/linux/eal_thread.c:146
#5  0x00007f17f4c3da72 in start_thread (arg=<optimized out>) at
pthread_create.c:442
#6  0x00007f17f4cbf930 in clone3 () at
../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb) l
333             ring_size = 1 << mq->log2_ring_size;
334             mask = ring_size - 1;
335
336             if (type == MEMIF_RING_C2S) {
337                     cur_slot = mq->last_head;
338                     last_slot = __atomic_load_n(&ring->head,
__ATOMIC_ACQUIRE);
339             } else {
340                     cur_slot = mq->last_tail;
341                     last_slot = __atomic_load_n(&ring->tail,
__ATOMIC_ACQUIRE);
342             }
(gdb) p ring->head
Cannot access memory at address 0x7f17d8e58006

(gdb) thread 19
[Switching to thread 19 (Thread 0x7f17f0804600 (LWP 468))]
#0  0x00007f17f4caf97b in __GI___close (fd=494) at
../sysdeps/unix/sysv/linux/close.c:27
27        return SYSCALL_CANCEL (close, fd);
(gdb) bt
#0  0x00007f17f4caf97b in __GI___close (fd=494) at
../sysdeps/unix/sysv/linux/close.c:27
#1  0x00007f17e374f01f in memif_free_regions (dev=dev@entry=0x7f17f727f000
<rte_eth_devices+99072>) at ../git/drivers/net/memif/rte_eth_memif.c:882
#2  0x00007f17e37475d0 in memif_disconnect (dev=0x7f17f727f000
<rte_eth_devices+99072>) at ../git/drivers/net/memif/memif_socket.c:623
#3  0x00007f17f7091bd2 in eal_intr_process_interrupts (nfds=<optimized out>,
events=<optimized out>) at ../git/lib/eal/linux/eal_interrupts.c:1026
#4  eal_intr_handle_interrupts (totalfds=<optimized out>, pfd=20) at
../git/lib/eal/linux/eal_interrupts.c:1100
#5  eal_intr_thread_main (arg=<optimized out>) at
../git/lib/eal/linux/eal_interrupts.c:1172
#6  0x00007f17f4c3da72 in start_thread (arg=<optimized out>) at
pthread_create.c:442
#7  0x00007f17f4cbf930 in clone3 () at
../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #2: Type: text/html, Size: 6370 bytes --]

                 reply	other threads:[~2023-10-12 15:37 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-1298-3@http.bugs.dpdk.org/ \
    --to=bugzilla@dpdk.org \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.