From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=BAYES_00,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C5D43C4363A for ; Fri, 23 Oct 2020 18:23:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 61CC122255 for ; Fri, 23 Oct 2020 18:23:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754652AbgJWSXw convert rfc822-to-8bit (ORCPT ); Fri, 23 Oct 2020 14:23:52 -0400 Received: from mail.kernel.org ([198.145.29.99]:52328 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S462161AbgJWSXw (ORCPT ); Fri, 23 Oct 2020 14:23:52 -0400 From: bugzilla-daemon@bugzilla.kernel.org To: kvm@vger.kernel.org Subject: [Bug 209253] Loss of connectivity on guest after important host <-> guest traffic Date: Fri, 23 Oct 2020 18:23:51 +0000 X-Bugzilla-Reason: None X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: AssignedTo virtualization_kvm@kernel-bugs.osdl.org X-Bugzilla-Product: Virtualization X-Bugzilla-Component: kvm X-Bugzilla-Version: unspecified X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: alex.williamson@redhat.com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P1 X-Bugzilla-Assigned-To: virtualization_kvm@kernel-bugs.osdl.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT X-Bugzilla-URL: https://bugzilla.kernel.org/ Auto-Submitted: auto-generated MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org https://bugzilla.kernel.org/show_bug.cgi?id=209253 --- Comment #7 from Alex Williamson (alex.williamson@redhat.com) --- Color me suspicious, but there are backtraces from two configurations in the comments here that have no vfio devices, the original post and Justin's second trace. The identified commit can only affect vfio configurations. All of the backtraces seem to be from triggering this warning: __u64 eventfd_signal(struct eventfd_ctx *ctx, __u64 n) { unsigned long flags; /* * Deadlock or stack overflow issues can happen if we recurse here * through waitqueue wakeup handlers. If the caller users potentially * nested waitqueues with custom wakeup handlers, then it should * check eventfd_signal_count() before calling this function. If * it returns true, the eventfd_signal() call should be deferred to a * safe context. */ if (WARN_ON_ONCE(this_cpu_read(eventfd_wake_count))) return 0; This cpu-local counter is only incremented while holding a spinlock with IRQs disabled while handling the wait queue. It's not obvious to me how the backtraces shown can lead to recursive eventfd signals. I've setup a configuration for stress testing, but any detailed description of a reliable reproducer would be appreciated. -- You are receiving this mail because: You are watching the assignee of the bug.