From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A680FC4BA0B for ; Wed, 26 Feb 2020 07:28:30 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 573ED2084E for ; Wed, 26 Feb 2020 07:28:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 573ED2084E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bugzilla.kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 48S6pz4tGqzDqMl for ; Wed, 26 Feb 2020 18:28:27 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=kernel.org (client-ip=198.145.29.99; helo=mail.kernel.org; envelope-from=srs0=+jiy=4o=bugzilla.kernel.org=bugzilla-daemon@kernel.org; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=bugzilla.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 48S6mp38ztzDq6N for ; Wed, 26 Feb 2020 18:26:34 +1100 (AEDT) From: bugzilla-daemon@bugzilla.kernel.org To: linuxppc-dev@lists.ozlabs.org Subject: [Bug 206669] Little-endian kernel crashing on POWER8 on heavy big-endian PowerKVM load Date: Wed, 26 Feb 2020 07:26:31 +0000 X-Bugzilla-Reason: None X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: AssignedTo platform_ppc-64@kernel-bugs.osdl.org X-Bugzilla-Product: Platform Specific/Hardware X-Bugzilla-Component: PPC-64 X-Bugzilla-Version: 2.5 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: glaubitz@physik.fu-berlin.de X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P1 X-Bugzilla-Assigned-To: platform_ppc-64@kernel-bugs.osdl.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugzilla.kernel.org/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" https://bugzilla.kernel.org/show_bug.cgi?id=3D206669 --- Comment #2 from John Paul Adrian Glaubitz (glaubitz@physik.fu-berlin.de= ) --- (In reply to npiggin from comment #1) > Thanks for the report, we need to get more data about the first BUG if=20 > we can. What function in your vmlinux contains address=20 > 0xc00000000017a778? (use nm or objdump etc) Seems to be t select_task_rq_fair: root@watson:/boot# nm vmlinux-5.4.0-0.bpo.3-powerpc64le |grep -C5 c00000000= 017a c000000000448550 T select_estimate_accuracy c000000000170d20 t select_fallback_rq c000000000e4c940 D select_idle_mask c000000000179f10 t select_idle_sibling c00000000018fd80 t select_task_rq_dl c00000000017a640 t select_task_rq_fair c000000000177f50 t select_task_rq_idle c00000000018c9e0 t select_task_rq_rt c00000000019c800 t select_task_rq_stop c000000000927710 t selem_alloc.isra.6 c000000000926e50 t selem_link_map root@watson:/boot# > Is that the first message you > get, > No warnings or anything else earlier in the dmesg? Correct. You can see the login prompt of the host VM watson directly after booting up. > Also 0xc0000000002659a0 would be interesting. Looks like that's ring_buffer_record_off: root@watson:/boot# nm vmlinux-5.4.0-0.bpo.3-powerpc64le |grep -C5 c0000000002659 c0000000002667e0 T ring_buffer_read_finish c00000000026b4b0 T ring_buffer_read_page c000000000265e10 T ring_buffer_read_prepare c000000000265ef0 T ring_buffer_read_prepare_sync c000000000269ae0 T ring_buffer_read_start c000000000265950 T ring_buffer_record_disable c000000000266070 T ring_buffer_record_disable_cpu c000000000265970 T ring_buffer_record_enable c0000000002660c0 T ring_buffer_record_enable_cpu c00000000026d470 T ring_buffer_record_is_on c00000000026d480 T ring_buffer_record_is_set_on c000000000265990 T ring_buffer_record_off c000000000265a10 T ring_buffer_record_on c000000000266da0 T ring_buffer_reset c000000000266a90 T ring_buffer_reset_cpu c000000000267cd0 T ring_buffer_resize c00000000026d400 T ring_buffer_set_clock root@watson:/boot# FWIW, the kernel image comes from this Debian package: > > http://snapshot.debian.org/archive/debian/20200211T210433Z/pool/main/l/li= nux/linux-image-5.4.0-0.bpo.3-powerpc64le_5.4.13-1%7Ebpo10%2B1_ppc64el.deb > When reproducing, do you ever get a clean trace of the first bug? I have logged everything that showed in the console during and after the cr= ash. After that, the machine no longer responds and has to be hard-resetted. > Could you try setting /proc/sys/kernel/panic_on_oops and reproducing? I will try that. Anything to be considered for the kernel running inside the big-endian VM? --=20 You are receiving this mail because: You are watching the assignee of the bug.=