From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B27FBC433F5 for ; Mon, 11 Oct 2021 06:48:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6252B60E78 for ; Mon, 11 Oct 2021 06:48:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6252B60E78 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 016C96B006C; Mon, 11 Oct 2021 02:48:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EE232900002; Mon, 11 Oct 2021 02:48:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D82E76B0072; Mon, 11 Oct 2021 02:48:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0207.hostedemail.com [216.40.44.207]) by kanga.kvack.org (Postfix) with ESMTP id C420A6B006C for ; Mon, 11 Oct 2021 02:48:42 -0400 (EDT) Received: from smtpin35.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 7789A2E4C2 for ; Mon, 11 Oct 2021 06:48:42 +0000 (UTC) X-FDA: 78683228484.35.8069FB7 Received: from mail-oo1-f49.google.com (mail-oo1-f49.google.com [209.85.161.49]) by imf10.hostedemail.com (Postfix) with ESMTP id 1F26C6004434 for ; Mon, 11 Oct 2021 06:48:42 +0000 (UTC) Received: by mail-oo1-f49.google.com with SMTP id i1-20020a4ab241000000b002b7069d0e88so282545ooo.5 for ; Sun, 10 Oct 2021 23:48:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=rg9FMIgGuiU1OB4eISCi95IByOPH0nsu1EJ0FkL+Lgk=; b=egUQGdXVqcdvTub+oSmCsfECSzihqal+CKJ4uV0XKQKD8rRjjI+YKz/Ijz6+VikIyV 6x13T+pjMBYf7jSs9J+uIgKXC/XtRkFlpEhXLe7ioLEpCBeunSL6mHh3y3ugxZmyPcoc P2n/DyDqR/lhKmsRYDRrkBC0Y3MUe3aZg3DZHsl4kRq/UjFi+IGq83fGaz6gJ5CTnvQr raKdGEKmu6Dbj8gkojy6aHws9EmheAEPxqJiaVUO1VmJBFvhqaHCHaR7LEMOToyXtY6m Jm2dNfLhB3WVb7gmvEL73aT9CyIKYtTMXAa/BW/IOSgtnbKY7G8AwQEcnUSqzLMcD9lc F2hw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=rg9FMIgGuiU1OB4eISCi95IByOPH0nsu1EJ0FkL+Lgk=; b=ZSxzIRszh8Y2kxUASHrMTpkGSNKEKg3/jhu0gSXLB9XnU55g5g6QpGkJewCg6Kc4hE ioQYuND/M8fvUO/8XQ8qHxMwrKo4DPSsO4T2kyqvCMQi+9WaTUkwAPa3d2p/BD+rEEUT 4vZ0mqXAvZqCNxyXuDH0LOfNryOTwzThQYYA6CmUvhFg1CA0F4N8oplbHT28yaWfG3jJ h5sBnqRqtpg6wfFAYj3jR/AGv2fiValrK711A3YJ1IdDcSn8Xfjrumru3imcnxGzCOao ohPjgzCatEuuoayU5OrFHLp+Jm3JWKFY17Qo0msP0X7hcYk6nbeIaXK+x/8cHDGxjfT6 WBlw== X-Gm-Message-State: AOAM531BoADapZiLHOmenUSOdpOmoDochnpoxEiV4HLZVYgWXsW0KAgG 0jsmOdtAU+WyOeYlnJVxaYIM4MRHdYV+KmCx64+Q5A== X-Google-Smtp-Source: ABdhPJxliWynHRDGre6s7TaTKGAQV1rH/vPbVCTFQL3hp8w5tYrrSWTJS9BPJ+Ui+r6KaW+dGyaXIBDRQCda/n43KW4= X-Received: by 2002:a4a:d54c:: with SMTP id q12mr17516865oos.25.1633934921227; Sun, 10 Oct 2021 23:48:41 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Marco Elver Date: Mon, 11 Oct 2021 08:48:29 +0200 Message-ID: Subject: Re: BUG: soft lockup in __kmalloc_node() with KFENCE enabled To: Andrea Righi Cc: Alexander Potapenko , Dmitry Vyukov , kasan-dev@googlegroups.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 1F26C6004434 X-Stat-Signature: 5iojt1bigxekscabpkrwnr97snpa6f8j Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=egUQGdXV; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf10.hostedemail.com: domain of elver@google.com designates 209.85.161.49 as permitted sender) smtp.mailfrom=elver@google.com X-HE-Tag: 1633934922-552034 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, 11 Oct 2021 at 08:32, Andrea Righi wrote: > On Mon, Oct 11, 2021 at 08:00:00AM +0200, Marco Elver wrote: > > On Sun, 10 Oct 2021 at 15:53, Andrea Righi wrote: > > > I can systematically reproduce the following soft lockup w/ the latest > > > 5.15-rc4 kernel (and all the 5.14, 5.13 and 5.12 kernels that I've > > > tested so far). > > > > > > I've found this issue by running systemd autopkgtest (I'm using the > > > latest systemd in Ubuntu - 248.3-1ubuntu7 - but it should happen with > > > any recent version of systemd). > > > > > > I'm running this test inside a local KVM instance and apparently systemd > > > is starting up its own KVM instances to run its tests, so the context is > > > a nested KVM scenario (even if I don't think the nested KVM part really > > > matters). > > > > > > Here's the oops: > > > > > > [ 36.466565] watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [udevadm:333] > > > [ 36.466565] Modules linked in: btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear psmouse floppy > > > [ 36.466565] CPU: 0 PID: 333 Comm: udevadm Not tainted 5.15-rc4 > > > [ 36.466565] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 > > [...] > > > > > > If I disable CONFIG_KFENCE the soft lockup doesn't happen and systemd > > > autotest completes just fine. > > > > > > We've decided to disable KFENCE in the latest Ubuntu Impish kernel > > > (5.13) for now, because of this issue, but I'm still investigating > > > trying to better understand the problem. > > > > > > Any hint / suggestion? > > > > Can you confirm this is not a QEMU TCG instance? There's been a known > > issue with it: https://bugs.launchpad.net/qemu/+bug/1920934 > > It looks like systemd is running qemu-system-x86 without any "accel" > options, so IIUC the instance shouldn't use TCG. Is this a correct > assumption or is there a better way to check? AFAIK, the default is TCG if nothing else is requested. What was the command line? > > One thing that I've been wondering is, if we can make > > CONFIG_KFENCE_STATIC_KEYS=n the default, because the static keys > > approach is becoming more trouble than it's worth. It requires us to > > re-benchmark the defaults. If you're thinking of turning KFENCE on by > > default (i.e. CONFIG_KFENCE_SAMPLE_INTERVAL non-zero), you could make > > this decision for Ubuntu with whatever sample interval you choose. > > We've found that for large deployments 500ms or above is more than > > adequate. > > Another thing that I forgot to mention is that with > CONFIG_KFENCE_STATIC_KEYS=n the soft lockup doesn't seem to happen. Thanks for confirming. Thanks, -- Marco