From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8ACE47080A for ; Wed, 15 Jan 2025 14:22:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736950940; cv=none; b=BeoUJT4CfQjocQpl2Ny57vZl3yMtg+ttw7kI0gFPkuK3YvuTe0WCivFLkwyd4jJ4EvgKma9h4kdKO6F+I1CnotuXAWcZc2uON3qofWj03Ai4QomggtU+sLF3eysTQ5XspXupAnops8SSzGBjsw+kCDxTrReWpBySEa4X0jiEmjE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736950940; c=relaxed/simple; bh=p2nkdyihG7hzDEdicEuBqvCaM2ymA9NP4S/x80Vfi6A=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=gsgNoLyvMDAGr22Y0aP+om8o36eWxHPf0aA0nhz6UlLBuLsbUgo3cowqaDSEwJC4TdZI9+qDLsAEE/8xAPPlfWRi1muW4LlIHgGY9VWk2ChrPInxvjz44pVS9QcGB1S8MOtJTNkJjAgiVBiZHOHppwL6dRVkWF9Xy0iqIUjijvY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=jL7ml1Be; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=rK7FH7j1; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="jL7ml1Be"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="rK7FH7j1" From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1736950935; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Cn7bF33YA0VyucohKGAkeoS0WCyfiv/ZwS29fUnaBkM=; b=jL7ml1BeNl2UYPwIoQJ4jRx86mqYTLYmenNrolZjJHISpNW8B5q034uQSyniFDrO5gu9x8 rD6lnIKPFY4tLZtOI5VK2SUv3E66HCK8NnMGAgjp1hCmQLq1yteiZ7H46FO1PSQXSwpDjn HfayrzjGmAbaTST5OSKSyH6aX6FP8434io1UfkPJFnRpACXEtzy5zZMiUb2/wH00t5AxFs dI0av5Y2Pu8ZQumgEF/HHYCFqw5aWUVa20tKIHtCV5Zackn/HerAywnL8Ip6JnalH3KjCY 2u55Eyxxgcg+D9iz7l1rms7kM5HED/n5CII6/SIxMcXLAcqhPBuOq8eQXO5bow== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1736950935; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Cn7bF33YA0VyucohKGAkeoS0WCyfiv/ZwS29fUnaBkM=; b=rK7FH7j1zGZCjpBVjaE/fUiHVQvUnFUfz6TEP2qwBdqqBoWymrJhuTDq/CiUDGY08MIWTa XZg13H2/0TDmRBBw== To: reveliofuzzing , Dave Hansen Cc: mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, kirill.shutemov@linux.intel.com, x86@kernel.org, linux-kernel@vger.kernel.org Subject: Re: reproducible GPF error in native_tss_update_io_bitmap In-Reply-To: References: <57af7a02-c765-409d-a04c-bf74e747f8b6@intel.com> Date: Wed, 15 Jan 2025 15:22:14 +0100 Message-ID: <871px4cg2x.ffs@tglx> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On Tue, Jan 07 2025 at 19:24, reveliofuzzing@gmail.com wrote: > On Tue, Jan 7, 2025 at 4:28=E2=80=AFPM Dave Hansen wrote: >> >> On 1/6/25 18:32, reveliofuzzing wrote: >> > Hello, >> > >> > We found the following general protection fault bug in Linux kernel 6.= 12, and >> > it can be reproduced stably in a QEMU VM. To our knowledge, this probl= em has not >> > been observed by SyzBot so we would like to report it for your referen= ce. >> > >> > - dmesg >> > syzkaller login: [ 90.849309] Oops: general protection fault, >> > probably for non-canonical address 0xdffffc0000000000: 0000 [#1] >> > PREEMPTI >> > [ 90.853735] KASAN: null-ptr-deref in range >> > [0x0000000000000000-0x0000000000000007] >> > [ 90.856772] CPU: 0 PID: 3265 Comm: iou-sqp-3264 Not tainted 6.10.0 = #2 >> > [ 90.859386] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), >> > BIOS 1.13.0-1ubuntu1.1 04/01/2014 >> > [ 90.862774] RIP: 0010:native_tss_update_io_bitmap+0x143/0x510 >> >> The whole thing looks like an issue in the failure path when trying to >> create an io_uring io worker thread. It's probably some confusion in >> treating the worker thread like a userspace thread with an io bitmap >> when the worker thread doesn't have one. >> >> It's _probably_ only reproducible with io_uring. It's arguable whether >> it's likely an x86 issue or an io_uring issue. >> >> In any case, running: >> >> scripts/decode_stacktrace.sh >> > Here is the output of running this script: So looking at it from the call chain: > [ 90.935319] copy_process (linux-6.12/kernel/fork.c:1764 This means copy_process() failed at some point and then invokes: > [ 90.933995] exit_thread (linux-6.12/arch/x86/kernel/process.c:122) which in turn invokes: > [ 90.932599] io_bitmap_exit (linux-6.12/arch/x86/kernel/ioport.c:58) > linux-6.12/arch/x86/kernel/ioport.c:36 - task_update_io_bitmap() > linux-6.12/arch/x86/kernel/ioport.c:48 - tss_update_io_bitmap() which ends up in native_tss_update_io_bitmap() > [ 90.853735] KASAN: null-ptr-deref in range > [0x0000000000000000-0x0000000000000007] > [ 90.856772] CPU: 0 PID: 3265 Comm: iou-sqp-3264 Not tainted 6.10.0 #2 > [ 90.859386] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > BIOS 1.13.0-1ubuntu1.1 04/01/2014 > [ 90.862774] RIP: 0010:native_tss_update_io_bitmap > (linux-6.12/arch/x86/kernel/process.c:470) Which is this code: if (tss->io_bitmap.prev_sequence !=3D iobm->sequence) where @iobm is NULL. But to reach that code it's required that the current task has TIF_IO_BITMAP set, which is wrong to begin with. The context of this is: [ 90.963627] io_sq_thread+0xaf9/0x1620 which is a IO worker thread created via io_uring_setup(). So that thread inherits the user space thread TIF flags and indeed the syzkaller reproducer does: syscall(__NR_ioperm, /*from=3D*/0ul, /*num=3D*/0xd8ul, /*on=3D*/0x80000000= ul); before invoking the io_uring setup. That inheritance is wrong and easy to fix, but that does not explain the actual failure. That's a bit more subtle. copy_thread() sets p->thread.io_bitmap to NULL, leaves TIF_IO_BITMAP set and then returns before sharing the bitmap of the parent thread. Now copy_process() fails after copy_thread() and invokes exit_thread() and due to TIF_IO_BITMAP being set it calls io_bitmap_exit(). The latter invokes task_update_io_bitmap(), which clears the newly created thread's IO_BITMAP flag because the new thread has neither iopl_emul =3D=3D 3 nor a io_valid bitmap. task_update_io_bitmap() then invokes tss_update_io_bitmap(), which checks the current thread's TIF_IO_BITMAP bit, which is set, but the io_sq_thread (current) does neither have iopl_emul =3D=3D 3 nor a bitmap pointer. Game over. Invoking task_update_io_bitmap() in the failure path of copy_process() is completely wrong as the newly created task never got active and therefore has never changed the TSS side. So invalidating or updating anything here is just bogus. The only important part is to drop the reference count on the bitmap if it got shared in copy_thread(). Tentative uncompiled fix below. Thanks, tglx --- diff --git a/arch/x86/kernel/ioport.c b/arch/x86/kernel/ioport.c index e2fab3ceb09f..fa7113babc8e 100644 --- a/arch/x86/kernel/ioport.c +++ b/arch/x86/kernel/ioport.c @@ -21,6 +21,9 @@ static atomic64_t io_bitmap_sequence; =20 void io_bitmap_share(struct task_struct *tsk) { + /* Inherit the IOPL level of the parent */ + tsk->thread.iopl_emul =3D current->thread.iopl_emul; + /* Can be NULL when current->thread.iopl_emul =3D=3D 3 */ if (current->thread.io_bitmap) { /* @@ -54,7 +57,12 @@ void io_bitmap_exit(struct task_struct *tsk) struct io_bitmap *iobm =3D tsk->thread.io_bitmap; =20 tsk->thread.io_bitmap =3D NULL; - task_update_io_bitmap(tsk); + /* + * Only update when a task is exiting, not when a newly created + * task is mopped up in the failure path of copy_process(). + */ + if (tsk =3D=3D current) + task_update_io_bitmap(tsk); if (iobm && refcount_dec_and_test(&iobm->refcnt)) kfree(iobm); } diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index f63f8fd00a91..89c16dc135cf 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -176,6 +176,8 @@ int copy_thread(struct task_struct *p, const struct ker= nel_clone_args *args) p->thread.sp =3D (unsigned long) fork_frame; p->thread.io_bitmap =3D NULL; p->thread.iopl_warn =3D 0; + p->thread.iopl_emul =3D 0; + clear_tsk_thread_flag(p, TIF_IO_BITMAP); memset(p->thread.ptrace_bps, 0, sizeof(p->thread.ptrace_bps)); =20 #ifdef CONFIG_X86_64