From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-oo1-f73.google.com (mail-oo1-f73.google.com [209.85.161.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B705DB672 for ; Wed, 29 Apr 2026 00:06:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.161.73 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777421194; cv=none; b=YiY/Jy084zwwUmH7soH2ykdhV2Cx1c1u7nQYMpB+ZVZya1YAKsFh9IGS02NLBYWmVJ6b9gcrKjtqv4QvsOmo5FLWPtxMqOe0uGQORAR+mBLrL+/ehWBmivQdb+CgS6BT8FykGetuq87BwXjLCZjnWxHXsgGXKoNHKIeZsW4hm3E= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777421194; c=relaxed/simple; bh=n0dBxv61kgCZiP6/pcNg8V2oL7fpjyJjwNl78oEnzwU=; h=Date:Mime-Version:Message-ID:Subject:From:To:Cc:Content-Type; b=hr+mc0zJJrWFeZJHtEFbHCxItBYsff5kDjwCFB0qlvP+KvBTHR6Tsl8L9KKDLMzpJCnpFPrACqXhL0yUXcasvcKg6ZBzdW3Gra3IUvKxjLQeILiK2HeNIWVHffDgXu9B2GA40LeIWYNdCvmT2n7EJL+3Vk7qYYBbT0JTsCpO/XA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--avagin.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=q+CWPJFJ; arc=none smtp.client-ip=209.85.161.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--avagin.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="q+CWPJFJ" Received: by mail-oo1-f73.google.com with SMTP id 006d021491bc7-6949742b3ebso11951395eaf.3 for ; Tue, 28 Apr 2026 17:06:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1777421190; x=1778025990; darn=lists.linux.dev; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=6FW+L3ubugl0QdPcOsaUMl+x/158JR6QuBxGAIT9xVs=; b=q+CWPJFJcX3QTXj2ZFeFwJrMEtsNn7wSVrztZeHISfiRFyKX/zD2duAS6wUHkTHvgX ysjlMDrNVpfTSxMkzuu2jdX65oH/oWt/39h4LhyQxfYbWGDFTP4XasoQ6I3y5NIfpnog WfDju9ajPyA02Wvw6vwg/stEP062HB4JYqiJDP9MJCxBag6TPXeS+LBPEJUJKQEmNJ5l 3FVb6TLNidD7dc9ScI2J79HZ36Xtv6SDrvDEQfXYo/o4Ly5UVZocyZiquFWXjB3K0C0j q2uEzqwDWfMT5Xj+gPcw9hLSXLU+yz7WtMy1xXGF47daOXsHYB6rXJQIDE4L7+LwfcYH 0I+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777421191; x=1778025991; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=6FW+L3ubugl0QdPcOsaUMl+x/158JR6QuBxGAIT9xVs=; b=KWJow6ajfyCyUiTTtY4dwwx9gY5LYJloYkdNsvZYBtLj1+yU6SzGxJRFgtBiG63hHW k/4adYwQSj+CrkjVcas8pdeH7xgcYwVr0aRyZdFYHvTnVnpKBUBGPtcDNRU6EuBdp4hM 0wlc/7aKvX9RJly2Zk99zhssZc8YW7kV92JwYjpETbvqUACvq2N+/kXe+bE8/5D/DeqU J0DeB1aYHVjzBpc5YqqJZXbivx4YrKA3KVYNMo6yTG1skVdUa9Ma89IIOcMXhsKExiUx KgfLeRIG2s9L+fYqvdPa8RRD/aw5dhYfAgjQYXwwruuEJwwpIweYPDg+Nlsy1/r/D0JY 44Eg== X-Forwarded-Encrypted: i=1; AFNElJ9UpihpvML+mnIxH7AAtFux9sz8WR6BCOyUO3/yKMrOd0FSNIcAckq3eqf5MJDNF/fhayIZ@lists.linux.dev X-Gm-Message-State: AOJu0YyPQHlI2jQXtsNkzw28hdo/DAu4FWpl+bmlMOLmKWv2r6c83GHS d/pRQjrGc+WflKT7zfoUQP/5jaJlxznha3PJrHJI4GXlcq/5kn8Mfzpudw1+Zn4Z5sRvNOnOTxk YHgMZTg== X-Received: from ilut10.prod.google.com ([2002:a05:6e02:160a:b0:4fc:4c72:e6cc]) (user=avagin job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6820:a0b:b0:696:22dc:b4db with SMTP id 006d021491bc7-696684c23afmr946678eaf.41.1777421190478; Tue, 28 Apr 2026 17:06:30 -0700 (PDT) Date: Wed, 29 Apr 2026 00:06:23 +0000 Precedence: bulk X-Mailing-List: criu@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 X-Mailer: git-send-email 2.54.0.545.g6539524ca2-goog Message-ID: <20260429000623.3356606-1-avagin@google.com> Subject: [PATCH] Revert "x86/fpu: Refine and simplify the magic number check during signal return" From: Andrei Vagin To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen Cc: linux-kernel@vger.kernel.org, criu@lists.linux.dev, x86@kernel.org, Andrei Vagin , "Chang S. Bae" , stable@vger.kernel.org Content-Type: text/plain; charset="UTF-8" This reverts commit dc8aa31a7ac2 ("x86/fpu: Refine and simplify the magic number check during signal return"). The reverted commit broke applications that construct signal frames in userspace (such as CRIU and gVisor) if the frame's xstate size is smaller than the kernel's fpstate->user_size. Furthermore, this introduces a critical issue for checkpoint/restore tools like CRIU. If a process is checkpointed while inside a signal handler, its stack contains a signal frame formatted according to the source host's xstate capabilities. If that process is later restored on a destination host with larger xstate capabilities (e.g., a newer CPU with more features enabled, resulting in a larger fpstate->user_size), the kernel will look for FP_XSTATE_MAGIC2 at the destination host's larger user_size offset instead of the offset encoded in the frame's fx_sw->xstate_size. This causes the magic2 check to fail, forcing sigreturn to silently fall back to "FX-only" mode. Upon return from the signal handler, the process's extended state is reset to initial values instead of being restored, leading to silent data corruption. The original commit cited commit d877550eaf2d ("x86/fpu: Stop relying on userspace for info to fault in xsave buffer") as justification to stop relying on userspace for the magic number check. However, these two changes are fundamentally different. The last one only changed how much memory the kernel ensures is paged-in before running XRSTOR to prevent an infinite loop. It did not change the signal frame format or how the layout is validated. Reverting this change restores the use of fx_sw->xstate_size for locating magic2 and restores the necessary sanity checks, ensuring that the signal frame remains self-describing and portable. Cc: Chang S. Bae Cc: stable@vger.kernel.org Fixes: dc8aa31a7ac2 ("x86/fpu: Refine and simplify the magic number check during signal return") Signed-off-by: Andrei Vagin --- arch/x86/kernel/fpu/signal.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c index c3ec2512f2bb..20b638c507ca 100644 --- a/arch/x86/kernel/fpu/signal.c +++ b/arch/x86/kernel/fpu/signal.c @@ -27,14 +27,19 @@ static inline bool check_xstate_in_sigframe(struct fxregs_state __user *fxbuf, struct _fpx_sw_bytes *fx_sw) { + int min_xstate_size = sizeof(struct fxregs_state) + + sizeof(struct xstate_header); void __user *fpstate = fxbuf; unsigned int magic2; if (__copy_from_user(fx_sw, &fxbuf->sw_reserved[0], sizeof(*fx_sw))) return false; - /* Check for the first magic field */ - if (fx_sw->magic1 != FP_XSTATE_MAGIC1) + /* Check for the first magic field and other error scenarios. */ + if (fx_sw->magic1 != FP_XSTATE_MAGIC1 || + fx_sw->xstate_size < min_xstate_size || + fx_sw->xstate_size > x86_task_fpu(current)->fpstate->user_size || + fx_sw->xstate_size > fx_sw->extended_size) goto setfx; /* @@ -43,7 +48,7 @@ static inline bool check_xstate_in_sigframe(struct fxregs_state __user *fxbuf, * fpstate layout with out copying the extended state information * in the memory layout. */ - if (__get_user(magic2, (__u32 __user *)(fpstate + x86_task_fpu(current)->fpstate->user_size))) + if (__get_user(magic2, (__u32 __user *)(fpstate + fx_sw->xstate_size))) return false; if (likely(magic2 == FP_XSTATE_MAGIC2)) -- 2.54.0.545.g6539524ca2-goog