From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D07292C11D1 for ; Wed, 1 Oct 2025 16:55:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759337740; cv=none; b=UwbvoI/9ku6UG/YTJ5vn6aql6Kf5ZqZMj9DV6tujRkDiEr1Vkx/j57uyRxkteIAkXH0IYk0dVtLDU7f+7VnBQgq8iQgpSkOwIvXYCW3Vz9+AYW3w3tcKZgwDxW/Dow8WhszZHmMUWHaAgWxEiH7/Z4V8rlq8gei2LMDe6d+a1UI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759337740; c=relaxed/simple; bh=Nw4Avdnnjq56W3DhoQg3b0JPdrX5PkzeIu2m81xC9L8=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=ocmd6sdsBAHsumcN45JvYJz5zbKBuEEwdMzD3gEoZpgkqjRhsPAGPpUZ/A0sTjGKjaiq9K7NQ7NlKHcKXpGKNRzvEHxbnKDoVdtoA3B7WbTCWPSe2+uWkpw/j/PtI/pJgS+NLLdlpl62Kl18zB28LB+0R5Zf6yF7CuLoiGKA3YY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=D+Ixwi7j; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="D+Ixwi7j" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-28973df6a90so40817215ad.2 for ; Wed, 01 Oct 2025 09:55:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1759337738; x=1759942538; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=UNyQbvzow/mhpmYas76FE2CqqCPGeJucQxrLl/NhhQI=; b=D+Ixwi7jcH/3rHMn04X0UhB/2l8eDnwKgn6rtSfq18dlttzon+chcFO00+9PCe7Ts5 BQwCIPUdGY6tPZxiww3r335GVwB6nk79V9MiAfJEhrSj+d38OZE9Am1ksRBvncK9sxZ8 024E5RI8WGS5FHrI9Z3WMkJ3En97KVh7UihyZTiQz0PbtK3dwNV4SiSteO2Siq2/cYPY 2r6FmaAHSWr2g+PaYxHA5ylOMh0gUoEqi7v2E5lP51+SxmSIxchH1f9fzXM7IHA37Ig3 /O1c6uxKvXs9JUKC/Bj1rM0uI/W8t7lxDIujR0adZ5uwTBGkdMPB2Kn3Rim4222/BvQi zkQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759337738; x=1759942538; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=UNyQbvzow/mhpmYas76FE2CqqCPGeJucQxrLl/NhhQI=; b=ayAXnsVrrnlli6VVZZ5XOzLzRwVlycADgxqKz1Ip51c49caLhN6wHnv/8MWYq/JadJ 4y8kyyV68me5iw+A5W04fZM1poIjAB7GJUtZMlCUhoMTPiP6I1X78ISdcCaMfQK8cRBl UQGCG9yGiWgQCdjTqEKyVwlgujniQHDXT6rl/B3YENJQGsG6PD6jyh3M5RPmNvB6tKkx 6TGGd+epGGDN5oFrn3nGgFKWJoZPft4myMKvZub8tY+Y2BJqM3fyIIm3CrnkbCHWPMAz Fxt81nee0E2+xeD1p5u1eCCJ0X/FF+bmlrdVysQCMLoZQEyUsourmjWWkag2//IVLKtp 5EEQ== X-Forwarded-Encrypted: i=1; AJvYcCU1kfb+ilNZP92R2AktLH+OOzrOP0ydgaeBT6hg3bHrHh0cS34b597Tp2VRd/yWkTyyUiiq6fIf0yir6WE=@vger.kernel.org X-Gm-Message-State: AOJu0YyTiYmtFYV1AvPMKWbMM2mxC4rJAa/ZKviY91enPB3sy097/dyq zM5uAUnYTEFE92ef1F+EziNhOSDK3x8N6bcA+V9uPeFwj+jEKxLgiy2eTk9TZuaNUXGs9Y3qdE2 z3AzHmg== X-Google-Smtp-Source: AGHT+IFbleLiOllM8oSjGlHNtpxd2L2C0N5Mjl8Um+LgOUG3P+4Gw9bm6dL1Mb6vYImkIrK2MyoFExzw6SI= X-Received: from plw21.prod.google.com ([2002:a17:903:45d5:b0:268:11e:8271]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:db03:b0:279:fa:30fe with SMTP id d9443c01a7336-28e7f2b5302mr58620075ad.26.1759337738221; Wed, 01 Oct 2025 09:55:38 -0700 (PDT) Date: Wed, 1 Oct 2025 09:55:36 -0700 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: Message-ID: Subject: Re: [RFC PATCH v2 02/51] KVM: guest_memfd: Introduce and use shareability to guard faulting From: Sean Christopherson To: Vishal Annapurve Cc: Ackerley Tng , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yan Zhao , Fuad Tabba , Binbin Wu , Michael Roth , Ira Weiny , Rick P Edgecombe , David Hildenbrand , Paolo Bonzini Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable On Wed, Oct 01, 2025, Vishal Annapurve wrote: > On Tue, Sep 30, 2025 at 4:40=E2=80=AFPM Sean Christopherson wrote: > > > > > +}; > > > + > > > +enum shareability { > > > + SHAREABILITY_GUEST =3D 1, /* Only the guest can map (fault) fol= ios in this range. */ > > > + SHAREABILITY_ALL =3D 2, /* Both guest and host can fault foli= os in this range. */ > > > +}; > > > > Rather than define new values and new KVM uAPI, I think we should inste= ad simply > > support KVM_SET_MEMORY_ATTRIBUTES. We'll probably need a new CAP, as I= 'm not sure > > supporting KVM_CHECK_EXTENSION+KVM_CAP_MEMORY_ATTRIBUTES on a gmem fd w= ould be a > > good idea (e.g. trying to do KVM_CAP_GUEST_MEMFD_FLAGS on a gmem fd doe= sn't work > > because the whole point is to get flags _before_ creating the gmem inst= ance). But > > adding e.g. KVM_CAP_GUEST_MEMFD_MEMORY_ATTRIBUTES is easy enough. > > > > But for specifying PRIVATE vs. SHARED, I don't see any reason to define= new uAPI. > > I also don't want an entirely new set of terms in KVM to describe the s= ame things. > > PRIVATE and SHARED are far from perfect, but they're better than https:= //xkcd.com/927. > > And if we ever want to let userspace restrict RWX protections in gmem, = we'll have > > a ready-made way to do so. > > >=20 > I don't understand why we need to reuse KVM_SET_MEMORY_ATTRIBUTES. It > anyways is a new ABI as it's on a guest_memfd FD instead of KVM FD. Yes, it's new functionality, but the semantics are the same (modulo s/addre= ss/offset), which makes life easier for KVM and its developers. Specifically I want to= avoid ending up with two entirely different ways for describing private vs. share= d memory. E.g. I don't want to have to translate between SHAREABILITY_GUEST and PRIVA= TE, in code or in conversation. > RWX protections seem to be pagetable configuration rather than > guest_memfd properties. Can mmap flags + kvm userfaultfd help enforce > RWX protections? No, because mmap() is optional. Potential use cases are for (seletively) restricting _guest_ access as well as host access. mmap() isn't a good fit regardless, as that's much more about describing what the process wants, no= t the properties of the underlying memory. E.g. read-only and noexec file systems exist for a reason.