From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 363F4C4332F for ; Mon, 30 Oct 2023 23:22:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A57FA6B0297; Mon, 30 Oct 2023 19:22:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9E1896B0298; Mon, 30 Oct 2023 19:22:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8345D6B0299; Mon, 30 Oct 2023 19:22:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 705E36B0297 for ; Mon, 30 Oct 2023 19:22:22 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 4AA8C804B1 for ; Mon, 30 Oct 2023 23:22:22 +0000 (UTC) X-FDA: 81403703724.28.BECCC17 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf12.hostedemail.com (Postfix) with ESMTP id 13E964000C for ; Mon, 30 Oct 2023 23:22:19 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=T68KyUI6; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf12.hostedemail.com: domain of pbonzini@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=pbonzini@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698708140; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Ld3vezLmgKEaAeNYJLpNp3rB+LOD2edLdu+ngMl8YRY=; b=VHEdclyCH2zB1prFBXMCUaO03/FY5/MVZYeKoZScTGwvXk3yH3/BnoKDMSQmYseSLhFiEw /ctoK1sB2mBGtdGk0gxnosiaUh4mFN4Rw8trYITjIJCs/hH5xVQ/Aik9eb5wtbCPX252gx Xs38BDUVE9U14qYZL2eitAJMnf4ZmAc= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=T68KyUI6; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf12.hostedemail.com: domain of pbonzini@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=pbonzini@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698708140; a=rsa-sha256; cv=none; b=JLEpAK7chj6IIr1gQVFRrYYSDLKm3KPKZh7WlAKDwyRC7Wn2qAm2AKf+Abd8gmsCs7ALM0 AmJiAWckrMl2RB0JZUfPKfRfN3PozNZdAZVwF2P7FzRxDX+bGd7jcC7Lis2UkfQTH7e7g+ Tr01BoJcJ+MZXFakZsLIXbbUHyusKcQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1698708139; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=Ld3vezLmgKEaAeNYJLpNp3rB+LOD2edLdu+ngMl8YRY=; b=T68KyUI6XiLk25lsQugARnewTF2S/lIjA3IAoyQk+6pcNZTXEfNETCUOBVI1uE49hcBiwz 7zZlia4oNf7zi21JVy40H9qBoyHlVgosBPSr63HtAclDznOJ9iWjzYaf/30gpVplf7tAQ7 0rDvJaWGDb4LRQ+PSTQ8IJGBp/XVLBQ= Received: from mail-ed1-f72.google.com (mail-ed1-f72.google.com [209.85.208.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-425-5iEOJSkNMKSA0v_9AifRpA-1; Mon, 30 Oct 2023 19:22:17 -0400 X-MC-Unique: 5iEOJSkNMKSA0v_9AifRpA-1 Received: by mail-ed1-f72.google.com with SMTP id 4fb4d7f45d1cf-53fa5cd4480so3913750a12.0 for ; Mon, 30 Oct 2023 16:22:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698708137; x=1699312937; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Ld3vezLmgKEaAeNYJLpNp3rB+LOD2edLdu+ngMl8YRY=; b=KEGh78LsewzqjLJF/QMR4clnKpT9psGgIw7cWwEjNWm2etJM0fcyEVy3NljiZOM+nH S8wTZPiFZo+EpnU68rnG/d0QLkIffKKpvY+9LmoxhevGPhFUaFsvLpW/K4e+rpGDviGE 1vkT8CGxakwyZG38sNWsU9A9BhlvQLCo41vzNdJCXoj00hzmsmpJdw5dOc/pArZa6lIk nrBv9p3U1fxKFx9n1KChIpGOPOi3s9ATx2f3s6MkccoYNZ+nK/V17A6vtnx/ZyBogyfd GDZL53iDDya1iEw1jd8kcHXNJQUISKvDgFG4qhD1op3EskOxdn/wmq0kVemzmAzXxtqM oL+A== X-Gm-Message-State: AOJu0YyVcwdJYy7cWM4R9Ligt4blbsy7SBg05hyVW3Pc6hGYnyeiq+nD trXn8nKjcAyGi/0Zc4CzADzYNhqiFBRnz2jRm70+0sEe0525fpwU+M8WgGdiZ5RJ3tM+UoEVMiK JzDztgwDtyg0= X-Received: by 2002:a05:6402:31e3:b0:540:8fc6:dc89 with SMTP id dy3-20020a05640231e300b005408fc6dc89mr9031595edb.25.1698708136861; Mon, 30 Oct 2023 16:22:16 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFIvgnzIhjdrEt9J9u+25rSKeQu3EBi4IBaV39UByrGfNi98o8+yXXgbVdhdU/vIuSdikvGKQ== X-Received: by 2002:a05:6402:31e3:b0:540:8fc6:dc89 with SMTP id dy3-20020a05640231e300b005408fc6dc89mr9031573edb.25.1698708136546; Mon, 30 Oct 2023 16:22:16 -0700 (PDT) Received: from ?IPV6:2001:b07:6468:f312:63a7:c72e:ea0e:6045? ([2001:b07:6468:f312:63a7:c72e:ea0e:6045]) by smtp.googlemail.com with ESMTPSA id 27-20020a50875b000000b0054358525a5bsm131435edv.62.2023.10.30.16.22.14 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 30 Oct 2023 16:22:15 -0700 (PDT) Message-ID: Date: Tue, 31 Oct 2023 00:22:13 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v13 08/35] KVM: Introduce KVM_SET_USER_MEMORY_REGION2 To: Sean Christopherson Cc: Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexander Viro , Christian Brauner , "Matthew Wilcox (Oracle)" , Andrew Morton , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xiaoyao Li , Xu Yilun , Chao Peng , Fuad Tabba , Jarkko Sakkinen , Anish Moorthy , David Matlack , Yu Zhang , Isaku Yamahata , =?UTF-8?B?TWlja2HDq2wgU2FsYcO8?= =?UTF-8?Q?n?= , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-9-seanjc@google.com> <211d093f-4023-4a39-a23f-6d8543512675@redhat.com> From: Paolo Bonzini Autocrypt: addr=pbonzini@redhat.com; keydata= xsEhBFRCcBIBDqDGsz4K0zZun3jh+U6Z9wNGLKQ0kSFyjN38gMqU1SfP+TUNQepFHb/Gc0E2 CxXPkIBTvYY+ZPkoTh5xF9oS1jqI8iRLzouzF8yXs3QjQIZ2SfuCxSVwlV65jotcjD2FTN04 hVopm9llFijNZpVIOGUTqzM4U55sdsCcZUluWM6x4HSOdw5F5Utxfp1wOjD/v92Lrax0hjiX DResHSt48q+8FrZzY+AUbkUS+Jm34qjswdrgsC5uxeVcLkBgWLmov2kMaMROT0YmFY6A3m1S P/kXmHDXxhe23gKb3dgwxUTpENDBGcfEzrzilWueOeUWiOcWuFOed/C3SyijBx3Av/lbCsHU Vx6pMycNTdzU1BuAroB+Y3mNEuW56Yd44jlInzG2UOwt9XjjdKkJZ1g0P9dwptwLEgTEd3Fo UdhAQyRXGYO8oROiuh+RZ1lXp6AQ4ZjoyH8WLfTLf5g1EKCTc4C1sy1vQSdzIRu3rBIjAvnC tGZADei1IExLqB3uzXKzZ1BZ+Z8hnt2og9hb7H0y8diYfEk2w3R7wEr+Ehk5NQsT2MPI2QBd wEv1/Aj1DgUHZAHzG1QN9S8wNWQ6K9DqHZTBnI1hUlkp22zCSHK/6FwUCuYp1zcAEQEAAc0j UGFvbG8gQm9uemluaSA8cGJvbnppbmlAcmVkaGF0LmNvbT7CwU0EEwECACMFAlRCcBICGwMH CwkIBwMCAQYVCAIJCgsEFgIDAQIeAQIXgAAKCRB+FRAMzTZpsbceDp9IIN6BIA0Ol7MoB15E 11kRz/ewzryFY54tQlMnd4xxfH8MTQ/mm9I482YoSwPMdcWFAKnUX6Yo30tbLiNB8hzaHeRj jx12K+ptqYbg+cevgOtbLAlL9kNgLLcsGqC2829jBCUTVeMSZDrzS97ole/YEez2qFpPnTV0 VrRWClWVfYh+JfzpXmgyhbkuwUxNFk421s4Ajp3d8nPPFUGgBG5HOxzkAm7xb1cjAuJ+oi/K CHfkuN+fLZl/u3E/fw7vvOESApLU5o0icVXeakfSz0LsygEnekDbxPnE5af/9FEkXJD5EoYG SEahaEtgNrR4qsyxyAGYgZlS70vkSSYJ+iT2rrwEiDlo31MzRo6Ba2FfHBSJ7lcYdPT7bbk9 AO3hlNMhNdUhoQv7M5HsnqZ6unvSHOKmReNaS9egAGdRN0/GPDWr9wroyJ65ZNQsHl9nXBqE AukZNr5oJO5vxrYiAuuTSd6UI/xFkjtkzltG3mw5ao2bBpk/V/YuePrJsnPFHG7NhizrxttB nTuOSCMo45pfHQ+XYd5K1+Cv/NzZFNWscm5htJ0HznY+oOsZvHTyGz3v91pn51dkRYN0otqr bQ4tlFFuVjArBZcapSIe6NV8C4cEiSTOwE0EVEJx7gEIAMeHcVzuv2bp9HlWDp6+RkZe+vtl KwAHplb/WH59j2wyG8V6i33+6MlSSJMOFnYUCCL77bucx9uImI5nX24PIlqT+zasVEEVGSRF m8dgkcJDB7Tps0IkNrUi4yof3B3shR+vMY3i3Ip0e41zKx0CvlAhMOo6otaHmcxr35sWq1Jk tLkbn3wG+fPQCVudJJECvVQ//UAthSSEklA50QtD2sBkmQ14ZryEyTHQ+E42K3j2IUmOLriF dNr9NvE1QGmGyIcbw2NIVEBOK/GWxkS5+dmxM2iD4Jdaf2nSn3jlHjEXoPwpMs0KZsgdU0pP JQzMUMwmB1wM8JxovFlPYrhNT9MAEQEAAcLBMwQYAQIACQUCVEJx7gIbDAAKCRB+FRAMzTZp sadRDqCctLmYICZu4GSnie4lKXl+HqlLanpVMOoFNnWs9oRP47MbE2wv8OaYh5pNR9VVgyhD OG0AU7oidG36OeUlrFDTfnPYYSF/mPCxHttosyt8O5kabxnIPv2URuAxDByz+iVbL+RjKaGM GDph56ZTswlx75nZVtIukqzLAQ5fa8OALSGum0cFi4ptZUOhDNz1onz61klD6z3MODi0sBZN Aj6guB2L/+2ZwElZEeRBERRd/uommlYuToAXfNRdUwrwl9gRMiA0WSyTb190zneRRDfpSK5d usXnM/O+kr3Dm+Ui+UioPf6wgbn3T0o6I5BhVhs4h4hWmIW7iNhPjX1iybXfmb1gAFfjtHfL xRUr64svXpyfJMScIQtBAm0ihWPltXkyITA92ngCmPdHa6M1hMh4RDX+Jf1fiWubzp1voAg0 JBrdmNZSQDz0iKmSrx8xkoXYfA3bgtFN8WJH2xgFL28XnqY4M6dLhJwV3z08tPSRqYFm4NMP dRsn0/7oymhneL8RthIvjDDQ5ktUjMe8LtHr70OZE/TT88qvEdhiIVUogHdo4qBrk41+gGQh b906Dudw5YhTJFU3nC6bbF2nrLlB4C/XSiH76ZvqzV0Z/cAMBo5NF/w= In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 13E964000C X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: zg15c9fy4o8em3nb5qbp6qakrunfyixg X-HE-Tag: 1698708139-874405 X-HE-Meta: U2FsdGVkX1+eGdz3Ai8M75laANQSSqMc/jWvrNxvo9MM1FdmIcyZmM5ewPDf6wEuZo7EeUjv2OUdBcr8j45tWxlw7q8hT+8oblKF3T+Wt0urVFYdSEXTunYkbahSP8+jokLW/mOwd6cMLROQFkYVm45n+O0ASp++pGy9CFaCzE5yv3AYOXc+4+WGcCi7Nz1Ynj+6pXfb40XkvsHgu5T3Vpmx+qVHjTub4DpbQ85X6FwHtPI2AUhqMWzKdS3rP6wU4xA1jnVZZ5doqrLLdG3XTR9Ho8WCt+0xzuxR48OLybQWeuZnvShzj0JWpaks+Q8kVxLb/kzECwEjWsqO7vduKCGQc3py0J2+MK0jDkfssaZtN9edFlRseXCgChHgDTbZCPV6kp+74sD9GqRg6FP169ayDQJyenLcx/wt+GVg8filb2X1DTUbZUlJLShaIrMxPICX5lvCo28acT29wmTz0zifcYvFCTkHzJvNC7P0Vzj7HDQAL7CNgN0ex2ZeAlWAUkOJDIZiLh7fm/G18GNzZlxtHUm/zKla+tqAIZq+uFtcWGL3eNIDpz5BGOO7+ZUaxWCk4+0VUvLiL9FALi7YzXclF0duQaOm1zy5C1Cizn0SvLhb0sPA/5pM3gKN5ic0E0OGWvG9gOaZD71FjLgaDU5SDtVCrqfwr1TLZr9KNLkyAjUYpiSLrPVWwmUg5QuwKDJw3KiowAN2iKhLVRXlNYVzsr9BNJczDUx2QtZxdql3OoyJLBIJItARgEDI/Nzy0VdT0kk5CJZGV3hUiBW9aB3wR6EnS5wvZXN5jHgxvpPqLFRtPVClX0JnU3PIHqW2lKfyRLCiqxqBKTOMYxw4ZdvuIcsdbc0vENN7DaivNp83MVpz1B3kapEKq8+JMXV369V0ByNYhOGzNRk00FgCEioWHfDj/XbY6rj725jXafSaD1G3wV0f9RoHqdJYV6Wnv42DTRncR7TnUtoMZoN 3hxtpZyL vV5uwH0PeJTh7wCwLvo53fPZCZqq1LUAEhrFeRKIvRyMGbmtckEae5vrjRNIzxtwus3Tzq1ArpehycXOAscByQowlxMykT1BllQqXqKDm4ldvSaNJFIYZv/MSZsS8VwaaDGHVHMBp+i8TLhvnPJPZGBctygFS/cy6XtLb1AwJS0LPUX7qrW8foi20nUyDZHgcxcZx05AeEmobgO9BQZZp/MCTwPrX/KHpzTh/iGEPNYkidaDSIIOUqnOIqD4n4KhzACEmgaOj5ErwqygEjWIVwJ4ClGcgiyesw42ekG8CB/6+VvqO60XARqLGlgahC1RPqvNNcXSJZJ/VjBxC0/kkgpYVlQkop7Osmgh+EFeSlZ+nH8+5QCIpsuXqQXmHSb5sUEVjtlTXURb70mXhSSjJC1X9G62tp0wRzCBa0Ax5mLfU45YXxaVVL7MrdKu4x8Q/PNCgTANBzAh9jXW1Mls4cq5NUtOrBJ++sShszZwJmtV2wxm9lSLWrBwMdAdc2npkoP2w X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 10/30/23 21:25, Sean Christopherson wrote: > On Mon, Oct 30, 2023, Paolo Bonzini wrote: >> On 10/27/23 20:21, Sean Christopherson wrote: >>> >>> + if (ioctl == KVM_SET_USER_MEMORY_REGION) >>> + size = sizeof(struct kvm_userspace_memory_region); >> >> This also needs a memset(&mem, 0, sizeof(mem)), otherwise the out-of-bounds >> access of the commit message becomes a kernel stack read. > > Ouch. There's some irony. Might be worth doing memset(&mem, -1, sizeof(mem)) > though as '0' is a valid file descriptor and a valid file offset. Either is okay, because unless the flags check is screwed up it should not matter. The memset is actually unnecessary, though it may be a good idea anyway to keep it, aka belt-and-suspenders. >> Probably worth adding a check on valid flags here. > > Definitely needed. There's a very real bug here. But rather than duplicate flags > checking or plumb @ioctl all the way to __kvm_set_memory_region(), now that we > have the fancy guard(mutex) and there are no internal calls to kvm_set_memory_region(), > what if we: > > 1. Acquire/release slots_lock in __kvm_set_memory_region() > 2. Call kvm_set_memory_region() from x86 code for the internal memslots > 3. Disallow *any* flags for internal memslots > 4. Open code check_memory_region_flags in kvm_vm_ioctl_set_memory_region() I dislike this step, there is a clear point where all paths meet (ioctl/internal, locked/unlocked) and that's __kvm_set_memory_region(). I think that's the place where flags should be checked. (I don't mind the restriction on internal memslots; it's just that to me it's not a particularly natural way to structure the checks). On the other hand, the place where to protect from out-of-bounds accesses, is the place where you stop caring about struct kvm_userspace_memory_region vs kvm_userspace_memory_region2 (and your code gets it right, by dropping "ioctl" as soon as possible). diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 87f45aa91ced..fe5a2af14fff 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1635,6 +1635,14 @@ bool __weak kvm_arch_dirty_log_supported(struct kvm *kvm) return true; } +/* + * Flags that do not access any of the extra space of struct + * kvm_userspace_memory_region2. KVM_SET_USER_MEMORY_REGION_FLAGS + * only allows these. + */ +#define KVM_SET_USER_MEMORY_REGION_FLAGS \ + (KVM_MEM_LOG_DIRTY_PAGES | KVM_MEM_READONLY) + static int check_memory_region_flags(struct kvm *kvm, const struct kvm_userspace_memory_region2 *mem) { @@ -5149,10 +5149,16 @@ static long kvm_vm_ioctl(struct file *filp, struct kvm_userspace_memory_region2 mem; unsigned long size; - if (ioctl == KVM_SET_USER_MEMORY_REGION) + if (ioctl == KVM_SET_USER_MEMORY_REGION) { + /* + * Fields beyond struct kvm_userspace_memory_region shouldn't be + * accessed, but avoid leaking kernel memory in case of a bug. + */ + memset(&mem, 0, sizeof(mem)); size = sizeof(struct kvm_userspace_memory_region); - else + } else { size = sizeof(struct kvm_userspace_memory_region2); + } /* Ensure the common parts of the two structs are identical. */ SANITY_CHECK_MEM_REGION_FIELD(slot); @@ -5165,6 +5167,11 @@ static long kvm_vm_ioctl(struct file *filp, if (copy_from_user(&mem, argp, size)) goto out; + r = -EINVAL; + if (ioctl == KVM_SET_USER_MEMORY_REGION && + (mem->flags & ~KVM_SET_USER_MEMORY_REGION_FLAGS)) + goto out; + r = kvm_vm_ioctl_set_memory_region(kvm, &mem); break; } That's a kind of patch that you can't really get wrong (though I have the brown paper bag ready). Maintainance-wise it's fine, since flags are being added at a pace of roughly one every five years, and anyway it's also future proof: I placed the #define near check_memory_region_flags so that in five years we remember to keep it up to date. But worst case, the new flags will only be allowed by KVM_SET_USER_MEMORY_REGION2 unnecessarily; there are no security issues waiting to bite us. In sum, this is exactly the only kind of fix that should be in the v13->v14 delta. Paolo