From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F988C3600C for ; Thu, 3 Apr 2025 20:44:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E28FD280003; Thu, 3 Apr 2025 16:44:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DD6BA280001; Thu, 3 Apr 2025 16:44:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C5008280003; Thu, 3 Apr 2025 16:44:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 9E0A6280001 for ; Thu, 3 Apr 2025 16:44:40 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 5E40D5922E for ; Thu, 3 Apr 2025 20:44:41 +0000 (UTC) X-FDA: 83293911162.20.A243A91 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf19.hostedemail.com (Postfix) with ESMTP id E0C911A000C for ; Thu, 3 Apr 2025 20:44:38 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=RCihhfzr; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf19.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743713079; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0Y9MBgTDLbQ/2p3ToDYrW2wJ+A96DDpYuEBaZvNIJ+Q=; b=CmY2YQjmq3fFtt8l35PGFDxVZ11GUu0x7c0q26bZoB3U1gjamtDdG2ig1tKoErrBOzaH8a qL94v4pERA5kDqEkbuzbQVkh03W8w7lltxPIvQubNow1urtJSLi3b7LW09zYurmLfXwh+X dkgrvdhvSwKfCit7zHa3oWpUcoBmag0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743713079; a=rsa-sha256; cv=none; b=R1m0oXjRL5Me2Grw2WAAH3W37ZuR3V6mvfG1nfWerYnnRvZ78QJHTolE2wru3BZvPVEbq3 kq42OYqHn1VWVScxw9x4f3k89VVWwhXt7o9UmQufvmFHvHP+KGCc4J0LceuZvaTO/n9KE6 TDxWmBIOZ6RE9ZzYR0jw3EVUCYyQlQk= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=RCihhfzr; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf19.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1743713078; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=0Y9MBgTDLbQ/2p3ToDYrW2wJ+A96DDpYuEBaZvNIJ+Q=; b=RCihhfzrburm099hUwYSQBrB24Fv5iFFWzjEOrYPFZem1KUpgi7rwV9Ll0utHAw6SQi/hD IuffgQzZjdxBNBw9fK6ow/pCpXg9uyOeklWCrUi58GxUf9wXFIiaOMmYK9Rj4qysrDqNK0 UE2K/zJ/6HgJI7cg+teByG9A3lgaPmI= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-488-ar_mIA-rOV6Bi7O916qq9Q-1; Thu, 03 Apr 2025 16:44:37 -0400 X-MC-Unique: ar_mIA-rOV6Bi7O916qq9Q-1 X-Mimecast-MFC-AGG-ID: ar_mIA-rOV6Bi7O916qq9Q_1743713076 Received: by mail-wr1-f70.google.com with SMTP id ffacd0b85a97d-39c2da64df9so1010369f8f.0 for ; Thu, 03 Apr 2025 13:44:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743713076; x=1744317876; h=content-transfer-encoding:in-reply-to:organization:autocrypt :content-language:from:references:cc:to:subject:user-agent :mime-version:date:message-id:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=0Y9MBgTDLbQ/2p3ToDYrW2wJ+A96DDpYuEBaZvNIJ+Q=; b=GD3d/qoZrNhZYsZM3l15D9Yzn6dSDZ95PemgqrfBsu2NFb6Z8ovdDomz2XOY8cgbAb xqELqD5wRSK2mgOJZMM04qly9KCkCWKt4R6IVYgq2JWDfP0p+8T0a2MSZ4FkhGAS/ATZ 0MKxlbc3rTMmTXbsO6J0jPxcqGA8f9iR+bzJNQn9GfDDuuTHUYhMLN2hD/zaH5H0/GfO 5vfX7PFPRxKfKK+ELUsZFdAJ69oOXIhS1GkV8Jkycxr6FJ6tDHMy1tTSzUfsTZsqiPi4 mjENaFoXx8D+RN90LcQsEGAdYAXEdn6FHkpcIZ5tePYsCglwaDIGyL2rbtMUVaBGsSH6 JXtA== X-Forwarded-Encrypted: i=1; AJvYcCWElRXDJGgj6q1V0ENXjXvj/PoLCCpSPKNCkjgzwTyp9UePM3VUbEI6k5iuZT8HBiFyr4yxhTUKtg==@kvack.org X-Gm-Message-State: AOJu0YzMnA9xKEMH6YYRcA2D40WqfYY6FaoE8xtAsfWvj8p4oFmOKybA d749UAs4eQmUQjiKQ+kEnH8iaqjj7IPnplsPxe1YfYEPapMpJd1VRmV8vjBslE8Nn7IWIhWeK1J 55vAmTtt6aRB31OUIp6gzKpsQ7zHxdw9B5bZMawzCDY3dLWZF X-Gm-Gg: ASbGnctAlxwFlNnEJYvqsTuu2pHvvVKZXLF664lALiNfjRM2uVFk1KRjtChNRsYalOd yGNcfbxqvxsLCUnL9+AIT91kqG8xK07kQRInTQg/2tfchhajawPqEZNSa8+zm2hy3tIJtVUap96 j8ZPgFH1dxWQauRmqsNph5Mgcfl2fIrj7nyLjrPOB4C/iip2mhZcw/8xRi067fHLCNnl95HXvLt bpm0gIWO5RBU2jfOmV4EbbohF8BHzU3aeq3JJx65bIRqw+wU5En9qXCKzQ0v5aS5bxvdywReinP MRwxYF8zd1V0WQbqBQQRkB774aESrEEoQj6x4z3ZcgbVHcksQO5NglkgAOgP6rsAv8mevF7cNxB an1EJUbZYHfP7zrrwR7Xz4CMK8bzjI3HTLU0gfG2kr00= X-Received: by 2002:a05:6000:4312:b0:390:e9e0:5cb3 with SMTP id ffacd0b85a97d-39cb36b2a91mr554696f8f.12.1743713075987; Thu, 03 Apr 2025 13:44:35 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHcpbwz8aWSj3cfnOxw0o3ybTgPdOxG2pUb5Nd+u+WSXD0JmQrC8lp6Zs5p7/wGoJOQtmZ0vg== X-Received: by 2002:a05:6000:4312:b0:390:e9e0:5cb3 with SMTP id ffacd0b85a97d-39cb36b2a91mr554677f8f.12.1743713075535; Thu, 03 Apr 2025 13:44:35 -0700 (PDT) Received: from ?IPV6:2003:cb:c70a:7b00:54a7:eb96:63bc:ccf4? (p200300cbc70a7b0054a7eb9663bcccf4.dip0.t-ipconnect.de. [2003:cb:c70a:7b00:54a7:eb96:63bc:ccf4]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-39c30096896sm2727202f8f.19.2025.04.03.13.44.33 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 03 Apr 2025 13:44:34 -0700 (PDT) Message-ID: <075209ac-c659-485e-a220-83d4afed8a94@redhat.com> Date: Thu, 3 Apr 2025 22:44:33 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v6 4/5] mm/migrate: skip migrating folios under writeback with AS_WRITEBACK_INDETERMINATE mappings To: Joanne Koong Cc: Jingbo Xu , miklos@szeredi.hu, linux-fsdevel@vger.kernel.org, shakeel.butt@linux.dev, josef@toxicpanda.com, bernd.schubert@fastmail.fm, linux-mm@kvack.org, kernel-team@meta.com, Matthew Wilcox , Zi Yan , Oscar Salvador , Michal Hocko References: <20241122232359.429647-1-joannelkoong@gmail.com> <20241122232359.429647-5-joannelkoong@gmail.com> <1036199a-3145-464b-8bbb-13726be86f46@linux.alibaba.com> <1577c4be-c6ee-4bc6-bb9c-d0a6d553b156@redhat.com> From: David Hildenbrand Autocrypt: addr=david@redhat.com; keydata= xsFNBFXLn5EBEAC+zYvAFJxCBY9Tr1xZgcESmxVNI/0ffzE/ZQOiHJl6mGkmA1R7/uUpiCjJ dBrn+lhhOYjjNefFQou6478faXE6o2AhmebqT4KiQoUQFV4R7y1KMEKoSyy8hQaK1umALTdL QZLQMzNE74ap+GDK0wnacPQFpcG1AE9RMq3aeErY5tujekBS32jfC/7AnH7I0v1v1TbbK3Gp XNeiN4QroO+5qaSr0ID2sz5jtBLRb15RMre27E1ImpaIv2Jw8NJgW0k/D1RyKCwaTsgRdwuK Kx/Y91XuSBdz0uOyU/S8kM1+ag0wvsGlpBVxRR/xw/E8M7TEwuCZQArqqTCmkG6HGcXFT0V9 PXFNNgV5jXMQRwU0O/ztJIQqsE5LsUomE//bLwzj9IVsaQpKDqW6TAPjcdBDPLHvriq7kGjt WhVhdl0qEYB8lkBEU7V2Yb+SYhmhpDrti9Fq1EsmhiHSkxJcGREoMK/63r9WLZYI3+4W2rAc UucZa4OT27U5ZISjNg3Ev0rxU5UH2/pT4wJCfxwocmqaRr6UYmrtZmND89X0KigoFD/XSeVv jwBRNjPAubK9/k5NoRrYqztM9W6sJqrH8+UWZ1Idd/DdmogJh0gNC0+N42Za9yBRURfIdKSb B3JfpUqcWwE7vUaYrHG1nw54pLUoPG6sAA7Mehl3nd4pZUALHwARAQABzSREYXZpZCBIaWxk ZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT7CwZgEEwEIAEICGwMGCwkIBwMCBhUIAgkKCwQW AgMBAh4BAheAAhkBFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAl8Ox4kFCRKpKXgACgkQTd4Q 9wD/g1oHcA//a6Tj7SBNjFNM1iNhWUo1lxAja0lpSodSnB2g4FCZ4R61SBR4l/psBL73xktp rDHrx4aSpwkRP6Epu6mLvhlfjmkRG4OynJ5HG1gfv7RJJfnUdUM1z5kdS8JBrOhMJS2c/gPf wv1TGRq2XdMPnfY2o0CxRqpcLkx4vBODvJGl2mQyJF/gPepdDfcT8/PY9BJ7FL6Hrq1gnAo4 3Iv9qV0JiT2wmZciNyYQhmA1V6dyTRiQ4YAc31zOo2IM+xisPzeSHgw3ONY/XhYvfZ9r7W1l pNQdc2G+o4Di9NPFHQQhDw3YTRR1opJaTlRDzxYxzU6ZnUUBghxt9cwUWTpfCktkMZiPSDGd KgQBjnweV2jw9UOTxjb4LXqDjmSNkjDdQUOU69jGMUXgihvo4zhYcMX8F5gWdRtMR7DzW/YE BgVcyxNkMIXoY1aYj6npHYiNQesQlqjU6azjbH70/SXKM5tNRplgW8TNprMDuntdvV9wNkFs 9TyM02V5aWxFfI42+aivc4KEw69SE9KXwC7FSf5wXzuTot97N9Phj/Z3+jx443jo2NR34XgF 89cct7wJMjOF7bBefo0fPPZQuIma0Zym71cP61OP/i11ahNye6HGKfxGCOcs5wW9kRQEk8P9 M/k2wt3mt/fCQnuP/mWutNPt95w9wSsUyATLmtNrwccz63XOwU0EVcufkQEQAOfX3n0g0fZz Bgm/S2zF/kxQKCEKP8ID+Vz8sy2GpDvveBq4H2Y34XWsT1zLJdvqPI4af4ZSMxuerWjXbVWb T6d4odQIG0fKx4F8NccDqbgHeZRNajXeeJ3R7gAzvWvQNLz4piHrO/B4tf8svmRBL0ZB5P5A 2uhdwLU3NZuK22zpNn4is87BPWF8HhY0L5fafgDMOqnf4guJVJPYNPhUFzXUbPqOKOkL8ojk CXxkOFHAbjstSK5Ca3fKquY3rdX3DNo+EL7FvAiw1mUtS+5GeYE+RMnDCsVFm/C7kY8c2d0G NWkB9pJM5+mnIoFNxy7YBcldYATVeOHoY4LyaUWNnAvFYWp08dHWfZo9WCiJMuTfgtH9tc75 7QanMVdPt6fDK8UUXIBLQ2TWr/sQKE9xtFuEmoQGlE1l6bGaDnnMLcYu+Asp3kDT0w4zYGsx 5r6XQVRH4+5N6eHZiaeYtFOujp5n+pjBaQK7wUUjDilPQ5QMzIuCL4YjVoylWiBNknvQWBXS lQCWmavOT9sttGQXdPCC5ynI+1ymZC1ORZKANLnRAb0NH/UCzcsstw2TAkFnMEbo9Zu9w7Kv AxBQXWeXhJI9XQssfrf4Gusdqx8nPEpfOqCtbbwJMATbHyqLt7/oz/5deGuwxgb65pWIzufa N7eop7uh+6bezi+rugUI+w6DABEBAAHCwXwEGAEIACYCGwwWIQQb2cqtc1xMOkYN/MpN3hD3 AP+DWgUCXw7HsgUJEqkpoQAKCRBN3hD3AP+DWrrpD/4qS3dyVRxDcDHIlmguXjC1Q5tZTwNB boaBTPHSy/Nksu0eY7x6HfQJ3xajVH32Ms6t1trDQmPx2iP5+7iDsb7OKAb5eOS8h+BEBDeq 3ecsQDv0fFJOA9ag5O3LLNk+3x3q7e0uo06XMaY7UHS341ozXUUI7wC7iKfoUTv03iO9El5f XpNMx/YrIMduZ2+nd9Di7o5+KIwlb2mAB9sTNHdMrXesX8eBL6T9b+MZJk+mZuPxKNVfEQMQ a5SxUEADIPQTPNvBewdeI80yeOCrN+Zzwy/Mrx9EPeu59Y5vSJOx/z6OUImD/GhX7Xvkt3kq Er5KTrJz3++B6SH9pum9PuoE/k+nntJkNMmQpR4MCBaV/J9gIOPGodDKnjdng+mXliF3Ptu6 3oxc2RCyGzTlxyMwuc2U5Q7KtUNTdDe8T0uE+9b8BLMVQDDfJjqY0VVqSUwImzTDLX9S4g/8 kC4HRcclk8hpyhY2jKGluZO0awwTIMgVEzmTyBphDg/Gx7dZU1Xf8HFuE+UZ5UDHDTnwgv7E th6RC9+WrhDNspZ9fJjKWRbveQgUFCpe1sa77LAw+XFrKmBHXp9ZVIe90RMe2tRL06BGiRZr jPrnvUsUUsjRoRNJjKKA/REq+sAnhkNPPZ/NNMjaZ5b8Tovi8C0tmxiCHaQYqj7G2rgnT0kt WNyWQQ== Organization: Red Hat In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: Zn9czxbUjVAHLp2bnrls_cg2dRLb72uFrtswUUJQCMw_1743713076 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: E0C911A000C X-Stat-Signature: n7ky1jhz9sj54qbj18diqfontefzsb9r X-Rspam-User: X-HE-Tag: 1743713078-608342 X-HE-Meta: U2FsdGVkX18PgL1QXmK1Xj9rW1zP7c/Jg6eC8mI4rqx0deCWfwjMerraKCLsuFgWhvA9QpceKJpXi+dO1TmssJ1aTz2cSChQv8U/Ix9UJIAQRPqWh2I6nLExPsjT1Gxoucga4bit1OUE+cc85vrqHXp9AwV/DN3wxG2SZf+23oiH6qdKOXiwC2a5ihJfCsu7w2vvTCE6hKgWq9udjwsUrTSZ3SQEXLtlBGdfzo6bjPTS/9nGTfEnoCFgNTYQd/wYWsUeZ0lcne/FW47MxsUfI9OASn+UtEjl0vAIAKmYuGmLBdFQJYFIOVJ+6N/QrozK34BW5IHrYU42Y9yBl+FgwDOWjRqVj5P5v0QLWmUDPn9RVWIH8k53NiamaWUVgFdVkxZZMHU/bKIUXjENR9949asXRiDryBLqhUjmt2CvKbz5ngpDvA3j7UpfULS8B+kOxYMK4HgOlDT/brRG1aisPCgEP6OIrWALGrJfIyeR+u6NsgaALKebAIwCCPcVgE+T7Xp95NJ81C/5JWSVO0dpX/BaBP4M8NOr05I56ZKvqFujYFnWAhR1XEMZxQS7SRjNQtWnETmwaFGRBHLxpmzz6xTuLOzuAq+zWAM0JeZ6hwMAZeuvSYdFGRQrWXOiqAxO5G3vX1REqh4Mf7DeHDCRCmfW+qkwdo7SjCjIw1mP4HAyVDavQuvUfoPguYkZTL2TPVf1gI4nNCZXwYnRZ9YDwNKcV1HKOUIGo5cV87grHiAp56cH4DZn2vygLblMvIRAj8BrZBP5f9uypB6HPAdwMdmjhY8K9xrMzf9HwgWCij41KcPMib7UT2X7tdBIy5RSL2Nua3VO+rSxlxc9nSe3M5ugRvaOpRc28M/k6i29JFdO1XCxZNRz5DdC/CdoZ99dFIyz7ejeYPiHwkXmhaC4vxKjJqD0OXB3/V95d3RhA1hlCB/cqT/Cp+TT8TraFyNryMEaB5IFIqr/w0zlRwk Rk4C/fyP W3RmL8PAOF8rF5rOO7XXi+KbkswnevnC5CSYVwh4mqXL49EMhsLq+dPWb3UQgsTx9yov/QNtc5nqkbQia6SqixMjhI6bhwF9N+1wibz8l9GHQSywMsA14Rr5eGqfsBhNa6T8M X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 03.04.25 21:09, Joanne Koong wrote: > On Thu, Apr 3, 2025 at 2:18 AM David Hildenbrand wrote: >> >> On 03.04.25 05:31, Jingbo Xu wrote: >>> >>> >>> On 4/3/25 5:34 AM, Joanne Koong wrote: >>>> On Thu, Dec 19, 2024 at 5:05 AM David Hildenbrand wrote: >>>>> >>>>> On 23.11.24 00:23, Joanne Koong wrote: >>>>>> For migrations called in MIGRATE_SYNC mode, skip migrating the folio if >>>>>> it is under writeback and has the AS_WRITEBACK_INDETERMINATE flag set on its >>>>>> mapping. If the AS_WRITEBACK_INDETERMINATE flag is set on the mapping, the >>>>>> writeback may take an indeterminate amount of time to complete, and >>>>>> waits may get stuck. >>>>>> >>>>>> Signed-off-by: Joanne Koong >>>>>> Reviewed-by: Shakeel Butt >>>>>> --- >>>>>> mm/migrate.c | 5 ++++- >>>>>> 1 file changed, 4 insertions(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/mm/migrate.c b/mm/migrate.c >>>>>> index df91248755e4..fe73284e5246 100644 >>>>>> --- a/mm/migrate.c >>>>>> +++ b/mm/migrate.c >>>>>> @@ -1260,7 +1260,10 @@ static int migrate_folio_unmap(new_folio_t get_new_folio, >>>>>> */ >>>>>> switch (mode) { >>>>>> case MIGRATE_SYNC: >>>>>> - break; >>>>>> + if (!src->mapping || >>>>>> + !mapping_writeback_indeterminate(src->mapping)) >>>>>> + break; >>>>>> + fallthrough; >>>>>> default: >>>>>> rc = -EBUSY; >>>>>> goto out; >>>>> >>>>> Ehm, doesn't this mean that any fuse user can essentially completely >>>>> block CMA allocations, memory compaction, memory hotunplug, memory >>>>> poisoning... ?! >>>>> >>>>> That sounds very bad. >>>> >>>> I took a closer look at the migration code and the FUSE code. In the >>>> migration code in migrate_folio_unmap(), I see that any MIGATE_SYNC >>>> mode folio lock holds will block migration until that folio is >>>> unlocked. This is the snippet in migrate_folio_unmap() I'm looking at: >>>> >>>> if (!folio_trylock(src)) { >>>> if (mode == MIGRATE_ASYNC) >>>> goto out; >>>> >>>> if (current->flags & PF_MEMALLOC) >>>> goto out; >>>> >>>> if (mode == MIGRATE_SYNC_LIGHT && !folio_test_uptodate(src)) >>>> goto out; >>>> >>>> folio_lock(src); >>>> } >>>> >> >> Right, I raised that also in my LSF/MM talk: waiting for readahead >> currently implies waiting for the folio lock (there is no separate >> readahead flag like there would be for writeback). >> >> The more I look into this and fuse, the more I realize that what fuse >> does is just completely broken right now. >> >>>> If this is all that is needed for a malicious FUSE server to block >>>> migration, then it makes no difference if AS_WRITEBACK_INDETERMINATE >>>> mappings are skipped in migration. A malicious server has easier and >>>> more powerful ways of blocking migration in FUSE than trying to do it >>>> through writeback. For a malicious fuse server, we in fact wouldn't >>>> even get far enough to hit writeback - a write triggers >>>> aops->write_begin() and a malicious server would deliberately hang >>>> forever while the folio is locked in write_begin(). >>> >>> Indeed it seems possible. A malicious FUSE server may already be >>> capable of blocking the synchronous migration in this way. >> >> Yes, I think the conclusion is that we should advise people from not >> using unprivileged FUSE if they care about any features that rely on >> page migration or page reclaim. >> >>> >>> >>>> >>>> I looked into whether we could eradicate all the places in FUSE where >>>> we may hold the folio lock for an indeterminate amount of time, >>>> because if that is possible, then we should not add this writeback way >>>> for a malicious fuse server to affect migration. But I don't think we >>>> can, for example taking one case, the folio lock needs to be held as >>>> we read in the folio from the server when servicing page faults, else >>>> the page cache would contain stale data if there was a concurrent >>>> write that happened just before, which would lead to data corruption >>>> in the filesystem. Imo, we need a more encompassing solution for all >>>> these cases if we're serious about preventing FUSE from blocking >>>> migration, which probably looks like a globally enforced default >>>> timeout of some sort or an mm solution for mitigating the blast radius >>>> of how much memory can be blocked from migration, but that is outside >>>> the scope of this patchset and is its own standalone topic. >> >> I'm still skeptical about timeouts: we can only get it wrong. >> >> I think a proper solution is making these pages movable, which does seem >> feasible if (a) splice is not involved and (b) we can find a way to not >> hold the folio lock forever e.g., in the readahead case. >> >> Maybe readahead would have to be handled more similar to writeback >> (e.g., having a separate flag, or using a combination of e.g., >> writeback+uptodate flag, not sure) >> >> In both cases (readahead+writeback), we'd want to call into the FS to >> migrate a folio that is under readahread/writeback. In case of fuse >> without splice, a migration might be doable, and as discussed, splice >> might just be avoided. >> >>>> >>>> I don't see how this patch has any additional negative impact on >>>> memory migration for the case of malicious servers that the server >>>> can't already (and more easily) do. In fact, this patchset if anything >>>> helps memory given that malicious servers now can't also trigger page >>>> allocations for temp pages that would never get freed. >>>> >>> >>> If that's true, maybe we could drop this patch out of this patchset? So >>> that both before and after this patchset, synchronous migration could be >>> blocked by a malicious FUSE server, while the usability of continuous >>> memory (CMA) won't be affected. >> >> I had exactly the same thought: if we can block forever on the folio >> lock, there is no need for AS_WRITEBACK_INDETERMINATE. It's already all >> completely broken. > > I will resubmit this patchset and drop this patch. > > I think we still need AS_WRITEBACK_INDETERMINATE for sync and legacy > cgroupv1 reclaim scenarios: > a) sync: sync waits on writeback so if we don't skip waiting on > writeback for AS_WRITEBACK_INDETERMINATE mappings, then malicious fuse > servers could make syncs hang. (There's no actual effect on sync > behavior though with temp pages because even without temp pages, we > return even though the data hasn't actually been synced to disk by the > server yet) Just curious: Are we sure there are no other cases where a malicious userspace could make some other folio_lock() hang forever either way? IOW, just like for migration, isn't this just solving one part of the whole problem we are facing? > > b) cgroupv1 reclaim: a correctly written fuse server can fall into > this deadlock in one very specific scenario (eg if it's using legacy > cgroupv1 and reclaim encounters a folio that already has the reclaim > flag set and the caller didn't have __GFP_FS (or __GFP_IO if swap) > set), where the deadlock is triggered by: > * single-threaded FUSE server is in the middle of handling a request > that needs a memory allocation > * memory allocation triggers direct reclaim > * direct reclaim waits on a folio under writeback > * the FUSE server can't write back the folio since it's stuck in direct reclaim Yes, that sounds reasonable. -- Cheers, David / dhildenb