From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00226C10F1A for ; Tue, 7 May 2024 18:39:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 93E5A6B009C; Tue, 7 May 2024 14:39:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8EED26B009D; Tue, 7 May 2024 14:39:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7B74F6B009F; Tue, 7 May 2024 14:39:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 5F4856B009C for ; Tue, 7 May 2024 14:39:07 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 0B040160CE6 for ; Tue, 7 May 2024 18:39:07 +0000 (UTC) X-FDA: 82092461934.21.E63545E Received: from mail-pf1-f182.google.com (mail-pf1-f182.google.com [209.85.210.182]) by imf15.hostedemail.com (Postfix) with ESMTP id 68757A0007 for ; Tue, 7 May 2024 18:39:05 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="TmdOYEM/"; spf=pass (imf15.hostedemail.com: domain of ritesh.list@gmail.com designates 209.85.210.182 as permitted sender) smtp.mailfrom=ritesh.list@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1715107145; a=rsa-sha256; cv=none; b=8Z3TInPXoMRjnSzNoqG05Llzj0GyuVb+WqP1tG+I5bFGrg3rz8xOEyNhCMZOR9f4MJkJTg c6iPVt6rJbzPH2fctflAMP3k9BcQnrY8AFOxvdncVzSK2TBmMezy0kr/bVsEWd+L7AmODw SICLOY9ViVysHRR0Pcw78u284f9sOEw= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="TmdOYEM/"; spf=pass (imf15.hostedemail.com: domain of ritesh.list@gmail.com designates 209.85.210.182 as permitted sender) smtp.mailfrom=ritesh.list@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1715107145; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:references:dkim-signature; bh=vbTvMCqSzr4haV2ycU7qwM3HYG3Vs4AHqnsFnA064zI=; b=SlzMeJc83EUJHxpT4yry6N4ncuHx0N6Bza0Rj3Om2YYF58kW1jCS0AVaxhjn2jPFXOIcWo e5iDOaOCRbdqVrrlRKF5AlqkJ1Ey//ID2vxnodji1i4nfPw76OeR4fRzLWrPz6C3g3xU0y 5OnO4l7fyhKKqIWMTS3c8cSZhGvQvJM= Received: by mail-pf1-f182.google.com with SMTP id d2e1a72fcca58-6f44e3fd382so2673062b3a.1 for ; Tue, 07 May 2024 11:39:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1715107144; x=1715711944; darn=kvack.org; h=subject:cc:to:from:message-id:date:from:to:cc:subject:date :message-id:reply-to; bh=vbTvMCqSzr4haV2ycU7qwM3HYG3Vs4AHqnsFnA064zI=; b=TmdOYEM/EbAbhSTwMhmkt6yIJNMtGU/CVF8vpK1Sm95Be535ram5IbDoXTqGExbj8n q1mQeVqeKZ/jut2tDsBpcJwz1IpX2dgEwf6v7/j05+nGW3uOv+EjUlePX+E8cYVIgmk1 nWZ6+T9KDqirgloOCXgzBF8m0Oze3LtAn/B7A/lbw8VvHo3QoWZhl50C6N/n1mVn5zw+ cfLIXPk07T/tQN3ZKrMzEIBCzwZlZv1yeMREAe+h4QLQuokcn+Y0RZYAKyRlHO2EUKxe azzH+Mw2VED1wpE4/N/uHhD37ctpkL2ywvYaeRpeL6yNZS9ILuR7PAE0pKEP/vd+AH/H IFkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715107144; x=1715711944; h=subject:cc:to:from:message-id:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vbTvMCqSzr4haV2ycU7qwM3HYG3Vs4AHqnsFnA064zI=; b=Eqjz+dq2xEAGx/EISYrftFBDW8kJtTPidRfdZbtcN6qVYN6w5jopVp+jBF0vE3/3yc BvP09stfAc9til8gsL5XvcHLah+PXzA3O+J1nYYvgzT+kU2TJQ3ee4Gr+a+zQk6Vs9TW nnHg1iHOxtOvkjJRie08PCN9lRgHu1T3heZ4ZZ+kl5pmIKryhO8FSVB8928w6xXecL5n LWO0ceT6Y3Ot4zrlqmIUdxXUrhV6DNzHgahfXMTUrhrKPXqSUJ4zlG0KhZk5QWLPXxgw urSMuobEQhjoM6Q2GLU25uUIwqDB/34TorCtJoDc6D2qdU9gqTPxjm0Iclpq5A9wz6oo fevA== X-Forwarded-Encrypted: i=1; AJvYcCUZT7DQXSQlT5kerAVxFLqZLjVdwDBAlTQuYKLNfP6iExiy5nbdrQqQDI4Rsq8FEo6gDJnrhxId359XR15MZFgLddI= X-Gm-Message-State: AOJu0YwkI6MSbhf9DJ31MtyZXsEq2wPAqNNre7XWcfW6k8Zg6sHLSLEh jMpThTsFN/oH+qU8b+nqsoRlAHTHN5CmJ/XDj/MzyVnP5z8vg1ZT X-Google-Smtp-Source: AGHT+IG1CoIsZ33D9jBTrdDLGpbQF8WORjgZG8GNes4e13UPyDA8z1xFk6dphfOtkjP9qKAC1oMGtQ== X-Received: by 2002:a05:6a00:2f16:b0:6ea:e2fd:6100 with SMTP id d2e1a72fcca58-6f49c2b1cb4mr437440b3a.30.1715107144144; Tue, 07 May 2024 11:39:04 -0700 (PDT) Received: from dw-tp ([171.76.81.176]) by smtp.gmail.com with ESMTPSA id p38-20020a056a000a2600b006f0da46c019sm9687357pfh.219.2024.05.07.11.38.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 May 2024 11:39:03 -0700 (PDT) Date: Wed, 08 May 2024 00:08:56 +0530 Message-Id: <87edado4an.fsf@gmail.com> From: Ritesh Harjani (IBM) To: Christoph Hellwig , "Pankaj Raghav (Samsung)" Cc: hch@lst.de, willy@infradead.org, mcgrof@kernel.org, akpm@linux-foundation.org, brauner@kernel.org, chandan.babu@oracle.com, david@fromorbit.com, djwong@kernel.org, gost.dev@samsung.com, hare@suse.de, john.g.garry@oracle.com, linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-xfs@vger.kernel.org, p.raghav@samsung.com, ziy@nvidia.com Subject: Re: [RFC] iomap: use huge zero folio in iomap_dio_zero X-Stat-Signature: c5cnwtrstdgj7bojmgziwm8645ca7wxj X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 68757A0007 X-HE-Tag: 1715107145-273850 X-HE-Meta: U2FsdGVkX1/swWJm7phChSdMqsffC+kKTppnugM/M+umFiE2vNk4n5D9taYTnTj5fmqItSl3WYchggLV+h6IJ66GoOXxbUe/7yDYMyQxE4QpxLuZlyQY+vbC1RymM8ORvnDOaXbwEm7PivGxI642q030ocZr8/kuJ5IJt6bKfaiGouf/dxsdlqxTgSB8Fe/2u9UshrJ3momh4qdY3VsyujNEVHTwL7wrjAHU3ANa4Lt83WWPcnqBE2vxl4la22ONMUvn5x69ZeMfRqaUKcNJH0L14UVy1iB58q/7KplwoNWHmj+t2sqx7c4lFr7zcI9dteIFIRBXPe6tXmbrSNlSYgqzH3GloLdbC45VqmsRgM1Hl22emro/N70SbFAn2o2U3rmq6n/zUCyWR0LK7UGMjG9nWIzsIShjxmcxHiMJMrJsxiTXFF9d15tPrGN8BUKWCxS1kQvCaxVJ7IqUG42Qm15ehtDzh2W9q0XOylcle9kI7N84Z0Nf42lKcre72NTFSgdT0m62muGbYuukd2XqjljfvQOEzvAQODjS1K93GZTQFI6Fp6mPPDDwieXnCTyWGUAacYZYD39BNpWCpk1bijanpt92tKvM+oRsAomi48Wmzl+tOJGhiZsksNcUL8gpnEloMzWmwQDO8rOZzXQ2Jk75lCP/QEVqphSKzxJjz+H6OMpwbjbEq3MA1GmqvDGRHJ+Vc24fAilKsb6DKkOvHuRy3Y7iA5a6d/Q2ht+GcKFOfxobl496B7yjfiyDB0e26sXa83ndhqEnaCgzWzfQlTR5f4hXW/yU0RNzYVwOO03WLQvtQLRQWRZR+1kPSXJX2FzE++UgENW5WDHytOscHOyHJfOylVHXBrhIflGATXqtu4/UWxpVu7yHsaQx2luWIRdTxHdIjYE7ocrm+aabUsWtDCqpqktiRWc9qaxKTDvfFMh6SdtT1l3wVa7uRK9F9X+2AHiXm99QBAPBdoU fyggw+LG xSkmX2oZtEmh1oi3fqZEJat5ZHvCy9w1UB4UQSZwmBLkoK0th158rU1H7nI2j98p9GNw/w2SLFbqEI974qK4IOuUXgJcsC0HI38HcMiCAEucwjpL1WaeHrE5KXsr+aQrTPVGIx8RvT/9gLMKzbAn1qyEh6Dr+4qzK6DuswTZLPpelCLhtmGqLnKTBR3nTg7tIFc/DYteOQ4TWcou4XY+JCMj1xCkRtlhwEvYOoGIinaMg9yFsYzMeZ4+aCEquxWgmj8DgmcjLQcrQBEsz38UAtMWSdkgQuvRpRu22Fx5FKiAb4GmncnJNCC69v3AUik4sTHL9sjn08kVcT28HewrCKfN8HbdTPT2SzrzchXzjaElfrkbfw8nC1wQCvTfjPE0zAZwjDqbfXY6aFBzRxQkwP0oL3s9mIZIVfWbR4Nf8jIhLk/SPVKIDDazA3ErSQr1/eocCAZdnn1uJ02+P9Qif8CmIHLTUec+BFYBrPekWKR9/imoX0Vmj/YKYbfZhgCKNwXpDc6BDqkFRNrEn0liNT3ynbUIvrDJDGCVLyfNA0jC/Z6o= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000237, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Christoph Hellwig writes: > On Tue, May 07, 2024 at 04:58:12PM +0200, Pankaj Raghav (Samsung) wrote: >> + if (len > PAGE_SIZE) { >> + folio = mm_get_huge_zero_folio(current->mm); > > I don't think the mm_struct based interfaces work well here, as I/O > completions don't come in through the same mm. You'll want to use But right now iomap_dio_zero() is only called from the submission context right i.e. iomap_dio_bio_iter(). Could you please explain the dependency with the completion context to have same mm_struct here? > lower level interfaces like get_huge_zero_page and use them at > mount time. > Even so, should we not check whether allocation of hugepage is of any value or not depending upon how large the length or (blocksize in case of mount time) really is. i.e. say if the len for zeroing is just 2 times the PAGE_SIZE, then it doesn't really make sense to allocate a 2MB hugepage and sometimes 16MB hugepage on some archs (like Power with hash mmu). maybe something like if len > 16 * pagesize? >> + if (!folio) >> + folio = zero_page_folio; > > And then don't bother with a fallback. The hugepage allocation can still fail during mount time (if we mount late when the system memory is already fragmented). So we might still need a fallback to ZERO_PAGE(0), right? -ritesh