From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62CB5C433F5 for ; Sat, 21 May 2022 17:52:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344369AbiEURwD (ORCPT ); Sat, 21 May 2022 13:52:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59270 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229897AbiEURwA (ORCPT ); Sat, 21 May 2022 13:52:00 -0400 Received: from zeniv-ca.linux.org.uk (zeniv-ca.linux.org.uk [IPv6:2607:5300:60:148a::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8C682286ED for ; Sat, 21 May 2022 10:51:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=linux.org.uk; s=zeniv-20220401; h=Sender:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=JFnShOrbXPheK2o1UXw4WQjmlOVuTNamJGsABFX0j4A=; b=gPXYckZgKo33zVw1AIKOTeVdaC Op3DaxtoBc+lGvYKBeyh71qX+sdpNyZoda3usBdSUd2CCpbRMZa5djaCwKB6RicdqEtrcuJj7St0m RvwqeKy8K4D+oNazcqIel573pWLHIQZmJDV+p4s3Qr7kl938eTna7YxwhAgOoOLmEnFAngWvBV0JO MKD2lAtWZjGNBWEUC73c26QGdgaot304nvTJwFUccflMtTjJrHwaZCnhEAeJJfFEesdkqUSObVUob TQgRElFOLb1MpmQSMbBqSCaw4xn6sf6FvkMDt+adIJdxQBYeTfzHkFnc1hKn2CXXkzQgpcLQ0UQhW MWQ34/tg==; Received: from viro by zeniv-ca.linux.org.uk with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1nsTGd-00Gukc-IW; Sat, 21 May 2022 17:51:55 +0000 Date: Sat, 21 May 2022 17:51:55 +0000 From: Al Viro To: Jens Axboe Cc: "Jason A. Donenfeld" , gregkh@linuxfoundation.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] char/mem: only use {read,write}_iter, not the old {read,write} functions Message-ID: References: <20220520135030.166831-1-Jason@zx2c4.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: Al Viro Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 20, 2022 at 03:44:04PM +0000, Al Viro wrote: > On Fri, May 20, 2022 at 09:32:34AM -0600, Jens Axboe wrote: > > > Didn't look closer, but I'm assuming this is _mostly_ tied to needing to > > init 48 bytes of kiocb for each one. There might be ways to embed a > > sync_kiocb inside the kiocb for the bits we need there, at least that > > could get us down to 32 bytes. > > My bet would be on iocb_flags() (and kiocb_set_rw_flags()) tests and > pointer-chasing, actually. I'd been sick on and off since early November, > trying to dig myself from under the piles right now. Christoph's > patches in that area are somewhere in the pile ;-/ FWIW, I can't find them, assuming I'm not misremembering in the first place. iocb_flags() is an atrocity, indeed. Look what happens in new_sync_write(): [edx holds file->f_flags] movl %edx, %eax shrl $6, %eax andl $16, %eax // eax = (edx >> 6) & 16; maps O_APPEND to IOCB_APPEND movl %eax, %ecx orl $131072, %ecx testb $64, %dh cmovne %ecx, %eax // eax = edh & 0x4000 ? eax | 0x20000; testb $16, %dh // if (edx & O_DSYNC) jne .L175 // goto L175; branch not taken movq 216(%rdi), %rcx // rcx = file->f_mapping; // fetch movq (%rcx), %rcx // rcx = rcx->host; // fetch movq 40(%rcx), %rsi // rsi = rcx->i_sb; // fetch testb $16, 80(%rsi) // if (!(rsi->s_flags & 0x10)) // fetch je .L198 // goto L198; // branch likely taken L175: orl $2, %eax // eax |= IOCB_DSYNC; L176: movl %eax, %ecx orl $4, %ecx andl $1048576, %edx cmovne %ecx, %eax // eax = edx & __O_SYNC ? eax | IOCB_SYNC : eax; movq %gs:current_task, %rdx # rdx = current; movq 2192(%rdx), %rcx # rcx = rdx->io_contenxt; movl $16388, %edx # edx = IOPRIO_DEFAULT; testq %rcx, %rcx # if (!rcx) je .L178 # goto L178; movzwl 12(%rcx), %edx # edx = rcx->ioprio; L178: ... fill iov_iter call ->write_iter() and bugger off L198: testb $1, 12(%rcx) // if (rcx->i_flags & S_SYNC) je .L176 jmp .L175 IOCB_DSYNC part is bloody awful. To add insult to injury, we end up doing it in new_sync_read() as well ;-/ Its validity is dubious, BTW - we only get away with that since O_DSYNC is ignored for all in-tree character devices and since for block ones ->f_mapping->host->i_sb ends up pointing to blockdev_superblock, which won't be mounted sync. I've sent a patch into old thread ([RFC] what to do with IOCB_DSYNC?); it's completely untested, though.