From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76FE1C04AA5 for ; Thu, 25 Aug 2022 09:40:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231283AbiHYJkO (ORCPT ); Thu, 25 Aug 2022 05:40:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43696 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236497AbiHYJkN (ORCPT ); Thu, 25 Aug 2022 05:40:13 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3C4FD43E70 for ; Thu, 25 Aug 2022 02:40:12 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id B00A91FD38; Thu, 25 Aug 2022 09:40:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1661420410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vbDYlX28bRFpScz3ZBpYLsEIe6WNmtCQj7t+URs8spo=; b=qY2WdgnStkJAR+LPrCKkgqA+vZ6sVU+fE/H8D3EaiS6H/lLYLX/Mb36kcq01yK7zLVIW57 KSIaPyGPAs7D+szkpwQEC+rgBs/br795u/AwfqGACfNz8ptTnY6QC+Cf3GEus81VOPUHmh j/+cU13Gcxg9stUp8be3qxAUtOJpQQ4= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1661420410; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vbDYlX28bRFpScz3ZBpYLsEIe6WNmtCQj7t+URs8spo=; b=HZUDRus3SCFNWoFVgeoiqRyreY0pdSFhng5vbYoqH7OzJwjzpA6PfL4VvsJlJUUKAUm/mv ofz8Es9FkZBACDBA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 5A4E113A8E; Thu, 25 Aug 2022 09:40:10 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id JRw6E3pDB2MDTAAAMHmgww (envelope-from ); Thu, 25 Aug 2022 09:40:10 +0000 Received: from localhost (brahms.olymp [local]) by brahms.olymp (OpenSMTPD) with ESMTPA id b5078ca5; Thu, 25 Aug 2022 09:41:02 +0000 (UTC) Date: Thu, 25 Aug 2022 10:41:02 +0100 From: =?iso-8859-1?Q?Lu=EDs?= Henriques To: Ilya Dryomov Cc: Jeff Layton , xiubli@redhat.com, ceph-devel@vger.kernel.org Subject: Re: [PATCH] ceph: fix error handling in ceph_sync_write Message-ID: References: <20220824205331.473248-1-jlayton@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org On Thu, Aug 25, 2022 at 10:32:56AM +0200, Ilya Dryomov wrote: > On Wed, Aug 24, 2022 at 10:53 PM Jeff Layton wrote: > > > > ceph_sync_write has assumed that a zero result in req->r_result means > > success. Testing with a recent cluster however shows the OSD returning > > a non-zero length written here. I'm not sure whether and when this > > changed, but fix the code to accept either result. > > > > Assume a negative result means error, and anything else is a success. If > > we're given a short length, then return a short write. > > > > Signed-off-by: Jeff Layton > > --- > > fs/ceph/file.c | 10 +++++++++- > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > diff --git a/fs/ceph/file.c b/fs/ceph/file.c > > index 86265713a743..c0b2c8968be9 100644 > > --- a/fs/ceph/file.c > > +++ b/fs/ceph/file.c > > @@ -1632,11 +1632,19 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, > > req->r_end_latency, len, ret); > > out: > > ceph_osdc_put_request(req); > > - if (ret != 0) { > > + if (ret < 0) { > > ceph_set_error_write(ci); > > break; > > } > > > > + /* > > + * FIXME: it's unclear whether all OSD versions return the > > + * length written on a write. For now, assume that a 0 return > > + * means that everything got written. > > + */ > > + if (ret && ret < len) > > + len = ret; > > + > > ceph_clear_error_write(ci); > > pos += len; > > written += len; > > -- > > 2.37.2 > > > > Hi Jeff, > > AFAIK OSDs aren't allowed to return any kind of length on a write > and there is no such thing as a short write. This definitely needs > deeper investigation. > > What is the cluster version you are testing against? OK, I'm only seeing 'ret' being set to the write length only when enabling encryption (i.e. with test_dummy_encryption mount option). So, maybe the right fix is something like: diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 16dcade66923..5119d87d61fb 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -1889,6 +1889,7 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, ceph_release_page_vector(pages, num_pages); break; } + ret = 0; } req = ceph_osdc_new_request(osdc, &ci->i_layout, Cheers, -- Luís