From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mail-ej1-f41.google.com (mail-ej1-f41.google.com [209.85.218.41])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id EE62030D3EC
	for <linux-kernel@vger.kernel.org>; Tue, 23 Jun 2026 16:11:41 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.41
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1782231103; cv=none; b=ifXxg9uNnHO/oqYnca2oKStCMoKOHsk8kmg6mezugWA19BWJpQQqOjuyQjXT4Ouks0S7J0fwOHXnyAuXFeW92PCcYqeSMqJ3khrZnjNa5m76U3n4K2NcqJIzCZvKxRuJDqwiiWnhamz5dgo0EL0j06XbuAJV+74uX81WIgeupeM=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1782231103; c=relaxed/simple;
	bh=ApYd6eMLX287wrJ6saYJiEWcq2pAqFgsSpMo+tAeWLE=;
	h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID:
	 MIME-Version:Content-Type; b=QV6oOT595iU+kLse+DxFlqF8pH1XGJTfsmJdeDIv+F35Gc+fHbnDGdMQ25xINS9o+xAaPRe+ymT6psTcFk+i6IvObCKvt7KBUIRni5yJlXta5lmL1tZFIgdYeWJVnKv69Bh2mlo9aWMAGjnkmZZeAKxw6lukVL62unJ7dHPlo8o=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=o/1Z/gsW; arc=none smtp.client-ip=209.85.218.41
Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="o/1Z/gsW"
Received: by mail-ej1-f41.google.com with SMTP id a640c23a62f3a-bec3ffb95dbso6080366b.0
        for <linux-kernel@vger.kernel.org>; Tue, 23 Jun 2026 09:11:41 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20251104; t=1782231100; x=1782835900; darn=vger.kernel.org;
        h=mime-version:message-id:date:references:in-reply-to:subject:cc:to
         :from:from:to:cc:subject:date:message-id:reply-to;
        bh=qseEtwramBDXDeeIOB7L3DgR9DC3hS4bogH0wKDYVJM=;
        b=o/1Z/gsWmtWcFwQUfSwQqdb5LKDsjWI0doXDLQ9/zTVw3RTfIFLu6S8t5BS/tr702C
         2PZXb5S/BaI5du5PZwkFTgRP8i6OwHv9yNKWW5xTIAuxyLCO0la9Uz373p/PwSb26TVI
         LUZbDAWbvrAHp3zPn+TZj30S4Sg/AtpQX306qgEWrMwOTWAWZBMwhYjyd9lrJLLuA+no
         y+4vBQMt69mQccCtETG2ghoF8oPZqf7Z1jiPohssCza/9XBDqbgoN5AG0+654nCYxBY5
         90skk+Armh/hvQ68IqXVSWB8pLFwp94GLQhyXbr1PVEmvb/YASnLavDC0jZqHxiA1nbO
         IZpQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20251104; t=1782231100; x=1782835900;
        h=mime-version:message-id:date:references:in-reply-to:subject:cc:to
         :from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date:message-id
         :reply-to;
        bh=qseEtwramBDXDeeIOB7L3DgR9DC3hS4bogH0wKDYVJM=;
        b=S+btWNDvsc8Lars4BqYGnvn5U/Lv4T6lQmw8uzIw5+j1e5uxLx7h2wZFlh4BkStxYM
         G1wDfDwwrV7N42Zci7OOcm5LajyNRo9RnBWORvYCUod4avReg9Ps1TI9XnBuiUpLhqFK
         TZLej9CmYAO8r43HRN80j11BK8QeHFE5P3fyeV+LWopxuitWkcn1p0pcjtP3xX9F8SBQ
         /GeESiCT9eRV3WcKMbf7HrJdM/FiltrXIpBnGgupUmNy7Kwbey5DpSead7CRZ49s1jI4
         VHrsJtn9NbUQt0awiMsDmbTqW6tEQ28NvSlEBqq32yEajrDv1BrVPAvtYS/n+0qO2B74
         /DkQ==
X-Forwarded-Encrypted: i=1; AFNElJ8b0oVRQ/Hzwvdx5PiYzkIlTnqUlYFjmVyo/c/XshRXqrsZf1uv0xG5S+c5wI3boRuJtAIT/2mhBy/oWxM=@vger.kernel.org
X-Gm-Message-State: AOJu0YzH8eXFf2C0JfxpPDIxtvDwu/XoX0y6hnuaGJ/xTKLPWJOULSfG
	hpZyKV01uATJvBMdblG+vpH9EVUt6QkjQrGARCam/JzH2txFJaq0zX+pKEPZ+g==
X-Gm-Gg: AfdE7ckEVbO1MBHcmgV/8VzeicfXeLYma0ljJQL9LPLcW+sn7ZD6ISbN+U4RSgAnVXp
	SWqkojVhQiFhWTs8Y1XffRsyDlXywqK7HqDtMkeeK0cerhHiKb7SN0qmaS1yqAz67CQGXu9xisY
	v4BL0zez5VPcdhPB/p7s7Qu57ocxeMaDNpviRHAJJxLMHPI55B77DZekf8GoVhPhTJvldBMJZkX
	/UIgdt0c6Y6nfDdUwhBO1/91VPeVqCQvnVZXU2/FZPxerkzXLpsvsrmTENMo12SwwcntWyhR+w0
	/NFKqhzVQvs0ir8tE0UjH5BqbrGQ0gJ4QJNWCea3JuCJdu0GjMuz5SgSozoITsx7Bn/oSHtE8n0
	34f3iKh7gzOpfGJBsclFaBbTCwbTf6frXWVMHtYCGB06EgUdlwtnEuLkR1MUpxLx6IWZ8fHgHBD
	PBU3kDczt35Wsoq6tDnhoO6F80Mr1q0db2QpE+KYf5Dx5aTf1KQkqkblwAdsnkKa7J
X-Received: by 2002:a17:907:6d1e:b0:c0e:883e:e4e6 with SMTP id a640c23a62f3a-c0e883ee9e1mr588253166b.25.1782231100269;
        Tue, 23 Jun 2026 09:11:40 -0700 (PDT)
Received: from Abds-MacBook-Air.local ([2a02:3037:603:9c63:bc1f:6f2d:e43:dd64])
        by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-46666788226sm36432536f8f.23.2026.06.23.09.11.37
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Tue, 23 Jun 2026 09:11:39 -0700 (PDT)
From: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
To: John Garry <john.g.garry@oracle.com>, song@kernel.org, yukuai@fygo.io,
 magiclinan@didiglobal.com, xiao@kernel.org, axboe@kernel.dk,
 martin.petersen@oracle.com
Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/7] md/raid1: handle atomic writes that require splitting
In-Reply-To: <007416b5-d099-406d-b1d1-ff0db2edc48a@oracle.com>
References: <20260623072456.333437-1-abd.masalkhi@gmail.com>
 <20260623072456.333437-3-abd.masalkhi@gmail.com>
 <ba67f3ef-45cb-41c0-b4ea-fa0a22508cdc@oracle.com>
 <m2se6d1vls.fsf@gmail.com>
 <6130d0cb-4cf8-4042-843e-98a9d8aa00c5@oracle.com>
 <m2mrwl1sha.fsf@gmail.com>
 <007416b5-d099-406d-b1d1-ff0db2edc48a@oracle.com>
Date: Tue, 23 Jun 2026 18:11:33 +0200
Message-ID: <m2jyrp1bka.fsf@gmail.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Type: text/plain

On Tue, Jun 23, 2026 at 12:38 +0100, John Garry wrote:
> On 23/06/2026 11:06, Abd-Alrhman Masalkhi wrote:
>> On Tue, Jun 23, 2026 at 10:20 +0100, John Garry wrote:
>>> On 23/06/2026 09:58, Abd-Alrhman Masalkhi wrote:
>>>> On Tue, Jun 23, 2026 at 09:11 +0100, John Garry wrote:
>>>>> On 23/06/2026 08:24, Abd-Alrhman Masalkhi wrote:
>>>>>> If a request already requires splitting when entering
>>>>>> raid1_write_request(), the current code allows it to proceed until it
>>>>>> eventually reaches the split path.
>>>>>
>>>>> The block layer should catch invalid atomic writes in
>>>>> submit_bio_noacct() -> blk_validate_atomic_write_op_size() before we
>>>>> even get as far as the md atomic write handling. Having the check in
>>>>> bio_submit_split_bioset() is really just a fail-safe for the block layer
>>>>> not catching invalid atomic writes or the atomic writes queue limits not
>>>>> being properly calculated.
>>>> The request size itself satisfies the currently advertised atomic write
>>>> limits, so blk_validate_atomic_write_op_size() allows it. The problem
>>>> is that RAID1 may further restrict atomic writes to a single barrier
>>>> unit via align_to_barrier_unit_end(). Therefore a request that crosses
>>>> a barrier-unit boundary can still reach raid1_write_request() with
>>>> max_sectors < bio_sectors(bio).
>>>>
>>>> If the barrier-unit restriction should instead be advertised through the
>>>> atomic write queue limits,
>>>
>>> It should. Any restrictions should be advertised up front. For the user
>>> to issue an atomic write which is valid according to limits, then it
>>> should succeed.
>>>
>> 
>> I'll take a look at how best to expose that through the queue limits and
>> rework this part accordingly. If there is already an existing mechanism
>> you had in mind, I'd appreciate any pointers.
>
> Any write must fit within BARRIER_UNIT_SECTOR_SIZE, right?
>
> Since an atomic write must be naturally aligned, then I would expect 
> that the atomic write max unit is limited by BARRIER_UNIT_SECTOR_SIZE.
>

Yes, that makes sense. I was thinking in terms of a boundary
restriction, but with natural alignment it should be sufficient to cap
the advertised atomic write max unit. I'll rework this accordingly and
drop the entry check.
Thanks for pointing out the natural alignment aspect.

>> 
>>>> then I agree the block layer could reject
>>>> such requests earlier and the RAID1 entry check would become
>>>> unnecessary.
>>>>
>>>> However, there are also cases where max_sectors is reduced later within
>>>> raid1_write_request(), for example when bad blocks are present on some
>>>> mirrors (or due to other RAID1-specific constraints such as write-behind
>>>> limits). Those reductions depend on RAID1 runtime state and mirror
>>>> health, so they are not readily visible to the block layer during atomic
>>>> write validation. In those cases RAID1 still needs to detect that the
>>>> atomic write can no longer be serviced as requested and fail it
>>>> appropriately.
>>>
>>> Sure, and we do this. As I remember, we should return -EIO in this case.
>>>
>> 
>> Right, and that's the main motivation for this patch. The original
>> atomic write support already returned -EIO for one bad-block path, but
>> there are other cases where max_sectors can be reduced (e.g. the
>> first_bad <= sector path and write-behind limits)
>> 
>> After a4c55c902670, those cases can end up completing with EINVAL or
>> NOTSUPP instead. This patch is intended to restore consistent -EIO.
>> 
>
> ok, but I could not check this as I did not recognize the baseline code.
>
>>>>
>>>>>
>>>>>> Along the way, the bio may instead
>>>>>> fail due to other conditions and return a different status, even though
>>>>>> the request was invalid as an atomic write from the beginning.
>>>>>>
>>>
>> 
>

-- 
Best Regards,
Abd-Alrhman