Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Naohiro Aota <Naohiro.Aota@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>,
	Johannes Thumshirn <Johannes.Thumshirn@wdc.com>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: new scrub code vs zoned file systems
Date: Thu, 1 Jun 2023 13:45:49 +0800	[thread overview]
Message-ID: <7939e360-27fb-119f-8339-36a86c2b3f94@gmx.com> (raw)
In-Reply-To: <gn6vj3mlwsm53iu4ktso2dts4ifyxaky54ivb22laq3mqy27lv@guvvxohmkxy6>



On 2023/6/1 13:17, Naohiro Aota wrote:
> On Thu, Jun 01, 2023 at 01:00:40PM +0800, Qu Wenruo wrote:
>>
>>
>> On 2023/6/1 12:40, Christoph Hellwig wrote:
>>> On Thu, Jun 01, 2023 at 10:09:24AM +0800, Qu Wenruo wrote:
>>>> So far the various wrapper around the write operations work as expected,
>>>> and hide the detailed well enough that most of us didn't even notice.
>>>>
>>>> E.g. all the zoned code is already handled in scrub_write_sectors().
>>>>
>>>> The crash itself is caused by the fact that end io part is relying on
>>>> the inode pointer, that itself is a simple fix.
>>>
>>> But the reason why it is relying on the inode pointer is that it needs
>>> to record the actual written LBA after I/O completion.  So it's not
>>> just a case of just add a NULL check, it needs a way to adjust the
>>> logical to physical mapping from the dummy added before the I/O.
>>
>> That's all handled by scrub.
>>
>> For scrub we're doing the writes just like metadata, with QD=1, aka,
>> always write and wait (and know where the write would land), and for the
>> gaps we would call fill_writer_pointer_gap() to fill them.
>>
>> Thus we don't need to do any adjustment (unless you're talking about
>> RST, but I believe that's a different beast).
>
> True. For the dev_replace, we need to place the moved data at the same
> address on the destination device as the source device. Thus, we need to
> use WRITE command to ensure that.

Oh, that looks like the cause.

In btrfs_submit_repair_write() we set the bi_opf to ZONE_APPEND instead,
which later would trigger btrfs_record_physical_zoned().

So this means, we should not change the WRITE into ZONE_APPEND for
btrfs_submit_repair_write() for dev-replace case at all.

I stupidly thought zoned device can not accept WRITE command at all but
only ZONE_APPEND.

Let me try it locally first.

Thanks,
Qu
>
> So, calling into the record_physical function looks strange to me. It
> misses some condition to use ZONE_APPEND?
>
>>>
>>>> But I'm more concerned about why we have a full zone before that crash.
>>>
>>> I think this is happening because we can't account for the zone filling
>>> without the proper context.
>>
>> I believe it's a different problem, maybe some de-sync between scrub
>> write_pointer and the real zoned pointer inside the device.
>>
>> My current guess is, the target zone inside the target device is not
>> properly reset before dev-replace.
>
> This must be a different issue. Are we choosing that zone for zone finish
> to free the active zone resource?

  parent reply	other threads:[~2023-06-01  5:46 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-31 12:52 new scrub code vs zoned file systems Christoph Hellwig
2023-05-31 13:10 ` Johannes Thumshirn
2023-05-31 13:20   ` Christoph Hellwig
2023-05-31 13:25     ` Johannes Thumshirn
2023-05-31 13:30       ` Christoph Hellwig
2023-05-31 14:04         ` Johannes Thumshirn
2023-05-31 14:17           ` Christoph Hellwig
2023-06-01  2:09             ` Qu Wenruo
2023-06-01  4:40               ` Christoph Hellwig
2023-06-01  5:00                 ` Qu Wenruo
2023-06-01  5:17                   ` Naohiro Aota
2023-06-01  5:21                     ` Naohiro Aota
2023-06-01  7:21                       ` Qu Wenruo
2023-06-01  7:27                         ` Christoph Hellwig
2023-06-01  8:46                           ` Qu Wenruo
2023-06-01  5:22                     ` Christoph Hellwig
2023-06-01  5:34                       ` Christoph Hellwig
2023-06-01  5:45                     ` Qu Wenruo [this message]
2023-06-01  5:47                       ` Christoph Hellwig
2023-05-31 22:25           ` Qu Wenruo
2023-05-31 22:48             ` Qu Wenruo
2023-06-01  4:53             ` Christoph Hellwig
2023-06-01  5:04               ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7939e360-27fb-119f-8339-36a86c2b3f94@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=Johannes.Thumshirn@wdc.com \
    --cc=Naohiro.Aota@wdc.com \
    --cc=hch@lst.de \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox