From: Sergei Shtepa <sergei.shtepa@veeam.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Damien Le Moal <Damien.LeMoal@wdc.com>,
"axboe@kernel.dk" <axboe@kernel.dk>,
"viro@zeniv.linux.org.uk" <viro@zeniv.linux.org.uk>,
"hch@infradead.org" <hch@infradead.org>,
"darrick.wong@oracle.com" <darrick.wong@oracle.com>,
"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"rjw@rjwysocki.net" <rjw@rjwysocki.net>,
"len.brown@intel.com" <len.brown@intel.com>,
"pavel@ucw.cz" <pavel@ucw.cz>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
Johannes Thumshirn <Johannes.Thumshirn@wdc.com>,
"ming.lei@redhat.com" <ming.lei@redhat.com>,
"jack@suse.cz" <jack@suse.cz>, "tj@kernel.org" <tj@kernel.org>,
"gustavo@embeddedor.com" <gustavo@embeddedor.com>,
"bvanassche@acm.org" <bvanassche@acm.org>,
"osandov@fb.com" <osandov@fb.com>,
"koct9i@gmail.com" <koct9i@gmail.com>,
"steve@sk2.org" <steve@sk2.org>,
"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [PATCH 1/2] Block layer filter - second version
Date: Wed, 21 Oct 2020 17:35:53 +0300 [thread overview]
Message-ID: <20201021143553.GG20749@veeam.com> (raw)
In-Reply-To: <20201021130753.GM20115@casper.infradead.org>
The 10/21/2020 16:07, Matthew Wilcox wrote:
> On Wed, Oct 21, 2020 at 03:55:55PM +0300, Sergei Shtepa wrote:
> > The 10/21/2020 14:44, Matthew Wilcox wrote:
> > > I don't understand why O_DIRECT gets to bypass the block filter. Nor do
> > > I understand why anybody would place a block filter on the swap device.
> > > But if somebody did place a filter on the swap device, why should swap
> > > be able to bypass the filter?
> >
> > Yes, intercepting the swap partition is absurd. But we can't guarantee
> > that the filter won't intercept swap.
> >
> > Swap operation is related to the memory allocation logic. If a swap on
> > the block device are accessed during memory allocation from filter,
> > a deadlock occurs. We can allow filters to occasionally shoot off their
> > feet, especially under high load. But I think it's better not to do it.
>
> We already have logic to prevent this in Linux. Filters need to
> call memalloc_noio_save() while they might cause swap to happen and
> memalloc_noio_restore() once it's safe for them to cause swap again.
Yes, I looked at this function, it can really be useful for the filter.
Then I don't need to enter the submit_bio_direct() function and the wait
loop associated with the queue polling function blk_mq_poll() will have
to be rewritten.
>
> > "directly access" - it is not O_DIRECT. This means (I think) direct
> > reading from the device file, like "dd if=/dev/sda1".
> > As for intercepting direct reading, I don't know how to do the right thing.
> >
> > The problem here is that in fs/block_dev.c in function __blkdev_direct_IO()
> > uses the qc - value returned by the submit_bio() function.
> > This value is used below when calling
> > blk_poll(bdev_get_queue(dev), qc, true).
> > The filter cannot return a meaningful value of the blk_qc_t type when
> > intercepting a request, because at that time it does not know which queue
> > the request will fall into.
> >
> > If function submit_bio() will always return BLK_QC_T_NONE - I think the
> > algorithm of the __blk dev_direct_IO() will not work correctly.
> > If we need to intercept direct access to a block device, we need to at
> > least redo the __blkdev_direct_IO function, getting rid of blk_pool.
> > I'm not sure it's necessary yet.
>
> This isn't part of the block layer that I'm familiar with, so I can't
> help solve this problem, but allowing O_DIRECT to bypass the block filter
> is a hole that needs to be fixed before these patches can be considered.
I think there is no such problem, but I will check, of course.
--
Sergei Shtepa
Veeam Software developer.
next prev parent reply other threads:[~2020-10-21 14:35 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-21 9:04 [PATCH 0/2] block layer filter and block device snapshot module Sergei Shtepa
2020-10-21 9:04 ` [PATCH 1/2] Block layer filter - second version Sergei Shtepa
2020-10-21 9:14 ` Johannes Thumshirn
2020-10-21 10:01 ` Sergei Shtepa
2020-10-21 9:21 ` Damien Le Moal
2020-10-21 10:27 ` Sergei Shtepa
2020-10-21 11:44 ` Matthew Wilcox
2020-10-21 12:55 ` Sergei Shtepa
2020-10-21 13:07 ` Matthew Wilcox
2020-10-21 14:35 ` Sergei Shtepa [this message]
2020-10-21 15:09 ` Randy Dunlap
2020-10-24 14:53 ` Greg KH
2020-10-21 9:04 ` [PATCH 2/2] blk-snap - snapshots and change-tracking for block devices Sergei Shtepa
2020-10-21 9:08 ` Pavel Machek
2020-10-21 9:37 ` Sergei Shtepa
2020-10-21 9:23 ` Damien Le Moal
2020-10-21 11:15 ` Sergei Shtepa
2020-10-21 10:48 ` kernel test robot
2020-10-21 15:11 ` Randy Dunlap
2020-10-21 13:31 ` [PATCH 0/2] block layer filter and block device snapshot module Hannes Reinecke
2020-10-21 14:10 ` Sergei Shtepa
2020-10-22 5:58 ` Hannes Reinecke
2020-10-22 9:44 ` Sergei Shtepa
2020-10-22 10:28 ` Damien Le Moal
2020-10-22 13:52 ` Sergei Shtepa
2020-10-22 15:14 ` Darrick J. Wong
2020-10-22 17:54 ` Mike Snitzer
2020-10-23 9:13 ` hch
2020-10-23 10:31 ` Hannes Reinecke
2020-10-23 11:04 ` Sergei Shtepa
2020-10-23 11:12 ` [dm-devel] " hch
2020-10-22 18:35 ` Mike Snitzer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201021143553.GG20749@veeam.com \
--to=sergei.shtepa@veeam.com \
--cc=Damien.LeMoal@wdc.com \
--cc=Johannes.Thumshirn@wdc.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=darrick.wong@oracle.com \
--cc=gustavo@embeddedor.com \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=koct9i@gmail.com \
--cc=len.brown@intel.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-pm@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=ming.lei@redhat.com \
--cc=osandov@fb.com \
--cc=pavel@ucw.cz \
--cc=rjw@rjwysocki.net \
--cc=steve@sk2.org \
--cc=tj@kernel.org \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).