From mboxrd@z Thu Jan  1 00:00:00 1970
From: Maxim Patlasov <mpatlasov@parallels.com>
Subject: Re: [PATCH 09/14] fuse: Implement writepages and write_begin/write_end
 callbacks - v3
Date: Fri, 14 Jun 2013 18:03:01 +0400
Message-ID: <51BB2295.1020105@parallels.com>
References: <20130401103749.19027.89833.stgit@maximpc.sw.ru> <20130401104202.19027.99619.stgit@maximpc.sw.ru> <CAJfpegskt2zmKyx5sB49em86HqEJH3mZFH5hnuDXPuD_EtYzyw@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8;
	format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE
Cc: Kirill Korotaev <dev@parallels.com>,
	Pavel Emelianov <xemul@parallels.com>,
	fuse-devel <fuse-devel@lists.sourceforge.net>,
	Kernel Mailing List <linux-kernel@vger.kernel.org>,
	James Bottomley <jbottomley@parallels.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Linux-Fsdevel <linux-fsdevel@vger.kernel.org>, <devel@openvz.org>
To: Miklos Szeredi <miklos@szeredi.hu>
Return-path: <linux-kernel-owner@vger.kernel.org>
In-Reply-To: <CAJfpegskt2zmKyx5sB49em86HqEJH3mZFH5hnuDXPuD_EtYzyw@mail.gmail.com>
Sender: linux-kernel-owner@vger.kernel.org
List-Id: linux-fsdevel.vger.kernel.org

Hi Miklos,

04/25/2013 02:35 PM, Miklos Szeredi =D0=BF=D0=B8=D1=88=D0=B5=D1=82:
> On Mon, Apr 1, 2013 at 12:42 PM, Maxim V. Patlasov
> <MPatlasov@parallels.com> wrote:
>> The .writepages one is required to make each writeback request carry=
 more than
>> one page on it.
> I'd split this into two parts:
>
> 1) implement ->writepages() and enable it unconditionally for mmaped
> writeback (why is it not enabled by this patch?)
>
> 2) implement ->write_begin() and ->write_end() and conditionally
> enable cached writeback
>
> More comments inline.

Thanks a lot for careful review. I agree with most of your comments and=
=20
will address them in the next version of patchset. The only point I=20
disagree is the following:

>> +
>> +       spin_lock(&fc->lock);
>> +       list_add(&req->writepages_entry, &fi->writepages);
>> +       spin_unlock(&fc->lock);
>> +
>> +       for (i =3D 0; i < req->num_pages; i++) {
>> +               struct page *page =3D req->pages[i];
>> +               struct page *tmp_page;
>> +
>> +               tmp_page =3D alloc_page(GFP_NOFS | __GFP_HIGHMEM);
>> +               if (tmp_page) {
>> +                       copy_highpage(tmp_page, page);
>> +                       inc_bdi_stat(bdi, BDI_WRITEBACK);
>> +                       inc_zone_page_state(tmp_page, NR_WRITEBACK_T=
EMP);
> Is there a reason why we do the page allocation/copying here instead
> of at fill time?  I'd guess it would be simpler and more logical
> there.

There is a problem to have in mind: we can't call=20
end_page_writeback(page) before update of fuse writeback state=20
(fi->writepages). Otherwise a nasty race would be possible when an=20
activity for that particular page offset intervenes in the middle of=20
writeback leading to multiple in-flight fuse requests for given page of=
fset.

What you suggested is doable but would require additional space to stas=
h=20
pointers to original pages because we cannot release them immediately=20
after copying (due to the problem described above). The size of the=20
space is not known in advance because we don't know how many pages=20
write_cache_pages() will process. The size is currently limited by=20
sizeof(struct page *) * FUSE_MAX_PAGES_PER_REQ, but may become variable=
=20
in future (a lot of people complain about 128K limit of fuse request).=20
Adding dynamically growing array of pages would make the code neither=20
simpler nor logical.

The approach I implemented utilizes the fact that=20
fuse_page_is_writeback() and friends require only=20
req->misc.write.in.offset and req->num_pages to be set correctly. Actua=
l=20
pointers in req->pages[] doesn't matter. Thus, as soon as the two=20
parameters are known, I add the request to fi->writepages (blocking=20
other operations on given page offset) and perform "in place"=20
allocation/copying avoiding need for extra space to stash page pointers=
=2E

Thanks,
Maxim