From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Maxim V. Patlasov" Subject: Re: [PATCH 6/10] fuse: Trust kernel i_size only Date: Fri, 16 Nov 2012 14:32:33 +0400 Message-ID: <50A61641.4070309@parallels.com> References: <4FF3156E.8030109@parallels.com> <4FF3160B.6090501@parallels.com> <8762a3pp3m.fsf@tucsk.pomaz.szeredi.hu> <5069D344.9090501@parallels.com> <87lie1ho7j.fsf@tucsk.pomaz.szeredi.hu> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Pavel Emelianov , "fuse-devel@lists.sourceforge.net" , Alexander Viro , linux-fsdevel , James Bottomley , Kirill Korotaev To: Miklos Szeredi Return-path: Received: from relay.parallels.com ([195.214.232.42]:50816 "EHLO relay.parallels.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750817Ab2KPKch convert rfc822-to-8bit (ORCPT ); Fri, 16 Nov 2012 05:32:37 -0500 In-Reply-To: <87lie1ho7j.fsf@tucsk.pomaz.szeredi.hu> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Hi Miklos, Thanks a lot for reply. See please inline comments below... 11/16/2012 01:49 PM, Miklos Szeredi =D0=BF=D0=B8=D1=88=D0=B5=D1=82: > "Maxim V. Patlasov" writes: > >> We should probably look at what NFS is doing. >> >> >> In case of NFS, the flush does updates the modification time on serv= er. And on >> client, getattr triggers flush: >> >> >> int nfs_getattr(struct vfsmount *mnt, struct dentry *dentry, st= ruct kstat >> *stat) >> { >> ... >> >> /* Flush out writes to the server in order to update c/mtim= e. */ >> if (S_ISREG(inode->i_mode)) { >> nfs_inode_dio_wait(inode); >> err =3D filemap_write_and_wait(inode->i_mapping); >> if (err) >> goto out; >> } >> >> >> In another email of this thread you suggested some approach where in= -kernel >> fuse flushes i_mtime to userspace: >> >> >> So basically what we need is a per-inode flag that says that i_= mtime has >> been updated (it is more recent then what userspace has) and we= must >> update i_mtime *only* in write and not other operations which s= till do >> the mtime update in the userspace filesystem. Any operation th= at >> modifies i_mtime (and hence invalidate the attributes) must cle= ar the >> flag. Any other operation which updates or invalidates the att= ributes >> must first flush the i_mtime to userspace if the flag is set. >> >> In addition the userspace fileystem need to implement the polic= y similar >> to NFS, in which it only updates mtime if it is greater than th= e current >> one. This means that we must differentiate between an mtime up= date due >> to a buffered write from an mtime update due to an utime (and f= riends) >> system call. >> >> >> My question is why do we need all these complications if we could fo= llow NFS >> way: trigger flush and wait for its (and fuse write-back) completion= before >> sending FUSE_GETATTR to userspace? > > Yes, the NFS way seems like a good approach assuming that getattrs ar= e > not too frequent. But I guess the fact that NFS does this is a prett= y > good assurance that will work fine. My first intention was to follow NFS way because it's very simple and=20 straightforward. But then I realized it would impact performance too=20 badly. You correctly noticed that frequent getattrs will be a problem.=20 But there will be a problem even without it: a single innocent 'ls' wil= l=20 wait for pretty long till all dirty memory is flushed. >> Another concern is about the idea of sending i_mtime to userspace pe= r se. You >> wrote: >> >> >> If we are doing buffered writes, then the kernel must update i_= mtime on >> write(2) and must flush that to the userspace filesystem at som= e point >> (with a SETTATTR operation). >> >> >> Fuse userspace may have its own non-trivial concept of 'modification= time'. >> It's not obliged to advance its mtime on every write. The only requi= rement is >> to be consistent: if we expose new data handling READs, mtime must b= e advanced >> properly as well. But, for example, the granularity of changes is up= to >> implementation. From this view, in-kenel fuse pushing i_mtime with a= SETATTR >> operation would look like a cheating userspace. What do you think? > I think you are right in that mixing kernel mtime updates with usersp= ace > mtime updates doesn't work. Either the kernel should be wholly > responsible (which works only for "local" filesystems) or the userspa= ce > is fully responsible for mtime updates (which works in all cases but = may > be suboptimal). Not having any feedback from you for long while I worked pretty hard to= =20 implement the approach you suggested early (update mtime locally on=20 buffered writes, flush it to userspace when needed). Now I have an=20 implementation that works in my tests. I'll send patches soon. Thanks, Maxim -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html