From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tyler Hicks Subject: Fwd: Re: [PATCH] eCryptfs: Quota check incorrectly ignored Date: Fri, 10 Feb 2012 16:58:05 -0600 Message-ID: <20120210225805.GA7493@boyd> References: <20120210194142.GB15606@boyd> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="9jxsPFA5p3P2qPhR" Cc: Li Wang , Jan Kara , Yong Peng , viro@zeniv.linux.org.uk, akpm@linux-foundation.org, taysom@chromium.org, Thieu Le To: ecryptfs@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Return-path: Content-Disposition: inline In-Reply-To: <20120210194142.GB15606@boyd> Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org --9jxsPFA5p3P2qPhR Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Thieu says his emails to the vger.kernel.org lists are bouncing. Forwarding for completeness. Tyler ----- Forwarded message from Thieu Le ----- > Date: Fri, 10 Feb 2012 14:31:12 -0800 > From: Thieu Le > To: Tyler Hicks > Cc: Li Wang , Jan Kara , ecryptfs@vger.= kernel.org, > linux-kernel@vger.kernel.org, Yong Peng , viro@= zeniv.linux.org.uk, > linux-fsdevel@vger.kernel.org, akpm@linux-foundation.org, taysom@chromiu= m.org > Subject: Re: [PATCH] eCryptfs: Quota check incorrectly ignored >=20 > +taysom >=20 > I am onboard with backing out the write-back cache patch. It looks like > we're just peeling the layers of this onion. >=20 > On Fri, Feb 10, 2012 at 11:41 AM, Tyler Hicks wro= te: >=20 > > On 2012-02-10 13:31:49, 'Tyler Hicks' wrote: > > > On 2012-02-10 22:44:05, Li Wang wrote: > > > > > -----Original Message----- > > > > > From: Jan Kara [mailto:jack@suse.cz] > > > > > Sent: Friday, February 10, 2012 6:32 PM > > > > > To: Li Wang > > > > > Cc: Tyler Hicks; ecryptfs@vger.kernel.org; > > linux-kernel@vger.kernel.org; > > > > Jan > > > > > Kara; Yong Peng > > > > > Subject: Re: [PATCH] eCryptfs: Quota check incorrectly ignored > > > > > > > > > > On Thu 09-02-12 19:39:32, Li Wang wrote: > > > > > > eCryptfs recently modified the write path to perform encryption > > > > > > and write down in ecryptfs_writepage(). This function invokes > > > > vfs_write() > > > > > > to write down the encrypted data to lower page cache. vfs_write= () > > will > > > > > > first make sure this write will not exceed the quota limit for = the > > owner > > > > > > of the file being written into, if quota is supported by > > > > > > the underlying file system, and it is turned on. Normally, it > > > > accomplishs > > > > > > this job by calling check_idq()/check_bdq() (fs/quota/dquot.c). > > When > > > > system > > > > > > dirty ratio is not high, ecryptfs_writepage() is normally invok= ed > > by the > > > > > > write back kernel thread, who has the capability CAP_SYS_RESOUR= CE, > > > > > > this priority will let check_idq()/check_bdq() directly bypass = the > > quota > > > > > > check. This sometimes makes data safely written into the disk in > > spite > > > > of > > > > > > exceeding the quota limit. > > > > > > > > > > > > This patch temporarily removes the CAP_SYS_RESOURCE capability > > from the > > > > > kernel > > > > > > thread before invoking vfs_write_lower(), to let it undergo quo= ta > > check > > > > by > > > > > > the lower file system, if necessary. After that, reassign the > > > > capability. > > > > > Hmm, but then the error will just be thrown away by the flusher > > thread > > > > > and the application never learns about it? That doesn't sound lik= e a > > good > > > > > solution. > > > > > > > > > > Honza > > > > > > > > Yes. we are aware of that, but we have not found better solution, > > since it > > > > seems > > > > VFS does not supply a file system independent interface to check qu= ota > > early > > > > and only > > > > (fix us if we are wrong), The file systems just perform the quota > > check in > > > > their own way, > > > > the routine is wrapped deeply inside the file system specific code > > path. > > > > Maybe we could > > > > copy some codes from _dquot_alloc_space() & dquot_alloc_inode > > > > (fs/quota/dquot.c) to > > > > perform quota check on lower inode ourselves, but that is ugly, and= we > > are > > > > not sure if it works > > > > for any file system or not..., On the other side, due to the existe= nce > > of > > > > write buffer, and io schedule, > > > > the successful write call does not gurantee the data will be written > > into > > > > disk, in terms of that, > > > > we do not change much of the semantic of write call. > > > > > > Maybe I should just revert 57db4e8d73ef2b5e94a3f412108dff2576670a8a > > > (and friends) and go back to the write-through cache model for eCrypt= fs. > > > The write-back model has some nice performance improvements, but we k= eep > > > running into little issues like this. For write-through to work > > > reliably, we're going to need some support from the VFS that isn't th= ere > > > yet. > > > > Sorry, that should be "For write-back to work reliably, we're going to > > need some support from the VFS that isn't there yet." > > > > Tyler > > > > > > > > Handling ENOSPC correctly is another area that is lacking after the > > > switch to write-back. Currently, an application will continue to write > > > out data without having any indication that the disk is full. Thieu > > > proposed a fix, but it was never merged: > > > > > > http://article.gmane.org/gmane.linux.kernel/1209265 > > > > > > So, what do folks think about going back to a write-through cache unt= il > > > we have the ability to alert the application of these error condition= s? > > > > > > Tyler > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Signed-off-by: Li Wang > > > > > > Singed-off-by: Yong Peng > > > > > > --- > > > > > > > > > > > > To repeat this bug, > > > > > > mount -o usrquota /dev/sda3 /tmp > > > > > > cd /tmp > > > > > > edquota -u foo // set the disk quota limit for user foo= be > > m1 bytes > > > > > > quotaon -a > > > > > > mount -t ecryptfs cipher plain > > > > > > > > > > > > login the system as user foo > > > > > > cd /tmp/plain > > > > > > execute the following simple program > > > > > > > > > > > > int main() > > > > > > { > > > > > > char buf[m2]; // m2>m1 > > > > > > FILE *f =3D fopen("dummy", "w"); > > > > > > fwrite(buf, 1, m2, f); > > > > > > sleep(60); // let the kernel thread do the write > > back job > > > > > > fclose(f); > > > > > > return 0; > > > > > > } > > > > > > > > > > > > This program can write as much of data as it wants, > > provided sleep > > > > > enough long time before closing the file. > > > > > > > > > > > > fs/ecryptfs/crypto.c | 27 +++++++++++++++++++++++++++ > > > > > > 1 files changed, 27 insertions(+), 0 deletions(-) > > > > > > > > > > > > diff --git a/fs/ecryptfs/crypto.c b/fs/ecryptfs/crypto.c > > > > > > index 63ab245..2c1da29 100644 > > > > > > --- a/fs/ecryptfs/crypto.c > > > > > > +++ b/fs/ecryptfs/crypto.c > > > > > > @@ -457,6 +457,7 @@ int ecryptfs_encrypt_page(struct page *page) > > > > > > struct page *enc_extent_page =3D NULL; > > > > > > loff_t extent_offset; > > > > > > int rc =3D 0; > > > > > > + struct cred *cred; > > > > > > > > > > > > ecryptfs_inode =3D page->mapping->host; > > > > > > crypt_stat =3D > > > > > > @@ -487,8 +488,34 @@ int ecryptfs_encrypt_page(struct page *pag= e) > > > > > > * (PAGE_CACHE_SIZE > > > > > > / crypt_stat->extent_size= )) > > > > > > + extent_offset), crypt_stat); > > > > > > + if (current->flags & PF_KTHREAD) { > > > > > > + /* > > > > > > + * Temporarily remove the > > > > > CAP_SYS_RESOURCE capability > > > > > > + * from the write back kernel thread to > > let it > > > > > undergo > > > > > > + * quota check by the lower file system > > > > > > + */ > > > > > > + cred =3D prepare_creds(); > > > > > > + if (unlikely(!cred)) { > > > > > > + rc =3D -ENOMEM; > > > > > > + goto out; > > > > > > + } > > > > > > + cap_lower(cred->cap_effective, > > CAP_SYS_RESOURCE); > > > > > > + commit_creds(cred); > > > > > > + } > > > > > > rc =3D ecryptfs_write_lower(ecryptfs_inode, > > enc_extent_virt, > > > > > > offset, > > crypt_stat->extent_size); > > > > > > + if (current->flags & PF_KTHREAD) { > > > > > > + /* > > > > > > + * Reassign the CAP_SYS_RESOURCE > > > > > capability > > > > > > + */ > > > > > > + cred =3D prepare_creds(); > > > > > > + if (unlikely(!cred)) { > > > > > > + rc =3D -ENOMEM; > > > > > > + goto out; > > > > > > + } > > > > > > + cap_raise(cred->cap_effective, > > CAP_SYS_RESOURCE); > > > > > > + commit_creds(cred); > > > > > > + } > > > > > > if (rc < 0) { > > > > > > ecryptfs_printk(KERN_ERR, "Error > > attempting " > > > > > > "to write lower page; r= c =3D > > [%d]" > > > > > > -- > > > > > > 1.7.6.5 > > > > > -- > > > > > Jan Kara > > > > > SUSE Labs, CR > > > > > > > > > > ----- End forwarded message ----- --9jxsPFA5p3P2qPhR Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAEBCgAGBQJPNaD9AAoJENaSAD2qAscKk+UP/iIEzADa5e1x9MIPz4edoZIU FMHGBwlPFytoUHbIcEfu2TuRZdg3udZTfMZoRhZoDmRn4NKsB2uctIT+/BRd85SJ na8vON31vsRBsdxcuKzYUcfPCzE+IQvixwbgjHdBFZlAUG47tB9Wufco1kXURJS9 0pDEoh1V+mXQb0sNM5ufaEup1XLeWvBB+1nej2L/MTJYrIWo0gcgj80pOlIIdHPd lzq4iDUpp6wM8qKh247vu7kODJDI+tTmQPfzwJqWmCCCcheCvrM84sZnDtgzjUC6 8tzAK1cerWmvyzEtwfUS6ZGpPpWZlxhYu5Kk2PyI4G+suAr6vf89EliQYPRp7a6o FtqsgsjSdFzDshK+ACa4Y6yHQLbGTph1w3hSqDr2Zp0OkcCkXytWWd+SAcFgOy0Q QSueNLEV/nnpk1NYpCbx2rvADMAMuuGrEBOgr17qtiquXMSOz5h+rKnP0An4ybYy 7JFUPHKOMqv7MFt9Lv5djlUASQJbBdIRdgBHjYwqG4rDFT+aoZPvDK9DYrjok/P0 GLs9Pyoydtcz0yyaBRiyS2N5qJ9XEqOnzop8lzrXvHMISnGNCHjx9eYvlCy6MHSn m3ydzNuMaOyM34EYhvR3cgKeuB9M5zrU/rMJkcysUXBlFqXCHQY8uufKuSQ4mF/b JASMKsAtsA0gN9vlbW40 =Pq5l -----END PGP SIGNATURE----- --9jxsPFA5p3P2qPhR--