* bug: truncate to zero + setuid
@ 2007-10-28 14:36 Roger Willcocks
2007-10-29 0:54 ` Tim Shimmin
0 siblings, 1 reply; 8+ messages in thread
From: Roger Willcocks @ 2007-10-28 14:36 UTC (permalink / raw)
To: xfs
The nfsv3 setattr call permits a simultaneous truncate + setuid/gid
operation. Normally XFS handles this fine, but if the file's being
truncated to zero, and the file's already empty, XFS simply ignores the
setuid/gid part, returning 'success'.
The error's in xfs_vnodeops.c/xfs_setattr below the comment
'Short circuit the truncate case for zero length files', which bypasses
all other changes.
The simplest fix is to test whether this is the only change that's
happening, otherwise you get tangled in transactions.
if (mask & XFS_AT_SIZE) {
/* Short circuit the truncate case for zero length files */
- if ((vap->va_size == 0) &&
+ if (((mask & ~XFS_AT_SIZE) == 0) && (vap->va_size == 0) &&
(ip->i_d.di_size == 0) && (ip->i_d.di_nextents == 0)) {
--
Roger
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: bug: truncate to zero + setuid
2007-10-28 14:36 bug: truncate to zero + setuid Roger Willcocks
@ 2007-10-29 0:54 ` Tim Shimmin
2007-10-29 18:56 ` Roger Willcocks
0 siblings, 1 reply; 8+ messages in thread
From: Tim Shimmin @ 2007-10-29 0:54 UTC (permalink / raw)
To: Roger Willcocks; +Cc: xfs
Hi Roger,
Roger Willcocks wrote:
> The nfsv3 setattr call permits a simultaneous truncate + setuid/gid
> operation. Normally XFS handles this fine, but if the file's being
> truncated to zero, and the file's already empty, XFS simply ignores the
> setuid/gid part, returning 'success'.
>
> The error's in xfs_vnodeops.c/xfs_setattr below the comment 'Short
> circuit the truncate case for zero length files', which bypasses all
> other changes.
>
> The simplest fix is to test whether this is the only change that's
> happening, otherwise you get tangled in transactions.
>
> if (mask & XFS_AT_SIZE) {
> /* Short circuit the truncate case for zero length files */
> - if ((vap->va_size == 0) &&
> + if (((mask & ~XFS_AT_SIZE) == 0) && (vap->va_size == 0) &&
> (ip->i_d.di_size == 0) && (ip->i_d.di_nextents == 0)) {
>
>
> --
> Roger
>
Yeah, I see your problem.
It is interesting to know how much of multiple actions (different
bits of mask) are supported.
XFS_AT_UPDTIMES doesn't support anything else and will return quickly.
As far as code which goes via error_return, it looks like the
only non-error case is the short-circuit one mentioned above -
truncating to zero an already zeroed file.
I see your fix will get what we want although, it will mean that we
will go thru the truncating path when we don't need to.
It would be nice if we could act like the XFS_AT_SIZE was never set
on the call in the case that other bits are set.
Though when we first test the AT_SIZE, we don't have the inode locked.
And I presume in that case we don't necessarily need to send a
DMAPI truncate event?
So I wonder if before the 1st AT_SIZE test that we have now,
in the AT_SIZE case and with other bits set,
we could lock the inode and test
for the zero truncate of zero-len file and if so then remove the AT_SIZE bit
and then go on to the other testing as if AT_SIZE was never set.
Hmmm, may be that just complicates things too much.
--Tim
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: bug: truncate to zero + setuid
2007-10-29 0:54 ` Tim Shimmin
@ 2007-10-29 18:56 ` Roger Willcocks
2007-10-30 4:06 ` Timothy Shimmin
0 siblings, 1 reply; 8+ messages in thread
From: Roger Willcocks @ 2007-10-29 18:56 UTC (permalink / raw)
To: Tim Shimmin; +Cc: xfs
Tim Shimmin wrote:
> Hi Roger,
>
> Roger Willcocks wrote:
>> The nfsv3 setattr call permits a simultaneous truncate + setuid/gid
>> operation. Normally XFS handles this fine, but if the file's being
>> truncated to zero, and the file's already empty, XFS simply ignores
>> the setuid/gid part, returning 'success'.
>>
>> The error's in xfs_vnodeops.c/xfs_setattr below the comment 'Short
>> circuit the truncate case for zero length files', which bypasses all
>> other changes.
>>
>> The simplest fix is to test whether this is the only change that's
>> happening, otherwise you get tangled in transactions.
>>
>> if (mask & XFS_AT_SIZE) {
>> /* Short circuit the truncate case for zero length
>> files */
>> - if ((vap->va_size == 0) &&
>> + if (((mask & ~XFS_AT_SIZE) == 0) && (vap->va_size ==
>> 0) &&
>> (ip->i_d.di_size == 0) && (ip->i_d.di_nextents ==
>> 0)) {
>>
>>
>> --
>> Roger
>>
>
> Yeah, I see your problem.
> It is interesting to know how much of multiple actions (different
> bits of mask) are supported.
> XFS_AT_UPDTIMES doesn't support anything else and will return quickly.
> As far as code which goes via error_return, it looks like the
> only non-error case is the short-circuit one mentioned above -
> truncating to zero an already zeroed file.
>
> I see your fix will get what we want although, it will mean that we
> will go thru the truncating path when we don't need to.
> It would be nice if we could act like the XFS_AT_SIZE was never set
> on the call in the case that other bits are set.
> Though when we first test the AT_SIZE, we don't have the inode locked.
> And I presume in that case we don't necessarily need to send a
> DMAPI truncate event?
> So I wonder if before the 1st AT_SIZE test that we have now,
> in the AT_SIZE case and with other bits set,
> we could lock the inode and test
> for the zero truncate of zero-len file and if so then remove the
> AT_SIZE bit
> and then go on to the other testing as if AT_SIZE was never set.
> Hmmm, may be that just complicates things too much.
>
Yes I looked at simply unsetting the XFS_AT_SIZE bit (you'd also need to set
timeflags appropriately) but the problem is the bit's used to check when
to create
a transaction - either up front as XFS_TRANS_SETATTR_NOT_SIZE, or
later on, after the inode's been locked and unlocked again, as
XFS_TRANS_SETATTR_SIZE. So if you unset the bit (and there's still more
to do) you need to build a not_size transaction, at which point you
might as well
rewrite the whole routine...
--
Roger
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: bug: truncate to zero + setuid
2007-10-29 18:56 ` Roger Willcocks
@ 2007-10-30 4:06 ` Timothy Shimmin
2007-10-30 17:28 ` Roger Willcocks
0 siblings, 1 reply; 8+ messages in thread
From: Timothy Shimmin @ 2007-10-30 4:06 UTC (permalink / raw)
To: Roger Willcocks; +Cc: xfs
Roger Willcocks wrote:
> Tim Shimmin wrote:
>> Hi Roger,
>>
>> Roger Willcocks wrote:
>>> The nfsv3 setattr call permits a simultaneous truncate + setuid/gid
>>> operation. Normally XFS handles this fine, but if the file's being
>>> truncated to zero, and the file's already empty, XFS simply ignores
>>> the setuid/gid part, returning 'success'.
>>>
>>> The error's in xfs_vnodeops.c/xfs_setattr below the comment 'Short
>>> circuit the truncate case for zero length files', which bypasses all
>>> other changes.
>>>
>>> The simplest fix is to test whether this is the only change that's
>>> happening, otherwise you get tangled in transactions.
>>>
>>> if (mask & XFS_AT_SIZE) {
>>> /* Short circuit the truncate case for zero length
>>> files */
>>> - if ((vap->va_size == 0) &&
>>> + if (((mask & ~XFS_AT_SIZE) == 0) && (vap->va_size ==
>>> 0) &&
>>> (ip->i_d.di_size == 0) && (ip->i_d.di_nextents ==
>>> 0)) {
>>>
>>>
>>> --
>>> Roger
>>>
>>
>> Yeah, I see your problem.
>> It is interesting to know how much of multiple actions (different
>> bits of mask) are supported.
>> XFS_AT_UPDTIMES doesn't support anything else and will return quickly.
>> As far as code which goes via error_return, it looks like the
>> only non-error case is the short-circuit one mentioned above -
>> truncating to zero an already zeroed file.
>>
>> I see your fix will get what we want although, it will mean that we
>> will go thru the truncating path when we don't need to.
>> It would be nice if we could act like the XFS_AT_SIZE was never set
>> on the call in the case that other bits are set.
>> Though when we first test the AT_SIZE, we don't have the inode locked.
>> And I presume in that case we don't necessarily need to send a
>> DMAPI truncate event?
>> So I wonder if before the 1st AT_SIZE test that we have now,
>> in the AT_SIZE case and with other bits set,
>> we could lock the inode and test
>> for the zero truncate of zero-len file and if so then remove the
>> AT_SIZE bit
>> and then go on to the other testing as if AT_SIZE was never set.
>> Hmmm, may be that just complicates things too much.
>>
> Yes I looked at simply unsetting the XFS_AT_SIZE bit (you'd also need to
> set
> timeflags appropriately) but the problem is the bit's used to check when
> to create
> a transaction - either up front as XFS_TRANS_SETATTR_NOT_SIZE, or
> later on, after the inode's been locked and unlocked again, as
> XFS_TRANS_SETATTR_SIZE. So if you unset the bit (and there's still more
> to do) you need to build a not_size transaction, at which point you
> might as well
> rewrite the whole routine...
>
> --
> Roger
So this was done by Steve L. for AIM performance...
----------------------------
revision 1.446
date: 2000/06/09 02:59:45; author: lord; state: Exp; lines: +11 -0
modid: 2.4.0-test1-xfs:slinx:55891a
Merge of 2.3.99pre2-xfs:slinx:55891a by ananth.
Short circuit the case where we truncate a zero length
file down to zero length so that we do not do a transaction.
This close to doubles create/close AIM run performance.
----------------------------
> p_rdiff -r1.445,1.446 xfs_vnodeops.c
746a747,757
> /* Short circuit the truncate case for zero length files */
> if ((vap->va_size == 0) &&
> (ip->i_d.di_size == 0) && (ip->i_d.di_nextents == 0)) {
> xfs_iunlock(ip, XFS_ILOCK_EXCL);
> lock_flags &= ~XFS_ILOCK_EXCL;
> if (mask & AT_CTIME)
> xfs_ichgtime(ip, XFS_ICHGTIME_CHG);
> code = 0;
> goto error_return;
> }
>
----------------------------
revision 1.506
date: 2001/06/26 22:07:29; author: lord; state: Exp; lines: +1 -1
modid: 2.4.x-xfs:slinx:97694a
Set the mtime on a zero length file when a create is being done on
top of it.
----------------------------
> p_rdiff -r1.505,1.506 xfs_vnodeops.c
526c526
< xfs_ichgtime(ip, XFS_ICHGTIME_CHG);
---
> xfs_ichgtime(ip, XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
I presume it was done where it was done so that the inode was locked and we
were under the XFS_AT_SIZE predicate.
I was just thinking of something like...
but I'm probably missing something.
Index: 2.6.x-xfs/fs/xfs/xfs_vnodeops.c
===================================================================
--- 2.6.x-xfs.orig/fs/xfs/xfs_vnodeops.c 2007-10-12 16:06:15.000000000 +1000
+++ 2.6.x-xfs/fs/xfs/xfs_vnodeops.c 2007-10-30 14:59:46.418837757 +1100
@@ -304,6 +304,24 @@
}
/*
+ * Short circuit the truncate case for zero length files.
+ * If more mask bits are set, then just remove the SIZE one
+ * and keep going.
+ */
+ if (mask & XFS_AT_SIZE) {
+ xfs_ilock(ip, XFS_ILOCK_SHARED);
+ if ((vap->va_size == 0) && (ip->i_size == 0) && (ip->i_d.di_nextents == 0)) {
+ if (mask & ~XFS_AT_SIZE) {
+ mask &= ~XFS_AT_SIZE;
+ } else {
+ xfs_iunlock(ip, XFS_ILOCK_SHARED);
+ return 0;
+ }
+ }
+ xfs_iunlock(ip, XFS_ILOCK_SHARED);
+ }
+
+ /*
* For the other attributes, we acquire the inode lock and
* first do an error checking pass.
*/
@@ -451,17 +469,6 @@
* Truncate file. Must have write permission and not be a directory.
*/
if (mask & XFS_AT_SIZE) {
- /* Short circuit the truncate case for zero length files */
- if ((vap->va_size == 0) &&
- (ip->i_size == 0) && (ip->i_d.di_nextents == 0)) {
- xfs_iunlock(ip, XFS_ILOCK_EXCL);
- lock_flags &= ~XFS_ILOCK_EXCL;
- if (mask & XFS_AT_CTIME)
- xfs_ichgtime(ip, XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
- code = 0;
- goto error_return;
- }
-
if (VN_ISDIR(vp)) {
code = XFS_ERROR(EISDIR);
goto error_return;
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: bug: truncate to zero + setuid
2007-10-30 4:06 ` Timothy Shimmin
@ 2007-10-30 17:28 ` Roger Willcocks
2007-11-02 1:11 ` Timothy Shimmin
0 siblings, 1 reply; 8+ messages in thread
From: Roger Willcocks @ 2007-10-30 17:28 UTC (permalink / raw)
To: Timothy Shimmin; +Cc: xfs
Timothy Shimmin wrote:
> Roger Willcocks wrote:
>> Tim Shimmin wrote:
>>> Hi Roger,
>>>
>>> Roger Willcocks wrote:
>>>> The nfsv3 setattr call permits a simultaneous truncate + setuid/gid
>>>> operation. Normally XFS handles this fine, but if the file's being
>>>> truncated to zero, and the file's already empty, XFS simply ignores
>>>> the setuid/gid part, returning 'success'.
>>>>
>>>> The error's in xfs_vnodeops.c/xfs_setattr below the comment 'Short
>>>> circuit the truncate case for zero length files', which bypasses
>>>> all other changes.
>>>>
>>>> The simplest fix is to test whether this is the only change that's
>>>> happening, otherwise you get tangled in transactions.
>>>>
>>>> if (mask & XFS_AT_SIZE) {
>>>> /* Short circuit the truncate case for zero length
>>>> files */
>>>> - if ((vap->va_size == 0) &&
>>>> + if (((mask & ~XFS_AT_SIZE) == 0) && (vap->va_size
>>>> == 0) &&
>>>> (ip->i_d.di_size == 0) && (ip->i_d.di_nextents ==
>>>> 0)) {
>>>>
>>>>
>>>> --
>>>> Roger
>>>>
>>>
>>> Yeah, I see your problem.
>>> It is interesting to know how much of multiple actions (different
>>> bits of mask) are supported.
>>> XFS_AT_UPDTIMES doesn't support anything else and will return quickly.
>>> As far as code which goes via error_return, it looks like the
>>> only non-error case is the short-circuit one mentioned above -
>>> truncating to zero an already zeroed file.
>>>
>>> I see your fix will get what we want although, it will mean that we
>>> will go thru the truncating path when we don't need to.
>>> It would be nice if we could act like the XFS_AT_SIZE was never set
>>> on the call in the case that other bits are set.
>>> Though when we first test the AT_SIZE, we don't have the inode locked.
>>> And I presume in that case we don't necessarily need to send a
>>> DMAPI truncate event?
>>> So I wonder if before the 1st AT_SIZE test that we have now,
>>> in the AT_SIZE case and with other bits set,
>>> we could lock the inode and test
>>> for the zero truncate of zero-len file and if so then remove the
>>> AT_SIZE bit
>>> and then go on to the other testing as if AT_SIZE was never set.
>>> Hmmm, may be that just complicates things too much.
>>>
>> Yes I looked at simply unsetting the XFS_AT_SIZE bit (you'd also need
>> to set
>> timeflags appropriately) but the problem is the bit's used to check
>> when to create
>> a transaction - either up front as XFS_TRANS_SETATTR_NOT_SIZE, or
>> later on, after the inode's been locked and unlocked again, as
>> XFS_TRANS_SETATTR_SIZE. So if you unset the bit (and there's still more
>> to do) you need to build a not_size transaction, at which point you
>> might as well
>> rewrite the whole routine...
>>
>> --
>> Roger
>
> So this was done by Steve L. for AIM performance...
>
> ----------------------------
> revision 1.446
> date: 2000/06/09 02:59:45; author: lord; state: Exp; lines: +11 -0
> modid: 2.4.0-test1-xfs:slinx:55891a
> Merge of 2.3.99pre2-xfs:slinx:55891a by ananth.
>
> Short circuit the case where we truncate a zero length
> file down to zero length so that we do not do a transaction.
> This close to doubles create/close AIM run performance.
> ----------------------------
>
> > p_rdiff -r1.445,1.446 xfs_vnodeops.c
> 746a747,757
> > /* Short circuit the truncate case for zero length
> files */
> > if ((vap->va_size == 0) &&
> > (ip->i_d.di_size == 0) && (ip->i_d.di_nextents ==
> 0)) {
> > xfs_iunlock(ip, XFS_ILOCK_EXCL);
> > lock_flags &= ~XFS_ILOCK_EXCL;
> > if (mask & AT_CTIME)
> > xfs_ichgtime(ip, XFS_ICHGTIME_CHG);
> > code = 0;
> > goto error_return;
> > }
> >
>
>
> ----------------------------
> revision 1.506
> date: 2001/06/26 22:07:29; author: lord; state: Exp; lines: +1 -1
> modid: 2.4.x-xfs:slinx:97694a
> Set the mtime on a zero length file when a create is being done on
> top of it.
> ----------------------------
>
> > p_rdiff -r1.505,1.506 xfs_vnodeops.c
> 526c526
> < xfs_ichgtime(ip, XFS_ICHGTIME_CHG);
> ---
> > xfs_ichgtime(ip, XFS_ICHGTIME_MOD |
> XFS_ICHGTIME_CHG);
>
>
>
>
> I presume it was done where it was done so that the inode was locked
> and we
> were under the XFS_AT_SIZE predicate.
>
> I was just thinking of something like...
> but I'm probably missing something.
>
> Index: 2.6.x-xfs/fs/xfs/xfs_vnodeops.c
> ===================================================================
> --- 2.6.x-xfs.orig/fs/xfs/xfs_vnodeops.c 2007-10-12
> 16:06:15.000000000 +1000
> +++ 2.6.x-xfs/fs/xfs/xfs_vnodeops.c 2007-10-30 14:59:46.418837757
> +1100
> @@ -304,6 +304,24 @@
> }
>
> /*
> + * Short circuit the truncate case for zero length files.
> + * If more mask bits are set, then just remove the SIZE one
> + * and keep going.
> + */
> + if (mask & XFS_AT_SIZE) {
> + xfs_ilock(ip, XFS_ILOCK_SHARED);
> + if ((vap->va_size == 0) && (ip->i_size == 0) &&
> (ip->i_d.di_nextents == 0)) {
> + if (mask & ~XFS_AT_SIZE) {
> + mask &= ~XFS_AT_SIZE;
> + } else {
> + xfs_iunlock(ip, XFS_ILOCK_SHARED);
> + return 0;
> + }
> + }
> + xfs_iunlock(ip, XFS_ILOCK_SHARED);
> + }
> +
> + /*
> * For the other attributes, we acquire the inode lock and
> * first do an error checking pass.
> */
> @@ -451,17 +469,6 @@
> * Truncate file. Must have write permission and not be a
> directory.
> */
> if (mask & XFS_AT_SIZE) {
> - /* Short circuit the truncate case for zero length
> files */
> - if ((vap->va_size == 0) &&
> - (ip->i_size == 0) && (ip->i_d.di_nextents == 0)) {
> - xfs_iunlock(ip, XFS_ILOCK_EXCL);
> - lock_flags &= ~XFS_ILOCK_EXCL;
> - if (mask & XFS_AT_CTIME)
> - xfs_ichgtime(ip, XFS_ICHGTIME_MOD |
> XFS_ICHGTIME_CHG);
> - code = 0;
> - goto error_return;
> - }
> -
> if (VN_ISDIR(vp)) {
> code = XFS_ERROR(EISDIR);
> goto error_return;
This misses setting the access and changed times which still need to be
touched even if the file's already zero bytes. How about this:
(noting that open.c/do_truncate uses XFS_AT_SIZE | XFS_AT_CTIME)
--- xfs_vnodeops.c 2007-09-04 15:57:40.000000000 +0100
+++ /tmp/xfs_vnodeops.c 2007-10-30 17:11:32.000000000 +0000
@@ -378,6 +378,24 @@
return (code);
}
+
+ if ((mask & XFS_AT_SIZE) && (vap->va_size == 0)) {
+
+ /* Short circuit the truncate case for zero length files */
+
+ xfs_ilock(ip, XFS_ILOCK_EXCL);
+ if ((ip->i_d.di_size == 0) && (ip->i_d.di_nextents == 0)) {
+ xfs_iunlock(ip, XFS_ILOCK_EXCL);
+ if (mask & XFS_AT_CTIME)
+ xfs_ichgtime(ip,
XFS_ICHGTIME_MOD|XFS_ICHGTIME_CHG);
+ mask &= ~(XFS_AT_SIZE|XFS_AT_CTIME);
+ if (mask == 0) {
+ code = 0;
+ goto error_return;
+ }
+ }
+ }
+
/*
* For the other attributes, we acquire the inode lock and
* first do an error checking pass.
@@ -528,17 +546,6 @@
* Truncate file. Must have write permission and not be a
directory.
*/
if (mask & XFS_AT_SIZE) {
- /* Short circuit the truncate case for zero length files */
- if ((vap->va_size == 0) &&
- (ip->i_d.di_size == 0) && (ip->i_d.di_nextents == 0)) {
- xfs_iunlock(ip, XFS_ILOCK_EXCL);
- lock_flags &= ~XFS_ILOCK_EXCL;
- if (mask & XFS_AT_CTIME)
- xfs_ichgtime(ip, XFS_ICHGTIME_MOD |
XFS_ICHGTIME_CHG);
- code = 0;
- goto error_return;
- }
-
if (vp->v_type == VDIR) {
code = XFS_ERROR(EISDIR);
goto error_return;
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: bug: truncate to zero + setuid
2007-10-30 17:28 ` Roger Willcocks
@ 2007-11-02 1:11 ` Timothy Shimmin
2007-11-04 11:59 ` Roger Willcocks
0 siblings, 1 reply; 8+ messages in thread
From: Timothy Shimmin @ 2007-11-02 1:11 UTC (permalink / raw)
To: Roger Willcocks; +Cc: xfs
Hi Roger,
Roger Willcocks wrote:
> Timothy Shimmin wrote:
>> I presume it was done where it was done so that the inode was locked
>> and we
>> were under the XFS_AT_SIZE predicate.
>>
>> I was just thinking of something like...
>> but I'm probably missing something.
>>
>> Index: 2.6.x-xfs/fs/xfs/xfs_vnodeops.c
>> ===================================================================
>> --- 2.6.x-xfs.orig/fs/xfs/xfs_vnodeops.c 2007-10-12
>> 16:06:15.000000000 +1000
>> +++ 2.6.x-xfs/fs/xfs/xfs_vnodeops.c 2007-10-30 14:59:46.418837757
>> +1100
>> @@ -304,6 +304,24 @@
>> }
>>
>> /*
>> + * Short circuit the truncate case for zero length files.
>> + * If more mask bits are set, then just remove the SIZE one
>> + * and keep going.
>> + */
>> + if (mask & XFS_AT_SIZE) {
>> + xfs_ilock(ip, XFS_ILOCK_SHARED);
>> + if ((vap->va_size == 0) && (ip->i_size == 0) &&
>> (ip->i_d.di_nextents == 0)) {
>> + if (mask & ~XFS_AT_SIZE) {
>> + mask &= ~XFS_AT_SIZE;
>> + } else {
>> + xfs_iunlock(ip, XFS_ILOCK_SHARED);
>> + return 0;
>> + }
>> + }
>> + xfs_iunlock(ip, XFS_ILOCK_SHARED);
>> + }
>> +
>> + /*
>> * For the other attributes, we acquire the inode lock and
>> * first do an error checking pass.
>> */
>> @@ -451,17 +469,6 @@
>> * Truncate file. Must have write permission and not be a
>> directory.
>> */
>> if (mask & XFS_AT_SIZE) {
>> - /* Short circuit the truncate case for zero length
>> files */
>> - if ((vap->va_size == 0) &&
>> - (ip->i_size == 0) && (ip->i_d.di_nextents == 0)) {
>> - xfs_iunlock(ip, XFS_ILOCK_EXCL);
>> - lock_flags &= ~XFS_ILOCK_EXCL;
>> - if (mask & XFS_AT_CTIME)
>> - xfs_ichgtime(ip, XFS_ICHGTIME_MOD |
>> XFS_ICHGTIME_CHG);
>> - code = 0;
>> - goto error_return;
>> - }
>> -
>> if (VN_ISDIR(vp)) {
>> code = XFS_ERROR(EISDIR);
>> goto error_return;
>
> This misses setting the access and changed times which still need to be
> touched even if the file's already zero bytes. How about this:
> (noting that open.c/do_truncate uses XFS_AT_SIZE | XFS_AT_CTIME)
>
Well, if XFS_AT_CTIME was set then my patch wouldn't
return straight away and would continue processing the mask fields.
And I was presuming that the times would be set doing this.
However, it doesn't look quite so simple and consistent:
* going down the normal (non-short-circuit) path for AT_SIZE,
it doesn't test for CTIME but rather just does:
/*
* Have to do this even if the file's size doesn't change.
*/
timeflags |= XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG;
And yet our short-circuit case only does it if AT_CTIME is set.
Doesn't look consistent to me.
* So what in the vfs would call it...
-----------------------
open(O_TRUNC)...
int may_open(struct nameidata *nd, int acc_mode, int flag)
{
if (flag & O_TRUNC) {
...
error = do_truncate(dentry, 0, ATTR_MTIME|ATTR_CTIME, NULL);
------------------------
static long do_sys_ftruncate(unsigned int fd, loff_t length, int small)
{
if (!error)
error = do_truncate(dentry, length, ATTR_MTIME|ATTR_CTIME, file);
------------------------
This is NOR the case for
* do_sys_truncate() and
* do_coredump(),
however, which send in zero for those bits.
Not to mention other ways to get to these calls.
Anyway, open(O_TRUNC) will be a good candidate for the optimization
by the looks of it as it always trunc's to zero.
So in the 2 truncate cases, it is setting MTIME and CTIME.
And how is MTIME handled in the xfs code...
/*
* Change file access or modified times.
*/
if (mask & (XFS_AT_ATIME|XFS_AT_MTIME)) {
if (mask & XFS_AT_ATIME) {
ip->i_d.di_atime.t_sec = vap->va_atime.tv_sec;
ip->i_d.di_atime.t_nsec = vap->va_atime.tv_nsec;
ip->i_update_core = 1;
timeflags &= ~XFS_ICHGTIME_ACC;
}
if (mask & XFS_AT_MTIME) {
ip->i_d.di_mtime.t_sec = vap->va_mtime.tv_sec;
ip->i_d.di_mtime.t_nsec = vap->va_mtime.tv_nsec;
timeflags &= ~XFS_ICHGTIME_MOD;
timeflags |= XFS_ICHGTIME_CHG;
}
if (tp && (flags & ATTR_UTIME))
xfs_trans_log_inode (tp, ip, XFS_ILOG_CORE);
}
/*
* Send out timestamp changes that need to be set to the
* current time. Not done when called by a DMI function.
*/
if (timeflags && !(flags & ATTR_DMI))
xfs_ichgtime(ip, timeflags);
And note that MTIME, will actually turn on the timeflags for XFS_ICHGTIME_CHG,
which is the time associated with XFS_AT_CTIME.
So for the 2 do_truncate calls paths,
it will set the mtime based
on the va_mtime and set the ctime based on current time (nanotime()).
Our shortcut is setting both to current time.
I don't know if anyone really cares.
I don't like all these inconsistencies.
One way to reduce inconsistencies is to allow code to go thru
common paths so we can do the same strange thing in the one spot ;-)
It looks like in the AT_SIZE, we should always set those timeflags
irrespective of AT_CTIME.
BTW, your locking looks wrong - it appears you don't unlock when the
file is non-zero size.
--Tim
> --- xfs_vnodeops.c 2007-09-04 15:57:40.000000000 +0100
> +++ /tmp/xfs_vnodeops.c 2007-10-30 17:11:32.000000000 +0000
> @@ -378,6 +378,24 @@
> return (code);
> }
>
> +
> + if ((mask & XFS_AT_SIZE) && (vap->va_size == 0)) {
> +
> + /* Short circuit the truncate case for zero length files */
> +
> + xfs_ilock(ip, XFS_ILOCK_EXCL);
> + if ((ip->i_d.di_size == 0) && (ip->i_d.di_nextents == 0)) {
> + xfs_iunlock(ip, XFS_ILOCK_EXCL);
> + if (mask & XFS_AT_CTIME)
> + xfs_ichgtime(ip,
> XFS_ICHGTIME_MOD|XFS_ICHGTIME_CHG);
> + mask &= ~(XFS_AT_SIZE|XFS_AT_CTIME);
> + if (mask == 0) {
> + code = 0;
> + goto error_return;
> + }
> + }
> + }
> +
> /*
> * For the other attributes, we acquire the inode lock and
> * first do an error checking pass.
> @@ -528,17 +546,6 @@
> * Truncate file. Must have write permission and not be a
> directory.
> */
> if (mask & XFS_AT_SIZE) {
> - /* Short circuit the truncate case for zero length files */
> - if ((vap->va_size == 0) &&
> - (ip->i_d.di_size == 0) && (ip->i_d.di_nextents == 0)) {
> - xfs_iunlock(ip, XFS_ILOCK_EXCL);
> - lock_flags &= ~XFS_ILOCK_EXCL;
> - if (mask & XFS_AT_CTIME)
> - xfs_ichgtime(ip, XFS_ICHGTIME_MOD |
> XFS_ICHGTIME_CHG);
> - code = 0;
> - goto error_return;
> - }
> -
> if (vp->v_type == VDIR) {
> code = XFS_ERROR(EISDIR);
> goto error_return;
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: bug: truncate to zero + setuid
2007-11-02 1:11 ` Timothy Shimmin
@ 2007-11-04 11:59 ` Roger Willcocks
2007-11-08 3:13 ` Timothy Shimmin
0 siblings, 1 reply; 8+ messages in thread
From: Roger Willcocks @ 2007-11-04 11:59 UTC (permalink / raw)
To: Timothy Shimmin; +Cc: xfs
[-- Attachment #1: Type: text/plain, Size: 1110 bytes --]
Timothy Shimmin wrote:
> Hi Roger,
>
...
> I don't like all these inconsistencies.
Take a look at the attached patch relative to the current cvs (it's a bit
big to put
inline). The basic problem is it's currently unclear when to set the times
from
va_atime etc. and when to set them to the current time. So I've used the
already
defined XFS_AT_UPDxTIME flags to indicate that a time should be set to 'now'
and XFS_AT_xTIME to mean set it using va_xtime. This seems to fit well with
the current code and I wonder if that's how it was meant to work in the
first
place. I've also removed the now redundant ATTR_UTIME flag and pulled
the null truncate to the top, which simplifies things.
One query: in both xfs_iops.c/xfs_vn_setattr and
xfs_dm.c/xfs_dm_set_fileattr the
ATIME branch sets the inode's atime directly. This is probably something to
do with
the comment above xfs_iops.c/xfs_ichgtime ('to make sure the access time
update
will take') but it could probably be handled better.
> BTW, your locking looks wrong - it appears you don't unlock when the
> file is non-zero size.
Oops...
--
Roger
[-- Attachment #2: xfs_setattr.patch --]
[-- Type: application/octet-stream, Size: 5478 bytes --]
diff -ur xfs.orig/linux-2.6/xfs_iops.c xfs/linux-2.6/xfs_iops.c
--- xfs.orig/linux-2.6/xfs_iops.c 2007-11-04 10:59:00.923480296 +0000
+++ xfs/linux-2.6/xfs_iops.c 2007-11-04 12:43:15.702609336 +0000
@@ -655,13 +655,21 @@
vattr.va_size = attr->ia_size;
}
if (ia_valid & ATTR_ATIME) {
- vattr.va_mask |= XFS_AT_ATIME;
- vattr.va_atime = attr->ia_atime;
- inode->i_atime = attr->ia_atime;
+ if (ia_valid & ATTR_ATIME_SET) {
+ vattr.va_mask |= XFS_AT_ATIME;
+ vattr.va_atime = attr->ia_atime;
+ inode->i_atime = attr->ia_atime;
+ } else {
+ vattr.va_mask |= XFS_AT_UPDATIME;
+ }
}
if (ia_valid & ATTR_MTIME) {
- vattr.va_mask |= XFS_AT_MTIME;
- vattr.va_mtime = attr->ia_mtime;
+ if (ia_valid & ATTR_MTIME_SET) {
+ vattr.va_mask |= XFS_AT_MTIME;
+ vattr.va_mtime = attr->ia_mtime;
+ } else {
+ vattr.va_mask |= XFS_AT_UPDMTIME;
+ }
}
if (ia_valid & ATTR_CTIME) {
vattr.va_mask |= XFS_AT_CTIME;
@@ -674,8 +682,6 @@
inode->i_mode &= ~S_ISGID;
}
- if (ia_valid & (ATTR_MTIME_SET | ATTR_ATIME_SET))
- flags |= ATTR_UTIME;
#ifdef ATTR_NO_BLOCK
if ((ia_valid & ATTR_NO_BLOCK))
flags |= ATTR_NONBLOCK;
diff -ur xfs.orig/linux-2.6/xfs_vnode.h xfs/linux-2.6/xfs_vnode.h
--- xfs.orig/linux-2.6/xfs_vnode.h 2007-11-04 10:59:00.923480296 +0000
+++ xfs/linux-2.6/xfs_vnode.h 2007-11-04 11:01:33.338309720 +0000
@@ -270,7 +270,6 @@
/*
* Flags to vop_setattr/getattr.
*/
-#define ATTR_UTIME 0x01 /* non-default utime(2) request */
#define ATTR_DMI 0x08 /* invocation from a DMI function */
#define ATTR_LAZY 0x80 /* set/get attributes lazily */
#define ATTR_NONBLOCK 0x100 /* return EAGAIN if operation would block */
diff -ur xfs.orig/xfs_vnodeops.c xfs/xfs_vnodeops.c
--- xfs.orig/xfs_vnodeops.c 2007-11-04 10:59:00.917481208 +0000
+++ xfs/xfs_vnodeops.c 2007-11-04 12:07:44.917537904 +0000
@@ -214,10 +214,10 @@
{
bhv_vnode_t *vp = XFS_ITOV(ip);
xfs_mount_t *mp = ip->i_mount;
- xfs_trans_t *tp;
+ xfs_trans_t *tp = NULL;
int mask;
int code;
- uint lock_flags;
+ uint lock_flags=0;
uint commit_flags=0;
uid_t uid=0, iuid=0;
gid_t gid=0, igid=0;
@@ -244,21 +244,51 @@
if (XFS_FORCED_SHUTDOWN(mp))
return XFS_ERROR(EIO);
+ olddquot1 = olddquot2 = NULL;
+ udqp = gdqp = NULL;
+
+ /*
+ * Truncate is special because it changes the file as well as
+ * the attributes.
+ */
+ if (mask & XFS_AT_SIZE) {
+ /* Must have write permission and not be a directory. */
+ if (VN_ISDIR(vp)) {
+ code = XFS_ERROR(EISDIR);
+ goto error_return;
+ } else if (!VN_ISREG(vp)) {
+ code = XFS_ERROR(EINVAL);
+ goto error_return;
+ }
+ /*
+ * Short circuit the truncate case for zero length files.
+ */
+ if (vap->va_size == 0) {
+ xfs_ilock(ip, XFS_ILOCK_EXCL);
+ if ((ip->i_size == 0) && (ip->i_d.di_nextents == 0)) {
+ mask |= XFS_AT_UPDCTIME|XFS_AT_UPDMTIME;
+ mask &= ~XFS_AT_SIZE;
+ }
+ xfs_iunlock(ip, XFS_ILOCK_EXCL);
+ }
+ }
+
/*
* Timestamps do not need to be logged and hence do not
* need to be done within a transaction.
*/
if (mask & XFS_AT_UPDTIMES) {
- ASSERT((mask & ~XFS_AT_UPDTIMES) == 0);
timeflags = ((mask & XFS_AT_UPDATIME) ? XFS_ICHGTIME_ACC : 0) |
((mask & XFS_AT_UPDCTIME) ? XFS_ICHGTIME_CHG : 0) |
((mask & XFS_AT_UPDMTIME) ? XFS_ICHGTIME_MOD : 0);
- xfs_ichgtime(ip, timeflags);
- return 0;
+ mask &= ~XFS_AT_UPDTIMES;
}
- olddquot1 = olddquot2 = NULL;
- udqp = gdqp = NULL;
+ if (mask == 0) {
+ if (timeflags && !(flags & ATTR_DMI))
+ xfs_ichgtime(ip, timeflags);
+ return 0;
+ }
/*
* If disk quotas is on, we make sure that the dquots do exist on disk,
@@ -307,12 +337,11 @@
* For the other attributes, we acquire the inode lock and
* first do an error checking pass.
*/
- tp = NULL;
lock_flags = XFS_ILOCK_EXCL;
if (flags & ATTR_NOLOCK)
need_iolock = 0;
if (!(mask & XFS_AT_SIZE)) {
- if ((mask != (XFS_AT_CTIME|XFS_AT_ATIME|XFS_AT_MTIME)) ||
+ if ((mask & ~(XFS_AT_CTIME|XFS_AT_ATIME|XFS_AT_MTIME)) != 0 ||
(mp->m_flags & XFS_MOUNT_WSYNC)) {
tp = xfs_trans_alloc(mp, XFS_TRANS_SETATTR_NOT_SIZE);
commit_flags = 0;
@@ -451,24 +480,6 @@
* Truncate file. Must have write permission and not be a directory.
*/
if (mask & XFS_AT_SIZE) {
- /* Short circuit the truncate case for zero length files */
- if ((vap->va_size == 0) &&
- (ip->i_size == 0) && (ip->i_d.di_nextents == 0)) {
- xfs_iunlock(ip, XFS_ILOCK_EXCL);
- lock_flags &= ~XFS_ILOCK_EXCL;
- if (mask & XFS_AT_CTIME)
- xfs_ichgtime(ip, XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
- code = 0;
- goto error_return;
- }
-
- if (VN_ISDIR(vp)) {
- code = XFS_ERROR(EISDIR);
- goto error_return;
- } else if (!VN_ISREG(vp)) {
- code = XFS_ERROR(EINVAL);
- goto error_return;
- }
/*
* Make sure that the dquots are attached to the inode.
*/
@@ -481,8 +492,7 @@
*/
if (mask & (XFS_AT_ATIME|XFS_AT_MTIME)) {
if (!file_owner) {
- if ((flags & ATTR_UTIME) &&
- !capable(CAP_FOWNER)) {
+ if (!capable(CAP_FOWNER)) {
code = XFS_ERROR(EPERM);
goto error_return;
}
@@ -760,7 +770,7 @@
timeflags &= ~XFS_ICHGTIME_MOD;
timeflags |= XFS_ICHGTIME_CHG;
}
- if (tp && (flags & ATTR_UTIME))
+ if (tp)
xfs_trans_log_inode (tp, ip, XFS_ILOG_CORE);
}
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: bug: truncate to zero + setuid
2007-11-04 11:59 ` Roger Willcocks
@ 2007-11-08 3:13 ` Timothy Shimmin
0 siblings, 0 replies; 8+ messages in thread
From: Timothy Shimmin @ 2007-11-08 3:13 UTC (permalink / raw)
To: Roger Willcocks; +Cc: xfs
Hi Roger,
Roger Willcocks wrote:
> Timothy Shimmin wrote:
>> Hi Roger,
>>
> ...
>> I don't like all these inconsistencies.
>
> Take a look at the attached patch relative to the current cvs (it's a
> bit big to put
> inline). The basic problem is it's currently unclear when to set the
> times from
> va_atime etc. and when to set them to the current time. So I've used the
> already
> defined XFS_AT_UPDxTIME flags to indicate that a time should be set to
> 'now'
> and XFS_AT_xTIME to mean set it using va_xtime. This seems to fit well with
> the current code and I wonder if that's how it was meant to work in the
> first
> place.
Yeah, I've looked at this a few times now ;-) and this _seems_ like
a reasonable thing to do to me.
So patch:
ATTR_ATIME_SET => XFS_AT_ATIME (& set va_atime etc) (used to set to given time)
ATTR_ATIME => XFS_AT_UPDATIME (used to set to "now")
likewise for M variant.
Previously:
ATTR_ATIME_SET => ATTR_UTIME flag (used to set given time)
must expect ATTR_ATIME to be set too to get va_atime
ATTR_ATIME => XFS_AT_ATIME (& set va_atime) (used to set to "now")
a bit confusing since it can store va_atime even if
ATTR_ATIME_SET is not on
> I've also removed the now redundant ATTR_UTIME flag and pulled
> the null truncate to the top, which simplifies things.
>
So these changes of:
if (mask & (XFS_AT_ATIME|XFS_AT_MTIME)) {
if (!file_owner) {
- if ((flags & ATTR_UTIME) &&
- !capable(CAP_FOWNER)) {
+ if (!capable(CAP_FOWNER)) {
Where you take out ATTR_UTIME make sense since XFS_AT_ATIME et al,
now refer to the case where a given time is provided
instead of requiring ATTR_UTIME to be set.
> One query: in both xfs_iops.c/xfs_vn_setattr and
> xfs_dm.c/xfs_dm_set_fileattr the
> ATIME branch sets the inode's atime directly.
xfs_vn_setattr()
if (ia_valid & ATTR_ATIME) {
vattr.va_mask |= XFS_AT_ATIME;
vattr.va_atime = attr->ia_atime;
inode->i_atime = attr->ia_atime;
}
xfs_dm_set_fileattr()
if (mask & DM_AT_ATIME) {
vat.va_mask |= XFS_AT_ATIME;
vat.va_atime.tv_sec = stat.fa_atime;
vat.va_atime.tv_nsec = 0;
inode->i_atime.tv_sec = stat.fa_atime;
}
Hmmm....
So this could change behavior for xfs_vn_setattr().
If previously we had ATTR_ATIME set but NOT ATTR_ATIME_SET,
then we would set inode->i_atime.
Now with the patch, in this case, we don't set inode->i_atime
at this point.
However, in this case we wouldn't want i_atime to be set to ia_atime
as we would want it to be set to "now" in xfs_ichgtime().
> This is probably something
> to do with
> the comment above xfs_iops.c/xfs_ichgtime ('to make sure the access time
> update
> will take') but it could probably be handled better.
>
I'll need to look.
>> BTW, your locking looks wrong - it appears you don't unlock when the
>> file is non-zero size.
>
> Oops...
>
I was also thinking of a read lock here.
And initializing quot vars to zero in variable definition at top.
This stuff really needs to be QA'ed well.
It would be too easy to get a regression in expected behavior.
Need to hunt out qa tests.
Thanks for the effort,
Tim.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2007-11-08 3:12 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-28 14:36 bug: truncate to zero + setuid Roger Willcocks
2007-10-29 0:54 ` Tim Shimmin
2007-10-29 18:56 ` Roger Willcocks
2007-10-30 4:06 ` Timothy Shimmin
2007-10-30 17:28 ` Roger Willcocks
2007-11-02 1:11 ` Timothy Shimmin
2007-11-04 11:59 ` Roger Willcocks
2007-11-08 3:13 ` Timothy Shimmin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox