* Re: [PATCH 1/2] ocfs2: Fix llseek() semantics and do some cleanup
@ 2013-06-15 5:09 ` Jeff Liu
0 siblings, 0 replies; 21+ messages in thread
From: Jeff Liu @ 2013-06-15 5:09 UTC (permalink / raw)
To: Richard Yao
Cc: linux-fsdevel, linux-kernel, kernel, Ocfs2-Devel, Joel Becker,
Mark Fasheh
[Add ocfs2-devel to CC-list]
Hello Richard,
Thanks for your patch.
On 06/15/2013 03:23 AM, Richard Yao wrote:
> There are multiple issues with the custom llseek implemented in ocfs2 for
> implementing SEEK_HOLE/SEEK_DATA.
>
> 1. It takes the inode->i_mutex lock before calling generic_file_llseek(), which
> is unnecessary.
Agree, but please see my comments below.
>
> 2. It fails to take the filp->f_lock spinlock before modifying filp->f_pos and
> filp->f_version, which differs from generic_file_llseek().
>
> 3. It does a offset > inode->i_sb->s_maxbytes check that permits seeking up to
> the maximum file size possible on the ocfs2 filesystem, even when it is past
> the end of the file. Seeking beyond that (if possible), would return EINVAL
> instead of ENXIO.
>
> 4. The switch statement tries to cover all whence values when in reality it
> should only care about SEEK_HOLE/SEEK_DATA. Any other cases should be passsed
> to generic_file_llseek().
I have another patch set for refactoring ocfs2_file_llseek() but not yet found time
to run a comprehensive tests. It can solve the existing issues but also improved the
SEEK_DATA/SEEK_HOLE for unwritten extents, i.e. OCFS2_EXT_UNWRITTEN.
With this change, SEEK_DATA/SEEK_HOLE will go into separate function with a little code
duplication instead of the current mix-ups in ocfs2_seek_data_hole_offset(), i.e,
loff_t ocfs2_file_llseek()
{
switch (origin) {
case SEEK_END:
case SEEK_CUR:
case SEEK_SET:
return generic_file_llseek(file, offset, origin);
case SEEK_DATA:
return ocfs2_seek_data(file, offset);
case SEEK_HOLE:
return ocfs2_seek_hole(file, offset);
default:
return -EINVAL;
}
}
I personally like keeping SEEK_DATA/SEEK_HOLE in switch...case style rather
than dealing with them in a condition check block.
Thanks,
-Jeff
>
> btrfs_file_llseek() and ocfs2_file_llseek() are extremely similar and
> consequently, contain many of the same flaws. Li Dongyang filed a pull
> request with ZFSOnLinux for SEEK_HOLE/SEEK_DATA support that included a
> custom llseek function that appears to have been modelled after the one
> in ocfs2. The similarity was strong enough that it suffered from many of
> the same flaws, which I caught during review. I addressed the issues
> with his patch with one that I wrote. Since a small percentage of Gentoo
> Linux users are affected by these flaws, I decided to adapt that code
> that to btrfs (separate patch) and ocfs2.
>
> Note that commit 48802c8ae2a9d618ec734a61283d645ad527e06c by Jeff Liu at
> Oracle mostly addressed #3 in btrfs. The only lingering issue was that
> the offset > inode->i_sb->s_maxbytes check became dead code. The ocfs2
> code was not fortunate enough to have had a similar correction until
> now.
>
> Signed-off-by: Richard Yao <ryao@gentoo.org>
> ---
> fs/ocfs2/file.c | 65 ++++++++++++++++++++++-----------------------------------
> 1 file changed, 25 insertions(+), 40 deletions(-)
>
> diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
> index ff54014..84f8c9c 100644
> --- a/fs/ocfs2/file.c
> +++ b/fs/ocfs2/file.c
> @@ -2615,54 +2615,39 @@ bail:
> }
>
> /* Refer generic_file_llseek_unlocked() */
> -static loff_t ocfs2_file_llseek(struct file *file, loff_t offset, int whence)
> +static loff_t ocfs2_file_llseek(struct file *filp, loff_t offset, int whence)
> {
> - struct inode *inode = file->f_mapping->host;
> - int ret = 0;
> + if (whence == SEEK_DATA || whence == SEEK_HOLE) {
> + struct inode *inode = filp->f_mapping->host;
> + int ret;
>
> - mutex_lock(&inode->i_mutex);
> + if (offset < 0 && !(filp->f_mode & FMODE_UNSIGNED_OFFSET))
> + return -EINVAL;
>
> - switch (whence) {
> - case SEEK_SET:
> - break;
> - case SEEK_END:
> - offset += inode->i_size;
> - break;
> - case SEEK_CUR:
> - if (offset == 0) {
> - offset = file->f_pos;
> - goto out;
> + if (offset >= i_size_read(inode)) {
> + return -ENXIO;
> }
> - offset += file->f_pos;
> - break;
> - case SEEK_DATA:
> - case SEEK_HOLE:
> - ret = ocfs2_seek_data_hole_offset(file, &offset, whence);
> - if (ret)
> - goto out;
> - break;
> - default:
> - ret = -EINVAL;
> - goto out;
> - }
>
> - if (offset < 0 && !(file->f_mode & FMODE_UNSIGNED_OFFSET))
> - ret = -EINVAL;
> - if (!ret && offset > inode->i_sb->s_maxbytes)
> - ret = -EINVAL;
> - if (ret)
> - goto out;
> + mutex_lock(&inode->i_mutex);
> + ret = ocfs2_seek_data_hole_offset(filp, &offset, whence);
> + mutex_unlock(&inode->i_mutex);
> +
> + if (ret) {
> + return ret;
> + }
>
> - if (offset != file->f_pos) {
> - file->f_pos = offset;
> - file->f_version = 0;
> + if (offset != filp->f_pos) {
> + spin_lock(&filp->f_lock);
> + filp->f_pos = offset;
> + filp->f_version = 0;
> + spin_unlock(&filp->f_lock);
> + }
> +
> + return offset;
> }
>
> -out:
> - mutex_unlock(&inode->i_mutex);
> - if (ret)
> - return ret;
> - return offset;
> + return generic_file_llseek(filp, offset, whence);
> +
> }
>
> const struct inode_operations ocfs2_file_iops = {
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH 1/2] ocfs2: Fix llseek() semantics and do some cleanup
@ 2013-06-15 5:09 ` Jeff Liu
0 siblings, 0 replies; 21+ messages in thread
From: Jeff Liu @ 2013-06-15 5:09 UTC (permalink / raw)
To: Richard Yao; +Cc: Mark Fasheh, linux-kernel, kernel, linux-fsdevel, Ocfs2-Devel
[Add ocfs2-devel to CC-list]
Hello Richard,
Thanks for your patch.
On 06/15/2013 03:23 AM, Richard Yao wrote:
> There are multiple issues with the custom llseek implemented in ocfs2 for
> implementing SEEK_HOLE/SEEK_DATA.
>
> 1. It takes the inode->i_mutex lock before calling generic_file_llseek(), which
> is unnecessary.
Agree, but please see my comments below.
>
> 2. It fails to take the filp->f_lock spinlock before modifying filp->f_pos and
> filp->f_version, which differs from generic_file_llseek().
>
> 3. It does a offset > inode->i_sb->s_maxbytes check that permits seeking up to
> the maximum file size possible on the ocfs2 filesystem, even when it is past
> the end of the file. Seeking beyond that (if possible), would return EINVAL
> instead of ENXIO.
>
> 4. The switch statement tries to cover all whence values when in reality it
> should only care about SEEK_HOLE/SEEK_DATA. Any other cases should be passsed
> to generic_file_llseek().
I have another patch set for refactoring ocfs2_file_llseek() but not yet found time
to run a comprehensive tests. It can solve the existing issues but also improved the
SEEK_DATA/SEEK_HOLE for unwritten extents, i.e. OCFS2_EXT_UNWRITTEN.
With this change, SEEK_DATA/SEEK_HOLE will go into separate function with a little code
duplication instead of the current mix-ups in ocfs2_seek_data_hole_offset(), i.e,
loff_t ocfs2_file_llseek()
{
switch (origin) {
case SEEK_END:
case SEEK_CUR:
case SEEK_SET:
return generic_file_llseek(file, offset, origin);
case SEEK_DATA:
return ocfs2_seek_data(file, offset);
case SEEK_HOLE:
return ocfs2_seek_hole(file, offset);
default:
return -EINVAL;
}
}
I personally like keeping SEEK_DATA/SEEK_HOLE in switch...case style rather
than dealing with them in a condition check block.
Thanks,
-Jeff
>
> btrfs_file_llseek() and ocfs2_file_llseek() are extremely similar and
> consequently, contain many of the same flaws. Li Dongyang filed a pull
> request with ZFSOnLinux for SEEK_HOLE/SEEK_DATA support that included a
> custom llseek function that appears to have been modelled after the one
> in ocfs2. The similarity was strong enough that it suffered from many of
> the same flaws, which I caught during review. I addressed the issues
> with his patch with one that I wrote. Since a small percentage of Gentoo
> Linux users are affected by these flaws, I decided to adapt that code
> that to btrfs (separate patch) and ocfs2.
>
> Note that commit 48802c8ae2a9d618ec734a61283d645ad527e06c by Jeff Liu at
> Oracle mostly addressed #3 in btrfs. The only lingering issue was that
> the offset > inode->i_sb->s_maxbytes check became dead code. The ocfs2
> code was not fortunate enough to have had a similar correction until
> now.
>
> Signed-off-by: Richard Yao <ryao@gentoo.org>
> ---
> fs/ocfs2/file.c | 65 ++++++++++++++++++++++-----------------------------------
> 1 file changed, 25 insertions(+), 40 deletions(-)
>
> diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
> index ff54014..84f8c9c 100644
> --- a/fs/ocfs2/file.c
> +++ b/fs/ocfs2/file.c
> @@ -2615,54 +2615,39 @@ bail:
> }
>
> /* Refer generic_file_llseek_unlocked() */
> -static loff_t ocfs2_file_llseek(struct file *file, loff_t offset, int whence)
> +static loff_t ocfs2_file_llseek(struct file *filp, loff_t offset, int whence)
> {
> - struct inode *inode = file->f_mapping->host;
> - int ret = 0;
> + if (whence == SEEK_DATA || whence == SEEK_HOLE) {
> + struct inode *inode = filp->f_mapping->host;
> + int ret;
>
> - mutex_lock(&inode->i_mutex);
> + if (offset < 0 && !(filp->f_mode & FMODE_UNSIGNED_OFFSET))
> + return -EINVAL;
>
> - switch (whence) {
> - case SEEK_SET:
> - break;
> - case SEEK_END:
> - offset += inode->i_size;
> - break;
> - case SEEK_CUR:
> - if (offset == 0) {
> - offset = file->f_pos;
> - goto out;
> + if (offset >= i_size_read(inode)) {
> + return -ENXIO;
> }
> - offset += file->f_pos;
> - break;
> - case SEEK_DATA:
> - case SEEK_HOLE:
> - ret = ocfs2_seek_data_hole_offset(file, &offset, whence);
> - if (ret)
> - goto out;
> - break;
> - default:
> - ret = -EINVAL;
> - goto out;
> - }
>
> - if (offset < 0 && !(file->f_mode & FMODE_UNSIGNED_OFFSET))
> - ret = -EINVAL;
> - if (!ret && offset > inode->i_sb->s_maxbytes)
> - ret = -EINVAL;
> - if (ret)
> - goto out;
> + mutex_lock(&inode->i_mutex);
> + ret = ocfs2_seek_data_hole_offset(filp, &offset, whence);
> + mutex_unlock(&inode->i_mutex);
> +
> + if (ret) {
> + return ret;
> + }
>
> - if (offset != file->f_pos) {
> - file->f_pos = offset;
> - file->f_version = 0;
> + if (offset != filp->f_pos) {
> + spin_lock(&filp->f_lock);
> + filp->f_pos = offset;
> + filp->f_version = 0;
> + spin_unlock(&filp->f_lock);
> + }
> +
> + return offset;
> }
>
> -out:
> - mutex_unlock(&inode->i_mutex);
> - if (ret)
> - return ret;
> - return offset;
> + return generic_file_llseek(filp, offset, whence);
> +
> }
>
> const struct inode_operations ocfs2_file_iops = {
^ permalink raw reply [flat|nested] 21+ messages in thread* [Ocfs2-devel] [PATCH 1/2] ocfs2: Fix llseek() semantics and do some cleanup
2013-06-15 5:09 ` Jeff Liu
@ 2013-06-15 6:22 ` shencanquan
-1 siblings, 0 replies; 21+ messages in thread
From: shencanquan @ 2013-06-15 6:22 UTC (permalink / raw)
To: Jeff Liu
Cc: Richard Yao, Mark Fasheh, linux-kernel, kernel, linux-fsdevel,
Ocfs2-Devel
Hello, Richard and Jeff,
we found that llseek has another bug when in SEEK_END. it should be
add the inode lock and unlock.
this bug can be reproduce the following scenario:
on one nodeA, open the file and then write some data to file and
close the file .
on another nodeB , open the file and llseek the end of file . the
position of file is old.
On 2013/6/15 13:09, Jeff Liu wrote:
> [Add ocfs2-devel to CC-list]
>
> Hello Richard,
>
> Thanks for your patch.
>
> On 06/15/2013 03:23 AM, Richard Yao wrote:
>
>> There are multiple issues with the custom llseek implemented in ocfs2 for
>> implementing SEEK_HOLE/SEEK_DATA.
>>
>> 1. It takes the inode->i_mutex lock before calling generic_file_llseek(), which
>> is unnecessary.
> Agree, but please see my comments below.
>
>> 2. It fails to take the filp->f_lock spinlock before modifying filp->f_pos and
>> filp->f_version, which differs from generic_file_llseek().
>>
>> 3. It does a offset> inode->i_sb->s_maxbytes check that permits seeking up to
>> the maximum file size possible on the ocfs2 filesystem, even when it is past
>> the end of the file. Seeking beyond that (if possible), would return EINVAL
>> instead of ENXIO.
>>
>> 4. The switch statement tries to cover all whence values when in reality it
>> should only care about SEEK_HOLE/SEEK_DATA. Any other cases should be passsed
>> to generic_file_llseek().
> I have another patch set for refactoring ocfs2_file_llseek() but not yet found time
> to run a comprehensive tests. It can solve the existing issues but also improved the
> SEEK_DATA/SEEK_HOLE for unwritten extents, i.e. OCFS2_EXT_UNWRITTEN.
>
> With this change, SEEK_DATA/SEEK_HOLE will go into separate function with a little code
> duplication instead of the current mix-ups in ocfs2_seek_data_hole_offset(), i.e,
>
> loff_t ocfs2_file_llseek()
> {
> switch (origin) {
> case SEEK_END:
> case SEEK_CUR:
> case SEEK_SET:
> return generic_file_llseek(file, offset, origin);
> case SEEK_DATA:
> return ocfs2_seek_data(file, offset);
> case SEEK_HOLE:
> return ocfs2_seek_hole(file, offset);
> default:
> return -EINVAL;
> }
> }
>
> I personally like keeping SEEK_DATA/SEEK_HOLE in switch...case style rather
> than dealing with them in a condition check block.
>
> Thanks,
> -Jeff
>
>> btrfs_file_llseek() and ocfs2_file_llseek() are extremely similar and
>> consequently, contain many of the same flaws. Li Dongyang filed a pull
>> request with ZFSOnLinux for SEEK_HOLE/SEEK_DATA support that included a
>> custom llseek function that appears to have been modelled after the one
>> in ocfs2. The similarity was strong enough that it suffered from many of
>> the same flaws, which I caught during review. I addressed the issues
>> with his patch with one that I wrote. Since a small percentage of Gentoo
>> Linux users are affected by these flaws, I decided to adapt that code
>> that to btrfs (separate patch) and ocfs2.
>>
>> Note that commit 48802c8ae2a9d618ec734a61283d645ad527e06c by Jeff Liu at
>> Oracle mostly addressed #3 in btrfs. The only lingering issue was that
>> the offset> inode->i_sb->s_maxbytes check became dead code. The ocfs2
>> code was not fortunate enough to have had a similar correction until
>> now.
>>
>> Signed-off-by: Richard Yao<ryao@gentoo.org>
>> ---
>> fs/ocfs2/file.c | 65 ++++++++++++++++++++++-----------------------------------
>> 1 file changed, 25 insertions(+), 40 deletions(-)
>>
>> diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
>> index ff54014..84f8c9c 100644
>> --- a/fs/ocfs2/file.c
>> +++ b/fs/ocfs2/file.c
>> @@ -2615,54 +2615,39 @@ bail:
>> }
>>
>> /* Refer generic_file_llseek_unlocked() */
>> -static loff_t ocfs2_file_llseek(struct file *file, loff_t offset, int whence)
>> +static loff_t ocfs2_file_llseek(struct file *filp, loff_t offset, int whence)
>> {
>> - struct inode *inode = file->f_mapping->host;
>> - int ret = 0;
>> + if (whence == SEEK_DATA || whence == SEEK_HOLE) {
>> + struct inode *inode = filp->f_mapping->host;
>> + int ret;
>>
>> - mutex_lock(&inode->i_mutex);
>> + if (offset< 0&& !(filp->f_mode& FMODE_UNSIGNED_OFFSET))
>> + return -EINVAL;
>>
>> - switch (whence) {
>> - case SEEK_SET:
>> - break;
>> - case SEEK_END:
>> - offset += inode->i_size;
>> - break;
>> - case SEEK_CUR:
>> - if (offset == 0) {
>> - offset = file->f_pos;
>> - goto out;
>> + if (offset>= i_size_read(inode)) {
>> + return -ENXIO;
>> }
>> - offset += file->f_pos;
>> - break;
>> - case SEEK_DATA:
>> - case SEEK_HOLE:
>> - ret = ocfs2_seek_data_hole_offset(file,&offset, whence);
>> - if (ret)
>> - goto out;
>> - break;
>> - default:
>> - ret = -EINVAL;
>> - goto out;
>> - }
>>
>> - if (offset< 0&& !(file->f_mode& FMODE_UNSIGNED_OFFSET))
>> - ret = -EINVAL;
>> - if (!ret&& offset> inode->i_sb->s_maxbytes)
>> - ret = -EINVAL;
>> - if (ret)
>> - goto out;
>> + mutex_lock(&inode->i_mutex);
>> + ret = ocfs2_seek_data_hole_offset(filp,&offset, whence);
>> + mutex_unlock(&inode->i_mutex);
>> +
>> + if (ret) {
>> + return ret;
>> + }
>>
>> - if (offset != file->f_pos) {
>> - file->f_pos = offset;
>> - file->f_version = 0;
>> + if (offset != filp->f_pos) {
>> + spin_lock(&filp->f_lock);
>> + filp->f_pos = offset;
>> + filp->f_version = 0;
>> + spin_unlock(&filp->f_lock);
>> + }
>> +
>> + return offset;
>> }
>>
>> -out:
>> - mutex_unlock(&inode->i_mutex);
>> - if (ret)
>> - return ret;
>> - return offset;
>> + return generic_file_llseek(filp, offset, whence);
>> +
>> }
>>
>> const struct inode_operations ocfs2_file_iops = {
>
>
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>
> .
>
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [Ocfs2-devel] [PATCH 1/2] ocfs2: Fix llseek() semantics and do some cleanup
@ 2013-06-15 6:22 ` shencanquan
0 siblings, 0 replies; 21+ messages in thread
From: shencanquan @ 2013-06-15 6:22 UTC (permalink / raw)
To: Jeff Liu
Cc: Richard Yao, Mark Fasheh, linux-kernel, kernel, linux-fsdevel,
Ocfs2-Devel
Hello, Richard and Jeff,
we found that llseek has another bug when in SEEK_END. it should be
add the inode lock and unlock.
this bug can be reproduce the following scenario:
on one nodeA, open the file and then write some data to file and
close the file .
on another nodeB , open the file and llseek the end of file . the
position of file is old.
On 2013/6/15 13:09, Jeff Liu wrote:
> [Add ocfs2-devel to CC-list]
>
> Hello Richard,
>
> Thanks for your patch.
>
> On 06/15/2013 03:23 AM, Richard Yao wrote:
>
>> There are multiple issues with the custom llseek implemented in ocfs2 for
>> implementing SEEK_HOLE/SEEK_DATA.
>>
>> 1. It takes the inode->i_mutex lock before calling generic_file_llseek(), which
>> is unnecessary.
> Agree, but please see my comments below.
>
>> 2. It fails to take the filp->f_lock spinlock before modifying filp->f_pos and
>> filp->f_version, which differs from generic_file_llseek().
>>
>> 3. It does a offset> inode->i_sb->s_maxbytes check that permits seeking up to
>> the maximum file size possible on the ocfs2 filesystem, even when it is past
>> the end of the file. Seeking beyond that (if possible), would return EINVAL
>> instead of ENXIO.
>>
>> 4. The switch statement tries to cover all whence values when in reality it
>> should only care about SEEK_HOLE/SEEK_DATA. Any other cases should be passsed
>> to generic_file_llseek().
> I have another patch set for refactoring ocfs2_file_llseek() but not yet found time
> to run a comprehensive tests. It can solve the existing issues but also improved the
> SEEK_DATA/SEEK_HOLE for unwritten extents, i.e. OCFS2_EXT_UNWRITTEN.
>
> With this change, SEEK_DATA/SEEK_HOLE will go into separate function with a little code
> duplication instead of the current mix-ups in ocfs2_seek_data_hole_offset(), i.e,
>
> loff_t ocfs2_file_llseek()
> {
> switch (origin) {
> case SEEK_END:
> case SEEK_CUR:
> case SEEK_SET:
> return generic_file_llseek(file, offset, origin);
> case SEEK_DATA:
> return ocfs2_seek_data(file, offset);
> case SEEK_HOLE:
> return ocfs2_seek_hole(file, offset);
> default:
> return -EINVAL;
> }
> }
>
> I personally like keeping SEEK_DATA/SEEK_HOLE in switch...case style rather
> than dealing with them in a condition check block.
>
> Thanks,
> -Jeff
>
>> btrfs_file_llseek() and ocfs2_file_llseek() are extremely similar and
>> consequently, contain many of the same flaws. Li Dongyang filed a pull
>> request with ZFSOnLinux for SEEK_HOLE/SEEK_DATA support that included a
>> custom llseek function that appears to have been modelled after the one
>> in ocfs2. The similarity was strong enough that it suffered from many of
>> the same flaws, which I caught during review. I addressed the issues
>> with his patch with one that I wrote. Since a small percentage of Gentoo
>> Linux users are affected by these flaws, I decided to adapt that code
>> that to btrfs (separate patch) and ocfs2.
>>
>> Note that commit 48802c8ae2a9d618ec734a61283d645ad527e06c by Jeff Liu at
>> Oracle mostly addressed #3 in btrfs. The only lingering issue was that
>> the offset> inode->i_sb->s_maxbytes check became dead code. The ocfs2
>> code was not fortunate enough to have had a similar correction until
>> now.
>>
>> Signed-off-by: Richard Yao<ryao@gentoo.org>
>> ---
>> fs/ocfs2/file.c | 65 ++++++++++++++++++++++-----------------------------------
>> 1 file changed, 25 insertions(+), 40 deletions(-)
>>
>> diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
>> index ff54014..84f8c9c 100644
>> --- a/fs/ocfs2/file.c
>> +++ b/fs/ocfs2/file.c
>> @@ -2615,54 +2615,39 @@ bail:
>> }
>>
>> /* Refer generic_file_llseek_unlocked() */
>> -static loff_t ocfs2_file_llseek(struct file *file, loff_t offset, int whence)
>> +static loff_t ocfs2_file_llseek(struct file *filp, loff_t offset, int whence)
>> {
>> - struct inode *inode = file->f_mapping->host;
>> - int ret = 0;
>> + if (whence == SEEK_DATA || whence == SEEK_HOLE) {
>> + struct inode *inode = filp->f_mapping->host;
>> + int ret;
>>
>> - mutex_lock(&inode->i_mutex);
>> + if (offset< 0&& !(filp->f_mode& FMODE_UNSIGNED_OFFSET))
>> + return -EINVAL;
>>
>> - switch (whence) {
>> - case SEEK_SET:
>> - break;
>> - case SEEK_END:
>> - offset += inode->i_size;
>> - break;
>> - case SEEK_CUR:
>> - if (offset == 0) {
>> - offset = file->f_pos;
>> - goto out;
>> + if (offset>= i_size_read(inode)) {
>> + return -ENXIO;
>> }
>> - offset += file->f_pos;
>> - break;
>> - case SEEK_DATA:
>> - case SEEK_HOLE:
>> - ret = ocfs2_seek_data_hole_offset(file,&offset, whence);
>> - if (ret)
>> - goto out;
>> - break;
>> - default:
>> - ret = -EINVAL;
>> - goto out;
>> - }
>>
>> - if (offset< 0&& !(file->f_mode& FMODE_UNSIGNED_OFFSET))
>> - ret = -EINVAL;
>> - if (!ret&& offset> inode->i_sb->s_maxbytes)
>> - ret = -EINVAL;
>> - if (ret)
>> - goto out;
>> + mutex_lock(&inode->i_mutex);
>> + ret = ocfs2_seek_data_hole_offset(filp,&offset, whence);
>> + mutex_unlock(&inode->i_mutex);
>> +
>> + if (ret) {
>> + return ret;
>> + }
>>
>> - if (offset != file->f_pos) {
>> - file->f_pos = offset;
>> - file->f_version = 0;
>> + if (offset != filp->f_pos) {
>> + spin_lock(&filp->f_lock);
>> + filp->f_pos = offset;
>> + filp->f_version = 0;
>> + spin_unlock(&filp->f_lock);
>> + }
>> +
>> + return offset;
>> }
>>
>> -out:
>> - mutex_unlock(&inode->i_mutex);
>> - if (ret)
>> - return ret;
>> - return offset;
>> + return generic_file_llseek(filp, offset, whence);
>> +
>> }
>>
>> const struct inode_operations ocfs2_file_iops = {
>
>
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel@oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>
> .
>
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [Ocfs2-devel] [PATCH 1/2] ocfs2: Fix llseek() semantics and do some cleanup
2013-06-15 6:22 ` shencanquan
@ 2013-06-16 0:45 ` Richard Yao
-1 siblings, 0 replies; 21+ messages in thread
From: Richard Yao @ 2013-06-16 0:44 UTC (permalink / raw)
To: shencanquan
Cc: Jeff Liu, Mark Fasheh, linux-kernel, kernel, linux-fsdevel,
Ocfs2-Devel
[-- Attachment #1: Type: text/plain, Size: 1106 bytes --]
On 06/15/2013 02:22 AM, shencanquan wrote:
> Hello, Richard and Jeff,
> we found that llseek has another bug when in SEEK_END. it should be
> add the inode lock and unlock.
> this bug can be reproduce the following scenario:
> on one nodeA, open the file and then write some data to file and
> close the file .
> on another nodeB , open the file and llseek the end of file . the
> position of file is old.
Did these operations occur sequentially or did they occur concurrently?
If you meant the former, the inode cache is not being invalidated. That
should be a bug because Oracle claims OCFS2 is cache-coherent. However,
it is possible that this case was left out of the cache-coherence
protocol for performance purposes. If that is the case, then this would
be by design. someone who works for Oracle would need to comment on that
though.
If you meant the latter, you should ask yourself what would happen when
you run two separate programs on the same file in a local filesystem.
There should be no way to avoid a race without some kind of a locking
mechanism.
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 901 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Ocfs2-devel] [PATCH 1/2] ocfs2: Fix llseek() semantics and do some cleanup
@ 2013-06-16 0:45 ` Richard Yao
0 siblings, 0 replies; 21+ messages in thread
From: Richard Yao @ 2013-06-16 0:45 UTC (permalink / raw)
To: shencanquan
Cc: Jeff Liu, Mark Fasheh, linux-kernel, kernel, linux-fsdevel,
Ocfs2-Devel
On 06/15/2013 02:22 AM, shencanquan wrote:
> Hello, Richard and Jeff,
> we found that llseek has another bug when in SEEK_END. it should be
> add the inode lock and unlock.
> this bug can be reproduce the following scenario:
> on one nodeA, open the file and then write some data to file and
> close the file .
> on another nodeB , open the file and llseek the end of file . the
> position of file is old.
Did these operations occur sequentially or did they occur concurrently?
If you meant the former, the inode cache is not being invalidated. That
should be a bug because Oracle claims OCFS2 is cache-coherent. However,
it is possible that this case was left out of the cache-coherence
protocol for performance purposes. If that is the case, then this would
be by design. someone who works for Oracle would need to comment on that
though.
If you meant the latter, you should ask yourself what would happen when
you run two separate programs on the same file in a local filesystem.
There should be no way to avoid a race without some kind of a locking
mechanism.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 901 bytes
Desc: OpenPGP digital signature
Url : http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20130616/818c9778/attachment.bin
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Ocfs2-devel] [PATCH 1/2] ocfs2: Fix llseek() semantics and do some cleanup
2013-06-16 0:45 ` Richard Yao
@ 2013-06-16 1:57 ` shencanquan
-1 siblings, 0 replies; 21+ messages in thread
From: shencanquan @ 2013-06-16 1:57 UTC (permalink / raw)
To: Richard Yao
Cc: Jeff Liu, Mark Fasheh, linux-kernel, kernel, linux-fsdevel,
Ocfs2-Devel
On 2013/6/16 8:44, Richard Yao wrote:
> On 06/15/2013 02:22 AM, shencanquan wrote:
>> Hello, Richard and Jeff,
>> we found that llseek has another bug when in SEEK_END. it should be
>> add the inode lock and unlock.
>> this bug can be reproduce the following scenario:
>> on one nodeA, open the file and then write some data to file and
>> close the file .
>> on another nodeB , open the file and llseek the end of file . the
>> position of file is old.
> Did these operations occur sequentially or did they occur concurrently?
>
> If you meant the former, the inode cache is not being invalidated. That
> should be a bug because Oracle claims OCFS2 is cache-coherent. However,
> it is possible that this case was left out of the cache-coherence
> protocol for performance purposes. If that is the case, then this would
> be by design. someone who works for Oracle would need to comment on that
> though.
it is a occur sequentially. after close the file on NodeA , on
nodeB and then open the file and llseek the end of file.
>
> If you meant the latter, you should ask yourself what would happen when
> you run two separate programs on the same file in a local filesystem.
> There should be no way to avoid a race without some kind of a locking
> mechanism.
>
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [Ocfs2-devel] [PATCH 1/2] ocfs2: Fix llseek() semantics and do some cleanup
@ 2013-06-16 1:57 ` shencanquan
0 siblings, 0 replies; 21+ messages in thread
From: shencanquan @ 2013-06-16 1:57 UTC (permalink / raw)
To: Richard Yao
Cc: Jeff Liu, Mark Fasheh, linux-kernel, kernel, linux-fsdevel,
Ocfs2-Devel
On 2013/6/16 8:44, Richard Yao wrote:
> On 06/15/2013 02:22 AM, shencanquan wrote:
>> Hello, Richard and Jeff,
>> we found that llseek has another bug when in SEEK_END. it should be
>> add the inode lock and unlock.
>> this bug can be reproduce the following scenario:
>> on one nodeA, open the file and then write some data to file and
>> close the file .
>> on another nodeB , open the file and llseek the end of file . the
>> position of file is old.
> Did these operations occur sequentially or did they occur concurrently?
>
> If you meant the former, the inode cache is not being invalidated. That
> should be a bug because Oracle claims OCFS2 is cache-coherent. However,
> it is possible that this case was left out of the cache-coherence
> protocol for performance purposes. If that is the case, then this would
> be by design. someone who works for Oracle would need to comment on that
> though.
it is a occur sequentially. after close the file on NodeA , on
nodeB and then open the file and llseek the end of file.
>
> If you meant the latter, you should ask yourself what would happen when
> you run two separate programs on the same file in a local filesystem.
> There should be no way to avoid a race without some kind of a locking
> mechanism.
>
^ permalink raw reply [flat|nested] 21+ messages in thread* [Ocfs2-devel] [PATCH 1/2] ocfs2: Fix llseek() semantics and do some cleanup
2013-06-16 1:57 ` shencanquan
@ 2013-06-16 7:16 ` Jeff Liu
-1 siblings, 0 replies; 21+ messages in thread
From: Jeff Liu @ 2013-06-16 7:16 UTC (permalink / raw)
To: shencanquan; +Cc: Richard Yao, Mark Fasheh, kernel, linux-fsdevel, Ocfs2-Devel
Remove kernel-dev from the CC-list as this is particular to OCFS2.
On 06/16/2013 09:57 AM, shencanquan wrote:
> On 2013/6/16 8:44, Richard Yao wrote:
>> On 06/15/2013 02:22 AM, shencanquan wrote:
>>> Hello, Richard and Jeff,
>>> we found that llseek has another bug when in SEEK_END. it should be
>>> add the inode lock and unlock.
>>> this bug can be reproduce the following scenario:
>>> on one nodeA, open the file and then write some data to file and
>>> close the file .
>>> on another nodeB , open the file and llseek the end of file . the
>>> position of file is old.
This sounds like a bug because SEEK_END references the file size, hence it
requires the OCFS2 specified inode lock protection.
So patch is always welcome.
Thanks,
-Jeff
>> Did these operations occur sequentially or did they occur concurrently?
>>
>> If you meant the former, the inode cache is not being invalidated. That
>> should be a bug because Oracle claims OCFS2 is cache-coherent. However,
>> it is possible that this case was left out of the cache-coherence
>> protocol for performance purposes. If that is the case, then this would
>> be by design. someone who works for Oracle would need to comment on that
>> though.
>
> it is a occur sequentially. after close the file on NodeA , on nodeB
> and then open the file and llseek the end of file.
>
>>
>> If you meant the latter, you should ask yourself what would happen when
>> you run two separate programs on the same file in a local filesystem.
>> There should be no way to avoid a race without some kind of a locking
>> mechanism.
>>
>
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Ocfs2-devel] [PATCH 1/2] ocfs2: Fix llseek() semantics and do some cleanup
@ 2013-06-16 7:16 ` Jeff Liu
0 siblings, 0 replies; 21+ messages in thread
From: Jeff Liu @ 2013-06-16 7:16 UTC (permalink / raw)
To: shencanquan; +Cc: Richard Yao, Mark Fasheh, kernel, linux-fsdevel, Ocfs2-Devel
Remove kernel-dev from the CC-list as this is particular to OCFS2.
On 06/16/2013 09:57 AM, shencanquan wrote:
> On 2013/6/16 8:44, Richard Yao wrote:
>> On 06/15/2013 02:22 AM, shencanquan wrote:
>>> Hello, Richard and Jeff,
>>> we found that llseek has another bug when in SEEK_END. it should be
>>> add the inode lock and unlock.
>>> this bug can be reproduce the following scenario:
>>> on one nodeA, open the file and then write some data to file and
>>> close the file .
>>> on another nodeB , open the file and llseek the end of file . the
>>> position of file is old.
This sounds like a bug because SEEK_END references the file size, hence it
requires the OCFS2 specified inode lock protection.
So patch is always welcome.
Thanks,
-Jeff
>> Did these operations occur sequentially or did they occur concurrently?
>>
>> If you meant the former, the inode cache is not being invalidated. That
>> should be a bug because Oracle claims OCFS2 is cache-coherent. However,
>> it is possible that this case was left out of the cache-coherence
>> protocol for performance purposes. If that is the case, then this would
>> be by design. someone who works for Oracle would need to comment on that
>> though.
>
> it is a occur sequentially. after close the file on NodeA , on nodeB
> and then open the file and llseek the end of file.
>
>>
>> If you meant the latter, you should ask yourself what would happen when
>> you run two separate programs on the same file in a local filesystem.
>> There should be no way to avoid a race without some kind of a locking
>> mechanism.
>>
>
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/2] ocfs2: Fix llseek() semantics and do some cleanup
2013-06-15 5:09 ` Jeff Liu
@ 2013-06-16 0:47 ` Richard Yao
-1 siblings, 0 replies; 21+ messages in thread
From: Richard Yao @ 2013-06-16 0:46 UTC (permalink / raw)
To: Jeff Liu
Cc: linux-fsdevel, linux-kernel, kernel, Ocfs2-Devel, Joel Becker,
Mark Fasheh
[-- Attachment #1: Type: text/plain, Size: 2945 bytes --]
On 06/15/2013 01:09 AM, Jeff Liu wrote:
> [Add ocfs2-devel to CC-list]
>
> Hello Richard,
>
> Thanks for your patch.
>
> On 06/15/2013 03:23 AM, Richard Yao wrote:
>
>> There are multiple issues with the custom llseek implemented in ocfs2 for
>> implementing SEEK_HOLE/SEEK_DATA.
>>
>> 1. It takes the inode->i_mutex lock before calling generic_file_llseek(), which
>> is unnecessary.
>
> Agree, but please see my comments below.
>
>>
>> 2. It fails to take the filp->f_lock spinlock before modifying filp->f_pos and
>> filp->f_version, which differs from generic_file_llseek().
>>
>> 3. It does a offset > inode->i_sb->s_maxbytes check that permits seeking up to
>> the maximum file size possible on the ocfs2 filesystem, even when it is past
>> the end of the file. Seeking beyond that (if possible), would return EINVAL
>> instead of ENXIO.
>>
>> 4. The switch statement tries to cover all whence values when in reality it
>> should only care about SEEK_HOLE/SEEK_DATA. Any other cases should be passsed
>> to generic_file_llseek().
>
> I have another patch set for refactoring ocfs2_file_llseek() but not yet found time
> to run a comprehensive tests. It can solve the existing issues but also improved the
> SEEK_DATA/SEEK_HOLE for unwritten extents, i.e. OCFS2_EXT_UNWRITTEN.
>
> With this change, SEEK_DATA/SEEK_HOLE will go into separate function with a little code
> duplication instead of the current mix-ups in ocfs2_seek_data_hole_offset(), i.e,
>
> loff_t ocfs2_file_llseek()
> {
> switch (origin) {
> case SEEK_END:
> case SEEK_CUR:
> case SEEK_SET:
> return generic_file_llseek(file, offset, origin);
> case SEEK_DATA:
> return ocfs2_seek_data(file, offset);
> case SEEK_HOLE:
> return ocfs2_seek_hole(file, offset);
> default:
> return -EINVAL;
> }
> }
>
> I personally like keeping SEEK_DATA/SEEK_HOLE in switch...case style rather
> than dealing with them in a condition check block.
I would prefer to see the code structured like this:
loff_t ocfs2_file_llseek()
{
switch (origin) {
case SEEK_DATA:
return ocfs2_seek_data(file, offset);
case SEEK_HOLE:
return ocfs2_seek_hole(file, offset);
default:
return generic_file_llseek(file, offset, origin);
}
}
Unfortunately, I just noticed that this code has a problem. In specific,
generic_file_llseek() calls generic_file_llseek_size(), which has a
switch statement for whence that fails to distinguish between SEEK_SET
and invalid whence values. Invalid whence values are mapped to SEEK_SET
instead of returning EINVAL, which is wrong. That issue affects all
filesystems that do not specify a custom llseek() function and it would
affect ocfs2 if my version of the function is used.
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 901 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread* [Ocfs2-devel] [PATCH 1/2] ocfs2: Fix llseek() semantics and do some cleanup
@ 2013-06-16 0:47 ` Richard Yao
0 siblings, 0 replies; 21+ messages in thread
From: Richard Yao @ 2013-06-16 0:47 UTC (permalink / raw)
To: Jeff Liu
Cc: linux-fsdevel, linux-kernel, kernel, Ocfs2-Devel, Joel Becker,
Mark Fasheh
On 06/15/2013 01:09 AM, Jeff Liu wrote:
> [Add ocfs2-devel to CC-list]
>
> Hello Richard,
>
> Thanks for your patch.
>
> On 06/15/2013 03:23 AM, Richard Yao wrote:
>
>> There are multiple issues with the custom llseek implemented in ocfs2 for
>> implementing SEEK_HOLE/SEEK_DATA.
>>
>> 1. It takes the inode->i_mutex lock before calling generic_file_llseek(), which
>> is unnecessary.
>
> Agree, but please see my comments below.
>
>>
>> 2. It fails to take the filp->f_lock spinlock before modifying filp->f_pos and
>> filp->f_version, which differs from generic_file_llseek().
>>
>> 3. It does a offset > inode->i_sb->s_maxbytes check that permits seeking up to
>> the maximum file size possible on the ocfs2 filesystem, even when it is past
>> the end of the file. Seeking beyond that (if possible), would return EINVAL
>> instead of ENXIO.
>>
>> 4. The switch statement tries to cover all whence values when in reality it
>> should only care about SEEK_HOLE/SEEK_DATA. Any other cases should be passsed
>> to generic_file_llseek().
>
> I have another patch set for refactoring ocfs2_file_llseek() but not yet found time
> to run a comprehensive tests. It can solve the existing issues but also improved the
> SEEK_DATA/SEEK_HOLE for unwritten extents, i.e. OCFS2_EXT_UNWRITTEN.
>
> With this change, SEEK_DATA/SEEK_HOLE will go into separate function with a little code
> duplication instead of the current mix-ups in ocfs2_seek_data_hole_offset(), i.e,
>
> loff_t ocfs2_file_llseek()
> {
> switch (origin) {
> case SEEK_END:
> case SEEK_CUR:
> case SEEK_SET:
> return generic_file_llseek(file, offset, origin);
> case SEEK_DATA:
> return ocfs2_seek_data(file, offset);
> case SEEK_HOLE:
> return ocfs2_seek_hole(file, offset);
> default:
> return -EINVAL;
> }
> }
>
> I personally like keeping SEEK_DATA/SEEK_HOLE in switch...case style rather
> than dealing with them in a condition check block.
I would prefer to see the code structured like this:
loff_t ocfs2_file_llseek()
{
switch (origin) {
case SEEK_DATA:
return ocfs2_seek_data(file, offset);
case SEEK_HOLE:
return ocfs2_seek_hole(file, offset);
default:
return generic_file_llseek(file, offset, origin);
}
}
Unfortunately, I just noticed that this code has a problem. In specific,
generic_file_llseek() calls generic_file_llseek_size(), which has a
switch statement for whence that fails to distinguish between SEEK_SET
and invalid whence values. Invalid whence values are mapped to SEEK_SET
instead of returning EINVAL, which is wrong. That issue affects all
filesystems that do not specify a custom llseek() function and it would
affect ocfs2 if my version of the function is used.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 901 bytes
Desc: OpenPGP digital signature
Url : http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20130616/2188112a/attachment.bin
^ permalink raw reply [flat|nested] 21+ messages in thread* [Ocfs2-devel] [PATCH 1/2] ocfs2: Fix llseek() semantics and do some cleanup
2013-06-16 0:47 ` [Ocfs2-devel] " Richard Yao
(?)
@ 2013-06-16 7:00 ` Jeff Liu
-1 siblings, 0 replies; 21+ messages in thread
From: Jeff Liu @ 2013-06-16 7:00 UTC (permalink / raw)
To: Richard Yao; +Cc: Mark Fasheh, linux-kernel, kernel, linux-fsdevel, Ocfs2-Devel
On 06/16/2013 08:46 AM, Richard Yao wrote:
> On 06/15/2013 01:09 AM, Jeff Liu wrote:
>> [Add ocfs2-devel to CC-list]
>>
>> Hello Richard,
>>
>> Thanks for your patch.
>>
>> On 06/15/2013 03:23 AM, Richard Yao wrote:
>>
>>> There are multiple issues with the custom llseek implemented in ocfs2 for
>>> implementing SEEK_HOLE/SEEK_DATA.
>>>
>>> 1. It takes the inode->i_mutex lock before calling generic_file_llseek(), which
>>> is unnecessary.
>>
>> Agree, but please see my comments below.
>>
>>>
>>> 2. It fails to take the filp->f_lock spinlock before modifying filp->f_pos and
>>> filp->f_version, which differs from generic_file_llseek().
>>>
>>> 3. It does a offset > inode->i_sb->s_maxbytes check that permits seeking up to
>>> the maximum file size possible on the ocfs2 filesystem, even when it is past
>>> the end of the file. Seeking beyond that (if possible), would return EINVAL
>>> instead of ENXIO.
>>>
>>> 4. The switch statement tries to cover all whence values when in reality it
>>> should only care about SEEK_HOLE/SEEK_DATA. Any other cases should be passsed
>>> to generic_file_llseek().
>>
>> I have another patch set for refactoring ocfs2_file_llseek() but not yet found time
>> to run a comprehensive tests. It can solve the existing issues but also improved the
>> SEEK_DATA/SEEK_HOLE for unwritten extents, i.e. OCFS2_EXT_UNWRITTEN.
>>
>> With this change, SEEK_DATA/SEEK_HOLE will go into separate function with a little code
>> duplication instead of the current mix-ups in ocfs2_seek_data_hole_offset(), i.e,
>>
>> loff_t ocfs2_file_llseek()
>> {
>> switch (origin) {
>> case SEEK_END:
>> case SEEK_CUR:
>> case SEEK_SET:
>> return generic_file_llseek(file, offset, origin);
>> case SEEK_DATA:
>> return ocfs2_seek_data(file, offset);
>> case SEEK_HOLE:
>> return ocfs2_seek_hole(file, offset);
>> default:
>> return -EINVAL;
>> }
>> }
>>
>> I personally like keeping SEEK_DATA/SEEK_HOLE in switch...case style rather
>> than dealing with them in a condition check block.
>
> I would prefer to see the code structured like this:
>
> loff_t ocfs2_file_llseek()
> {
> switch (origin) {
> case SEEK_DATA:
> return ocfs2_seek_data(file, offset);
> case SEEK_HOLE:
> return ocfs2_seek_hole(file, offset);
> default:
> return generic_file_llseek(file, offset, origin);
> }
> }
>
> Unfortunately, I just noticed that this code has a problem. In specific,
> generic_file_llseek() calls generic_file_llseek_size(), which has a
> switch statement for whence that fails to distinguish between SEEK_SET
> and invalid whence values. Invalid whence values are mapped to SEEK_SET
> instead of returning EINVAL, which is wrong. That issue affects all
> filesystems that do not specify a custom llseek() function and it would
> affect ocfs2 if my version of the function is used.
Hmm?? Did you mean to say that an invalid whence(i.e, whence > SEEK_MAX)
can be passed into generic_file_llseek()?
If so, I don't think that is a problem because any invalid whence should
be rejected at the entrance of VFS lseek(2) with EINVAL.
Thanks,
-Jeff
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH 1/2] ocfs2: Fix llseek() semantics and do some cleanup
@ 2013-06-16 7:00 ` Jeff Liu
0 siblings, 0 replies; 21+ messages in thread
From: Jeff Liu @ 2013-06-16 7:00 UTC (permalink / raw)
To: Richard Yao
Cc: linux-fsdevel, linux-kernel, kernel, Ocfs2-Devel, Joel Becker,
Mark Fasheh
On 06/16/2013 08:46 AM, Richard Yao wrote:
> On 06/15/2013 01:09 AM, Jeff Liu wrote:
>> [Add ocfs2-devel to CC-list]
>>
>> Hello Richard,
>>
>> Thanks for your patch.
>>
>> On 06/15/2013 03:23 AM, Richard Yao wrote:
>>
>>> There are multiple issues with the custom llseek implemented in ocfs2 for
>>> implementing SEEK_HOLE/SEEK_DATA.
>>>
>>> 1. It takes the inode->i_mutex lock before calling generic_file_llseek(), which
>>> is unnecessary.
>>
>> Agree, but please see my comments below.
>>
>>>
>>> 2. It fails to take the filp->f_lock spinlock before modifying filp->f_pos and
>>> filp->f_version, which differs from generic_file_llseek().
>>>
>>> 3. It does a offset > inode->i_sb->s_maxbytes check that permits seeking up to
>>> the maximum file size possible on the ocfs2 filesystem, even when it is past
>>> the end of the file. Seeking beyond that (if possible), would return EINVAL
>>> instead of ENXIO.
>>>
>>> 4. The switch statement tries to cover all whence values when in reality it
>>> should only care about SEEK_HOLE/SEEK_DATA. Any other cases should be passsed
>>> to generic_file_llseek().
>>
>> I have another patch set for refactoring ocfs2_file_llseek() but not yet found time
>> to run a comprehensive tests. It can solve the existing issues but also improved the
>> SEEK_DATA/SEEK_HOLE for unwritten extents, i.e. OCFS2_EXT_UNWRITTEN.
>>
>> With this change, SEEK_DATA/SEEK_HOLE will go into separate function with a little code
>> duplication instead of the current mix-ups in ocfs2_seek_data_hole_offset(), i.e,
>>
>> loff_t ocfs2_file_llseek()
>> {
>> switch (origin) {
>> case SEEK_END:
>> case SEEK_CUR:
>> case SEEK_SET:
>> return generic_file_llseek(file, offset, origin);
>> case SEEK_DATA:
>> return ocfs2_seek_data(file, offset);
>> case SEEK_HOLE:
>> return ocfs2_seek_hole(file, offset);
>> default:
>> return -EINVAL;
>> }
>> }
>>
>> I personally like keeping SEEK_DATA/SEEK_HOLE in switch...case style rather
>> than dealing with them in a condition check block.
>
> I would prefer to see the code structured like this:
>
> loff_t ocfs2_file_llseek()
> {
> switch (origin) {
> case SEEK_DATA:
> return ocfs2_seek_data(file, offset);
> case SEEK_HOLE:
> return ocfs2_seek_hole(file, offset);
> default:
> return generic_file_llseek(file, offset, origin);
> }
> }
>
> Unfortunately, I just noticed that this code has a problem. In specific,
> generic_file_llseek() calls generic_file_llseek_size(), which has a
> switch statement for whence that fails to distinguish between SEEK_SET
> and invalid whence values. Invalid whence values are mapped to SEEK_SET
> instead of returning EINVAL, which is wrong. That issue affects all
> filesystems that do not specify a custom llseek() function and it would
> affect ocfs2 if my version of the function is used.
Hmm?? Did you mean to say that an invalid whence(i.e, whence > SEEK_MAX)
can be passed into generic_file_llseek()?
If so, I don't think that is a problem because any invalid whence should
be rejected at the entrance of VFS lseek(2) with EINVAL.
Thanks,
-Jeff
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH 1/2] ocfs2: Fix llseek() semantics and do some cleanup
@ 2013-06-16 7:00 ` Jeff Liu
0 siblings, 0 replies; 21+ messages in thread
From: Jeff Liu @ 2013-06-16 7:00 UTC (permalink / raw)
To: Richard Yao; +Cc: Mark Fasheh, linux-kernel, kernel, linux-fsdevel, Ocfs2-Devel
On 06/16/2013 08:46 AM, Richard Yao wrote:
> On 06/15/2013 01:09 AM, Jeff Liu wrote:
>> [Add ocfs2-devel to CC-list]
>>
>> Hello Richard,
>>
>> Thanks for your patch.
>>
>> On 06/15/2013 03:23 AM, Richard Yao wrote:
>>
>>> There are multiple issues with the custom llseek implemented in ocfs2 for
>>> implementing SEEK_HOLE/SEEK_DATA.
>>>
>>> 1. It takes the inode->i_mutex lock before calling generic_file_llseek(), which
>>> is unnecessary.
>>
>> Agree, but please see my comments below.
>>
>>>
>>> 2. It fails to take the filp->f_lock spinlock before modifying filp->f_pos and
>>> filp->f_version, which differs from generic_file_llseek().
>>>
>>> 3. It does a offset > inode->i_sb->s_maxbytes check that permits seeking up to
>>> the maximum file size possible on the ocfs2 filesystem, even when it is past
>>> the end of the file. Seeking beyond that (if possible), would return EINVAL
>>> instead of ENXIO.
>>>
>>> 4. The switch statement tries to cover all whence values when in reality it
>>> should only care about SEEK_HOLE/SEEK_DATA. Any other cases should be passsed
>>> to generic_file_llseek().
>>
>> I have another patch set for refactoring ocfs2_file_llseek() but not yet found time
>> to run a comprehensive tests. It can solve the existing issues but also improved the
>> SEEK_DATA/SEEK_HOLE for unwritten extents, i.e. OCFS2_EXT_UNWRITTEN.
>>
>> With this change, SEEK_DATA/SEEK_HOLE will go into separate function with a little code
>> duplication instead of the current mix-ups in ocfs2_seek_data_hole_offset(), i.e,
>>
>> loff_t ocfs2_file_llseek()
>> {
>> switch (origin) {
>> case SEEK_END:
>> case SEEK_CUR:
>> case SEEK_SET:
>> return generic_file_llseek(file, offset, origin);
>> case SEEK_DATA:
>> return ocfs2_seek_data(file, offset);
>> case SEEK_HOLE:
>> return ocfs2_seek_hole(file, offset);
>> default:
>> return -EINVAL;
>> }
>> }
>>
>> I personally like keeping SEEK_DATA/SEEK_HOLE in switch...case style rather
>> than dealing with them in a condition check block.
>
> I would prefer to see the code structured like this:
>
> loff_t ocfs2_file_llseek()
> {
> switch (origin) {
> case SEEK_DATA:
> return ocfs2_seek_data(file, offset);
> case SEEK_HOLE:
> return ocfs2_seek_hole(file, offset);
> default:
> return generic_file_llseek(file, offset, origin);
> }
> }
>
> Unfortunately, I just noticed that this code has a problem. In specific,
> generic_file_llseek() calls generic_file_llseek_size(), which has a
> switch statement for whence that fails to distinguish between SEEK_SET
> and invalid whence values. Invalid whence values are mapped to SEEK_SET
> instead of returning EINVAL, which is wrong. That issue affects all
> filesystems that do not specify a custom llseek() function and it would
> affect ocfs2 if my version of the function is used.
Hmm?? Did you mean to say that an invalid whence(i.e, whence > SEEK_MAX)
can be passed into generic_file_llseek()?
If so, I don't think that is a problem because any invalid whence should
be rejected at the entrance of VFS lseek(2) with EINVAL.
Thanks,
-Jeff
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH 1/2] ocfs2: Fix llseek() semantics and do some cleanup
2013-06-16 7:00 ` Jeff Liu
@ 2013-06-16 7:17 ` Richard Yao
-1 siblings, 0 replies; 21+ messages in thread
From: Richard Yao @ 2013-06-16 7:17 UTC (permalink / raw)
To: Jeff Liu
Cc: linux-fsdevel, linux-kernel, kernel, Ocfs2-Devel, Joel Becker,
Mark Fasheh
[-- Attachment #1: Type: text/plain, Size: 3536 bytes --]
On 06/16/2013 03:00 AM, Jeff Liu wrote:
> On 06/16/2013 08:46 AM, Richard Yao wrote:
>
>> On 06/15/2013 01:09 AM, Jeff Liu wrote:
>>> [Add ocfs2-devel to CC-list]
>>>
>>> Hello Richard,
>>>
>>> Thanks for your patch.
>>>
>>> On 06/15/2013 03:23 AM, Richard Yao wrote:
>>>
>>>> There are multiple issues with the custom llseek implemented in ocfs2 for
>>>> implementing SEEK_HOLE/SEEK_DATA.
>>>>
>>>> 1. It takes the inode->i_mutex lock before calling generic_file_llseek(), which
>>>> is unnecessary.
>>>
>>> Agree, but please see my comments below.
>>>
>>>>
>>>> 2. It fails to take the filp->f_lock spinlock before modifying filp->f_pos and
>>>> filp->f_version, which differs from generic_file_llseek().
>>>>
>>>> 3. It does a offset > inode->i_sb->s_maxbytes check that permits seeking up to
>>>> the maximum file size possible on the ocfs2 filesystem, even when it is past
>>>> the end of the file. Seeking beyond that (if possible), would return EINVAL
>>>> instead of ENXIO.
>>>>
>>>> 4. The switch statement tries to cover all whence values when in reality it
>>>> should only care about SEEK_HOLE/SEEK_DATA. Any other cases should be passsed
>>>> to generic_file_llseek().
>>>
>>> I have another patch set for refactoring ocfs2_file_llseek() but not yet found time
>>> to run a comprehensive tests. It can solve the existing issues but also improved the
>>> SEEK_DATA/SEEK_HOLE for unwritten extents, i.e. OCFS2_EXT_UNWRITTEN.
>>>
>>> With this change, SEEK_DATA/SEEK_HOLE will go into separate function with a little code
>>> duplication instead of the current mix-ups in ocfs2_seek_data_hole_offset(), i.e,
>>>
>>> loff_t ocfs2_file_llseek()
>>> {
>>> switch (origin) {
>>> case SEEK_END:
>>> case SEEK_CUR:
>>> case SEEK_SET:
>>> return generic_file_llseek(file, offset, origin);
>>> case SEEK_DATA:
>>> return ocfs2_seek_data(file, offset);
>>> case SEEK_HOLE:
>>> return ocfs2_seek_hole(file, offset);
>>> default:
>>> return -EINVAL;
>>> }
>>> }
>>>
>>> I personally like keeping SEEK_DATA/SEEK_HOLE in switch...case style rather
>>> than dealing with them in a condition check block.
>>
>> I would prefer to see the code structured like this:
>>
>> loff_t ocfs2_file_llseek()
>> {
>> switch (origin) {
>> case SEEK_DATA:
>> return ocfs2_seek_data(file, offset);
>> case SEEK_HOLE:
>> return ocfs2_seek_hole(file, offset);
>> default:
>> return generic_file_llseek(file, offset, origin);
>> }
>> }
>>
>> Unfortunately, I just noticed that this code has a problem. In specific,
>> generic_file_llseek() calls generic_file_llseek_size(), which has a
>> switch statement for whence that fails to distinguish between SEEK_SET
>> and invalid whence values. Invalid whence values are mapped to SEEK_SET
>> instead of returning EINVAL, which is wrong. That issue affects all
>> filesystems that do not specify a custom llseek() function and it would
>> affect ocfs2 if my version of the function is used.
>
> Hmm?? Did you mean to say that an invalid whence(i.e, whence > SEEK_MAX)
> can be passed into generic_file_llseek()?
> If so, I don't think that is a problem because any invalid whence should
> be rejected at the entrance of VFS lseek(2) with EINVAL.
>
> Thanks,
> -Jeff
>
You are correct. I had not looked there.
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 901 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread* [Ocfs2-devel] [PATCH 1/2] ocfs2: Fix llseek() semantics and do some cleanup
@ 2013-06-16 7:17 ` Richard Yao
0 siblings, 0 replies; 21+ messages in thread
From: Richard Yao @ 2013-06-16 7:17 UTC (permalink / raw)
To: Jeff Liu
Cc: linux-fsdevel, linux-kernel, kernel, Ocfs2-Devel, Joel Becker,
Mark Fasheh
On 06/16/2013 03:00 AM, Jeff Liu wrote:
> On 06/16/2013 08:46 AM, Richard Yao wrote:
>
>> On 06/15/2013 01:09 AM, Jeff Liu wrote:
>>> [Add ocfs2-devel to CC-list]
>>>
>>> Hello Richard,
>>>
>>> Thanks for your patch.
>>>
>>> On 06/15/2013 03:23 AM, Richard Yao wrote:
>>>
>>>> There are multiple issues with the custom llseek implemented in ocfs2 for
>>>> implementing SEEK_HOLE/SEEK_DATA.
>>>>
>>>> 1. It takes the inode->i_mutex lock before calling generic_file_llseek(), which
>>>> is unnecessary.
>>>
>>> Agree, but please see my comments below.
>>>
>>>>
>>>> 2. It fails to take the filp->f_lock spinlock before modifying filp->f_pos and
>>>> filp->f_version, which differs from generic_file_llseek().
>>>>
>>>> 3. It does a offset > inode->i_sb->s_maxbytes check that permits seeking up to
>>>> the maximum file size possible on the ocfs2 filesystem, even when it is past
>>>> the end of the file. Seeking beyond that (if possible), would return EINVAL
>>>> instead of ENXIO.
>>>>
>>>> 4. The switch statement tries to cover all whence values when in reality it
>>>> should only care about SEEK_HOLE/SEEK_DATA. Any other cases should be passsed
>>>> to generic_file_llseek().
>>>
>>> I have another patch set for refactoring ocfs2_file_llseek() but not yet found time
>>> to run a comprehensive tests. It can solve the existing issues but also improved the
>>> SEEK_DATA/SEEK_HOLE for unwritten extents, i.e. OCFS2_EXT_UNWRITTEN.
>>>
>>> With this change, SEEK_DATA/SEEK_HOLE will go into separate function with a little code
>>> duplication instead of the current mix-ups in ocfs2_seek_data_hole_offset(), i.e,
>>>
>>> loff_t ocfs2_file_llseek()
>>> {
>>> switch (origin) {
>>> case SEEK_END:
>>> case SEEK_CUR:
>>> case SEEK_SET:
>>> return generic_file_llseek(file, offset, origin);
>>> case SEEK_DATA:
>>> return ocfs2_seek_data(file, offset);
>>> case SEEK_HOLE:
>>> return ocfs2_seek_hole(file, offset);
>>> default:
>>> return -EINVAL;
>>> }
>>> }
>>>
>>> I personally like keeping SEEK_DATA/SEEK_HOLE in switch...case style rather
>>> than dealing with them in a condition check block.
>>
>> I would prefer to see the code structured like this:
>>
>> loff_t ocfs2_file_llseek()
>> {
>> switch (origin) {
>> case SEEK_DATA:
>> return ocfs2_seek_data(file, offset);
>> case SEEK_HOLE:
>> return ocfs2_seek_hole(file, offset);
>> default:
>> return generic_file_llseek(file, offset, origin);
>> }
>> }
>>
>> Unfortunately, I just noticed that this code has a problem. In specific,
>> generic_file_llseek() calls generic_file_llseek_size(), which has a
>> switch statement for whence that fails to distinguish between SEEK_SET
>> and invalid whence values. Invalid whence values are mapped to SEEK_SET
>> instead of returning EINVAL, which is wrong. That issue affects all
>> filesystems that do not specify a custom llseek() function and it would
>> affect ocfs2 if my version of the function is used.
>
> Hmm?? Did you mean to say that an invalid whence(i.e, whence > SEEK_MAX)
> can be passed into generic_file_llseek()?
> If so, I don't think that is a problem because any invalid whence should
> be rejected at the entrance of VFS lseek(2) with EINVAL.
>
> Thanks,
> -Jeff
>
You are correct. I had not looked there.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 901 bytes
Desc: OpenPGP digital signature
Url : http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20130616/8e3d7b2a/attachment-0001.bin
^ permalink raw reply [flat|nested] 21+ messages in thread