linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/filemap: generic_file_read_iter(): check for zero reads unconditionally
@ 2016-03-24 11:08 Nicolai Stange
  2016-03-24 11:45 ` Jan Kara
  2016-03-29  8:46 ` Mel Gorman
  0 siblings, 2 replies; 4+ messages in thread
From: Nicolai Stange @ 2016-03-24 11:08 UTC (permalink / raw)
  To: Andrew Morton, Al Viro
  Cc: Jan Kara, Johannes Weiner, Michal Hocko, Ross Zwisler, Mel Gorman,
	Junichi Nomura, Hugh Dickins, Matthew Wilcox, linux-mm,
	linux-kernel, Nicolai Stange

If
- generic_file_read_iter() gets called with a zero read length,
- the read offset is at a page boundary,
- IOCB_DIRECT is not set
- and the page in question hasn't made it into the page cache yet,
then do_generic_file_read() will trigger a readahead with a req_size hint
of zero.

Since roundup_pow_of_two(0) is undefined, UBSAN reports

  UBSAN: Undefined behaviour in include/linux/log2.h:63:13
  shift exponent 64 is too large for 64-bit type 'long unsigned int'
  CPU: 3 PID: 1017 Comm: sa1 Tainted: G L 4.5.0-next-20160318+ #14
  [...]
  Call Trace:
   [...]
   [<ffffffff813ef61a>] ondemand_readahead+0x3aa/0x3d0
   [<ffffffff813ef61a>] ? ondemand_readahead+0x3aa/0x3d0
   [<ffffffff813c73bd>] ? find_get_entry+0x2d/0x210
   [<ffffffff813ef9c3>] page_cache_sync_readahead+0x63/0xa0
   [<ffffffff813cc04d>] do_generic_file_read+0x80d/0xf90
   [<ffffffff813cc955>] generic_file_read_iter+0x185/0x420
   [...]
   [<ffffffff81510b06>] __vfs_read+0x256/0x3d0
   [...]

when get_init_ra_size() gets called from ondemand_readahead().

The net effect is that the initial readahead size is arch dependent for
requested read lengths of zero: for example, since

  1UL << (sizeof(unsigned long) * 8)

evaluates to 1 on x86 while its result is 0 on ARMv7, the initial readahead
size becomes 4 on the former and 0 on the latter.

What's more, whether or not the file access timestamp is updated for zero
length reads is decided differently for the two cases of IOCB_DIRECT
being set or cleared: in the first case, generic_file_read_iter()
explicitly skips updating that timestamp while in the latter case, it is
always updated through the call to do_generic_file_read().

According to POSIX, zero length reads "do not modify the last data access
timestamp" and thus, the IOCB_DIRECT behaviour is POSIXly correct.

Let generic_file_read_iter() unconditionally check the requested read
length at its entry and return immediately with success if it is zero.

Signed-off-by: Nicolai Stange <nicstange@gmail.com>
---
 Applicable to linux-next-20160324

 mm/filemap.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 7c00f10..a8c69c8 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1840,15 +1840,16 @@ generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
 	ssize_t retval = 0;
 	loff_t *ppos = &iocb->ki_pos;
 	loff_t pos = *ppos;
+	size_t count = iov_iter_count(iter);
+
+	if (!count)
+		goto out; /* skip atime */
 
 	if (iocb->ki_flags & IOCB_DIRECT) {
 		struct address_space *mapping = file->f_mapping;
 		struct inode *inode = mapping->host;
-		size_t count = iov_iter_count(iter);
 		loff_t size;
 
-		if (!count)
-			goto out; /* skip atime */
 		size = i_size_read(inode);
 		retval = filemap_write_and_wait_range(mapping, pos,
 					pos + count - 1);
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/filemap: generic_file_read_iter(): check for zero reads unconditionally
  2016-03-24 11:08 [PATCH] mm/filemap: generic_file_read_iter(): check for zero reads unconditionally Nicolai Stange
@ 2016-03-24 11:45 ` Jan Kara
  2016-03-25  7:50   ` Nicolai Stange
  2016-03-29  8:46 ` Mel Gorman
  1 sibling, 1 reply; 4+ messages in thread
From: Jan Kara @ 2016-03-24 11:45 UTC (permalink / raw)
  To: Nicolai Stange
  Cc: Andrew Morton, Al Viro, Jan Kara, Johannes Weiner, Michal Hocko,
	Ross Zwisler, Mel Gorman, Junichi Nomura, Hugh Dickins,
	Matthew Wilcox, linux-mm, linux-kernel

On Thu 24-03-16 12:08:58, Nicolai Stange wrote:
> If
> - generic_file_read_iter() gets called with a zero read length,
> - the read offset is at a page boundary,
> - IOCB_DIRECT is not set
> - and the page in question hasn't made it into the page cache yet,
> then do_generic_file_read() will trigger a readahead with a req_size hint
> of zero.
> 
> Since roundup_pow_of_two(0) is undefined, UBSAN reports
> 
>   UBSAN: Undefined behaviour in include/linux/log2.h:63:13
>   shift exponent 64 is too large for 64-bit type 'long unsigned int'
>   CPU: 3 PID: 1017 Comm: sa1 Tainted: G L 4.5.0-next-20160318+ #14
>   [...]
>   Call Trace:
>    [...]
>    [<ffffffff813ef61a>] ondemand_readahead+0x3aa/0x3d0
>    [<ffffffff813ef61a>] ? ondemand_readahead+0x3aa/0x3d0
>    [<ffffffff813c73bd>] ? find_get_entry+0x2d/0x210
>    [<ffffffff813ef9c3>] page_cache_sync_readahead+0x63/0xa0
>    [<ffffffff813cc04d>] do_generic_file_read+0x80d/0xf90
>    [<ffffffff813cc955>] generic_file_read_iter+0x185/0x420
>    [...]
>    [<ffffffff81510b06>] __vfs_read+0x256/0x3d0
>    [...]
> 
> when get_init_ra_size() gets called from ondemand_readahead().
> 
> The net effect is that the initial readahead size is arch dependent for
> requested read lengths of zero: for example, since
> 
>   1UL << (sizeof(unsigned long) * 8)
> 
> evaluates to 1 on x86 while its result is 0 on ARMv7, the initial readahead
> size becomes 4 on the former and 0 on the latter.
> 
> What's more, whether or not the file access timestamp is updated for zero
> length reads is decided differently for the two cases of IOCB_DIRECT
> being set or cleared: in the first case, generic_file_read_iter()
> explicitly skips updating that timestamp while in the latter case, it is
> always updated through the call to do_generic_file_read().
> 
> According to POSIX, zero length reads "do not modify the last data access
> timestamp" and thus, the IOCB_DIRECT behaviour is POSIXly correct.
> 
> Let generic_file_read_iter() unconditionally check the requested read
> length at its entry and return immediately with success if it is zero.
> 
> Signed-off-by: Nicolai Stange <nicstange@gmail.com>

Makes sense to me. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> diff --git a/mm/filemap.c b/mm/filemap.c
> index 7c00f10..a8c69c8 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -1840,15 +1840,16 @@ generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
>  	ssize_t retval = 0;
>  	loff_t *ppos = &iocb->ki_pos;
>  	loff_t pos = *ppos;
> +	size_t count = iov_iter_count(iter);
> +
> +	if (!count)
> +		goto out; /* skip atime */
>  
>  	if (iocb->ki_flags & IOCB_DIRECT) {
>  		struct address_space *mapping = file->f_mapping;
>  		struct inode *inode = mapping->host;
> -		size_t count = iov_iter_count(iter);
>  		loff_t size;
>  
> -		if (!count)
> -			goto out; /* skip atime */
>  		size = i_size_read(inode);
>  		retval = filemap_write_and_wait_range(mapping, pos,
>  					pos + count - 1);
> -- 
> 2.7.4
> 
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/filemap: generic_file_read_iter(): check for zero reads unconditionally
  2016-03-24 11:45 ` Jan Kara
@ 2016-03-25  7:50   ` Nicolai Stange
  0 siblings, 0 replies; 4+ messages in thread
From: Nicolai Stange @ 2016-03-25  7:50 UTC (permalink / raw)
  To: Jan Kara
  Cc: Nicolai Stange, Andrew Morton, Al Viro, Jan Kara, Johannes Weiner,
	Michal Hocko, Ross Zwisler, Mel Gorman, Junichi Nomura,
	Hugh Dickins, Matthew Wilcox, linux-mm, linux-kernel

Jan Kara <jack@suse.cz> writes:

> On Thu 24-03-16 12:08:58, Nicolai Stange wrote:
>> If
>> - generic_file_read_iter() gets called with a zero read length,
>> - the read offset is at a page boundary,
>> - IOCB_DIRECT is not set
>> - and the page in question hasn't made it into the page cache yet,
>> then do_generic_file_read() will trigger a readahead with a req_size hint
>> of zero.
>> 
>> Since roundup_pow_of_two(0) is undefined, UBSAN reports
>> 
>>   UBSAN: Undefined behaviour in include/linux/log2.h:63:13
>>   shift exponent 64 is too large for 64-bit type 'long unsigned int'
>>   CPU: 3 PID: 1017 Comm: sa1 Tainted: G L 4.5.0-next-20160318+ #14
>>   [...]
>>   Call Trace:
>>    [...]
>>    [<ffffffff813ef61a>] ondemand_readahead+0x3aa/0x3d0
>>    [<ffffffff813ef61a>] ? ondemand_readahead+0x3aa/0x3d0
>>    [<ffffffff813c73bd>] ? find_get_entry+0x2d/0x210
>>    [<ffffffff813ef9c3>] page_cache_sync_readahead+0x63/0xa0
>>    [<ffffffff813cc04d>] do_generic_file_read+0x80d/0xf90
>>    [<ffffffff813cc955>] generic_file_read_iter+0x185/0x420
>>    [...]
>>    [<ffffffff81510b06>] __vfs_read+0x256/0x3d0
>>    [...]
>> 
>> when get_init_ra_size() gets called from ondemand_readahead().
>> 
>> The net effect is that the initial readahead size is arch dependent for
>> requested read lengths of zero: for example, since
>> 
>>   1UL << (sizeof(unsigned long) * 8)
>> 
>> evaluates to 1 on x86 while its result is 0 on ARMv7, the initial readahead
>> size becomes 4 on the former and 0 on the latter.
>> 
>> What's more, whether or not the file access timestamp is updated for zero
>> length reads is decided differently for the two cases of IOCB_DIRECT
>> being set or cleared: in the first case, generic_file_read_iter()
>> explicitly skips updating that timestamp while in the latter case, it is
>> always updated through the call to do_generic_file_read().
>> 
>> According to POSIX, zero length reads "do not modify the last data access
>> timestamp" and thus, the IOCB_DIRECT behaviour is POSIXly correct.
>> 
>> Let generic_file_read_iter() unconditionally check the requested read
>> length at its entry and return immediately with success if it is zero.
>> 
>> Signed-off-by: Nicolai Stange <nicstange@gmail.com>
>
> Makes sense to me. You can add:
>
> Reviewed-by: Jan Kara <jack@suse.cz>

Thank you very much for reviewing this!

Nicolai


>
> 								Honza
>
>> diff --git a/mm/filemap.c b/mm/filemap.c
>> index 7c00f10..a8c69c8 100644
>> --- a/mm/filemap.c
>> +++ b/mm/filemap.c
>> @@ -1840,15 +1840,16 @@ generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
>>  	ssize_t retval = 0;
>>  	loff_t *ppos = &iocb->ki_pos;
>>  	loff_t pos = *ppos;
>> +	size_t count = iov_iter_count(iter);
>> +
>> +	if (!count)
>> +		goto out; /* skip atime */
>>  
>>  	if (iocb->ki_flags & IOCB_DIRECT) {
>>  		struct address_space *mapping = file->f_mapping;
>>  		struct inode *inode = mapping->host;
>> -		size_t count = iov_iter_count(iter);
>>  		loff_t size;
>>  
>> -		if (!count)
>> -			goto out; /* skip atime */
>>  		size = i_size_read(inode);
>>  		retval = filemap_write_and_wait_range(mapping, pos,
>>  					pos + count - 1);
>> -- 
>> 2.7.4
>> 
>> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/filemap: generic_file_read_iter(): check for zero reads unconditionally
  2016-03-24 11:08 [PATCH] mm/filemap: generic_file_read_iter(): check for zero reads unconditionally Nicolai Stange
  2016-03-24 11:45 ` Jan Kara
@ 2016-03-29  8:46 ` Mel Gorman
  1 sibling, 0 replies; 4+ messages in thread
From: Mel Gorman @ 2016-03-29  8:46 UTC (permalink / raw)
  To: Nicolai Stange
  Cc: Andrew Morton, Al Viro, Jan Kara, Johannes Weiner, Michal Hocko,
	Ross Zwisler, Junichi Nomura, Hugh Dickins, Matthew Wilcox,
	linux-mm, linux-kernel

On Thu, Mar 24, 2016 at 12:08:58PM +0100, Nicolai Stange wrote:
> If
> - generic_file_read_iter() gets called with a zero read length,
> - the read offset is at a page boundary,
> - IOCB_DIRECT is not set
> - and the page in question hasn't made it into the page cache yet,
> then do_generic_file_read() will trigger a readahead with a req_size hint
> of zero.
> 
> Since roundup_pow_of_two(0) is undefined, UBSAN reports
> 
>   UBSAN: Undefined behaviour in include/linux/log2.h:63:13
>   shift exponent 64 is too large for 64-bit type 'long unsigned int'
>   CPU: 3 PID: 1017 Comm: sa1 Tainted: G L 4.5.0-next-20160318+ #14
>   [...]
>   Call Trace:
>    [...]
>    [<ffffffff813ef61a>] ondemand_readahead+0x3aa/0x3d0
>    [<ffffffff813ef61a>] ? ondemand_readahead+0x3aa/0x3d0
>    [<ffffffff813c73bd>] ? find_get_entry+0x2d/0x210
>    [<ffffffff813ef9c3>] page_cache_sync_readahead+0x63/0xa0
>    [<ffffffff813cc04d>] do_generic_file_read+0x80d/0xf90
>    [<ffffffff813cc955>] generic_file_read_iter+0x185/0x420
>    [...]
>    [<ffffffff81510b06>] __vfs_read+0x256/0x3d0
>    [...]
> 
> when get_init_ra_size() gets called from ondemand_readahead().
> 
> The net effect is that the initial readahead size is arch dependent for
> requested read lengths of zero: for example, since
> 
>   1UL << (sizeof(unsigned long) * 8)
> 
> evaluates to 1 on x86 while its result is 0 on ARMv7, the initial readahead
> size becomes 4 on the former and 0 on the latter.
> 
> What's more, whether or not the file access timestamp is updated for zero
> length reads is decided differently for the two cases of IOCB_DIRECT
> being set or cleared: in the first case, generic_file_read_iter()
> explicitly skips updating that timestamp while in the latter case, it is
> always updated through the call to do_generic_file_read().
> 
> According to POSIX, zero length reads "do not modify the last data access
> timestamp" and thus, the IOCB_DIRECT behaviour is POSIXly correct.
> 
> Let generic_file_read_iter() unconditionally check the requested read
> length at its entry and return immediately with success if it is zero.
> 
> Signed-off-by: Nicolai Stange <nicstange@gmail.com>

Acked-by: Mel Gorman <mgorman@techsingularity.net>

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-03-29  8:46 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-24 11:08 [PATCH] mm/filemap: generic_file_read_iter(): check for zero reads unconditionally Nicolai Stange
2016-03-24 11:45 ` Jan Kara
2016-03-25  7:50   ` Nicolai Stange
2016-03-29  8:46 ` Mel Gorman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).