* [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
@ 2023-03-10 6:07 Yangtao Li
2023-03-10 6:17 ` Eric Biggers
` (3 more replies)
0 siblings, 4 replies; 18+ messages in thread
From: Yangtao Li @ 2023-03-10 6:07 UTC (permalink / raw)
To: tytso, adilger.kernel; +Cc: linux-ext4, linux-kernel, viro, Yangtao Li
Just for better readability, no code logic change.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
---
fs/ext4/inode.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d251d705c276..d121cde74522 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd,
{
struct inode *inode = mpd->inode;
int err;
- ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1)
- >> inode->i_blkbits;
+ ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode));
if (ext4_verity_in_progress(inode))
blocks = EXT_MAX_BLOCKS;
--
2.25.1
^ permalink raw reply related [flat|nested] 18+ messages in thread* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() 2023-03-10 6:07 [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() Yangtao Li @ 2023-03-10 6:17 ` Eric Biggers 2023-03-10 6:27 ` Yangtao Li 2023-03-10 6:37 ` Al Viro 2023-03-10 19:07 ` kernel test robot ` (2 subsequent siblings) 3 siblings, 2 replies; 18+ messages in thread From: Eric Biggers @ 2023-03-10 6:17 UTC (permalink / raw) To: Yangtao Li; +Cc: tytso, adilger.kernel, linux-ext4, linux-kernel, viro On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote: > Just for better readability, no code logic change. > > Signed-off-by: Yangtao Li <frank.li@vivo.com> > --- > fs/ext4/inode.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index d251d705c276..d121cde74522 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd, > { > struct inode *inode = mpd->inode; > int err; > - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1) > - >> inode->i_blkbits; > + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode)); > Please don't do this. This makes the code compile down to a division, which is far less efficient. I've verified this by checking the assembly generated. - Eric ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() 2023-03-10 6:17 ` Eric Biggers @ 2023-03-10 6:27 ` Yangtao Li 2023-03-10 6:35 ` Gao Xiang 2023-03-10 6:41 ` Eric Biggers 2023-03-10 6:37 ` Al Viro 1 sibling, 2 replies; 18+ messages in thread From: Yangtao Li @ 2023-03-10 6:27 UTC (permalink / raw) To: tytso, adilger.kernel Cc: linux-ext4, linux-kernel, viro, hsiangkao, linux-erofs > Please don't do this. This makes the code compile down to a division, which is > far less efficient. I've verified this by checking the assembly generated. How much is the performance impact? So should the following be modified as shift operations as well? fs/erofs/namei.c:92: int head = 0, back = DIV_ROUND_UP(dir->i_size, EROFS_BLKSIZ) - 1; fs/erofs/zmap.c:252: const unsigned int totalidx = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ); fs/erofs/decompressor.c:14:#define LZ4_MAX_DISTANCE_PAGES (DIV_ROUND_UP(LZ4_DISTANCE_MAX, PAGE_SIZE) + 1) fs/erofs/decompressor.c:56: DIV_ROUND_UP(distance, PAGE_SIZE) + 1 : fs/erofs/decompressor.c:70: unsigned long bounced[DIV_ROUND_UP(LZ4_MAX_DISTANCE_PAGES, fs/erofs/data.c:84: nblocks = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ); Thx, Yangtao ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() 2023-03-10 6:27 ` Yangtao Li @ 2023-03-10 6:35 ` Gao Xiang 2023-03-10 6:37 ` Gao Xiang 2023-03-10 6:41 ` Eric Biggers 1 sibling, 1 reply; 18+ messages in thread From: Gao Xiang @ 2023-03-10 6:35 UTC (permalink / raw) To: Yangtao Li, tytso, adilger.kernel Cc: linux-ext4, linux-kernel, viro, linux-erofs On 2023/3/10 14:27, Yangtao Li wrote: >> Please don't do this. This makes the code compile down to a division, which is >> far less efficient. I've verified this by checking the assembly generated. > > How much is the performance impact? So should the following be modified as shift operations as well? > > fs/erofs/namei.c:92: int head = 0, back = DIV_ROUND_UP(dir->i_size, EROFS_BLKSIZ) - 1; > fs/erofs/zmap.c:252: const unsigned int totalidx = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ); > fs/erofs/decompressor.c:14:#define LZ4_MAX_DISTANCE_PAGES (DIV_ROUND_UP(LZ4_DISTANCE_MAX, PAGE_SIZE) + 1) > fs/erofs/decompressor.c:56: DIV_ROUND_UP(distance, PAGE_SIZE) + 1 : > fs/erofs/decompressor.c:70: unsigned long bounced[DIV_ROUND_UP(LZ4_MAX_DISTANCE_PAGES, > fs/erofs/data.c:84: nblocks = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ); Please stop taking EROFS as example on ext4 patches and they will all be changed due to subpage support. > > Thx, > Yangtao ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() 2023-03-10 6:35 ` Gao Xiang @ 2023-03-10 6:37 ` Gao Xiang 0 siblings, 0 replies; 18+ messages in thread From: Gao Xiang @ 2023-03-10 6:37 UTC (permalink / raw) To: Yangtao Li, tytso, adilger.kernel Cc: linux-ext4, linux-kernel, viro, linux-erofs On 2023/3/10 14:35, Gao Xiang wrote: > > > On 2023/3/10 14:27, Yangtao Li wrote: >>> Please don't do this. This makes the code compile down to a division, which is >>> far less efficient. I've verified this by checking the assembly generated. >> >> How much is the performance impact? So should the following be modified as shift operations as well? >> >> fs/erofs/namei.c:92: int head = 0, back = DIV_ROUND_UP(dir->i_size, EROFS_BLKSIZ) - 1; >> fs/erofs/zmap.c:252: const unsigned int totalidx = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ); >> fs/erofs/decompressor.c:14:#define LZ4_MAX_DISTANCE_PAGES (DIV_ROUND_UP(LZ4_DISTANCE_MAX, PAGE_SIZE) + 1) >> fs/erofs/decompressor.c:56: DIV_ROUND_UP(distance, PAGE_SIZE) + 1 : >> fs/erofs/decompressor.c:70: unsigned long bounced[DIV_ROUND_UP(LZ4_MAX_DISTANCE_PAGES, >> fs/erofs/data.c:84: nblocks = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ); > > Please stop taking EROFS as example on ext4 patches > and they will all be changed due to subpage support. Here EROFS_BLKSIZ == PAGE_SIZE is a constant, so no difference to use shift or division. > >> >> Thx, >> Yangtao ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() 2023-03-10 6:27 ` Yangtao Li 2023-03-10 6:35 ` Gao Xiang @ 2023-03-10 6:41 ` Eric Biggers 1 sibling, 0 replies; 18+ messages in thread From: Eric Biggers @ 2023-03-10 6:41 UTC (permalink / raw) To: Yangtao Li Cc: tytso, adilger.kernel, hsiangkao, linux-ext4, linux-erofs, linux-kernel, viro On Fri, Mar 10, 2023 at 02:27:38PM +0800, Yangtao Li wrote: > > Please don't do this. This makes the code compile down to a division, which is > > far less efficient. I've verified this by checking the assembly generated. > > How much is the performance impact? So should the following be modified as shift operations as well? > > fs/erofs/namei.c:92: int head = 0, back = DIV_ROUND_UP(dir->i_size, EROFS_BLKSIZ) - 1; > fs/erofs/zmap.c:252: const unsigned int totalidx = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ); > fs/erofs/decompressor.c:14:#define LZ4_MAX_DISTANCE_PAGES (DIV_ROUND_UP(LZ4_DISTANCE_MAX, PAGE_SIZE) + 1) > fs/erofs/decompressor.c:56: DIV_ROUND_UP(distance, PAGE_SIZE) + 1 : > fs/erofs/decompressor.c:70: unsigned long bounced[DIV_ROUND_UP(LZ4_MAX_DISTANCE_PAGES, > fs/erofs/data.c:84: nblocks = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ); > No, compilers can handle division by a power-of-2 constant just fine. It's just division by a non-constant that is inefficient. - Eric ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() 2023-03-10 6:17 ` Eric Biggers 2023-03-10 6:27 ` Yangtao Li @ 2023-03-10 6:37 ` Al Viro 2023-03-10 6:43 ` Al Viro 2023-03-10 6:43 ` Eric Biggers 1 sibling, 2 replies; 18+ messages in thread From: Al Viro @ 2023-03-10 6:37 UTC (permalink / raw) To: Eric Biggers; +Cc: Yangtao Li, tytso, adilger.kernel, linux-ext4, linux-kernel On Thu, Mar 09, 2023 at 10:17:16PM -0800, Eric Biggers wrote: > On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote: > > Just for better readability, no code logic change. > > > > Signed-off-by: Yangtao Li <frank.li@vivo.com> > > --- > > fs/ext4/inode.c | 3 +-- > > 1 file changed, 1 insertion(+), 2 deletions(-) > > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > > index d251d705c276..d121cde74522 100644 > > --- a/fs/ext4/inode.c > > +++ b/fs/ext4/inode.c > > @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd, > > { > > struct inode *inode = mpd->inode; > > int err; > > - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1) > > - >> inode->i_blkbits; > > + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode)); > > > > Please don't do this. This makes the code compile down to a division, which is > far less efficient. I've verified this by checking the assembly generated. Which compiler is doing that? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() 2023-03-10 6:37 ` Al Viro @ 2023-03-10 6:43 ` Al Viro 2023-03-10 6:43 ` Eric Biggers 1 sibling, 0 replies; 18+ messages in thread From: Al Viro @ 2023-03-10 6:43 UTC (permalink / raw) To: Eric Biggers; +Cc: Yangtao Li, tytso, adilger.kernel, linux-ext4, linux-kernel On Fri, Mar 10, 2023 at 06:37:29AM +0000, Al Viro wrote: > On Thu, Mar 09, 2023 at 10:17:16PM -0800, Eric Biggers wrote: > > On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote: > > > Just for better readability, no code logic change. > > > > > > Signed-off-by: Yangtao Li <frank.li@vivo.com> > > > --- > > > fs/ext4/inode.c | 3 +-- > > > 1 file changed, 1 insertion(+), 2 deletions(-) > > > > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > > > index d251d705c276..d121cde74522 100644 > > > --- a/fs/ext4/inode.c > > > +++ b/fs/ext4/inode.c > > > @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd, > > > { > > > struct inode *inode = mpd->inode; > > > int err; > > > - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1) > > > - >> inode->i_blkbits; > > > + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode)); > > > > > > > Please don't do this. This makes the code compile down to a division, which is > > far less efficient. I've verified this by checking the assembly generated. > > Which compiler is doing that? While we are at it, replace return (1 << node->i_blkbits); with return (1u << node->i_blkbits); and see if that changes the things. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() 2023-03-10 6:37 ` Al Viro 2023-03-10 6:43 ` Al Viro @ 2023-03-10 6:43 ` Eric Biggers 2023-03-10 6:46 ` Al Viro 1 sibling, 1 reply; 18+ messages in thread From: Eric Biggers @ 2023-03-10 6:43 UTC (permalink / raw) To: Al Viro; +Cc: Yangtao Li, tytso, adilger.kernel, linux-ext4, linux-kernel On Fri, Mar 10, 2023 at 06:37:29AM +0000, Al Viro wrote: > On Thu, Mar 09, 2023 at 10:17:16PM -0800, Eric Biggers wrote: > > On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote: > > > Just for better readability, no code logic change. > > > > > > Signed-off-by: Yangtao Li <frank.li@vivo.com> > > > --- > > > fs/ext4/inode.c | 3 +-- > > > 1 file changed, 1 insertion(+), 2 deletions(-) > > > > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > > > index d251d705c276..d121cde74522 100644 > > > --- a/fs/ext4/inode.c > > > +++ b/fs/ext4/inode.c > > > @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd, > > > { > > > struct inode *inode = mpd->inode; > > > int err; > > > - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1) > > > - >> inode->i_blkbits; > > > + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode)); > > > > > > > Please don't do this. This makes the code compile down to a division, which is > > far less efficient. I've verified this by checking the assembly generated. > > Which compiler is doing that? $ gcc --version gcc (GCC) 12.2.1 20230201 i_blocksize(inode) is not a constant, so this should not be particularly surprising. One might hope that a / (1 << b) would be optimized into a >> b, but that doesn't seem to happen. - Eric ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() 2023-03-10 6:43 ` Eric Biggers @ 2023-03-10 6:46 ` Al Viro 2023-03-10 6:54 ` Eric Biggers 0 siblings, 1 reply; 18+ messages in thread From: Al Viro @ 2023-03-10 6:46 UTC (permalink / raw) To: Eric Biggers; +Cc: Yangtao Li, tytso, adilger.kernel, linux-ext4, linux-kernel On Thu, Mar 09, 2023 at 10:43:55PM -0800, Eric Biggers wrote: > On Fri, Mar 10, 2023 at 06:37:29AM +0000, Al Viro wrote: > > On Thu, Mar 09, 2023 at 10:17:16PM -0800, Eric Biggers wrote: > > > On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote: > > > > Just for better readability, no code logic change. > > > > > > > > Signed-off-by: Yangtao Li <frank.li@vivo.com> > > > > --- > > > > fs/ext4/inode.c | 3 +-- > > > > 1 file changed, 1 insertion(+), 2 deletions(-) > > > > > > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > > > > index d251d705c276..d121cde74522 100644 > > > > --- a/fs/ext4/inode.c > > > > +++ b/fs/ext4/inode.c > > > > @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd, > > > > { > > > > struct inode *inode = mpd->inode; > > > > int err; > > > > - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1) > > > > - >> inode->i_blkbits; > > > > + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode)); > > > > > > > > > > Please don't do this. This makes the code compile down to a division, which is > > > far less efficient. I've verified this by checking the assembly generated. > > > > Which compiler is doing that? > > $ gcc --version > gcc (GCC) 12.2.1 20230201 > > i_blocksize(inode) is not a constant, so this should not be particularly > surprising. One might hope that a / (1 << b) would be optimized into a >> b, > but that doesn't seem to happen. It really ought to be a / (1u << b), though... ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() 2023-03-10 6:46 ` Al Viro @ 2023-03-10 6:54 ` Eric Biggers 2023-03-10 7:05 ` Al Viro 0 siblings, 1 reply; 18+ messages in thread From: Eric Biggers @ 2023-03-10 6:54 UTC (permalink / raw) To: Al Viro; +Cc: Yangtao Li, tytso, adilger.kernel, linux-ext4, linux-kernel On Fri, Mar 10, 2023 at 06:46:12AM +0000, Al Viro wrote: > On Thu, Mar 09, 2023 at 10:43:55PM -0800, Eric Biggers wrote: > > On Fri, Mar 10, 2023 at 06:37:29AM +0000, Al Viro wrote: > > > On Thu, Mar 09, 2023 at 10:17:16PM -0800, Eric Biggers wrote: > > > > On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote: > > > > > Just for better readability, no code logic change. > > > > > > > > > > Signed-off-by: Yangtao Li <frank.li@vivo.com> > > > > > --- > > > > > fs/ext4/inode.c | 3 +-- > > > > > 1 file changed, 1 insertion(+), 2 deletions(-) > > > > > > > > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > > > > > index d251d705c276..d121cde74522 100644 > > > > > --- a/fs/ext4/inode.c > > > > > +++ b/fs/ext4/inode.c > > > > > @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd, > > > > > { > > > > > struct inode *inode = mpd->inode; > > > > > int err; > > > > > - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1) > > > > > - >> inode->i_blkbits; > > > > > + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode)); > > > > > > > > > > > > > Please don't do this. This makes the code compile down to a division, which is > > > > far less efficient. I've verified this by checking the assembly generated. > > > > > > Which compiler is doing that? > > > > $ gcc --version > > gcc (GCC) 12.2.1 20230201 > > > > i_blocksize(inode) is not a constant, so this should not be particularly > > surprising. One might hope that a / (1 << b) would be optimized into a >> b, > > but that doesn't seem to happen. > > It really ought to be a / (1u << b), though... Sure, that does better: uint64_t f(uint64_t a, int b) { return a / (1U << b); } gcc: 0000000000000000 <f>: 0: 48 89 f8 mov %rdi,%rax 3: 89 f1 mov %esi,%ecx 5: 48 d3 e8 shr %cl,%rax 8: c3 ret clang: 0000000000000000 <f>: 0: 89 f1 mov %esi,%ecx 2: 48 89 f8 mov %rdi,%rax 5: 48 d3 e8 shr %cl,%rax 8: c3 ret But with a signed dividend (which is the case here) it gets messed up: int64_t f(int64_t a, int b) { return a / (1U << b); } gcc: 0000000000000000 <f>: 0: 48 89 f8 mov %rdi,%rax 3: 89 f1 mov %esi,%ecx 5: bf 01 00 00 00 mov $0x1,%edi a: d3 e7 shl %cl,%edi c: 48 99 cqto e: 48 f7 ff idiv %rdi 11: c3 ret clang: 0000000000000000 <f>: 0: 89 f1 mov %esi,%ecx 2: be 01 00 00 00 mov $0x1,%esi 7: d3 e6 shl %cl,%esi 9: 48 89 f8 mov %rdi,%rax c: 48 89 f9 mov %rdi,%rcx f: 48 c1 e9 20 shr $0x20,%rcx 13: 74 06 je 1b <f+0x1b> 15: 48 99 cqto 17: 48 f7 fe idiv %rsi 1a: c3 ret 1b: 31 d2 xor %edx,%edx 1d: f7 f6 div %esi 1f: c3 ret ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() 2023-03-10 6:54 ` Eric Biggers @ 2023-03-10 7:05 ` Al Viro 2023-03-10 7:12 ` Al Viro 0 siblings, 1 reply; 18+ messages in thread From: Al Viro @ 2023-03-10 7:05 UTC (permalink / raw) To: Eric Biggers; +Cc: Yangtao Li, tytso, adilger.kernel, linux-ext4, linux-kernel On Thu, Mar 09, 2023 at 10:54:38PM -0800, Eric Biggers wrote: > On Fri, Mar 10, 2023 at 06:46:12AM +0000, Al Viro wrote: > > On Thu, Mar 09, 2023 at 10:43:55PM -0800, Eric Biggers wrote: > > > On Fri, Mar 10, 2023 at 06:37:29AM +0000, Al Viro wrote: > > > > On Thu, Mar 09, 2023 at 10:17:16PM -0800, Eric Biggers wrote: > > > > > On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote: > > > > > > Just for better readability, no code logic change. > > > > > > > > > > > > Signed-off-by: Yangtao Li <frank.li@vivo.com> > > > > > > --- > > > > > > fs/ext4/inode.c | 3 +-- > > > > > > 1 file changed, 1 insertion(+), 2 deletions(-) > > > > > > > > > > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > > > > > > index d251d705c276..d121cde74522 100644 > > > > > > --- a/fs/ext4/inode.c > > > > > > +++ b/fs/ext4/inode.c > > > > > > @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd, > > > > > > { > > > > > > struct inode *inode = mpd->inode; > > > > > > int err; > > > > > > - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1) > > > > > > - >> inode->i_blkbits; > > > > > > + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode)); > > > > > > > > > > > > > > > > Please don't do this. This makes the code compile down to a division, which is > > > > > far less efficient. I've verified this by checking the assembly generated. > > > > > > > > Which compiler is doing that? > > > > > > $ gcc --version > > > gcc (GCC) 12.2.1 20230201 > > > > > > i_blocksize(inode) is not a constant, so this should not be particularly > > > surprising. One might hope that a / (1 << b) would be optimized into a >> b, > > > but that doesn't seem to happen. > > > > It really ought to be a / (1u << b), though... > > Sure, that does better: > > uint64_t f(uint64_t a, int b) > { > return a / (1U << b); > } > > gcc: > 0000000000000000 <f>: > 0: 48 89 f8 mov %rdi,%rax > 3: 89 f1 mov %esi,%ecx > 5: 48 d3 e8 shr %cl,%rax > 8: c3 ret > > clang: > 0000000000000000 <f>: > 0: 89 f1 mov %esi,%ecx > 2: 48 89 f8 mov %rdi,%rax > 5: 48 d3 e8 shr %cl,%rax > 8: c3 ret > > But with a signed dividend (which is the case here) it gets messed up: > > int64_t f(int64_t a, int b) > { > return a / (1U << b); > } *ow* And i_size_read() is long long ;-/ Right. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() 2023-03-10 7:05 ` Al Viro @ 2023-03-10 7:12 ` Al Viro 2023-03-10 21:47 ` Eric Biggers 0 siblings, 1 reply; 18+ messages in thread From: Al Viro @ 2023-03-10 7:12 UTC (permalink / raw) To: Eric Biggers; +Cc: Yangtao Li, tytso, adilger.kernel, linux-ext4, linux-kernel On Fri, Mar 10, 2023 at 07:05:27AM +0000, Al Viro wrote: > On Thu, Mar 09, 2023 at 10:54:38PM -0800, Eric Biggers wrote: > > On Fri, Mar 10, 2023 at 06:46:12AM +0000, Al Viro wrote: > > > On Thu, Mar 09, 2023 at 10:43:55PM -0800, Eric Biggers wrote: > > > > On Fri, Mar 10, 2023 at 06:37:29AM +0000, Al Viro wrote: > > > > > On Thu, Mar 09, 2023 at 10:17:16PM -0800, Eric Biggers wrote: > > > > > > On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote: > > > > > > > Just for better readability, no code logic change. > > > > > > > > > > > > > > Signed-off-by: Yangtao Li <frank.li@vivo.com> > > > > > > > --- > > > > > > > fs/ext4/inode.c | 3 +-- > > > > > > > 1 file changed, 1 insertion(+), 2 deletions(-) > > > > > > > > > > > > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > > > > > > > index d251d705c276..d121cde74522 100644 > > > > > > > --- a/fs/ext4/inode.c > > > > > > > +++ b/fs/ext4/inode.c > > > > > > > @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd, > > > > > > > { > > > > > > > struct inode *inode = mpd->inode; > > > > > > > int err; > > > > > > > - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1) > > > > > > > - >> inode->i_blkbits; > > > > > > > + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode)); > > > > > > > > > > > > > > > > > > > Please don't do this. This makes the code compile down to a division, which is > > > > > > far less efficient. I've verified this by checking the assembly generated. > > > > > > > > > > Which compiler is doing that? > > > > > > > > $ gcc --version > > > > gcc (GCC) 12.2.1 20230201 > > > > > > > > i_blocksize(inode) is not a constant, so this should not be particularly > > > > surprising. One might hope that a / (1 << b) would be optimized into a >> b, > > > > but that doesn't seem to happen. > > > > > > It really ought to be a / (1u << b), though... > > > > Sure, that does better: > > > > uint64_t f(uint64_t a, int b) > > { > > return a / (1U << b); > > } > > > > gcc: > > 0000000000000000 <f>: > > 0: 48 89 f8 mov %rdi,%rax > > 3: 89 f1 mov %esi,%ecx > > 5: 48 d3 e8 shr %cl,%rax > > 8: c3 ret > > > > clang: > > 0000000000000000 <f>: > > 0: 89 f1 mov %esi,%ecx > > 2: 48 89 f8 mov %rdi,%rax > > 5: 48 d3 e8 shr %cl,%rax > > 8: c3 ret > > > > But with a signed dividend (which is the case here) it gets messed up: > > > > int64_t f(int64_t a, int b) > > { > > return a / (1U << b); > > } > > *ow* > > And i_size_read() is long long ;-/ Right. Out of curiosity (and that's already too brittle for practical use) - does DIV_ROUND_UP_ULL() do any better on full example? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() 2023-03-10 7:12 ` Al Viro @ 2023-03-10 21:47 ` Eric Biggers 0 siblings, 0 replies; 18+ messages in thread From: Eric Biggers @ 2023-03-10 21:47 UTC (permalink / raw) To: Al Viro; +Cc: Yangtao Li, tytso, adilger.kernel, linux-ext4, linux-kernel On Fri, Mar 10, 2023 at 07:12:31AM +0000, Al Viro wrote: > > Out of curiosity (and that's already too brittle for practical use) - > does DIV_ROUND_UP_ULL() do any better on full example? 'DIV_ROUND_UP_ULL(i_size_read(inode), i_blocksize(inode))' works properly with clang but not gcc. If i_blocksize() is changed to do '1U << inode->i_blkbits' instead of '1 << inode->i_blkbits', it works with gcc too. - Eric ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() 2023-03-10 6:07 [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() Yangtao Li 2023-03-10 6:17 ` Eric Biggers @ 2023-03-10 19:07 ` kernel test robot 2023-03-10 19:38 ` kernel test robot 2023-03-10 23:21 ` Theodore Ts'o 3 siblings, 0 replies; 18+ messages in thread From: kernel test robot @ 2023-03-10 19:07 UTC (permalink / raw) To: Yangtao Li, tytso, adilger.kernel Cc: oe-kbuild-all, linux-ext4, linux-kernel, viro, Yangtao Li Hi Yangtao, I love your patch! Yet something to improve: [auto build test ERROR on tytso-ext4/dev] [also build test ERROR on linus/master v6.3-rc1 next-20230310] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Yangtao-Li/ext4-convert-to-DIV_ROUND_UP-in-mpage_process_page_bufs/20230310-140903 base: https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev patch link: https://lore.kernel.org/r/20230310060734.8780-1-frank.li%40vivo.com patch subject: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() config: i386-defconfig (https://download.01.org/0day-ci/archive/20230311/202303110211.LXeNm5uw-lkp@intel.com/config) compiler: gcc-11 (Debian 11.3.0-8) 11.3.0 reproduce (this is a W=1 build): # https://github.com/intel-lab-lkp/linux/commit/f4d2db5f59592a5688be6e4d2d3dd6f3f94d4f96 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Yangtao-Li/ext4-convert-to-DIV_ROUND_UP-in-mpage_process_page_bufs/20230310-140903 git checkout f4d2db5f59592a5688be6e4d2d3dd6f3f94d4f96 # save the config file mkdir build_dir && cp config build_dir/.config make W=1 O=build_dir ARCH=i386 olddefconfig make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot <lkp@intel.com> | Link: https://lore.kernel.org/oe-kbuild-all/202303110211.LXeNm5uw-lkp@intel.com/ All errors (new ones prefixed by >>): ld: fs/ext4/inode.o: in function `mpage_process_page_bufs': >> inode.c:(.text+0xbda): undefined reference to `__divdi3' -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() 2023-03-10 6:07 [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() Yangtao Li 2023-03-10 6:17 ` Eric Biggers 2023-03-10 19:07 ` kernel test robot @ 2023-03-10 19:38 ` kernel test robot 2023-03-10 23:21 ` Theodore Ts'o 3 siblings, 0 replies; 18+ messages in thread From: kernel test robot @ 2023-03-10 19:38 UTC (permalink / raw) To: Yangtao Li, tytso, adilger.kernel Cc: oe-kbuild-all, linux-ext4, linux-kernel, viro, Yangtao Li Hi Yangtao, I love your patch! Yet something to improve: [auto build test ERROR on tytso-ext4/dev] [also build test ERROR on linus/master v6.3-rc1 next-20230310] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Yangtao-Li/ext4-convert-to-DIV_ROUND_UP-in-mpage_process_page_bufs/20230310-140903 base: https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev patch link: https://lore.kernel.org/r/20230310060734.8780-1-frank.li%40vivo.com patch subject: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() config: i386-debian-10.3 (https://download.01.org/0day-ci/archive/20230311/202303110358.NxL6UI32-lkp@intel.com/config) compiler: gcc-11 (Debian 11.3.0-8) 11.3.0 reproduce (this is a W=1 build): # https://github.com/intel-lab-lkp/linux/commit/f4d2db5f59592a5688be6e4d2d3dd6f3f94d4f96 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Yangtao-Li/ext4-convert-to-DIV_ROUND_UP-in-mpage_process_page_bufs/20230310-140903 git checkout f4d2db5f59592a5688be6e4d2d3dd6f3f94d4f96 # save the config file mkdir build_dir && cp config build_dir/.config make W=1 O=build_dir ARCH=i386 olddefconfig make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot <lkp@intel.com> | Link: https://lore.kernel.org/oe-kbuild-all/202303110358.NxL6UI32-lkp@intel.com/ All errors (new ones prefixed by >>, old ones prefixed by <<): >> ERROR: modpost: "__divdi3" [fs/ext4/ext4.ko] undefined! -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() 2023-03-10 6:07 [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() Yangtao Li ` (2 preceding siblings ...) 2023-03-10 19:38 ` kernel test robot @ 2023-03-10 23:21 ` Theodore Ts'o 2023-03-13 9:25 ` David Laight 3 siblings, 1 reply; 18+ messages in thread From: Theodore Ts'o @ 2023-03-10 23:21 UTC (permalink / raw) To: Yangtao Li; +Cc: adilger.kernel, linux-ext4, linux-kernel, viro On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote: > Just for better readability, no code logic change. All aside from the arguments over performance, I'm not at *all* convinced by the "it's more readable" argument. So yeah, let's not. We have i_blkbits for a reason, and it's because shifting right is just simpler and easier. BTW, doing a 64-bit division on a 32-bit platforms causes compile failures, which was the cause of the test bot complaint: ld: fs/ext4/inode.o: in function `mpage_process_page_bufs': >> inode.c:(.text+0xbda): undefined reference to `__divdi3' On 32-bit platforms --- i386 in particular --- the 64-bit division results in an out-of-line call to a helper function that is not supplied in the kernel compilation environment, so not only is it slower, it Just Doesn't Work. - Ted ^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() 2023-03-10 23:21 ` Theodore Ts'o @ 2023-03-13 9:25 ` David Laight 0 siblings, 0 replies; 18+ messages in thread From: David Laight @ 2023-03-13 9:25 UTC (permalink / raw) To: 'Theodore Ts'o', Yangtao Li Cc: adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, viro@zeniv.linux.org.uk ... > On 32-bit platforms --- i386 in particular --- the 64-bit division > results in an out-of-line call to a helper function that is not > supplied in the kernel compilation environment, so not only is it > slower, it Just Doesn't Work. Even on some 64-bit systems a 64bit divide can be significantly slower than a 32-bit divide - even with the same arguments. IIRC Intel x86-64 a 64-bit divide (128-bit dividend) is about twice the clocks of a 32-bit one. On AMD cpu they are much the same. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2023-03-13 9:25 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-03-10 6:07 [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs() Yangtao Li 2023-03-10 6:17 ` Eric Biggers 2023-03-10 6:27 ` Yangtao Li 2023-03-10 6:35 ` Gao Xiang 2023-03-10 6:37 ` Gao Xiang 2023-03-10 6:41 ` Eric Biggers 2023-03-10 6:37 ` Al Viro 2023-03-10 6:43 ` Al Viro 2023-03-10 6:43 ` Eric Biggers 2023-03-10 6:46 ` Al Viro 2023-03-10 6:54 ` Eric Biggers 2023-03-10 7:05 ` Al Viro 2023-03-10 7:12 ` Al Viro 2023-03-10 21:47 ` Eric Biggers 2023-03-10 19:07 ` kernel test robot 2023-03-10 19:38 ` kernel test robot 2023-03-10 23:21 ` Theodore Ts'o 2023-03-13 9:25 ` David Laight
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).