* [PATCH] patch move-ext4_update_i_disksize-into-mpage_map_and_submit_extent
@ 2014-04-11 14:35 Theodore Ts'o
2014-04-12 13:45 ` [PATCH -v2] ext4: move ext4_update_i_disksize() into mpage_map_and_submit_extent() Theodore Ts'o
0 siblings, 1 reply; 5+ messages in thread
From: Theodore Ts'o @ 2014-04-11 14:35 UTC (permalink / raw)
To: Ext4 Developers List; +Cc: pfonseca, Theodore Ts'o
---
fs/ext4/ext4.h | 17 -----------------
fs/ext4/inode.c | 16 +++++++++++++---
2 files changed, 13 insertions(+), 20 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index f1c65dc..66946aa 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2466,23 +2466,6 @@ static inline void ext4_update_i_disksize(struct inode *inode, loff_t newsize)
up_write(&EXT4_I(inode)->i_data_sem);
}
-/*
- * Update i_disksize after writeback has been started. Races with truncate
- * are avoided by checking i_size under i_data_sem.
- */
-static inline void ext4_wb_update_i_disksize(struct inode *inode, loff_t newsize)
-{
- loff_t i_size;
-
- down_write(&EXT4_I(inode)->i_data_sem);
- i_size = i_size_read(inode);
- if (newsize > i_size)
- newsize = i_size;
- if (newsize > EXT4_I(inode)->i_disksize)
- EXT4_I(inode)->i_disksize = newsize;
- up_write(&EXT4_I(inode)->i_data_sem);
-}
-
struct ext4_group_info {
unsigned long bb_state;
struct rb_root bb_free_root;
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 7b93df9..f023f0c 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2247,13 +2247,23 @@ static int mpage_map_and_submit_extent(handle_t *handle,
return err;
} while (map->m_len);
- /* Update on-disk size after IO is submitted */
+ /*
+ * Update on-disk size after IO is submitted. Races with
+ * truncate are avoided by checking i_size under i_data_sem.
+ */
disksize = ((loff_t)mpd->first_page) << PAGE_CACHE_SHIFT;
if (disksize > EXT4_I(inode)->i_disksize) {
int err2;
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH -v2] ext4: move ext4_update_i_disksize() into mpage_map_and_submit_extent()
2014-04-11 14:35 [PATCH] patch move-ext4_update_i_disksize-into-mpage_map_and_submit_extent Theodore Ts'o
@ 2014-04-12 13:45 ` Theodore Ts'o
2014-04-15 16:26 ` Jan Kara
0 siblings, 1 reply; 5+ messages in thread
From: Theodore Ts'o @ 2014-04-12 13:45 UTC (permalink / raw)
To: Ext4 Developers List; +Cc: pfonseca, Theodore Ts'o, stable
The function ext4_update_i_disksize() is used in only one place, in
the function mpage_map_and_submit_extent(). Move there to simplify
the code paths, and also move the call to ext4_mark_inode_dirty() into
the i_data_sem's critical region, to be consistent with all of the
other places where we update i_disksize. That way, we also keep the
raw_inode's i_disksize protected.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
---
Oops, the first version was missing the commit description
fs/ext4/ext4.h | 17 -----------------
fs/ext4/inode.c | 16 +++++++++++++---
2 files changed, 13 insertions(+), 20 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index f1c65dc..66946aa 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2466,23 +2466,6 @@ static inline void ext4_update_i_disksize(struct inode *inode, loff_t newsize)
up_write(&EXT4_I(inode)->i_data_sem);
}
-/*
- * Update i_disksize after writeback has been started. Races with truncate
- * are avoided by checking i_size under i_data_sem.
- */
-static inline void ext4_wb_update_i_disksize(struct inode *inode, loff_t newsize)
-{
- loff_t i_size;
-
- down_write(&EXT4_I(inode)->i_data_sem);
- i_size = i_size_read(inode);
- if (newsize > i_size)
- newsize = i_size;
- if (newsize > EXT4_I(inode)->i_disksize)
- EXT4_I(inode)->i_disksize = newsize;
- up_write(&EXT4_I(inode)->i_data_sem);
-}
-
struct ext4_group_info {
unsigned long bb_state;
struct rb_root bb_free_root;
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 7b93df9..f023f0c 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2247,13 +2247,23 @@ static int mpage_map_and_submit_extent(handle_t *handle,
return err;
} while (map->m_len);
- /* Update on-disk size after IO is submitted */
+ /*
+ * Update on-disk size after IO is submitted. Races with
+ * truncate are avoided by checking i_size under i_data_sem.
+ */
disksize = ((loff_t)mpd->first_page) << PAGE_CACHE_SHIFT;
if (disksize > EXT4_I(inode)->i_disksize) {
int err2;
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH -v2] ext4: move ext4_update_i_disksize() into mpage_map_and_submit_extent()
2014-04-12 13:45 ` [PATCH -v2] ext4: move ext4_update_i_disksize() into mpage_map_and_submit_extent() Theodore Ts'o
@ 2014-04-15 16:26 ` Jan Kara
2014-04-15 19:26 ` Theodore Ts'o
0 siblings, 1 reply; 5+ messages in thread
From: Jan Kara @ 2014-04-15 16:26 UTC (permalink / raw)
To: Theodore Ts'o; +Cc: Ext4 Developers List, pfonseca, stable
On Sat 12-04-14 09:45:27, Ted Tso wrote:
> The function ext4_update_i_disksize() is used in only one place, in
> the function mpage_map_and_submit_extent(). Move there to simplify
> the code paths, and also move the call to ext4_mark_inode_dirty() into
> the i_data_sem's critical region, to be consistent with all of the
> other places where we update i_disksize. That way, we also keep the
> raw_inode's i_disksize protected.
>
> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
> Cc: stable@vger.kernel.org
I agree that it makes sense to have all the places consistent and protect
raw disk inode i_disksize with i_data_sem. OTOH I don't see a way how this
can cause any real harm (but I guess you expect there might be something as
you CCed stable), so can you explain it please?
Honza
> ---
>
> Oops, the first version was missing the commit description
>
> fs/ext4/ext4.h | 17 -----------------
> fs/ext4/inode.c | 16 +++++++++++++---
> 2 files changed, 13 insertions(+), 20 deletions(-)
>
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index f1c65dc..66946aa 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -2466,23 +2466,6 @@ static inline void ext4_update_i_disksize(struct inode *inode, loff_t newsize)
> up_write(&EXT4_I(inode)->i_data_sem);
> }
>
> -/*
> - * Update i_disksize after writeback has been started. Races with truncate
> - * are avoided by checking i_size under i_data_sem.
> - */
> -static inline void ext4_wb_update_i_disksize(struct inode *inode, loff_t newsize)
> -{
> - loff_t i_size;
> -
> - down_write(&EXT4_I(inode)->i_data_sem);
> - i_size = i_size_read(inode);
> - if (newsize > i_size)
> - newsize = i_size;
> - if (newsize > EXT4_I(inode)->i_disksize)
> - EXT4_I(inode)->i_disksize = newsize;
> - up_write(&EXT4_I(inode)->i_data_sem);
> -}
> -
> struct ext4_group_info {
> unsigned long bb_state;
> struct rb_root bb_free_root;
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 7b93df9..f023f0c 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -2247,13 +2247,23 @@ static int mpage_map_and_submit_extent(handle_t *handle,
> return err;
> } while (map->m_len);
>
> - /* Update on-disk size after IO is submitted */
> + /*
> + * Update on-disk size after IO is submitted. Races with
> + * truncate are avoided by checking i_size under i_data_sem.
> + */
> disksize = ((loff_t)mpd->first_page) << PAGE_CACHE_SHIFT;
> if (disksize > EXT4_I(inode)->i_disksize) {
> int err2;
> -
> - ext4_wb_update_i_disksize(inode, disksize);
> + loff_t i_size;
> +
> + down_write(&EXT4_I(inode)->i_data_sem);
> + i_size = i_size_read(inode);
> + if (disksize > i_size)
> + disksize = i_size;
> + if (disksize > EXT4_I(inode)->i_disksize)
> + EXT4_I(inode)->i_disksize = disksize;
> err2 = ext4_mark_inode_dirty(handle, inode);
> + up_write(&EXT4_I(inode)->i_data_sem);
> if (err2)
> ext4_error(inode->i_sb,
> "Failed to mark inode %lu dirty",
> --
> 1.9.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH -v2] ext4: move ext4_update_i_disksize() into mpage_map_and_submit_extent()
2014-04-15 16:26 ` Jan Kara
@ 2014-04-15 19:26 ` Theodore Ts'o
2014-04-15 20:18 ` Jan Kara
0 siblings, 1 reply; 5+ messages in thread
From: Theodore Ts'o @ 2014-04-15 19:26 UTC (permalink / raw)
To: Jan Kara; +Cc: Ext4 Developers List, pfonseca, stable
On Tue, Apr 15, 2014 at 06:26:27PM +0200, Jan Kara wrote:
> On Sat 12-04-14 09:45:27, Ted Tso wrote:
> > The function ext4_update_i_disksize() is used in only one place, in
> > the function mpage_map_and_submit_extent(). Move there to simplify
> > the code paths, and also move the call to ext4_mark_inode_dirty() into
> > the i_data_sem's critical region, to be consistent with all of the
> > other places where we update i_disksize. That way, we also keep the
> > raw_inode's i_disksize protected.
> >
> > Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
> > Cc: stable@vger.kernel.org
> I agree that it makes sense to have all the places consistent and protect
> raw disk inode i_disksize with i_data_sem. OTOH I don't see a way how this
> can cause any real harm (but I guess you expect there might be something as
> you CCed stable), so can you explain it please?
This was the case I was worried about:
CPU #1 CPU #2
1. down_write(&i_data_sem)
2. Modify i_disk_size
4. up_write(&i_data_sem)
5. down_write(&i_data_sem)
6. Modify i_disk_size
7. Copy i_disk_size to on-disk inode
8. up_write(&i_data_sem)
9. Copy i_disk_size to on-disk inode
It's the standard data race; it might not be a problem on Intel CPU's,
but in general, cpu #1 might still have a stale copy of i_disk_size in
its cache, and hence it might copying the old, outdated value into the
on-disk inode.
- Ted
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH -v2] ext4: move ext4_update_i_disksize() into mpage_map_and_submit_extent()
2014-04-15 19:26 ` Theodore Ts'o
@ 2014-04-15 20:18 ` Jan Kara
0 siblings, 0 replies; 5+ messages in thread
From: Jan Kara @ 2014-04-15 20:18 UTC (permalink / raw)
To: Theodore Ts'o; +Cc: Jan Kara, Ext4 Developers List, pfonseca, stable
On Tue 15-04-14 15:26:42, Ted Tso wrote:
> On Tue, Apr 15, 2014 at 06:26:27PM +0200, Jan Kara wrote:
> > On Sat 12-04-14 09:45:27, Ted Tso wrote:
> > > The function ext4_update_i_disksize() is used in only one place, in
> > > the function mpage_map_and_submit_extent(). Move there to simplify
> > > the code paths, and also move the call to ext4_mark_inode_dirty() into
> > > the i_data_sem's critical region, to be consistent with all of the
> > > other places where we update i_disksize. That way, we also keep the
> > > raw_inode's i_disksize protected.
> > >
> > > Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
> > > Cc: stable@vger.kernel.org
> > I agree that it makes sense to have all the places consistent and protect
> > raw disk inode i_disksize with i_data_sem. OTOH I don't see a way how this
> > can cause any real harm (but I guess you expect there might be something as
> > you CCed stable), so can you explain it please?
>
> This was the case I was worried about:
>
> CPU #1 CPU #2
>
> 1. down_write(&i_data_sem)
> 2. Modify i_disk_size
> 4. up_write(&i_data_sem)
> 5. down_write(&i_data_sem)
> 6. Modify i_disk_size
> 7. Copy i_disk_size to on-disk inode
> 8. up_write(&i_data_sem)
> 9. Copy i_disk_size to on-disk inode
>
>
> It's the standard data race; it might not be a problem on Intel CPU's,
> but in general, cpu #1 might still have a stale copy of i_disk_size in
> its cache, and hence it might copying the old, outdated value into the
> on-disk inode.
Yes, that could be a problem even on Intel CPU - not because of cache
coherency but because old i_disk_size value might be speculatively
preloaded before CPU#2 updates its value. So feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-04-15 20:18 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-11 14:35 [PATCH] patch move-ext4_update_i_disksize-into-mpage_map_and_submit_extent Theodore Ts'o
2014-04-12 13:45 ` [PATCH -v2] ext4: move ext4_update_i_disksize() into mpage_map_and_submit_extent() Theodore Ts'o
2014-04-15 16:26 ` Jan Kara
2014-04-15 19:26 ` Theodore Ts'o
2014-04-15 20:18 ` Jan Kara
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).