* [PATCH 0/2] Fix direct write with respect to inode locking
@ 2020-12-08 18:42 Goldwyn Rodrigues
2020-12-08 18:42 ` [PATCH 1/2] btrfs: Fold generic_write_checks() in btrfs_write_check() Goldwyn Rodrigues
2020-12-08 18:42 ` [PATCH 2/2] btrfs: Make btrfs_direct_write atomic with respect to inode_lock Goldwyn Rodrigues
0 siblings, 2 replies; 7+ messages in thread
From: Goldwyn Rodrigues @ 2020-12-08 18:42 UTC (permalink / raw)
To: linux-btrfs; +Cc: Goldwyn Rodrigues
From: Goldwyn Rodrigues <rgoldwyn@suse.com>
In my previous attempt to fix direct I/O using iomap, the inode locks
were pushed into respective direct and buffered writes. However, in case
of fallback to buffered write, direct-io would release the inode lock and
reacquire it for buffered. This can cause corruption if another process
acquires the lock in between and writes around the same offset. Change
the flow so that the lock is acquired at the begining and release only
after the fallback buffered is complete.
Goldwyn Rodrigues (2):
btrfs: Fold generic_write_checks() in btrfs_write_check()
btrfs: Make btrfs_direct_write atomic with respect to inode_lock
fs/btrfs/file.c | 86 ++++++++++++++++++++++++++-----------------------
1 file changed, 46 insertions(+), 40 deletions(-)
--
2.29.2
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] btrfs: Fold generic_write_checks() in btrfs_write_check()
2020-12-08 18:42 [PATCH 0/2] Fix direct write with respect to inode locking Goldwyn Rodrigues
@ 2020-12-08 18:42 ` Goldwyn Rodrigues
2020-12-10 8:43 ` Nikolay Borisov
2020-12-10 11:47 ` David Sterba
2020-12-08 18:42 ` [PATCH 2/2] btrfs: Make btrfs_direct_write atomic with respect to inode_lock Goldwyn Rodrigues
1 sibling, 2 replies; 7+ messages in thread
From: Goldwyn Rodrigues @ 2020-12-08 18:42 UTC (permalink / raw)
To: linux-btrfs; +Cc: Goldwyn Rodrigues
From: Goldwyn Rodrigues <rgoldwyn@suse.com>
Code Cleanup.
Fold generic_write_checks() in btrfs_write_check(), because
generic_write_checks() is called before btrfs_write_check() in both
cases. The prototype of btrfs_write_check() has been changed to return
ssize_t and it can return zero as a valid error code. btrfs_write_check
now returns the count of I/O to be performed.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
fs/btrfs/file.c | 33 +++++++++++++++++----------------
1 file changed, 17 insertions(+), 16 deletions(-)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 0e41459b8de6..272660a8279f 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1583,17 +1583,28 @@ static void update_time_for_write(struct inode *inode)
inode_inc_iversion(inode);
}
-static int btrfs_write_check(struct kiocb *iocb, struct iov_iter *from,
- size_t count)
+/* btrfs_write_check - checks if a write can be performed
+ *
+ * Returns:
+ * count - in case the write can be successfully performed
+ * < 0 - error in case write cannot be performed
+ * 0 - if the write is not required
+ */
+static ssize_t btrfs_write_check(struct kiocb *iocb, struct iov_iter *from)
{
struct file *file = iocb->ki_filp;
struct inode *inode = file_inode(file);
struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
loff_t pos = iocb->ki_pos;
- int ret;
+ ssize_t ret;
+ ssize_t count;
loff_t oldsize;
loff_t start_pos;
+ count = generic_write_checks(iocb, from);
+ if (count <= 0)
+ return count;
+
if (iocb->ki_flags & IOCB_NOWAIT) {
size_t nocow_bytes = count;
@@ -1635,7 +1646,7 @@ static int btrfs_write_check(struct kiocb *iocb, struct iov_iter *from,
}
}
- return 0;
+ return count;
}
static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
@@ -1665,14 +1676,10 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
if (ret < 0)
return ret;
- ret = generic_write_checks(iocb, i);
+ ret = btrfs_write_check(iocb, i);
if (ret <= 0)
goto out;
- ret = btrfs_write_check(iocb, i, ret);
- if (ret < 0)
- goto out;
-
pos = iocb->ki_pos;
nrptrs = min(DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE),
PAGE_SIZE / (sizeof(struct page *)));
@@ -1920,14 +1927,8 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
if (err < 0)
return err;
- err = generic_write_checks(iocb, from);
+ err = btrfs_write_check(iocb, from);
if (err <= 0) {
- btrfs_inode_unlock(inode, ilock_flags);
- return err;
- }
-
- err = btrfs_write_check(iocb, from, err);
- if (err < 0) {
btrfs_inode_unlock(inode, ilock_flags);
goto out;
}
--
2.29.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] btrfs: Make btrfs_direct_write atomic with respect to inode_lock
2020-12-08 18:42 [PATCH 0/2] Fix direct write with respect to inode locking Goldwyn Rodrigues
2020-12-08 18:42 ` [PATCH 1/2] btrfs: Fold generic_write_checks() in btrfs_write_check() Goldwyn Rodrigues
@ 2020-12-08 18:42 ` Goldwyn Rodrigues
2020-12-10 8:52 ` Nikolay Borisov
1 sibling, 1 reply; 7+ messages in thread
From: Goldwyn Rodrigues @ 2020-12-08 18:42 UTC (permalink / raw)
To: linux-btrfs; +Cc: Goldwyn Rodrigues
From: Goldwyn Rodrigues <rgoldwyn@suse.com>
btrfs_direct_write() fallsback to buffered write in case btrfs is not
able to perform or complete a direct I/O. During the fallback
inode lock is unlocked and relocked. This does not guarantee the
atomicity of the entire write since the lock can be acquired by another
write between unlock and relock.
__btrfs_buffered_write() is used to perform the write without locks or
checks and called from btrfs_direct_write().
fa54fc76db94 ("btrfs: push inode locking and unlocking into buffered/direct write")
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
---
fs/btrfs/file.c | 55 +++++++++++++++++++++++++++----------------------
1 file changed, 30 insertions(+), 25 deletions(-)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 272660a8279f..03569fe20237 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1649,11 +1649,11 @@ static ssize_t btrfs_write_check(struct kiocb *iocb, struct iov_iter *from)
return count;
}
-static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
+static noinline ssize_t __btrfs_buffered_write(struct kiocb *iocb,
struct iov_iter *i)
{
struct file *file = iocb->ki_filp;
- loff_t pos;
+ loff_t pos = iocb->ki_pos;
struct inode *inode = file_inode(file);
struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
struct page **pages = NULL;
@@ -1667,20 +1667,7 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
bool only_release_metadata = false;
bool force_page_uptodate = false;
loff_t old_isize = i_size_read(inode);
- unsigned int ilock_flags = 0;
-
- if (iocb->ki_flags & IOCB_NOWAIT)
- ilock_flags |= BTRFS_ILOCK_TRY;
-
- ret = btrfs_inode_lock(inode, ilock_flags);
- if (ret < 0)
- return ret;
-
- ret = btrfs_write_check(iocb, i);
- if (ret <= 0)
- goto out;
- pos = iocb->ki_pos;
nrptrs = min(DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE),
PAGE_SIZE / (sizeof(struct page *)));
nrptrs = min(nrptrs, current->nr_dirtied_pause - current->nr_dirtied);
@@ -1884,10 +1871,33 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
iocb->ki_pos += num_written;
}
out:
- btrfs_inode_unlock(inode, ilock_flags);
return num_written ? num_written : ret;
}
+static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
+ struct iov_iter *i)
+{
+ struct inode *inode = file_inode(iocb->ki_filp);
+ unsigned int ilock_flags = 0;
+ ssize_t ret;
+
+ if (iocb->ki_flags & IOCB_NOWAIT)
+ ilock_flags |= BTRFS_ILOCK_TRY;
+
+ ret = btrfs_inode_lock(inode, ilock_flags);
+ if (ret < 0)
+ return ret;
+
+ ret = btrfs_write_check(iocb, i);
+ if (ret <= 0)
+ goto out;
+
+ ret = __btrfs_buffered_write(iocb, i);
+out:
+ btrfs_inode_unlock(inode, ilock_flags);
+ return ret;
+}
+
static ssize_t check_direct_IO(struct btrfs_fs_info *fs_info,
const struct iov_iter *iter, loff_t offset)
{
@@ -1928,10 +1938,8 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
return err;
err = btrfs_write_check(iocb, from);
- if (err <= 0) {
- btrfs_inode_unlock(inode, ilock_flags);
+ if (err <= 0)
goto out;
- }
pos = iocb->ki_pos;
/*
@@ -1945,16 +1953,12 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
goto relock;
}
- if (check_direct_IO(fs_info, from, pos)) {
- btrfs_inode_unlock(inode, ilock_flags);
+ if (check_direct_IO(fs_info, from, pos))
goto buffered;
- }
dio = __iomap_dio_rw(iocb, from, &btrfs_dio_iomap_ops,
&btrfs_dio_ops, is_sync_kiocb(iocb));
- btrfs_inode_unlock(inode, ilock_flags);
-
if (IS_ERR_OR_NULL(dio)) {
err = PTR_ERR_OR_ZERO(dio);
if (err < 0 && err != -ENOTBLK)
@@ -1970,7 +1974,7 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
buffered:
pos = iocb->ki_pos;
- written_buffered = btrfs_buffered_write(iocb, from);
+ written_buffered = __btrfs_buffered_write(iocb, from);
if (written_buffered < 0) {
err = written_buffered;
goto out;
@@ -1991,6 +1995,7 @@ static ssize_t btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
invalidate_mapping_pages(file->f_mapping, pos >> PAGE_SHIFT,
endbyte >> PAGE_SHIFT);
out:
+ btrfs_inode_unlock(inode, ilock_flags);
return written ? written : err;
}
--
2.29.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] btrfs: Fold generic_write_checks() in btrfs_write_check()
2020-12-08 18:42 ` [PATCH 1/2] btrfs: Fold generic_write_checks() in btrfs_write_check() Goldwyn Rodrigues
@ 2020-12-10 8:43 ` Nikolay Borisov
2020-12-10 11:47 ` David Sterba
1 sibling, 0 replies; 7+ messages in thread
From: Nikolay Borisov @ 2020-12-10 8:43 UTC (permalink / raw)
To: Goldwyn Rodrigues, linux-btrfs; +Cc: Goldwyn Rodrigues
On 8.12.20 г. 20:42 ч., Goldwyn Rodrigues wrote:
> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
>
> Code Cleanup.
>
> Fold generic_write_checks() in btrfs_write_check(), because
> generic_write_checks() is called before btrfs_write_check() in both
> cases. The prototype of btrfs_write_check() has been changed to return
> ssize_t and it can return zero as a valid error code. btrfs_write_check
> now returns the count of I/O to be performed.
>
> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Codewise LGTM, just one minor nit below, I guess David can fix it up
during merge.
with it addressed you can add my:
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> ---
> fs/btrfs/file.c | 33 +++++++++++++++++----------------
> 1 file changed, 17 insertions(+), 16 deletions(-)
>
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 0e41459b8de6..272660a8279f 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -1583,17 +1583,28 @@ static void update_time_for_write(struct inode *inode)
> inode_inc_iversion(inode);
> }
>
> -static int btrfs_write_check(struct kiocb *iocb, struct iov_iter *from,
> - size_t count)
> +/* btrfs_write_check - checks if a write can be performed
> + *
> + * Returns:
> + * count - in case the write can be successfully performed
nit: count - in case the write can be successfully performed number of
bytes to write
<snip>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] btrfs: Make btrfs_direct_write atomic with respect to inode_lock
2020-12-08 18:42 ` [PATCH 2/2] btrfs: Make btrfs_direct_write atomic with respect to inode_lock Goldwyn Rodrigues
@ 2020-12-10 8:52 ` Nikolay Borisov
0 siblings, 0 replies; 7+ messages in thread
From: Nikolay Borisov @ 2020-12-10 8:52 UTC (permalink / raw)
To: Goldwyn Rodrigues, linux-btrfs; +Cc: Goldwyn Rodrigues
On 8.12.20 г. 20:42 ч., Goldwyn Rodrigues wrote:
> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
>
> btrfs_direct_write() fallsback to buffered write in case btrfs is not
> able to perform or complete a direct I/O. During the fallback
> inode lock is unlocked and relocked. This does not guarantee the
> atomicity of the entire write since the lock can be acquired by another
> write between unlock and relock.
>
> __btrfs_buffered_write() is used to perform the write without locks or
> checks and called from btrfs_direct_write().
>
> fa54fc76db94 ("btrfs: push inode locking and unlocking into buffered/direct write")
> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
> ---
> fs/btrfs/file.c | 55 +++++++++++++++++++++++++++----------------------
> 1 file changed, 30 insertions(+), 25 deletions(-)
>
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 272660a8279f..03569fe20237 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -1649,11 +1649,11 @@ static ssize_t btrfs_write_check(struct kiocb *iocb, struct iov_iter *from)
> return count;
> }
>
> -static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
> +static noinline ssize_t __btrfs_buffered_write(struct kiocb *iocb,
> struct iov_iter *i)
> {
> struct file *file = iocb->ki_filp;
> - loff_t pos;
> + loff_t pos = iocb->ki_pos;
> struct inode *inode = file_inode(file);
> struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
> struct page **pages = NULL;
> @@ -1667,20 +1667,7 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
> bool only_release_metadata = false;
> bool force_page_uptodate = false;
> loff_t old_isize = i_size_read(inode);
> - unsigned int ilock_flags = 0;
> -
> - if (iocb->ki_flags & IOCB_NOWAIT)
> - ilock_flags |= BTRFS_ILOCK_TRY;
> -
> - ret = btrfs_inode_lock(inode, ilock_flags);
> - if (ret < 0)
> - return ret;
> -
> - ret = btrfs_write_check(iocb, i);
> - if (ret <= 0)
> - goto out;
>
> - pos = iocb->ki_pos;
Add lockdep_assert_held(&inode->i_rwsem); since __btrfs_buffered_write
does require the lock to be held.
<snip>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] btrfs: Fold generic_write_checks() in btrfs_write_check()
2020-12-08 18:42 ` [PATCH 1/2] btrfs: Fold generic_write_checks() in btrfs_write_check() Goldwyn Rodrigues
2020-12-10 8:43 ` Nikolay Borisov
@ 2020-12-10 11:47 ` David Sterba
2020-12-10 16:10 ` Goldwyn Rodrigues
1 sibling, 1 reply; 7+ messages in thread
From: David Sterba @ 2020-12-10 11:47 UTC (permalink / raw)
To: Goldwyn Rodrigues; +Cc: linux-btrfs, Goldwyn Rodrigues, osandov
On Tue, Dec 08, 2020 at 12:42:40PM -0600, Goldwyn Rodrigues wrote:
> From: Goldwyn Rodrigues <rgoldwyn@suse.com>
>
> Code Cleanup.
>
> Fold generic_write_checks() in btrfs_write_check(), because
> generic_write_checks() is called before btrfs_write_check() in both
> cases. The prototype of btrfs_write_check() has been changed to return
> ssize_t and it can return zero as a valid error code. btrfs_write_check
> now returns the count of I/O to be performed.
That's effectively reverting what Omar sent as a fix to your initial
patch:
https://lore.kernel.org/linux-btrfs/b096cecce8277b30e1c7e26efd0450c0bc12ff31.1605723568.git.osandov@fb.com/
fixing a problem. Now you revert that to fix another problem, now with
the lock added. I'd rather have one patch without this cleanup and given
that this is technically fixing a regression in the new 5.11 code it'll
go to post rc1 pull request.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] btrfs: Fold generic_write_checks() in btrfs_write_check()
2020-12-10 11:47 ` David Sterba
@ 2020-12-10 16:10 ` Goldwyn Rodrigues
0 siblings, 0 replies; 7+ messages in thread
From: Goldwyn Rodrigues @ 2020-12-10 16:10 UTC (permalink / raw)
To: dsterba, linux-btrfs, osandov
On 12:47 10/12, David Sterba wrote:
> On Tue, Dec 08, 2020 at 12:42:40PM -0600, Goldwyn Rodrigues wrote:
> > From: Goldwyn Rodrigues <rgoldwyn@suse.com>
> >
> > Code Cleanup.
> >
> > Fold generic_write_checks() in btrfs_write_check(), because
> > generic_write_checks() is called before btrfs_write_check() in both
> > cases. The prototype of btrfs_write_check() has been changed to return
> > ssize_t and it can return zero as a valid error code. btrfs_write_check
> > now returns the count of I/O to be performed.
>
> That's effectively reverting what Omar sent as a fix to your initial
> patch:
>
> https://lore.kernel.org/linux-btrfs/b096cecce8277b30e1c7e26efd0450c0bc12ff31.1605723568.git.osandov@fb.com/
>
> fixing a problem. Now you revert that to fix another problem, now with
> the lock added. I'd rather have one patch without this cleanup and given
> that this is technically fixing a regression in the new 5.11 code it'll
> go to post rc1 pull request.
This patchset fixes the problems mentioned there since both count and
pos are accessed after the btrfs_write_check is called. However, this
should be rejected because of the RWF_ENCODED work.
I will post a patch for fixing the regression only.
--
Goldwyn
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2020-12-10 16:11 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-12-08 18:42 [PATCH 0/2] Fix direct write with respect to inode locking Goldwyn Rodrigues
2020-12-08 18:42 ` [PATCH 1/2] btrfs: Fold generic_write_checks() in btrfs_write_check() Goldwyn Rodrigues
2020-12-10 8:43 ` Nikolay Borisov
2020-12-10 11:47 ` David Sterba
2020-12-10 16:10 ` Goldwyn Rodrigues
2020-12-08 18:42 ` [PATCH 2/2] btrfs: Make btrfs_direct_write atomic with respect to inode_lock Goldwyn Rodrigues
2020-12-10 8:52 ` Nikolay Borisov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox