* clean up kernel_{read,write} & friends v2
@ 2020-05-13  6:56 Christoph Hellwig
  2020-05-13  6:56 ` [PATCH 01/14] cachefiles: switch to kernel_write Christoph Hellwig
                   ` (16 more replies)
  0 siblings, 17 replies; 25+ messages in thread
From: Christoph Hellwig @ 2020-05-13  6:56 UTC (permalink / raw)
  To: Al Viro
  Cc: Ian Kent, David Howells, linux-kernel, linux-fsdevel,
	linux-security-module, netfilter-devel
Hi Al,
this series fixes a few issues and cleans up the helpers that read from
or write to kernel space buffers, and ensures that we don't change the
address limit if we are using the ->read_iter and ->write_iter methods
that don't need the changed address limit.
Changes since v1:
 - __kernel_write must not take sb_writers
 - unexported __kernel_write
^ permalink raw reply	[flat|nested] 25+ messages in thread
* [PATCH 01/14] cachefiles: switch to kernel_write
  2020-05-13  6:56 clean up kernel_{read,write} & friends v2 Christoph Hellwig
@ 2020-05-13  6:56 ` Christoph Hellwig
  2020-05-13  6:56 ` [PATCH 02/14] autofs: " Christoph Hellwig
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Christoph Hellwig @ 2020-05-13  6:56 UTC (permalink / raw)
  To: Al Viro
  Cc: Ian Kent, David Howells, linux-kernel, linux-fsdevel,
	linux-security-module, netfilter-devel
__kernel_write doesn't take a sb_writers references, which we need here.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/cachefiles/rdwr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
index 1dc97f2d62013..c0b99ccd9a79f 100644
--- a/fs/cachefiles/rdwr.c
+++ b/fs/cachefiles/rdwr.c
@@ -937,7 +937,7 @@ int cachefiles_write_page(struct fscache_storage *op, struct page *page)
 	}
 
 	data = kmap(page);
-	ret = __kernel_write(file, data, len, &pos);
+	ret = kernel_write(file, data, len, &pos);
 	kunmap(page);
 	fput(file);
 	if (ret != len)
-- 
2.26.2
^ permalink raw reply related	[flat|nested] 25+ messages in thread
* [PATCH 02/14] autofs: switch to kernel_write
  2020-05-13  6:56 clean up kernel_{read,write} & friends v2 Christoph Hellwig
  2020-05-13  6:56 ` [PATCH 01/14] cachefiles: switch to kernel_write Christoph Hellwig
@ 2020-05-13  6:56 ` Christoph Hellwig
  2020-05-14  8:25   ` Ian Kent
  2020-05-13  6:56 ` [PATCH 03/14] bpfilter: " Christoph Hellwig
                   ` (14 subsequent siblings)
  16 siblings, 1 reply; 25+ messages in thread
From: Christoph Hellwig @ 2020-05-13  6:56 UTC (permalink / raw)
  To: Al Viro
  Cc: Ian Kent, David Howells, linux-kernel, linux-fsdevel,
	linux-security-module, netfilter-devel
While pipes don't really need sb_writers projection, __kernel_write is an
interface better kept private, and the additional rw_verify_area does not
hurt here.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/autofs/waitq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/autofs/waitq.c b/fs/autofs/waitq.c
index b04c528b19d34..74c886f7c51cb 100644
--- a/fs/autofs/waitq.c
+++ b/fs/autofs/waitq.c
@@ -53,7 +53,7 @@ static int autofs_write(struct autofs_sb_info *sbi,
 
 	mutex_lock(&sbi->pipe_mutex);
 	while (bytes) {
-		wr = __kernel_write(file, data, bytes, &file->f_pos);
+		wr = kernel_write(file, data, bytes, &file->f_pos);
 		if (wr <= 0)
 			break;
 		data += wr;
-- 
2.26.2
^ permalink raw reply related	[flat|nested] 25+ messages in thread
* [PATCH 03/14] bpfilter: switch to kernel_write
  2020-05-13  6:56 clean up kernel_{read,write} & friends v2 Christoph Hellwig
  2020-05-13  6:56 ` [PATCH 01/14] cachefiles: switch to kernel_write Christoph Hellwig
  2020-05-13  6:56 ` [PATCH 02/14] autofs: " Christoph Hellwig
@ 2020-05-13  6:56 ` Christoph Hellwig
  2020-05-13  6:56 ` [PATCH 04/14] fs: unexport __kernel_write Christoph Hellwig
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Christoph Hellwig @ 2020-05-13  6:56 UTC (permalink / raw)
  To: Al Viro
  Cc: Ian Kent, David Howells, linux-kernel, linux-fsdevel,
	linux-security-module, netfilter-devel
While pipes don't really need sb_writers projection, __kernel_write is an
interface better kept private, and the additional rw_verify_area does not
hurt here.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 net/bpfilter/bpfilter_kern.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/bpfilter/bpfilter_kern.c b/net/bpfilter/bpfilter_kern.c
index c0f0990f30b60..1905e01c3aa9a 100644
--- a/net/bpfilter/bpfilter_kern.c
+++ b/net/bpfilter/bpfilter_kern.c
@@ -50,7 +50,7 @@ static int __bpfilter_process_sockopt(struct sock *sk, int optname,
 	req.len = optlen;
 	if (!bpfilter_ops.info.pid)
 		goto out;
-	n = __kernel_write(bpfilter_ops.info.pipe_to_umh, &req, sizeof(req),
+	n = kernel_write(bpfilter_ops.info.pipe_to_umh, &req, sizeof(req),
 			   &pos);
 	if (n != sizeof(req)) {
 		pr_err("write fail %zd\n", n);
-- 
2.26.2
^ permalink raw reply related	[flat|nested] 25+ messages in thread
* [PATCH 04/14] fs: unexport __kernel_write
  2020-05-13  6:56 clean up kernel_{read,write} & friends v2 Christoph Hellwig
                   ` (2 preceding siblings ...)
  2020-05-13  6:56 ` [PATCH 03/14] bpfilter: " Christoph Hellwig
@ 2020-05-13  6:56 ` Christoph Hellwig
  2020-05-13  6:56 ` [PATCH 05/14] fs: check FMODE_WRITE in __kernel_write Christoph Hellwig
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Christoph Hellwig @ 2020-05-13  6:56 UTC (permalink / raw)
  To: Al Viro
  Cc: Ian Kent, David Howells, linux-kernel, linux-fsdevel,
	linux-security-module, netfilter-devel
This is a very special interface that skips sb_writes protection, and not
used by modules anymore.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 1 -
 1 file changed, 1 deletion(-)
diff --git a/fs/read_write.c b/fs/read_write.c
index bbfa9b12b15eb..2c601d853ff3d 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -522,7 +522,6 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t
 	inc_syscw(current);
 	return ret;
 }
-EXPORT_SYMBOL(__kernel_write);
 
 ssize_t kernel_write(struct file *file, const void *buf, size_t count,
 			    loff_t *pos)
-- 
2.26.2
^ permalink raw reply related	[flat|nested] 25+ messages in thread
* [PATCH 05/14] fs: check FMODE_WRITE in __kernel_write
  2020-05-13  6:56 clean up kernel_{read,write} & friends v2 Christoph Hellwig
                   ` (3 preceding siblings ...)
  2020-05-13  6:56 ` [PATCH 04/14] fs: unexport __kernel_write Christoph Hellwig
@ 2020-05-13  6:56 ` Christoph Hellwig
  2020-05-13  6:56 ` [PATCH 06/14] fs: remove the call_{read,write}_iter functions Christoph Hellwig
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Christoph Hellwig @ 2020-05-13  6:56 UTC (permalink / raw)
  To: Al Viro
  Cc: Ian Kent, David Howells, linux-kernel, linux-fsdevel,
	linux-security-module, netfilter-devel
We still need to check if the fѕ is open write, even for the low-level
helper.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 2 ++
 1 file changed, 2 insertions(+)
diff --git a/fs/read_write.c b/fs/read_write.c
index 2c601d853ff3d..76be155ad9824 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -505,6 +505,8 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t
 	const char __user *p;
 	ssize_t ret;
 
+	if (!(file->f_mode & FMODE_WRITE))
+		return -EBADF;
 	if (!(file->f_mode & FMODE_CAN_WRITE))
 		return -EINVAL;
 
-- 
2.26.2
^ permalink raw reply related	[flat|nested] 25+ messages in thread
* [PATCH 06/14] fs: remove the call_{read,write}_iter functions
  2020-05-13  6:56 clean up kernel_{read,write} & friends v2 Christoph Hellwig
                   ` (4 preceding siblings ...)
  2020-05-13  6:56 ` [PATCH 05/14] fs: check FMODE_WRITE in __kernel_write Christoph Hellwig
@ 2020-05-13  6:56 ` Christoph Hellwig
  2020-05-13  6:56 ` [PATCH 07/14] fs: implement kernel_write using __kernel_write Christoph Hellwig
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Christoph Hellwig @ 2020-05-13  6:56 UTC (permalink / raw)
  To: Al Viro
  Cc: Ian Kent, David Howells, linux-kernel, linux-fsdevel,
	linux-security-module, netfilter-devel
Just open coding the methods calls is a lot easier to follow.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/loop.c              |  4 ++--
 drivers/target/target_core_file.c |  4 ++--
 fs/aio.c                          |  4 ++--
 fs/io_uring.c                     |  4 ++--
 fs/read_write.c                   | 12 ++++++------
 fs/splice.c                       |  2 +-
 include/linux/fs.h                | 12 ------------
 7 files changed, 15 insertions(+), 27 deletions(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index da693e6a834e5..ad167050a4ec4 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -572,9 +572,9 @@ static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
 		kthread_associate_blkcg(cmd->css);
 
 	if (rw == WRITE)
-		ret = call_write_iter(file, &cmd->iocb, &iter);
+		ret = file->f_op->write_iter(&cmd->iocb, &iter);
 	else
-		ret = call_read_iter(file, &cmd->iocb, &iter);
+		ret = file->f_op->read_iter(&cmd->iocb, &iter);
 
 	lo_rw_aio_do_completion(cmd);
 	kthread_associate_blkcg(NULL);
diff --git a/drivers/target/target_core_file.c b/drivers/target/target_core_file.c
index 7143d03f0e027..79f0707877917 100644
--- a/drivers/target/target_core_file.c
+++ b/drivers/target/target_core_file.c
@@ -303,9 +303,9 @@ fd_execute_rw_aio(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents,
 		aio_cmd->iocb.ki_flags |= IOCB_DSYNC;
 
 	if (is_write)
-		ret = call_write_iter(file, &aio_cmd->iocb, &iter);
+		ret = file->f_op->write_iter(&aio_cmd->iocb, &iter);
 	else
-		ret = call_read_iter(file, &aio_cmd->iocb, &iter);
+		ret = file->f_op->read_iter(&aio_cmd->iocb, &iter);
 
 	kfree(bvec);
 
diff --git a/fs/aio.c b/fs/aio.c
index 5f3d3d8149287..1ccc0efdc357d 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1540,7 +1540,7 @@ static int aio_read(struct kiocb *req, const struct iocb *iocb,
 		return ret;
 	ret = rw_verify_area(READ, file, &req->ki_pos, iov_iter_count(&iter));
 	if (!ret)
-		aio_rw_done(req, call_read_iter(file, req, &iter));
+		aio_rw_done(req, file->f_op->read_iter(req, &iter));
 	kfree(iovec);
 	return ret;
 }
@@ -1580,7 +1580,7 @@ static int aio_write(struct kiocb *req, const struct iocb *iocb,
 			__sb_writers_release(file_inode(file)->i_sb, SB_FREEZE_WRITE);
 		}
 		req->ki_flags |= IOCB_WRITE;
-		aio_rw_done(req, call_write_iter(file, req, &iter));
+		aio_rw_done(req, file->f_op->write_iter(req, &iter));
 	}
 	kfree(iovec);
 	return ret;
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 979d9f977409a..59514051383e4 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2582,7 +2582,7 @@ static int io_read(struct io_kiocb *req, bool force_nonblock)
 		ssize_t ret2;
 
 		if (req->file->f_op->read_iter)
-			ret2 = call_read_iter(req->file, kiocb, &iter);
+			ret2 = req->file->f_op->read_iter(kiocb, &iter);
 		else
 			ret2 = loop_rw_iter(READ, req->file, kiocb, &iter);
 
@@ -2697,7 +2697,7 @@ static int io_write(struct io_kiocb *req, bool force_nonblock)
 			current->signal->rlim[RLIMIT_FSIZE].rlim_cur = req->fsize;
 
 		if (req->file->f_op->write_iter)
-			ret2 = call_write_iter(req->file, kiocb, &iter);
+			ret2 = req->file->f_op->write_iter(kiocb, &iter);
 		else
 			ret2 = loop_rw_iter(WRITE, req->file, kiocb, &iter);
 
diff --git a/fs/read_write.c b/fs/read_write.c
index 76be155ad9824..f0768313ea010 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -412,7 +412,7 @@ static ssize_t new_sync_read(struct file *filp, char __user *buf, size_t len, lo
 	kiocb.ki_pos = (ppos ? *ppos : 0);
 	iov_iter_init(&iter, READ, &iov, 1, len);
 
-	ret = call_read_iter(filp, &kiocb, &iter);
+	ret = filp->f_op->read_iter(&kiocb, &iter);
 	BUG_ON(ret == -EIOCBQUEUED);
 	if (ppos)
 		*ppos = kiocb.ki_pos;
@@ -481,7 +481,7 @@ static ssize_t new_sync_write(struct file *filp, const char __user *buf, size_t
 	kiocb.ki_pos = (ppos ? *ppos : 0);
 	iov_iter_init(&iter, WRITE, &iov, 1, len);
 
-	ret = call_write_iter(filp, &kiocb, &iter);
+	ret = filp->f_op->write_iter(&kiocb, &iter);
 	BUG_ON(ret == -EIOCBQUEUED);
 	if (ret > 0 && ppos)
 		*ppos = kiocb.ki_pos;
@@ -690,9 +690,9 @@ static ssize_t do_iter_readv_writev(struct file *filp, struct iov_iter *iter,
 	kiocb.ki_pos = (ppos ? *ppos : 0);
 
 	if (type == READ)
-		ret = call_read_iter(filp, &kiocb, iter);
+		ret = filp->f_op->read_iter(&kiocb, iter);
 	else
-		ret = call_write_iter(filp, &kiocb, iter);
+		ret = filp->f_op->write_iter(&kiocb, iter);
 	BUG_ON(ret == -EIOCBQUEUED);
 	if (ppos)
 		*ppos = kiocb.ki_pos;
@@ -961,7 +961,7 @@ ssize_t vfs_iocb_iter_read(struct file *file, struct kiocb *iocb,
 	if (ret < 0)
 		return ret;
 
-	ret = call_read_iter(file, iocb, iter);
+	ret = file->f_op->read_iter(iocb, iter);
 out:
 	if (ret >= 0)
 		fsnotify_access(file);
@@ -1025,7 +1025,7 @@ ssize_t vfs_iocb_iter_write(struct file *file, struct kiocb *iocb,
 	if (ret < 0)
 		return ret;
 
-	ret = call_write_iter(file, iocb, iter);
+	ret = file->f_op->write_iter(iocb, iter);
 	if (ret > 0)
 		fsnotify_modify(file);
 
diff --git a/fs/splice.c b/fs/splice.c
index fd0a1e7e5959a..a59d4fadf27fe 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -310,7 +310,7 @@ ssize_t generic_file_splice_read(struct file *in, loff_t *ppos,
 	i_head = to.head;
 	init_sync_kiocb(&kiocb, in);
 	kiocb.ki_pos = *ppos;
-	ret = call_read_iter(in, &kiocb, &to);
+	ret = in->f_op->read_iter(&kiocb, &to);
 	if (ret > 0) {
 		*ppos = kiocb.ki_pos;
 		file_accessed(in);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 45cc10cdf6ddd..21f126957c2cf 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1895,18 +1895,6 @@ struct inode_operations {
 	int (*set_acl)(struct inode *, struct posix_acl *, int);
 } ____cacheline_aligned;
 
-static inline ssize_t call_read_iter(struct file *file, struct kiocb *kio,
-				     struct iov_iter *iter)
-{
-	return file->f_op->read_iter(kio, iter);
-}
-
-static inline ssize_t call_write_iter(struct file *file, struct kiocb *kio,
-				      struct iov_iter *iter)
-{
-	return file->f_op->write_iter(kio, iter);
-}
-
 static inline int call_mmap(struct file *file, struct vm_area_struct *vma)
 {
 	return file->f_op->mmap(file, vma);
-- 
2.26.2
^ permalink raw reply related	[flat|nested] 25+ messages in thread
* [PATCH 07/14] fs: implement kernel_write using __kernel_write
  2020-05-13  6:56 clean up kernel_{read,write} & friends v2 Christoph Hellwig
                   ` (5 preceding siblings ...)
  2020-05-13  6:56 ` [PATCH 06/14] fs: remove the call_{read,write}_iter functions Christoph Hellwig
@ 2020-05-13  6:56 ` Christoph Hellwig
  2020-05-13  6:56 ` [PATCH 08/14] fs: remove __vfs_write Christoph Hellwig
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Christoph Hellwig @ 2020-05-13  6:56 UTC (permalink / raw)
  To: Al Viro
  Cc: Ian Kent, David Howells, linux-kernel, linux-fsdevel,
	linux-security-module, netfilter-devel
Consolidate the two in-kernel write helpers to make upcoming changes
easier.  The only difference are the missing call to rw_verify_area
in kernel_write, and an access_ok check that doesn't make sense for
kernel buffers to start with.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/fs/read_write.c b/fs/read_write.c
index f0768313ea010..abb84391cfbc5 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -499,6 +499,7 @@ static ssize_t __vfs_write(struct file *file, const char __user *p,
 		return -EINVAL;
 }
 
+/* caller is responsible for file_start_write/file_end_write */
 ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t *pos)
 {
 	mm_segment_t old_fs;
@@ -528,16 +529,16 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t
 ssize_t kernel_write(struct file *file, const void *buf, size_t count,
 			    loff_t *pos)
 {
-	mm_segment_t old_fs;
-	ssize_t res;
+	ssize_t ret;
 
-	old_fs = get_fs();
-	set_fs(KERNEL_DS);
-	/* The cast to a user pointer is valid due to the set_fs() */
-	res = vfs_write(file, (__force const char __user *)buf, count, pos);
-	set_fs(old_fs);
+	ret = rw_verify_area(WRITE, file, pos, count);
+	if (ret)
+		return ret;
 
-	return res;
+	file_start_write(file);
+	ret =  __kernel_write(file, buf, count, pos);
+	file_end_write(file);
+	return ret;
 }
 EXPORT_SYMBOL(kernel_write);
 
-- 
2.26.2
^ permalink raw reply related	[flat|nested] 25+ messages in thread
* [PATCH 08/14] fs: remove __vfs_write
  2020-05-13  6:56 clean up kernel_{read,write} & friends v2 Christoph Hellwig
                   ` (6 preceding siblings ...)
  2020-05-13  6:56 ` [PATCH 07/14] fs: implement kernel_write using __kernel_write Christoph Hellwig
@ 2020-05-13  6:56 ` Christoph Hellwig
  2020-05-13  6:56 ` [PATCH 09/14] fs: don't change the address limit for ->write_iter in __kernel_write Christoph Hellwig
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Christoph Hellwig @ 2020-05-13  6:56 UTC (permalink / raw)
  To: Al Viro
  Cc: Ian Kent, David Howells, linux-kernel, linux-fsdevel,
	linux-security-module, netfilter-devel
Fold it into the two callers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 46 ++++++++++++++++++++++------------------------
 1 file changed, 22 insertions(+), 24 deletions(-)
diff --git a/fs/read_write.c b/fs/read_write.c
index abb84391cfbc5..3bcb084f160de 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -488,17 +488,6 @@ static ssize_t new_sync_write(struct file *filp, const char __user *buf, size_t
 	return ret;
 }
 
-static ssize_t __vfs_write(struct file *file, const char __user *p,
-			   size_t count, loff_t *pos)
-{
-	if (file->f_op->write)
-		return file->f_op->write(file, p, count, pos);
-	else if (file->f_op->write_iter)
-		return new_sync_write(file, p, count, pos);
-	else
-		return -EINVAL;
-}
-
 /* caller is responsible for file_start_write/file_end_write */
 ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t *pos)
 {
@@ -516,7 +505,12 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t
 	p = (__force const char __user *)buf;
 	if (count > MAX_RW_COUNT)
 		count =  MAX_RW_COUNT;
-	ret = __vfs_write(file, p, count, pos);
+	if (file->f_op->write)
+		ret = file->f_op->write(file, p, count, pos);
+	else if (file->f_op->write_iter)
+		ret = new_sync_write(file, p, count, pos);
+	else
+		ret = -EINVAL;
 	set_fs(old_fs);
 	if (ret > 0) {
 		fsnotify_modify(file);
@@ -554,19 +548,23 @@ ssize_t vfs_write(struct file *file, const char __user *buf, size_t count, loff_
 		return -EFAULT;
 
 	ret = rw_verify_area(WRITE, file, pos, count);
-	if (!ret) {
-		if (count > MAX_RW_COUNT)
-			count =  MAX_RW_COUNT;
-		file_start_write(file);
-		ret = __vfs_write(file, buf, count, pos);
-		if (ret > 0) {
-			fsnotify_modify(file);
-			add_wchar(current, ret);
-		}
-		inc_syscw(current);
-		file_end_write(file);
+	if (ret)
+		return ret;
+	if (count > MAX_RW_COUNT)
+		count =  MAX_RW_COUNT;
+	file_start_write(file);
+	if (file->f_op->write)
+		ret = file->f_op->write(file, buf, count, pos);
+	else if (file->f_op->write_iter)
+		ret = new_sync_write(file, buf, count, pos);
+	else
+		ret = -EINVAL;
+	if (ret > 0) {
+		fsnotify_modify(file);
+		add_wchar(current, ret);
 	}
-
+	inc_syscw(current);
+	file_end_write(file);
 	return ret;
 }
 
-- 
2.26.2
^ permalink raw reply related	[flat|nested] 25+ messages in thread
* [PATCH 09/14] fs: don't change the address limit for ->write_iter in __kernel_write
  2020-05-13  6:56 clean up kernel_{read,write} & friends v2 Christoph Hellwig
                   ` (7 preceding siblings ...)
  2020-05-13  6:56 ` [PATCH 08/14] fs: remove __vfs_write Christoph Hellwig
@ 2020-05-13  6:56 ` Christoph Hellwig
  2020-05-13  6:56 ` [PATCH 10/14] fs: add a __kernel_read helper Christoph Hellwig
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Christoph Hellwig @ 2020-05-13  6:56 UTC (permalink / raw)
  To: Al Viro
  Cc: Ian Kent, David Howells, linux-kernel, linux-fsdevel,
	linux-security-module, netfilter-devel
If we write to a file that implements ->write_iter there is no need
to change the address limit if we send a kvec down.  Implement that
case, and prefer it over using plain ->write with a changed address
limit if available.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 34 ++++++++++++++++++++++------------
 1 file changed, 22 insertions(+), 12 deletions(-)
diff --git a/fs/read_write.c b/fs/read_write.c
index 3bcb084f160de..8cfca5f8fc3ce 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -489,10 +489,9 @@ static ssize_t new_sync_write(struct file *filp, const char __user *buf, size_t
 }
 
 /* caller is responsible for file_start_write/file_end_write */
-ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t *pos)
+ssize_t __kernel_write(struct file *file, const void *buf, size_t count,
+		loff_t *pos)
 {
-	mm_segment_t old_fs;
-	const char __user *p;
 	ssize_t ret;
 
 	if (!(file->f_mode & FMODE_WRITE))
@@ -500,18 +499,29 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t
 	if (!(file->f_mode & FMODE_CAN_WRITE))
 		return -EINVAL;
 
-	old_fs = get_fs();
-	set_fs(KERNEL_DS);
-	p = (__force const char __user *)buf;
 	if (count > MAX_RW_COUNT)
 		count =  MAX_RW_COUNT;
-	if (file->f_op->write)
-		ret = file->f_op->write(file, p, count, pos);
-	else if (file->f_op->write_iter)
-		ret = new_sync_write(file, p, count, pos);
-	else
+	if (file->f_op->write_iter) {
+		struct kvec iov = { .iov_base = (void *)buf, .iov_len = count };
+		struct kiocb kiocb;
+		struct iov_iter iter;
+
+		init_sync_kiocb(&kiocb, file);
+		kiocb.ki_pos = *pos;
+		iov_iter_kvec(&iter, WRITE, &iov, 1, count);
+		ret = file->f_op->write_iter(&kiocb, &iter);
+		if (ret > 0)
+			*pos = kiocb.ki_pos;
+	} else if (file->f_op->write) {
+		mm_segment_t old_fs = get_fs();
+
+		set_fs(KERNEL_DS);
+		ret = file->f_op->write(file, (__force const char __user *)buf,
+				count, pos);
+		set_fs(old_fs);
+	} else {
 		ret = -EINVAL;
-	set_fs(old_fs);
+	}
 	if (ret > 0) {
 		fsnotify_modify(file);
 		add_wchar(current, ret);
-- 
2.26.2
^ permalink raw reply related	[flat|nested] 25+ messages in thread
* [PATCH 10/14] fs: add a __kernel_read helper
  2020-05-13  6:56 clean up kernel_{read,write} & friends v2 Christoph Hellwig
                   ` (8 preceding siblings ...)
  2020-05-13  6:56 ` [PATCH 09/14] fs: don't change the address limit for ->write_iter in __kernel_write Christoph Hellwig
@ 2020-05-13  6:56 ` Christoph Hellwig
  2020-05-13  6:56 ` [PATCH 11/14] integrity/ima: switch to using __kernel_read Christoph Hellwig
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Christoph Hellwig @ 2020-05-13  6:56 UTC (permalink / raw)
  To: Al Viro
  Cc: Ian Kent, David Howells, linux-kernel, linux-fsdevel,
	linux-security-module, netfilter-devel
This is the counterpart to __kernel_write, and skip the rw_verify_area
call compared to kernel_read.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c    | 21 +++++++++++++++++++++
 include/linux/fs.h |  1 +
 2 files changed, 22 insertions(+)
diff --git a/fs/read_write.c b/fs/read_write.c
index 8cfca5f8fc3ce..bd12af8a895c8 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -430,6 +430,27 @@ ssize_t __vfs_read(struct file *file, char __user *buf, size_t count,
 		return -EINVAL;
 }
 
+ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
+{
+	mm_segment_t old_fs = get_fs();
+	ssize_t ret;
+
+	if (!(file->f_mode & FMODE_CAN_READ))
+		return -EINVAL;
+
+	if (count > MAX_RW_COUNT)
+		count =  MAX_RW_COUNT;
+	set_fs(KERNEL_DS);
+	ret = __vfs_read(file, (void __user *)buf, count, pos);
+	set_fs(old_fs);
+	if (ret > 0) {
+		fsnotify_access(file);
+		add_rchar(current, ret);
+	}
+	inc_syscr(current);
+	return ret;
+}
+
 ssize_t kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 {
 	mm_segment_t old_fs;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 21f126957c2cf..6441aaa25f8f2 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3011,6 +3011,7 @@ extern int kernel_read_file_from_path_initns(const char *, void **, loff_t *, lo
 extern int kernel_read_file_from_fd(int, void **, loff_t *, loff_t,
 				    enum kernel_read_file_id);
 extern ssize_t kernel_read(struct file *, void *, size_t, loff_t *);
+ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos);
 extern ssize_t kernel_write(struct file *, const void *, size_t, loff_t *);
 extern ssize_t __kernel_write(struct file *, const void *, size_t, loff_t *);
 extern struct file * open_exec(const char *);
-- 
2.26.2
^ permalink raw reply related	[flat|nested] 25+ messages in thread
* [PATCH 11/14] integrity/ima: switch to using __kernel_read
  2020-05-13  6:56 clean up kernel_{read,write} & friends v2 Christoph Hellwig
                   ` (9 preceding siblings ...)
  2020-05-13  6:56 ` [PATCH 10/14] fs: add a __kernel_read helper Christoph Hellwig
@ 2020-05-13  6:56 ` Christoph Hellwig
  2020-05-13  6:56 ` [PATCH 12/14] fs: implement kernel_read " Christoph Hellwig
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Christoph Hellwig @ 2020-05-13  6:56 UTC (permalink / raw)
  To: Al Viro
  Cc: Ian Kent, David Howells, linux-kernel, linux-fsdevel,
	linux-security-module, netfilter-devel
__kernel_read has a bunch of additional sanity checks, and this moves
the set_fs out of non-core code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 security/integrity/iint.c | 14 +-------------
 1 file changed, 1 insertion(+), 13 deletions(-)
diff --git a/security/integrity/iint.c b/security/integrity/iint.c
index e12c4900510f6..1d20003243c3f 100644
--- a/security/integrity/iint.c
+++ b/security/integrity/iint.c
@@ -188,19 +188,7 @@ DEFINE_LSM(integrity) = {
 int integrity_kernel_read(struct file *file, loff_t offset,
 			  void *addr, unsigned long count)
 {
-	mm_segment_t old_fs;
-	char __user *buf = (char __user *)addr;
-	ssize_t ret;
-
-	if (!(file->f_mode & FMODE_READ))
-		return -EBADF;
-
-	old_fs = get_fs();
-	set_fs(KERNEL_DS);
-	ret = __vfs_read(file, buf, count, &offset);
-	set_fs(old_fs);
-
-	return ret;
+	return __kernel_read(file, addr, count, &offset);
 }
 
 /*
-- 
2.26.2
^ permalink raw reply related	[flat|nested] 25+ messages in thread
* [PATCH 12/14] fs: implement kernel_read using __kernel_read
  2020-05-13  6:56 clean up kernel_{read,write} & friends v2 Christoph Hellwig
                   ` (10 preceding siblings ...)
  2020-05-13  6:56 ` [PATCH 11/14] integrity/ima: switch to using __kernel_read Christoph Hellwig
@ 2020-05-13  6:56 ` Christoph Hellwig
  2020-05-13  6:56 ` [PATCH 13/14] fs: remove __vfs_read Christoph Hellwig
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Christoph Hellwig @ 2020-05-13  6:56 UTC (permalink / raw)
  To: Al Viro
  Cc: Ian Kent, David Howells, linux-kernel, linux-fsdevel,
	linux-security-module, netfilter-devel
Consolidate the two in-kernel read helpers to make upcoming changes
easier.  The only difference are the missing call to rw_verify_area
in kernel_read, and an access_ok check that doesn't make sense for
kernel buffers to start with.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)
diff --git a/fs/read_write.c b/fs/read_write.c
index bd12af8a895c8..4e19152a7efe0 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -453,15 +453,12 @@ ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 
 ssize_t kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 {
-	mm_segment_t old_fs;
-	ssize_t result;
+	ssize_t ret;
 
-	old_fs = get_fs();
-	set_fs(KERNEL_DS);
-	/* The cast to a user pointer is valid due to the set_fs() */
-	result = vfs_read(file, (void __user *)buf, count, pos);
-	set_fs(old_fs);
-	return result;
+	ret = rw_verify_area(READ, file, pos, count);
+	if (ret)
+		return ret;
+	return __kernel_read(file, buf, count, pos);
 }
 EXPORT_SYMBOL(kernel_read);
 
-- 
2.26.2
^ permalink raw reply related	[flat|nested] 25+ messages in thread
* [PATCH 13/14] fs: remove __vfs_read
  2020-05-13  6:56 clean up kernel_{read,write} & friends v2 Christoph Hellwig
                   ` (11 preceding siblings ...)
  2020-05-13  6:56 ` [PATCH 12/14] fs: implement kernel_read " Christoph Hellwig
@ 2020-05-13  6:56 ` Christoph Hellwig
  2020-05-13  6:56 ` [PATCH 14/14] fs: don't change the address limit for ->read_iter in __kernel_read Christoph Hellwig
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Christoph Hellwig @ 2020-05-13  6:56 UTC (permalink / raw)
  To: Al Viro
  Cc: Ian Kent, David Howells, linux-kernel, linux-fsdevel,
	linux-security-module, netfilter-devel
Fold it into the two callers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c    | 43 +++++++++++++++++++++----------------------
 include/linux/fs.h |  1 -
 2 files changed, 21 insertions(+), 23 deletions(-)
diff --git a/fs/read_write.c b/fs/read_write.c
index 4e19152a7efe0..46ddfce17e839 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -419,17 +419,6 @@ static ssize_t new_sync_read(struct file *filp, char __user *buf, size_t len, lo
 	return ret;
 }
 
-ssize_t __vfs_read(struct file *file, char __user *buf, size_t count,
-		   loff_t *pos)
-{
-	if (file->f_op->read)
-		return file->f_op->read(file, buf, count, pos);
-	else if (file->f_op->read_iter)
-		return new_sync_read(file, buf, count, pos);
-	else
-		return -EINVAL;
-}
-
 ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 {
 	mm_segment_t old_fs = get_fs();
@@ -441,7 +430,12 @@ ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 	if (count > MAX_RW_COUNT)
 		count =  MAX_RW_COUNT;
 	set_fs(KERNEL_DS);
-	ret = __vfs_read(file, (void __user *)buf, count, pos);
+	if (file->f_op->read)
+		ret = file->f_op->read(file, (void __user *)buf, count, pos);
+	else if (file->f_op->read_iter)
+		ret = new_sync_read(file, (void __user *)buf, count, pos);
+	else
+		ret = -EINVAL;
 	set_fs(old_fs);
 	if (ret > 0) {
 		fsnotify_access(file);
@@ -474,17 +468,22 @@ ssize_t vfs_read(struct file *file, char __user *buf, size_t count, loff_t *pos)
 		return -EFAULT;
 
 	ret = rw_verify_area(READ, file, pos, count);
-	if (!ret) {
-		if (count > MAX_RW_COUNT)
-			count =  MAX_RW_COUNT;
-		ret = __vfs_read(file, buf, count, pos);
-		if (ret > 0) {
-			fsnotify_access(file);
-			add_rchar(current, ret);
-		}
-		inc_syscr(current);
-	}
+	if (ret)
+		return ret;
+	if (count > MAX_RW_COUNT)
+		count =  MAX_RW_COUNT;
 
+	if (file->f_op->read)
+		ret = file->f_op->read(file, buf, count, pos);
+	else if (file->f_op->read_iter)
+		ret = new_sync_read(file, buf, count, pos);
+	else
+		ret = -EINVAL;
+	if (ret > 0) {
+		fsnotify_access(file);
+		add_rchar(current, ret);
+	}
+	inc_syscr(current);
 	return ret;
 }
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 6441aaa25f8f2..4c10a07a36178 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1905,7 +1905,6 @@ ssize_t rw_copy_check_uvector(int type, const struct iovec __user * uvector,
 			      struct iovec *fast_pointer,
 			      struct iovec **ret_pointer);
 
-extern ssize_t __vfs_read(struct file *, char __user *, size_t, loff_t *);
 extern ssize_t vfs_read(struct file *, char __user *, size_t, loff_t *);
 extern ssize_t vfs_write(struct file *, const char __user *, size_t, loff_t *);
 extern ssize_t vfs_readv(struct file *, const struct iovec __user *,
-- 
2.26.2
^ permalink raw reply related	[flat|nested] 25+ messages in thread
* [PATCH 14/14] fs: don't change the address limit for ->read_iter in __kernel_read
  2020-05-13  6:56 clean up kernel_{read,write} & friends v2 Christoph Hellwig
                   ` (12 preceding siblings ...)
  2020-05-13  6:56 ` [PATCH 13/14] fs: remove __vfs_read Christoph Hellwig
@ 2020-05-13  6:56 ` Christoph Hellwig
  2020-05-15 15:21 ` [PATCH 01/14] cachefiles: switch to kernel_write David Howells
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 25+ messages in thread
From: Christoph Hellwig @ 2020-05-13  6:56 UTC (permalink / raw)
  To: Al Viro
  Cc: Ian Kent, David Howells, linux-kernel, linux-fsdevel,
	linux-security-module, netfilter-devel
If we read to a file that implements ->read_iter there is no need
to change the address limit if we send a kvec down.  Implement that
case, and prefer it over using plain ->read with a changed address
limit if available.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)
diff --git a/fs/read_write.c b/fs/read_write.c
index 46ddfce17e839..c93acbd8bf5a3 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -421,7 +421,6 @@ static ssize_t new_sync_read(struct file *filp, char __user *buf, size_t len, lo
 
 ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 {
-	mm_segment_t old_fs = get_fs();
 	ssize_t ret;
 
 	if (!(file->f_mode & FMODE_CAN_READ))
@@ -429,14 +428,25 @@ ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 
 	if (count > MAX_RW_COUNT)
 		count =  MAX_RW_COUNT;
-	set_fs(KERNEL_DS);
-	if (file->f_op->read)
+	if (file->f_op->read_iter) {
+		struct kvec iov = { .iov_base = buf, .iov_len = count };
+		struct kiocb kiocb;
+		struct iov_iter iter;
+
+		init_sync_kiocb(&kiocb, file);
+		kiocb.ki_pos = *pos;
+		iov_iter_kvec(&iter, READ, &iov, 1, count);
+		ret = file->f_op->read_iter(&kiocb, &iter);
+		*pos = kiocb.ki_pos;
+	} else if (file->f_op->read) {
+		mm_segment_t old_fs = get_fs();
+
+		set_fs(KERNEL_DS);
 		ret = file->f_op->read(file, (void __user *)buf, count, pos);
-	else if (file->f_op->read_iter)
-		ret = new_sync_read(file, (void __user *)buf, count, pos);
-	else
+		set_fs(old_fs);
+	} else {
 		ret = -EINVAL;
-	set_fs(old_fs);
+	}
 	if (ret > 0) {
 		fsnotify_access(file);
 		add_rchar(current, ret);
-- 
2.26.2
^ permalink raw reply related	[flat|nested] 25+ messages in thread
* Re: [PATCH 02/14] autofs: switch to kernel_write
  2020-05-13  6:56 ` [PATCH 02/14] autofs: " Christoph Hellwig
@ 2020-05-14  8:25   ` Ian Kent
  0 siblings, 0 replies; 25+ messages in thread
From: Ian Kent @ 2020-05-14  8:25 UTC (permalink / raw)
  To: Christoph Hellwig, Al Viro
  Cc: David Howells, linux-kernel, linux-fsdevel, linux-security-module,
	netfilter-devel
On Wed, 2020-05-13 at 08:56 +0200, Christoph Hellwig wrote:
> While pipes don't really need sb_writers projection, __kernel_write
> is an
> interface better kept private, and the additional rw_verify_area does
> not
> hurt here.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Right, should be fine AFAICS.
Acked-by: Ian Kent <raven@themaw.net>
> ---
>  fs/autofs/waitq.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/autofs/waitq.c b/fs/autofs/waitq.c
> index b04c528b19d34..74c886f7c51cb 100644
> --- a/fs/autofs/waitq.c
> +++ b/fs/autofs/waitq.c
> @@ -53,7 +53,7 @@ static int autofs_write(struct autofs_sb_info *sbi,
>  
>  	mutex_lock(&sbi->pipe_mutex);
>  	while (bytes) {
> -		wr = __kernel_write(file, data, bytes, &file->f_pos);
> +		wr = kernel_write(file, data, bytes, &file->f_pos);
>  		if (wr <= 0)
>  			break;
>  		data += wr;
^ permalink raw reply	[flat|nested] 25+ messages in thread
* Re: [PATCH 01/14] cachefiles: switch to kernel_write
  2020-05-13  6:56 clean up kernel_{read,write} & friends v2 Christoph Hellwig
                   ` (13 preceding siblings ...)
  2020-05-13  6:56 ` [PATCH 14/14] fs: don't change the address limit for ->read_iter in __kernel_read Christoph Hellwig
@ 2020-05-15 15:21 ` David Howells
       [not found] ` <20200516030436.19448-1-hdanton@sina.com>
  2020-05-20 15:59 ` clean up kernel_{read,write} & friends v2 Christoph Hellwig
  16 siblings, 0 replies; 25+ messages in thread
From: David Howells @ 2020-05-15 15:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: dhowells, Al Viro, Ian Kent, linux-kernel, linux-fsdevel,
	linux-security-module, netfilter-devel
Christoph Hellwig <hch@lst.de> wrote:
> __kernel_write doesn't take a sb_writers references, which we need here.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Howells <dhowells@redhat.com>
^ permalink raw reply	[flat|nested] 25+ messages in thread
* Re: [PATCH 09/14] fs: don't change the address limit for ->write_iter in __kernel_write
       [not found] ` <20200516030436.19448-1-hdanton@sina.com>
@ 2020-05-18  6:42   ` Christoph Hellwig
  0 siblings, 0 replies; 25+ messages in thread
From: Christoph Hellwig @ 2020-05-18  6:42 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Christoph Hellwig, Al Viro, Ian Kent, David Howells, linux-kernel,
	linux-fsdevel, linux-security-module, netfilter-devel
On Sat, May 16, 2020 at 11:04:36AM +0800, Hillf Danton wrote:
> > +	if (file->f_op->write_iter) {
> > +		struct kvec iov = { .iov_base = (void *)buf, .iov_len = count };
> > +		struct kiocb kiocb;
> > +		struct iov_iter iter;
> > +
> > +		init_sync_kiocb(&kiocb, file);
> > +		kiocb.ki_pos = *pos;
> > +		iov_iter_kvec(&iter, WRITE, &iov, 1, count);
> > +		ret = file->f_op->write_iter(&kiocb, &iter);
> > +		if (ret > 0)
> > +			*pos = kiocb.ki_pos;
> > +	} else if (file->f_op->write) {
> > +		mm_segment_t old_fs = get_fs();
> > +
> > +		set_fs(KERNEL_DS);
> 
> Would you please shed light on who need it if a workqueue worker does
> not, given the access to buf? 
Can you rephrase the question, I unfortunately do not understand it at
all as-is.
^ permalink raw reply	[flat|nested] 25+ messages in thread
* Re: clean up kernel_{read,write} & friends v2
  2020-05-13  6:56 clean up kernel_{read,write} & friends v2 Christoph Hellwig
                   ` (15 preceding siblings ...)
       [not found] ` <20200516030436.19448-1-hdanton@sina.com>
@ 2020-05-20 15:59 ` Christoph Hellwig
  16 siblings, 0 replies; 25+ messages in thread
From: Christoph Hellwig @ 2020-05-20 15:59 UTC (permalink / raw)
  To: Al Viro
  Cc: Ian Kent, David Howells, linux-kernel, linux-fsdevel,
	linux-security-module, netfilter-devel
ping?
On Wed, May 13, 2020 at 08:56:42AM +0200, Christoph Hellwig wrote:
> Hi Al,
> 
> this series fixes a few issues and cleans up the helpers that read from
> or write to kernel space buffers, and ensures that we don't change the
> address limit if we are using the ->read_iter and ->write_iter methods
> that don't need the changed address limit.
> 
> Changes since v1:
>  - __kernel_write must not take sb_writers
>  - unexported __kernel_write
---end quoted text---
^ permalink raw reply	[flat|nested] 25+ messages in thread
* [PATCH 09/14] fs: don't change the address limit for ->write_iter in __kernel_write
  2020-05-28  5:40 Christoph Hellwig
@ 2020-05-28  5:40 ` Christoph Hellwig
  2020-05-28 18:43   ` Linus Torvalds
  2020-05-28 19:00   ` Al Viro
  0 siblings, 2 replies; 25+ messages in thread
From: Christoph Hellwig @ 2020-05-28  5:40 UTC (permalink / raw)
  To: Al Viro
  Cc: Linus Torvalds, Ian Kent, David Howells, linux-kernel,
	linux-fsdevel, linux-security-module, netfilter-devel
If we write to a file that implements ->write_iter there is no need
to change the address limit if we send a kvec down.  Implement that
case, and prefer it over using plain ->write with a changed address
limit if available.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 34 ++++++++++++++++++++++------------
 1 file changed, 22 insertions(+), 12 deletions(-)
diff --git a/fs/read_write.c b/fs/read_write.c
index 3bcb084f160de..8cfca5f8fc3ce 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -489,10 +489,9 @@ static ssize_t new_sync_write(struct file *filp, const char __user *buf, size_t
 }
 
 /* caller is responsible for file_start_write/file_end_write */
-ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t *pos)
+ssize_t __kernel_write(struct file *file, const void *buf, size_t count,
+		loff_t *pos)
 {
-	mm_segment_t old_fs;
-	const char __user *p;
 	ssize_t ret;
 
 	if (!(file->f_mode & FMODE_WRITE))
@@ -500,18 +499,29 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t
 	if (!(file->f_mode & FMODE_CAN_WRITE))
 		return -EINVAL;
 
-	old_fs = get_fs();
-	set_fs(KERNEL_DS);
-	p = (__force const char __user *)buf;
 	if (count > MAX_RW_COUNT)
 		count =  MAX_RW_COUNT;
-	if (file->f_op->write)
-		ret = file->f_op->write(file, p, count, pos);
-	else if (file->f_op->write_iter)
-		ret = new_sync_write(file, p, count, pos);
-	else
+	if (file->f_op->write_iter) {
+		struct kvec iov = { .iov_base = (void *)buf, .iov_len = count };
+		struct kiocb kiocb;
+		struct iov_iter iter;
+
+		init_sync_kiocb(&kiocb, file);
+		kiocb.ki_pos = *pos;
+		iov_iter_kvec(&iter, WRITE, &iov, 1, count);
+		ret = file->f_op->write_iter(&kiocb, &iter);
+		if (ret > 0)
+			*pos = kiocb.ki_pos;
+	} else if (file->f_op->write) {
+		mm_segment_t old_fs = get_fs();
+
+		set_fs(KERNEL_DS);
+		ret = file->f_op->write(file, (__force const char __user *)buf,
+				count, pos);
+		set_fs(old_fs);
+	} else {
 		ret = -EINVAL;
-	set_fs(old_fs);
+	}
 	if (ret > 0) {
 		fsnotify_modify(file);
 		add_wchar(current, ret);
-- 
2.26.2
^ permalink raw reply related	[flat|nested] 25+ messages in thread
* Re: [PATCH 09/14] fs: don't change the address limit for ->write_iter in __kernel_write
  2020-05-28  5:40 ` [PATCH 09/14] fs: don't change the address limit for ->write_iter in __kernel_write Christoph Hellwig
@ 2020-05-28 18:43   ` Linus Torvalds
  2020-05-29 12:32     ` Christoph Hellwig
  2020-05-28 19:00   ` Al Viro
  1 sibling, 1 reply; 25+ messages in thread
From: Linus Torvalds @ 2020-05-28 18:43 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Ian Kent, David Howells, Linux Kernel Mailing List,
	linux-fsdevel, LSM List, NetFilter
On Wed, May 27, 2020 at 10:41 PM Christoph Hellwig <hch@lst.de> wrote:
>
> -ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t *pos)
> +ssize_t __kernel_write(struct file *file, const void *buf, size_t count,
> +               loff_t *pos)
Please don't do these kinds of pointless whitespace changes.
If you have an actual 80x25 vt100 sitting in a corner, it's not really
conducive to kernel development any more.
Yes, yes, we'd like to have shorter lines for new code, but no, don't
do silly line breaks that just makes old code look and grep worse.
             Linus
^ permalink raw reply	[flat|nested] 25+ messages in thread
* Re: [PATCH 09/14] fs: don't change the address limit for ->write_iter in __kernel_write
  2020-05-28  5:40 ` [PATCH 09/14] fs: don't change the address limit for ->write_iter in __kernel_write Christoph Hellwig
  2020-05-28 18:43   ` Linus Torvalds
@ 2020-05-28 19:00   ` Al Viro
  2020-05-29  5:57     ` Christoph Hellwig
  1 sibling, 1 reply; 25+ messages in thread
From: Al Viro @ 2020-05-28 19:00 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linus Torvalds, Ian Kent, David Howells, linux-kernel,
	linux-fsdevel, linux-security-module, netfilter-devel
On Thu, May 28, 2020 at 07:40:38AM +0200, Christoph Hellwig wrote:
> If we write to a file that implements ->write_iter there is no need
> to change the address limit if we send a kvec down.  Implement that
> case, and prefer it over using plain ->write with a changed address
> limit if available.
Umm...  It needs a comment along the lines of "weird shits like
/dev/sg that currently check for uaccess_kernel() will just
have to make sure they never switch to ->write_iter()"
^ permalink raw reply	[flat|nested] 25+ messages in thread
* Re: [PATCH 09/14] fs: don't change the address limit for ->write_iter in __kernel_write
  2020-05-28 19:00   ` Al Viro
@ 2020-05-29  5:57     ` Christoph Hellwig
  2020-05-29 13:37       ` Christoph Hellwig
  0 siblings, 1 reply; 25+ messages in thread
From: Christoph Hellwig @ 2020-05-29  5:57 UTC (permalink / raw)
  To: Al Viro
  Cc: Christoph Hellwig, Linus Torvalds, Ian Kent, David Howells,
	linux-kernel, linux-fsdevel, linux-security-module,
	netfilter-devel
On Thu, May 28, 2020 at 08:00:52PM +0100, Al Viro wrote:
> On Thu, May 28, 2020 at 07:40:38AM +0200, Christoph Hellwig wrote:
> > If we write to a file that implements ->write_iter there is no need
> > to change the address limit if we send a kvec down.  Implement that
> > case, and prefer it over using plain ->write with a changed address
> > limit if available.
> 
> Umm...  It needs a comment along the lines of "weird shits like
> /dev/sg that currently check for uaccess_kernel() will just
> have to make sure they never switch to ->write_iter()"
sg and hid has the uaccess_kernel because it accesses userspace memory not
in the range passed to it.  Something using write_iter/read_iter should
never access any memory outside the iter passed to.  rdma has it because
it uses write as a bidirectional interface, which obviously can't work at
all with an iter.  So I'm not sure what we should comment on, but if
you have a desire and a proposal for a comment I'll happily add it.
^ permalink raw reply	[flat|nested] 25+ messages in thread
* Re: [PATCH 09/14] fs: don't change the address limit for ->write_iter in __kernel_write
  2020-05-28 18:43   ` Linus Torvalds
@ 2020-05-29 12:32     ` Christoph Hellwig
  0 siblings, 0 replies; 25+ messages in thread
From: Christoph Hellwig @ 2020-05-29 12:32 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christoph Hellwig, Al Viro, Ian Kent, David Howells,
	Linux Kernel Mailing List, linux-fsdevel, LSM List, NetFilter
On Thu, May 28, 2020 at 11:43:13AM -0700, Linus Torvalds wrote:
> On Wed, May 27, 2020 at 10:41 PM Christoph Hellwig <hch@lst.de> wrote:
> >
> > -ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t *pos)
> > +ssize_t __kernel_write(struct file *file, const void *buf, size_t count,
> > +               loff_t *pos)
> 
> Please don't do these kinds of pointless whitespace changes.
> 
> If you have an actual 80x25 vt100 sitting in a corner, it's not really
> conducive to kernel development any more.
I have real 80x25 xterms, as that allows me to comfortably fit 4 of
them onto my latop screen.
^ permalink raw reply	[flat|nested] 25+ messages in thread
* Re: [PATCH 09/14] fs: don't change the address limit for ->write_iter in __kernel_write
  2020-05-29  5:57     ` Christoph Hellwig
@ 2020-05-29 13:37       ` Christoph Hellwig
  0 siblings, 0 replies; 25+ messages in thread
From: Christoph Hellwig @ 2020-05-29 13:37 UTC (permalink / raw)
  To: Al Viro
  Cc: Christoph Hellwig, Linus Torvalds, Ian Kent, David Howells,
	linux-kernel, linux-fsdevel, linux-security-module,
	netfilter-devel
On Fri, May 29, 2020 at 07:57:36AM +0200, Christoph Hellwig wrote:
> On Thu, May 28, 2020 at 08:00:52PM +0100, Al Viro wrote:
> > On Thu, May 28, 2020 at 07:40:38AM +0200, Christoph Hellwig wrote:
> > > If we write to a file that implements ->write_iter there is no need
> > > to change the address limit if we send a kvec down.  Implement that
> > > case, and prefer it over using plain ->write with a changed address
> > > limit if available.
> > 
> > Umm...  It needs a comment along the lines of "weird shits like
> > /dev/sg that currently check for uaccess_kernel() will just
> > have to make sure they never switch to ->write_iter()"
> 
> sg and hid has the uaccess_kernel because it accesses userspace memory not
> in the range passed to it.  Something using write_iter/read_iter should
> never access any memory outside the iter passed to.  rdma has it because
> it uses write as a bidirectional interface, which obviously can't work at
> all with an iter.  So I'm not sure what we should comment on, but if
> you have a desire and a proposal for a comment I'll happily add it.
And looking over all three again they actually comment why they
check uaccess_kernel.  More importantly if someone switched them to
the ->write_iter carelessly that means the uaccess outside of the range
would actually aways fail now as we didn't allow access to userspace
memory, so this should show up when testing instantly.
^ permalink raw reply	[flat|nested] 25+ messages in thread
end of thread, other threads:[~2020-05-29 13:37 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-05-13  6:56 clean up kernel_{read,write} & friends v2 Christoph Hellwig
2020-05-13  6:56 ` [PATCH 01/14] cachefiles: switch to kernel_write Christoph Hellwig
2020-05-13  6:56 ` [PATCH 02/14] autofs: " Christoph Hellwig
2020-05-14  8:25   ` Ian Kent
2020-05-13  6:56 ` [PATCH 03/14] bpfilter: " Christoph Hellwig
2020-05-13  6:56 ` [PATCH 04/14] fs: unexport __kernel_write Christoph Hellwig
2020-05-13  6:56 ` [PATCH 05/14] fs: check FMODE_WRITE in __kernel_write Christoph Hellwig
2020-05-13  6:56 ` [PATCH 06/14] fs: remove the call_{read,write}_iter functions Christoph Hellwig
2020-05-13  6:56 ` [PATCH 07/14] fs: implement kernel_write using __kernel_write Christoph Hellwig
2020-05-13  6:56 ` [PATCH 08/14] fs: remove __vfs_write Christoph Hellwig
2020-05-13  6:56 ` [PATCH 09/14] fs: don't change the address limit for ->write_iter in __kernel_write Christoph Hellwig
2020-05-13  6:56 ` [PATCH 10/14] fs: add a __kernel_read helper Christoph Hellwig
2020-05-13  6:56 ` [PATCH 11/14] integrity/ima: switch to using __kernel_read Christoph Hellwig
2020-05-13  6:56 ` [PATCH 12/14] fs: implement kernel_read " Christoph Hellwig
2020-05-13  6:56 ` [PATCH 13/14] fs: remove __vfs_read Christoph Hellwig
2020-05-13  6:56 ` [PATCH 14/14] fs: don't change the address limit for ->read_iter in __kernel_read Christoph Hellwig
2020-05-15 15:21 ` [PATCH 01/14] cachefiles: switch to kernel_write David Howells
     [not found] ` <20200516030436.19448-1-hdanton@sina.com>
2020-05-18  6:42   ` [PATCH 09/14] fs: don't change the address limit for ->write_iter in __kernel_write Christoph Hellwig
2020-05-20 15:59 ` clean up kernel_{read,write} & friends v2 Christoph Hellwig
  -- strict thread matches above, loose matches on Subject: below --
2020-05-28  5:40 Christoph Hellwig
2020-05-28  5:40 ` [PATCH 09/14] fs: don't change the address limit for ->write_iter in __kernel_write Christoph Hellwig
2020-05-28 18:43   ` Linus Torvalds
2020-05-29 12:32     ` Christoph Hellwig
2020-05-28 19:00   ` Al Viro
2020-05-29  5:57     ` Christoph Hellwig
2020-05-29 13:37       ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).