All of lore.kernel.org
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Christoph Hellwig <hch@infradead.org>,
	Chen Huang <chenhuang5@huawei.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Randy Dunlap <rdunlap@infradead.org>,
	Will Deacon <will@kernel.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	linux-mm <linux-mm@kvack.org>,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [BUG] arm64: an infinite loop in generic_perform_write()
Date: Thu, 24 Jun 2021 16:09:11 +0100	[thread overview]
Message-ID: <20210624150911.GA25097@arm.com> (raw)
In-Reply-To: <YNRpYli/5/GWvaTT@casper.infradead.org>

On Thu, Jun 24, 2021 at 12:15:46PM +0100, Matthew Wilcox wrote:
> On Thu, Jun 24, 2021 at 08:04:07AM +0100, Christoph Hellwig wrote:
> > On Thu, Jun 24, 2021 at 04:24:46AM +0100, Matthew Wilcox wrote:
> > > On Thu, Jun 24, 2021 at 11:10:41AM +0800, Chen Huang wrote:
> > > > In userspace, I perform such operation:
> > > > 
> > > >  	fd = open("/tmp/test", O_RDWR | O_SYNC);
> > > >         access_address = (char *)mmap(NULL, uio_size, PROT_READ, MAP_SHARED, uio_fd, 0);
> > > >         ret = write(fd, access_address + 2, sizeof(long));
> > > 
> > > ... you know that accessing this at unaligned offsets isn't going to
> > > work.  It's completely meaningless.  Why are you trying to do it?
> > 
> > We still should not cause an infinite loop in kernel space due to a
> > a userspace programmer error.
> 
> They're running as root and they've mapped some device memory.  We can't
> save them from themself.  Imagine if they'd done this to the NVMe BAR.

Ignoring the MMIO case for now, I can trigger the same infinite loop
with MTE (memory tagging), something like:

	char *a;

	a = mmap(0, page_sz, PROT_READ | PROT_WRITE | PROT_MTE,
		 MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
	/* tag 0 is the default, set tag 1 for the next 16 bytes */
	set_tag((unsigned long)(a + 16) | (1UL << 56));

	/* uaccess to a[16] expected to fail */
	bytes = write(fd, a + 14, 8);

The iov_iter_fault_in_readable() check succeeds since a[14] has tag 0.
However, the copy_from_user() attempts an unaligned 8-byte load which
fails because of the mismatched tag from a[16]. The loop continues
indefinitely.

copy_from_user() is not required to squeeze in as much as possible. So I
think the 1-byte read per page via iov_iter_fault_in_readable() is not
sufficient to guarantee progress unless copy_from_user() also reads at
least 1 byte.

We could change raw_copy_from_user() to fall back to 1-byte read in case
of a fault or fix this corner case in the generic code. A quick hack,
re-attempting the access with one byte:

------------------8<-------------------------
diff --git a/mm/filemap.c b/mm/filemap.c
index 66f7e9fdfbc4..67059071460c 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3686,8 +3686,18 @@ ssize_t generic_perform_write(struct file *file,
 			 * because not all segments in the iov can be copied at
 			 * once without a pagefault.
 			 */
-			bytes = min_t(unsigned long, PAGE_SIZE - offset,
-						iov_iter_single_seg_count(i));
+			unsigned long single_seg_bytes =
+				min_t(unsigned long, PAGE_SIZE - offset,
+				      iov_iter_single_seg_count(i));
+
+			/*
+			 * Check for intra-page faults (arm64 MTE, SPARC ADI)
+			 * and fall back to single byte.
+			 */
+			if (bytes > single_seg_bytes)
+				bytes = single_seg_bytes;
+			else
+				bytes = 1;
 			goto again;
 		}
 		pos += copied;
------------------8<-------------------------

Or a slightly different hack, trying to detect if the first segment was
crossing a page boundary:

------------------8<-------------------------
diff --git a/mm/filemap.c b/mm/filemap.c
index 66f7e9fdfbc4..7d1c03f5f559 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3678,16 +3678,24 @@ ssize_t generic_perform_write(struct file *file,
 
 		iov_iter_advance(i, copied);
 		if (unlikely(copied == 0)) {
+			struct iovec v = iov_iter_iovec(i);
+
 			/*
 			 * If we were unable to copy any data at all, we must
-			 * fall back to a single segment length write.
+			 * fall back to a single segment length write or a
+			 * single byte write (for intra-page faults - arm64
+			 * MTE or SPARC ADI).
 			 *
 			 * If we didn't fallback here, we could livelock
-			 * because not all segments in the iov can be copied at
-			 * once without a pagefault.
+			 * because not all segments in the iov or data within
+			 * a segment can be copied at once without a fault.
 			 */
-			bytes = min_t(unsigned long, PAGE_SIZE - offset,
-						iov_iter_single_seg_count(i));
+			if (((unsigned long)v.iov_base & PAGE_MASK) ==
+			    ((unsigned long)(v.iov_base + bytes) & PAGE_MASK))
+				bytes = 1;
+			else
+				bytes = min_t(unsigned long, PAGE_SIZE - offset,
+					      iov_iter_single_seg_count(i));
 			goto again;
 		}
 		pos += copied;
------------------8<-------------------------

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Catalin Marinas <catalin.marinas@arm.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Christoph Hellwig <hch@infradead.org>,
	Chen Huang <chenhuang5@huawei.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Randy Dunlap <rdunlap@infradead.org>,
	Will Deacon <will@kernel.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	linux-mm <linux-mm@kvack.org>,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [BUG] arm64: an infinite loop in generic_perform_write()
Date: Thu, 24 Jun 2021 16:09:11 +0100	[thread overview]
Message-ID: <20210624150911.GA25097@arm.com> (raw)
In-Reply-To: <YNRpYli/5/GWvaTT@casper.infradead.org>

On Thu, Jun 24, 2021 at 12:15:46PM +0100, Matthew Wilcox wrote:
> On Thu, Jun 24, 2021 at 08:04:07AM +0100, Christoph Hellwig wrote:
> > On Thu, Jun 24, 2021 at 04:24:46AM +0100, Matthew Wilcox wrote:
> > > On Thu, Jun 24, 2021 at 11:10:41AM +0800, Chen Huang wrote:
> > > > In userspace, I perform such operation:
> > > > 
> > > >  	fd = open("/tmp/test", O_RDWR | O_SYNC);
> > > >         access_address = (char *)mmap(NULL, uio_size, PROT_READ, MAP_SHARED, uio_fd, 0);
> > > >         ret = write(fd, access_address + 2, sizeof(long));
> > > 
> > > ... you know that accessing this at unaligned offsets isn't going to
> > > work.  It's completely meaningless.  Why are you trying to do it?
> > 
> > We still should not cause an infinite loop in kernel space due to a
> > a userspace programmer error.
> 
> They're running as root and they've mapped some device memory.  We can't
> save them from themself.  Imagine if they'd done this to the NVMe BAR.

Ignoring the MMIO case for now, I can trigger the same infinite loop
with MTE (memory tagging), something like:

	char *a;

	a = mmap(0, page_sz, PROT_READ | PROT_WRITE | PROT_MTE,
		 MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
	/* tag 0 is the default, set tag 1 for the next 16 bytes */
	set_tag((unsigned long)(a + 16) | (1UL << 56));

	/* uaccess to a[16] expected to fail */
	bytes = write(fd, a + 14, 8);

The iov_iter_fault_in_readable() check succeeds since a[14] has tag 0.
However, the copy_from_user() attempts an unaligned 8-byte load which
fails because of the mismatched tag from a[16]. The loop continues
indefinitely.

copy_from_user() is not required to squeeze in as much as possible. So I
think the 1-byte read per page via iov_iter_fault_in_readable() is not
sufficient to guarantee progress unless copy_from_user() also reads at
least 1 byte.

We could change raw_copy_from_user() to fall back to 1-byte read in case
of a fault or fix this corner case in the generic code. A quick hack,
re-attempting the access with one byte:

------------------8<-------------------------
diff --git a/mm/filemap.c b/mm/filemap.c
index 66f7e9fdfbc4..67059071460c 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3686,8 +3686,18 @@ ssize_t generic_perform_write(struct file *file,
 			 * because not all segments in the iov can be copied at
 			 * once without a pagefault.
 			 */
-			bytes = min_t(unsigned long, PAGE_SIZE - offset,
-						iov_iter_single_seg_count(i));
+			unsigned long single_seg_bytes =
+				min_t(unsigned long, PAGE_SIZE - offset,
+				      iov_iter_single_seg_count(i));
+
+			/*
+			 * Check for intra-page faults (arm64 MTE, SPARC ADI)
+			 * and fall back to single byte.
+			 */
+			if (bytes > single_seg_bytes)
+				bytes = single_seg_bytes;
+			else
+				bytes = 1;
 			goto again;
 		}
 		pos += copied;
------------------8<-------------------------

Or a slightly different hack, trying to detect if the first segment was
crossing a page boundary:

------------------8<-------------------------
diff --git a/mm/filemap.c b/mm/filemap.c
index 66f7e9fdfbc4..7d1c03f5f559 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3678,16 +3678,24 @@ ssize_t generic_perform_write(struct file *file,
 
 		iov_iter_advance(i, copied);
 		if (unlikely(copied == 0)) {
+			struct iovec v = iov_iter_iovec(i);
+
 			/*
 			 * If we were unable to copy any data at all, we must
-			 * fall back to a single segment length write.
+			 * fall back to a single segment length write or a
+			 * single byte write (for intra-page faults - arm64
+			 * MTE or SPARC ADI).
 			 *
 			 * If we didn't fallback here, we could livelock
-			 * because not all segments in the iov can be copied at
-			 * once without a pagefault.
+			 * because not all segments in the iov or data within
+			 * a segment can be copied at once without a fault.
 			 */
-			bytes = min_t(unsigned long, PAGE_SIZE - offset,
-						iov_iter_single_seg_count(i));
+			if (((unsigned long)v.iov_base & PAGE_MASK) ==
+			    ((unsigned long)(v.iov_base + bytes) & PAGE_MASK))
+				bytes = 1;
+			else
+				bytes = min_t(unsigned long, PAGE_SIZE - offset,
+					      iov_iter_single_seg_count(i));
 			goto again;
 		}
 		pos += copied;
------------------8<-------------------------

-- 
Catalin


  parent reply	other threads:[~2021-06-24 15:10 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-23  2:39 [BUG] arm64: an infinite loop in generic_perform_write() Chen Huang
2021-06-23  2:39 ` Chen Huang
2021-06-23  2:50 ` Al Viro
2021-06-23  2:50   ` Al Viro
2021-06-23  3:24   ` Xiaoming Ni
2021-06-23  3:24     ` Xiaoming Ni
2021-06-23  4:27     ` Al Viro
2021-06-23  4:27       ` Al Viro
2021-06-23  9:32       ` Catalin Marinas
2021-06-23  9:32         ` Catalin Marinas
2021-06-23 11:51         ` Matthew Wilcox
2021-06-23 11:51           ` Matthew Wilcox
2021-06-23 13:04         ` Al Viro
2021-06-23 13:04           ` Al Viro
2021-06-23 13:22 ` Mark Rutland
2021-06-23 13:22   ` Mark Rutland
2021-06-24  3:10   ` Chen Huang
2021-06-24  3:10     ` Chen Huang
2021-06-24  3:24     ` Matthew Wilcox
2021-06-24  3:24       ` Matthew Wilcox
2021-06-24  3:52       ` Chen Huang
2021-06-24  3:52         ` Chen Huang
2021-06-24  7:04       ` Christoph Hellwig
2021-06-24  7:04         ` Christoph Hellwig
2021-06-24 11:15         ` Matthew Wilcox
2021-06-24 11:15           ` Matthew Wilcox
2021-06-24 13:22           ` Robin Murphy
2021-06-24 13:22             ` Robin Murphy
2021-06-24 16:27             ` Al Viro
2021-06-24 16:27               ` Al Viro
2021-06-24 16:38               ` Robin Murphy
2021-06-24 16:38                 ` Robin Murphy
2021-06-24 16:39                 ` Al Viro
2021-06-24 16:39                   ` Al Viro
2021-06-24 17:24                   ` Robin Murphy
2021-06-24 17:24                     ` Robin Murphy
2021-06-24 18:55               ` Catalin Marinas
2021-06-24 18:55                 ` Catalin Marinas
2021-06-24 20:36                 ` Robin Murphy
2021-06-24 20:36                   ` Robin Murphy
2021-06-25 10:39                   ` Catalin Marinas
2021-06-25 10:39                     ` Catalin Marinas
2021-06-28 16:22                     ` Robin Murphy
2021-06-28 16:22                       ` Robin Murphy
2021-06-29  8:30                       ` Catalin Marinas
2021-06-29  8:30                         ` Catalin Marinas
2021-06-29 10:01                         ` Robin Murphy
2021-06-29 10:01                           ` Robin Murphy
2021-07-06 17:50                       ` Catalin Marinas
2021-07-06 17:50                         ` Catalin Marinas
2021-07-06 19:15                         ` Robin Murphy
2021-07-06 19:15                           ` Robin Murphy
2021-07-07  9:55                           ` David Laight
2021-07-07  9:55                             ` David Laight
2021-07-07 11:04                             ` Robin Murphy
2021-07-07 11:04                               ` Robin Murphy
2021-07-07 12:50                           ` Catalin Marinas
2021-07-07 12:50                             ` Catalin Marinas
2021-06-24 15:09           ` Catalin Marinas [this message]
2021-06-24 15:09             ` Catalin Marinas
2021-06-24 16:17             ` Al Viro
2021-06-24 16:17               ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210624150911.GA25097@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=chenhuang5@huawei.com \
    --cc=hch@infradead.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=rdunlap@infradead.org \
    --cc=sfr@canb.auug.org.au \
    --cc=viro@zeniv.linux.org.uk \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.