All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>,
	Andreas Gruenbacher <agruenba@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	linux-s390 <linux-s390@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Buffered I/O broken on s390x with page faults disabled (gfs2)
Date: Tue, 8 Mar 2022 09:21:23 +0100	[thread overview]
Message-ID: <bcafacea-7e67-405c-a969-e5a58a3c727e@redhat.com> (raw)
In-Reply-To: <CAHk-=wgmCuuJdf96WiT6WXzQQTEeSK=cgBy24J4U9V2AvK4KdQ@mail.gmail.com>

On 08.03.22 00:18, Linus Torvalds wrote:
> On Mon, Mar 7, 2022 at 2:52 PM Andreas Gruenbacher <agruenba@redhat.com> wrote:
>>
>> After generic_file_read_iter() returns a short or empty read, we fault
>> in some pages with fault_in_iov_iter_writeable(). This succeeds, but
>> the next call to generic_file_read_iter() returns -EFAULT and we're
>> not making any progress.
> 
> Since this is s390-specific, I get the very strong feeling that the
> 
>   fault_in_iov_iter_writeable ->
>     fault_in_safe_writeable ->
>       __get_user_pages_locked ->
>         __get_user_pages
> 
> path somehow successfully finds the page, despite it not being
> properly accessible in the page tables.

As raised offline already, I suspect

shrink_active_list()
->page_referenced()
 ->page_referenced_one()
  ->ptep_clear_flush_young_notify()
   ->ptep_clear_flush_young()

which results on s390x in:

static inline pte_t pte_mkold(pte_t pte)
{
	pte_val(pte) &= ~_PAGE_YOUNG;
	pte_val(pte) |= _PAGE_INVALID;
	return pte;
}

static inline int ptep_test_and_clear_young(struct vm_area_struct *vma,
					    unsigned long addr, pte_t *ptep)
{
	pte_t pte = *ptep;

	pte = ptep_xchg_direct(vma->vm_mm, addr, ptep, pte_mkold(pte));
	return pte_young(pte);
}


_PAGE_INVALID is the actual HW bit, _PAGE_PRESENT is a
pure SW bit. AFAIU, pte_present() still holds:

static inline int pte_present(pte_t pte)
{
	/* Bit pattern: (pte & 0x001) == 0x001 */
	return (pte_val(pte) & _PAGE_PRESENT) != 0;
}


pte_mkyoung() will revert that action:

static inline pte_t pte_mkyoung(pte_t pte)
{
	pte_val(pte) |= _PAGE_YOUNG;
	if (pte_val(pte) & _PAGE_READ)
		pte_val(pte) &= ~_PAGE_INVALID;
	return pte;
}


and pte_modify() will adjust it properly again:

/*
 * The following pte modification functions only work if
 * pte_present() is true. Undefined behaviour if not..
 */
static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
{
	pte_val(pte) &= _PAGE_CHG_MASK;
	pte_val(pte) |= pgprot_val(newprot);
	/*
	 * newprot for PAGE_NONE, PAGE_RO, PAGE_RX, PAGE_RW and PAGE_RWX
	 * has the invalid bit set, clear it again for readable, young pages
	 */
	if ((pte_val(pte) & _PAGE_YOUNG) && (pte_val(pte) & _PAGE_READ))
		pte_val(pte) &= ~_PAGE_INVALID;
	/*
	 * newprot for PAGE_RO, PAGE_RX, PAGE_RW and PAGE_RWX has the page
	 * protection bit set, clear it again for writable, dirty pages
	 */
	if ((pte_val(pte) & _PAGE_DIRTY) && (pte_val(pte) & _PAGE_WRITE))
		pte_val(pte) &= ~_PAGE_PROTECT;
	return pte;
}



Which leaves me wondering if there is a way in GUP whereby
we would lookup that page and not clear _PAGE_INVALID,
resulting in GUP succeeding but faults via the MMU still
faulting on _PAGE_INVALID.


-- 
Thanks,

David / dhildenb


  reply	other threads:[~2022-03-08  8:21 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-07 22:52 Buffered I/O broken on s390x with page faults disabled (gfs2) Andreas Gruenbacher
2022-03-07 23:18 ` Linus Torvalds
2022-03-08  8:21   ` David Hildenbrand [this message]
2022-03-08  8:37     ` David Hildenbrand
2022-03-08 12:11       ` David Hildenbrand
2022-03-08 12:24         ` David Hildenbrand
2022-03-08 13:20           ` Gerald Schaefer
2022-03-08 13:32             ` David Hildenbrand
2022-03-08 14:14               ` Gerald Schaefer
2022-03-08 17:23           ` David Hildenbrand
2022-03-08 17:26     ` Linus Torvalds
2022-03-08 17:40       ` Linus Torvalds
2022-03-08 19:27         ` Linus Torvalds
2022-03-08 20:03           ` Linus Torvalds
2022-03-08 23:24             ` Andreas Gruenbacher
2022-03-09  0:22               ` Linus Torvalds
2022-03-09 18:42                 ` Andreas Gruenbacher
2022-03-09 19:08                   ` Linus Torvalds
2022-03-09 20:57                     ` Andreas Gruenbacher
2022-03-09 21:08                     ` Andreas Gruenbacher
2022-03-10 12:13                       ` Filipe Manana
2022-03-09 19:21                   ` Linus Torvalds
2022-03-09 19:35                     ` Andreas Gruenbacher
2022-03-09 20:18                       ` Linus Torvalds
2022-03-09 20:36                         ` Andreas Gruenbacher
2022-03-09 20:48                           ` Linus Torvalds
2022-03-09 20:54                             ` Linus Torvalds
2022-03-10 17:13           ` David Hildenbrand
2022-03-10 18:00             ` Andreas Gruenbacher
2022-03-10 18:35             ` Linus Torvalds
2022-03-10 18:38               ` David Hildenbrand
2022-03-10 18:47               ` Andreas Gruenbacher
2022-03-10 19:22                 ` Linus Torvalds
2022-03-10 19:56                   ` Linus Torvalds
2022-03-10 20:23                     ` Andreas Gruenbacher
2022-03-08 17:47       ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bcafacea-7e67-405c-a969-e5a58a3c727e@redhat.com \
    --to=david@redhat.com \
    --cc=agruenba@redhat.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.