From: Izik Eidus <ieidus-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Andrea Arcangeli <aarcange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Greg KH <greg-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>,
KOSAKI Motohiro
<kosaki.motohiro-+CUm20s59erQFUHtdCDX3A@public.gmane.org>,
KAMEZAWA Hiroyuki
<kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>,
mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
linux-man-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Nick Piggin <npiggin-l3A5Bk7waGM@public.gmane.org>,
Hugh Dickins <hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org>
Subject: Re: open(2) says O_DIRECT works on 512 byte boundries?
Date: Sat, 07 Feb 2009 15:32:03 +0200 [thread overview]
Message-ID: <498D8D53.6030007@redhat.com> (raw)
In-Reply-To: <20090206175414.GQ14011-ulAA2RpUuRJvbPj5LQlJ3g@public.gmane.org>
Andrea Arcangeli wrote:
>
> Here the latest patch. Now that I consider the 'production' trouble
> closed I'll be porting it to mainline while addressing gup-fast too
> which is an additional complication I didn't have to take care of
> yet. So expect a patch that works for you in the next few days, either
> that or complains about an unfixable gup-fast ;). But frankly I've
> been thinking it should be possible in a much simpler way that I
> ever thought before, by entirely relaying on the tlb flush.
>
> In short if I simply do the page-count vs page-mapcount check _after_
> ptep_set_wrprotect (which implies a tlb flush and after that any
> gup-fast running in other cpus should be forced through the slow path
> and block) I think I'm done. The code now does:
>
> check mapcount
> ptep_set_wrprotect
>
> I think to make the thing working with gup-fast I've only to
> ptep_set_wrprotect before the mapcount check.
>
> The reason why the normal pagetable walking of the cpu is ok with
> current code is that ptep_set_wrprotect done later will block any
> access to the page from the other cpus. Not so if it was gup-fast
> taking reference on the page. So we need to stop with
> ptep_set_wrprotect any subsequential gup-fast _before_ we check count
> vs mapcount and the fact the get_page is run inside the critical
> section with local irq disabled in gup-fast should close the race for
> gup-fast too. Will try that ASAP...
>
>
Hi Andrea,
Good approach, but I think that it isn't enough to just change the order
of check mapcount and ptep_set_wrprotect(),
the reason is that gup_fast do the get_page() AFTER it fetched the pte,
so we are opening here a tiny race:
cpu#1 do get_user_pages_fast and fetch the pte (it think the pte is
writeable)
cpu#2 do ptep_set_wrprotect()
cpu#2 check the mapcount against pagecount (it think that everything is
fine and continue)
cpu#1 only now do get_page()
Anyway this is minor issue that can be probably solved by just:
rechecking if the pte isnt read_only in gup_fast after we do the get_page()
Anyway sound like a great idea to fix this issue!
good work.
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
WARNING: multiple messages have this Message-ID (diff)
From: Izik Eidus <ieidus@redhat.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: Greg KH <greg@kroah.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
mtk.manpages@gmail.com, linux-man@vger.kernel.org,
linux-kernel@vger.kernel.org, Nick Piggin <npiggin@suse.de>,
Hugh Dickins <hugh@veritas.com>
Subject: Re: open(2) says O_DIRECT works on 512 byte boundries?
Date: Sat, 07 Feb 2009 15:32:03 +0200 [thread overview]
Message-ID: <498D8D53.6030007@redhat.com> (raw)
In-Reply-To: <20090206175414.GQ14011@random.random>
Andrea Arcangeli wrote:
>
> Here the latest patch. Now that I consider the 'production' trouble
> closed I'll be porting it to mainline while addressing gup-fast too
> which is an additional complication I didn't have to take care of
> yet. So expect a patch that works for you in the next few days, either
> that or complains about an unfixable gup-fast ;). But frankly I've
> been thinking it should be possible in a much simpler way that I
> ever thought before, by entirely relaying on the tlb flush.
>
> In short if I simply do the page-count vs page-mapcount check _after_
> ptep_set_wrprotect (which implies a tlb flush and after that any
> gup-fast running in other cpus should be forced through the slow path
> and block) I think I'm done. The code now does:
>
> check mapcount
> ptep_set_wrprotect
>
> I think to make the thing working with gup-fast I've only to
> ptep_set_wrprotect before the mapcount check.
>
> The reason why the normal pagetable walking of the cpu is ok with
> current code is that ptep_set_wrprotect done later will block any
> access to the page from the other cpus. Not so if it was gup-fast
> taking reference on the page. So we need to stop with
> ptep_set_wrprotect any subsequential gup-fast _before_ we check count
> vs mapcount and the fact the get_page is run inside the critical
> section with local irq disabled in gup-fast should close the race for
> gup-fast too. Will try that ASAP...
>
>
Hi Andrea,
Good approach, but I think that it isn't enough to just change the order
of check mapcount and ptep_set_wrprotect(),
the reason is that gup_fast do the get_page() AFTER it fetched the pte,
so we are opening here a tiny race:
cpu#1 do get_user_pages_fast and fetch the pte (it think the pte is
writeable)
cpu#2 do ptep_set_wrprotect()
cpu#2 check the mapcount against pagecount (it think that everything is
fine and continue)
cpu#1 only now do get_page()
Anyway this is minor issue that can be probably solved by just:
rechecking if the pte isnt read_only in gup_fast after we do the get_page()
Anyway sound like a great idea to fix this issue!
good work.
next prev parent reply other threads:[~2009-02-07 13:32 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-28 21:33 open(2) says O_DIRECT works on 512 byte boundries? Greg KH
2009-01-28 21:33 ` Greg KH
[not found] ` <20090128213322.GA15789-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2009-01-29 0:41 ` Robert Hancock
2009-01-29 0:41 ` Robert Hancock
[not found] ` <4980FB4D.9090009-fVOoFLC7IWo@public.gmane.org>
2009-01-29 1:17 ` Greg KH
[not found] ` <20090129011758.GA26534-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2009-01-29 2:59 ` Michael Kerrisk
2009-01-29 2:59 ` Michael Kerrisk
2009-01-29 3:13 ` Greg KH
[not found] ` <20090129031349.GA23722-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2009-01-29 15:40 ` Jeff Moyer
2009-01-29 15:40 ` Jeff Moyer
[not found] ` <x49iqnyi0ih.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
2009-01-30 6:16 ` Greg KH
2009-01-30 6:16 ` Greg KH
2009-01-29 5:13 ` KAMEZAWA Hiroyuki
2009-01-29 5:13 ` KAMEZAWA Hiroyuki
[not found] ` <20090129141338.34e44a1f.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2009-01-29 7:10 ` KOSAKI Motohiro
2009-01-29 7:10 ` KOSAKI Motohiro
[not found] ` <20090129160826.701E.KOSAKI.MOTOHIRO-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2009-01-30 6:17 ` Greg KH
2009-01-30 6:17 ` Greg KH
[not found] ` <20090130061714.GC31209-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2009-02-02 22:08 ` Andrea Arcangeli
2009-02-02 22:08 ` Andrea Arcangeli
2009-02-03 1:29 ` KAMEZAWA Hiroyuki
[not found] ` <20090203102920.684e7b67.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2009-02-03 2:31 ` Andrea Arcangeli
2009-02-03 2:31 ` Andrea Arcangeli
[not found] ` <20090203023147.GZ20323-ulAA2RpUuRJvbPj5LQlJ3g@public.gmane.org>
2009-02-03 2:55 ` KAMEZAWA Hiroyuki
2009-02-03 2:55 ` KAMEZAWA Hiroyuki
[not found] ` <20090203115540.86a01273.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2009-02-03 3:42 ` KAMEZAWA Hiroyuki
2009-02-03 3:42 ` KAMEZAWA Hiroyuki
2009-02-06 17:55 ` Andrea Arcangeli
2009-02-06 17:55 ` Andrea Arcangeli
[not found] ` <20090202220856.GY20323-ulAA2RpUuRJvbPj5LQlJ3g@public.gmane.org>
2009-02-03 3:50 ` Greg KH
2009-02-03 3:50 ` Greg KH
[not found] ` <20090203035012.GC1867-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2009-02-03 15:01 ` Andrea Arcangeli
2009-02-03 15:01 ` Andrea Arcangeli
2009-02-03 4:13 ` KAMEZAWA Hiroyuki
2009-02-03 4:13 ` KAMEZAWA Hiroyuki
2009-02-03 4:38 ` KAMEZAWA Hiroyuki
2009-02-03 4:38 ` KAMEZAWA Hiroyuki
[not found] ` <20090203133811.47324d80.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2009-02-03 15:08 ` Andrea Arcangeli
2009-02-03 15:08 ` Andrea Arcangeli
2009-02-04 23:41 ` Greg KH
2009-02-04 23:41 ` Greg KH
[not found] ` <20090204234153.GA32244-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2009-02-06 17:54 ` Andrea Arcangeli
2009-02-06 17:54 ` Andrea Arcangeli
[not found] ` <20090206175414.GQ14011-ulAA2RpUuRJvbPj5LQlJ3g@public.gmane.org>
2009-02-06 18:38 ` Andrea Arcangeli
2009-02-06 18:38 ` Andrea Arcangeli
2009-02-07 13:32 ` Izik Eidus [this message]
2009-02-07 13:32 ` Izik Eidus
[not found] ` <498D8D53.6030007-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-02-07 15:33 ` Andrea Arcangeli
2009-02-07 15:33 ` Andrea Arcangeli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=498D8D53.6030007@redhat.com \
--to=ieidus-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
--cc=aarcange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=greg-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org \
--cc=hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org \
--cc=kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org \
--cc=kosaki.motohiro-+CUm20s59erQFUHtdCDX3A@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-man-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=npiggin-l3A5Bk7waGM@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.