All of lore.kernel.org
 help / color / mirror / Atom feed
From: Izik Eidus <ieidus-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Andrea Arcangeli <aarcange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Greg KH <greg-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>,
	KOSAKI Motohiro
	<kosaki.motohiro-+CUm20s59erQFUHtdCDX3A@public.gmane.org>,
	KAMEZAWA Hiroyuki
	<kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>,
	mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
	linux-man-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Nick Piggin <npiggin-l3A5Bk7waGM@public.gmane.org>,
	Hugh Dickins <hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org>
Subject: Re: open(2) says O_DIRECT works on 512 byte boundries?
Date: Sat, 07 Feb 2009 15:32:03 +0200	[thread overview]
Message-ID: <498D8D53.6030007@redhat.com> (raw)
In-Reply-To: <20090206175414.GQ14011-ulAA2RpUuRJvbPj5LQlJ3g@public.gmane.org>

Andrea Arcangeli wrote:
>
> Here the latest patch. Now that I consider the 'production' trouble
> closed I'll be porting it to mainline while addressing gup-fast too
> which is an additional complication I didn't have to take care of
> yet. So expect a patch that works for you in the next few days, either
> that or complains about an unfixable gup-fast ;). But frankly I've
> been thinking it should be possible in a much simpler way that I
> ever thought before, by entirely relaying on the tlb flush.
>
> In short if I simply do the page-count vs page-mapcount check _after_
> ptep_set_wrprotect (which implies a tlb flush and after that any
> gup-fast running in other cpus should be forced through the slow path
> and block) I think I'm done. The code now does:
>
>     check mapcount
>     ptep_set_wrprotect
>
> I think to make the thing working with gup-fast I've only to
> ptep_set_wrprotect before the mapcount check.
>
> The reason why the normal pagetable walking of the cpu is ok with
> current code is that ptep_set_wrprotect done later will block any
> access to the page from the other cpus. Not so if it was gup-fast
> taking reference on the page. So we need to stop with
> ptep_set_wrprotect any subsequential gup-fast _before_ we check count
> vs mapcount and the fact the get_page is run inside the critical
> section with local irq disabled in gup-fast should close the race for
> gup-fast too. Will try that ASAP...
>
>   
Hi Andrea,
Good approach, but I think that it isn't enough to just change the order 
of check mapcount and ptep_set_wrprotect(),
the reason is that gup_fast do the get_page() AFTER it fetched the pte, 
so we are opening here a tiny race:

cpu#1 do get_user_pages_fast and fetch the pte (it think the pte is 
writeable)
cpu#2 do ptep_set_wrprotect()
cpu#2 check the mapcount against pagecount (it think that everything is 
fine and continue)
cpu#1 only now do get_page()

Anyway this is minor issue that can be probably solved by just:
rechecking if the pte isnt read_only in gup_fast after we do the get_page()

Anyway sound like a great idea to fix this issue!
good work.


--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: Izik Eidus <ieidus@redhat.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: Greg KH <greg@kroah.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	mtk.manpages@gmail.com, linux-man@vger.kernel.org,
	linux-kernel@vger.kernel.org, Nick Piggin <npiggin@suse.de>,
	Hugh Dickins <hugh@veritas.com>
Subject: Re: open(2) says O_DIRECT works on 512 byte boundries?
Date: Sat, 07 Feb 2009 15:32:03 +0200	[thread overview]
Message-ID: <498D8D53.6030007@redhat.com> (raw)
In-Reply-To: <20090206175414.GQ14011@random.random>

Andrea Arcangeli wrote:
>
> Here the latest patch. Now that I consider the 'production' trouble
> closed I'll be porting it to mainline while addressing gup-fast too
> which is an additional complication I didn't have to take care of
> yet. So expect a patch that works for you in the next few days, either
> that or complains about an unfixable gup-fast ;). But frankly I've
> been thinking it should be possible in a much simpler way that I
> ever thought before, by entirely relaying on the tlb flush.
>
> In short if I simply do the page-count vs page-mapcount check _after_
> ptep_set_wrprotect (which implies a tlb flush and after that any
> gup-fast running in other cpus should be forced through the slow path
> and block) I think I'm done. The code now does:
>
>     check mapcount
>     ptep_set_wrprotect
>
> I think to make the thing working with gup-fast I've only to
> ptep_set_wrprotect before the mapcount check.
>
> The reason why the normal pagetable walking of the cpu is ok with
> current code is that ptep_set_wrprotect done later will block any
> access to the page from the other cpus. Not so if it was gup-fast
> taking reference on the page. So we need to stop with
> ptep_set_wrprotect any subsequential gup-fast _before_ we check count
> vs mapcount and the fact the get_page is run inside the critical
> section with local irq disabled in gup-fast should close the race for
> gup-fast too. Will try that ASAP...
>
>   
Hi Andrea,
Good approach, but I think that it isn't enough to just change the order 
of check mapcount and ptep_set_wrprotect(),
the reason is that gup_fast do the get_page() AFTER it fetched the pte, 
so we are opening here a tiny race:

cpu#1 do get_user_pages_fast and fetch the pte (it think the pte is 
writeable)
cpu#2 do ptep_set_wrprotect()
cpu#2 check the mapcount against pagecount (it think that everything is 
fine and continue)
cpu#1 only now do get_page()

Anyway this is minor issue that can be probably solved by just:
rechecking if the pte isnt read_only in gup_fast after we do the get_page()

Anyway sound like a great idea to fix this issue!
good work.



  parent reply	other threads:[~2009-02-07 13:32 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-28 21:33 open(2) says O_DIRECT works on 512 byte boundries? Greg KH
2009-01-28 21:33 ` Greg KH
     [not found] ` <20090128213322.GA15789-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2009-01-29  0:41   ` Robert Hancock
2009-01-29  0:41     ` Robert Hancock
     [not found]     ` <4980FB4D.9090009-fVOoFLC7IWo@public.gmane.org>
2009-01-29  1:17       ` Greg KH
     [not found]         ` <20090129011758.GA26534-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2009-01-29  2:59           ` Michael Kerrisk
2009-01-29  2:59             ` Michael Kerrisk
2009-01-29  3:13             ` Greg KH
     [not found]               ` <20090129031349.GA23722-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2009-01-29 15:40                 ` Jeff Moyer
2009-01-29 15:40                   ` Jeff Moyer
     [not found]                   ` <x49iqnyi0ih.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
2009-01-30  6:16                     ` Greg KH
2009-01-30  6:16                       ` Greg KH
2009-01-29  5:13   ` KAMEZAWA Hiroyuki
2009-01-29  5:13     ` KAMEZAWA Hiroyuki
     [not found]     ` <20090129141338.34e44a1f.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2009-01-29  7:10       ` KOSAKI Motohiro
2009-01-29  7:10         ` KOSAKI Motohiro
     [not found]         ` <20090129160826.701E.KOSAKI.MOTOHIRO-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2009-01-30  6:17           ` Greg KH
2009-01-30  6:17             ` Greg KH
     [not found]             ` <20090130061714.GC31209-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2009-02-02 22:08               ` Andrea Arcangeli
2009-02-02 22:08                 ` Andrea Arcangeli
2009-02-03  1:29                 ` KAMEZAWA Hiroyuki
     [not found]                   ` <20090203102920.684e7b67.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2009-02-03  2:31                     ` Andrea Arcangeli
2009-02-03  2:31                       ` Andrea Arcangeli
     [not found]                       ` <20090203023147.GZ20323-ulAA2RpUuRJvbPj5LQlJ3g@public.gmane.org>
2009-02-03  2:55                         ` KAMEZAWA Hiroyuki
2009-02-03  2:55                           ` KAMEZAWA Hiroyuki
     [not found]                           ` <20090203115540.86a01273.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2009-02-03  3:42                             ` KAMEZAWA Hiroyuki
2009-02-03  3:42                               ` KAMEZAWA Hiroyuki
2009-02-06 17:55                             ` Andrea Arcangeli
2009-02-06 17:55                               ` Andrea Arcangeli
     [not found]                 ` <20090202220856.GY20323-ulAA2RpUuRJvbPj5LQlJ3g@public.gmane.org>
2009-02-03  3:50                   ` Greg KH
2009-02-03  3:50                     ` Greg KH
     [not found]                     ` <20090203035012.GC1867-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2009-02-03 15:01                       ` Andrea Arcangeli
2009-02-03 15:01                         ` Andrea Arcangeli
2009-02-03  4:13                   ` KAMEZAWA Hiroyuki
2009-02-03  4:13                     ` KAMEZAWA Hiroyuki
2009-02-03  4:38                   ` KAMEZAWA Hiroyuki
2009-02-03  4:38                     ` KAMEZAWA Hiroyuki
     [not found]                     ` <20090203133811.47324d80.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2009-02-03 15:08                       ` Andrea Arcangeli
2009-02-03 15:08                         ` Andrea Arcangeli
2009-02-04 23:41                   ` Greg KH
2009-02-04 23:41                     ` Greg KH
     [not found]                     ` <20090204234153.GA32244-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2009-02-06 17:54                       ` Andrea Arcangeli
2009-02-06 17:54                         ` Andrea Arcangeli
     [not found]                         ` <20090206175414.GQ14011-ulAA2RpUuRJvbPj5LQlJ3g@public.gmane.org>
2009-02-06 18:38                           ` Andrea Arcangeli
2009-02-06 18:38                             ` Andrea Arcangeli
2009-02-07 13:32                           ` Izik Eidus [this message]
2009-02-07 13:32                             ` Izik Eidus
     [not found]                             ` <498D8D53.6030007-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-02-07 15:33                               ` Andrea Arcangeli
2009-02-07 15:33                                 ` Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=498D8D53.6030007@redhat.com \
    --to=ieidus-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=aarcange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=greg-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org \
    --cc=hugh-DTz5qymZ9yRBDgjK7y7TUQ@public.gmane.org \
    --cc=kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org \
    --cc=kosaki.motohiro-+CUm20s59erQFUHtdCDX3A@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-man-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=npiggin-l3A5Bk7waGM@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.