public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC] copy_from_user races with readpage
@ 2006-04-19 17:18 Chris Mason
  2006-04-19 20:41 ` Andrew Morton
  0 siblings, 1 reply; 9+ messages in thread
From: Chris Mason @ 2006-04-19 17:18 UTC (permalink / raw)
  To: linux-kernel, akpm, andrea

Hello everyone,

I've been working with IBM on a long standing bug where zeros unexpectedly pop 
up during a disk certification test.  We tracked it down to copy_from_user.  
A simplified form of the test works like this:

memset(buffer, 0x5a, 4096);
fd = open("/dev/some_disk", O_RDWR);
write(fd, buffer, 4096);
pid = fork();
if (pid) {
    while(1) {
        lseek(fd, 0, 0);
        read(fd, buf2, 4096);
    }
} else {
    while(1) {
        lseek(fd, 0, 0);
        write(fd, buffer, 4096);
    }
}

First we fill a given block in the file with a specific pattern.  Then we 
fork.  One proc writes that exact same pattern over and over, and the other 
proc reads from the block over and over.

The reads and writes race, but you would expect the read to always see the 
0x5a pattern.  If we introduce enough memory pressure, sometimes the read 
sees zeros instead of the pattern because of kmap_atomic:

cpu1                                            cpu2
file_write 
(page now up to date)
file_write                                     file_read
__copy_from_user (atomic)
                                                   file_read_actor
                                                   copy_to_user
__copy_from_user (non-atomic)

The first copy_from_user fails because of a page fault.  So, the destination
page is zero filled, which is the data found by file_read_actor().  The second 
copy_from_user succeeds and puts the proper data in the page.

The solution seems to be a non-zeroing copy_from_user, but this is only 
required on arches where kmap_atomic incs the preemption count.  Andrea has a 
patch for i386 that does this (small and obvious), along with some memsets to 
zero out the kernel page when copy_from_user fails.

This feature has been present for quite a while, and I think it should be 
fixed.  But before we go through making a patch for ppc (any other arches 
affected?) I wanted to poll here and make sure people agreed the zeros are 
not correct.

-chris

^ permalink raw reply	[flat|nested] 9+ messages in thread
* [PATCH - RESEND - 000 of 2] Avoid subtle cache consistancy problem
@ 2006-05-22  4:46 NeilBrown
  2006-05-22  4:46 ` [PATCH 002 of 2] Make copy_from_user_inatomic NOT zero the tail on i386 NeilBrown
  0 siblings, 1 reply; 9+ messages in thread
From: NeilBrown @ 2006-05-22  4:46 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

This is a resend of a pair of patches that didn't get a lot of attention
last time.
I've cleaned up the second one a bit, as it had some ugliness that might
have put some people off...

The problem is that when we write to a file, the copy from userspace
to pagecache is first done with preemption disabled, so if the source
address is not immediately available the copy fails *and* *zeros*
*the*  *destination*.

This is a problem because a concurrent read (which admittedly is an
odd thing to do) might see zeros rather that was there before the
write, or what was there after, or some mixture of the two (any of
these being a reasonable thing to see).

If the copy did fail, it will immediately be retried with preemption
re-enabled so any transient problem with accessing the source won't
cause an error.

The first copying does not need to zero any uncopied bytes, and doing
so causes the problem.
It uses copy_from_user_atomic rather than copy_from_user so the simple
expedient is to change copy_from_user_atomic to *not* zero out bytes
on failure.

The first of these two patches prepares for the change by fixing two
places which assume copy_from_user_atomic does zero the tail.  The
two usages are very similar pieces of code which copy from
a userspace iovec into one or more page-cache pages.  These are
changed to remove the assumption.

The second patch changes __copy_from_user_inatomic* to not zero the
tail.
Once these are accepted, I will look at similar patches of other
architectures where this is important (ppc, mips and sparc being the
ones I can find).

Feedback very welcome.

Thanks.
NeilBrown


 [PATCH 001 of 2] Prepare  for __copy_from_user_inatomic to not zero missed bytes.
 [PATCH 002 of 2] Make copy_from_user_inatomic NOT zero the tail on i386

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2006-05-22  4:52 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-19 17:18 [RFC] copy_from_user races with readpage Chris Mason
2006-04-19 20:41 ` Andrew Morton
2006-04-19 21:38   ` Andrew Morton
2006-04-19 22:18   ` Neil Brown
2006-04-19 23:36     ` Andrea Arcangeli
2006-04-28  2:04   ` [PATCH INTRO] Re: [RFC] copy_from_user races with readpage, [PATCH 000 of 2] Introduction NeilBrown
2006-04-28  2:10     ` [PATCH 001 of 2] Prepare for __copy_from_user_inatomic to not zero missed bytes NeilBrown
2006-04-28  2:10     ` [PATCH 002 of 2] Make copy_from_user_inatomic NOT zero the tail on i386 NeilBrown
  -- strict thread matches above, loose matches on Subject: below --
2006-05-22  4:46 [PATCH - RESEND - 000 of 2] Avoid subtle cache consistancy problem NeilBrown
2006-05-22  4:46 ` [PATCH 002 of 2] Make copy_from_user_inatomic NOT zero the tail on i386 NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox