public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] extend e2fsprogs functionality to add EXT2_FLAG_DIRECT option
       [not found]               ` <4B4C7297.5030905@redhat.com>
@ 2010-01-12 16:38                 ` Christoph Hellwig
  2010-01-12 16:43                   ` Michal Novotny
  0 siblings, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2010-01-12 16:38 UTC (permalink / raw)
  To: Michal Novotny; +Cc: Christoph Hellwig, Ric Wheeler, linux-ext4, linux-kernel

Ok, I looked at the issue.  The problem is that the Xen backend drivers
are (as expected) utterly braindead and submit bios directly from the
virtualization backed without using proper abstractions and thus
bypassing all the cache coherency features in the fileystems (the block
device nodes are just another mini-filesystem in that respect).  So
when you first have buffered access in the host pages may stay in cache
and get overwritten directly on disk by a Xen guest, and once the guest
is down the host may still use the now stale cached data.

I would recommend to migrate your cutomers to KVM which uses the proper
abtractions and thus doesn't have this problem.  There's a reason after
all why all the Xen dom0 mess never got merged to mainline.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] extend e2fsprogs functionality to add EXT2_FLAG_DIRECT option
  2010-01-12 16:38                 ` [PATCH] extend e2fsprogs functionality to add EXT2_FLAG_DIRECT option Christoph Hellwig
@ 2010-01-12 16:43                   ` Michal Novotny
  2010-01-12 16:47                     ` Christoph Hellwig
  2010-01-12 16:50                     ` Ric Wheeler
  0 siblings, 2 replies; 10+ messages in thread
From: Michal Novotny @ 2010-01-12 16:43 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Ric Wheeler, linux-ext4, linux-kernel

On 01/12/2010 05:38 PM, Christoph Hellwig wrote:
> Ok, I looked at the issue.  The problem is that the Xen backend drivers
> are (as expected) utterly braindead and submit bios directly from the
> virtualization backed without using proper abstractions and thus
> bypassing all the cache coherency features in the fileystems (the block
> device nodes are just another mini-filesystem in that respect).  So
> when you first have buffered access in the host pages may stay in cache
> and get overwritten directly on disk by a Xen guest, and once the guest
> is down the host may still use the now stale cached data.
>
> I would recommend to migrate your cutomers to KVM which uses the proper
> abtractions and thus doesn't have this problem.  There's a reason after
> all why all the Xen dom0 mess never got merged to mainline.
>    
So, do you think the problem is in the Xen backend drivers and to make 
it working right in Xen the driver fix is needed?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] extend e2fsprogs functionality to add EXT2_FLAG_DIRECT option
  2010-01-12 16:43                   ` Michal Novotny
@ 2010-01-12 16:47                     ` Christoph Hellwig
  2010-01-12 16:51                       ` Michal Novotny
  2010-01-12 16:50                     ` Ric Wheeler
  1 sibling, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2010-01-12 16:47 UTC (permalink / raw)
  To: Michal Novotny; +Cc: Christoph Hellwig, Ric Wheeler, linux-ext4, linux-kernel

On Tue, Jan 12, 2010 at 05:43:21PM +0100, Michal Novotny wrote:
> So, do you think the problem is in the Xen backend drivers and to make  
> it working right in Xen the driver fix is needed?

Yes, the Xen blkback driver just submits I/O directly without using
the right interfaces to force cache coherency.  It might be relatively
easy to hack a call in to flush all caches when it starts up, but given
how it bypasses all abstractions it's almost impossible to give full
coherency as if using the normal block device interfaces.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] extend e2fsprogs functionality to add EXT2_FLAG_DIRECT option
  2010-01-12 16:43                   ` Michal Novotny
  2010-01-12 16:47                     ` Christoph Hellwig
@ 2010-01-12 16:50                     ` Ric Wheeler
  2010-01-12 16:53                       ` Michal Novotny
  1 sibling, 1 reply; 10+ messages in thread
From: Ric Wheeler @ 2010-01-12 16:50 UTC (permalink / raw)
  To: Michal Novotny; +Cc: Christoph Hellwig, linux-ext4, linux-kernel

On 01/12/2010 11:43 AM, Michal Novotny wrote:
> On 01/12/2010 05:38 PM, Christoph Hellwig wrote:
>> Ok, I looked at the issue. The problem is that the Xen backend drivers
>> are (as expected) utterly braindead and submit bios directly from the
>> virtualization backed without using proper abstractions and thus
>> bypassing all the cache coherency features in the fileystems (the block
>> device nodes are just another mini-filesystem in that respect). So
>> when you first have buffered access in the host pages may stay in cache
>> and get overwritten directly on disk by a Xen guest, and once the guest
>> is down the host may still use the now stale cached data.
>>
>> I would recommend to migrate your cutomers to KVM which uses the proper
>> abtractions and thus doesn't have this problem. There's a reason after
>> all why all the Xen dom0 mess never got merged to mainline.
> So, do you think the problem is in the Xen backend drivers and to make
> it working right in Xen the driver fix is needed?

If XEN drivers by pass the normal IO and FS stack on the host, then I can 
understand why the hack to e2fsprogs works but it does not seem like a good fix.

Specifically, the data will continue to be cached (and if dirty, might be 
written back to the storage eventually).

If we need a work around, you need to drop VM caches for that device before you 
update the guest's files and possibly again afterwards (and make sure that 
nothing pulls the data into cache during the operation).

Basically, this sounds like the backend drivers are doing something really, 
really dangerous....

ric


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] extend e2fsprogs functionality to add EXT2_FLAG_DIRECT option
  2010-01-12 16:47                     ` Christoph Hellwig
@ 2010-01-12 16:51                       ` Michal Novotny
  0 siblings, 0 replies; 10+ messages in thread
From: Michal Novotny @ 2010-01-12 16:51 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Ric Wheeler, linux-ext4, linux-kernel

On 01/12/2010 05:47 PM, Christoph Hellwig wrote:
> On Tue, Jan 12, 2010 at 05:43:21PM +0100, Michal Novotny wrote:
>    
>> So, do you think the problem is in the Xen backend drivers and to make
>> it working right in Xen the driver fix is needed?
>>      
> Yes, the Xen blkback driver just submits I/O directly without using
> the right interfaces to force cache coherency.  It might be relatively
> easy to hack a call in to flush all caches when it starts up, but given
> how it bypasses all abstractions it's almost impossible to give full
> coherency as if using the normal block device interfaces.
>
>    
Ok, this way the fix to the drivers should be done. Thanks for your 
investigation. I just forwarded your reply to other members of my team.

Thanks,
Michal

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] extend e2fsprogs functionality to add EXT2_FLAG_DIRECT option
  2010-01-12 16:50                     ` Ric Wheeler
@ 2010-01-12 16:53                       ` Michal Novotny
  2010-01-12 16:56                         ` Eric Sandeen
  0 siblings, 1 reply; 10+ messages in thread
From: Michal Novotny @ 2010-01-12 16:53 UTC (permalink / raw)
  To: Ric Wheeler; +Cc: Christoph Hellwig, linux-ext4, linux-kernel

On 01/12/2010 05:50 PM, Ric Wheeler wrote:
> On 01/12/2010 11:43 AM, Michal Novotny wrote:
>> On 01/12/2010 05:38 PM, Christoph Hellwig wrote:
>>> Ok, I looked at the issue. The problem is that the Xen backend drivers
>>> are (as expected) utterly braindead and submit bios directly from the
>>> virtualization backed without using proper abstractions and thus
>>> bypassing all the cache coherency features in the fileystems (the block
>>> device nodes are just another mini-filesystem in that respect). So
>>> when you first have buffered access in the host pages may stay in cache
>>> and get overwritten directly on disk by a Xen guest, and once the guest
>>> is down the host may still use the now stale cached data.
>>>
>>> I would recommend to migrate your cutomers to KVM which uses the proper
>>> abtractions and thus doesn't have this problem. There's a reason after
>>> all why all the Xen dom0 mess never got merged to mainline.
>> So, do you think the problem is in the Xen backend drivers and to make
>> it working right in Xen the driver fix is needed?
>
> If XEN drivers by pass the normal IO and FS stack on the host, then I 
> can understand why the hack to e2fsprogs works but it does not seem 
> like a good fix.
>
> Specifically, the data will continue to be cached (and if dirty, might 
> be written back to the storage eventually).
>
> If we need a work around, you need to drop VM caches for that device 
> before you update the guest's files and possibly again afterwards (and 
> make sure that nothing pulls the data into cache during the operation).
>
> Basically, this sounds like the backend drivers are doing something 
> really, really dangerous....
>
> ric
>
Ok, so you think this is not good to do this patch for e2fsprogs for 
direct access support? The only things we could do now is to fix the 
backend drivers or create a workaround to drop caches? I need to discuss 
this further with guys in my team...

Thanks,
Michal

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] extend e2fsprogs functionality to add EXT2_FLAG_DIRECT option
  2010-01-12 16:53                       ` Michal Novotny
@ 2010-01-12 16:56                         ` Eric Sandeen
  2010-01-12 16:59                           ` Ric Wheeler
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Sandeen @ 2010-01-12 16:56 UTC (permalink / raw)
  To: Michal Novotny; +Cc: Ric Wheeler, Christoph Hellwig, linux-ext4, linux-kernel

Michal Novotny wrote:
> On 01/12/2010 05:50 PM, Ric Wheeler wrote:
>> On 01/12/2010 11:43 AM, Michal Novotny wrote:
>>> On 01/12/2010 05:38 PM, Christoph Hellwig wrote:
>>>> Ok, I looked at the issue. The problem is that the Xen backend drivers
>>>> are (as expected) utterly braindead and submit bios directly from the
>>>> virtualization backed without using proper abstractions and thus
>>>> bypassing all the cache coherency features in the fileystems (the block
>>>> device nodes are just another mini-filesystem in that respect). So
>>>> when you first have buffered access in the host pages may stay in cache
>>>> and get overwritten directly on disk by a Xen guest, and once the guest
>>>> is down the host may still use the now stale cached data.
>>>>
>>>> I would recommend to migrate your cutomers to KVM which uses the proper
>>>> abtractions and thus doesn't have this problem. There's a reason after
>>>> all why all the Xen dom0 mess never got merged to mainline.
>>> So, do you think the problem is in the Xen backend drivers and to make
>>> it working right in Xen the driver fix is needed?
>>
>> If XEN drivers by pass the normal IO and FS stack on the host, then I
>> can understand why the hack to e2fsprogs works but it does not seem
>> like a good fix.
>>
>> Specifically, the data will continue to be cached (and if dirty, might
>> be written back to the storage eventually).
>>
>> If we need a work around, you need to drop VM caches for that device
>> before you update the guest's files and possibly again afterwards (and
>> make sure that nothing pulls the data into cache during the operation).
>>
>> Basically, this sounds like the backend drivers are doing something
>> really, really dangerous....
>>
>> ric
>>
> Ok, so you think this is not good to do this patch for e2fsprogs for
> direct access support? The only things we could do now is to fix the
> backend drivers or create a workaround to drop caches? I need to discuss
> this further with guys in my team...

I do think that patching it up in e2fsprogs is unnecessarily invasive;
it's fixing it at the wrong spot.

Any block dev IO from the host is dangerous; fixing it only in e2fsprogs
for this one case doesn't seem like the right course of action.

-Eric

> Thanks,
> Michal
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] extend e2fsprogs functionality to add EXT2_FLAG_DIRECT option
  2010-01-12 16:56                         ` Eric Sandeen
@ 2010-01-12 16:59                           ` Ric Wheeler
  2010-01-12 17:00                             ` Michal Novotny
  0 siblings, 1 reply; 10+ messages in thread
From: Ric Wheeler @ 2010-01-12 16:59 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Michal Novotny, Christoph Hellwig, linux-ext4, linux-kernel

On 01/12/2010 11:56 AM, Eric Sandeen wrote:
> Michal Novotny wrote:
>> On 01/12/2010 05:50 PM, Ric Wheeler wrote:
>>> On 01/12/2010 11:43 AM, Michal Novotny wrote:
>>>> On 01/12/2010 05:38 PM, Christoph Hellwig wrote:
>>>>> Ok, I looked at the issue. The problem is that the Xen backend drivers
>>>>> are (as expected) utterly braindead and submit bios directly from the
>>>>> virtualization backed without using proper abstractions and thus
>>>>> bypassing all the cache coherency features in the fileystems (the block
>>>>> device nodes are just another mini-filesystem in that respect). So
>>>>> when you first have buffered access in the host pages may stay in cache
>>>>> and get overwritten directly on disk by a Xen guest, and once the guest
>>>>> is down the host may still use the now stale cached data.
>>>>>
>>>>> I would recommend to migrate your cutomers to KVM which uses the proper
>>>>> abtractions and thus doesn't have this problem. There's a reason after
>>>>> all why all the Xen dom0 mess never got merged to mainline.
>>>> So, do you think the problem is in the Xen backend drivers and to make
>>>> it working right in Xen the driver fix is needed?
>>>
>>> If XEN drivers by pass the normal IO and FS stack on the host, then I
>>> can understand why the hack to e2fsprogs works but it does not seem
>>> like a good fix.
>>>
>>> Specifically, the data will continue to be cached (and if dirty, might
>>> be written back to the storage eventually).
>>>
>>> If we need a work around, you need to drop VM caches for that device
>>> before you update the guest's files and possibly again afterwards (and
>>> make sure that nothing pulls the data into cache during the operation).
>>>
>>> Basically, this sounds like the backend drivers are doing something
>>> really, really dangerous....
>>>
>>> ric
>>>
>> Ok, so you think this is not good to do this patch for e2fsprogs for
>> direct access support? The only things we could do now is to fix the
>> backend drivers or create a workaround to drop caches? I need to discuss
>> this further with guys in my team...
>
> I do think that patching it up in e2fsprogs is unnecessarily invasive;
> it's fixing it at the wrong spot.
>
> Any block dev IO from the host is dangerous; fixing it only in e2fsprogs
> for this one case doesn't seem like the right course of action.
>
> -Eric

It actually could produce some nastier issues where it would work a bit, the bad 
data gets flushed back to the backing store and then your O_DIRECT read would be 
broken.

Also, for normal users of e2fsprogs, they should never bypass the cache...

ric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] extend e2fsprogs functionality to add EXT2_FLAG_DIRECT option
  2010-01-12 16:59                           ` Ric Wheeler
@ 2010-01-12 17:00                             ` Michal Novotny
  2010-01-14 13:46                               ` Michal Novotny
  0 siblings, 1 reply; 10+ messages in thread
From: Michal Novotny @ 2010-01-12 17:00 UTC (permalink / raw)
  To: Ric Wheeler; +Cc: Eric Sandeen, Christoph Hellwig, linux-ext4, linux-kernel

On 01/12/2010 05:59 PM, Ric Wheeler wrote:
> On 01/12/2010 11:56 AM, Eric Sandeen wrote:
>> Michal Novotny wrote:
>>> On 01/12/2010 05:50 PM, Ric Wheeler wrote:
>>>> On 01/12/2010 11:43 AM, Michal Novotny wrote:
>>>>> On 01/12/2010 05:38 PM, Christoph Hellwig wrote:
>>>>>> Ok, I looked at the issue. The problem is that the Xen backend 
>>>>>> drivers
>>>>>> are (as expected) utterly braindead and submit bios directly from 
>>>>>> the
>>>>>> virtualization backed without using proper abstractions and thus
>>>>>> bypassing all the cache coherency features in the fileystems (the 
>>>>>> block
>>>>>> device nodes are just another mini-filesystem in that respect). So
>>>>>> when you first have buffered access in the host pages may stay in 
>>>>>> cache
>>>>>> and get overwritten directly on disk by a Xen guest, and once the 
>>>>>> guest
>>>>>> is down the host may still use the now stale cached data.
>>>>>>
>>>>>> I would recommend to migrate your cutomers to KVM which uses the 
>>>>>> proper
>>>>>> abtractions and thus doesn't have this problem. There's a reason 
>>>>>> after
>>>>>> all why all the Xen dom0 mess never got merged to mainline.
>>>>> So, do you think the problem is in the Xen backend drivers and to 
>>>>> make
>>>>> it working right in Xen the driver fix is needed?
>>>>
>>>> If XEN drivers by pass the normal IO and FS stack on the host, then I
>>>> can understand why the hack to e2fsprogs works but it does not seem
>>>> like a good fix.
>>>>
>>>> Specifically, the data will continue to be cached (and if dirty, might
>>>> be written back to the storage eventually).
>>>>
>>>> If we need a work around, you need to drop VM caches for that device
>>>> before you update the guest's files and possibly again afterwards (and
>>>> make sure that nothing pulls the data into cache during the 
>>>> operation).
>>>>
>>>> Basically, this sounds like the backend drivers are doing something
>>>> really, really dangerous....
>>>>
>>>> ric
>>>>
>>> Ok, so you think this is not good to do this patch for e2fsprogs for
>>> direct access support? The only things we could do now is to fix the
>>> backend drivers or create a workaround to drop caches? I need to 
>>> discuss
>>> this further with guys in my team...
>>
>> I do think that patching it up in e2fsprogs is unnecessarily invasive;
>> it's fixing it at the wrong spot.
>>
>> Any block dev IO from the host is dangerous; fixing it only in e2fsprogs
>> for this one case doesn't seem like the right course of action.
>>
>> -Eric
>
> It actually could produce some nastier issues where it would work a 
> bit, the bad data gets flushed back to the backing store and then your 
> O_DIRECT read would be broken.
>
> Also, for normal users of e2fsprogs, they should never bypass the 
> cache...
>
> ric
Ok, good to know the normal users should never bypass the cache. That 
way this seems to be some kind of kernel bug...

Thank you all for your input,
Michal

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] extend e2fsprogs functionality to add EXT2_FLAG_DIRECT option
  2010-01-12 17:00                             ` Michal Novotny
@ 2010-01-14 13:46                               ` Michal Novotny
  0 siblings, 0 replies; 10+ messages in thread
From: Michal Novotny @ 2010-01-14 13:46 UTC (permalink / raw)
  To: Ric Wheeler; +Cc: Eric Sandeen, Christoph Hellwig, linux-ext4, linux-kernel

On 01/12/2010 06:00 PM, Michal Novotny wrote:
> On 01/12/2010 05:59 PM, Ric Wheeler wrote:
>> On 01/12/2010 11:56 AM, Eric Sandeen wrote:
>>> Michal Novotny wrote:
>>>> On 01/12/2010 05:50 PM, Ric Wheeler wrote:
>>>>> On 01/12/2010 11:43 AM, Michal Novotny wrote:
>>>>>> On 01/12/2010 05:38 PM, Christoph Hellwig wrote:
>>>>>>> Ok, I looked at the issue. The problem is that the Xen backend 
>>>>>>> drivers
>>>>>>> are (as expected) utterly braindead and submit bios directly 
>>>>>>> from the
>>>>>>> virtualization backed without using proper abstractions and thus
>>>>>>> bypassing all the cache coherency features in the fileystems 
>>>>>>> (the block
>>>>>>> device nodes are just another mini-filesystem in that respect). So
>>>>>>> when you first have buffered access in the host pages may stay 
>>>>>>> in cache
>>>>>>> and get overwritten directly on disk by a Xen guest, and once 
>>>>>>> the guest
>>>>>>> is down the host may still use the now stale cached data.
>>>>>>>
>>>>>>> I would recommend to migrate your cutomers to KVM which uses the 
>>>>>>> proper
>>>>>>> abtractions and thus doesn't have this problem. There's a reason 
>>>>>>> after
>>>>>>> all why all the Xen dom0 mess never got merged to mainline.
>>>>>> So, do you think the problem is in the Xen backend drivers and to 
>>>>>> make
>>>>>> it working right in Xen the driver fix is needed?
>>>>>
>>>>> If XEN drivers by pass the normal IO and FS stack on the host, then I
>>>>> can understand why the hack to e2fsprogs works but it does not seem
>>>>> like a good fix.
>>>>>
>>>>> Specifically, the data will continue to be cached (and if dirty, 
>>>>> might
>>>>> be written back to the storage eventually).
>>>>>
>>>>> If we need a work around, you need to drop VM caches for that device
>>>>> before you update the guest's files and possibly again afterwards 
>>>>> (and
>>>>> make sure that nothing pulls the data into cache during the 
>>>>> operation).
>>>>>
>>>>> Basically, this sounds like the backend drivers are doing something
>>>>> really, really dangerous....
>>>>>
>>>>> ric
>>>>>
>>>> Ok, so you think this is not good to do this patch for e2fsprogs for
>>>> direct access support? The only things we could do now is to fix the
>>>> backend drivers or create a workaround to drop caches? I need to 
>>>> discuss
>>>> this further with guys in my team...
>>>
>>> I do think that patching it up in e2fsprogs is unnecessarily invasive;
>>> it's fixing it at the wrong spot.
>>>
>>> Any block dev IO from the host is dangerous; fixing it only in 
>>> e2fsprogs
>>> for this one case doesn't seem like the right course of action.
>>>
>>> -Eric
>>
>> It actually could produce some nastier issues where it would work a 
>> bit, the bad data gets flushed back to the backing store and then 
>> your O_DIRECT read would be broken.
>>
>> Also, for normal users of e2fsprogs, they should never bypass the 
>> cache...
>>
>> ric
> Ok, good to know the normal users should never bypass the cache. That 
> way this seems to be some kind of kernel bug...
>
Hi,
finally, after discussing this with several people we decided it's 
better to solve it in kernel path since Xen is using some weird IO path 
like mentioned. Since this patch is about using O_DIRECT in e2fsprogs 
which I've been told that normal e2fsprogs user should never do I'd like 
to thanks you all for your help and tell you to ignore this patch since 
we're going to solve it some other way, ie. fix this in the kernel itself.

Thanks for your help!

Michal

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-01-14 13:47 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <4B46FCB2.1090308@redhat.com>
     [not found] ` <4B4B84E2.1050508@redhat.com>
     [not found]   ` <4B4C54DC.4040006@redhat.com>
     [not found]     ` <4B4C6429.6090803@redhat.com>
     [not found]       ` <4B4C67F5.1020009@redhat.com>
     [not found]         ` <20100112122319.GA20596@infradead.org>
     [not found]           ` <4B4C6B70.1050205@redhat.com>
     [not found]             ` <20100112124600.GA7151@infradead.org>
     [not found]               ` <4B4C7297.5030905@redhat.com>
2010-01-12 16:38                 ` [PATCH] extend e2fsprogs functionality to add EXT2_FLAG_DIRECT option Christoph Hellwig
2010-01-12 16:43                   ` Michal Novotny
2010-01-12 16:47                     ` Christoph Hellwig
2010-01-12 16:51                       ` Michal Novotny
2010-01-12 16:50                     ` Ric Wheeler
2010-01-12 16:53                       ` Michal Novotny
2010-01-12 16:56                         ` Eric Sandeen
2010-01-12 16:59                           ` Ric Wheeler
2010-01-12 17:00                             ` Michal Novotny
2010-01-14 13:46                               ` Michal Novotny

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox