* Re: Current topics for LSF10/MM Summit 8-9 August in Boston
@ 2010-06-17 16:07 ` James Bottomley
0 siblings, 0 replies; 39+ messages in thread
From: James Bottomley @ 2010-06-17 16:07 UTC (permalink / raw)
To: Christof Schmitt; +Cc: linux-scsi, linux-fsdevel, linux-mm, lsf10-pc
On Thu, 2010-06-17 at 18:00 +0200, Christof Schmitt wrote:
> On Wed, Jun 16, 2010 at 03:50:59PM -0500, James Bottomley wrote:
> > Given that we're under two months out, I thought it would be time to
> > post a summary of the topics we've collected so far (Nick will post the
> > MM summit ones later). Look this over, and if there's anything missing,
> > propose it ... or if you have cross Storage/FS/MM topics, post them too.
> >
> > Oh, and since we're not the most organised bunch, if you posted a topic
> > and don't see it in the list, please resend ... we probably lost it in
> > an email shuffle.
> >
> > Current Filesystem Topics:
> >
> > Alex Elder Upstream maintainer for XFS, general discussion on FS/IO
> > Aneesh Kumar Rich-acl patches which work better with NFSv4 acl and CIFS acl
> > Anshul Madan reflink for NFS
> > Chuck Lever NFS/IPV6 and NFS O_DIRECT, Wu's read-ahead work, vitro perf tools
> > Eric Sandeen Advances in testing, TRIM/DISCARD/Alignment, writeback sanity
> > James Lentini reflink for NFS
> > Jan Kara Discuss/drive sanity review of writeback and general ext*/jbd
> > Michael Rubin Writeback scaling
> > Sage Weil Statlite, generic interface for describing file striping for distributed FS, VFS scalability
> > Al Viro Sorting out d_revalidate and other dcache issues
> > Coly Li directory/large file scalability
> > Sorin Faibish Cache writeback discussion
> >
> > Current Storage Topics:
> >
> > Eric Seppanen Next generation SSDs, performance implications on Linux I/O
> > Boaz Harrosh PNFS performance considerations, bio_list based/async raidN for generic use; stable pages for I/O
> > FUJITA Tomonori SCSI target mode, iSCSI, block layer SG (bsg), sg, IOMMU, DMA issues
> > Hannes Reinecke libfc/multipath/error handing
> > James Smart FCOE proposal for rework of the FC sysfs tree, work with Hannes on other transport/SCSI subsystem topics
> > Jeff Moyer IO scheduler
> > Joel Becker SAN management plugin
> > Martin Petersen Updates on DIF/DIX, TRIM/DISCARD/UNMAP, generic support for WRITE_SAME
> >
> > Plus some MM summit ones which Nick will summarise.
> [...]
>
> What about the topic "Stable pages while IO"?
> http://www.spinics.net/lists/linux-scsi/msg44074.html
>
> Was it lost during the e-mail shuffle or will it be part of the MM topics?
It's actually listed under 'dma issues' ... but there's really been no
satisfactory resolution or discussion of how one might be achieved.
Most filesystems rely on modifications to in-flight pages for efficiency
and copying every fs I/O page would be horrendous both for performance
and memory consumption. Nor has there really been an indication that
it's a serious issue. The two sufferers are DIF and iSCSI checksum.
The latter generates the checksum late enough that it can just discard
incorrect pages ... the former might need simply to turn off DIF for
everything other than DIRECT IO.
James
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Current topics for LSF10/MM Summit 8-9 August in Boston
2010-06-17 16:07 ` James Bottomley
@ 2010-06-17 16:13 ` Boaz Harrosh
-1 siblings, 0 replies; 39+ messages in thread
From: Boaz Harrosh @ 2010-06-17 16:13 UTC (permalink / raw)
To: James Bottomley
Cc: Christof Schmitt, linux-scsi, linux-fsdevel, linux-mm, lsf10-pc
On 06/17/2010 12:07 PM, James Bottomley wrote:
> On Thu, 2010-06-17 at 18:00 +0200, Christof Schmitt wrote:
>> On Wed, Jun 16, 2010 at 03:50:59PM -0500, James Bottomley wrote:
>>> Given that we're under two months out, I thought it would be time to
>>> post a summary of the topics we've collected so far (Nick will post the
>>> MM summit ones later). Look this over, and if there's anything missing,
>>> propose it ... or if you have cross Storage/FS/MM topics, post them too.
>>>
>>> Oh, and since we're not the most organised bunch, if you posted a topic
>>> and don't see it in the list, please resend ... we probably lost it in
>>> an email shuffle.
>>>
>>> Current Filesystem Topics:
>>>
>>> Alex Elder Upstream maintainer for XFS, general discussion on FS/IO
>>> Aneesh Kumar Rich-acl patches which work better with NFSv4 acl and CIFS acl
>>> Anshul Madan reflink for NFS
>>> Chuck Lever NFS/IPV6 and NFS O_DIRECT, Wu's read-ahead work, vitro perf tools
>>> Eric Sandeen Advances in testing, TRIM/DISCARD/Alignment, writeback sanity
>>> James Lentini reflink for NFS
>>> Jan Kara Discuss/drive sanity review of writeback and general ext*/jbd
>>> Michael Rubin Writeback scaling
>>> Sage Weil Statlite, generic interface for describing file striping for distributed FS, VFS scalability
>>> Al Viro Sorting out d_revalidate and other dcache issues
>>> Coly Li directory/large file scalability
>>> Sorin Faibish Cache writeback discussion
>>>
>>> Current Storage Topics:
>>>
>>> Eric Seppanen Next generation SSDs, performance implications on Linux I/O
>>> Boaz Harrosh PNFS performance considerations, bio_list based/async raidN for generic use; stable pages for I/O
>>> FUJITA Tomonori SCSI target mode, iSCSI, block layer SG (bsg), sg, IOMMU, DMA issues
>>> Hannes Reinecke libfc/multipath/error handing
>>> James Smart FCOE proposal for rework of the FC sysfs tree, work with Hannes on other transport/SCSI subsystem topics
>>> Jeff Moyer IO scheduler
>>> Joel Becker SAN management plugin
>>> Martin Petersen Updates on DIF/DIX, TRIM/DISCARD/UNMAP, generic support for WRITE_SAME
>>>
>>> Plus some MM summit ones which Nick will summarise.
>> [...]
>>
>> What about the topic "Stable pages while IO"?
>> http://www.spinics.net/lists/linux-scsi/msg44074.html
>>
>> Was it lost during the e-mail shuffle or will it be part of the MM topics?
>
> It's actually listed under 'dma issues' ... but there's really been no
> satisfactory resolution or discussion of how one might be achieved.
> Most filesystems rely on modifications to in-flight pages for efficiency
> and copying every fs I/O page would be horrendous both for performance
> and memory consumption. Nor has there really been an indication that
> it's a serious issue. The two sufferers are DIF and iSCSI checksum.
> The latter generates the checksum late enough that it can just discard
> incorrect pages ... the former might need simply to turn off DIF for
> everything other than DIRECT IO.
What about raid and mirror that do copy the complete IO load
just because of that? They gave up long ago but wouldn't they
gain if that was revisited? (And so would DIF/checksum)
>
> James
>
Boaz
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Current topics for LSF10/MM Summit 8-9 August in Boston
@ 2010-06-17 16:13 ` Boaz Harrosh
0 siblings, 0 replies; 39+ messages in thread
From: Boaz Harrosh @ 2010-06-17 16:13 UTC (permalink / raw)
To: James Bottomley
Cc: Christof Schmitt, linux-scsi, linux-fsdevel, linux-mm, lsf10-pc
On 06/17/2010 12:07 PM, James Bottomley wrote:
> On Thu, 2010-06-17 at 18:00 +0200, Christof Schmitt wrote:
>> On Wed, Jun 16, 2010 at 03:50:59PM -0500, James Bottomley wrote:
>>> Given that we're under two months out, I thought it would be time to
>>> post a summary of the topics we've collected so far (Nick will post the
>>> MM summit ones later). Look this over, and if there's anything missing,
>>> propose it ... or if you have cross Storage/FS/MM topics, post them too.
>>>
>>> Oh, and since we're not the most organised bunch, if you posted a topic
>>> and don't see it in the list, please resend ... we probably lost it in
>>> an email shuffle.
>>>
>>> Current Filesystem Topics:
>>>
>>> Alex Elder Upstream maintainer for XFS, general discussion on FS/IO
>>> Aneesh Kumar Rich-acl patches which work better with NFSv4 acl and CIFS acl
>>> Anshul Madan reflink for NFS
>>> Chuck Lever NFS/IPV6 and NFS O_DIRECT, Wu's read-ahead work, vitro perf tools
>>> Eric Sandeen Advances in testing, TRIM/DISCARD/Alignment, writeback sanity
>>> James Lentini reflink for NFS
>>> Jan Kara Discuss/drive sanity review of writeback and general ext*/jbd
>>> Michael Rubin Writeback scaling
>>> Sage Weil Statlite, generic interface for describing file striping for distributed FS, VFS scalability
>>> Al Viro Sorting out d_revalidate and other dcache issues
>>> Coly Li directory/large file scalability
>>> Sorin Faibish Cache writeback discussion
>>>
>>> Current Storage Topics:
>>>
>>> Eric Seppanen Next generation SSDs, performance implications on Linux I/O
>>> Boaz Harrosh PNFS performance considerations, bio_list based/async raidN for generic use; stable pages for I/O
>>> FUJITA Tomonori SCSI target mode, iSCSI, block layer SG (bsg), sg, IOMMU, DMA issues
>>> Hannes Reinecke libfc/multipath/error handing
>>> James Smart FCOE proposal for rework of the FC sysfs tree, work with Hannes on other transport/SCSI subsystem topics
>>> Jeff Moyer IO scheduler
>>> Joel Becker SAN management plugin
>>> Martin Petersen Updates on DIF/DIX, TRIM/DISCARD/UNMAP, generic support for WRITE_SAME
>>>
>>> Plus some MM summit ones which Nick will summarise.
>> [...]
>>
>> What about the topic "Stable pages while IO"?
>> http://www.spinics.net/lists/linux-scsi/msg44074.html
>>
>> Was it lost during the e-mail shuffle or will it be part of the MM topics?
>
> It's actually listed under 'dma issues' ... but there's really been no
> satisfactory resolution or discussion of how one might be achieved.
> Most filesystems rely on modifications to in-flight pages for efficiency
> and copying every fs I/O page would be horrendous both for performance
> and memory consumption. Nor has there really been an indication that
> it's a serious issue. The two sufferers are DIF and iSCSI checksum.
> The latter generates the checksum late enough that it can just discard
> incorrect pages ... the former might need simply to turn off DIF for
> everything other than DIRECT IO.
What about raid and mirror that do copy the complete IO load
just because of that? They gave up long ago but wouldn't they
gain if that was revisited? (And so would DIF/checksum)
>
> James
>
Boaz
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Current topics for LSF10/MM Summit 8-9 August in Boston
2010-06-17 16:07 ` James Bottomley
@ 2010-06-17 16:34 ` Vladislav Bolkhovitin
-1 siblings, 0 replies; 39+ messages in thread
From: Vladislav Bolkhovitin @ 2010-06-17 16:34 UTC (permalink / raw)
To: James Bottomley
Cc: Christof Schmitt, linux-scsi, linux-fsdevel, linux-mm, lsf10-pc
James Bottomley, on 06/17/2010 08:07 PM wrote:
> On Thu, 2010-06-17 at 18:00 +0200, Christof Schmitt wrote:
>> On Wed, Jun 16, 2010 at 03:50:59PM -0500, James Bottomley wrote:
>>> Given that we're under two months out, I thought it would be time to
>>> post a summary of the topics we've collected so far (Nick will post the
>>> MM summit ones later). Look this over, and if there's anything missing,
>>> propose it ... or if you have cross Storage/FS/MM topics, post them too.
>>>
>>> Oh, and since we're not the most organised bunch, if you posted a topic
>>> and don't see it in the list, please resend ... we probably lost it in
>>> an email shuffle.
>>>
>>> Current Filesystem Topics:
>>>
>>> Alex Elder Upstream maintainer for XFS, general discussion on FS/IO
>>> Aneesh Kumar Rich-acl patches which work better with NFSv4 acl and CIFS acl
>>> Anshul Madan reflink for NFS
>>> Chuck Lever NFS/IPV6 and NFS O_DIRECT, Wu's read-ahead work, vitro perf tools
>>> Eric Sandeen Advances in testing, TRIM/DISCARD/Alignment, writeback sanity
>>> James Lentini reflink for NFS
>>> Jan Kara Discuss/drive sanity review of writeback and general ext*/jbd
>>> Michael Rubin Writeback scaling
>>> Sage Weil Statlite, generic interface for describing file striping for distributed FS, VFS scalability
>>> Al Viro Sorting out d_revalidate and other dcache issues
>>> Coly Li directory/large file scalability
>>> Sorin Faibish Cache writeback discussion
>>>
>>> Current Storage Topics:
>>>
>>> Eric Seppanen Next generation SSDs, performance implications on Linux I/O
>>> Boaz Harrosh PNFS performance considerations, bio_list based/async raidN for generic use; stable pages for I/O
>>> FUJITA Tomonori SCSI target mode, iSCSI, block layer SG (bsg), sg, IOMMU, DMA issues
>>> Hannes Reinecke libfc/multipath/error handing
>>> James Smart FCOE proposal for rework of the FC sysfs tree, work with Hannes on other transport/SCSI subsystem topics
>>> Jeff Moyer IO scheduler
>>> Joel Becker SAN management plugin
>>> Martin Petersen Updates on DIF/DIX, TRIM/DISCARD/UNMAP, generic support for WRITE_SAME
>>>
>>> Plus some MM summit ones which Nick will summarise.
>> [...]
>>
>> What about the topic "Stable pages while IO"?
>> http://www.spinics.net/lists/linux-scsi/msg44074.html
>>
>> Was it lost during the e-mail shuffle or will it be part of the MM topics?
>
> It's actually listed under 'dma issues' ... but there's really been no
> satisfactory resolution or discussion of how one might be achieved.
> Most filesystems rely on modifications to in-flight pages for efficiency
> and copying every fs I/O page would be horrendous both for performance
> and memory consumption. Nor has there really been an indication that
> it's a serious issue. The two sufferers are DIF and iSCSI checksum.
You forgot the third: advanced storage, including MPIO clusters, where
retry of the write of the modified in-flight pages while the original
write for them not yet completed might cause out of the expected order
execution of the writes and data corruption (old data written instead of
new).
Vlad
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Current topics for LSF10/MM Summit 8-9 August in Boston
@ 2010-06-17 16:34 ` Vladislav Bolkhovitin
0 siblings, 0 replies; 39+ messages in thread
From: Vladislav Bolkhovitin @ 2010-06-17 16:34 UTC (permalink / raw)
To: James Bottomley
Cc: Christof Schmitt, linux-scsi, linux-fsdevel, linux-mm, lsf10-pc
James Bottomley, on 06/17/2010 08:07 PM wrote:
> On Thu, 2010-06-17 at 18:00 +0200, Christof Schmitt wrote:
>> On Wed, Jun 16, 2010 at 03:50:59PM -0500, James Bottomley wrote:
>>> Given that we're under two months out, I thought it would be time to
>>> post a summary of the topics we've collected so far (Nick will post the
>>> MM summit ones later). Look this over, and if there's anything missing,
>>> propose it ... or if you have cross Storage/FS/MM topics, post them too.
>>>
>>> Oh, and since we're not the most organised bunch, if you posted a topic
>>> and don't see it in the list, please resend ... we probably lost it in
>>> an email shuffle.
>>>
>>> Current Filesystem Topics:
>>>
>>> Alex Elder Upstream maintainer for XFS, general discussion on FS/IO
>>> Aneesh Kumar Rich-acl patches which work better with NFSv4 acl and CIFS acl
>>> Anshul Madan reflink for NFS
>>> Chuck Lever NFS/IPV6 and NFS O_DIRECT, Wu's read-ahead work, vitro perf tools
>>> Eric Sandeen Advances in testing, TRIM/DISCARD/Alignment, writeback sanity
>>> James Lentini reflink for NFS
>>> Jan Kara Discuss/drive sanity review of writeback and general ext*/jbd
>>> Michael Rubin Writeback scaling
>>> Sage Weil Statlite, generic interface for describing file striping for distributed FS, VFS scalability
>>> Al Viro Sorting out d_revalidate and other dcache issues
>>> Coly Li directory/large file scalability
>>> Sorin Faibish Cache writeback discussion
>>>
>>> Current Storage Topics:
>>>
>>> Eric Seppanen Next generation SSDs, performance implications on Linux I/O
>>> Boaz Harrosh PNFS performance considerations, bio_list based/async raidN for generic use; stable pages for I/O
>>> FUJITA Tomonori SCSI target mode, iSCSI, block layer SG (bsg), sg, IOMMU, DMA issues
>>> Hannes Reinecke libfc/multipath/error handing
>>> James Smart FCOE proposal for rework of the FC sysfs tree, work with Hannes on other transport/SCSI subsystem topics
>>> Jeff Moyer IO scheduler
>>> Joel Becker SAN management plugin
>>> Martin Petersen Updates on DIF/DIX, TRIM/DISCARD/UNMAP, generic support for WRITE_SAME
>>>
>>> Plus some MM summit ones which Nick will summarise.
>> [...]
>>
>> What about the topic "Stable pages while IO"?
>> http://www.spinics.net/lists/linux-scsi/msg44074.html
>>
>> Was it lost during the e-mail shuffle or will it be part of the MM topics?
>
> It's actually listed under 'dma issues' ... but there's really been no
> satisfactory resolution or discussion of how one might be achieved.
> Most filesystems rely on modifications to in-flight pages for efficiency
> and copying every fs I/O page would be horrendous both for performance
> and memory consumption. Nor has there really been an indication that
> it's a serious issue. The two sufferers are DIF and iSCSI checksum.
You forgot the third: advanced storage, including MPIO clusters, where
retry of the write of the modified in-flight pages while the original
write for them not yet completed might cause out of the expected order
execution of the writes and data corruption (old data written instead of
new).
Vlad
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Current topics for LSF10/MM Summit 8-9 August in Boston
2010-06-17 16:34 ` Vladislav Bolkhovitin
@ 2010-06-17 16:42 ` James Bottomley
-1 siblings, 0 replies; 39+ messages in thread
From: James Bottomley @ 2010-06-17 16:42 UTC (permalink / raw)
To: Vladislav Bolkhovitin
Cc: Christof Schmitt, linux-scsi, linux-fsdevel, linux-mm, lsf10-pc
On Thu, 2010-06-17 at 20:34 +0400, Vladislav Bolkhovitin wrote:
> James Bottomley, on 06/17/2010 08:07 PM wrote:
> > On Thu, 2010-06-17 at 18:00 +0200, Christof Schmitt wrote:
> >> On Wed, Jun 16, 2010 at 03:50:59PM -0500, James Bottomley wrote:
> >>> Given that we're under two months out, I thought it would be time to
> >>> post a summary of the topics we've collected so far (Nick will post the
> >>> MM summit ones later). Look this over, and if there's anything missing,
> >>> propose it ... or if you have cross Storage/FS/MM topics, post them too.
> >>>
> >>> Oh, and since we're not the most organised bunch, if you posted a topic
> >>> and don't see it in the list, please resend ... we probably lost it in
> >>> an email shuffle.
> >>>
> >>> Current Filesystem Topics:
> >>>
> >>> Alex Elder Upstream maintainer for XFS, general discussion on FS/IO
> >>> Aneesh Kumar Rich-acl patches which work better with NFSv4 acl and CIFS acl
> >>> Anshul Madan reflink for NFS
> >>> Chuck Lever NFS/IPV6 and NFS O_DIRECT, Wu's read-ahead work, vitro perf tools
> >>> Eric Sandeen Advances in testing, TRIM/DISCARD/Alignment, writeback sanity
> >>> James Lentini reflink for NFS
> >>> Jan Kara Discuss/drive sanity review of writeback and general ext*/jbd
> >>> Michael Rubin Writeback scaling
> >>> Sage Weil Statlite, generic interface for describing file striping for distributed FS, VFS scalability
> >>> Al Viro Sorting out d_revalidate and other dcache issues
> >>> Coly Li directory/large file scalability
> >>> Sorin Faibish Cache writeback discussion
> >>>
> >>> Current Storage Topics:
> >>>
> >>> Eric Seppanen Next generation SSDs, performance implications on Linux I/O
> >>> Boaz Harrosh PNFS performance considerations, bio_list based/async raidN for generic use; stable pages for I/O
> >>> FUJITA Tomonori SCSI target mode, iSCSI, block layer SG (bsg), sg, IOMMU, DMA issues
> >>> Hannes Reinecke libfc/multipath/error handing
> >>> James Smart FCOE proposal for rework of the FC sysfs tree, work with Hannes on other transport/SCSI subsystem topics
> >>> Jeff Moyer IO scheduler
> >>> Joel Becker SAN management plugin
> >>> Martin Petersen Updates on DIF/DIX, TRIM/DISCARD/UNMAP, generic support for WRITE_SAME
> >>>
> >>> Plus some MM summit ones which Nick will summarise.
> >> [...]
> >>
> >> What about the topic "Stable pages while IO"?
> >> http://www.spinics.net/lists/linux-scsi/msg44074.html
> >>
> >> Was it lost during the e-mail shuffle or will it be part of the MM topics?
> >
> > It's actually listed under 'dma issues' ... but there's really been no
> > satisfactory resolution or discussion of how one might be achieved.
> > Most filesystems rely on modifications to in-flight pages for efficiency
> > and copying every fs I/O page would be horrendous both for performance
> > and memory consumption. Nor has there really been an indication that
> > it's a serious issue. The two sufferers are DIF and iSCSI checksum.
>
> You forgot the third: advanced storage, including MPIO clusters, where
> retry of the write of the modified in-flight pages while the original
> write for them not yet completed might cause out of the expected order
> execution of the writes and data corruption (old data written instead of
> new).
I don't think that's a problem. Multiple commands in flight to the same
I/O region can get reordered because we only use simple tagging
regardless of advanced or otherwise storage. The VM seems to wait for
one write to complete before starting another because of the way the
flush threads work.
James
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Current topics for LSF10/MM Summit 8-9 August in Boston
@ 2010-06-17 16:42 ` James Bottomley
0 siblings, 0 replies; 39+ messages in thread
From: James Bottomley @ 2010-06-17 16:42 UTC (permalink / raw)
To: Vladislav Bolkhovitin
Cc: Christof Schmitt, linux-scsi, linux-fsdevel, linux-mm, lsf10-pc
On Thu, 2010-06-17 at 20:34 +0400, Vladislav Bolkhovitin wrote:
> James Bottomley, on 06/17/2010 08:07 PM wrote:
> > On Thu, 2010-06-17 at 18:00 +0200, Christof Schmitt wrote:
> >> On Wed, Jun 16, 2010 at 03:50:59PM -0500, James Bottomley wrote:
> >>> Given that we're under two months out, I thought it would be time to
> >>> post a summary of the topics we've collected so far (Nick will post the
> >>> MM summit ones later). Look this over, and if there's anything missing,
> >>> propose it ... or if you have cross Storage/FS/MM topics, post them too.
> >>>
> >>> Oh, and since we're not the most organised bunch, if you posted a topic
> >>> and don't see it in the list, please resend ... we probably lost it in
> >>> an email shuffle.
> >>>
> >>> Current Filesystem Topics:
> >>>
> >>> Alex Elder Upstream maintainer for XFS, general discussion on FS/IO
> >>> Aneesh Kumar Rich-acl patches which work better with NFSv4 acl and CIFS acl
> >>> Anshul Madan reflink for NFS
> >>> Chuck Lever NFS/IPV6 and NFS O_DIRECT, Wu's read-ahead work, vitro perf tools
> >>> Eric Sandeen Advances in testing, TRIM/DISCARD/Alignment, writeback sanity
> >>> James Lentini reflink for NFS
> >>> Jan Kara Discuss/drive sanity review of writeback and general ext*/jbd
> >>> Michael Rubin Writeback scaling
> >>> Sage Weil Statlite, generic interface for describing file striping for distributed FS, VFS scalability
> >>> Al Viro Sorting out d_revalidate and other dcache issues
> >>> Coly Li directory/large file scalability
> >>> Sorin Faibish Cache writeback discussion
> >>>
> >>> Current Storage Topics:
> >>>
> >>> Eric Seppanen Next generation SSDs, performance implications on Linux I/O
> >>> Boaz Harrosh PNFS performance considerations, bio_list based/async raidN for generic use; stable pages for I/O
> >>> FUJITA Tomonori SCSI target mode, iSCSI, block layer SG (bsg), sg, IOMMU, DMA issues
> >>> Hannes Reinecke libfc/multipath/error handing
> >>> James Smart FCOE proposal for rework of the FC sysfs tree, work with Hannes on other transport/SCSI subsystem topics
> >>> Jeff Moyer IO scheduler
> >>> Joel Becker SAN management plugin
> >>> Martin Petersen Updates on DIF/DIX, TRIM/DISCARD/UNMAP, generic support for WRITE_SAME
> >>>
> >>> Plus some MM summit ones which Nick will summarise.
> >> [...]
> >>
> >> What about the topic "Stable pages while IO"?
> >> http://www.spinics.net/lists/linux-scsi/msg44074.html
> >>
> >> Was it lost during the e-mail shuffle or will it be part of the MM topics?
> >
> > It's actually listed under 'dma issues' ... but there's really been no
> > satisfactory resolution or discussion of how one might be achieved.
> > Most filesystems rely on modifications to in-flight pages for efficiency
> > and copying every fs I/O page would be horrendous both for performance
> > and memory consumption. Nor has there really been an indication that
> > it's a serious issue. The two sufferers are DIF and iSCSI checksum.
>
> You forgot the third: advanced storage, including MPIO clusters, where
> retry of the write of the modified in-flight pages while the original
> write for them not yet completed might cause out of the expected order
> execution of the writes and data corruption (old data written instead of
> new).
I don't think that's a problem. Multiple commands in flight to the same
I/O region can get reordered because we only use simple tagging
regardless of advanced or otherwise storage. The VM seems to wait for
one write to complete before starting another because of the way the
flush threads work.
James
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Current topics for LSF10/MM Summit 8-9 August in Boston
2010-06-17 16:42 ` James Bottomley
@ 2010-06-17 17:11 ` Vladislav Bolkhovitin
-1 siblings, 0 replies; 39+ messages in thread
From: Vladislav Bolkhovitin @ 2010-06-17 17:11 UTC (permalink / raw)
To: James Bottomley, Gennadiy Nerubayev
Cc: Christof Schmitt, linux-scsi, linux-fsdevel, linux-mm, lsf10-pc,
Boaz Harrosh
James Bottomley, on 06/17/2010 08:42 PM wrote:
> On Thu, 2010-06-17 at 20:34 +0400, Vladislav Bolkhovitin wrote:
>> James Bottomley, on 06/17/2010 08:07 PM wrote:
>>> On Thu, 2010-06-17 at 18:00 +0200, Christof Schmitt wrote:
>>>> On Wed, Jun 16, 2010 at 03:50:59PM -0500, James Bottomley wrote:
>>>>> Given that we're under two months out, I thought it would be time to
>>>>> post a summary of the topics we've collected so far (Nick will post the
>>>>> MM summit ones later). Look this over, and if there's anything missing,
>>>>> propose it ... or if you have cross Storage/FS/MM topics, post them too.
>>>>>
>>>>> Oh, and since we're not the most organised bunch, if you posted a topic
>>>>> and don't see it in the list, please resend ... we probably lost it in
>>>>> an email shuffle.
>>>>>
>>>>> Current Filesystem Topics:
>>>>>
>>>>> Alex Elder Upstream maintainer for XFS, general discussion on FS/IO
>>>>> Aneesh Kumar Rich-acl patches which work better with NFSv4 acl and CIFS acl
>>>>> Anshul Madan reflink for NFS
>>>>> Chuck Lever NFS/IPV6 and NFS O_DIRECT, Wu's read-ahead work, vitro perf tools
>>>>> Eric Sandeen Advances in testing, TRIM/DISCARD/Alignment, writeback sanity
>>>>> James Lentini reflink for NFS
>>>>> Jan Kara Discuss/drive sanity review of writeback and general ext*/jbd
>>>>> Michael Rubin Writeback scaling
>>>>> Sage Weil Statlite, generic interface for describing file striping for distributed FS, VFS scalability
>>>>> Al Viro Sorting out d_revalidate and other dcache issues
>>>>> Coly Li directory/large file scalability
>>>>> Sorin Faibish Cache writeback discussion
>>>>>
>>>>> Current Storage Topics:
>>>>>
>>>>> Eric Seppanen Next generation SSDs, performance implications on Linux I/O
>>>>> Boaz Harrosh PNFS performance considerations, bio_list based/async raidN for generic use; stable pages for I/O
>>>>> FUJITA Tomonori SCSI target mode, iSCSI, block layer SG (bsg), sg, IOMMU, DMA issues
>>>>> Hannes Reinecke libfc/multipath/error handing
>>>>> James Smart FCOE proposal for rework of the FC sysfs tree, work with Hannes on other transport/SCSI subsystem topics
>>>>> Jeff Moyer IO scheduler
>>>>> Joel Becker SAN management plugin
>>>>> Martin Petersen Updates on DIF/DIX, TRIM/DISCARD/UNMAP, generic support for WRITE_SAME
>>>>>
>>>>> Plus some MM summit ones which Nick will summarise.
>>>> [...]
>>>>
>>>> What about the topic "Stable pages while IO"?
>>>> http://www.spinics.net/lists/linux-scsi/msg44074.html
>>>>
>>>> Was it lost during the e-mail shuffle or will it be part of the MM topics?
>>> It's actually listed under 'dma issues' ... but there's really been no
>>> satisfactory resolution or discussion of how one might be achieved.
>>> Most filesystems rely on modifications to in-flight pages for efficiency
>>> and copying every fs I/O page would be horrendous both for performance
>>> and memory consumption. Nor has there really been an indication that
>>> it's a serious issue. The two sufferers are DIF and iSCSI checksum.
>> You forgot the third: advanced storage, including MPIO clusters, where
>> retry of the write of the modified in-flight pages while the original
>> write for them not yet completed might cause out of the expected order
>> execution of the writes and data corruption (old data written instead of
>> new).
>
> I don't think that's a problem. Multiple commands in flight to the same
> I/O region can get reordered because we only use simple tagging
> regardless of advanced or otherwise storage. The VM seems to wait for
> one write to complete before starting another because of the way the
> flush threads work.
I hope so, but: (1) we can see such writes (see
http://lists.linbit.com/pipermail/drbd-user/2009-April/011891.html, for
instance) and (2) Boaz said it's possible. From the "seems" you wrote
looks like your are also not too sure. So, if it isn't possible, it
would be good if someone familar with VM internals confirmed this.
Gennadiy,
If possible, can you recheck in your setup with a real Linux as
initiator to confirm if Linux is suffers from the concurrent writes
you've seen or not, please?
Vlad
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Current topics for LSF10/MM Summit 8-9 August in Boston
@ 2010-06-17 17:11 ` Vladislav Bolkhovitin
0 siblings, 0 replies; 39+ messages in thread
From: Vladislav Bolkhovitin @ 2010-06-17 17:11 UTC (permalink / raw)
To: James Bottomley, Gennadiy Nerubayev
Cc: Christof Schmitt, linux-scsi, linux-fsdevel, linux-mm, lsf10-pc,
Boaz Harrosh
James Bottomley, on 06/17/2010 08:42 PM wrote:
> On Thu, 2010-06-17 at 20:34 +0400, Vladislav Bolkhovitin wrote:
>> James Bottomley, on 06/17/2010 08:07 PM wrote:
>>> On Thu, 2010-06-17 at 18:00 +0200, Christof Schmitt wrote:
>>>> On Wed, Jun 16, 2010 at 03:50:59PM -0500, James Bottomley wrote:
>>>>> Given that we're under two months out, I thought it would be time to
>>>>> post a summary of the topics we've collected so far (Nick will post the
>>>>> MM summit ones later). Look this over, and if there's anything missing,
>>>>> propose it ... or if you have cross Storage/FS/MM topics, post them too.
>>>>>
>>>>> Oh, and since we're not the most organised bunch, if you posted a topic
>>>>> and don't see it in the list, please resend ... we probably lost it in
>>>>> an email shuffle.
>>>>>
>>>>> Current Filesystem Topics:
>>>>>
>>>>> Alex Elder Upstream maintainer for XFS, general discussion on FS/IO
>>>>> Aneesh Kumar Rich-acl patches which work better with NFSv4 acl and CIFS acl
>>>>> Anshul Madan reflink for NFS
>>>>> Chuck Lever NFS/IPV6 and NFS O_DIRECT, Wu's read-ahead work, vitro perf tools
>>>>> Eric Sandeen Advances in testing, TRIM/DISCARD/Alignment, writeback sanity
>>>>> James Lentini reflink for NFS
>>>>> Jan Kara Discuss/drive sanity review of writeback and general ext*/jbd
>>>>> Michael Rubin Writeback scaling
>>>>> Sage Weil Statlite, generic interface for describing file striping for distributed FS, VFS scalability
>>>>> Al Viro Sorting out d_revalidate and other dcache issues
>>>>> Coly Li directory/large file scalability
>>>>> Sorin Faibish Cache writeback discussion
>>>>>
>>>>> Current Storage Topics:
>>>>>
>>>>> Eric Seppanen Next generation SSDs, performance implications on Linux I/O
>>>>> Boaz Harrosh PNFS performance considerations, bio_list based/async raidN for generic use; stable pages for I/O
>>>>> FUJITA Tomonori SCSI target mode, iSCSI, block layer SG (bsg), sg, IOMMU, DMA issues
>>>>> Hannes Reinecke libfc/multipath/error handing
>>>>> James Smart FCOE proposal for rework of the FC sysfs tree, work with Hannes on other transport/SCSI subsystem topics
>>>>> Jeff Moyer IO scheduler
>>>>> Joel Becker SAN management plugin
>>>>> Martin Petersen Updates on DIF/DIX, TRIM/DISCARD/UNMAP, generic support for WRITE_SAME
>>>>>
>>>>> Plus some MM summit ones which Nick will summarise.
>>>> [...]
>>>>
>>>> What about the topic "Stable pages while IO"?
>>>> http://www.spinics.net/lists/linux-scsi/msg44074.html
>>>>
>>>> Was it lost during the e-mail shuffle or will it be part of the MM topics?
>>> It's actually listed under 'dma issues' ... but there's really been no
>>> satisfactory resolution or discussion of how one might be achieved.
>>> Most filesystems rely on modifications to in-flight pages for efficiency
>>> and copying every fs I/O page would be horrendous both for performance
>>> and memory consumption. Nor has there really been an indication that
>>> it's a serious issue. The two sufferers are DIF and iSCSI checksum.
>> You forgot the third: advanced storage, including MPIO clusters, where
>> retry of the write of the modified in-flight pages while the original
>> write for them not yet completed might cause out of the expected order
>> execution of the writes and data corruption (old data written instead of
>> new).
>
> I don't think that's a problem. Multiple commands in flight to the same
> I/O region can get reordered because we only use simple tagging
> regardless of advanced or otherwise storage. The VM seems to wait for
> one write to complete before starting another because of the way the
> flush threads work.
I hope so, but: (1) we can see such writes (see
http://lists.linbit.com/pipermail/drbd-user/2009-April/011891.html, for
instance) and (2) Boaz said it's possible. From the "seems" you wrote
looks like your are also not too sure. So, if it isn't possible, it
would be good if someone familar with VM internals confirmed this.
Gennadiy,
If possible, can you recheck in your setup with a real Linux as
initiator to confirm if Linux is suffers from the concurrent writes
you've seen or not, please?
Vlad
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Current topics for LSF10/MM Summit 8-9 August in Boston
2010-06-17 17:11 ` Vladislav Bolkhovitin
(?)
@ 2010-06-17 17:37 ` James Bottomley
2010-06-17 17:55 ` Vladislav Bolkhovitin
-1 siblings, 1 reply; 39+ messages in thread
From: James Bottomley @ 2010-06-17 17:37 UTC (permalink / raw)
To: Vladislav Bolkhovitin
Cc: Gennadiy Nerubayev, Christof Schmitt, linux-scsi, linux-fsdevel,
linux-mm, lsf10-pc, Boaz Harrosh
On Thu, 2010-06-17 at 21:11 +0400, Vladislav Bolkhovitin wrote:
> James Bottomley, on 06/17/2010 08:42 PM wrote:
> > On Thu, 2010-06-17 at 20:34 +0400, Vladislav Bolkhovitin wrote:
> >> James Bottomley, on 06/17/2010 08:07 PM wrote:
> >>> On Thu, 2010-06-17 at 18:00 +0200, Christof Schmitt wrote:
> >>>> On Wed, Jun 16, 2010 at 03:50:59PM -0500, James Bottomley wrote:
> >>>>> Given that we're under two months out, I thought it would be time to
> >>>>> post a summary of the topics we've collected so far (Nick will post the
> >>>>> MM summit ones later). Look this over, and if there's anything missing,
> >>>>> propose it ... or if you have cross Storage/FS/MM topics, post them too.
> >>>>>
> >>>>> Oh, and since we're not the most organised bunch, if you posted a topic
> >>>>> and don't see it in the list, please resend ... we probably lost it in
> >>>>> an email shuffle.
> >>>>>
> >>>>> Current Filesystem Topics:
> >>>>>
> >>>>> Alex Elder Upstream maintainer for XFS, general discussion on FS/IO
> >>>>> Aneesh Kumar Rich-acl patches which work better with NFSv4 acl and CIFS acl
> >>>>> Anshul Madan reflink for NFS
> >>>>> Chuck Lever NFS/IPV6 and NFS O_DIRECT, Wu's read-ahead work, vitro perf tools
> >>>>> Eric Sandeen Advances in testing, TRIM/DISCARD/Alignment, writeback sanity
> >>>>> James Lentini reflink for NFS
> >>>>> Jan Kara Discuss/drive sanity review of writeback and general ext*/jbd
> >>>>> Michael Rubin Writeback scaling
> >>>>> Sage Weil Statlite, generic interface for describing file striping for distributed FS, VFS scalability
> >>>>> Al Viro Sorting out d_revalidate and other dcache issues
> >>>>> Coly Li directory/large file scalability
> >>>>> Sorin Faibish Cache writeback discussion
> >>>>>
> >>>>> Current Storage Topics:
> >>>>>
> >>>>> Eric Seppanen Next generation SSDs, performance implications on Linux I/O
> >>>>> Boaz Harrosh PNFS performance considerations, bio_list based/async raidN for generic use; stable pages for I/O
> >>>>> FUJITA Tomonori SCSI target mode, iSCSI, block layer SG (bsg), sg, IOMMU, DMA issues
> >>>>> Hannes Reinecke libfc/multipath/error handing
> >>>>> James Smart FCOE proposal for rework of the FC sysfs tree, work with Hannes on other transport/SCSI subsystem topics
> >>>>> Jeff Moyer IO scheduler
> >>>>> Joel Becker SAN management plugin
> >>>>> Martin Petersen Updates on DIF/DIX, TRIM/DISCARD/UNMAP, generic support for WRITE_SAME
> >>>>>
> >>>>> Plus some MM summit ones which Nick will summarise.
> >>>> [...]
> >>>>
> >>>> What about the topic "Stable pages while IO"?
> >>>> http://www.spinics.net/lists/linux-scsi/msg44074.html
> >>>>
> >>>> Was it lost during the e-mail shuffle or will it be part of the MM topics?
> >>> It's actually listed under 'dma issues' ... but there's really been no
> >>> satisfactory resolution or discussion of how one might be achieved.
> >>> Most filesystems rely on modifications to in-flight pages for efficiency
> >>> and copying every fs I/O page would be horrendous both for performance
> >>> and memory consumption. Nor has there really been an indication that
> >>> it's a serious issue. The two sufferers are DIF and iSCSI checksum.
> >> You forgot the third: advanced storage, including MPIO clusters, where
> >> retry of the write of the modified in-flight pages while the original
> >> write for them not yet completed might cause out of the expected order
> >> execution of the writes and data corruption (old data written instead of
> >> new).
> >
> > I don't think that's a problem. Multiple commands in flight to the same
> > I/O region can get reordered because we only use simple tagging
> > regardless of advanced or otherwise storage. The VM seems to wait for
> > one write to complete before starting another because of the way the
> > flush threads work.
>
> I hope so, but: (1) we can see such writes (see
> http://lists.linbit.com/pipermail/drbd-user/2009-April/011891.html, for
> instance)
So the email says blockio mode ... which I take it isn't through the
pagecache cleaning? All bets are off if the user initiates the
writeback ... and certainly you can get two blocks in flight for the
same destination using DIRECT IO ... but that's up to the applications
to fix ... we don't guarantee ordering in that case.
James
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Current topics for LSF10/MM Summit 8-9 August in Boston
2010-06-17 17:37 ` James Bottomley
@ 2010-06-17 17:55 ` Vladislav Bolkhovitin
0 siblings, 0 replies; 39+ messages in thread
From: Vladislav Bolkhovitin @ 2010-06-17 17:55 UTC (permalink / raw)
To: James Bottomley, Gennadiy Nerubayev
Cc: Christof Schmitt, linux-scsi, linux-fsdevel, linux-mm, lsf10-pc,
Boaz Harrosh
James Bottomley, on 06/17/2010 09:37 PM wrote:
> On Thu, 2010-06-17 at 21:11 +0400, Vladislav Bolkhovitin wrote:
>> James Bottomley, on 06/17/2010 08:42 PM wrote:
>>> On Thu, 2010-06-17 at 20:34 +0400, Vladislav Bolkhovitin wrote:
>>>> James Bottomley, on 06/17/2010 08:07 PM wrote:
>>>>> On Thu, 2010-06-17 at 18:00 +0200, Christof Schmitt wrote:
>>>>>> On Wed, Jun 16, 2010 at 03:50:59PM -0500, James Bottomley wrote:
>>>>>>> Given that we're under two months out, I thought it would be time to
>>>>>>> post a summary of the topics we've collected so far (Nick will post the
>>>>>>> MM summit ones later). Look this over, and if there's anything missing,
>>>>>>> propose it ... or if you have cross Storage/FS/MM topics, post them too.
>>>>>>>
>>>>>>> Oh, and since we're not the most organised bunch, if you posted a topic
>>>>>>> and don't see it in the list, please resend ... we probably lost it in
>>>>>>> an email shuffle.
>>>>>>>
>>>>>>> Current Filesystem Topics:
>>>>>>>
>>>>>>> Alex Elder Upstream maintainer for XFS, general discussion on FS/IO
>>>>>>> Aneesh Kumar Rich-acl patches which work better with NFSv4 acl and CIFS acl
>>>>>>> Anshul Madan reflink for NFS
>>>>>>> Chuck Lever NFS/IPV6 and NFS O_DIRECT, Wu's read-ahead work, vitro perf tools
>>>>>>> Eric Sandeen Advances in testing, TRIM/DISCARD/Alignment, writeback sanity
>>>>>>> James Lentini reflink for NFS
>>>>>>> Jan Kara Discuss/drive sanity review of writeback and general ext*/jbd
>>>>>>> Michael Rubin Writeback scaling
>>>>>>> Sage Weil Statlite, generic interface for describing file striping for distributed FS, VFS scalability
>>>>>>> Al Viro Sorting out d_revalidate and other dcache issues
>>>>>>> Coly Li directory/large file scalability
>>>>>>> Sorin Faibish Cache writeback discussion
>>>>>>>
>>>>>>> Current Storage Topics:
>>>>>>>
>>>>>>> Eric Seppanen Next generation SSDs, performance implications on Linux I/O
>>>>>>> Boaz Harrosh PNFS performance considerations, bio_list based/async raidN for generic use; stable pages for I/O
>>>>>>> FUJITA Tomonori SCSI target mode, iSCSI, block layer SG (bsg), sg, IOMMU, DMA issues
>>>>>>> Hannes Reinecke libfc/multipath/error handing
>>>>>>> James Smart FCOE proposal for rework of the FC sysfs tree, work with Hannes on other transport/SCSI subsystem topics
>>>>>>> Jeff Moyer IO scheduler
>>>>>>> Joel Becker SAN management plugin
>>>>>>> Martin Petersen Updates on DIF/DIX, TRIM/DISCARD/UNMAP, generic support for WRITE_SAME
>>>>>>>
>>>>>>> Plus some MM summit ones which Nick will summarise.
>>>>>> [...]
>>>>>>
>>>>>> What about the topic "Stable pages while IO"?
>>>>>> http://www.spinics.net/lists/linux-scsi/msg44074.html
>>>>>>
>>>>>> Was it lost during the e-mail shuffle or will it be part of the MM topics?
>>>>> It's actually listed under 'dma issues' ... but there's really been no
>>>>> satisfactory resolution or discussion of how one might be achieved.
>>>>> Most filesystems rely on modifications to in-flight pages for efficiency
>>>>> and copying every fs I/O page would be horrendous both for performance
>>>>> and memory consumption. Nor has there really been an indication that
>>>>> it's a serious issue. The two sufferers are DIF and iSCSI checksum.
>>>> You forgot the third: advanced storage, including MPIO clusters, where
>>>> retry of the write of the modified in-flight pages while the original
>>>> write for them not yet completed might cause out of the expected order
>>>> execution of the writes and data corruption (old data written instead of
>>>> new).
>>> I don't think that's a problem. Multiple commands in flight to the same
>>> I/O region can get reordered because we only use simple tagging
>>> regardless of advanced or otherwise storage. The VM seems to wait for
>>> one write to complete before starting another because of the way the
>>> flush threads work.
>> I hope so, but: (1) we can see such writes (see
>> http://lists.linbit.com/pipermail/drbd-user/2009-April/011891.html, for
>> instance)
>
> So the email says blockio mode ... which I take it isn't through the
> pagecache cleaning? All bets are off if the user initiates the
> writeback ... and certainly you can get two blocks in flight for the
> same destination using DIRECT IO ... but that's up to the applications
> to fix ... we don't guarantee ordering in that case.
That's blockio on the target. The target stack just passed down to its
backstorage incoming from the initiator SCSI commands as a block
requests in 1:1 mapping. But the SCSI commands were sent by the
initiator. So, the concurrent writes came from the initiator.
Gennadiy,
Which application did you use on the initiator to generate load when you
had the concurrent writes?
Vlad
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Current topics for LSF10/MM Summit 8-9 August in Boston
@ 2010-06-17 17:55 ` Vladislav Bolkhovitin
0 siblings, 0 replies; 39+ messages in thread
From: Vladislav Bolkhovitin @ 2010-06-17 17:55 UTC (permalink / raw)
To: James Bottomley, Gennadiy Nerubayev
Cc: Christof Schmitt, linux-scsi, linux-fsdevel, linux-mm, lsf10-pc,
Boaz Harrosh
James Bottomley, on 06/17/2010 09:37 PM wrote:
> On Thu, 2010-06-17 at 21:11 +0400, Vladislav Bolkhovitin wrote:
>> James Bottomley, on 06/17/2010 08:42 PM wrote:
>>> On Thu, 2010-06-17 at 20:34 +0400, Vladislav Bolkhovitin wrote:
>>>> James Bottomley, on 06/17/2010 08:07 PM wrote:
>>>>> On Thu, 2010-06-17 at 18:00 +0200, Christof Schmitt wrote:
>>>>>> On Wed, Jun 16, 2010 at 03:50:59PM -0500, James Bottomley wrote:
>>>>>>> Given that we're under two months out, I thought it would be time to
>>>>>>> post a summary of the topics we've collected so far (Nick will post the
>>>>>>> MM summit ones later). Look this over, and if there's anything missing,
>>>>>>> propose it ... or if you have cross Storage/FS/MM topics, post them too.
>>>>>>>
>>>>>>> Oh, and since we're not the most organised bunch, if you posted a topic
>>>>>>> and don't see it in the list, please resend ... we probably lost it in
>>>>>>> an email shuffle.
>>>>>>>
>>>>>>> Current Filesystem Topics:
>>>>>>>
>>>>>>> Alex Elder Upstream maintainer for XFS, general discussion on FS/IO
>>>>>>> Aneesh Kumar Rich-acl patches which work better with NFSv4 acl and CIFS acl
>>>>>>> Anshul Madan reflink for NFS
>>>>>>> Chuck Lever NFS/IPV6 and NFS O_DIRECT, Wu's read-ahead work, vitro perf tools
>>>>>>> Eric Sandeen Advances in testing, TRIM/DISCARD/Alignment, writeback sanity
>>>>>>> James Lentini reflink for NFS
>>>>>>> Jan Kara Discuss/drive sanity review of writeback and general ext*/jbd
>>>>>>> Michael Rubin Writeback scaling
>>>>>>> Sage Weil Statlite, generic interface for describing file striping for distributed FS, VFS scalability
>>>>>>> Al Viro Sorting out d_revalidate and other dcache issues
>>>>>>> Coly Li directory/large file scalability
>>>>>>> Sorin Faibish Cache writeback discussion
>>>>>>>
>>>>>>> Current Storage Topics:
>>>>>>>
>>>>>>> Eric Seppanen Next generation SSDs, performance implications on Linux I/O
>>>>>>> Boaz Harrosh PNFS performance considerations, bio_list based/async raidN for generic use; stable pages for I/O
>>>>>>> FUJITA Tomonori SCSI target mode, iSCSI, block layer SG (bsg), sg, IOMMU, DMA issues
>>>>>>> Hannes Reinecke libfc/multipath/error handing
>>>>>>> James Smart FCOE proposal for rework of the FC sysfs tree, work with Hannes on other transport/SCSI subsystem topics
>>>>>>> Jeff Moyer IO scheduler
>>>>>>> Joel Becker SAN management plugin
>>>>>>> Martin Petersen Updates on DIF/DIX, TRIM/DISCARD/UNMAP, generic support for WRITE_SAME
>>>>>>>
>>>>>>> Plus some MM summit ones which Nick will summarise.
>>>>>> [...]
>>>>>>
>>>>>> What about the topic "Stable pages while IO"?
>>>>>> http://www.spinics.net/lists/linux-scsi/msg44074.html
>>>>>>
>>>>>> Was it lost during the e-mail shuffle or will it be part of the MM topics?
>>>>> It's actually listed under 'dma issues' ... but there's really been no
>>>>> satisfactory resolution or discussion of how one might be achieved.
>>>>> Most filesystems rely on modifications to in-flight pages for efficiency
>>>>> and copying every fs I/O page would be horrendous both for performance
>>>>> and memory consumption. Nor has there really been an indication that
>>>>> it's a serious issue. The two sufferers are DIF and iSCSI checksum.
>>>> You forgot the third: advanced storage, including MPIO clusters, where
>>>> retry of the write of the modified in-flight pages while the original
>>>> write for them not yet completed might cause out of the expected order
>>>> execution of the writes and data corruption (old data written instead of
>>>> new).
>>> I don't think that's a problem. Multiple commands in flight to the same
>>> I/O region can get reordered because we only use simple tagging
>>> regardless of advanced or otherwise storage. The VM seems to wait for
>>> one write to complete before starting another because of the way the
>>> flush threads work.
>> I hope so, but: (1) we can see such writes (see
>> http://lists.linbit.com/pipermail/drbd-user/2009-April/011891.html, for
>> instance)
>
> So the email says blockio mode ... which I take it isn't through the
> pagecache cleaning? All bets are off if the user initiates the
> writeback ... and certainly you can get two blocks in flight for the
> same destination using DIRECT IO ... but that's up to the applications
> to fix ... we don't guarantee ordering in that case.
That's blockio on the target. The target stack just passed down to its
backstorage incoming from the initiator SCSI commands as a block
requests in 1:1 mapping. But the SCSI commands were sent by the
initiator. So, the concurrent writes came from the initiator.
Gennadiy,
Which application did you use on the initiator to generate load when you
had the concurrent writes?
Vlad
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Current topics for LSF10/MM Summit 8-9 August in Boston
2010-06-17 16:07 ` James Bottomley
@ 2010-06-18 11:41 ` Christof Schmitt
-1 siblings, 0 replies; 39+ messages in thread
From: Christof Schmitt @ 2010-06-18 11:41 UTC (permalink / raw)
To: James Bottomley; +Cc: linux-scsi, linux-fsdevel, linux-mm, lsf10-pc
On Thu, Jun 17, 2010 at 11:07:30AM -0500, James Bottomley wrote:
> On Thu, 2010-06-17 at 18:00 +0200, Christof Schmitt wrote:
> > On Wed, Jun 16, 2010 at 03:50:59PM -0500, James Bottomley wrote:
> > > Given that we're under two months out, I thought it would be time to
> > > post a summary of the topics we've collected so far (Nick will post the
> > > MM summit ones later). Look this over, and if there's anything missing,
> > > propose it ... or if you have cross Storage/FS/MM topics, post them too.
> > >
> > > Oh, and since we're not the most organised bunch, if you posted a topic
> > > and don't see it in the list, please resend ... we probably lost it in
> > > an email shuffle.
> > >
> > > Current Filesystem Topics:
> > >
> > > Alex Elder Upstream maintainer for XFS, general discussion on FS/IO
> > > Aneesh Kumar Rich-acl patches which work better with NFSv4 acl and CIFS acl
> > > Anshul Madan reflink for NFS
> > > Chuck Lever NFS/IPV6 and NFS O_DIRECT, Wu's read-ahead work, vitro perf tools
> > > Eric Sandeen Advances in testing, TRIM/DISCARD/Alignment, writeback sanity
> > > James Lentini reflink for NFS
> > > Jan Kara Discuss/drive sanity review of writeback and general ext*/jbd
> > > Michael Rubin Writeback scaling
> > > Sage Weil Statlite, generic interface for describing file striping for distributed FS, VFS scalability
> > > Al Viro Sorting out d_revalidate and other dcache issues
> > > Coly Li directory/large file scalability
> > > Sorin Faibish Cache writeback discussion
> > >
> > > Current Storage Topics:
> > >
> > > Eric Seppanen Next generation SSDs, performance implications on Linux I/O
> > > Boaz Harrosh PNFS performance considerations, bio_list based/async raidN for generic use; stable pages for I/O
> > > FUJITA Tomonori SCSI target mode, iSCSI, block layer SG (bsg), sg, IOMMU, DMA issues
> > > Hannes Reinecke libfc/multipath/error handing
> > > James Smart FCOE proposal for rework of the FC sysfs tree, work with Hannes on other transport/SCSI subsystem topics
> > > Jeff Moyer IO scheduler
> > > Joel Becker SAN management plugin
> > > Martin Petersen Updates on DIF/DIX, TRIM/DISCARD/UNMAP, generic support for WRITE_SAME
> > >
> > > Plus some MM summit ones which Nick will summarise.
> > [...]
> >
> > What about the topic "Stable pages while IO"?
> > http://www.spinics.net/lists/linux-scsi/msg44074.html
> >
> > Was it lost during the e-mail shuffle or will it be part of the MM topics?
>
> It's actually listed under 'dma issues' ... but there's really been no
> satisfactory resolution or discussion of how one might be achieved.
> Most filesystems rely on modifications to in-flight pages for efficiency
> and copying every fs I/O page would be horrendous both for performance
> and memory consumption. Nor has there really been an indication that
> it's a serious issue. The two sufferers are DIF and iSCSI checksum.
> The latter generates the checksum late enough that it can just discard
> incorrect pages ... the former might need simply to turn off DIF for
> everything other than DIRECT IO.
It is a serious problem when using DIF, so turning off this feature or
only using XFS and direct i/o does not sound very satisfying. But then
i also see the points that have been discussed and that there is no
simple solution.
Christof
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Current topics for LSF10/MM Summit 8-9 August in Boston
@ 2010-06-18 11:41 ` Christof Schmitt
0 siblings, 0 replies; 39+ messages in thread
From: Christof Schmitt @ 2010-06-18 11:41 UTC (permalink / raw)
To: James Bottomley; +Cc: linux-scsi, linux-fsdevel, linux-mm, lsf10-pc
On Thu, Jun 17, 2010 at 11:07:30AM -0500, James Bottomley wrote:
> On Thu, 2010-06-17 at 18:00 +0200, Christof Schmitt wrote:
> > On Wed, Jun 16, 2010 at 03:50:59PM -0500, James Bottomley wrote:
> > > Given that we're under two months out, I thought it would be time to
> > > post a summary of the topics we've collected so far (Nick will post the
> > > MM summit ones later). Look this over, and if there's anything missing,
> > > propose it ... or if you have cross Storage/FS/MM topics, post them too.
> > >
> > > Oh, and since we're not the most organised bunch, if you posted a topic
> > > and don't see it in the list, please resend ... we probably lost it in
> > > an email shuffle.
> > >
> > > Current Filesystem Topics:
> > >
> > > Alex Elder Upstream maintainer for XFS, general discussion on FS/IO
> > > Aneesh Kumar Rich-acl patches which work better with NFSv4 acl and CIFS acl
> > > Anshul Madan reflink for NFS
> > > Chuck Lever NFS/IPV6 and NFS O_DIRECT, Wu's read-ahead work, vitro perf tools
> > > Eric Sandeen Advances in testing, TRIM/DISCARD/Alignment, writeback sanity
> > > James Lentini reflink for NFS
> > > Jan Kara Discuss/drive sanity review of writeback and general ext*/jbd
> > > Michael Rubin Writeback scaling
> > > Sage Weil Statlite, generic interface for describing file striping for distributed FS, VFS scalability
> > > Al Viro Sorting out d_revalidate and other dcache issues
> > > Coly Li directory/large file scalability
> > > Sorin Faibish Cache writeback discussion
> > >
> > > Current Storage Topics:
> > >
> > > Eric Seppanen Next generation SSDs, performance implications on Linux I/O
> > > Boaz Harrosh PNFS performance considerations, bio_list based/async raidN for generic use; stable pages for I/O
> > > FUJITA Tomonori SCSI target mode, iSCSI, block layer SG (bsg), sg, IOMMU, DMA issues
> > > Hannes Reinecke libfc/multipath/error handing
> > > James Smart FCOE proposal for rework of the FC sysfs tree, work with Hannes on other transport/SCSI subsystem topics
> > > Jeff Moyer IO scheduler
> > > Joel Becker SAN management plugin
> > > Martin Petersen Updates on DIF/DIX, TRIM/DISCARD/UNMAP, generic support for WRITE_SAME
> > >
> > > Plus some MM summit ones which Nick will summarise.
> > [...]
> >
> > What about the topic "Stable pages while IO"?
> > http://www.spinics.net/lists/linux-scsi/msg44074.html
> >
> > Was it lost during the e-mail shuffle or will it be part of the MM topics?
>
> It's actually listed under 'dma issues' ... but there's really been no
> satisfactory resolution or discussion of how one might be achieved.
> Most filesystems rely on modifications to in-flight pages for efficiency
> and copying every fs I/O page would be horrendous both for performance
> and memory consumption. Nor has there really been an indication that
> it's a serious issue. The two sufferers are DIF and iSCSI checksum.
> The latter generates the checksum late enough that it can just discard
> incorrect pages ... the former might need simply to turn off DIF for
> everything other than DIRECT IO.
It is a serious problem when using DIF, so turning off this feature or
only using XFS and direct i/o does not sound very satisfying. But then
i also see the points that have been discussed and that there is no
simple solution.
Christof
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Lsf10-pc] Current topics for LSF10/MM Summit 8-9 August in Boston
2010-06-17 16:07 ` James Bottomley
@ 2010-06-18 12:18 ` J. Bruce Fields
-1 siblings, 0 replies; 39+ messages in thread
From: J. Bruce Fields @ 2010-06-18 12:18 UTC (permalink / raw)
To: James Bottomley
Cc: Christof Schmitt, linux-fsdevel, linux-mm, lsf10-pc, linux-scsi
On Thu, Jun 17, 2010 at 11:07:30AM -0500, James Bottomley wrote:
> It's actually listed under 'dma issues' ... but there's really been no
> satisfactory resolution or discussion of how one might be achieved.
> Most filesystems rely on modifications to in-flight pages for efficiency
> and copying every fs I/O page would be horrendous both for performance
> and memory consumption. Nor has there really been an indication that
> it's a serious issue. The two sufferers are DIF and iSCSI checksum.
And, again, NFS (both client (on writes) and server (on reads)), when
using sec=krb5i. Haven't tried to reproduce the problem, but I believe
it would result in spurious IO errors.
--b.
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [Lsf10-pc] Current topics for LSF10/MM Summit 8-9 August in Boston
@ 2010-06-18 12:18 ` J. Bruce Fields
0 siblings, 0 replies; 39+ messages in thread
From: J. Bruce Fields @ 2010-06-18 12:18 UTC (permalink / raw)
To: James Bottomley
Cc: Christof Schmitt, linux-fsdevel, linux-mm, lsf10-pc, linux-scsi
On Thu, Jun 17, 2010 at 11:07:30AM -0500, James Bottomley wrote:
> It's actually listed under 'dma issues' ... but there's really been no
> satisfactory resolution or discussion of how one might be achieved.
> Most filesystems rely on modifications to in-flight pages for efficiency
> and copying every fs I/O page would be horrendous both for performance
> and memory consumption. Nor has there really been an indication that
> it's a serious issue. The two sufferers are DIF and iSCSI checksum.
And, again, NFS (both client (on writes) and server (on reads)), when
using sec=krb5i. Haven't tried to reproduce the problem, but I believe
it would result in spurious IO errors.
--b.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 39+ messages in thread