* CephFS hangs when writing 10GB files in loop @ 2014-12-17 16:35 Wido den Hollander 2014-12-17 16:40 ` Sage Weil 2014-12-17 18:42 ` Gregory Farnum 0 siblings, 2 replies; 8+ messages in thread From: Wido den Hollander @ 2014-12-17 16:35 UTC (permalink / raw) To: ceph-devel Hi, Today I've been playing with CephFS and the morning started great with CephFS playing along just fine. Some information first: - Ceph 0.89 - Linux kernel 3.18 - Ceph fuse 0.89 - One Active MDS, one Standby This morning I could write a 10GB file like this using the kclient: $ dd if=/dev/zero of=10GB.bin bs=1M count=10240 conv=fsync That gave me 850MB/sec (all 10G network) and I could read the same file again with 610MB/sec. After writing to it multiple times it suddenly started to hang. No real evidence on the MDS (debug mds set to 20) or anything on the client. That specific operation just blocked, but I could still 'ls' the filesystem in a second terminal. The MDS was showing in it's log that it was checking active sessions of clients. It showed the active session of my single client. The client renewed it's caps and proceeded. I currently don't have any logs, but I'm just looking for a direction to be pointed towards. Any ideas? -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: CephFS hangs when writing 10GB files in loop 2014-12-17 16:35 CephFS hangs when writing 10GB files in loop Wido den Hollander @ 2014-12-17 16:40 ` Sage Weil 2014-12-17 16:43 ` Wido den Hollander 2014-12-17 18:42 ` Gregory Farnum 1 sibling, 1 reply; 8+ messages in thread From: Sage Weil @ 2014-12-17 16:40 UTC (permalink / raw) To: Wido den Hollander; +Cc: ceph-devel On Wed, 17 Dec 2014, Wido den Hollander wrote: > Hi, > > Today I've been playing with CephFS and the morning started great with > CephFS playing along just fine. > > Some information first: > - Ceph 0.89 > - Linux kernel 3.18 > - Ceph fuse 0.89 > - One Active MDS, one Standby > > This morning I could write a 10GB file like this using the kclient: > $ dd if=/dev/zero of=10GB.bin bs=1M count=10240 conv=fsync > > That gave me 850MB/sec (all 10G network) and I could read the same file > again with 610MB/sec. > > After writing to it multiple times it suddenly started to hang. > > No real evidence on the MDS (debug mds set to 20) or anything on the > client. That specific operation just blocked, but I could still 'ls' the > filesystem in a second terminal. > > The MDS was showing in it's log that it was checking active sessions of > clients. It showed the active session of my single client. > > The client renewed it's caps and proceeded. > > I currently don't have any logs, but I'm just looking for a direction to > be pointed towards. Hmm. Try cat /sys/kernel/debug/ceph/*/mdsc cat /sys/kernel/debug/ceph/*/osdc to see requests in flight (you may need to mount -t debugfs none /sys/kernel/debug first). What kernel version? sage ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: CephFS hangs when writing 10GB files in loop 2014-12-17 16:40 ` Sage Weil @ 2014-12-17 16:43 ` Wido den Hollander 0 siblings, 0 replies; 8+ messages in thread From: Wido den Hollander @ 2014-12-17 16:43 UTC (permalink / raw) To: Sage Weil; +Cc: ceph-devel On 12/17/2014 05:40 PM, Sage Weil wrote: > On Wed, 17 Dec 2014, Wido den Hollander wrote: >> Hi, >> >> Today I've been playing with CephFS and the morning started great with >> CephFS playing along just fine. >> >> Some information first: >> - Ceph 0.89 >> - Linux kernel 3.18 >> - Ceph fuse 0.89 >> - One Active MDS, one Standby >> >> This morning I could write a 10GB file like this using the kclient: >> $ dd if=/dev/zero of=10GB.bin bs=1M count=10240 conv=fsync >> >> That gave me 850MB/sec (all 10G network) and I could read the same file >> again with 610MB/sec. >> >> After writing to it multiple times it suddenly started to hang. >> >> No real evidence on the MDS (debug mds set to 20) or anything on the >> client. That specific operation just blocked, but I could still 'ls' the >> filesystem in a second terminal. >> >> The MDS was showing in it's log that it was checking active sessions of >> clients. It showed the active session of my single client. >> >> The client renewed it's caps and proceeded. >> >> I currently don't have any logs, but I'm just looking for a direction to >> be pointed towards. > > Hmm. Try > > cat /sys/kernel/debug/ceph/*/mdsc > cat /sys/kernel/debug/ceph/*/osdc > I'll check that, good point. > to see requests in flight (you may need to mount -t debugfs none > /sys/kernel/debug first). What kernel version? > I tried with 3.18 Also tried with ceph-fuse 0.89, same result. It is slower, but it also hangs at some point. > sage > -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: CephFS hangs when writing 10GB files in loop 2014-12-17 16:35 CephFS hangs when writing 10GB files in loop Wido den Hollander 2014-12-17 16:40 ` Sage Weil @ 2014-12-17 18:42 ` Gregory Farnum 2014-12-18 10:13 ` Wido den Hollander 1 sibling, 1 reply; 8+ messages in thread From: Gregory Farnum @ 2014-12-17 18:42 UTC (permalink / raw) To: Wido den Hollander; +Cc: ceph-devel On Wed, Dec 17, 2014 at 8:35 AM, Wido den Hollander <wido@42on.com> wrote: > Hi, > > Today I've been playing with CephFS and the morning started great with > CephFS playing along just fine. > > Some information first: > - Ceph 0.89 > - Linux kernel 3.18 > - Ceph fuse 0.89 > - One Active MDS, one Standby > > This morning I could write a 10GB file like this using the kclient: > $ dd if=/dev/zero of=10GB.bin bs=1M count=10240 conv=fsync > > That gave me 850MB/sec (all 10G network) and I could read the same file > again with 610MB/sec. > > After writing to it multiple times it suddenly started to hang. > > No real evidence on the MDS (debug mds set to 20) or anything on the > client. That specific operation just blocked, but I could still 'ls' the > filesystem in a second terminal. > > The MDS was showing in it's log that it was checking active sessions of > clients. It showed the active session of my single client. > > The client renewed it's caps and proceeded. Can you clarify this? I'm not quite sure what you mean. > I currently don't have any logs, but I'm just looking for a direction to > be pointed towards. > > Any ideas? Well, now that you're on v0.89 you should explore the admin socket...there are commands on the MDS to dump ops in flight (and maybe to look at session states? I don't remember when that merged). -Greg ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: CephFS hangs when writing 10GB files in loop 2014-12-17 18:42 ` Gregory Farnum @ 2014-12-18 10:13 ` Wido den Hollander 2014-12-18 15:54 ` Wido den Hollander 0 siblings, 1 reply; 8+ messages in thread From: Wido den Hollander @ 2014-12-18 10:13 UTC (permalink / raw) To: Gregory Farnum; +Cc: ceph-devel On 12/17/2014 07:42 PM, Gregory Farnum wrote: > On Wed, Dec 17, 2014 at 8:35 AM, Wido den Hollander <wido@42on.com> wrote: >> Hi, >> >> Today I've been playing with CephFS and the morning started great with >> CephFS playing along just fine. >> >> Some information first: >> - Ceph 0.89 >> - Linux kernel 3.18 >> - Ceph fuse 0.89 >> - One Active MDS, one Standby >> >> This morning I could write a 10GB file like this using the kclient: >> $ dd if=/dev/zero of=10GB.bin bs=1M count=10240 conv=fsync >> >> That gave me 850MB/sec (all 10G network) and I could read the same file >> again with 610MB/sec. >> >> After writing to it multiple times it suddenly started to hang. >> >> No real evidence on the MDS (debug mds set to 20) or anything on the >> client. That specific operation just blocked, but I could still 'ls' the >> filesystem in a second terminal. >> >> The MDS was showing in it's log that it was checking active sessions of >> clients. It showed the active session of my single client. >> >> The client renewed it's caps and proceeded. > > Can you clarify this? I'm not quite sure what you mean. > I currently don't have the logs available. That was my problem when typing the original e-mail. >> I currently don't have any logs, but I'm just looking for a direction to >> be pointed towards. >> >> Any ideas? > > Well, now that you're on v0.89 you should explore the admin > socket...there are commands on the MDS to dump ops in flight (and > maybe to look at session states? I don't remember when that merged). Sage's pointer towards the kernel debugging and the new admin socket showed me that it were RADOS calls which were hanging. I investigated even further and it seems that this is not a CephFS problem, but a local TCP issue which is only triggered when using CephFS. At some point, which is still unclear to me, data transfer becomes very slow. The MDS doesn't seem to be able to update the journal and the client can't write to the OSDs anymore. It happened after I did some very basic TCP tuning (timestamp, rmem, wmem, sack, fastopen). Reverting back to the Ubuntu 14.04 defaults resolved it all and CephFS is running happily now. I'll dig some deeper to see why this system was affected by those changes. I applied these settings earlier on a RBD-only cluster without any problems. > -Greg > -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: CephFS hangs when writing 10GB files in loop 2014-12-18 10:13 ` Wido den Hollander @ 2014-12-18 15:54 ` Wido den Hollander 2014-12-18 16:32 ` Atchley, Scott 0 siblings, 1 reply; 8+ messages in thread From: Wido den Hollander @ 2014-12-18 15:54 UTC (permalink / raw) To: Gregory Farnum; +Cc: ceph-devel On 12/18/2014 11:13 AM, Wido den Hollander wrote: > On 12/17/2014 07:42 PM, Gregory Farnum wrote: >> On Wed, Dec 17, 2014 at 8:35 AM, Wido den Hollander <wido@42on.com> wrote: >>> Hi, >>> >>> Today I've been playing with CephFS and the morning started great with >>> CephFS playing along just fine. >>> >>> Some information first: >>> - Ceph 0.89 >>> - Linux kernel 3.18 >>> - Ceph fuse 0.89 >>> - One Active MDS, one Standby >>> >>> This morning I could write a 10GB file like this using the kclient: >>> $ dd if=/dev/zero of=10GB.bin bs=1M count=10240 conv=fsync >>> >>> That gave me 850MB/sec (all 10G network) and I could read the same file >>> again with 610MB/sec. >>> >>> After writing to it multiple times it suddenly started to hang. >>> >>> No real evidence on the MDS (debug mds set to 20) or anything on the >>> client. That specific operation just blocked, but I could still 'ls' the >>> filesystem in a second terminal. >>> >>> The MDS was showing in it's log that it was checking active sessions of >>> clients. It showed the active session of my single client. >>> >>> The client renewed it's caps and proceeded. >> >> Can you clarify this? I'm not quite sure what you mean. >> > > I currently don't have the logs available. That was my problem when > typing the original e-mail. > >>> I currently don't have any logs, but I'm just looking for a direction to >>> be pointed towards. >>> >>> Any ideas? >> >> Well, now that you're on v0.89 you should explore the admin >> socket...there are commands on the MDS to dump ops in flight (and >> maybe to look at session states? I don't remember when that merged). > > Sage's pointer towards the kernel debugging and the new admin socket > showed me that it were RADOS calls which were hanging. > > I investigated even further and it seems that this is not a CephFS > problem, but a local TCP issue which is only triggered when using CephFS. > > At some point, which is still unclear to me, data transfer becomes very > slow. The MDS doesn't seem to be able to update the journal and the > client can't write to the OSDs anymore. > > It happened after I did some very basic TCP tuning (timestamp, rmem, > wmem, sack, fastopen). > So it was tcp_sack. With tcp_sack=0 the MDS has problems talking to OSDs. Other clients still work fine, but the MDS couldn't replay it's journal and such. Enabling tcp_sack again resolved the problem. The new admin socket really helped there! > Reverting back to the Ubuntu 14.04 defaults resolved it all and CephFS > is running happily now. > > I'll dig some deeper to see why this system was affected by those > changes. I applied these settings earlier on a RBD-only cluster without > any problems. > >> -Greg >> > > -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: CephFS hangs when writing 10GB files in loop 2014-12-18 15:54 ` Wido den Hollander @ 2014-12-18 16:32 ` Atchley, Scott 2014-12-18 20:50 ` Wido den Hollander 0 siblings, 1 reply; 8+ messages in thread From: Atchley, Scott @ 2014-12-18 16:32 UTC (permalink / raw) To: Wido den Hollander; +Cc: Gregory Farnum, ceph-devel On Dec 18, 2014, at 10:54 AM, Wido den Hollander <wido@42on.com> wrote: > On 12/18/2014 11:13 AM, Wido den Hollander wrote: >> On 12/17/2014 07:42 PM, Gregory Farnum wrote: >>> On Wed, Dec 17, 2014 at 8:35 AM, Wido den Hollander <wido@42on.com> wrote: >>>> Hi, >>>> >>>> Today I've been playing with CephFS and the morning started great with >>>> CephFS playing along just fine. >>>> >>>> Some information first: >>>> - Ceph 0.89 >>>> - Linux kernel 3.18 >>>> - Ceph fuse 0.89 >>>> - One Active MDS, one Standby >>>> >>>> This morning I could write a 10GB file like this using the kclient: >>>> $ dd if=/dev/zero of=10GB.bin bs=1M count=10240 conv=fsync >>>> >>>> That gave me 850MB/sec (all 10G network) and I could read the same file >>>> again with 610MB/sec. >>>> >>>> After writing to it multiple times it suddenly started to hang. >>>> >>>> No real evidence on the MDS (debug mds set to 20) or anything on the >>>> client. That specific operation just blocked, but I could still 'ls' the >>>> filesystem in a second terminal. >>>> >>>> The MDS was showing in it's log that it was checking active sessions of >>>> clients. It showed the active session of my single client. >>>> >>>> The client renewed it's caps and proceeded. >>> >>> Can you clarify this? I'm not quite sure what you mean. >>> >> >> I currently don't have the logs available. That was my problem when >> typing the original e-mail. >> >>>> I currently don't have any logs, but I'm just looking for a direction to >>>> be pointed towards. >>>> >>>> Any ideas? >>> >>> Well, now that you're on v0.89 you should explore the admin >>> socket...there are commands on the MDS to dump ops in flight (and >>> maybe to look at session states? I don't remember when that merged). >> >> Sage's pointer towards the kernel debugging and the new admin socket >> showed me that it were RADOS calls which were hanging. >> >> I investigated even further and it seems that this is not a CephFS >> problem, but a local TCP issue which is only triggered when using CephFS. >> >> At some point, which is still unclear to me, data transfer becomes very >> slow. The MDS doesn't seem to be able to update the journal and the >> client can't write to the OSDs anymore. >> >> It happened after I did some very basic TCP tuning (timestamp, rmem, >> wmem, sack, fastopen). >> > > So it was tcp_sack. With tcp_sack=0 the MDS has problems talking to > OSDs. Other clients still work fine, but the MDS couldn't replay it's > journal and such. > > Enabling tcp_sack again resolved the problem. The new admin socket > really helped there! What was the reasoning behind disabling SACK to begin with? Without it, any drops or reordering might require resending potentially a lot of data. > >> Reverting back to the Ubuntu 14.04 defaults resolved it all and CephFS >> is running happily now. >> >> I'll dig some deeper to see why this system was affected by those >> changes. I applied these settings earlier on a RBD-only cluster without >> any problems. >> >>> -Greg >>> >> >> > > > -- > Wido den Hollander > 42on B.V. > Ceph trainer and consultant > > Phone: +31 (0)20 700 9902 > Skype: contact42on > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: CephFS hangs when writing 10GB files in loop 2014-12-18 16:32 ` Atchley, Scott @ 2014-12-18 20:50 ` Wido den Hollander 0 siblings, 0 replies; 8+ messages in thread From: Wido den Hollander @ 2014-12-18 20:50 UTC (permalink / raw) To: Atchley, Scott; +Cc: Gregory Farnum, ceph-devel On 12/18/2014 05:32 PM, Atchley, Scott wrote: > On Dec 18, 2014, at 10:54 AM, Wido den Hollander <wido@42on.com> wrote: > >> On 12/18/2014 11:13 AM, Wido den Hollander wrote: >>> On 12/17/2014 07:42 PM, Gregory Farnum wrote: >>>> On Wed, Dec 17, 2014 at 8:35 AM, Wido den Hollander <wido@42on.com> wrote: >>>>> Hi, >>>>> >>>>> Today I've been playing with CephFS and the morning started great with >>>>> CephFS playing along just fine. >>>>> >>>>> Some information first: >>>>> - Ceph 0.89 >>>>> - Linux kernel 3.18 >>>>> - Ceph fuse 0.89 >>>>> - One Active MDS, one Standby >>>>> >>>>> This morning I could write a 10GB file like this using the kclient: >>>>> $ dd if=/dev/zero of=10GB.bin bs=1M count=10240 conv=fsync >>>>> >>>>> That gave me 850MB/sec (all 10G network) and I could read the same file >>>>> again with 610MB/sec. >>>>> >>>>> After writing to it multiple times it suddenly started to hang. >>>>> >>>>> No real evidence on the MDS (debug mds set to 20) or anything on the >>>>> client. That specific operation just blocked, but I could still 'ls' the >>>>> filesystem in a second terminal. >>>>> >>>>> The MDS was showing in it's log that it was checking active sessions of >>>>> clients. It showed the active session of my single client. >>>>> >>>>> The client renewed it's caps and proceeded. >>>> >>>> Can you clarify this? I'm not quite sure what you mean. >>>> >>> >>> I currently don't have the logs available. That was my problem when >>> typing the original e-mail. >>> >>>>> I currently don't have any logs, but I'm just looking for a direction to >>>>> be pointed towards. >>>>> >>>>> Any ideas? >>>> >>>> Well, now that you're on v0.89 you should explore the admin >>>> socket...there are commands on the MDS to dump ops in flight (and >>>> maybe to look at session states? I don't remember when that merged). >>> >>> Sage's pointer towards the kernel debugging and the new admin socket >>> showed me that it were RADOS calls which were hanging. >>> >>> I investigated even further and it seems that this is not a CephFS >>> problem, but a local TCP issue which is only triggered when using CephFS. >>> >>> At some point, which is still unclear to me, data transfer becomes very >>> slow. The MDS doesn't seem to be able to update the journal and the >>> client can't write to the OSDs anymore. >>> >>> It happened after I did some very basic TCP tuning (timestamp, rmem, >>> wmem, sack, fastopen). >>> >> >> So it was tcp_sack. With tcp_sack=0 the MDS has problems talking to >> OSDs. Other clients still work fine, but the MDS couldn't replay it's >> journal and such. >> >> Enabling tcp_sack again resolved the problem. The new admin socket >> really helped there! > > What was the reasoning behind disabling SACK to begin with? Without it, any drops or reordering might require resending potentially a lot of data. > I was testing with various TCP settings and sack was one of those. Didn't think about it earlier that it might be the problem. >> >>> Reverting back to the Ubuntu 14.04 defaults resolved it all and CephFS >>> is running happily now. >>> >>> I'll dig some deeper to see why this system was affected by those >>> changes. I applied these settings earlier on a RBD-only cluster without >>> any problems. >>> >>>> -Greg >>>> >>> >>> >> >> >> -- >> Wido den Hollander >> 42on B.V. >> Ceph trainer and consultant >> >> Phone: +31 (0)20 700 9902 >> Skype: contact42on >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-12-18 20:50 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-12-17 16:35 CephFS hangs when writing 10GB files in loop Wido den Hollander 2014-12-17 16:40 ` Sage Weil 2014-12-17 16:43 ` Wido den Hollander 2014-12-17 18:42 ` Gregory Farnum 2014-12-18 10:13 ` Wido den Hollander 2014-12-18 15:54 ` Wido den Hollander 2014-12-18 16:32 ` Atchley, Scott 2014-12-18 20:50 ` Wido den Hollander
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.