CephFS and Samba hang on copy of large file

All of lore.kernel.org
 help / color / mirror / Atom feed

* CephFS and Samba hang on copy of large file
@ 2016-08-15 13:23 Wido den Hollander
       [not found] ` <CAAPGDwLxomxLJMZNsW7qs9vqjnxH9zbBkUjBok7tsHXA_rt3yA@mail.gmail.com>
  2016-08-16 14:05 ` Jeff Layton
  0 siblings, 2 replies; 14+ messages in thread
From: Wido den Hollander @ 2016-08-15 13:23 UTC (permalink / raw)
  To: ceph-devel

Hi,

I'm running into a issue in combination of CephFS and Samba and I was wondering if a dev knew what is happening here.

The situation:
- Jewel cluster
- CephFS kernel client version 4.7
- Samba re-export of CephFS
- Mount options: rw,noatime,acl

A copy of a 15GB file results in Samba hanging in status D:

root@hlms-zaken-01:~# ps aux|grep smb|grep D
jongh       8887  0.0  0.0 376656 19068 ?        D    14:42   0:00 /usr/sbin/smbd -D
jongh       9740  0.0  0.0 377380 19244 ?        D    14:49   0:00 /usr/sbin/smbd -D
root@hlms-zaken-01:~# cat /proc/8887/stack
[<ffffffff8132d353>] call_rwsem_down_write_failed+0x13/0x20
[<ffffffff8121a145>] vfs_setxattr+0x55/0xb0
[<ffffffff8121a2a5>] setxattr+0x105/0x170
[<ffffffff81203aa1>] filename_lookup+0xf1/0x180
[<ffffffff8120369f>] getname_flags+0x6f/0x1e0
[<ffffffff8121a3bd>] path_setxattr+0xad/0xe0
[<ffffffff8121a4f0>] SyS_setxattr+0x10/0x20
[<ffffffff815e8b76>] entry_SYSCALL_64_fastpath+0x1e/0xa8
[<ffffffffffffffff>] 0xffffffffffffffff
root@hlms-zaken-01:~# cat /proc/9740/stack
[<ffffffff8132d353>] call_rwsem_down_write_failed+0x13/0x20
[<ffffffff8121a145>] vfs_setxattr+0x55/0xb0
[<ffffffff8121a2a5>] setxattr+0x105/0x170
[<ffffffff81203aa1>] filename_lookup+0xf1/0x180
[<ffffffff8120369f>] getname_flags+0x6f/0x1e0
[<ffffffff8121a3bd>] path_setxattr+0xad/0xe0
[<ffffffff8121a4f0>] SyS_setxattr+0x10/0x20
[<ffffffff815e8b76>] entry_SYSCALL_64_fastpath+0x1e/0xa8
[<ffffffffffffffff>] 0xffffffffffffffff
root@hlms-zaken-01:~#

Now, when I look in /sys/kernel/debug/ceph/*/osdc / mdsc there are no outstanding requests to the OSDs or MDS.

Both these calls just hang for ever on these requests and they don't continue.

Any pointers where to start looking for this? I tried the 4.4 kernel before, it gave me the same hang. So that's why I upgraded to 4.7 to see if it was fixed there.

The Ceph cluster is currently backfilling 17 PGs, but this also happend when HEALTH_OK was around.

There are no block or slow requests in the cluster.

Wido

^ permalink raw reply	[flat|nested] 14+ messages in thread

[parent not found: <CAAPGDwLxomxLJMZNsW7qs9vqjnxH9zbBkUjBok7tsHXA_rt3yA@mail.gmail.com>]

* Re: CephFS and Samba hang on copy of large file
       [not found] ` <CAAPGDwLxomxLJMZNsW7qs9vqjnxH9zbBkUjBok7tsHXA_rt3yA@mail.gmail.com>
@ 2016-08-15 13:35   ` Wido den Hollander
  2016-08-15 13:43     ` Ira Cooper
  0 siblings, 1 reply; 14+ messages in thread
From: Wido den Hollander @ 2016-08-15 13:35 UTC (permalink / raw)
  To: Ira Cooper; +Cc: ceph-devel


> Op 15 augustus 2016 om 15:27 schreef Ira Cooper <ira@wakeful.net>:
> 
> 
> Have you tried the ceph VFS module in Samba yet?
> 

Yes. That works, but the performance is a lot lower. So for that we are testing/using the CephFS kernel client.

I would also like to see the VFS module upstream in Samba, you still need to manually patch it in.

Wido

> -Ira
> 
> On Mon, Aug 15, 2016 at 9:23 AM, Wido den Hollander <wido@42on.com> wrote:
> 
> > Hi,
> >
> > I'm running into a issue in combination of CephFS and Samba and I was
> > wondering if a dev knew what is happening here.
> >
> > The situation:
> > - Jewel cluster
> > - CephFS kernel client version 4.7
> > - Samba re-export of CephFS
> > - Mount options: rw,noatime,acl
> >
> > A copy of a 15GB file results in Samba hanging in status D:
> >
> > root@hlms-zaken-01:~# ps aux|grep smb|grep D
> > jongh       8887  0.0  0.0 376656 19068 ?        D    14:42   0:00
> > /usr/sbin/smbd -D
> > jongh       9740  0.0  0.0 377380 19244 ?        D    14:49   0:00
> > /usr/sbin/smbd -D
> > root@hlms-zaken-01:~# cat /proc/8887/stack
> > [<ffffffff8132d353>] call_rwsem_down_write_failed+0x13/0x20
> > [<ffffffff8121a145>] vfs_setxattr+0x55/0xb0
> > [<ffffffff8121a2a5>] setxattr+0x105/0x170
> > [<ffffffff81203aa1>] filename_lookup+0xf1/0x180
> > [<ffffffff8120369f>] getname_flags+0x6f/0x1e0
> > [<ffffffff8121a3bd>] path_setxattr+0xad/0xe0
> > [<ffffffff8121a4f0>] SyS_setxattr+0x10/0x20
> > [<ffffffff815e8b76>] entry_SYSCALL_64_fastpath+0x1e/0xa8
> > [<ffffffffffffffff>] 0xffffffffffffffff
> > root@hlms-zaken-01:~# cat /proc/9740/stack
> > [<ffffffff8132d353>] call_rwsem_down_write_failed+0x13/0x20
> > [<ffffffff8121a145>] vfs_setxattr+0x55/0xb0
> > [<ffffffff8121a2a5>] setxattr+0x105/0x170
> > [<ffffffff81203aa1>] filename_lookup+0xf1/0x180
> > [<ffffffff8120369f>] getname_flags+0x6f/0x1e0
> > [<ffffffff8121a3bd>] path_setxattr+0xad/0xe0
> > [<ffffffff8121a4f0>] SyS_setxattr+0x10/0x20
> > [<ffffffff815e8b76>] entry_SYSCALL_64_fastpath+0x1e/0xa8
> > [<ffffffffffffffff>] 0xffffffffffffffff
> > root@hlms-zaken-01:~#
> >
> > Now, when I look in /sys/kernel/debug/ceph/*/osdc / mdsc there are no
> > outstanding requests to the OSDs or MDS.
> >
> > Both these calls just hang for ever on these requests and they don't
> > continue.
> >
> > Any pointers where to start looking for this? I tried the 4.4 kernel
> > before, it gave me the same hang. So that's why I upgraded to 4.7 to see if
> > it was fixed there.
> >
> > The Ceph cluster is currently backfilling 17 PGs, but this also happend
> > when HEALTH_OK was around.
> >
> > There are no block or slow requests in the cluster.
> >
> > Wido
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CephFS and Samba hang on copy of large file
  2016-08-15 13:35   ` Wido den Hollander
@ 2016-08-15 13:43     ` Ira Cooper
  2016-08-15 13:58       ` Wido den Hollander
  0 siblings, 1 reply; 14+ messages in thread
From: Ira Cooper @ 2016-08-15 13:43 UTC (permalink / raw)
  To: Wido den Hollander; +Cc: ceph-devel

On Mon, Aug 15, 2016 at 9:35 AM, Wido den Hollander <wido@42on.com> wrote:
>
>> Op 15 augustus 2016 om 15:27 schreef Ira Cooper <ira@wakeful.net>:
>>
>>
>> Have you tried the ceph VFS module in Samba yet?
>>
>
> Yes. That works, but the performance is a lot lower. So for that we are testing/using the CephFS kernel client.
>
> I would also like to see the VFS module upstream in Samba, you still need to manually patch it in.

It is in upstream Samba.

https://git.samba.org/?p=samba.git;a=blob;f=source3/modules/vfs_ceph.c;h=59e9b9cf9b3e8e5313a20823994fcacf9e4b4168;hb=f1b42ec778e08875e076df7fdf67dd69bf9b2757

It's been there a while now.

I'm curious what you are patching in, and what the performance numbers are. :)

Thanks,

-Ira

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CephFS and Samba hang on copy of large file
  2016-08-15 13:43     ` Ira Cooper
@ 2016-08-15 13:58       ` Wido den Hollander
  2016-08-15 14:08         ` Ira Cooper
  2016-08-15 21:36         ` Milosz Tanski
  0 siblings, 2 replies; 14+ messages in thread
From: Wido den Hollander @ 2016-08-15 13:58 UTC (permalink / raw)
  To: Ira Cooper; +Cc: ceph-devel


> Op 15 augustus 2016 om 15:43 schreef Ira Cooper <ira@wakeful.net>:
> 
> 
> On Mon, Aug 15, 2016 at 9:35 AM, Wido den Hollander <wido@42on.com> wrote:
> >
> >> Op 15 augustus 2016 om 15:27 schreef Ira Cooper <ira@wakeful.net>:
> >>
> >>
> >> Have you tried the ceph VFS module in Samba yet?
> >>
> >
> > Yes. That works, but the performance is a lot lower. So for that we are testing/using the CephFS kernel client.
> >
> > I would also like to see the VFS module upstream in Samba, you still need to manually patch it in.
> 
> It is in upstream Samba.
> 
> https://git.samba.org/?p=samba.git;a=blob;f=source3/modules/vfs_ceph.c;h=59e9b9cf9b3e8e5313a20823994fcacf9e4b4168;hb=f1b42ec778e08875e076df7fdf67dd69bf9b2757
> 
> It's been there a while now.
> 
> I'm curious what you are patching in, and what the performance numbers are. :)

No, sorry for that confusion. I meant DEB and/or RPM packages. You have to compile manually which not all companies like.

With the kernel client we see about 200MB/sec and with VFS about 50MB/sec.

However, I still wonder why samba went in status D. That should not happen.

Wido

> 
> Thanks,
> 
> -Ira

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CephFS and Samba hang on copy of large file
  2016-08-15 13:58       ` Wido den Hollander
@ 2016-08-15 14:08         ` Ira Cooper
  2016-08-15 21:36         ` Milosz Tanski
  1 sibling, 0 replies; 14+ messages in thread
From: Ira Cooper @ 2016-08-15 14:08 UTC (permalink / raw)
  To: Wido den Hollander; +Cc: ceph-devel

On Mon, Aug 15, 2016 at 9:58 AM, Wido den Hollander <wido@42on.com> wrote:
>
>> Op 15 augustus 2016 om 15:43 schreef Ira Cooper <ira@wakeful.net>:
>>
>>
>> On Mon, Aug 15, 2016 at 9:35 AM, Wido den Hollander <wido@42on.com> wrote:
>> >
>> >> Op 15 augustus 2016 om 15:27 schreef Ira Cooper <ira@wakeful.net>:
>> >>
>> >>
>> >> Have you tried the ceph VFS module in Samba yet?
>> >>
>> >
>> > Yes. That works, but the performance is a lot lower. So for that we are testing/using the CephFS kernel client.
>> >
>> > I would also like to see the VFS module upstream in Samba, you still need to manually patch it in.
>>
>> It is in upstream Samba.
>>
>> https://git.samba.org/?p=samba.git;a=blob;f=source3/modules/vfs_ceph.c;h=59e9b9cf9b3e8e5313a20823994fcacf9e4b4168;hb=f1b42ec778e08875e076df7fdf67dd69bf9b2757
>>
>> It's been there a while now.
>>
>> I'm curious what you are patching in, and what the performance numbers are. :)
>
> No, sorry for that confusion. I meant DEB and/or RPM packages. You have to compile manually which not all companies like.
>
> With the kernel client we see about 200MB/sec and with VFS about 50MB/sec.
>
> However, I still wonder why samba went in status D. That should not happen.

I totally agree about the kernel issue, it should work.

I'm really curious why the kernel client blows the VFS module away.
Can you share any info on your workload?

As far as packaging, Fedora does build it.

Thanks,

-Ira

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CephFS and Samba hang on copy of large file
  2016-08-15 13:58       ` Wido den Hollander
  2016-08-15 14:08         ` Ira Cooper
@ 2016-08-15 21:36         ` Milosz Tanski
  2016-08-16 10:32           ` Wido den Hollander
  1 sibling, 1 reply; 14+ messages in thread
From: Milosz Tanski @ 2016-08-15 21:36 UTC (permalink / raw)
  To: Wido den Hollander; +Cc: Ira Cooper, ceph-devel

On Mon, Aug 15, 2016 at 9:58 AM, Wido den Hollander <wido@42on.com> wrote:
>
>> Op 15 augustus 2016 om 15:43 schreef Ira Cooper <ira@wakeful.net>:
>>
>>
>> On Mon, Aug 15, 2016 at 9:35 AM, Wido den Hollander <wido@42on.com> wrote:
>> >
>> >> Op 15 augustus 2016 om 15:27 schreef Ira Cooper <ira@wakeful.net>:
>> >>
>> >>
>> >> Have you tried the ceph VFS module in Samba yet?
>> >>
>> >
>> > Yes. That works, but the performance is a lot lower. So for that we are testing/using the CephFS kernel client.
>> >
>> > I would also like to see the VFS module upstream in Samba, you still need to manually patch it in.
>>
>> It is in upstream Samba.
>>
>> https://git.samba.org/?p=samba.git;a=blob;f=source3/modules/vfs_ceph.c;h=59e9b9cf9b3e8e5313a20823994fcacf9e4b4168;hb=f1b42ec778e08875e076df7fdf67dd69bf9b2757
>>
>> It's been there a while now.
>>
>> I'm curious what you are patching in, and what the performance numbers are. :)
>
> No, sorry for that confusion. I meant DEB and/or RPM packages. You have to compile manually which not all companies like.
>
> With the kernel client we see about 200MB/sec and with VFS about 50MB/sec.

I'm willing to bet that at lest some of it is page cache & read ahead.

>
> However, I still wonder why samba went in status D. That should not happen.
>
> Wido
>



-- 
Milosz Tanski
CTO
16 East 34th Street, 15th floor
New York, NY 10016

p: 646-253-9055
e: milosz@adfin.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CephFS and Samba hang on copy of large file
  2016-08-15 21:36         ` Milosz Tanski
@ 2016-08-16 10:32           ` Wido den Hollander
  2016-08-16 10:37             ` Wido den Hollander
  2016-08-16 10:38             ` Ira Cooper
  0 siblings, 2 replies; 14+ messages in thread
From: Wido den Hollander @ 2016-08-16 10:32 UTC (permalink / raw)
  To: Milosz Tanski; +Cc: Ira Cooper, ceph-devel


> Op 15 augustus 2016 om 23:36 schreef Milosz Tanski <milosz@adfin.com>:
> 
> 
> On Mon, Aug 15, 2016 at 9:58 AM, Wido den Hollander <wido@42on.com> wrote:
> >
> >> Op 15 augustus 2016 om 15:43 schreef Ira Cooper <ira@wakeful.net>:
> >>
> >>
> >> On Mon, Aug 15, 2016 at 9:35 AM, Wido den Hollander <wido@42on.com> wrote:
> >> >
> >> >> Op 15 augustus 2016 om 15:27 schreef Ira Cooper <ira@wakeful.net>:
> >> >>
> >> >>
> >> >> Have you tried the ceph VFS module in Samba yet?
> >> >>
> >> >
> >> > Yes. That works, but the performance is a lot lower. So for that we are testing/using the CephFS kernel client.
> >> >
> >> > I would also like to see the VFS module upstream in Samba, you still need to manually patch it in.
> >>
> >> It is in upstream Samba.
> >>
> >> https://git.samba.org/?p=samba.git;a=blob;f=source3/modules/vfs_ceph.c;h=59e9b9cf9b3e8e5313a20823994fcacf9e4b4168;hb=f1b42ec778e08875e076df7fdf67dd69bf9b2757
> >>
> >> It's been there a while now.
> >>
> >> I'm curious what you are patching in, and what the performance numbers are. :)
> >
> > No, sorry for that confusion. I meant DEB and/or RPM packages. You have to compile manually which not all companies like.
> >
> > With the kernel client we see about 200MB/sec and with VFS about 50MB/sec.
> 
> I'm willing to bet that at lest some of it is page cache & read ahead.
> 

Probably indeed. Tested with Jewel and VFS and we see a much higher throughput right now.

We had to disable sendfile in Samba though.

Still, it's not good that Samba locked up and stayed in status D. That should not happen.

Wido

> >
> > However, I still wonder why samba went in status D. That should not happen.
> >
> > Wido
> >
> 
> 
> 
> -- 
> Milosz Tanski
> CTO
> 16 East 34th Street, 15th floor
> New York, NY 10016
> 
> p: 646-253-9055
> e: milosz@adfin.com
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CephFS and Samba hang on copy of large file
  2016-08-16 10:32           ` Wido den Hollander
@ 2016-08-16 10:37             ` Wido den Hollander
  2016-08-16 10:38             ` Ira Cooper
  1 sibling, 0 replies; 14+ messages in thread
From: Wido den Hollander @ 2016-08-16 10:37 UTC (permalink / raw)
  To: Milosz Tanski; +Cc: Ira Cooper, ceph-devel


> Op 16 augustus 2016 om 12:32 schreef Wido den Hollander <wido@42on.com>:
> 
> 
> 
> > Op 15 augustus 2016 om 23:36 schreef Milosz Tanski <milosz@adfin.com>:
> > 
> > 
> > On Mon, Aug 15, 2016 at 9:58 AM, Wido den Hollander <wido@42on.com> wrote:
> > >
> > >> Op 15 augustus 2016 om 15:43 schreef Ira Cooper <ira@wakeful.net>:
> > >>
> > >>
> > >> On Mon, Aug 15, 2016 at 9:35 AM, Wido den Hollander <wido@42on.com> wrote:
> > >> >
> > >> >> Op 15 augustus 2016 om 15:27 schreef Ira Cooper <ira@wakeful.net>:
> > >> >>
> > >> >>
> > >> >> Have you tried the ceph VFS module in Samba yet?
> > >> >>
> > >> >
> > >> > Yes. That works, but the performance is a lot lower. So for that we are testing/using the CephFS kernel client.
> > >> >
> > >> > I would also like to see the VFS module upstream in Samba, you still need to manually patch it in.
> > >>
> > >> It is in upstream Samba.
> > >>
> > >> https://git.samba.org/?p=samba.git;a=blob;f=source3/modules/vfs_ceph.c;h=59e9b9cf9b3e8e5313a20823994fcacf9e4b4168;hb=f1b42ec778e08875e076df7fdf67dd69bf9b2757
> > >>
> > >> It's been there a while now.
> > >>
> > >> I'm curious what you are patching in, and what the performance numbers are. :)
> > >
> > > No, sorry for that confusion. I meant DEB and/or RPM packages. You have to compile manually which not all companies like.
> > >
> > > With the kernel client we see about 200MB/sec and with VFS about 50MB/sec.
> > 
> > I'm willing to bet that at lest some of it is page cache & read ahead.
> > 
> 
> Probably indeed. Tested with Jewel and VFS and we see a much higher throughput right now.
> 
> We had to disable sendfile in Samba though.
> 
> Still, it's not good that Samba locked up and stayed in status D. That should not happen.
> 

Forgot to add. This is a 864 OSD cluster (24 OSDs, 36 nodes). All 10GbE connected.

Write spreed through Samba is about 200MB/sec, heavy drops at the end of the copy though. Probably 150MB/sec on average with a 15GB file.

Wido

> Wido
> 
> > >
> > > However, I still wonder why samba went in status D. That should not happen.
> > >
> > > Wido
> > >
> > 
> > 
> > 
> > -- 
> > Milosz Tanski
> > CTO
> > 16 East 34th Street, 15th floor
> > New York, NY 10016
> > 
> > p: 646-253-9055
> > e: milosz@adfin.com
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CephFS and Samba hang on copy of large file
  2016-08-16 10:32           ` Wido den Hollander
  2016-08-16 10:37             ` Wido den Hollander
@ 2016-08-16 10:38             ` Ira Cooper
  2016-08-16 10:44               ` Wido den Hollander
  1 sibling, 1 reply; 14+ messages in thread
From: Ira Cooper @ 2016-08-16 10:38 UTC (permalink / raw)
  To: Wido den Hollander; +Cc: Milosz Tanski, ceph-devel

On Tue, Aug 16, 2016 at 6:32 AM, Wido den Hollander <wido@42on.com> wrote:
>
>> Op 15 augustus 2016 om 23:36 schreef Milosz Tanski <milosz@adfin.com>:
>>
>>
>> On Mon, Aug 15, 2016 at 9:58 AM, Wido den Hollander <wido@42on.com> wrote:
>> >
>> >> Op 15 augustus 2016 om 15:43 schreef Ira Cooper <ira@wakeful.net>:
>> >>
>> >>
>> >> On Mon, Aug 15, 2016 at 9:35 AM, Wido den Hollander <wido@42on.com> wrote:
>> >> >
>> >> >> Op 15 augustus 2016 om 15:27 schreef Ira Cooper <ira@wakeful.net>:
>> >> >>
>> >> >>
>> >> >> Have you tried the ceph VFS module in Samba yet?
>> >> >>
>> >> >
>> >> > Yes. That works, but the performance is a lot lower. So for that we are testing/using the CephFS kernel client.
>> >> >
>> >> > I would also like to see the VFS module upstream in Samba, you still need to manually patch it in.
>> >>
>> >> It is in upstream Samba.
>> >>
>> >> https://git.samba.org/?p=samba.git;a=blob;f=source3/modules/vfs_ceph.c;h=59e9b9cf9b3e8e5313a20823994fcacf9e4b4168;hb=f1b42ec778e08875e076df7fdf67dd69bf9b2757
>> >>
>> >> It's been there a while now.
>> >>
>> >> I'm curious what you are patching in, and what the performance numbers are. :)
>> >
>> > No, sorry for that confusion. I meant DEB and/or RPM packages. You have to compile manually which not all companies like.
>> >
>> > With the kernel client we see about 200MB/sec and with VFS about 50MB/sec.
>>
>> I'm willing to bet that at lest some of it is page cache & read ahead.
>>
>
> Probably indeed. Tested with Jewel and VFS and we see a much higher throughput right now.
>
> We had to disable sendfile in Samba though.
>
> Still, it's not good that Samba locked up and stayed in status D. That should not happen.

Sendfile doesn't make much sense with a userspace filesystem like vfs_ceph.

You said performance improved, how much?

Also are you using copy, robocopy, the windows explorer?  And which
version of Windows?

Thanks,

-Ira

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CephFS and Samba hang on copy of large file
  2016-08-16 10:38             ` Ira Cooper
@ 2016-08-16 10:44               ` Wido den Hollander
  2016-08-16 10:48                 ` Ira Cooper
  0 siblings, 1 reply; 14+ messages in thread
From: Wido den Hollander @ 2016-08-16 10:44 UTC (permalink / raw)
  To: Ira Cooper; +Cc: ceph-devel, Milosz Tanski


> Op 16 augustus 2016 om 12:38 schreef Ira Cooper <ira@wakeful.net>:
> 
> 
> On Tue, Aug 16, 2016 at 6:32 AM, Wido den Hollander <wido@42on.com> wrote:
> >
> >> Op 15 augustus 2016 om 23:36 schreef Milosz Tanski <milosz@adfin.com>:
> >>
> >>
> >> On Mon, Aug 15, 2016 at 9:58 AM, Wido den Hollander <wido@42on.com> wrote:
> >> >
> >> >> Op 15 augustus 2016 om 15:43 schreef Ira Cooper <ira@wakeful.net>:
> >> >>
> >> >>
> >> >> On Mon, Aug 15, 2016 at 9:35 AM, Wido den Hollander <wido@42on.com> wrote:
> >> >> >
> >> >> >> Op 15 augustus 2016 om 15:27 schreef Ira Cooper <ira@wakeful.net>:
> >> >> >>
> >> >> >>
> >> >> >> Have you tried the ceph VFS module in Samba yet?
> >> >> >>
> >> >> >
> >> >> > Yes. That works, but the performance is a lot lower. So for that we are testing/using the CephFS kernel client.
> >> >> >
> >> >> > I would also like to see the VFS module upstream in Samba, you still need to manually patch it in.
> >> >>
> >> >> It is in upstream Samba.
> >> >>
> >> >> https://git.samba.org/?p=samba.git;a=blob;f=source3/modules/vfs_ceph.c;h=59e9b9cf9b3e8e5313a20823994fcacf9e4b4168;hb=f1b42ec778e08875e076df7fdf67dd69bf9b2757
> >> >>
> >> >> It's been there a while now.
> >> >>
> >> >> I'm curious what you are patching in, and what the performance numbers are. :)
> >> >
> >> > No, sorry for that confusion. I meant DEB and/or RPM packages. You have to compile manually which not all companies like.
> >> >
> >> > With the kernel client we see about 200MB/sec and with VFS about 50MB/sec.
> >>
> >> I'm willing to bet that at lest some of it is page cache & read ahead.
> >>
> >
> > Probably indeed. Tested with Jewel and VFS and we see a much higher throughput right now.
> >
> > We had to disable sendfile in Samba though.
> >
> > Still, it's not good that Samba locked up and stayed in status D. That should not happen.
> 
> Sendfile doesn't make much sense with a userspace filesystem like vfs_ceph.
> 

True, but it was still in the config. Had to debug that.

> You said performance improved, how much?
> 

We went from 50MB/sec to 150 ~ 200MB/sec write speed.

Writing directly to CephFS (kernel) goes with 900MB/sec.

> Also are you using copy, robocopy, the windows explorer?  And which
> version of Windows?
> 

A wide range of Operating Systems. In this case it was a Ubuntu 16.04 desktop, but the clients are also Windows 7, 8 and Windows 10. Just using the Explorer in Windows's case.

Wido

> Thanks,
> 
> -Ira
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CephFS and Samba hang on copy of large file
  2016-08-16 10:44               ` Wido den Hollander
@ 2016-08-16 10:48                 ` Ira Cooper
  2016-08-16 11:43                   ` Wido den Hollander
  0 siblings, 1 reply; 14+ messages in thread
From: Ira Cooper @ 2016-08-16 10:48 UTC (permalink / raw)
  To: Wido den Hollander; +Cc: ceph-devel, Milosz Tanski

On Tue, Aug 16, 2016 at 6:44 AM, Wido den Hollander <wido@42on.com> wrote:
>
>> Op 16 augustus 2016 om 12:38 schreef Ira Cooper <ira@wakeful.net>:
>>
>>
>> On Tue, Aug 16, 2016 at 6:32 AM, Wido den Hollander <wido@42on.com> wrote:
>> >
>> >> Op 15 augustus 2016 om 23:36 schreef Milosz Tanski <milosz@adfin.com>:
>> >>
>> >>
>> >> On Mon, Aug 15, 2016 at 9:58 AM, Wido den Hollander <wido@42on.com> wrote:
>> >> >
>> >> >> Op 15 augustus 2016 om 15:43 schreef Ira Cooper <ira@wakeful.net>:
>> >> >>
>> >> >>
>> >> >> On Mon, Aug 15, 2016 at 9:35 AM, Wido den Hollander <wido@42on.com> wrote:
>> >> >> >
>> >> >> >> Op 15 augustus 2016 om 15:27 schreef Ira Cooper <ira@wakeful.net>:
>> >> >> >>
>> >> >> >>
>> >> >> >> Have you tried the ceph VFS module in Samba yet?
>> >> >> >>
>> >> >> >
>> >> >> > Yes. That works, but the performance is a lot lower. So for that we are testing/using the CephFS kernel client.
>> >> >> >
>> >> >> > I would also like to see the VFS module upstream in Samba, you still need to manually patch it in.
>> >> >>
>> >> >> It is in upstream Samba.
>> >> >>
>> >> >> https://git.samba.org/?p=samba.git;a=blob;f=source3/modules/vfs_ceph.c;h=59e9b9cf9b3e8e5313a20823994fcacf9e4b4168;hb=f1b42ec778e08875e076df7fdf67dd69bf9b2757
>> >> >>
>> >> >> It's been there a while now.
>> >> >>
>> >> >> I'm curious what you are patching in, and what the performance numbers are. :)
>> >> >
>> >> > No, sorry for that confusion. I meant DEB and/or RPM packages. You have to compile manually which not all companies like.
>> >> >
>> >> > With the kernel client we see about 200MB/sec and with VFS about 50MB/sec.
>> >>
>> >> I'm willing to bet that at lest some of it is page cache & read ahead.
>> >>
>> >
>> > Probably indeed. Tested with Jewel and VFS and we see a much higher throughput right now.
>> >
>> > We had to disable sendfile in Samba though.
>> >
>> > Still, it's not good that Samba locked up and stayed in status D. That should not happen.
>>
>> Sendfile doesn't make much sense with a userspace filesystem like vfs_ceph.
>>
>
> True, but it was still in the config. Had to debug that.
>
>> You said performance improved, how much?
>>
>
> We went from 50MB/sec to 150 ~ 200MB/sec write speed.
>
> Writing directly to CephFS (kernel) goes with 900MB/sec.
>
>> Also are you using copy, robocopy, the windows explorer?  And which
>> version of Windows?
>>
>
> A wide range of Operating Systems. In this case it was a Ubuntu 16.04 desktop, but the clients are also Windows 7, 8 and Windows 10. Just using the Explorer in Windows's case.

Were you talking read or write performance before?

Also any chance you could run these tests from a Windows 7+ machine?

I know it is an odd request on a Linux list, but for debugging Samba,
it helps at times.

Thanks,

-Ira / ira@(samba.org|redhat.com|wakeful.net)

Technical Lead / Red Hat Storage - SMB (Samba) Team

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CephFS and Samba hang on copy of large file
  2016-08-16 10:48                 ` Ira Cooper
@ 2016-08-16 11:43                   ` Wido den Hollander
  2016-08-16 11:59                     ` Ira Cooper
  0 siblings, 1 reply; 14+ messages in thread
From: Wido den Hollander @ 2016-08-16 11:43 UTC (permalink / raw)
  To: Ira Cooper; +Cc: ceph-devel, Milosz Tanski


> Op 16 augustus 2016 om 12:48 schreef Ira Cooper <ira@wakeful.net>:
> 
> 
> On Tue, Aug 16, 2016 at 6:44 AM, Wido den Hollander <wido@42on.com> wrote:
> >
> >> Op 16 augustus 2016 om 12:38 schreef Ira Cooper <ira@wakeful.net>:
> >>
> >>
> >> On Tue, Aug 16, 2016 at 6:32 AM, Wido den Hollander <wido@42on.com> wrote:
> >> >
> >> >> Op 15 augustus 2016 om 23:36 schreef Milosz Tanski <milosz@adfin.com>:
> >> >>
> >> >>
> >> >> On Mon, Aug 15, 2016 at 9:58 AM, Wido den Hollander <wido@42on.com> wrote:
> >> >> >
> >> >> >> Op 15 augustus 2016 om 15:43 schreef Ira Cooper <ira@wakeful.net>:
> >> >> >>
> >> >> >>
> >> >> >> On Mon, Aug 15, 2016 at 9:35 AM, Wido den Hollander <wido@42on.com> wrote:
> >> >> >> >
> >> >> >> >> Op 15 augustus 2016 om 15:27 schreef Ira Cooper <ira@wakeful.net>:
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> Have you tried the ceph VFS module in Samba yet?
> >> >> >> >>
> >> >> >> >
> >> >> >> > Yes. That works, but the performance is a lot lower. So for that we are testing/using the CephFS kernel client.
> >> >> >> >
> >> >> >> > I would also like to see the VFS module upstream in Samba, you still need to manually patch it in.
> >> >> >>
> >> >> >> It is in upstream Samba.
> >> >> >>
> >> >> >> https://git.samba.org/?p=samba.git;a=blob;f=source3/modules/vfs_ceph.c;h=59e9b9cf9b3e8e5313a20823994fcacf9e4b4168;hb=f1b42ec778e08875e076df7fdf67dd69bf9b2757
> >> >> >>
> >> >> >> It's been there a while now.
> >> >> >>
> >> >> >> I'm curious what you are patching in, and what the performance numbers are. :)
> >> >> >
> >> >> > No, sorry for that confusion. I meant DEB and/or RPM packages. You have to compile manually which not all companies like.
> >> >> >
> >> >> > With the kernel client we see about 200MB/sec and with VFS about 50MB/sec.
> >> >>
> >> >> I'm willing to bet that at lest some of it is page cache & read ahead.
> >> >>
> >> >
> >> > Probably indeed. Tested with Jewel and VFS and we see a much higher throughput right now.
> >> >
> >> > We had to disable sendfile in Samba though.
> >> >
> >> > Still, it's not good that Samba locked up and stayed in status D. That should not happen.
> >>
> >> Sendfile doesn't make much sense with a userspace filesystem like vfs_ceph.
> >>
> >
> > True, but it was still in the config. Had to debug that.
> >
> >> You said performance improved, how much?
> >>
> >
> > We went from 50MB/sec to 150 ~ 200MB/sec write speed.
> >
> > Writing directly to CephFS (kernel) goes with 900MB/sec.
> >
> >> Also are you using copy, robocopy, the windows explorer?  And which
> >> version of Windows?
> >>
> >
> > A wide range of Operating Systems. In this case it was a Ubuntu 16.04 desktop, but the clients are also Windows 7, 8 and Windows 10. Just using the Explorer in Windows's case.
> 
> Were you talking read or write performance before?
> 

All write performance. We don't care that much about read in this environment.

> Also any chance you could run these tests from a Windows 7+ machine?
> 
> I know it is an odd request on a Linux list, but for debugging Samba,
> it helps at times.
> 

I'm currently not at that customer, so I'm not able to run the test. I also don't have remote access.

The hang we saw was from a Windows 7 desktop btw, not the Ubuntu one. We just used the Ubuntu desktop to see if it also happened there.

Wido

> Thanks,
> 
> -Ira / ira@(samba.org|redhat.com|wakeful.net)
> 
> Technical Lead / Red Hat Storage - SMB (Samba) Team
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CephFS and Samba hang on copy of large file
  2016-08-16 11:43                   ` Wido den Hollander
@ 2016-08-16 11:59                     ` Ira Cooper
  0 siblings, 0 replies; 14+ messages in thread
From: Ira Cooper @ 2016-08-16 11:59 UTC (permalink / raw)
  To: Wido den Hollander; +Cc: ceph-devel, Milosz Tanski

On Tue, Aug 16, 2016 at 7:43 AM, Wido den Hollander <wido@42on.com> wrote:
>
>> Op 16 augustus 2016 om 12:48 schreef Ira Cooper <ira@wakeful.net>:
>>
>>
>> On Tue, Aug 16, 2016 at 6:44 AM, Wido den Hollander <wido@42on.com> wrote:
>> >
>> >> Op 16 augustus 2016 om 12:38 schreef Ira Cooper <ira@wakeful.net>:
>> >>
>> >>
>> >> On Tue, Aug 16, 2016 at 6:32 AM, Wido den Hollander <wido@42on.com> wrote:
>> >> >
>> >> >> Op 15 augustus 2016 om 23:36 schreef Milosz Tanski <milosz@adfin.com>:
>> >> >>
>> >> >>
>> >> >> On Mon, Aug 15, 2016 at 9:58 AM, Wido den Hollander <wido@42on.com> wrote:
>> >> >> >
>> >> >> >> Op 15 augustus 2016 om 15:43 schreef Ira Cooper <ira@wakeful.net>:
>> >> >> >>
>> >> >> >>
>> >> >> >> On Mon, Aug 15, 2016 at 9:35 AM, Wido den Hollander <wido@42on.com> wrote:
>> >> >> >> >
>> >> >> >> >> Op 15 augustus 2016 om 15:27 schreef Ira Cooper <ira@wakeful.net>:
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> Have you tried the ceph VFS module in Samba yet?
>> >> >> >> >>
>> >> >> >> >
>> >> >> >> > Yes. That works, but the performance is a lot lower. So for that we are testing/using the CephFS kernel client.
>> >> >> >> >
>> >> >> >> > I would also like to see the VFS module upstream in Samba, you still need to manually patch it in.
>> >> >> >>
>> >> >> >> It is in upstream Samba.
>> >> >> >>
>> >> >> >> https://git.samba.org/?p=samba.git;a=blob;f=source3/modules/vfs_ceph.c;h=59e9b9cf9b3e8e5313a20823994fcacf9e4b4168;hb=f1b42ec778e08875e076df7fdf67dd69bf9b2757
>> >> >> >>
>> >> >> >> It's been there a while now.
>> >> >> >>
>> >> >> >> I'm curious what you are patching in, and what the performance numbers are. :)
>> >> >> >
>> >> >> > No, sorry for that confusion. I meant DEB and/or RPM packages. You have to compile manually which not all companies like.
>> >> >> >
>> >> >> > With the kernel client we see about 200MB/sec and with VFS about 50MB/sec.
>> >> >>
>> >> >> I'm willing to bet that at lest some of it is page cache & read ahead.
>> >> >>
>> >> >
>> >> > Probably indeed. Tested with Jewel and VFS and we see a much higher throughput right now.
>> >> >
>> >> > We had to disable sendfile in Samba though.
>> >> >
>> >> > Still, it's not good that Samba locked up and stayed in status D. That should not happen.
>> >>
>> >> Sendfile doesn't make much sense with a userspace filesystem like vfs_ceph.
>> >>
>> >
>> > True, but it was still in the config. Had to debug that.
>> >
>> >> You said performance improved, how much?
>> >>
>> >
>> > We went from 50MB/sec to 150 ~ 200MB/sec write speed.
>> >
>> > Writing directly to CephFS (kernel) goes with 900MB/sec.
>> >
>> >> Also are you using copy, robocopy, the windows explorer?  And which
>> >> version of Windows?
>> >>
>> >
>> > A wide range of Operating Systems. In this case it was a Ubuntu 16.04 desktop, but the clients are also Windows 7, 8 and Windows 10. Just using the Explorer in Windows's case.
>>
>> Were you talking read or write performance before?
>>
>
> All write performance. We don't care that much about read in this environment.

Ok then AIO won't have much impact, nor much of anything.  I'm now much more
curious about what is going on :).

>> Also any chance you could run these tests from a Windows 7+ machine?
>>
>> I know it is an odd request on a Linux list, but for debugging Samba,
>> it helps at times.
>>
>
> I'm currently not at that customer, so I'm not able to run the test. I also don't have remote access.
>
> The hang we saw was from a Windows 7 desktop btw, not the Ubuntu one. We just used the Ubuntu desktop to see if it also happened there.

Alas, the use pattern will be different between the two. :(

SMB 2.1 vs. SMB1 with unix extensions is a major delta.

Cheers,

-Ira

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: CephFS and Samba hang on copy of large file
  2016-08-15 13:23 CephFS and Samba hang on copy of large file Wido den Hollander
       [not found] ` <CAAPGDwLxomxLJMZNsW7qs9vqjnxH9zbBkUjBok7tsHXA_rt3yA@mail.gmail.com>
@ 2016-08-16 14:05 ` Jeff Layton
  1 sibling, 0 replies; 14+ messages in thread
From: Jeff Layton @ 2016-08-16 14:05 UTC (permalink / raw)
  To: Wido den Hollander, ceph-devel

On Mon, 2016-08-15 at 15:23 +0200, Wido den Hollander wrote:
> Hi,
> 
> I'm running into a issue in combination of CephFS and Samba and I was
> wondering if a dev knew what is happening here.
> 
> The situation:
> - Jewel cluster
> - CephFS kernel client version 4.7
> - Samba re-export of CephFS
> - Mount options: rw,noatime,acl
> 
> A copy of a 15GB file results in Samba hanging in status D:
> 
> root@hlms-zaken-01:~# ps aux|grep smb|grep D
> jongh       8887  0.0  0.0 376656 19068 ?        D    14:42   0:00
> /usr/sbin/smbd -D
> jongh       9740  0.0  0.0 377380 19244 ?        D    14:49   0:00
> /usr/sbin/smbd -D
> root@hlms-zaken-01:~# cat /proc/8887/stack
> [<ffffffff8132d353>] call_rwsem_down_write_failed+0x13/0x20
> [<ffffffff8121a145>] vfs_setxattr+0x55/0xb0
> [<ffffffff8121a2a5>] setxattr+0x105/0x170
> [<ffffffff81203aa1>] filename_lookup+0xf1/0x180
> [<ffffffff8120369f>] getname_flags+0x6f/0x1e0
> [<ffffffff8121a3bd>] path_setxattr+0xad/0xe0
> [<ffffffff8121a4f0>] SyS_setxattr+0x10/0x20
> [<ffffffff815e8b76>] entry_SYSCALL_64_fastpath+0x1e/0xa8
> [<ffffffffffffffff>] 0xffffffffffffffff
> root@hlms-zaken-01:~# cat /proc/9740/stack
> [<ffffffff8132d353>] call_rwsem_down_write_failed+0x13/0x20
> [<ffffffff8121a145>] vfs_setxattr+0x55/0xb0
> [<ffffffff8121a2a5>] setxattr+0x105/0x170
> [<ffffffff81203aa1>] filename_lookup+0xf1/0x180
> [<ffffffff8120369f>] getname_flags+0x6f/0x1e0
> [<ffffffff8121a3bd>] path_setxattr+0xad/0xe0
> [<ffffffff8121a4f0>] SyS_setxattr+0x10/0x20
> [<ffffffff815e8b76>] entry_SYSCALL_64_fastpath+0x1e/0xa8
> [<ffffffffffffffff>] 0xffffffffffffffff
> root@hlms-zaken-01:~#
> 
> Now, when I look in /sys/kernel/debug/ceph/*/osdc / mdsc there are no
> outstanding requests to the OSDs or MDS.
> 
> Both these calls just hang for ever on these requests and they don't
> continue.
> 
> Any pointers where to start looking for this? I tried the 4.4 kernel
> before, it gave me the same hang. So that's why I upgraded to 4.7 to
> see if it was fixed there.
> 
> The Ceph cluster is currently backfilling 17 PGs, but this also
> happend when HEALTH_OK was around.
> 
> There are no block or slow requests in the cluster.
> 
FWIW, looks like those tasks are stuck trying to acquire the inode
rwsem in order to do a setxattr. Most likely, something grabbed that
lock and didn't release it for some reason.

In order to track that down, you'd need to take a look at the state of all of the tasks on the machine and try to track down the one that's holding this lock (or "these locks" -- these could be different inodes after all).

It may be simplest to force a vmcore and then poke around in there with the crash kernel debugger. Identify the inode that has the blocked rwsem, verify that it's still legit (not freed or anything) and then see if you can figure out which task might have failed to release the lock and why.
-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-08-16 14:05 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-15 13:23 CephFS and Samba hang on copy of large file Wido den Hollander
     [not found] ` <CAAPGDwLxomxLJMZNsW7qs9vqjnxH9zbBkUjBok7tsHXA_rt3yA@mail.gmail.com>
2016-08-15 13:35   ` Wido den Hollander
2016-08-15 13:43     ` Ira Cooper
2016-08-15 13:58       ` Wido den Hollander
2016-08-15 14:08         ` Ira Cooper
2016-08-15 21:36         ` Milosz Tanski
2016-08-16 10:32           ` Wido den Hollander
2016-08-16 10:37             ` Wido den Hollander
2016-08-16 10:38             ` Ira Cooper
2016-08-16 10:44               ` Wido den Hollander
2016-08-16 10:48                 ` Ira Cooper
2016-08-16 11:43                   ` Wido den Hollander
2016-08-16 11:59                     ` Ira Cooper
2016-08-16 14:05 ` Jeff Layton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.