All of lore.kernel.org
 help / color / mirror / Atom feed
* Severe performance issue (Ubuntu 10.04 mounting an XP share)
@ 2010-07-02 15:39 James Green
       [not found] ` <AANLkTilXnPMMoY-54RbO4CUeqP9b-zaM7NZQZQpHH5Gk-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: James Green @ 2010-07-02 15:39 UTC (permalink / raw)
  To: linux-cifs-u79uwXL29TY76Z2rM5mHXA

Hi,

I'm currently seeing a serious performance problem with a cifs-mounted
WinXP share, and I've not found any reference to it yet.

Basically my hardware is a desktop with Windows XP. VMWare server is
installed and I have Ubuntu 10.04 as an image. The ubuntu VM
cifs-mounts this XP share that's called 'ControlPanel'.

This has been running just fine until yesterday morning then I was
prompted to install an updated kernel and reboot.

Since then, I can mount the share, I can look in the files and do
everything normally, except run phpunit which now runs dog-slow.

Previously, I would expect 15-30s for total phpunit execution. Now,
it's between 2m30s and 5m30s. I traced the code through and found it
to be trying to load the files up.

syslog shows every fifteen seconds:

Jul  2 16:09:13 ubuntu kernel: [ 1033.820126] CIFS VFS: No response
for cmd 50 mid 6469
Jul  2 16:09:27 ubuntu kernel: [ 1048.779718] CIFS VFS: No response
for cmd 50 mid 6593
Jul  2 16:09:42 ubuntu kernel: [ 1063.816102] CIFS VFS: No response
for cmd 50 mid 6722
Jul  2 16:09:57 ubuntu kernel: [ 1078.787182] CIFS VFS: No response
for cmd 50 mid 6855
Jul  2 16:10:12 ubuntu kernel: [ 1093.820243] CIFS VFS: No response
for cmd 50 mid 6991
Jul  2 16:10:27 ubuntu kernel: [ 1108.793428] CIFS VFS: No response
for cmd 50 mid 7120
Jul  2 16:10:43 ubuntu kernel: [ 1123.828834] CIFS VFS: No response
for cmd 50 mid 7260
Jul  2 16:10:57 ubuntu kernel: [ 1138.798723] CIFS VFS: No response
for cmd 50 mid 7395
Jul  2 16:11:13 ubuntu kernel: [ 1153.836951] CIFS VFS: No response
for cmd 50 mid 7544
Jul  2 16:11:27 ubuntu kernel: [ 1168.803846] CIFS VFS: No response
for cmd 50 mid 7677
Jul  2 16:11:43 ubuntu kernel: [ 1183.844863] CIFS VFS: No response
for cmd 50 mid 7829
Jul  2 16:11:57 ubuntu kernel: [ 1198.808690] CIFS VFS: No response
for cmd 50 mid 7955
Jul  2 16:12:13 ubuntu kernel: [ 1213.840801] CIFS VFS: No response
for cmd 50 mid 8073
Jul  2 16:12:27 ubuntu kernel: [ 1228.813492] CIFS VFS: No response
for cmd 50 mid 8219
Jul  2 16:12:43 ubuntu kernel: [ 1243.844349] CIFS VFS: No response
for cmd 50 mid 8338
Jul  2 16:12:57 ubuntu kernel: [ 1258.817835] CIFS VFS: No response
for cmd 50 mid 8425
Jul  2 16:13:13 ubuntu kernel: [ 1273.852591] CIFS VFS: No response
for cmd 50 mid 8557
Jul  2 16:13:27 ubuntu kernel: [ 1288.822119] CIFS VFS: No response
for cmd 50 mid 8687
Jul  2 16:14:26 ubuntu kernel: [ 1347.512906] CIFS VFS: No response
for cmd 50 mid 10970
Jul  2 16:14:43 ubuntu kernel: [ 1363.848978] CIFS VFS: No response
for cmd 50 mid 11040
Jul  2 16:14:58 ubuntu kernel: [ 1378.833956] CIFS VFS: No response
for cmd 50 mid 11112

You get the idea...

I downgraded to the previously installed kernel without any
improvement. I upgraded to
http://kernel.ubuntu.com/~kernel-ppa/mainline/daily/current/linux-image-2.6.35-999-generic_2.6.35-999.201007021009_i386.deb
again without any improvement.

Using wireshark I noticed lots of QUERY_PATH_INFO requests but I have
no idea what is normal.

When connecting a remote 9.10 machine to my share, I again see the
QUERY_PATH_INFO but the performance is as expected - much improved.

I've raised a ticket with Ubuntu which I've been making notes in:
https://bugs.launchpad.net/ubuntu/+source/samba/+bug/600565

Can anyone help or give me further instructions for debugging this?
I'm working at a snails pace otherwise :(

FYI my smbfs version is 2:3.4.7~dfsg-1ubuntu3.

Thanks,

James

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Severe performance issue (Ubuntu 10.04 mounting an XP share)
       [not found] ` <AANLkTilXnPMMoY-54RbO4CUeqP9b-zaM7NZQZQpHH5Gk-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-07-02 18:16   ` Jeff Layton
       [not found]     ` <AANLkTim4xY4n2y-jE0FzCAKc6NWK6DZcHx15d2owR-wV@mail.gmail.com>
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff Layton @ 2010-07-02 18:16 UTC (permalink / raw)
  To: James Green; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA

On Fri, 2 Jul 2010 16:39:06 +0100
James Green <james.mk.green-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> Jul  2 16:14:58 ubuntu kernel: [ 1378.833956] CIFS VFS: No response
> for cmd 50 mid 11112

The client is sending a SMB_COM_TRANSACTION2 and isn't getting a
response. You might want to sniff traffic and identify calls that
aren't getting responses. If there is some commonality between them
then that may point to where the problem is.

-- 
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Fwd: Severe performance issue (Ubuntu 10.04 mounting an XP share)
       [not found]       ` <AANLkTim4xY4n2y-jE0FzCAKc6NWK6DZcHx15d2owR-wV-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-07-05  9:14         ` James Green
       [not found]           ` <AANLkTimdE4q2q3w0DccJUv_steLH-0kIhgK7_jv6-IMi-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: James Green @ 2010-07-05  9:14 UTC (permalink / raw)
  To: linux-cifs-u79uwXL29TY76Z2rM5mHXA

Sorry I meant this to go to the list rather than solely to Jeff personally.

On 2 July 2010 19:16, Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On Fri, 2 Jul 2010 16:39:06 +0100
> James Green <james.mk.green-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
>> Jul  2 16:14:58 ubuntu kernel: [ 1378.833956] CIFS VFS: No response
>> for cmd 50 mid 11112
>
> The client is sending a SMB_COM_TRANSACTION2 and isn't getting a
> response. You might want to sniff traffic and identify calls that
> aren't getting responses. If there is some commonality between them
> then that may point to where the problem is.

Jeff,

I fear I'm looking for a needle in a haystack, and I've no idea what
the needle looks like.

I ran Wireshark on my XP host and repeatedly ran the phpunit tests observing:

1. There really is an awful lot of FIND_FIRST2, Pattern: <blah>,
followed immediately by FIND_FIRST2, Files: <blank>. I assume this to
be normal.

2. Occassionally and on a random Pattern argumen, a FIND_FIRST2 or
QUERY_PATH_INFO is followed immediately not by an SMB level response
but by a TCP "microsoft-ds" ACK. The Len is 0 and all other arguments
have values. At this point there may be several seconds of delay
before SMB re-establishes with a Session Set AndX Request, User:
\jgreen then a Tree Connect AndX Request for the mount point being
used.

It feels as though the XP host itself is simply not returning a
response some of the time. Other than changing my password and
rebooting the whole lot I can't think of anything I might have done.

If you have any further tests you'd like me to perform I'll try to
oblige as this is killing my workflow.

Thanks,

James

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Severe performance issue (Ubuntu 10.04 mounting an XP share)
       [not found]           ` <AANLkTimdE4q2q3w0DccJUv_steLH-0kIhgK7_jv6-IMi-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-07-05 13:04             ` Jeff Layton
       [not found]               ` <20100705090425.05c3f568-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff Layton @ 2010-07-05 13:04 UTC (permalink / raw)
  To: James Green; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA, Shirish Pargaonkar

On Mon, 5 Jul 2010 10:14:29 +0100
James Green <james.mk.green-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> Sorry I meant this to go to the list rather than solely to Jeff personally.
> 
> On 2 July 2010 19:16, Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > On Fri, 2 Jul 2010 16:39:06 +0100
> > James Green <james.mk.green-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >
> >> Jul  2 16:14:58 ubuntu kernel: [ 1378.833956] CIFS VFS: No response
> >> for cmd 50 mid 11112
> >
> > The client is sending a SMB_COM_TRANSACTION2 and isn't getting a
> > response. You might want to sniff traffic and identify calls that
> > aren't getting responses. If there is some commonality between them
> > then that may point to where the problem is.
> 
> Jeff,
> 
> I fear I'm looking for a needle in a haystack, and I've no idea what
> the needle looks like.
> 
> I ran Wireshark on my XP host and repeatedly ran the phpunit tests observing:
> 
> 1. There really is an awful lot of FIND_FIRST2, Pattern: <blah>,
> followed immediately by FIND_FIRST2, Files: <blank>. I assume this to
> be normal.
> 
> 2. Occassionally and on a random Pattern argumen, a FIND_FIRST2 or
> QUERY_PATH_INFO is followed immediately not by an SMB level response
> but by a TCP "microsoft-ds" ACK. The Len is 0 and all other arguments
> have values.

Can't be certain without seeing the capture, but it sounds like a
"normal" TCP ACK. That means that the server has acknowledged that it
received the packet.

> At this point there may be several seconds of delay
> before SMB re-establishes with a Session Set AndX Request, User:
> \jgreen then a Tree Connect AndX Request for the mount point being
> used.
> 
> It feels as though the XP host itself is simply not returning a
> response some of the time. Other than changing my password and
> rebooting the whole lot I can't think of anything I might have done.
> 
> If you have any further tests you'd like me to perform I'll try to
> oblige as this is killing my workflow.
> 

Yep, sounds like the server just isn't responding here. When that
happens, cifs currently forces a reconnect after the call times out.
Ugly and I believe unnecessary...when the server ACKs the response we
shouldn't force a reconnect just because the server is taking a while
to respond. Right now though we don't change the mid state to show that
the call was successfully sent, so that probably ought to be fixed so
we can deal with that differently.

It's possible that by making the client wait indefinitely, the server
will eventually respond. It's also possible however that the server is
just dropping these calls on the floor for some reason and will never
respond. Hard to say which it is.

Shirish was working recently on fixing the "hard" mount option for
cifs. It might be worth trying out his patches.

Shirish, any thoughts?
-- 
Jeff Layton <jlayton-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Severe performance issue (Ubuntu 10.04 mounting an XP share)
       [not found]               ` <20100705090425.05c3f568-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
@ 2010-07-05 13:21                 ` James Green
       [not found]                   ` <AANLkTimguAhx_fL-N_wsckwj4EbiIMqAi93PofSohMSo-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: James Green @ 2010-07-05 13:21 UTC (permalink / raw)
  To: Jeff Layton; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA, Shirish Pargaonkar

On 5 July 2010 14:04, Jeff Layton <jlayton-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org> wrote:
> Yep, sounds like the server just isn't responding here. When that
> happens, cifs currently forces a reconnect after the call times out.
> Ugly and I believe unnecessary...when the server ACKs the response we
> shouldn't force a reconnect just because the server is taking a while
> to respond. Right now though we don't change the mid state to show that
> the call was successfully sent, so that probably ought to be fixed so
> we can deal with that differently.
>
> It's possible that by making the client wait indefinitely, the server
> will eventually respond. It's also possible however that the server is
> just dropping these calls on the floor for some reason and will never
> respond. Hard to say which it is.
>
> Shirish was working recently on fixing the "hard" mount option for
> cifs. It might be worth trying out his patches.

This doesn't quite feel right. I mean, why one day should an XP host
just stop responding at random to requests? I know it's Microsoft but
even so...

I took a look at the Event Viewer and under Security there are
0xC0000064 errors as smbfs makes it's accesses. Perhaps this is a
clue?

Applying a patch might not be particularly easy as this box is stock
Ubuntu (excepting the upstream kernel) and I don't particularly want
to start compiling bits just to make phpunit run as it was a few days
ago.

James

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Severe performance issue (Ubuntu 10.04 mounting an XP share)
       [not found]                   ` <AANLkTimguAhx_fL-N_wsckwj4EbiIMqAi93PofSohMSo-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-07-05 13:32                     ` Jeff Layton
       [not found]                       ` <20100705093204.5064b612-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff Layton @ 2010-07-05 13:32 UTC (permalink / raw)
  To: James Green; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA, Shirish Pargaonkar

On Mon, 5 Jul 2010 14:21:08 +0100
James Green <james.mk.green-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> On 5 July 2010 14:04, Jeff Layton <jlayton-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org> wrote:
> > Yep, sounds like the server just isn't responding here. When that
> > happens, cifs currently forces a reconnect after the call times out.
> > Ugly and I believe unnecessary...when the server ACKs the response we
> > shouldn't force a reconnect just because the server is taking a while
> > to respond. Right now though we don't change the mid state to show that
> > the call was successfully sent, so that probably ought to be fixed so
> > we can deal with that differently.
> >
> > It's possible that by making the client wait indefinitely, the server
> > will eventually respond. It's also possible however that the server is
> > just dropping these calls on the floor for some reason and will never
> > respond. Hard to say which it is.
> >
> > Shirish was working recently on fixing the "hard" mount option for
> > cifs. It might be worth trying out his patches.
> 
> This doesn't quite feel right. I mean, why one day should an XP host
> just stop responding at random to requests? I know it's Microsoft but
> even so...
> 
> I took a look at the Event Viewer and under Security there are
> 0xC0000064 errors as smbfs makes it's accesses. Perhaps this is a
> clue?
> 
> Applying a patch might not be particularly easy as this box is stock
> Ubuntu (excepting the upstream kernel) and I don't particularly want
> to start compiling bits just to make phpunit run as it was a few days
> ago.
> 

You may be right, but unfortunately I can't do much beyond tell you
what I think is happening now. It's certainly possible there was a
change in the cifs code that is tickling a server-side bug.

I think 0xC0000064 == NO_SUCH_USER, so there may be some sort of
problem with authentication.

-- 
Jeff Layton <jlayton-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Severe performance issue (Ubuntu 10.04 mounting an XP share)
       [not found]                       ` <20100705093204.5064b612-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
@ 2010-07-06 11:24                         ` James Green
       [not found]                           ` <AANLkTimKQAM1xbM-Ggkfs7M8atcPkDbBL5cVazf3R0dR-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: James Green @ 2010-07-06 11:24 UTC (permalink / raw)
  To: Jeff Layton; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA

On 5 July 2010 14:32, Jeff Layton <jlayton-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org> wrote:
> You may be right, but unfortunately I can't do much beyond tell you
> what I think is happening now. It's certainly possible there was a
> change in the cifs code that is tickling a server-side bug.
>
> I think 0xC0000064 == NO_SUCH_USER, so there may be some sort of
> problem with authentication.

I've rebooted using 2.6.28 and phpunit is far, far faster (although
not as quick as it could be). Instead of 4m30s or more, I just ran it
at 48s.

Somewhere between .28 and .32 I suspect a regression causing massive
performance penalties. It would be interesting to see if this was
limited to mounting XP shares, I might have a go later.

I have absolutely no idea why XP is reporting these 0xC0000064 errors
- the user clearly does exist as I'm him!

James

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Severe performance issue (Ubuntu 10.04 mounting an XP share)
       [not found]                           ` <AANLkTimKQAM1xbM-Ggkfs7M8atcPkDbBL5cVazf3R0dR-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-07-06 13:11                             ` Suresh Jayaraman
  0 siblings, 0 replies; 8+ messages in thread
From: Suresh Jayaraman @ 2010-07-06 13:11 UTC (permalink / raw)
  To: James Green; +Cc: Jeff Layton, linux-cifs-u79uwXL29TY76Z2rM5mHXA

On 07/06/2010 04:54 PM, James Green wrote:
> On 5 July 2010 14:32, Jeff Layton <jlayton-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org> wrote:
>> You may be right, but unfortunately I can't do much beyond tell you
>> what I think is happening now. It's certainly possible there was a
>> change in the cifs code that is tickling a server-side bug.
>>
>> I think 0xC0000064 == NO_SUCH_USER, so there may be some sort of
>> problem with authentication.
> 
> I've rebooted using 2.6.28 and phpunit is far, far faster (although
> not as quick as it could be). Instead of 4m30s or more, I just ran it
> at 48s.
> 
> Somewhere between .28 and .32 I suspect a regression causing massive
> performance penalties. It would be interesting to see if this was
> limited to mounting XP shares, I might have a go later.
> 

I'm not sure whether this could be seen an performance issue, sounds
more like a bug which results in server dropping packets.

Could you please compare the network traces between the suspected
versions and observe the difference?


Thanks,

--
Suresh Jayaraman

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-07-06 13:11 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-02 15:39 Severe performance issue (Ubuntu 10.04 mounting an XP share) James Green
     [not found] ` <AANLkTilXnPMMoY-54RbO4CUeqP9b-zaM7NZQZQpHH5Gk-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-07-02 18:16   ` Jeff Layton
     [not found]     ` <AANLkTim4xY4n2y-jE0FzCAKc6NWK6DZcHx15d2owR-wV@mail.gmail.com>
     [not found]       ` <AANLkTim4xY4n2y-jE0FzCAKc6NWK6DZcHx15d2owR-wV-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-07-05  9:14         ` Fwd: " James Green
     [not found]           ` <AANLkTimdE4q2q3w0DccJUv_steLH-0kIhgK7_jv6-IMi-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-07-05 13:04             ` Jeff Layton
     [not found]               ` <20100705090425.05c3f568-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
2010-07-05 13:21                 ` James Green
     [not found]                   ` <AANLkTimguAhx_fL-N_wsckwj4EbiIMqAi93PofSohMSo-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-07-05 13:32                     ` Jeff Layton
     [not found]                       ` <20100705093204.5064b612-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
2010-07-06 11:24                         ` James Green
     [not found]                           ` <AANLkTimKQAM1xbM-Ggkfs7M8atcPkDbBL5cVazf3R0dR-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-07-06 13:11                             ` Suresh Jayaraman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.