public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* make -j4 gets stuck w/ ccache over NFS
@ 2004-12-07  2:24 Mark M. Hoffman
  2004-12-07  6:22 ` Martin Pool
  2004-12-08  7:25 ` tridge
  0 siblings, 2 replies; 6+ messages in thread
From: Mark M. Hoffman @ 2004-12-07  2:24 UTC (permalink / raw)
  To: LKML

Hello:

I'm using ccache version 2.4 [1].  I just changed ~/.ccache to a symbolic
link to a directory which is NFS mounted [2].  The kernel source itself is
on a local FS.  With the ccache suitably primed, when I do a kernel compile
using 'make -j4' it seems to get stuck for seconds at a time.  When it gets
unstuck, it blows through a handful of files and then gets stuck again.

When it is stuck, both NFS client and server are almost totally idle.  The
network itself has almost no other traffic.  It doesn't seem to matter if I
mount NFS w/ udp or tcp (v3 in both cases).

If I move ~/.ccache to a local FS, it never gets stuck.  If I just use 'make'
or even 'make -j2', (I'm pretty sure but not 100%) it never gets stuck.

The NFS client is FC3 with kernel.org 2.6.10-rc3.  I also tried it with -rc1,
2.6.9, and 2.6.8.1 - same problem.

Weirdest part: when I boot the distro stock kernel (2.6.9-1.681_FC3smp) the
problem goes away.  Configs for all of the above are here [3].

Please CC me with replies - thanks!

[1] http://ccache.samba.org/

[2] The NFS server uses kernel 2.4.28-rc2.

[3] http://members.dca.net/mhoffman/lkml-20041213/

-- 
Mark M. Hoffman
mhoffman@lightlink.com


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: make -j4 gets stuck w/ ccache over NFS
  2004-12-07  2:24 make -j4 gets stuck w/ ccache over NFS Mark M. Hoffman
@ 2004-12-07  6:22 ` Martin Pool
  2004-12-08  7:25 ` tridge
  1 sibling, 0 replies; 6+ messages in thread
From: Martin Pool @ 2004-12-07  6:22 UTC (permalink / raw)
  To: linux-kernel

On Mon, 06 Dec 2004 21:24:29 -0500, Mark M. Hoffman wrote:

> Hello:
> 
> I'm using ccache version 2.4 [1].  I just changed ~/.ccache to a symbolic
> link to a directory which is NFS mounted [2].  The kernel source itself is
> on a local FS.  With the ccache suitably primed, when I do a kernel
> compile using 'make -j4' it seems to get stuck for seconds at a time. 
> When it gets unstuck, it blows through a handful of files and then gets
> stuck again.
> 
> When it is stuck, both NFS client and server are almost totally idle.  The
> network itself has almost no other traffic.  It doesn't seem to matter if
> I mount NFS w/ udp or tcp (v3 in both cases).
> 
> If I move ~/.ccache to a local FS, it never gets stuck.  If I just use
> 'make' or even 'make -j2', (I'm pretty sure but not 100%) it never gets
> stuck.

Perhaps ccache is getting jammed/deadlocked trying to take a lock on NFS. 
Maybe you should try getting an ethereal dump of the network traffic.

-- 
Martin



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: make -j4 gets stuck w/ ccache over NFS
  2004-12-07  2:24 make -j4 gets stuck w/ ccache over NFS Mark M. Hoffman
  2004-12-07  6:22 ` Martin Pool
@ 2004-12-08  7:25 ` tridge
  2005-03-10  5:47   ` make -j4 gets stuck w/ ccache over NFS - solved! Mark M. Hoffman
  1 sibling, 1 reply; 6+ messages in thread
From: tridge @ 2004-12-08  7:25 UTC (permalink / raw)
  To: Mark M. Hoffman; +Cc: LKML

Mark,

 > I'm using ccache version 2.4 [1].  I just changed ~/.ccache to a symbolic
 > link to a directory which is NFS mounted [2].  The kernel source itself is
 > on a local FS.  With the ccache suitably primed, when I do a kernel compile
 > using 'make -j4' it seems to get stuck for seconds at a time.  When it gets
 > unstuck, it blows through a handful of files and then gets stuck again.

I'd suggest you first narrow down the problem to either being a
locking problem or a file IO problem. To do that, change lock_fd() in
util.c in ccache to just "return 0;". That will mean the ccache stats
file could become corrupted, but if it runs fast then you know that it
is a locking problem. I have noticed severe speed problem with NFS
locking on Linux previosly, which is why I mention this as a
possibility.

Note that removing this locking will not cause ccache to produce
incorrect object files, it will just mean the stats printed with
"ccache -s" may be inaccurate.

Cheers, Tridge

PS: I also wonder why you're not just using distcc. It's usually a lot
more appropriate in a distributed environment.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: make -j4 gets stuck w/ ccache over NFS - solved!
  2004-12-08  7:25 ` tridge
@ 2005-03-10  5:47   ` Mark M. Hoffman
  2005-03-10 16:24     ` J. Bruce Fields
  2005-03-10 17:11     ` Greg KH
  0 siblings, 2 replies; 6+ messages in thread
From: Mark M. Hoffman @ 2005-03-10  5:47 UTC (permalink / raw)
  To: tridge, Greg KH; +Cc: LKML, J. Bruce Fields, Neil Brown, Andrew Morton

Hi Tridge, Greg, et. al.:

I wrote, some months ago:
>  > I'm using ccache version 2.4 [1].  I just changed ~/.ccache to a symbolic
>  > link to a directory which is NFS mounted [2].  The kernel source itself is
>  > on a local FS.  With the ccache suitably primed, when I do a kernel compile
>  > using 'make -j4' it seems to get stuck for seconds at a time.  When it gets
>  > unstuck, it blows through a handful of files and then gets stuck again.

* tridge@samba.org <tridge@samba.org> [2004-12-08 18:25:59 +1100]:
> I'd suggest you first narrow down the problem to either being a
> locking problem or a file IO problem. To do that, change lock_fd() in
> util.c in ccache to just "return 0;". That will mean the ccache stats
> file could become corrupted, but if it runs fast then you know that it
> is a locking problem. I have noticed severe speed problem with NFS
> locking on Linux previosly, which is why I mention this as a
> possibility.
> 
> Note that removing this locking will not cause ccache to produce
> incorrect object files, it will just mean the stats printed with
> "ccache -s" may be inaccurate.

Thanks for the suggestions.  It wasn't very important to me so I didn't
make time to follow up on it.  I was just playing w/ ccache at the time.

Finally I noticed this patch from -mm1... and it solves the problem.

nfsd--lockd-dont-try-to-match-callback-requests-against-export-table.patch

How I tested: I applied the first 12 patches in 2.6.11-mm1; the above
mentioned was last - couldn't reproduce the bug.  When I unapplied just
that one, I saw it again.

original bug report:
http://marc.theaimsgroup.com/?l=linux-kernel&m=110238645132535&w=3

Greg: have you considered this one for 2.6.11.x?

Thanks,

-- 
Mark M. Hoffman
mhoffman@lightlink.com


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: make -j4 gets stuck w/ ccache over NFS - solved!
  2005-03-10  5:47   ` make -j4 gets stuck w/ ccache over NFS - solved! Mark M. Hoffman
@ 2005-03-10 16:24     ` J. Bruce Fields
  2005-03-10 17:11     ` Greg KH
  1 sibling, 0 replies; 6+ messages in thread
From: J. Bruce Fields @ 2005-03-10 16:24 UTC (permalink / raw)
  To: Mark M. Hoffman; +Cc: tridge, Greg KH, LKML, Neil Brown, Andrew Morton

On Thu, Mar 10, 2005 at 12:47:37AM -0500, Mark M. Hoffman wrote:
> Thanks for the suggestions.  It wasn't very important to me so I didn't
> make time to follow up on it.  I was just playing w/ ccache at the time.
> 
> Finally I noticed this patch from -mm1... and it solves the problem.
> 
> nfsd--lockd-dont-try-to-match-callback-requests-against-export-table.patch
> 
> How I tested: I applied the first 12 patches in 2.6.11-mm1; the above
> mentioned was last - couldn't reproduce the bug.  When I unapplied just
> that one, I saw it again.
> 
> original bug report:
> http://marc.theaimsgroup.com/?l=linux-kernel&m=110238645132535&w=3
> 
> Greg: have you considered this one for 2.6.11.x?

That patch depends on 3 of the previous 4 patches.  Taken together I
doubt they meet the criteria for 2.6.11.x.

It's probably possible to write a shorter and more obvious one-off fix
just for that tree, but I'm not sure it's worth it for a bug that, while
it's obviously extremely annoying for some workloads, doesn't quite
reach the level of, say, a root exploit.

--b.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: make -j4 gets stuck w/ ccache over NFS - solved!
  2005-03-10  5:47   ` make -j4 gets stuck w/ ccache over NFS - solved! Mark M. Hoffman
  2005-03-10 16:24     ` J. Bruce Fields
@ 2005-03-10 17:11     ` Greg KH
  1 sibling, 0 replies; 6+ messages in thread
From: Greg KH @ 2005-03-10 17:11 UTC (permalink / raw)
  To: Mark M. Hoffman; +Cc: tridge, LKML, J. Bruce Fields, Neil Brown, Andrew Morton

On Thu, Mar 10, 2005 at 12:47:37AM -0500, Mark M. Hoffman wrote:
> Finally I noticed this patch from -mm1... and it solves the problem.
> 
> nfsd--lockd-dont-try-to-match-callback-requests-against-export-table.patch
> 
> How I tested: I applied the first 12 patches in 2.6.11-mm1; the above
> mentioned was last - couldn't reproduce the bug.  When I unapplied just
> that one, I saw it again.
> 
> original bug report:
> http://marc.theaimsgroup.com/?l=linux-kernel&m=110238645132535&w=3
> 
> Greg: have you considered this one for 2.6.11.x?

No, it hasn't been submitted to the stable@kernel.org address :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2005-03-10 17:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-12-07  2:24 make -j4 gets stuck w/ ccache over NFS Mark M. Hoffman
2004-12-07  6:22 ` Martin Pool
2004-12-08  7:25 ` tridge
2005-03-10  5:47   ` make -j4 gets stuck w/ ccache over NFS - solved! Mark M. Hoffman
2005-03-10 16:24     ` J. Bruce Fields
2005-03-10 17:11     ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox