public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Kruse <pk@q-leap.com>
To: linux-kernel@vger.kernel.org
Cc: Peter Kruse <pk@q-leap.com>
Subject: Random packets loss under x86_64 - routing?
Date: Fri, 14 Jan 2005 16:35:51 +0100	[thread overview]
Message-ID: <41E7E6D7.10303@q-leap.com> (raw)

kernel: 2.4.28 smp x86_64

Hello,

We experience a problem in our amd64 beowulf clusters and could need
some help.
When ping'ing other machines in a cluster on the same
subnet, it fails for some machines.  But only right after boot
and after a day or so of idle time.  After some time (a few minutes) the
ping packets go through.

Other things we observed:

1. it is not always the same machines that fail
2. if it fails then no packets are sent or received (checked with
    tcpdump on sending and target host) although all hosts are up.
3. There is no difference if using a 64bit or 32bit ping
4. It does not depend on the network adapter or other hardware, we have
    machines with different NICs connected to different switches with the
    same problem.
5. It does however only happen on amd64 (biarch) systems and not on
    pure i386 systems so it looks like related to the kernel.
6. I have to reboot to reproduce the problem, it's not enough to
    unload and load the network module.
7. It only happens with ping, not with ssh.

The ping always succeeds when running with the "-r" switch,
that bypasses "the normal routing tables and send directly to a host
on an attached interface".  This makes us think that it indeed it is
related to routing - but how?

I can provide an strace output if you think that could help.
What else can I do to gather more information?

Please cc to me, as I'm not subscribed, thanks.

	Peter

-- 
Peter Kruse <pk@q-leap.com>, Chief Software Architect
Q-Leap Networks GmbH
phone: +497071-703171, mobile: +49172-6340044



             reply	other threads:[~2005-01-14 15:36 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-01-14 15:35 Peter Kruse [this message]
2005-01-14 16:37 ` Random packets loss under x86_64 - routing? linux-os
2005-01-17 13:57   ` Peter Kruse
2005-01-17 14:27     ` linux-os
2005-01-24 10:24   ` Peter Kruse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41E7E6D7.10303@q-leap.com \
    --to=pk@q-leap.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox