* Re: 2.6.8-rc1: Possible SCSI-related problem on dual Opteron w/ NUMA
[not found] <200407171826.03709.rjwysocki@sisk.pl>
@ 2004-07-17 18:12 ` Andi Kleen
2004-07-17 19:09 ` R. J. Wysocki
0 siblings, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2004-07-17 18:12 UTC (permalink / raw)
To: R. J. Wysocki; +Cc: linux-kernel
On Sat, Jul 17, 2004 at 06:26:03PM +0200, R. J. Wysocki wrote:
> I got this on a dual Opteron system on 2.6.8-rc1 with the latest x86-64
> patchset from Andi:
Does it happen with x86_64-2.6.8-1 too ?
-Andi
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.8-rc1: Possible SCSI-related problem on dual Opteron w/ NUMA
2004-07-17 18:12 ` 2.6.8-rc1: Possible SCSI-related problem on dual Opteron w/ NUMA Andi Kleen
@ 2004-07-17 19:09 ` R. J. Wysocki
2004-07-18 12:48 ` R. J. Wysocki
0 siblings, 1 reply; 7+ messages in thread
From: R. J. Wysocki @ 2004-07-17 19:09 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1236 bytes --]
On Saturday 17 of July 2004 20:12, Andi Kleen wrote:
> On Sat, Jul 17, 2004 at 06:26:03PM +0200, R. J. Wysocki wrote:
> > I got this on a dual Opteron system on 2.6.8-rc1 with the latest x86-64
> > patchset from Andi:
>
> Does it happen with x86_64-2.6.8-1 too ?
It did not happen when I was running that kernel, but I had only run it for a
couple of times. It generally does not happen with this one either. It's
happened only once and I reported it immediately. But ...
... I saw something very similar on 2.6.7-rc1 and I'm now looking at the log
(attached - it's partially corrupted, because /var was on /dev/sdb5 that
failed too). The hardware configuration was similar to the current one,
AFAIR.
Well, it looks like the whole SCSI bus sometimes goes south for some reason
and it may very well be a hardware problem that manifests itself in such a
(strange?) way. Or not. Anyway, it certainly had not happened on this
hardware _before_ 2.6.7-rc1.
I have no idea what to do to make it happen again (suggestions welcome).
rjw
--
Rafael J. Wysocki
----------------------------
For a successful technology, reality must take precedence over public
relations, for nature cannot be fooled.
-- Richard P. Feynman
[-- Attachment #2: 2.6.7-rc1.log.gz --]
[-- Type: application/x-gzip, Size: 3833 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.8-rc1: Possible SCSI-related problem on dual Opteron w/ NUMA
2004-07-17 19:09 ` R. J. Wysocki
@ 2004-07-18 12:48 ` R. J. Wysocki
2004-07-18 21:38 ` R. J. Wysocki
2004-07-20 12:02 ` Andi Kleen
0 siblings, 2 replies; 7+ messages in thread
From: R. J. Wysocki @ 2004-07-18 12:48 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel
Andi,
On Saturday 17 of July 2004 21:09, R. J. Wysocki wrote:
> On Saturday 17 of July 2004 20:12, Andi Kleen wrote:
> > On Sat, Jul 17, 2004 at 06:26:03PM +0200, R. J. Wysocki wrote:
> > > I got this on a dual Opteron system on 2.6.8-rc1 with the latest x86-64
> > > patchset from Andi:
> >
> > Does it happen with x86_64-2.6.8-1 too ?
>
> It did not happen when I was running that kernel, but I had only run it for
> a couple of times. It generally does not happen with this one either.
> It's happened only once and I reported it immediately. But ...
>
[snip]
I had this problem again this morning. I was unpacking the kernel tarball to
/dev/sda8 and it went south (the tarball had been partially unpacked before
the partition was remounted r-o). Then, I got back to 2.6.7 and ran fsck -
now it found some errors (obviously) and fixed them. Next (on 2.6.7), I
unpacked the kernel to /dev/sda8 (again) and compiled the 2.6.8-rc2. I ran
it, unpacked the kernel to /dev/sda8 (again) and compiled it - everything
worked. Then, I applied your patch on top of the newly created 2.6.8-rc2
tree and compiled the kernel. After installing and running it I tried to
unpack the kernel to /dev/sda8 (again) and it went south, so I got back to
the "plain" 2.6.8-rc2, ran fsck and fixed the partition, unpacked the kernel
to /dev/sda8 - and it all worked.
So, it seems, there's something in your patch that causes this misbehavior.
rjw
--
Rafael J. Wysocki
----------------------------
For a successful technology, reality must take precedence over public
relations, for nature cannot be fooled.
-- Richard P. Feynman
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.8-rc1: Possible SCSI-related problem on dual Opteron w/ NUMA
2004-07-18 12:48 ` R. J. Wysocki
@ 2004-07-18 21:38 ` R. J. Wysocki
2004-07-20 12:04 ` Andi Kleen
2004-07-20 12:02 ` Andi Kleen
1 sibling, 1 reply; 7+ messages in thread
From: R. J. Wysocki @ 2004-07-18 21:38 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel
On Sunday 18 of July 2004 14:48, R. J. Wysocki wrote:
> Andi,
>
> On Saturday 17 of July 2004 21:09, R. J. Wysocki wrote:
> > On Saturday 17 of July 2004 20:12, Andi Kleen wrote:
> > > On Sat, Jul 17, 2004 at 06:26:03PM +0200, R. J. Wysocki wrote:
> > > > I got this on a dual Opteron system on 2.6.8-rc1 with the latest
> > > > x86-64 patchset from Andi:
> > >
> > > Does it happen with x86_64-2.6.8-1 too ?
> >
> > It did not happen when I was running that kernel, but I had only run it
> > for a couple of times. It generally does not happen with this one
> > either. It's happened only once and I reported it immediately. But ...
>
> [snip]
>
> I had this problem again this morning. I was unpacking the kernel tarball
> to /dev/sda8 and it went south (the tarball had been partially unpacked
> before the partition was remounted r-o). Then, I got back to 2.6.7 and ran
> fsck - now it found some errors (obviously) and fixed them. Next (on
> 2.6.7), I unpacked the kernel to /dev/sda8 (again) and compiled the
> 2.6.8-rc2. I ran it, unpacked the kernel to /dev/sda8 (again) and compiled
> it - everything worked. Then, I applied your patch on top of the newly
> created 2.6.8-rc2 tree and compiled the kernel. After installing and
> running it I tried to unpack the kernel to /dev/sda8 (again) and it went
> south, so I got back to the "plain" 2.6.8-rc2, ran fsck and fixed the
> partition, unpacked the kernel to /dev/sda8 - and it all worked.
>
> So, it seems, there's something in your patch that causes this misbehavior.
To clarify: in the above "your patch" means "x86_64-2.6.8rc1-1". I should
have called it by name. Sorry for the confusion,
rjw
--
Rafael J. Wysocki
----------------------------
For a successful technology, reality must take precedence over public
relations, for nature cannot be fooled.
-- Richard P. Feynman
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.8-rc1: Possible SCSI-related problem on dual Opteron w/ NUMA
2004-07-18 12:48 ` R. J. Wysocki
2004-07-18 21:38 ` R. J. Wysocki
@ 2004-07-20 12:02 ` Andi Kleen
1 sibling, 0 replies; 7+ messages in thread
From: Andi Kleen @ 2004-07-20 12:02 UTC (permalink / raw)
To: R. J. Wysocki; +Cc: linux-kernel
> I had this problem again this morning. I was unpacking the kernel tarball to
> /dev/sda8 and it went south (the tarball had been partially unpacked before
> the partition was remounted r-o). Then, I got back to 2.6.7 and ran fsck -
> now it found some errors (obviously) and fixed them. Next (on 2.6.7), I
> unpacked the kernel to /dev/sda8 (again) and compiled the 2.6.8-rc2. I ran
> it, unpacked the kernel to /dev/sda8 (again) and compiled it - everything
> worked. Then, I applied your patch on top of the newly created 2.6.8-rc2
> tree and compiled the kernel. After installing and running it I tried to
> unpack the kernel to /dev/sda8 (again) and it went south, so I got back to
> the "plain" 2.6.8-rc2, ran fsck and fixed the partition, unpacked the kernel
> to /dev/sda8 - and it all worked.
>
> So, it seems, there's something in your patch that causes this misbehavior.
In which patch eactly? x86_64-2.6.8rc1-1 or x86_64-2.6.8rc1-2 ?
If it started with -2 can you check if -1 has the problem too?
-Andi
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.8-rc1: Possible SCSI-related problem on dual Opteron w/ NUMA
2004-07-18 21:38 ` R. J. Wysocki
@ 2004-07-20 12:04 ` Andi Kleen
2004-07-20 14:23 ` R. J. Wysocki
0 siblings, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2004-07-20 12:04 UTC (permalink / raw)
To: R. J. Wysocki; +Cc: linux-kernel
> To clarify: in the above "your patch" means "x86_64-2.6.8rc1-1". I should
> have called it by name. Sorry for the confusion,
Hmm. Do you have CONFIG_IOMMU_DEBUG on or iommu=force set?
-Andi
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.8-rc1: Possible SCSI-related problem on dual Opteron w/ NUMA
2004-07-20 12:04 ` Andi Kleen
@ 2004-07-20 14:23 ` R. J. Wysocki
0 siblings, 0 replies; 7+ messages in thread
From: R. J. Wysocki @ 2004-07-20 14:23 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel
On Tuesday 20 of July 2004 14:04, Andi Kleen wrote:
> > To clarify: in the above "your patch" means "x86_64-2.6.8rc1-1". I
> > should have called it by name. Sorry for the confusion,
>
> Hmm. Do you have CONFIG_IOMMU_DEBUG on or iommu=force set?
No. Unless iommu=force is by default (I haven't turned it on specifically).
rjw
--
Rafael J. Wysocki
[tel. (+48) 605 053 693]
----------------------------
For a successful technology, reality must take precedence over public
relations, for nature cannot be fooled.
-- Richard P. Feynman
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2004-07-20 14:14 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <200407171826.03709.rjwysocki@sisk.pl>
2004-07-17 18:12 ` 2.6.8-rc1: Possible SCSI-related problem on dual Opteron w/ NUMA Andi Kleen
2004-07-17 19:09 ` R. J. Wysocki
2004-07-18 12:48 ` R. J. Wysocki
2004-07-18 21:38 ` R. J. Wysocki
2004-07-20 12:04 ` Andi Kleen
2004-07-20 14:23 ` R. J. Wysocki
2004-07-20 12:02 ` Andi Kleen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox