2.5.18 / ext3 / oracle trouble

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* 2.5.18 / ext3 / oracle trouble
@ 2002-05-26 15:26 Zlatko Calusic
  2002-05-26 19:35 ` Andrew Morton
  2002-05-27  7:23 ` Christoph Rohland
  0 siblings, 2 replies; 8+ messages in thread
From: Zlatko Calusic @ 2002-05-26 15:26 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andrew Morton, cr

Hi!

After lots of testing, I can say that 2.5.18 works great in all three
modes of ext3 for all but one purpose. Oracle database still gets
corrupted during insert load. More precisely, online redo log gets
corrupted, database panics and restore is in order.

This leads me to thinking that there's something wrong with sysv
shared memory in 2.5.x. Although the problem could also be in fsync()
or swap_out() & co. paths, it's yet to be discovered.

It could also be that journaled mode helps the trouble, and it could
be that some swapping makes it more certain, but none of these two
facts are proved for sure. Take it as an observation.

Christoph, I don't know if you're still taking care of shmem in 2.5.x,
so take my apologies if you didn't want to see this email.

Regards,
-- 
Zlatko

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.5.18 / ext3 / oracle trouble
  2002-05-26 15:26 2.5.18 / ext3 / oracle trouble Zlatko Calusic
@ 2002-05-26 19:35 ` Andrew Morton
  2002-05-27  7:23 ` Christoph Rohland
  1 sibling, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2002-05-26 19:35 UTC (permalink / raw)
  To: zlatko.calusic; +Cc: linux-kernel, cr

Zlatko Calusic wrote:
> 
> Hi!
> 
> After lots of testing, I can say that 2.5.18 works great in all three
> modes of ext3 for all but one purpose. Oracle database still gets
> corrupted during insert load. More precisely, online redo log gets
> corrupted, database panics and restore is in order.
> 
> This leads me to thinking that there's something wrong with sysv
> shared memory in 2.5.x. Although the problem could also be in fsync()
> or swap_out() & co. paths, it's yet to be discovered.
> 
> It could also be that journaled mode helps the trouble, and it could
> be that some swapping makes it more certain, but none of these two
> facts are proved for sure. Take it as an observation.
> 
> Christoph, I don't know if you're still taking care of shmem in 2.5.x,
> so take my apologies if you didn't want to see this email.
> 

Are you able to try it on ext2?

Thanks.

-

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.5.18 / ext3 / oracle trouble
  2002-05-26 15:26 2.5.18 / ext3 / oracle trouble Zlatko Calusic
  2002-05-26 19:35 ` Andrew Morton
@ 2002-05-27  7:23 ` Christoph Rohland
  2002-05-27  7:52   ` Andrew Morton
  1 sibling, 1 reply; 8+ messages in thread
From: Christoph Rohland @ 2002-05-27  7:23 UTC (permalink / raw)
  To: zlatko.calusic; +Cc: linux-kernel, Andrew Morton, Hugh Dickins

Hi Zlatko,

On Sun, 26 May 2002, Zlatko Calusic wrote:
> Hi!
> 
> After lots of testing, I can say that 2.5.18 works great in all
> three modes of ext3 for all but one purpose. Oracle database still
> gets corrupted during insert load. More precisely, online redo log
> gets corrupted, database panics and restore is in order.
> 
> This leads me to thinking that there's something wrong with sysv
> shared memory in 2.5.x. Although the problem could also be in
> fsync() or swap_out() & co. paths, it's yet to be discovered.
> 
> It could also be that journaled mode helps the trouble, and it could
> be that some swapping makes it more certain, but none of these two
> facts are proved for sure. Take it as an observation.
> 
> Christoph, I don't know if you're still taking care of shmem in
> 2.5.x, so take my apologies if you didn't want to see this email.
> 
> Regards,
> -- 
> Zlatko

Unfortunately I do not have the time to work on shmem right now. Hugh
Dickins is the right guy to contact nowadays.

Greetings
		Christoph


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.5.18 / ext3 / oracle trouble
  2002-05-27  7:23 ` Christoph Rohland
@ 2002-05-27  7:52   ` Andrew Morton
  2002-05-27  8:43     ` Zlatko Calusic
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2002-05-27  7:52 UTC (permalink / raw)
  To: Christoph Rohland; +Cc: zlatko.calusic, linux-kernel, Hugh Dickins

Christoph Rohland wrote:
> 
> Hi Zlatko,
> 
> On Sun, 26 May 2002, Zlatko Calusic wrote:
> > Hi!
> >
> > After lots of testing, I can say that 2.5.18 works great in all
> > three modes of ext3 for all but one purpose. Oracle database still
> > gets corrupted during insert load. More precisely, online redo log
> > gets corrupted, database panics and restore is in order.
> >
> > This leads me to thinking that there's something wrong with sysv
> > shared memory in 2.5.x. Although the problem could also be in
> > fsync() or swap_out() & co. paths, it's yet to be discovered.
> >
> > It could also be that journaled mode helps the trouble, and it could
> > be that some swapping makes it more certain, but none of these two
> > facts are proved for sure. Take it as an observation.
> >
> > Christoph, I don't know if you're still taking care of shmem in
> > 2.5.x, so take my apologies if you didn't want to see this email.
> >
> > Regards,
> > --
> > Zlatko
> 
> Unfortunately I do not have the time to work on shmem right now. Hugh
> Dickins is the right guy to contact nowadays.
> 

Most likely suspect here is the heavy fsync() load is triggering
some timing problem in ext3 - it'll be pushing the commits though
at high rate.

I'll teach fsx-linux (great test app, btw) about fsync() and see
how it stands up.  And if Zlatko can retest on ext2 that would be a
big help.

-

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.5.18 / ext3 / oracle trouble
  2002-05-27  7:52   ` Andrew Morton
@ 2002-05-27  8:43     ` Zlatko Calusic
  2002-05-27 20:02       ` Zlatko Calusic
  0 siblings, 1 reply; 8+ messages in thread
From: Zlatko Calusic @ 2002-05-27  8:43 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Christoph Rohland, linux-kernel, Hugh Dickins

Andrew Morton <akpm@zip.com.au> writes:

> Christoph Rohland wrote:
>> 
>> Hi Zlatko,
>> 
>> On Sun, 26 May 2002, Zlatko Calusic wrote:
>> > Hi!
>> >
>> > After lots of testing, I can say that 2.5.18 works great in all
>> > three modes of ext3 for all but one purpose. Oracle database still
>> > gets corrupted during insert load. More precisely, online redo log
>> > gets corrupted, database panics and restore is in order.
>> >
>> > This leads me to thinking that there's something wrong with sysv
>> > shared memory in 2.5.x. Although the problem could also be in
>> > fsync() or swap_out() & co. paths, it's yet to be discovered.
>> >
>> > It could also be that journaled mode helps the trouble, and it could
>> > be that some swapping makes it more certain, but none of these two
>> > facts are proved for sure. Take it as an observation.
>> >
>> > Christoph, I don't know if you're still taking care of shmem in
>> > 2.5.x, so take my apologies if you didn't want to see this email.
>> >
>> > Regards,
>> > --
>> > Zlatko
>> 
>> Unfortunately I do not have the time to work on shmem right now. Hugh
>> Dickins is the right guy to contact nowadays.
>> 
>
> Most likely suspect here is the heavy fsync() load is triggering
> some timing problem in ext3 - it'll be pushing the commits though
> at high rate.
>
> I'll teach fsx-linux (great test app, btw) about fsync() and see
> how it stands up.  And if Zlatko can retest on ext2 that would be a
> big help.
>

This is just a short notice so that you know I'm working on it.

I did some testing last evening, but I need to do some more
comprehensive ones before any meaningful conclusion.

1 test: compiled ext2 in, mounted partitions as ext2, tests passed
        (no corruption)
2 test: rebooted, mounted as ext3(journal/writeback). This time even
        ext3 passed tests, so I got confused :)
3 test: pushed things harder on ext3, machine started swapping,
        restarted tests and finally it choked (some kind of smon
        non-fatal error 1/100, problem with writing scn, and instance
        shutdown)

Obviously I need to perform tests on ext2 with swap load, and repeat
them few times. Will do this evening (it takes some time to recover a
database after a corruption, so it's slightly time consuming).

Later,
-- 
Zlatko

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.5.18 / ext3 / oracle trouble
  2002-05-27  8:43     ` Zlatko Calusic
@ 2002-05-27 20:02       ` Zlatko Calusic
  2002-05-27 20:28         ` Andrew Morton
  0 siblings, 1 reply; 8+ messages in thread
From: Zlatko Calusic @ 2002-05-27 20:02 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Hugh Dickins

Zlatko Calusic <zlatko.calusic@iskon.hr> writes:
>
> Obviously I need to perform tests on ext2 with swap load, and repeat
> them few times. Will do this evening (it takes some time to recover a
> database after a corruption, so it's slightly time consuming).
>

And I did get some interesting results. :)
I found a great test case, rebuilding database after corruption. :)

It consists of recreation of all tablespaces, initializing data
dictionary and finally importing useful data. The whole process takes
between 11 and 14 minutes, depending on the type of FS. It's write
intensive workload and induces some paging even with 768 MB RAM I
have. Did I forgot to say that all this is on a SMP machine, dual
PIII? It might matter.

And you know what, corruption doesn't depend on the type of FS. It
happens on both ext2 & ext3. It's just more likely to see it when
running on ext3.

Anyway, I managed to pinpoint the problem, it's paging that's the
culprit. When I turned off my swap partition (swapoff -a), rebuild
went correctly. So I was right, swapping will get you in
trouble.

I also tried to push the machine harder into swap, with artificial
load (typical malloc() in the loop), and it locked up hard after some
time (minute or two).

And during one of the tests on ext3, when machine actually survived,
just after exiting X I had a welcome message waiting, saying something
like this:

 Assertion failure: journal_dirty_metadata() at transaction.c:1146
 "jh->b_frozen_data == 0"

Don't know if it's related, but could be useful to someone.

That's it. I'm back to 2.4.19-pre8 for the time being, but if anybody
needs more testing...

Regards,
-- 
Zlatko

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.5.18 / ext3 / oracle trouble
  2002-05-27 20:02       ` Zlatko Calusic
@ 2002-05-27 20:28         ` Andrew Morton
  2002-05-27 20:29           ` Zlatko Calusic
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2002-05-27 20:28 UTC (permalink / raw)
  To: zlatko.calusic; +Cc: linux-kernel, Hugh Dickins

Zlatko Calusic wrote:
> 
> Zlatko Calusic <zlatko.calusic@iskon.hr> writes:
> >
> > Obviously I need to perform tests on ext2 with swap load, and repeat
> > them few times. Will do this evening (it takes some time to recover a
> > database after a corruption, so it's slightly time consuming).
> >
> 
> And I did get some interesting results. :)
> I found a great test case, rebuilding database after corruption. :)
> 
> It consists of recreation of all tablespaces, initializing data
> dictionary and finally importing useful data. The whole process takes
> between 11 and 14 minutes, depending on the type of FS. It's write
> intensive workload and induces some paging even with 768 MB RAM I
> have. Did I forgot to say that all this is on a SMP machine, dual
> PIII? It might matter.
> 
> And you know what, corruption doesn't depend on the type of FS. It
> happens on both ext2 & ext3. It's just more likely to see it when
> running on ext3.
> 
> Anyway, I managed to pinpoint the problem, it's paging that's the
> culprit. When I turned off my swap partition (swapoff -a), rebuild
> went correctly. So I was right, swapping will get you in
> trouble.

Thanks.  I'll cook up a test for that.

> I also tried to push the machine harder into swap, with artificial
> load (typical malloc() in the loop), and it locked up hard after some
> time (minute or two).
> 
> And during one of the tests on ext3, when machine actually survived,
> just after exiting X I had a welcome message waiting, saying something
> like this:
> 
>  Assertion failure: journal_dirty_metadata() at transaction.c:1146
>  "jh->b_frozen_data == 0"

I've seen them under load with data=journal.  Were you using data=journal
at the time?

-

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.5.18 / ext3 / oracle trouble
  2002-05-27 20:28         ` Andrew Morton
@ 2002-05-27 20:29           ` Zlatko Calusic
  0 siblings, 0 replies; 8+ messages in thread
From: Zlatko Calusic @ 2002-05-27 20:29 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Hugh Dickins

Andrew Morton <akpm@zip.com.au> writes:

>> 
>> And during one of the tests on ext3, when machine actually survived,
>> just after exiting X I had a welcome message waiting, saying something
>> like this:
>> 
>>  Assertion failure: journal_dirty_metadata() at transaction.c:1146
>>  "jh->b_frozen_data == 0"
>
> I've seen them under load with data=journal.  Were you using data=journal
> at the time?
>

Yes.

Don't know if it's strictly needed as Oracle tries to keep consistency
on it's own, but it can't hurt, I think (except performance wise, but
sometimes it can be a win, too).
-- 
Zlatko

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2002-05-27 20:29 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-05-26 15:26 2.5.18 / ext3 / oracle trouble Zlatko Calusic
2002-05-26 19:35 ` Andrew Morton
2002-05-27  7:23 ` Christoph Rohland
2002-05-27  7:52   ` Andrew Morton
2002-05-27  8:43     ` Zlatko Calusic
2002-05-27 20:02       ` Zlatko Calusic
2002-05-27 20:28         ` Andrew Morton
2002-05-27 20:29           ` Zlatko Calusic

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox