public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
* Database on JFFS2?
@ 2003-04-14  8:08 Esben Nielsen
  2003-04-14 23:03 ` Charles Manning
  0 siblings, 1 reply; 22+ messages in thread
From: Esben Nielsen @ 2003-04-14  8:08 UTC (permalink / raw)
  To: linux-mtd

Is it posible to run a sql database (mysql forinstance) on a JFFS2 filesystem?

Running just a database as it is seems to be out of the question due to the 
many writes a database does. At least some tweaking is needed.

Do anyone have experience wth MySQL or other database engines running on a 
JFFS2 volume?

Esben Nielsen

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Database on JFFS2?
  2003-04-14  8:08 Esben Nielsen
@ 2003-04-14 23:03 ` Charles Manning
  2003-04-15  8:06   ` Esben Nielsen
  0 siblings, 1 reply; 22+ messages in thread
From: Charles Manning @ 2003-04-14 23:03 UTC (permalink / raw)
  To: esn, linux-mtd

On Mon, 14 Apr 2003 20:08, Esben Nielsen wrote:
> Is it posible to run a sql database (mysql forinstance) on a JFFS2
> filesystem?

What kind of flash? NOR flash is very slow for writing, making some database 
solutions impractical. 

People are using NAND with YAFFS in database-like solutions.

-- CHarles

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Database on JFFS2?
  2003-04-14 23:03 ` Charles Manning
@ 2003-04-15  8:06   ` Esben Nielsen
  2003-04-15  8:51     ` Holger Schurig
  2003-04-15  9:13     ` Jörn Engel
  0 siblings, 2 replies; 22+ messages in thread
From: Esben Nielsen @ 2003-04-15  8:06 UTC (permalink / raw)
  To: manningc2, linux-mtd

Our flash is NOR (Intel statoflash). What we a going for is a standeard SQL 
solution  - so it isn't enough that the filesystem has a database-like 
behaviour.

Someone from this list pointed out sqlite to me yesterday. I am looking into 
that now.  It has a log system. The most optimal would be if the database 
could be tuned not to write too often and on other hand when it does it is 
flush to flash immediately to prevent datacorruption. I am not quite sure how 
to make it work with JFFS2. DoesJFFS2 write immediately on fsync() or it is 
buffered in ram making the database believe it is safe to delete it's 
transactionlog?

Esben

On Tuesday 15 April 2003 01:03, Charles Manning wrote:
> On Mon, 14 Apr 2003 20:08, Esben Nielsen wrote:
> > Is it posible to run a sql database (mysql forinstance) on a JFFS2
> > filesystem?
>
> What kind of flash? NOR flash is very slow for writing, making some
> database solutions impractical.
>
> People are using NAND with YAFFS in database-like solutions.
>
> -- CHarles

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Database on JFFS2?
  2003-04-15  8:06   ` Esben Nielsen
@ 2003-04-15  8:51     ` Holger Schurig
  2003-04-15 15:06       ` Esben Nielsen
  2003-04-15  9:13     ` Jörn Engel
  1 sibling, 1 reply; 22+ messages in thread
From: Holger Schurig @ 2003-04-15  8:51 UTC (permalink / raw)
  To: linux-mtd

> into that now.  It has a log system. The most optimal would be if the
> database could be tuned not to write too often and on other hand when it
> does it is flush to flash immediately to prevent datacorruption.

That is simple to find out. sqllite is so small, you surely could add some 
printf() statements at the right place to find out when it writes date.

Hey, even "strace -efile" may help you. "man strace" is your friend.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Database on JFFS2?
  2003-04-15  8:06   ` Esben Nielsen
  2003-04-15  8:51     ` Holger Schurig
@ 2003-04-15  9:13     ` Jörn Engel
  1 sibling, 0 replies; 22+ messages in thread
From: Jörn Engel @ 2003-04-15  9:13 UTC (permalink / raw)
  To: Esben Nielsen; +Cc: manningc2, linux-mtd

On Tue, 15 April 2003 10:06:51 +0200, Esben Nielsen wrote:
> 
> Our flash is NOR (Intel statoflash). What we a going for is a standeard SQL 
> solution  - so it isn't enough that the filesystem has a database-like 
> behaviour.
> 
> Someone from this list pointed out sqlite to me yesterday. I am looking into 
> that now.  It has a log system. The most optimal would be if the database 
> could be tuned not to write too often and on other hand when it does it is 
> flush to flash immediately to prevent datacorruption. I am not quite sure how 
> to make it work with JFFS2. DoesJFFS2 write immediately on fsync() or it is 
> buffered in ram making the database believe it is safe to delete it's 
> transactionlog?

afaik jffs2 writes immediately to flash, even without fsync, as long
as you use NOR. NAND holds a buffer for 2s, I'd have to check when
this is flushed exactly.

Jörn

-- 
Those who come seeking peace without a treaty are plotting.
-- Sun Tzu

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Database on JFFS2?
  2003-04-15  8:51     ` Holger Schurig
@ 2003-04-15 15:06       ` Esben Nielsen
  2003-04-15 15:39         ` Jörn Engel
  0 siblings, 1 reply; 22+ messages in thread
From: Esben Nielsen @ 2003-04-15 15:06 UTC (permalink / raw)
  To: Holger Schurig, linux-mtd

Yes, but are those writes syncronices with actual flash writes such I am sure 
that a write is _physically_ complete when fsync() returns? 
They certainly ought to be or many applications would get in trouble with 
corrupt data at a sudden reboot.

Let us assume it is so. Then I need to tweak sqlite to only force an fsync() 
when the JFFS2 layer is ready to commit anyway to avoid too many writes.
Any idea of how to do that?

Esben

On Tuesday 15 April 2003 10:51, Holger Schurig wrote:
> > into that now.  It has a log system. The most optimal would be if the
> > database could be tuned not to write too often and on other hand when it
> > does it is flush to flash immediately to prevent datacorruption.
>
> That is simple to find out. sqllite is so small, you surely could add some
> printf() statements at the right place to find out when it writes date.
>
> Hey, even "strace -efile" may help you. "man strace" is your friend.
>
>
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Database on JFFS2?
  2003-04-15 15:06       ` Esben Nielsen
@ 2003-04-15 15:39         ` Jörn Engel
  2003-04-15 16:11           ` Jasmine Strong
  0 siblings, 1 reply; 22+ messages in thread
From: Jörn Engel @ 2003-04-15 15:39 UTC (permalink / raw)
  To: Esben Nielsen; +Cc: linux-mtd, Holger Schurig

On Tue, 15 April 2003 17:06:36 +0200, Esben Nielsen wrote:
> 
> Yes, but are those writes syncronices with actual flash writes such I am sure 
> that a write is _physically_ complete when fsync() returns? 
> They certainly ought to be or many applications would get in trouble with 
> corrupt data at a sudden reboot.
> 
> Let us assume it is so. Then I need to tweak sqlite to only force an fsync() 
> when the JFFS2 layer is ready to commit anyway to avoid too many writes.
> Any idea of how to do that?

On NOR, there is absolutely no point in avoiding many writes. Write
the same amount of data in fewer bigger chunks and you have the same
problems.

If you can reduce the amount of data written to flash, do that.

Jörn

-- 
Courage is not the absence of fear, but rather the judgement that
something else is more important than fear.
-- Ambrose Redmoon

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Database on JFFS2?
  2003-04-15 15:39         ` Jörn Engel
@ 2003-04-15 16:11           ` Jasmine Strong
  2003-04-15 16:14             ` Jörn Engel
  0 siblings, 1 reply; 22+ messages in thread
From: Jasmine Strong @ 2003-04-15 16:11 UTC (permalink / raw)
  To: Jörn Engel; +Cc: linux-mtd, Holger Schurig, Esben Nielsen

On Tuesday, Apr 15, 2003, at 16:39 Europe/London, Jörn Engel wrote:
>
> On NOR, there is absolutely no point in avoiding many writes.

Unless it would cause many erases, which would slow things down a lot...
If data integrity is the crucial issue (which it usually is) there's no 
way to avoid
this, though.

The only other issue I can foresee is the overhead of the jffs 
structures at the
start of each write;  if there are only a few bytes being written each 
time, it
quickly becomes very significant.

Jas.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Database on JFFS2?
  2003-04-15 16:11           ` Jasmine Strong
@ 2003-04-15 16:14             ` Jörn Engel
  2003-04-15 16:23               ` Jasmine Strong
  0 siblings, 1 reply; 22+ messages in thread
From: Jörn Engel @ 2003-04-15 16:14 UTC (permalink / raw)
  To: Jasmine Strong; +Cc: linux-mtd, Holger Schurig, Esben Nielsen

On Tue, 15 April 2003 17:11:44 +0100, Jasmine Strong wrote:
> On Tuesday, Apr 15, 2003, at 16:39 Europe/London, Jörn Engel wrote:
> >
> >On NOR, there is absolutely no point in avoiding many writes.
> 
> Unless it would cause many erases, which would slow things down a lot...
> If data integrity is the crucial issue (which it usually is) there's no 
> way to avoid
> this, though.

Erases get triggered by garbage collection, which depends on the
amount of data written, not the chunk size.

> The only other issue I can foresee is the overhead of the jffs 
> structures at the
> start of each write;  if there are only a few bytes being written each 
> time, it
> quickly becomes very significant.

Correct. iirc, node header was 70 Bytes, so writes below 700 Bytes
will have significant overhead (10% in my book). Point taken.

Jörn

-- 
Fantasy is more important than knowlegde. Knowlegde is limited,
while fantasy embraces the whole world.
-- Albert Einstein

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Database on JFFS2?
  2003-04-15 16:14             ` Jörn Engel
@ 2003-04-15 16:23               ` Jasmine Strong
  2003-04-16 11:16                 ` Jörn Engel
  0 siblings, 1 reply; 22+ messages in thread
From: Jasmine Strong @ 2003-04-15 16:23 UTC (permalink / raw)
  To: Jörn Engel; +Cc: Holger Schurig, linux-mtd, Esben Nielsen


On Tuesday, Apr 15, 2003, at 17:14 Europe/London, Jörn Engel wrote:

> On Tue, 15 April 2003 17:11:44 +0100, Jasmine Strong wrote:
>> Unless it would cause many erases, which would slow things down a 
>> lot...
> Erases get triggered by garbage collection, which depends on the
> amount of data written, not the chunk size.

yes.  I think my two points were actually the same point taken twice :-)
If you're only updating a few bytes of data you will end up writing
a large proportion of log control data.  That'll end up being
responsible for most of the erase traffic.

Still, if you need to be powerfail-safe, I can't see any way of not
doing this.

-Jas.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Database on JFFS2?
       [not found] <20030415171123.GH7721@wohnheim.fh-wedel.de>
@ 2003-04-16 10:04 ` Esben Nielsen
  2003-04-16 11:35   ` Thomas Gleixner
  0 siblings, 1 reply; 22+ messages in thread
From: Esben Nielsen @ 2003-04-16 10:04 UTC (permalink / raw)
  To: Jörn Engel, Jasmine Strong; +Cc: linux-mtd, Holger Schurig

I appreciate that you look into my problem and get a good discussion out of it 
:-)

I did try to follow it and as far as I can see it JFFS2 _only_ writes to flash 
in the GC thread? That contradicts the manual page for fsync(): "fsync copies 
all in core parts of a file to disk, and waits  until  the device  reports  
that all parts are on stable storage."

I am not much into flash technology itself (NOR contra NAND), but I know that 
on our device one write erases and rewrites one block of 128k. Each block can 
only be erased between 1E5 and 1E6 times. We thus have to be carefull as our 
device have to live for 20 years (yes, that is what a promise our 
customers!).

If JFFS2 does do a real write on fsync() we will get too many physical writes 
unless we buffer our inserts/updates into one big transaction in the 
application layer above the database. On the other hand if JFFS2 doesn't 
commit changes we risc getting our database file inconsistant if we loose 
power in the middle :-(

Now in our application the most important thing is that our database file is 
self-consistant at reboot such we can get up running again. If we loose the 
last few inserts before the reboot we wont go out of buisness. The most 
optimal was if we could tweak sqlite to syncronice it's fsync with JFFS2 such 
the last written data resides purely in memory until we have enough to erase 
a full page on flash and then we sync it all. That way we can minimalize the 
number of writes and at the same time avoid inconsistant data at boot.

Another option was to have JFFS2 use battery backed up SRAM as temporately 
storage. fsync() would then work "correcty" by just writing to that SRAM. 
Is it in possible to do make JFFS2 use a specific part of memory as "cache" 
before the GC thread writes it to flash? Is it even possible to make JFFS2 
reestablish that cache at mount such no data would be lost if it was synced 
with the cache before a crash?

Esben

On Tuesday 15 April 2003 19:11, Jörn Engel wrote:
> On Tue, 15 April 2003 17:23:59 +0100, Jasmine Strong wrote:
> > On Tuesday, Apr 15, 2003, at 17:14 Europe/London, Jörn Engel wrote:
> > >On Tue, 15 April 2003 17:11:44 +0100, Jasmine Strong wrote:
> > >>Unless it would cause many erases, which would slow things down a
> > >>lot...
> > >
> > >Erases get triggered by garbage collection, which depends on the
> > >amount of data written, not the chunk size.
> >
> > yes.  I think my two points were actually the same point taken twice :-)
> > If you're only updating a few bytes of data you will end up writing
> > a large proportion of log control data.  That'll end up being
> > responsible for most of the erase traffic.
>
> Actually, that shouldn't matter too much. For comparison, I did some
> benchmarks using jffs2 (without compression) as a filesystem for a
> ramdisk.
>
> The benchmark wrote data to jffs2, deleted it and repeated this
> several times to remove statistical noise. Horrible results.
> Then I got a clue and added "sleep 6" after both writing and deleting,
> getting roughly twice the performance. Why?
>
> Under normal operation, the system is idle a lot and the garbage
> collector (GC) has plenty of time to clean up the mess you made. But
> the first benchmark was measuring a system without idle times, so all
> writes were waiting for GC to finally free some space. Wrong.
>
> Back to the Database:
> Even if you write data in very small chunks, the system should have
> enough free time to GC those fragments and reassemble them into larger
> chunks with less overhead, so this doesn't matter.
>
> Unless you permanently operate near the limit. Without the free time
> for GC, this does matter.
>
> > Still, if you need to be powerfail-safe, I can't see any way of not
> > doing this.
>
> Right.
>
> Jörn

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Database on JFFS2?
  2003-04-16 11:35   ` Thomas Gleixner
@ 2003-04-16 11:13     ` Jörn Engel
  2003-04-16 14:05       ` Thomas Gleixner
  2003-04-22  8:07     ` Esben Nielsen
  1 sibling, 1 reply; 22+ messages in thread
From: Jörn Engel @ 2003-04-16 11:13 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Holger Schurig, linux-mtd, esn, Jasmine Strong

On Wed, 16 April 2003 13:35:50 +0200, Thomas Gleixner wrote:
> [lots of good information]
> 
> JFSS2 does no write on fsync, as data are already written.

This is true for NOR, but how about the wbuf for NAND? How does that
respond to sync/fsync?
iirc it used to ignore sync completely so people could potentially get
into trouble if they write-sync-poweroff with <2s between write and
poweroff.

Jörn

-- 
With a PC, I always felt limited by the software available. On Unix, 
I am limited only by my knowledge.
-- Peter J. Schoenster

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Database on JFFS2?
  2003-04-15 16:23               ` Jasmine Strong
@ 2003-04-16 11:16                 ` Jörn Engel
  0 siblings, 0 replies; 22+ messages in thread
From: Jörn Engel @ 2003-04-16 11:16 UTC (permalink / raw)
  To: Jasmine Strong; +Cc: linux-mtd, Holger Schurig, Esben Nielsen

On Tue, 15 April 2003 17:23:59 +0100, Jasmine Strong wrote:
> On Tuesday, Apr 15, 2003, at 17:14 Europe/London, Jörn Engel wrote:
> >On Tue, 15 April 2003 17:11:44 +0100, Jasmine Strong wrote:
> >>Unless it would cause many erases, which would slow things down a 
> >>lot...
> >Erases get triggered by garbage collection, which depends on the
> >amount of data written, not the chunk size.
> 
> yes.  I think my two points were actually the same point taken twice :-)
> If you're only updating a few bytes of data you will end up writing
> a large proportion of log control data.  That'll end up being
> responsible for most of the erase traffic.

Actually, that shouldn't matter too much. For comparison, I did some
benchmarks using jffs2 (without compression) as a filesystem for a
ramdisk.

The benchmark wrote data to jffs2, deleted it and repeated this
several times to remove statistical noise. Horrible results.
Then I got a clue and added "sleep 6" after both writing and deleting,
getting roughly twice the performance. Why?

Under normal operation, the system is idle a lot and the garbage
collector (GC) has plenty of time to clean up the mess you made. But
the first benchmark was measuring a system without idle times, so all
writes were waiting for GC to finally free some space. Wrong.

Back to the Database:
Even if you write data in very small chunks, the system should have
enough free time to GC those fragments and reassemble them into larger
chunks with less overhead, so this doesn't matter.

Unless you permanently operate near the limit. Without the free time
for GC, this does matter.

> Still, if you need to be powerfail-safe, I can't see any way of not
> doing this.

Right.

Jörn

-- 
Sometimes, asking the right question is already the answer.
-- Unknown

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Database on JFFS2?
  2003-04-16 10:04 ` Database on JFFS2? Esben Nielsen
@ 2003-04-16 11:35   ` Thomas Gleixner
  2003-04-16 11:13     ` Jörn Engel
  2003-04-22  8:07     ` Esben Nielsen
  0 siblings, 2 replies; 22+ messages in thread
From: Thomas Gleixner @ 2003-04-16 11:35 UTC (permalink / raw)
  To: esn, Jörn Engel, Jasmine Strong; +Cc: Holger Schurig, linux-mtd

On Wednesday 16 April 2003 12:04, Esben Nielsen wrote:
> I did try to follow it and as far as I can see it JFFS2 _only_ writes to
> flash in the GC thread? That contradicts the manual page for fsync():
> "fsync copies all in core parts of a file to disk, and waits  until  the
> device  reports that all parts are on stable storage."

No, JFFS2 writes all data immidiately to the FLASH chip. GC takes care of 
reusing blocks, where obsolete information is stored. This happens, if you 
change the contents of a file. This space is garbage collected at some point 
and valid nodes, which reside on the same block are moved to a different 
block. If the block, which is gc'ed contains no more valid information it is 
erased and can be used for writing again.

> I am not much into flash technology itself (NOR contra NAND), but I know
> that on our device one write erases and rewrites one block of 128k. Each
> block can only be erased between 1E5 and 1E6 times. We thus have to be
> carefull as our device have to live for 20 years (yes, that is what a
> promise our customers!).

Erase is always a full block (e.g. 128K), but writes happen in small chunks. 

Have you really thought about the numbers ?
20 years, 365 days
Erasecycles max. 10E5:	13,6 erases per day
Erasecycles max. 10E6:	136 erases per day

I assume a compressed database size of 1MB and a FLASH size of 8MB.
Even if you have to write the full database every time you will be allowed
to do it 
108,8 times for max. Erasecycles of 10E5 and
1088 times for  max. Erasecycles of 10E6

which means a write every
13 minutes  for max. Erasecycles of 10E5 and
1,3 minutes  for max. Erasecycles of 10E6

assumed, that your device operates 20 years, 365 days/year and 24h /day.

If you buffer your small changes in sqlite or on a ramfs, which maybe located 
in a buffered SRAM, you can reduce the write cycles further and expand the 
lifetime beyond your retirement age. :)

> If JFFS2 does do a real write on fsync() we will get too many physical
> writes unless we buffer our inserts/updates into one big transaction in the
> application layer above the database. On the other hand if JFFS2 doesn't
> commit changes we risc getting our database file inconsistant if we loose
> power in the middle :-(

JFSS2 does no write on fsync, as data are already written.

-- 
Thomas
________________________________________________________________________
linutronix - competence in embedded & realtime linux
http://www.linutronix.de
mail: tglx@linutronix.de

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Database on JFFS2?
  2003-04-16 14:05       ` Thomas Gleixner
@ 2003-04-16 13:08         ` Jörn Engel
  2003-04-16 14:30         ` matsunaga
  1 sibling, 0 replies; 22+ messages in thread
From: Jörn Engel @ 2003-04-16 13:08 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-mtd, Holger Schurig, esn, Jasmine Strong

On Wed, 16 April 2003 16:05:33 +0200, Thomas Gleixner wrote:
> write - sync -poweroff works on NAND. flush_wbuf is also triggered on umount.

Good. Then either my memory was wrong, or the problem got fixed in the
meantime. Either way, good work!

Jörn

-- 
Those who come seeking peace without a treaty are plotting.
-- Sun Tzu

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Database on JFFS2?
  2003-04-16 11:13     ` Jörn Engel
@ 2003-04-16 14:05       ` Thomas Gleixner
  2003-04-16 13:08         ` Jörn Engel
  2003-04-16 14:30         ` matsunaga
  0 siblings, 2 replies; 22+ messages in thread
From: Thomas Gleixner @ 2003-04-16 14:05 UTC (permalink / raw)
  To: Jörn Engel; +Cc: linux-mtd, Holger Schurig, esn, Jasmine Strong

On Wednesday 16 April 2003 13:13, Jörn Engel wrote:
> On Wed, 16 April 2003 13:35:50 +0200, Thomas Gleixner wrote:
> > [lots of good information]
> >
> > JFSS2 does no write on fsync, as data are already written.
>
> This is true for NOR, but how about the wbuf for NAND? How does that
> respond to sync/fsync?
Sync triggers flush_wbuf. 
NAND uses a writebuffer of page_size, which is 256 / 512 Byte depending on the 
chip type.

> iirc it used to ignore sync completely so people could potentially get
> into trouble if they write-sync-poweroff with <2s between write and
> poweroff.
write - sync -poweroff works on NAND. flush_wbuf is also triggered on umount.

-- 
Thomas
________________________________________________________________________
linutronix - competence in embedded & realtime linux
http://www.linutronix.de
mail: tglx@linutronix.de

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Database on JFFS2?
  2003-04-16 14:05       ` Thomas Gleixner
  2003-04-16 13:08         ` Jörn Engel
@ 2003-04-16 14:30         ` matsunaga
  2003-04-16 18:19           ` Thomas Gleixner
  1 sibling, 1 reply; 22+ messages in thread
From: matsunaga @ 2003-04-16 14:30 UTC (permalink / raw)
  To: tglx, Jörn Engel; +Cc: Holger Schurig, linux-mtd, esn, Jasmine Strong

Hi.

On Wednesday 16 April 2003 13:13, Jörn Engel wrote:
> On Wed, 16 April 2003 13:35:50 +0200, Thomas Gleixner wrote:
> > [lots of good information]
> >
> > JFSS2 does no write on fsync, as data are already written.
>
> This is true for NOR, but how about the wbuf for NAND? How does that
> respond to sync/fsync?
Sync triggers flush_wbuf.
NAND uses a writebuffer of page_size, which is 256 / 512 Byte depending on the
chip type.

> iirc it used to ignore sync completely so people could potentially get
> into trouble if they write-sync-poweroff with <2s between write and
> poweroff.
write - sync -poweroff works on NAND. flush_wbuf is also triggered on umount.

I think that sys_sync is not supported yet, as is written in jffs2/Todo, right?

__________________________________________________
Do You Yahoo!?
Yahoo! BB is Broadband by Yahoo!  http://bb.yahoo.co.jp/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Database on JFFS2?
       [not found] <00512BA4F9D3D311912A009027E9B8F407E4D9@NT>
@ 2003-04-16 15:23 ` Jörn Engel
  0 siblings, 0 replies; 22+ messages in thread
From: Jörn Engel @ 2003-04-16 15:23 UTC (permalink / raw)
  To: Dave Ellis; +Cc: linux-mtd

On Wed, 16 April 2003 10:17:49 -0400, Dave Ellis wrote:
> 
> I have also seen poor performance with jffs2 and a ramdisk. At first
> writes are fast, but after a little while there are pauses of almost
> 5 seconds for a single write. Then it is fast again for less than a 
> second, then the big pause. The problem is worse with a small (512K)
> ramdisk. I don't think it is possible to write to flash fast enough
> to cause the problem.
> 
> The problem is that garbage collection only puts the block to erase
> on the erase list. It then triggers the erase by marking the superblock
> dirty. It can be up to 5 seconds later (with default kupdate times)
> when kupdate writes the superblock and starts the erase.
> 
> I don't want to run kupdate 10 times a second, so I patched nodemgmt.c
> to do the erase directly for MTD_RAM devices. This solves the problem,
> but I am not sure if it is the right (or even safe) solution.

Interesting patch. Thanks!

> +			if (c->mtd->type == MTD_RAM) {

It might be cleaner to put another capability in the flags field, set
it for MTD_RAM and do something like this

			if (c->mtd->flags & MTD_DIRECT_ERASE) {

Jörn

-- 
Do not stop an army on its way home.
-- Sun Tzu

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Database on JFFS2?
  2003-04-16 14:30         ` matsunaga
@ 2003-04-16 18:19           ` Thomas Gleixner
  2003-04-17 16:02             ` matsunaga
  0 siblings, 1 reply; 22+ messages in thread
From: Thomas Gleixner @ 2003-04-16 18:19 UTC (permalink / raw)
  To: matsunaga, Jörn Engel; +Cc: linux-mtd, Holger Schurig, esn, Jasmine Strong

On Wednesday 16 April 2003 16:30, matsunaga wrote:
> > iirc it used to ignore sync completely so people could potentially get
> > into trouble if they write-sync-poweroff with <2s between write and
> > poweroff.
>
> write - sync -poweroff works on NAND. flush_wbuf is also triggered on
> umount.
>
> I think that sys_sync is not supported yet, as is written in jffs2/Todo,
> right?

Sorry, I mixed this up with fsync. sys_sync is not working, just flushing the 
buffer on umount.

When I brought in the timed flush, I tried to use kupdated 
(superblock->s_dirty) and had a bunch of unwanted flush's during consecutive 
writes, because setting sb->s_dirty is asynchronous to kupdated interval.  
That's why I used the 2 seconds timer. 

I will think about it again.

-- 
Thomas
________________________________________________________________________
linutronix - competence in embedded & realtime linux
http://www.linutronix.de
mail: tglx@linutronix.de

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Database on JFFS2?
  2003-04-16 18:19           ` Thomas Gleixner
@ 2003-04-17 16:02             ` matsunaga
  0 siblings, 0 replies; 22+ messages in thread
From: matsunaga @ 2003-04-17 16:02 UTC (permalink / raw)
  To: tglx, Jörn Engel; +Cc: Holger Schurig, linux-mtd, esn, Jasmine Strong

----- Original Message -----
From: "Thomas Gleixner" <tglx@linutronix.de>
To: "matsunaga" <matsunaga_kazuhisa@yahoo.co.jp>; "Jörn Engel" <joern@wohnheim.fh-wedel.de>
Cc: <linux-mtd@lists.infradead.org>; "Holger Schurig" <h.schurig@mn-logistik.de>; <esn@cotas.dk>; "Jasmine Strong"
<jasmine@regolith.co.uk>
Sent: Thursday, April 17, 2003 3:19 AM
Subject: Re: Database on JFFS2?


> On Wednesday 16 April 2003 16:30, matsunaga wrote:
> > > iirc it used to ignore sync completely so people could potentially get
> > > into trouble if they write-sync-poweroff with <2s between write and
> > > poweroff.
> >
> > write - sync -poweroff works on NAND. flush_wbuf is also triggered on
> > umount.
> >
> > I think that sys_sync is not supported yet, as is written in jffs2/Todo,
> > right?
>
> Sorry, I mixed this up with fsync. sys_sync is not working, just flushing the
> buffer on umount.
>
> When I brought in the timed flush, I tried to use kupdated
> (superblock->s_dirty) and had a bunch of unwanted flush's during consecutive
> writes, because setting sb->s_dirty is asynchronous to kupdated interval.
> That's why I used the 2 seconds timer.
>
> I will think about it again.

I understand the sisuation. Asynchronous sync could prevents consecutive writes.
I will wait for it ;-)
Ideally sys_sync while a file is being written should wait for the last wbuf write of the file and flush it...

BR.

__________________________________________________
Do You Yahoo!?
Yahoo! BB is Broadband by Yahoo!  http://bb.yahoo.co.jp/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Database on JFFS2?
  2003-04-16 11:35   ` Thomas Gleixner
  2003-04-16 11:13     ` Jörn Engel
@ 2003-04-22  8:07     ` Esben Nielsen
  2003-04-22  8:24       ` Jörn Engel
  1 sibling, 1 reply; 22+ messages in thread
From: Esben Nielsen @ 2003-04-22  8:07 UTC (permalink / raw)
  To: tglx, Jörn Engel, Jasmine Strong; +Cc: Holger Schurig, linux-mtd

On Wednesday 16 April 2003 13:35, Thomas Gleixner wrote:
> [...]
> On Wednesday 16 April 2003 12:04, Esben Nielsen wrote:
> > I am not much into flash technology itself (NOR contra NAND), but I know
> > that on our device one write erases and rewrites one block of 128k. Each
> > block can only be erased between 1E5 and 1E6 times. We thus have to be
> > carefull as our device have to live for 20 years (yes, that is what a
> > promise our customers!).
>
> Erase is always a full block (e.g. 128K), but writes happen in small
> chunks.
>

Ah! Now I get it: You have erase a block as a whole and then you can write 
parts of it until it is full? I always believed you had to write a full block 
just to write a single byte, but this does make much more sense to me. We 
have been through similar discussions with TFFS under vxWorks where we 
couldn't find out how many times we were actually allowed to write.

> [...]
>
> JFSS2 does no write on fsync, as data are already written.

Doesn't Linux per default for any filesystem hold a write cache which needs to 
be flushed with fsync()?

Esben

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Database on JFFS2?
  2003-04-22  8:07     ` Esben Nielsen
@ 2003-04-22  8:24       ` Jörn Engel
  0 siblings, 0 replies; 22+ messages in thread
From: Jörn Engel @ 2003-04-22  8:24 UTC (permalink / raw)
  To: Esben Nielsen; +Cc: Holger Schurig, tglx, linux-mtd, Jasmine Strong

On Tue, 22 April 2003 10:07:59 +0200, Esben Nielsen wrote:
> On Wednesday 16 April 2003 13:35, Thomas Gleixner wrote:
> >
> > Erase is always a full block (e.g. 128K), but writes happen in small
> > chunks.
> 
> Ah! Now I get it: You have erase a block as a whole and then you can write 
> parts of it until it is full? I always believed you had to write a full block 
> just to write a single byte, but this does make much more sense to me. We 
> have been through similar discussions with TFFS under vxWorks where we 
> couldn't find out how many times we were actually allowed to write.

:)

> > JFSS2 does no write on fsync, as data are already written.
> 
> Doesn't Linux per default for any filesystem hold a write cache which needs to 
> be flushed with fsync()?

Linux is designed for disk io, it holds a buffer cache that is flushed
to disk as a whole on sync, for any mount point on umount and for any
file on fsync. Sync also happens regularly every 30s.

With flash, many things are different. Latency is very small, compared
to disks, and it doesn't (usually) make sense to hold any writes back
from the medium. No cache -> no need for (f)sync.

The cache came back with NAND flash, but you don't use it and don't
need to worry about it.

Jörn

-- 
Write programs that do one thing and do it well. Write programs to work
together. Write programs to handle text streams, because that is a
universal interface. 
-- Doug MacIlroy

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2003-04-22  8:24 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20030415171123.GH7721@wohnheim.fh-wedel.de>
2003-04-16 10:04 ` Database on JFFS2? Esben Nielsen
2003-04-16 11:35   ` Thomas Gleixner
2003-04-16 11:13     ` Jörn Engel
2003-04-16 14:05       ` Thomas Gleixner
2003-04-16 13:08         ` Jörn Engel
2003-04-16 14:30         ` matsunaga
2003-04-16 18:19           ` Thomas Gleixner
2003-04-17 16:02             ` matsunaga
2003-04-22  8:07     ` Esben Nielsen
2003-04-22  8:24       ` Jörn Engel
     [not found] <00512BA4F9D3D311912A009027E9B8F407E4D9@NT>
2003-04-16 15:23 ` Jörn Engel
2003-04-14  8:08 Esben Nielsen
2003-04-14 23:03 ` Charles Manning
2003-04-15  8:06   ` Esben Nielsen
2003-04-15  8:51     ` Holger Schurig
2003-04-15 15:06       ` Esben Nielsen
2003-04-15 15:39         ` Jörn Engel
2003-04-15 16:11           ` Jasmine Strong
2003-04-15 16:14             ` Jörn Engel
2003-04-15 16:23               ` Jasmine Strong
2003-04-16 11:16                 ` Jörn Engel
2003-04-15  9:13     ` Jörn Engel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox