public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
* jffs_file_write
@ 2000-07-21 23:41 Rogelio M. Serrano Jr.
  2000-07-24 15:09 ` jffs_file_write Finn Hakansson
  0 siblings, 1 reply; 14+ messages in thread
From: Rogelio M. Serrano Jr. @ 2000-07-21 23:41 UTC (permalink / raw)
  To: Finn Hakansson; +Cc: mtd@infradead.org

How do we handle these:

  Appending to an existing file. What fields must be updated with what?
We write a new inode right? What values should I place in the raw inode
struct?

  Rewriting file? Writing a file?



To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: jffs_file_write
  2000-07-21 23:41 jffs_file_write Rogelio M. Serrano Jr.
@ 2000-07-24 15:09 ` Finn Hakansson
  2000-07-25  9:29   ` jffs_file_write David Woodhouse
  0 siblings, 1 reply; 14+ messages in thread
From: Finn Hakansson @ 2000-07-24 15:09 UTC (permalink / raw)
  To: Rogelio M. Serrano Jr.; +Cc: mtd@infradead.org, jffs-dev

On Sat, 22 Jul 2000, Rogelio M. Serrano Jr. wrote:

> How do we handle these:
> 
>   Appending to an existing file. What fields must be updated with what?
> We write a new inode right? What values should I place in the raw inode
> struct?

There must be a raw inode preceeding each new chunk of data. The kind of
stuff that has to be inserted into each raw inode in jffs_file_write is
approximately what is in there today. The thingies that have to be changed
are the raw inode's version, offset and dsize. Of course one have to
compute new checksums. I don't think there are stuff other than that.


>   Rewriting file? Writing a file?

Yeah. One day, write and rewrite should be merged into one single function...

/Finn



To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: jffs_file_write
  2000-07-24 15:09 ` jffs_file_write Finn Hakansson
@ 2000-07-25  9:29   ` David Woodhouse
  2000-07-25  9:44     ` jffs_file_write Finn Hakansson
                       ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: David Woodhouse @ 2000-07-25  9:29 UTC (permalink / raw)
  To: Finn Hakansson; +Cc: Rogelio M. Serrano Jr., mtd@infradead.org, jffs-dev


finn@axis.com said:
>  Yeah. One day, write and rewrite should be merged into one single
> function... 

I'm been thinking about this, and about the problems with garbage collection
taking to long. What about shifting all node writes into a kernel thread,
which also does the GC?

The jffs_file_write() function then only needs to queue the node(s) to be 
written, and can return immediately. Obviously we have to implement a way 
of flushing a particular file, but that shouldn't be too difficult.

--
dwmw2




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: jffs_file_write
  2000-07-25  9:29   ` jffs_file_write David Woodhouse
@ 2000-07-25  9:44     ` Finn Hakansson
  2000-07-25 10:01       ` jffs_file_write David Woodhouse
  2000-07-25 10:02     ` jffs_file_write Bjorn Wesen
  2000-07-25 14:29     ` jffs_file_write Philipp Rumpf
  2 siblings, 1 reply; 14+ messages in thread
From: Finn Hakansson @ 2000-07-25  9:44 UTC (permalink / raw)
  To: David Woodhouse; +Cc: Rogelio M. Serrano Jr., mtd@infradead.org, jffs-dev

On Tue, 25 Jul 2000, David Woodhouse wrote:

> 
> finn@axis.com said:
> >  Yeah. One day, write and rewrite should be merged into one single
> > function... 
> 
> I'm been thinking about this, and about the problems with garbage collection
> taking to long.

I cannot understand that. How much time are we talking about? How large
is the flash? How long does an erase take? One garbage collect cannot
consume more time than (sectors on flash * number of sectors) time I
think.


> What about shifting all node writes into a kernel thread,
> which also does the GC?

Hmmm...

> The jffs_file_write() function then only needs to queue the node(s) to be 
> written, and can return immediately. Obviously we have to implement a way 
> of flushing a particular file, but that shouldn't be too difficult.

Well. Sounds like a good idea. I'll think about it.

/Finn



To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: jffs_file_write
  2000-07-25  9:44     ` jffs_file_write Finn Hakansson
@ 2000-07-25 10:01       ` David Woodhouse
  2000-07-25 10:11         ` jffs_file_write Finn Hakansson
  0 siblings, 1 reply; 14+ messages in thread
From: David Woodhouse @ 2000-07-25 10:01 UTC (permalink / raw)
  To: Finn Hakansson; +Cc: Rogelio M. Serrano Jr., mtd@infradead.org, jffs-dev


finn@axis.com said:
>  I cannot understand that. How much time are we talking about? How
> large is the flash? How long does an erase take? One garbage collect
> cannot consume more time than (sectors on flash * number of sectors)
> time I think. 

After making a few copies of /usr and /lib, then deleting them, I saw it
take an hour and a half with a 16Mb flash which was about half-full.

Now I've implemented buffer writes on chips which support it, that should be
down to ten minutes. I'm sure the verbose logging over the serial port
doesn't help :)

But it's still done in process context, and writes of new nodes have to 
wait while it's happening. Shifting that into its own kernel thread would 
be nice, and would also allow us to merge writes. 

I'm far more concerned by the thing getting corrupted when I abruptly 
remove power - I'm about to see if I can reproduce it with the patches you 
committed just before I left - do you have any ideas from the log I posted 
on the 17th of July?



--
dwmw2




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: jffs_file_write
  2000-07-25  9:29   ` jffs_file_write David Woodhouse
  2000-07-25  9:44     ` jffs_file_write Finn Hakansson
@ 2000-07-25 10:02     ` Bjorn Wesen
  2000-07-25 10:15       ` jffs_file_write David Woodhouse
  2000-07-25 14:29     ` jffs_file_write Philipp Rumpf
  2 siblings, 1 reply; 14+ messages in thread
From: Bjorn Wesen @ 2000-07-25 10:02 UTC (permalink / raw)
  To: David Woodhouse; +Cc: mtd@infradead.org, jffs-dev

On Tue, 25 Jul 2000, David Woodhouse wrote:
> I'm been thinking about this, and about the problems with garbage collection
> taking to long. What about shifting all node writes into a kernel thread,
> which also does the GC?
> 
> The jffs_file_write() function then only needs to queue the node(s) to be 
> written, and can return immediately. Obviously we have to implement a way 
> of flushing a particular file, but that shouldn't be too difficult.

But you will run into the same problem then that the buffer-cache we avoid
solves. If you start queueing stuff, the reads will need to check
the queue before reading from the flash for example (the in-core node
which keeps track of the data contents of the files can of course have a
pointer to the queued data if it's not on flash yet). You cannot rely
on the page cache caching the changes because the pages might have become
invalidated.

So it's more complicated than just queueing the writes and erases. Also if
you defer writes too long you start loosing the point with the flash
filesystem since you'll loose a lot of changes upon power-failure. Imagine
a user changing a configuration entry with his web-browser. When he has
hit OK and gotten the "Saved" webpage, he pulls the plug. OK that is
perhaps fixed by writing files with O_SYNC or whatever that mechanism is
called like you say...

However with suitable locking, the GC _should_ indeed be a kernel thread,
that either is woken up on-demand (like it is now) or wakes up once in a
while and collects if necessary (both probably). That is not the same as
queueing writes though.

-Bjorn




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: jffs_file_write
  2000-07-25 10:01       ` jffs_file_write David Woodhouse
@ 2000-07-25 10:11         ` Finn Hakansson
  2000-07-25 13:02           ` jffs_file_write Bjorn Wesen
  2000-07-25 13:54           ` jffs_file_write David Woodhouse
  0 siblings, 2 replies; 14+ messages in thread
From: Finn Hakansson @ 2000-07-25 10:11 UTC (permalink / raw)
  To: David Woodhouse; +Cc: Rogelio M. Serrano Jr., mtd@infradead.org, jffs-dev

On Tue, 25 Jul 2000, David Woodhouse wrote:

> 
> finn@axis.com said:
> >  I cannot understand that. How much time are we talking about? How
> > large is the flash? How long does an erase take? One garbage collect
> > cannot consume more time than (sectors on flash * number of sectors)
> > time I think. 
> 
> After making a few copies of /usr and /lib, then deleting them, I saw it
> take an hour and a half with a 16Mb flash which was about half-full.

Then something else must be very wrong. A garbage collect shouldn't take
more than a few seconds at most. Unless you have a really slow flash chip.


> Now I've implemented buffer writes on chips which support it, that should be
> down to ten minutes. I'm sure the verbose logging over the serial port
> doesn't help :)
> 
> But it's still done in process context, and writes of new nodes have to 
> wait while it's happening. Shifting that into its own kernel thread would 
> be nice, and would also allow us to merge writes. 
> 
> I'm far more concerned by the thing getting corrupted when I abruptly 
> remove power - I'm about to see if I can reproduce it with the patches you 
> committed just before I left - do you have any ideas from the log I posted 
> on the 17th of July?

I'll take a look at it after lunch.

/Finn



To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: jffs_file_write
  2000-07-25 10:02     ` jffs_file_write Bjorn Wesen
@ 2000-07-25 10:15       ` David Woodhouse
  0 siblings, 0 replies; 14+ messages in thread
From: David Woodhouse @ 2000-07-25 10:15 UTC (permalink / raw)
  To: Bjorn Wesen; +Cc: mtd@infradead.org, jffs-dev


bjornw@axis.com said:
> But you will run into the same problem then that the buffer-cache we
> avoid solves. If you start queueing stuff, the reads will need to
> check the queue before reading from the flash for example (the in-core
> node which keeps track of the data contents of the files can of course
> have a pointer to the queued data if it's not on flash yet). You
> cannot rely on the page cache caching the changes because the pages
> might have become invalidated. 

True. It should be possible to make JFFS use the page cache and
generic_file_{read,write}() though, so the data in the page cache are always
valid. There are apparently ways to do it without having to write 4 KB nodes
each time we're told a page has been dirtied.

I need to schedule myself a crash course on Linux VFS so that I know what 
I'm talking about :)

bjornw@axis.com said:
> . When he has hit OK and gotten the "Saved" webpage, he pulls the
> plug. OK that is perhaps fixed by writing files with O_SYNC or
> whatever that mechanism is called like you say...

Netscape does actually write changes to its config files with O_SYNC, and I
think it's reasonable to expect that to be necessary. A journalling
filesystem should always have a consistent state, but it allowed to have
write-behind. It doesn't necessarily have to commit writes to the media
before returning - that's what O_SYNC is for.




--
dwmw2




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: jffs_file_write
  2000-07-25 10:11         ` jffs_file_write Finn Hakansson
@ 2000-07-25 13:02           ` Bjorn Wesen
  2000-07-25 13:19             ` jffs_file_write David Woodhouse
  2000-07-25 13:54           ` jffs_file_write David Woodhouse
  1 sibling, 1 reply; 14+ messages in thread
From: Bjorn Wesen @ 2000-07-25 13:02 UTC (permalink / raw)
  To: Finn Hakansson; +Cc: mtd@infradead.org, jffs-dev

On Tue, 25 Jul 2000, Finn Hakansson wrote:
> > After making a few copies of /usr and /lib, then deleting them, I saw it
> > take an hour and a half with a 16Mb flash which was about half-full.
> 
> Then something else must be very wrong. A garbage collect shouldn't take
> more than a few seconds at most. Unless you have a really slow flash chip.

It usually takes about 1-2 seconds to erase a sector, so if you really
need to erase the entire 16 mb flash during GC, that's 250 sectors times 2
seconds, about 8 minutes. Mileage may vary depending on flash manufacturer
- I've never used 16 mbyte flashes in anything.. 

Now of course this should not be necessary, because under normal
circumstances you're not dirtying that many sectors, but we have not
implemented the more flexible sector allocation scheme yet so we cannot
take advantage of that. So instead of the GC needing to erase redundant
sectors * erasetime, it needs to erase used-sectors * erasetime which is
usually much more.

> > But it's still done in process context, and writes of new nodes have to 
> > wait while it's happening. Shifting that into its own kernel thread would 
> > be nice, and would also allow us to merge writes. 

Yes that would be more nice. The GC could interleave its operations
incrementally in the thread. There is no state in the GC operation (except
keeping writes from coming into the sector we're about to clear :) so it
should be easy to do.

-Bjorn




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: jffs_file_write
  2000-07-25 13:02           ` jffs_file_write Bjorn Wesen
@ 2000-07-25 13:19             ` David Woodhouse
  0 siblings, 0 replies; 14+ messages in thread
From: David Woodhouse @ 2000-07-25 13:19 UTC (permalink / raw)
  To: Bjorn Wesen; +Cc: Finn Hakansson, mtd@infradead.org, jffs-dev


bjornw@axis.com said:
>  It usually takes about 1-2 seconds to erase a sector, so if you
> really need to erase the entire 16 mb flash during GC, that's 250
> sectors times 2 seconds, about 8 minutes. Mileage may vary depending
> on flash manufacturer - I've never used 16 mbyte flashes in anything..
> 

On these chips, erases only take a second. At the time, it was also 128
microseconds per _word_ write. So although a complete erase only takes 128
seconds, writing the whole device was more like 20 minutes.

Even so, it should be far less than the 90 minutes that I observed. Is it
possible that the GC was compacting the data on the flash repeatedly, before
it had completed removing all the files? The command I was running was:
 rm -r /mnt/usr

If it removed a couple of MB, then garbage collected, then removed another 
couple of MB, then would it have ended up moving the same data twice?



--
dwmw2




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: jffs_file_write
  2000-07-25 10:11         ` jffs_file_write Finn Hakansson
  2000-07-25 13:02           ` jffs_file_write Bjorn Wesen
@ 2000-07-25 13:54           ` David Woodhouse
  1 sibling, 0 replies; 14+ messages in thread
From: David Woodhouse @ 2000-07-25 13:54 UTC (permalink / raw)
  To: Finn Hakansson; +Cc: Rogelio M. Serrano Jr., mtd@infradead.org, jffs-dev


finn@axis.com said:
> On Tue, 25 Jul 2000, David Woodhouse wrote:
> > I'm far more concerned by the thing getting corrupted when I abruptly 
> > remove power - I'm about to see if I can reproduce it with the patches
> > you committed just before I left - do you have any ideas from the log I
> > posted on the 17th of July?

> I'll take a look at it after lunch. 

I've now updated my kernel to the latest code from CVS, and I can still 
reproduce it. I'll start logging and try to do it again. 

--
dwmw2




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: jffs_file_write
  2000-07-25  9:29   ` jffs_file_write David Woodhouse
  2000-07-25  9:44     ` jffs_file_write Finn Hakansson
  2000-07-25 10:02     ` jffs_file_write Bjorn Wesen
@ 2000-07-25 14:29     ` Philipp Rumpf
  2000-07-25 15:12       ` jffs_file_write David Woodhouse
  2 siblings, 1 reply; 14+ messages in thread
From: Philipp Rumpf @ 2000-07-25 14:29 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Finn Hakansson, Rogelio M. Serrano Jr., mtd@infradead.org,
	jffs-dev

On Tue, Jul 25, 2000 at 10:29:47AM +0100, David Woodhouse wrote:
> I'm been thinking about this, and about the problems with garbage collection
> taking to long. What about shifting all node writes into a kernel thread,
> which also does the GC?
> 
> The jffs_file_write() function then only needs to queue the node(s) to be 
> written, and can return immediately. Obviously we have to implement a way 
> of flushing a particular file, but that shouldn't be too difficult.

I disagree with queueing nodes - just updating a "which byte ranges of the
file have been modified" list should be both simpler and allow more efficient
write merging/compression.


To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: jffs_file_write
  2000-07-25 14:29     ` jffs_file_write Philipp Rumpf
@ 2000-07-25 15:12       ` David Woodhouse
  2000-07-25 15:24         ` jffs_file_write Philipp Rumpf
  0 siblings, 1 reply; 14+ messages in thread
From: David Woodhouse @ 2000-07-25 15:12 UTC (permalink / raw)
  To: Philipp Rumpf
  Cc: Finn Hakansson, Rogelio M. Serrano Jr., mtd@infradead.org,
	jffs-dev


prumpf@uzix.org said:
>  I disagree with queueing nodes - just updating a "which byte ranges
> of the file have been modified" list should be both simpler and allow
> more efficient write merging/compression. 

OK. As long as we're aware of the possibility that a certain byte range 
could be changed _again_ in a different transaction. 

i.e.	pwrite(fd, "aaaaaaaaa", 10, 0);
	pwrite(fd, "bbbbbbbbb", 10, 5);

If you're not going to take a copy of data in the first write, but just 
keep it in the page cache and remember where it is, then you cannot write 
that transaction unless you combine it with the second one. 

That is - you _must_ combine the two into a single node write. It's not 
just an optimisation. If you were to write "aaaaabbbbb" to the beginning of 
the file and lose power before writing the rest of the 'bbbbb', I believe 
you're violating POSIX by having non-atomic write().

So if you're going to write data to the flash directly from the page cache, 
you have to have some lock in place which prevents anything else from 
dirtying it during the mtd_write() call.

--
dwmw2




To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: jffs_file_write
  2000-07-25 15:12       ` jffs_file_write David Woodhouse
@ 2000-07-25 15:24         ` Philipp Rumpf
  0 siblings, 0 replies; 14+ messages in thread
From: Philipp Rumpf @ 2000-07-25 15:24 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Finn Hakansson, Rogelio M. Serrano Jr., mtd@infradead.org,
	jffs-dev

On Tue, Jul 25, 2000 at 04:12:11PM +0100, David Woodhouse wrote:
> OK. As long as we're aware of the possibility that a certain byte range 
> could be changed _again_ in a different transaction. 
> 
> i.e.	pwrite(fd, "aaaaaaaaa", 10, 0);
> 	pwrite(fd, "bbbbbbbbb", 10, 5);
> 
> If you're not going to take a copy of data in the first write, but just 
> keep it in the page cache and remember where it is, then you cannot write 
> that transaction unless you combine it with the second one. 
> 
> That is - you _must_ combine the two into a single node write. It's not 
> just an optimisation. If you were to write "aaaaabbbbb" to the beginning of 
> the file and lose power before writing the rest of the 'bbbbb', I believe 
> you're violating POSIX by having non-atomic write().

prepare_write can look like this:

	if(try_to_abort_pending_conflicting_writes_fails()) {
		sleep until the writes happened;
	} else {
		reschedule merged writes;
	}

I'm not actually sure there's a POSIX requirement to do this - as long as
the fs doesn't crash the page cache will keep our view of it consistent.

BTW, the simplistic implementation would simply add

	__u32 first_modified_byte;
	__u32 last_modified_byte;

to jffs_file and keep track of only one range that has been modified - it
should work well enough for the common case (non-conflicting consecutive
writes) and we can easily detect the cases in which it doesn't work and
fall back to sleeping until the write are finished.

> So if you're going to write data to the flash directly from the page cache, 
> you have to have some lock in place which prevents anything else from 
> dirtying it during the mtd_write() call.

If all writes are done by one thread that shouldn't be necessary.


To unsubscribe, send "unsubscribe mtd" to majordomo@infradead.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2000-07-25 15:24 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2000-07-21 23:41 jffs_file_write Rogelio M. Serrano Jr.
2000-07-24 15:09 ` jffs_file_write Finn Hakansson
2000-07-25  9:29   ` jffs_file_write David Woodhouse
2000-07-25  9:44     ` jffs_file_write Finn Hakansson
2000-07-25 10:01       ` jffs_file_write David Woodhouse
2000-07-25 10:11         ` jffs_file_write Finn Hakansson
2000-07-25 13:02           ` jffs_file_write Bjorn Wesen
2000-07-25 13:19             ` jffs_file_write David Woodhouse
2000-07-25 13:54           ` jffs_file_write David Woodhouse
2000-07-25 10:02     ` jffs_file_write Bjorn Wesen
2000-07-25 10:15       ` jffs_file_write David Woodhouse
2000-07-25 14:29     ` jffs_file_write Philipp Rumpf
2000-07-25 15:12       ` jffs_file_write David Woodhouse
2000-07-25 15:24         ` jffs_file_write Philipp Rumpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox