All of lore.kernel.org
 help / color / mirror / Atom feed
* ext3 file system
@ 2002-01-17 21:45 Justin Smith
  2002-01-18 15:14 ` Noah silva
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Justin Smith @ 2002-01-17 21:45 UTC (permalink / raw)
  To: selinux

When trying to label files with the setfiles utilities, the process
crashes when they encounter the .journal files. These are user-visible,
but are not regular files (they are immutable in the sense that even
root cannot change them in any way). The setfiles utilities issue a
warning that they cannot write to .journal and quit.

Any ideas?

Perhaps there should be some way of exempting the .journal files from
the labelling process. It is unfortunate that they are visible to users
(since users cannot actually use them for anything).

--


--
You have received this message because you are subscribed to the selinux list.
If you no longer wish to subscribe, send mail to majordomo@tycho.nsa.gov with
the words "unsubscribe selinux" without quotes as the message.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ext3 file system
  2002-01-17 21:45 Justin Smith
@ 2002-01-18 15:14 ` Noah silva
  2002-01-18 15:43 ` Stephen Smalley
  2002-01-18 15:53 ` Stephen Smalley
  2 siblings, 0 replies; 19+ messages in thread
From: Noah silva @ 2002-01-18 15:14 UTC (permalink / raw)
  To: Justin Smith; +Cc: selinux

Can't you disable the visibility? I thought that was an option in
ext3.  Does SELinux prevent this from functioning in some way.

 -- noah silva 

On 17 Jan 2002, Justin Smith wrote:

> When trying to label files with the setfiles utilities, the process
> crashes when they encounter the .journal files. These are user-visible,
> but are not regular files (they are immutable in the sense that even
> root cannot change them in any way). The setfiles utilities issue a
> warning that they cannot write to .journal and quit.
> 
> Any ideas?
> 
> Perhaps there should be some way of exempting the .journal files from
> the labelling process. It is unfortunate that they are visible to users
> (since users cannot actually use them for anything).
> 
> --
> 
> 
> --
> You have received this message because you are subscribed to the selinux list.
> If you no longer wish to subscribe, send mail to majordomo@tycho.nsa.gov with
> the words "unsubscribe selinux" without quotes as the message.
> 


--
You have received this message because you are subscribed to the selinux list.
If you no longer wish to subscribe, send mail to majordomo@tycho.nsa.gov with
the words "unsubscribe selinux" without quotes as the message.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ext3 file system
  2002-01-17 21:45 Justin Smith
  2002-01-18 15:14 ` Noah silva
@ 2002-01-18 15:43 ` Stephen Smalley
  2002-01-18 15:53 ` Stephen Smalley
  2 siblings, 0 replies; 19+ messages in thread
From: Stephen Smalley @ 2002-01-18 15:43 UTC (permalink / raw)
  To: Justin Smith; +Cc: selinux


On 17 Jan 2002, Justin Smith wrote:

> When trying to label files with the setfiles utilities, the process
> crashes when they encounter the .journal files. These are user-visible,
> but are not regular files (they are immutable in the sense that even
> root cannot change them in any way). The setfiles utilities issue a
> warning that they cannot write to .journal and quit.
>
> Any ideas?
>
> Perhaps there should be some way of exempting the .journal files from
> the labelling process. It is unfortunate that they are visible to users
> (since users cannot actually use them for anything).

First, the journals aren't visible to users on my RH7.2 systems running
the SELinux kernel, so I'm not sure why they are visible on your system.

Second, setfiles should not need to write to .journal.  However, it does
need to be able to set the label of .journal.  If you are running setfiles
on a non-SELinux kernel, setfiles directly updates the persistent label
mapping files in /...security.  If you are running setfiles on a SELinux
kernel, setfiles uses the lchsid system call to set the label for each
file.  At present, lchsid fails on immutable files, just like chmod and
chown.  However, there isn't any strong reason to prevent relabeling of
immutable files, so we could remove this restriction from the [l|f|]chsid
calls in the SELinux module.

Third, you can exempt files from the labeling process by using the
<<none>> specification in the file_contexts configuration.  So, to exempt
.journal files from being relabeled by setfiles, you might add the
following to the end of file_contexts:

.*/\.journal		<<none>>

Of course, this is a questionable choice.  We should probably define a
type for the journal files so that they can be rigorously protected by
the SELinux nondiscretionary access controls.

--
Stephen D. Smalley, NAI Labs
ssmalley@nai.com




--
You have received this message because you are subscribed to the selinux list.
If you no longer wish to subscribe, send mail to majordomo@tycho.nsa.gov with
the words "unsubscribe selinux" without quotes as the message.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ext3 file system
  2002-01-17 21:45 Justin Smith
  2002-01-18 15:14 ` Noah silva
  2002-01-18 15:43 ` Stephen Smalley
@ 2002-01-18 15:53 ` Stephen Smalley
  2002-01-18 17:14   ` Paul Kronenwetter
  2 siblings, 1 reply; 19+ messages in thread
From: Stephen Smalley @ 2002-01-18 15:53 UTC (permalink / raw)
  To: Justin Smith; +Cc: selinux


On 17 Jan 2002, Justin Smith wrote:

> When trying to label files with the setfiles utilities, the process
> crashes when they encounter the .journal files. These are user-visible,
> but are not regular files (they are immutable in the sense that even
> root cannot change them in any way). The setfiles utilities issue a
> warning that they cannot write to .journal and quit.
>
> Any ideas?
>
> Perhaps there should be some way of exempting the .journal files from
> the labelling process. It is unfortunate that they are visible to users
> (since users cannot actually use them for anything).





--
You have received this message because you are subscribed to the selinux list.
If you no longer wish to subscribe, send mail to majordomo@tycho.nsa.gov with
the words "unsubscribe selinux" without quotes as the message.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ext3 file system
  2002-01-18 15:53 ` Stephen Smalley
@ 2002-01-18 17:14   ` Paul Kronenwetter
  0 siblings, 0 replies; 19+ messages in thread
From: Paul Kronenwetter @ 2002-01-18 17:14 UTC (permalink / raw)
  To: Stephen Smalley; +Cc: Justin Smith, selinux

Actually the .journal files are only visible when they're created on a 
mounted filesystem.  If they're created by mke2fs -j they're made invisible.

So Stephen's machine was a clean install of Red Hat 7.2 and Justin's was 
an upgrade or a manual enhancement of an ext2 fs.

-Paul

Stephen Smalley wrote:

>On 17 Jan 2002, Justin Smith wrote:
>
>>When trying to label files with the setfiles utilities, the process
>>crashes when they encounter the .journal files. These are user-visible,
>>but are not regular files (they are immutable in the sense that even
>>root cannot change them in any way). The setfiles utilities issue a
>>warning that they cannot write to .journal and quit.
>>
>>Any ideas?
>>
>>Perhaps there should be some way of exempting the .journal files from
>>the labelling process. It is unfortunate that they are visible to users
>>(since users cannot actually use them for anything).
>>
>
>
>
>
>
>--
>You have received this message because you are subscribed to the selinux list.
>If you no longer wish to subscribe, send mail to majordomo@tycho.nsa.gov with
>the words "unsubscribe selinux" without quotes as the message.
>




--
You have received this message because you are subscribed to the selinux list.
If you no longer wish to subscribe, send mail to majordomo@tycho.nsa.gov with
the words "unsubscribe selinux" without quotes as the message.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* ext3 file system
@ 2002-07-20  6:13 Peter
  2002-07-20 11:26 ` pa3gcu
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Peter @ 2002-07-20  6:13 UTC (permalink / raw)
  To: linux

Hi,

I got myself RH7.3 and it says it is recommended, but not required to move to 
ext3 file sytem.

Is it safe to do?

Regards

-- 
Peter

-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ext3 file system
  2002-07-20  6:13 Peter
@ 2002-07-20 11:26 ` pa3gcu
  2002-07-20 16:22 ` 1stFlight
  2002-08-03 20:24 ` Benny Pedersen
  2 siblings, 0 replies; 19+ messages in thread
From: pa3gcu @ 2002-07-20 11:26 UTC (permalink / raw)
  To: Peter, linux

On Saturday 20 July 2002 06:13, Peter wrote:
> Hi,
>
> I got myself RH7.3 and it says it is recommended, but not required to move
> to ext3 file sytem.
>
> Is it safe to do?

Well safe i dont know, i know of one instance where things went wrong and 
data was lost.
However when one install's redhat 7.3 one can choose to install it all onto 
ext3 partitions, i did that and keep all other patitions as ext2.

An ext3 partition can always be mounted as ext2 as well.

>
> Regards

-- 
Regards Richard
pa3gcu@zeelandnet.nl
http://people.zeelandnet.nl/pa3gcu/

-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ext3 file system
  2002-07-20  6:13 Peter
  2002-07-20 11:26 ` pa3gcu
@ 2002-07-20 16:22 ` 1stFlight
  2002-08-03 20:24 ` Benny Pedersen
  2 siblings, 0 replies; 19+ messages in thread
From: 1stFlight @ 2002-07-20 16:22 UTC (permalink / raw)
  To: Peter, linux

From my experience, it's proven to be as reliable as EXT2 with the added 
benifit of faster recovery. My perspective is one of a home workstation user, 
not one of a fielded servers system, so take that as it may. I've powered off 
my machine down in the middle of operations just to see if it worked as 
advertised, and it has, far better than EXT2 would have done under similiar 
circumstances. So in my opinion it's quite safe. 

                                              											Darryl

On Saturday 20 July 2002 02:13 am, Peter wrote:
> Hi,
>
> I got myself RH7.3 and it says it is recommended, but not required to move
> to ext3 file sytem.
>
> Is it safe to do?
>
> Regards

-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* ext3 file system
  2002-07-20  6:13 Peter
  2002-07-20 11:26 ` pa3gcu
  2002-07-20 16:22 ` 1stFlight
@ 2002-08-03 20:24 ` Benny Pedersen
  2 siblings, 0 replies; 19+ messages in thread
From: Benny Pedersen @ 2002-08-03 20:24 UTC (permalink / raw)
  To: linux-newbie

Originally to: Peter

Hello Peter.

20 Jul 02 14:13, you wrote to all:

 P> I got myself RH7.3 and it says it is recommended, but not required to
 P> move to ext3 file sytem.

 P> Is it safe to do?

you need 10% more free space on all partions, other then that the update is safe if you doit from a redhat 7.3 disk upgrade

if you try to doit manuely, no :-)

Benny

<-> Gateway Information.
This message originated from a Fidonet System (http://www.fidonet.org)
and was gated at TCOB1 (http://www.tcob1.net)
Please do not respond direct to this message but via the list


-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* ext3 file system
@ 2003-12-17 22:13 jshankar
  2003-12-17 22:25 ` Richard B. Johnson
  0 siblings, 1 reply; 19+ messages in thread
From: jshankar @ 2003-12-17 22:13 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel

Hello,

Does the  ext3 file systems have to wait for the acknowledgement of block of 
data written to the SCSI device before writing the next block of data.

Is there a parallel I/O where the file system goes on writing the block of 
data
without waiting for the acknowledgement.

Please let me know your opinion.

Thanks
Jayshankar


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ext3 file system
  2003-12-17 22:13 ext3 file system jshankar
@ 2003-12-17 22:25 ` Richard B. Johnson
  2003-12-17 23:02   ` Mike Fedyk
  0 siblings, 1 reply; 19+ messages in thread
From: Richard B. Johnson @ 2003-12-17 22:25 UTC (permalink / raw)
  To: jshankar; +Cc: linux-fsdevel, linux-kernel

On Wed, 17 Dec 2003, jshankar wrote:

> Hello,
>
> Does the  ext3 file systems have to wait for the acknowledgement of block of
> data written to the SCSI device before writing the next block of data.
>

No. Many SCSI drives and adapters allow queued commands and disconnect
operation.

> Is there a parallel I/O where the file system goes on writing the block of
> data
> without waiting for the acknowledgement.
>

This is the normal mode of operation.

> Please let me know your opinion.
>
> Thanks
> Jayshankar
>

Normal Unix/Linux file-systems write data to RAM. At some unknown
time, when memory gets tight, some data are written to the device.

Basically with Unix/Linux, you are using a RAM-Disk that overflows
to the physical media. There are special file-systems (journaling)
that guarantee that something, enough to recover the data, is
written at periodic intervals.


Cheers,
Dick Johnson
Penguin : Linux version 2.4.22 on an i686 machine (797.90 BogoMips).
            Note 96.31% of all statistics are fiction.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ext3 file system
  2003-12-17 22:25 ` Richard B. Johnson
@ 2003-12-17 23:02   ` Mike Fedyk
  0 siblings, 0 replies; 19+ messages in thread
From: Mike Fedyk @ 2003-12-17 23:02 UTC (permalink / raw)
  To: Richard B. Johnson; +Cc: jshankar, linux-fsdevel, linux-kernel

On Wed, Dec 17, 2003 at 05:25:49PM -0500, Richard B. Johnson wrote:
> to the physical media. There are special file-systems (journaling)
> that guarantee that something, enough to recover the data, is
> written at periodic intervals.

Most journaling filesystems make guarantees on the filesystem meta-data, but
not on the data.  Some like ext3, and reiserfs (with suse's journaling
patch) can journal the data, or order things so that the data is written
before any pointers (ie meta-data) make it to the disk so it will be harder
to loose data.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: ext3 file system
@ 2003-12-17 23:25 jshankar
  2003-12-17 23:59 ` Brad Boyer
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: jshankar @ 2003-12-17 23:25 UTC (permalink / raw)
  To: Richard B. Johnson, Mike Fedyk; +Cc: linux-fsdevel, linux-kernel

Hello,

Please provide some more insight.

Suppose a filesystem issues a write command to the disk with around 10 4K 
Blocks  to be written. SCSI device point of view i don't get what is the 
parallel I/O.
It has only 1 write command. If some other sends a write request it needs to 
be queued. But the next question arises how the write data would be handled. 
Does it mean the SCSI does not give a response for the block of data written. 
In otherwords does it mean that the response would be given after all the 
block of data is written for a single write request.
 
Thanks
Jay




>===== Original Message From Mike Fedyk <mfedyk@matchmail.com> =====
>On Wed, Dec 17, 2003 at 05:25:49PM -0500, Richard B. Johnson wrote:
>> to the physical media. There are special file-systems (journaling)
>> that guarantee that something, enough to recover the data, is
>> written at periodic intervals.
>
>Most journaling filesystems make guarantees on the filesystem meta-data, but
>not on the data.  Some like ext3, and reiserfs (with suse's journaling
>patch) can journal the data, or order things so that the data is written
>before any pointers (ie meta-data) make it to the disk so it will be harder
>to loose data.
>-
>To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ext3 file system
  2003-12-17 23:25 jshankar
@ 2003-12-17 23:59 ` Brad Boyer
  2003-12-18  1:25 ` Hans Reiser
  2003-12-18 14:17 ` Richard B. Johnson
  2 siblings, 0 replies; 19+ messages in thread
From: Brad Boyer @ 2003-12-17 23:59 UTC (permalink / raw)
  To: jshankar; +Cc: Richard B. Johnson, Mike Fedyk, linux-fsdevel, linux-kernel


I think the big thing that you're missing is that block device requests
are totally asynchronous. In general, a block gets sent down to the
block layer as needing to be transfered one way or the other. It gets
queued up in the driver for that block device, such as sd.o for SCSI.
That driver will be notified that it has requests to process, and
can handle them however it wants. When it is done with any specific
request, it calls back up and sets that request as done. You could
have multiple requests in the queue at the same time, and a driver
can be working on more than one at a time if it supports that.

In the specific case of SCSI, the host adapter and disk drives may
support various queues along the way, with any number of outstanding
requests in various buffers. The controller may be able to merge
requests on the fly in order to improve performance.

Obviously this is a fairly abstract view of the whole process. For
details you would need to read the code. You can trace the process
down (filesystem -> page buffers -> block devices -> block driver).
In the SCSI case, the block driver is sd.o, and you then can follow
down into the generic SCSI mid-layer and the controller driver.

	Brad Boyer
	flar@allandria.com

On Wed, Dec 17, 2003 at 04:25:11PM -0700, jshankar wrote:
> Please provide some more insight.
> 
> Suppose a filesystem issues a write command to the disk with around 10 4K 
> Blocks  to be written. SCSI device point of view i don't get what is the 
> parallel I/O.
> It has only 1 write command. If some other sends a write request it needs to 
> be queued. But the next question arises how the write data would be handled. 
> Does it mean the SCSI does not give a response for the block of data written. 
> In otherwords does it mean that the response would be given after all the 
> block of data is written for a single write request.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ext3 file system
  2003-12-17 23:25 jshankar
  2003-12-17 23:59 ` Brad Boyer
@ 2003-12-18  1:25 ` Hans Reiser
  2003-12-18 14:17 ` Richard B. Johnson
  2 siblings, 0 replies; 19+ messages in thread
From: Hans Reiser @ 2003-12-18  1:25 UTC (permalink / raw)
  To: jshankar; +Cc: Richard B. Johnson, Mike Fedyk, linux-fsdevel, linux-kernel

jshankar wrote:

>Hello,
>
>Please provide some more insight.
>
>Suppose a filesystem issues a write command to the disk with around 10 4K 
>Blocks  to be written. SCSI device point of view i don't get what is the 
>parallel I/O.
>It has only 1 write command. If some other sends a write request it needs to 
>be queued. But the next question arises how the write data would be handled. 
>Does it mean the SCSI does not give a response for the block of data written. 
>In otherwords does it mean that the response would be given after all the 
>block of data is written for a single write request.
> 
>Thanks
>Jay
>
>
>
>
>  
>
>>===== Original Message From Mike Fedyk <mfedyk@matchmail.com> =====
>>On Wed, Dec 17, 2003 at 05:25:49PM -0500, Richard B. Johnson wrote:
>>    
>>
>>>to the physical media. There are special file-systems (journaling)
>>>that guarantee that something, enough to recover the data, is
>>>written at periodic intervals.
>>>      
>>>
>>Most journaling filesystems make guarantees on the filesystem meta-data, but
>>not on the data.  Some like ext3, and reiserfs (with suse's journaling
>>patch) can journal the data, or order things so that the data is written
>>before any pointers (ie meta-data) make it to the disk so it will be harder
>>to loose data.
>>-
>>To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>>the body of a message to majordomo@vger.kernel.org
>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>    
>>
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
>
>
>  
>
Filesystems don't usually wait on the IO to complete before submitting 
more IO in response to the next write() syscall.  They can do this by 
batching a whole bunch of operations into one committed transaction.

In reiser4 we do this more carefully than other filesystems such as 
reiserfs v3, and as a result every fs operation is fully atomic.

-- 
Hans



^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: ext3 file system
@ 2003-12-18  4:47 jshankar
  2003-12-18  8:39 ` Mike Fedyk
  0 siblings, 1 reply; 19+ messages in thread
From: jshankar @ 2003-12-18  4:47 UTC (permalink / raw)
  To: Hans Reiser; +Cc: linux-fsdevel, linux-kernel

Hello Hans,

>Filesystems don't usually wait on the IO to complete before submitting
>more IO in response to the next write() syscall.  They can do this by
>batching a whole bunch of operations into one committed transaction.
>

Is there a timeout mechanism for batching operations. What if certain 
operation
is done after the batch operation is executed. Does it mean that the new 
operation has to wait.

Thanks
Jay


>===== Original Message From Hans Reiser <reiser@namesys.com> =====
>jshankar wrote:
>
>>Hello,
>>
>>Please provide some more insight.
>>
>>Suppose a filesystem issues a write command to the disk with around 10 4K
>>Blocks  to be written. SCSI device point of view i don't get what is the
>>parallel I/O.
>>It has only 1 write command. If some other sends a write request it needs to
>>be queued. But the next question arises how the write data would be handled.
>>Does it mean the SCSI does not give a response for the block of data 
written.
>>In otherwords does it mean that the response would be given after all the
>>block of data is written for a single write request.
>>
>>Thanks
>>Jay
>>
>>
>>
>>
>>
>>
>>>===== Original Message From Mike Fedyk <mfedyk@matchmail.com> =====
>>>On Wed, Dec 17, 2003 at 05:25:49PM -0500, Richard B. Johnson wrote:
>>>
>>>
>>>to the physical media. There are special file-systems (journaling)
>>>that guarantee that something, enough to recover the data, is
>>>written at periodic intervals.
>>>
>>>
>>>Most journaling filesystems make guarantees on the filesystem meta-data, 
but
>>>not on the data.  Some like ext3, and reiserfs (with suse's journaling
>>>patch) can journal the data, or order things so that the data is written
>>>before any pointers (ie meta-data) make it to the disk so it will be harder
>>>to loose data.
>>>-
>>>To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>>>the body of a message to majordomo@vger.kernel.org
>>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>
>>-
>>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>the body of a message to majordomo@vger.kernel.org
>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>Please read the FAQ at  http://www.tux.org/lkml/
>>
>>
>>
>>
>In reiser4 we do this more carefully than other filesystems such as
>reiserfs v3, and as a result every fs operation is fully atomic.
>
>--
>Hans
>
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ext3 file system
  2003-12-18  4:47 jshankar
@ 2003-12-18  8:39 ` Mike Fedyk
  2003-12-18 10:41   ` Hans Reiser
  0 siblings, 1 reply; 19+ messages in thread
From: Mike Fedyk @ 2003-12-18  8:39 UTC (permalink / raw)
  To: jshankar; +Cc: Hans Reiser, linux-fsdevel, linux-kernel

On Wed, Dec 17, 2003 at 09:47:59PM -0700, jshankar wrote:
> Hello Hans,
> 
> >Filesystems don't usually wait on the IO to complete before submitting
> >more IO in response to the next write() syscall.  They can do this by
> >batching a whole bunch of operations into one committed transaction.
> >
> 
> Is there a timeout mechanism for batching operations. What if certain 
> operation
> is done after the batch operation is executed. Does it mean that the new 
> operation has to wait.

You don't have to wait unless you run out of available non-dirty memory, or
issue a call to sync to the disks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ext3 file system
  2003-12-18  8:39 ` Mike Fedyk
@ 2003-12-18 10:41   ` Hans Reiser
  0 siblings, 0 replies; 19+ messages in thread
From: Hans Reiser @ 2003-12-18 10:41 UTC (permalink / raw)
  To: Mike Fedyk; +Cc: jshankar, linux-fsdevel, linux-kernel

Mike Fedyk wrote:

>On Wed, Dec 17, 2003 at 09:47:59PM -0700, jshankar wrote:
>  
>
>>Hello Hans,
>>
>>    
>>
>>>Filesystems don't usually wait on the IO to complete before submitting
>>>more IO in response to the next write() syscall.  They can do this by
>>>batching a whole bunch of operations into one committed transaction.
>>>
>>>      
>>>
>>Is there a timeout mechanism for batching operations.
>>
At some point due to its age or size you decide the batch needs to commit.

>> What if certain 
>>operation
>>is done after the batch operation is executed. Does it mean that the new 
>>operation has to wait.
>>    
>>
>
>You don't have to wait unless you run out of available non-dirty memory, or
>issue a call to sync to the disks.
>
>
>  
>


-- 
Hans



^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: ext3 file system
  2003-12-17 23:25 jshankar
  2003-12-17 23:59 ` Brad Boyer
  2003-12-18  1:25 ` Hans Reiser
@ 2003-12-18 14:17 ` Richard B. Johnson
  2 siblings, 0 replies; 19+ messages in thread
From: Richard B. Johnson @ 2003-12-18 14:17 UTC (permalink / raw)
  To: jshankar; +Cc: Mike Fedyk, linux-fsdevel, linux-kernel

On Wed, 17 Dec 2003, jshankar wrote:

> Hello,
>
> Please provide some more insight.
>
> Suppose a filesystem issues a write command to the disk with around 10 4K
> Blocks  to be written. SCSI device point of view i don't get what is the
> parallel I/O.
> It has only 1 write command. If some other sends a write request it needs to
> be queued. But the next question arises how the write data would be handled.
> Does it mean the SCSI does not give a response for the block of data written.
> In otherwords does it mean that the response would be given after all the
> block of data is written for a single write request.
>
> Thanks
> Jay


I guess you completely misunderstand. Any I/O to the physical devices
are completely asynchronous. There is no relationship between when
an application writes a buffer of data to a file, and when it gets
written to the physical media. This includes the device-file,
i.e., the raw device with no file-system.

What is implemented is called VFS (Virtual File System). It is
a RAM-Disk with all user data going to and from the RAM-Disk.

In principle, many temporary files never even get written to
the physical device. They are created, written, read, then
deleted long before there is any reason to write to the physical
media. Writing to physical media is a performance bottle-neck.

Eventually, the supply of kernel buffers used to keep the
file-system data might get short. When it does, the kernel
writes (through the drivers) data to the devices using a
LRU (least recently used) algorithm. This write also is
asynchronous. It gets handed-off to a SCSI, or IDE, or whatever,
driver which should eventually get the data into the drives.

In the meantime, the devices may time-out, there may be errors
that require the writes to be retried, etc. Eventually the
operating system will be notified that a write succeeded so
that particular amount of RAM containing the data can be
freed.

Even if there are errors, a subsequent read of the data, which
comes from RAM will succeed. It is only after that data gets
to the drive that subsequent reads may require the data to be
re-read from the drive.

All this work executes in parallel with the work of the
application software. Notification of the success or failure
of a particular operation is handled in the drivers using
an interrupt. With common SCSI controllers, data are transferred
using Bus-Master DMA so the CPU continues handling user and
kernel code while the DMA is occurring. The CPU is not locked-
off the bus during DMA so there is additional parallelism
under these conditions.

At the SCSI device-driver level, typically a data block is
built that tells the SCSI controller all it needs to know
about the transfer. The controller is then "told" to execute
the command. The success or failure of the command is determined
by some status read in an interrupt. The controller does whatever
it needs to do, to get the data to the drive without using
the CPU at all. This means that the CPU can be executing code
(doing useful work) in parallel with the data transfer.

You can force the file-systems to write their data to the
physical media by executing sync(). This is not a good thing
to do very often if you expect any reasonable performance.

The only time all the data gets to the drive(s) is when they
are dismounted (umount). This gets all the data into the drives
and severs the logical connection between your applications and
the file-systems that you just dismounted.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.22 on an i686 machine (797.90 BogoMips).
            Note 96.31% of all statistics are fiction.



^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2003-12-18 14:16 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-12-17 22:13 ext3 file system jshankar
2003-12-17 22:25 ` Richard B. Johnson
2003-12-17 23:02   ` Mike Fedyk
  -- strict thread matches above, loose matches on Subject: below --
2003-12-18  4:47 jshankar
2003-12-18  8:39 ` Mike Fedyk
2003-12-18 10:41   ` Hans Reiser
2003-12-17 23:25 jshankar
2003-12-17 23:59 ` Brad Boyer
2003-12-18  1:25 ` Hans Reiser
2003-12-18 14:17 ` Richard B. Johnson
2002-07-20  6:13 Peter
2002-07-20 11:26 ` pa3gcu
2002-07-20 16:22 ` 1stFlight
2002-08-03 20:24 ` Benny Pedersen
2002-01-17 21:45 Justin Smith
2002-01-18 15:14 ` Noah silva
2002-01-18 15:43 ` Stephen Smalley
2002-01-18 15:53 ` Stephen Smalley
2002-01-18 17:14   ` Paul Kronenwetter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.