All of lore.kernel.org
 help / color / mirror / Atom feed
* Can ReiserFS solve this?
@ 2003-04-28 14:04 Erik Terpstra
  2003-04-28 14:17 ` Valdis.Kletnieks
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Erik Terpstra @ 2003-04-28 14:04 UTC (permalink / raw)
  To: reiserfs-list

Hi,

I am looking for a solution for the following problem:

On a legacy system for newspaper workflow, files are delivered to a 
certain directory (for example ~/input).
These files (in TIFF format) can be quite large (10 to 400 MB), they 
could be copied over the local filesystem, a Samba share or via FTP.
When large files are copied over the network these files show up in 
~/input while they are being copied (you can see the filesize grow).

However TIFF files are only useful for further processing when they are 
complete.

Initially we solved this problem by monitoring the input directory and 
make our applications look at the files until they stop growing, but 
this isn't a very elegant and reliable method.

After a while we discovered the UNIX 'fuser' command so that we could 
see if the incoming file is still being transferred or not.
This works fine but it's still not very elegant, and it requires root 
privileges for the applications involved in the workflow.

Naturally, the best solution would be for the sender to notify the 
completion of the transfer. But this is not an option because several 
organizations are involved that do not wish to adapt their software.

Right now I am wondering if this is something that could be solved on 
the filesystem level, i.e. is it possible to 'only see files that are 
not in the process of being transferred'.

Is this possible with Reiser3? Reiser4? Should it be solved on the 
filesystem level?

Any thoughts on this matter are appreciated.

Kind regards,

   Erik Terpstra.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Can ReiserFS solve this?
  2003-04-28 14:04 Can ReiserFS solve this? Erik Terpstra
@ 2003-04-28 14:17 ` Valdis.Kletnieks
  2003-04-28 14:26   ` Erik Terpstra
       [not found]   ` <1742847756.20030428162843@tnonline.net>
  2003-04-28 14:32 ` Oleg Drokin
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 18+ messages in thread
From: Valdis.Kletnieks @ 2003-04-28 14:17 UTC (permalink / raw)
  To: Erik Terpstra; +Cc: reiserfs-list

[-- Attachment #1: Type: text/plain, Size: 893 bytes --]

On Mon, 28 Apr 2003 16:04:05 +0200, Erik Terpstra <erik@solidcode.net>  said:
> Hi,
> 
> I am looking for a solution for the following problem:
> 
> On a legacy system for newspaper workflow, files are delivered to a 
> certain directory (for example ~/input).
> These files (in TIFF format) can be quite large (10 to 400 MB), they 
> could be copied over the local filesystem, a Samba share or via FTP.
> When large files are copied over the network these files show up in 
> ~/input while they are being copied (you can see the filesize grow).
> 
> However TIFF files are only useful for further processing when they are 
> complete.

Note that *any* solution has to deal elegantly with the case of a transfer
that fails partway through.  No amount of 'fuser' or reiser trickery will
fix the case where you are receiving a 400mb TIFF, and the connection is
closed after 250M is transferred.

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Can ReiserFS solve this?
  2003-04-28 14:17 ` Valdis.Kletnieks
@ 2003-04-28 14:26   ` Erik Terpstra
  2003-04-28 16:11     ` Christian Mayrhuber
       [not found]   ` <1742847756.20030428162843@tnonline.net>
  1 sibling, 1 reply; 18+ messages in thread
From: Erik Terpstra @ 2003-04-28 14:26 UTC (permalink / raw)
  Cc: reiserfs-list

Valdis.Kletnieks@vt.edu wrote:

>On Mon, 28 Apr 2003 16:04:05 +0200, Erik Terpstra <erik@solidcode.net>  said:
>  
>
>>Hi,
>>
>>I am looking for a solution for the following problem:
>>
>>On a legacy system for newspaper workflow, files are delivered to a 
>>certain directory (for example ~/input).
>>These files (in TIFF format) can be quite large (10 to 400 MB), they 
>>could be copied over the local filesystem, a Samba share or via FTP.
>>When large files are copied over the network these files show up in 
>>~/input while they are being copied (you can see the filesize grow).
>>
>>However TIFF files are only useful for further processing when they are 
>>complete.
>>    
>>
>
>Note that *any* solution has to deal elegantly with the case of a transfer
>that fails partway through.  No amount of 'fuser' or reiser trickery will
>fix the case where you are receiving a 400mb TIFF, and the connection is
>closed after 250M is transferred.
>
I agree, but shouldn't it be possible to see a distinction between files 
that are actually there and those that are coming in?
 From my problem domain there are three cases:

1) Files that are arriving
2) Files that have arrived and are complete (i.e. correct TIFF format)
3) Files that have arrived and are incomplete (invalid TIFFs)

I am happy to deal with case 2 and 3, but I am not interested in case 1.

If my problem domain is not related to filesystems then that's okay, but 
I appreciate your opinion (I am not a filesystem expert).

  Erik.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Can ReiserFS solve this?
  2003-04-28 14:04 Can ReiserFS solve this? Erik Terpstra
  2003-04-28 14:17 ` Valdis.Kletnieks
@ 2003-04-28 14:32 ` Oleg Drokin
  2003-04-28 14:42   ` Valdis.Kletnieks
  2003-04-28 15:45   ` Yury Umanets
  2003-04-28 14:53 ` Hans Reiser
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 18+ messages in thread
From: Oleg Drokin @ 2003-04-28 14:32 UTC (permalink / raw)
  To: Erik Terpstra; +Cc: reiserfs-list

Hello!

On Mon, Apr 28, 2003 at 04:04:05PM +0200, Erik Terpstra wrote:
> I am looking for a solution for the following problem:
> On a legacy system for newspaper workflow, files are delivered to a 
> certain directory (for example ~/input).
> These files (in TIFF format) can be quite large (10 to 400 MB), they 
> could be copied over the local filesystem, a Samba share or via FTP.
> When large files are copied over the network these files show up in 
> ~/input while they are being copied (you can see the filesize grow).
> Naturally, the best solution would be for the sender to notify the 
> completion of the transfer. But this is not an option because several 
> organizations are involved that do not wish to adapt their software.
> Right now I am wondering if this is something that could be solved on 
> the filesystem level, i.e. is it possible to 'only see files that are 
> not in the process of being transferred'.
> Is this possible with Reiser3? Reiser4? Should it be solved on the 
> filesystem level?

How about such a generic solution:

you create /incoming/.temp (or /incoming.tmp), all the files are being written
there. When write is complete, you just do rename(2) from tempdir to /incoming
This is atomic operation, files will apear immediately in place in their full
size.

FTP knows how to rename stuff at remote location.
I will be surprised if samba does not know how to do that.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Can ReiserFS solve this?
       [not found]   ` <1742847756.20030428162843@tnonline.net>
@ 2003-04-28 14:39     ` Anders Widman
  0 siblings, 0 replies; 18+ messages in thread
From: Anders Widman @ 2003-04-28 14:39 UTC (permalink / raw)
  To: reiserfs-list

> Note that *any* solution has to deal elegantly with the case of a transfer
> that fails partway through.  No amount of 'fuser' or reiser trickery will
> fix the case where you are receiving a 400mb TIFF, and the connection is
> closed after 250M is transferred.


   Does not Samba take care of this?



--------
PGP public key: https://tnonline.net/secure/pgp_key.txt


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Can ReiserFS solve this?
  2003-04-28 14:32 ` Oleg Drokin
@ 2003-04-28 14:42   ` Valdis.Kletnieks
  2003-04-28 14:52     ` Chris Dukes
                       ` (2 more replies)
  2003-04-28 15:45   ` Yury Umanets
  1 sibling, 3 replies; 18+ messages in thread
From: Valdis.Kletnieks @ 2003-04-28 14:42 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: Erik Terpstra, reiserfs-list

[-- Attachment #1: Type: text/plain, Size: 469 bytes --]

On Mon, 28 Apr 2003 18:32:31 +0400, Oleg Drokin said:

> How about such a generic solution:
> 
> you create /incoming/.temp (or /incoming.tmp), all the files are being writte
n
> there. When write is complete, you just do rename(2) from tempdir to /incomin
g

Oleg:  A very good solution, except there are chucklehead admins at the
remote site that refuse to make changes to their end, and this solution
would require a change at the remote end to do a FTP rename.. ;)

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Can ReiserFS solve this?
  2003-04-28 14:42   ` Valdis.Kletnieks
@ 2003-04-28 14:52     ` Chris Dukes
  2003-04-28 14:53     ` Oleg Drokin
  2003-04-28 14:53     ` Anders Widman
  2 siblings, 0 replies; 18+ messages in thread
From: Chris Dukes @ 2003-04-28 14:52 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Oleg Drokin, Erik Terpstra, reiserfs-list

On Mon, Apr 28, 2003 at 10:42:30AM -0400, Valdis.Kletnieks@vt.edu wrote:
> 
> Oleg:  A very good solution, except there are chucklehead admins at the
> remote site that refuse to make changes to their end, and this solution
> would require a change at the remote end to do a FTP rename.. ;)

Technology is never a good solution to social problems.

As a gut feeling, I suspect that FTP is the wrong tool for the job.
Something like rsync would be a better choice.
If you must go with FTP, you might be able to peruse the logs for
truncated transfers.

-- 
Chris Dukes
I tried being reasonable once--I didn't like it.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Can ReiserFS solve this?
  2003-04-28 14:04 Can ReiserFS solve this? Erik Terpstra
  2003-04-28 14:17 ` Valdis.Kletnieks
  2003-04-28 14:32 ` Oleg Drokin
@ 2003-04-28 14:53 ` Hans Reiser
  2003-04-28 15:24   ` Erik Terpstra
  2003-04-28 16:36 ` Kristian Koehntopp
  2003-04-28 17:37 ` Anders Widman
  4 siblings, 1 reply; 18+ messages in thread
From: Hans Reiser @ 2003-04-28 14:53 UTC (permalink / raw)
  To: Erik Terpstra; +Cc: reiserfs-list

Erik Terpstra wrote:

> Hi,
>
> I am looking for a solution for the following problem:
>
> On a legacy system for newspaper workflow, files are delivered to a 
> certain directory (for example ~/input).
> These files (in TIFF format) can be quite large (10 to 400 MB), they 
> could be copied over the local filesystem, a Samba share or via FTP.
> When large files are copied over the network these files show up in 
> ~/input while they are being copied (you can see the filesize grow).
>
> However TIFF files are only useful for further processing when they 
> are complete.
>
> Initially we solved this problem by monitoring the input directory and 
> make our applications look at the files until they stop growing, but 
> this isn't a very elegant and reliable method.
>
> After a while we discovered the UNIX 'fuser' command so that we could 
> see if the incoming file is still being transferred or not.
> This works fine but it's still not very elegant, and it requires root 
> privileges for the applications involved in the workflow.
>
> Naturally, the best solution would be for the sender to notify the 
> completion of the transfer. But this is not an option because several 
> organizations are involved that do not wish to adapt their software.
>
> Right now I am wondering if this is something that could be solved on 
> the filesystem level, i.e. is it possible to 'only see files that are 
> not in the process of being transferred'.
>
> Is this possible with Reiser3? Reiser4? Should it be solved on the 
> filesystem level?
>
> Any thoughts on this matter are appreciated.
>
> Kind regards,
>
>   Erik Terpstra.
>
>
>
Is there a test for completeness of the tiff?  It could be done as a 
reiser4 plugin.

-- 
Hans



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Can ReiserFS solve this?
  2003-04-28 14:42   ` Valdis.Kletnieks
  2003-04-28 14:52     ` Chris Dukes
@ 2003-04-28 14:53     ` Oleg Drokin
  2003-04-28 14:53     ` Anders Widman
  2 siblings, 0 replies; 18+ messages in thread
From: Oleg Drokin @ 2003-04-28 14:53 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Erik Terpstra, reiserfs-list

Hello!

On Mon, Apr 28, 2003 at 10:42:30AM -0400, Valdis.Kletnieks@vt.edu wrote:
> > How about such a generic solution:
> > you create /incoming/.temp (or /incoming.tmp), all the files are being writte
> n
> > there. When write is complete, you just do rename(2) from tempdir to /incomin
> g
> Oleg:  A very good solution, except there are chucklehead admins at the
> remote site that refuse to make changes to their end, and this solution
> would require a change at the remote end to do a FTP rename.. ;)

Huh? No changes at remote end is needed.
FTP rename command is sent from our end (to remote server) after transfer is done.
As I understand, the sending side is under "our" control?
Well, if this assumption is wrong, then there are less reliable things to look at
(keeping in mind broken connections).
Samba and most ftp clients (though not sure about if the ftp clients do this on put)
change file's mtime to that of original file.
just compare ctime and mtime. If ctime is greater than mtime, then the transfer is finished ;)

Or just change FTP server to internally store files in /incoming/.tmp and move them to /incoming
once upload is finished. (will need some tricks to wrt upload continues after connection is broken)

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Can ReiserFS solve this?
  2003-04-28 14:42   ` Valdis.Kletnieks
  2003-04-28 14:52     ` Chris Dukes
  2003-04-28 14:53     ` Oleg Drokin
@ 2003-04-28 14:53     ` Anders Widman
  2003-04-28 15:21       ` Hans Reiser
  2 siblings, 1 reply; 18+ messages in thread
From: Anders Widman @ 2003-04-28 14:53 UTC (permalink / raw)
  To: reiserfs-list

> On Mon, 28 Apr 2003 18:32:31 +0400, Oleg Drokin said:

>> How about such a generic solution:
>> 
>> you create /incoming/.temp (or /incoming.tmp), all the files are being writte
> n
>> there. When write is complete, you just do rename(2) from tempdir to /incomin
> g

> Oleg:  A very good solution, except there are chucklehead admins at the
> remote site that refuse to make changes to their end, and this solution
> would require a change at the remote end to do a FTP rename.. ;)


   Which  still  should  be  less  expencive  than having to resort to
   making  custom programs on your side. Use money as argument.. Works
   most of the time ;)

   //Anders


--------
PGP public key: https://tnonline.net/secure/pgp_key.txt


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Can ReiserFS solve this?
  2003-04-28 14:53     ` Anders Widman
@ 2003-04-28 15:21       ` Hans Reiser
  0 siblings, 0 replies; 18+ messages in thread
From: Hans Reiser @ 2003-04-28 15:21 UTC (permalink / raw)
  To: Anders Widman; +Cc: reiserfs-list

Anders Widman wrote:

>>On Mon, 28 Apr 2003 18:32:31 +0400, Oleg Drokin said:
>>    
>>
>
>  
>
>>>How about such a generic solution:
>>>
>>>you create /incoming/.temp (or /incoming.tmp), all the files are being writte
>>>      
>>>
>>n
>>    
>>
>>>there. When write is complete, you just do rename(2) from tempdir to /incomin
>>>      
>>>
>>g
>>    
>>
>
>  
>
>>Oleg:  A very good solution, except there are chucklehead admins at the
>>remote site that refuse to make changes to their end, and this solution
>>would require a change at the remote end to do a FTP rename.. ;)
>>    
>>
>
>
>   Which  still  should  be  less  expencive  than having to resort to
>   making  custom programs on your side. Use money as argument.. Works
>   most of the time ;)
>
>   //Anders
>
>
>--------
>PGP public key: https://tnonline.net/secure/pgp_key.txt
>
>
>
>  
>
We can quote a price that you can use to bang their heads in with if you 
want.;-)

-- 
Hans



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Can ReiserFS solve this?
  2003-04-28 14:53 ` Hans Reiser
@ 2003-04-28 15:24   ` Erik Terpstra
  2003-04-28 15:52     ` Erik Terpstra
  0 siblings, 1 reply; 18+ messages in thread
From: Erik Terpstra @ 2003-04-28 15:24 UTC (permalink / raw)
  To: reiserfs-list

Hi Hans,

Hans Reiser wrote:

> Is there a test for completeness of the tiff?  It could be done as a 
> reiser4 plugin.

Yes, tiffinfo.c from libtiff (http://www.libtiff.org).
That would indeed be an elegant solution!

I just realized that the code in the fuser command could also be done as 
a reiser4 plugin (as to provide a more generic solution).
A fuser reiser4 plugin probably implies the following:

An 'ls' command on an input directory would indeed show only 
'transferred files', which I realize could better be described as 'files 
that are in-use by a process'.

So when such a plugin is active on my home directory and I do an ls I 
only see 'complete files'. However when one of my files is opened by a 
program, for example bla.doc is opened by OpenOffice, then bla.doc won't 
be visible in the results of the 'ls' command (which is kind of weird 
behaviour but could be handy at times).

Is my assumption correct? There is no further distinction between files 
that are being received and files that are in use by a process?

Erik.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Can ReiserFS solve this?
  2003-04-28 14:32 ` Oleg Drokin
  2003-04-28 14:42   ` Valdis.Kletnieks
@ 2003-04-28 15:45   ` Yury Umanets
  2003-04-28 19:48     ` Soeren Sonnenburg
  1 sibling, 1 reply; 18+ messages in thread
From: Yury Umanets @ 2003-04-28 15:45 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: Erik Terpstra, reiserfs-list

Oleg Drokin wrote:

>Hello!
>
>On Mon, Apr 28, 2003 at 04:04:05PM +0200, Erik Terpstra wrote:
>  
>
>>I am looking for a solution for the following problem:
>>On a legacy system for newspaper workflow, files are delivered to a 
>>certain directory (for example ~/input).
>>These files (in TIFF format) can be quite large (10 to 400 MB), they 
>>could be copied over the local filesystem, a Samba share or via FTP.
>>When large files are copied over the network these files show up in 
>>~/input while they are being copied (you can see the filesize grow).
>>Naturally, the best solution would be for the sender to notify the 
>>completion of the transfer. But this is not an option because several 
>>organizations are involved that do not wish to adapt their software.
>>Right now I am wondering if this is something that could be solved on 
>>the filesystem level, i.e. is it possible to 'only see files that are 
>>not in the process of being transferred'.
>>Is this possible with Reiser3? Reiser4? Should it be solved on the 
>>filesystem level?
>>    
>>

>How about such a generic solution:
>
>you create /incoming/.temp (or /incoming.tmp), all the files are being written
>there. When write is complete, you just do rename(2) from tempdir to /incoming
>This is atomic operation, files will apear immediately in place in their full
>size.
>
>FTP knows how to rename stuff at remote location.
>I will be surprised if samba does not know how to do that.
>
>Bye,
>    Oleg
>
>
>  
>
Then you also might use some kind of filesystem notification, like GNOME 
or KDE do for their file managers. I mean, that konqueror or nautilus 
knows about changing in particular directory without refreshing.

-- 
Yury Umanets
"We're flying high, we're watching the world passes by..."




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Can ReiserFS solve this?
  2003-04-28 15:24   ` Erik Terpstra
@ 2003-04-28 15:52     ` Erik Terpstra
  0 siblings, 0 replies; 18+ messages in thread
From: Erik Terpstra @ 2003-04-28 15:52 UTC (permalink / raw)
  To: reiserfs-list

Erik Terpstra wrote:

> I just realized that the code in the fuser command could also be done 
> as a reiser4 plugin (as to provide a more generic solution).
> A fuser reiser4 plugin probably implies the following:
>
> An 'ls' command on an input directory would indeed show only 
> 'transferred files', which I realize could better be described as 
> 'files that are in-use by a process'. 

Duhh, I meant 'transferred files' could better be described as 'files 
that are *not* in-use by a process.

>
>
> So when such a plugin is active on my home directory and I do an ls I 
> only see 'complete files'. However when one of my files is opened by a 
> program, for example bla.doc is opened by OpenOffice, then bla.doc 
> won't be visible in the results of the 'ls' command (which is kind of 
> weird behaviour but could be handy at times).
>
> Is my assumption correct? There is no further distinction between 
> files that are being received and files that are in use by a process?
>
> Erik.
>
>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Can ReiserFS solve this?
  2003-04-28 14:26   ` Erik Terpstra
@ 2003-04-28 16:11     ` Christian Mayrhuber
  0 siblings, 0 replies; 18+ messages in thread
From: Christian Mayrhuber @ 2003-04-28 16:11 UTC (permalink / raw)
  To: Erik Terpstra, reiserfs-list

Am Montag, 28. April 2003 16:26 schrieb Erik Terpstra:
> Valdis.Kletnieks@vt.edu wrote:
> >On Mon, 28 Apr 2003 16:04:05 +0200, Erik Terpstra <erik@solidcode.net>  said:
> >>Hi,
> >>
> >>I am looking for a solution for the following problem:
> >>
> >>On a legacy system for newspaper workflow, files are delivered to a
> >>certain directory (for example ~/input).
> >>These files (in TIFF format) can be quite large (10 to 400 MB), they
> >>could be copied over the local filesystem, a Samba share or via FTP.
> >>When large files are copied over the network these files show up in
> >>~/input while they are being copied (you can see the filesize grow).
> >>
> >>However TIFF files are only useful for further processing when they are
> >>complete.
> >
> >Note that *any* solution has to deal elegantly with the case of a transfer
> >that fails partway through.  No amount of 'fuser' or reiser trickery will
> >fix the case where you are receiving a 400mb TIFF, and the connection is
> >closed after 250M is transferred.
>
> I agree, but shouldn't it be possible to see a distinction between files
> that are actually there and those that are coming in?
>  From my problem domain there are three cases:
>
> 1) Files that are arriving
> 2) Files that have arrived and are complete (i.e. correct TIFF format)
> 3) Files that have arrived and are incomplete (invalid TIFFs)
>
> I am happy to deal with case 2 and 3, but I am not interested in case 1.
>
> If my problem domain is not related to filesystems then that's okay, but
> I appreciate your opinion (I am not a filesystem expert).
>
>   Erik.

Have a look at pure-ftpd http://www.pureftpd.org/ it supports running a script
(pure-uploadscript) after a file has been successfully uploaded.
In this script you could check if the tiff is ok, if yes then you can move it
to your ./input directory. It's one of the most secure ftp servers out there
and controllable trough command line switches. (bandwith throttling, etc..)
It provides a program called pure-ftpwho which displays the current connections
and their status.

I guess, it can do what you need ;-)
-- 
lg, Chris


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Can ReiserFS solve this?
  2003-04-28 14:04 Can ReiserFS solve this? Erik Terpstra
                   ` (2 preceding siblings ...)
  2003-04-28 14:53 ` Hans Reiser
@ 2003-04-28 16:36 ` Kristian Koehntopp
  2003-04-28 17:37 ` Anders Widman
  4 siblings, 0 replies; 18+ messages in thread
From: Kristian Koehntopp @ 2003-04-28 16:36 UTC (permalink / raw)
  To: Erik Terpstra; +Cc: reiserfs-list

On Mon, Apr 28, 2003 at 04:04:05PM +0200, Erik Terpstra wrote:
> I am looking for a solution for the following problem:
>
> [ I need to touch only these arriving files that are already
>   complete ]

You should have a look at the "maildir" format for mail
directories as opposed to mbox. It is described in
http://cr.yp.to/proto/maildir.html.

Arriving mail has a problem very similar to yours, and DJB
proposes a solution to this by having multiple directories.
Essentially, DJB observes that the rename(2) system call is
atomic within the boundaries of a filesystem, while
open(2)/write(2)/close(2) sequences are not, and proposes
uploads of arriving new mails into a "tmp" directory, which is
parallel to a "new" directory:

.../foldername/tmp
              /new

New mail arrive in tmp, until they are completely transferred.
They are then moved into new with a single atomic rename(2)
call, guaranteeing that all files in new are complete. Also,
files that are being worked on can be moved into a third
parallel directoy in your system ("work"), thus making sure that
only one instance of a worker process can work on a certain
file, enabling parallelism in your application:

.../foldername/tmp/partial_file.tiff
               new/somefile_1.tiff
                  /somefile_2.tiff
               work/worker_<pid>.tiff
                    worker_<pid>.tiff

Thus, each worker process will move his current
new/somefile_<n>.tiff to work/worker_<pid>.tiff, and check if
that suceeds. If so, it is the only worker currently working on
that file, and can proceed. At the same time, any number of
workers may be active, without a chance of two workers running
into each other.


I once had a similar problem to yours, where I had to push and
pull batchfiles with Siemens HiCom data to a rather primitive
SVR4 that controlled the installation. Here, I was not allowed
to use directories, so I used trigger files named
"<original_name>.x" which were uploaded after the
"<original_name>" file. An uploaded "q" was complete when the
matching empty trigger file "q.x" showed up (which was due to
sorting always transmitted immediately after "q"). If there was
a "q.w" work indicator (containing the PID of the worker), any
"q" was being processed and thus untouchable.

This works with any filesystem that has atomic rename(2)
semantics.

Kristian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Can ReiserFS solve this?
  2003-04-28 14:04 Can ReiserFS solve this? Erik Terpstra
                   ` (3 preceding siblings ...)
  2003-04-28 16:36 ` Kristian Koehntopp
@ 2003-04-28 17:37 ` Anders Widman
  4 siblings, 0 replies; 18+ messages in thread
From: Anders Widman @ 2003-04-28 17:37 UTC (permalink / raw)
  To: reiserfs-list

> Hi,

> I am looking for a solution for the following problem:

> On a legacy system for newspaper workflow, files are delivered to a 
> certain directory (for example ~/input).
> These files (in TIFF format) can be quite large (10 to 400 MB), they 
> could be copied over the local filesystem, a Samba share or via FTP.
> When large files are copied over the network these files show up in 
> ~/input while they are being copied (you can see the filesize grow).

> However TIFF files are only useful for further processing when they are 
> complete.

> Initially we solved this problem by monitoring the input directory and
> make our applications look at the files until they stop growing, but 
> this isn't a very elegant and reliable method.

  No,  and  it  won't  give  confirmation  that  the  file actually is
  complete.

  You  can  set Samba to issue a lock on the files while transferring.
  This  should  be enough to see if the file has finished transferring
  or not.

> After a while we discovered the UNIX 'fuser' command so that we could 
> see if the incoming file is still being transferred or not.
> This works fine but it's still not very elegant, and it requires root 
> privileges for the applications involved in the workflow.

> Naturally, the best solution would be for the sender to notify the 
> completion of the transfer. But this is not an option because several 
> organizations are involved that do not wish to adapt their software.

> Right now I am wondering if this is something that could be solved on 
> the filesystem level, i.e. is it possible to 'only see files that are 
> not in the process of being transferred'.

> Is this possible with Reiser3? Reiser4? Should it be solved on the
> filesystem level?

  Sounds rather extreme measures.

> Any thoughts on this matter are appreciated.

> Kind regards,

>    Erik Terpstra.



--------
PGP public key: https://tnonline.net/secure/pgp_key.txt


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Can ReiserFS solve this?
  2003-04-28 15:45   ` Yury Umanets
@ 2003-04-28 19:48     ` Soeren Sonnenburg
  0 siblings, 0 replies; 18+ messages in thread
From: Soeren Sonnenburg @ 2003-04-28 19:48 UTC (permalink / raw)
  To: reiserfs-list

On Mon, 2003-04-28 at 17:45, Yury Umanets wrote:
[...]
> Then you also might use some kind of filesystem notification, like GNOME 
> or KDE do for their file managers. I mean, that konqueror or nautilus 
> knows about changing in particular directory without refreshing.

right, they use FAM... also the courier mailsuite makes use of fam... it
is really cool to see the size/mtime change while one is copying :-)

S.


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2003-04-28 19:48 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-04-28 14:04 Can ReiserFS solve this? Erik Terpstra
2003-04-28 14:17 ` Valdis.Kletnieks
2003-04-28 14:26   ` Erik Terpstra
2003-04-28 16:11     ` Christian Mayrhuber
     [not found]   ` <1742847756.20030428162843@tnonline.net>
2003-04-28 14:39     ` Anders Widman
2003-04-28 14:32 ` Oleg Drokin
2003-04-28 14:42   ` Valdis.Kletnieks
2003-04-28 14:52     ` Chris Dukes
2003-04-28 14:53     ` Oleg Drokin
2003-04-28 14:53     ` Anders Widman
2003-04-28 15:21       ` Hans Reiser
2003-04-28 15:45   ` Yury Umanets
2003-04-28 19:48     ` Soeren Sonnenburg
2003-04-28 14:53 ` Hans Reiser
2003-04-28 15:24   ` Erik Terpstra
2003-04-28 15:52     ` Erik Terpstra
2003-04-28 16:36 ` Kristian Koehntopp
2003-04-28 17:37 ` Anders Widman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.