linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] staging/lustre/llite: fix O_TMPFILE/O_LOV_DELAY_CREATE conflict
@ 2014-02-03 23:07 Andreas Dilger
  2014-02-04  0:09 ` Dilger, Andreas
  2014-02-04  4:10 ` Greg KH
  0 siblings, 2 replies; 4+ messages in thread
From: Andreas Dilger @ 2014-02-03 23:07 UTC (permalink / raw)
  To: greg; +Cc: linux-fsdevel, Andreas Dilger, Oleg Drokin, Peng Tao

In kernel 3.11 O_TMPFILE was introduced, but the open flag value
conflicts with the O_LOV_DELAY_CREATE flag 020000000 previously used
by Lustre-aware applications.  O_LOV_DELAY_CREATE allows applications
to defer file layout and object creation from open time (the default)
until it can instead be specified by the application using an ioctl.

Instead of trying to find a non-conflicting O_LOV_DELAY_CREATE flag
or define a Lustre-specific flag that isn't of use to most/any other
filesystems, use (O_NOCTTY|FASYNC) as the new value.  These flag
are not meaningful for newly-created regular files and should be
OK since O_LOV_DELAY_CREATE is only meaningful for new files.

I looked into using O_ACCMODE/FMODE_WRITE_IOCTL, which allows calling
ioctl() on the minimally-opened fd and is close to what is needed,
but that doesn't allow specifying the actual read or write mode for
the file, and fcntl(F_SETFL) doesn't allow O_RDONLY/O_WRONLY/O_RDWR
to be set after the file is opened.

Lustre-change: http://review.whamcloud.com/8312
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4209
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Peng Tao <bergwolf@gmail.com>
---
 .../lustre/lustre/include/lustre/lustre_user.h     |   12 ++++------
 drivers/staging/lustre/lustre/include/lustre_mdc.h |   11 ++++++++++
 drivers/staging/lustre/lustre/llite/file.c         |   21 ++++++++++---------
 drivers/staging/lustre/lustre/mdc/mdc_lib.c        |    2 +-
 4 files changed, 28 insertions(+), 18 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index 6b6c0240..91f22a8 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -265,13 +265,11 @@ struct ost_id {
 
 #define MAX_OBD_NAME 128 /* If this changes, a NEW ioctl must be added */
 
-/* Hopefully O_LOV_DELAY_CREATE does not conflict with standard O_xxx flags.
- * Previously it was defined as 0100000000 and conflicts with FMODE_NONOTIFY
- * which was added since kernel 2.6.36, so we redefine it as 020000000.
- * To be compatible with old version's statically linked binary, finally we
- * define it as (020000000 | 0100000000).
- * */
-#define O_LOV_DELAY_CREATE      0120000000
+/* Define O_LOV_DELAY_CREATE to be a mask that is not useful for regular
+ * files, but are unlikely to be used in practice and are not harmful if
+ * used incorrectly.  O_NOCTTY and FASYNC are only meaningful for character
+ * devices and are safe for use on new files (See LU-812, LU-4209). */
+#define O_LOV_DELAY_CREATE	(O_NOCTTY | FASYNC)
 
 #define LL_FILE_IGNORE_LOCK     0x00000001
 #define LL_FILE_GROUP_LOCKED    0x00000002
diff --git a/drivers/staging/lustre/lustre/include/lustre_mdc.h b/drivers/staging/lustre/lustre/include/lustre_mdc.h
index c1e0270..468f363 100644
--- a/drivers/staging/lustre/lustre/include/lustre_mdc.h
+++ b/drivers/staging/lustre/lustre/include/lustre_mdc.h
@@ -166,6 +166,17 @@ void it_clear_disposition(struct lookup_intent *it, int flag);
 void it_set_disposition(struct lookup_intent *it, int flag);
 int it_open_error(int phase, struct lookup_intent *it);
 
+static inline bool cl_is_lov_delay_create(unsigned int flags)
+{
+	return (flags & O_LOV_DELAY_CREATE) == O_LOV_DELAY_CREATE;
+}
+
+static inline void cl_lov_delay_create_clear(unsigned int *flags)
+{
+	if ((*flags & O_LOV_DELAY_CREATE) == O_LOV_DELAY_CREATE)
+		*flags &= ~O_LOV_DELAY_CREATE;
+}
+
 /** @} mdc */
 
 #endif
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index c12821a..dc9da77 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -671,14 +671,13 @@ restart:
 
 	ll_capa_open(inode);
 
-	if (!lli->lli_has_smd) {
-		if (file->f_flags & O_LOV_DELAY_CREATE ||
-		    !(file->f_mode & FMODE_WRITE)) {
-			CDEBUG(D_INODE, "object creation was delayed\n");
-			GOTO(out_och_free, rc);
-		}
+	if (!lli->lli_has_smd &&
+	    (cl_is_lov_delay_create(file->f_flags) ||
+	     (file->f_mode & FMODE_WRITE) == 0)) {
+		CDEBUG(D_INODE, "object creation was delayed\n");
+		GOTO(out_och_free, rc);
 	}
-	file->f_flags &= ~O_LOV_DELAY_CREATE;
+	cl_lov_delay_create_clear(&file->f_flags);
 	GOTO(out_och_free, rc);
 
 out_och_free:
@@ -1381,23 +1380,25 @@ int ll_lov_setstripe_ea_info(struct inode *inode, struct file *file,
 		ccc_inode_lsm_put(inode, lsm);
 		CDEBUG(D_IOCTL, "stripe already exists for ino %lu\n",
 		       inode->i_ino);
-		return -EEXIST;
+		GOTO(out, rc = -EEXIST);
 	}
 
 	ll_inode_size_lock(inode);
 	rc = ll_intent_file_open(file, lum, lum_size, &oit);
 	if (rc)
-		GOTO(out, rc);
+		GOTO(out_unlock, rc);
 	rc = oit.d.lustre.it_status;
 	if (rc < 0)
 		GOTO(out_req_free, rc);
 
 	ll_release_openhandle(file->f_dentry, &oit);
 
- out:
+out_unlock:
 	ll_inode_size_unlock(inode);
 	ll_intent_release(&oit);
 	ccc_inode_lsm_put(inode, lsm);
+out:
+	cl_lov_delay_create_clear(&file->f_flags);
 	return rc;
 out_req_free:
 	ptlrpc_req_finished((struct ptlrpc_request *) oit.d.lustre.it_data);
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_lib.c b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
index 91f6876..5b9f371 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_lib.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
@@ -197,7 +197,7 @@ static __u64 mds_pack_open_flags(__u64 flags, __u32 mode)
 	if (flags & FMODE_EXEC)
 		cr_flags |= MDS_FMODE_EXEC;
 #endif
-	if (flags & O_LOV_DELAY_CREATE)
+	if (cl_is_lov_delay_create(flags))
 		cr_flags |= MDS_OPEN_DELAY_CREATE;
 
 	if (flags & O_NONBLOCK)
-- 
1.7.3.4


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] staging/lustre/llite: fix O_TMPFILE/O_LOV_DELAY_CREATE conflict
  2014-02-03 23:07 [PATCH] staging/lustre/llite: fix O_TMPFILE/O_LOV_DELAY_CREATE conflict Andreas Dilger
@ 2014-02-04  0:09 ` Dilger, Andreas
  2014-02-04  4:10 ` Greg KH
  1 sibling, 0 replies; 4+ messages in thread
From: Dilger, Andreas @ 2014-02-04  0:09 UTC (permalink / raw)
  To: greg@kroah.com; +Cc: linux-fsdevel@vger.kernel.org, Drokin, Oleg, Peng Tao

On 2014/02/03, 4:07 PM, "Andreas Dilger" <andreas.dilger@intel.com> wrote:
>In kernel 3.11 O_TMPFILE was introduced, but the open flag value
>conflicts with the O_LOV_DELAY_CREATE flag 020000000 previously used
>by Lustre-aware applications.  O_LOV_DELAY_CREATE allows applications
>to defer file layout and object creation from open time (the default)
>until it can instead be specified by the application using an ioctl.
>
>Instead of trying to find a non-conflicting O_LOV_DELAY_CREATE flag
>or define a Lustre-specific flag that isn't of use to most/any other
>filesystems, use (O_NOCTTY|FASYNC) as the new value.  These flag
>are not meaningful for newly-created regular files and should be
>OK since O_LOV_DELAY_CREATE is only meaningful for new files.
>
>I looked into using O_ACCMODE/FMODE_WRITE_IOCTL, which allows calling
>ioctl() on the minimally-opened fd and is close to what is needed,
>but that doesn't allow specifying the actual read or write mode for
>the file, and fcntl(F_SETFL) doesn't allow O_RDONLY/O_WRONLY/O_RDWR
>to be set after the file is opened.

A few extra comments here that I wasn't sure should be in the commit
comment.

The main goal of the O_LOV_DELAY_CREATE flag is to allow the file to be
opened in a "preliminary" manner to allow the application to specify the
layout of the file across the Lustre storage targets (e.g. whether the
app has millions of separate files, each one written to a single server,
or there is a single huge file spread across all of the servers, or some
combination of the two, if it is RAID-0 or RAID-1, or whatever).

I'm open to a separate flag value for this, but I don't know if that is
of much interest to other filesystems since they don't have similar needs.
We want to avoid the need to have lots of syscalls to do this, since they
translate into multiple RPCs that we want to avoid when creating
potentially
millions of files over the network.

Cheers, Andreas

>Lustre-change: http://review.whamcloud.com/8312
>Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4209
>Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
>Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
>Signed-off-by: Peng Tao <bergwolf@gmail.com>
>---
> .../lustre/lustre/include/lustre/lustre_user.h     |   12 ++++------
> drivers/staging/lustre/lustre/include/lustre_mdc.h |   11 ++++++++++
> drivers/staging/lustre/lustre/llite/file.c         |   21
>++++++++++---------
> drivers/staging/lustre/lustre/mdc/mdc_lib.c        |    2 +-
> 4 files changed, 28 insertions(+), 18 deletions(-)
>
>diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
>b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
>index 6b6c0240..91f22a8 100644
>--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
>+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
>@@ -265,13 +265,11 @@ struct ost_id {
> 
> #define MAX_OBD_NAME 128 /* If this changes, a NEW ioctl must be added */
> 
>-/* Hopefully O_LOV_DELAY_CREATE does not conflict with standard O_xxx
>flags.
>- * Previously it was defined as 0100000000 and conflicts with
>FMODE_NONOTIFY
>- * which was added since kernel 2.6.36, so we redefine it as 020000000.
>- * To be compatible with old version's statically linked binary, finally
>we
>- * define it as (020000000 | 0100000000).
>- * */
>-#define O_LOV_DELAY_CREATE      0120000000
>+/* Define O_LOV_DELAY_CREATE to be a mask that is not useful for regular
>+ * files, but are unlikely to be used in practice and are not harmful if
>+ * used incorrectly.  O_NOCTTY and FASYNC are only meaningful for
>character
>+ * devices and are safe for use on new files (See LU-812, LU-4209). */
>+#define O_LOV_DELAY_CREATE	(O_NOCTTY | FASYNC)
> 
> #define LL_FILE_IGNORE_LOCK     0x00000001
> #define LL_FILE_GROUP_LOCKED    0x00000002
>diff --git a/drivers/staging/lustre/lustre/include/lustre_mdc.h
>b/drivers/staging/lustre/lustre/include/lustre_mdc.h
>index c1e0270..468f363 100644
>--- a/drivers/staging/lustre/lustre/include/lustre_mdc.h
>+++ b/drivers/staging/lustre/lustre/include/lustre_mdc.h
>@@ -166,6 +166,17 @@ void it_clear_disposition(struct lookup_intent *it,
>int flag);
> void it_set_disposition(struct lookup_intent *it, int flag);
> int it_open_error(int phase, struct lookup_intent *it);
> 
>+static inline bool cl_is_lov_delay_create(unsigned int flags)
>+{
>+	return (flags & O_LOV_DELAY_CREATE) == O_LOV_DELAY_CREATE;
>+}
>+
>+static inline void cl_lov_delay_create_clear(unsigned int *flags)
>+{
>+	if ((*flags & O_LOV_DELAY_CREATE) == O_LOV_DELAY_CREATE)
>+		*flags &= ~O_LOV_DELAY_CREATE;
>+}
>+
> /** @} mdc */
> 
> #endif
>diff --git a/drivers/staging/lustre/lustre/llite/file.c
>b/drivers/staging/lustre/lustre/llite/file.c
>index c12821a..dc9da77 100644
>--- a/drivers/staging/lustre/lustre/llite/file.c
>+++ b/drivers/staging/lustre/lustre/llite/file.c
>@@ -671,14 +671,13 @@ restart:
> 
> 	ll_capa_open(inode);
> 
>-	if (!lli->lli_has_smd) {
>-		if (file->f_flags & O_LOV_DELAY_CREATE ||
>-		    !(file->f_mode & FMODE_WRITE)) {
>-			CDEBUG(D_INODE, "object creation was delayed\n");
>-			GOTO(out_och_free, rc);
>-		}
>+	if (!lli->lli_has_smd &&
>+	    (cl_is_lov_delay_create(file->f_flags) ||
>+	     (file->f_mode & FMODE_WRITE) == 0)) {
>+		CDEBUG(D_INODE, "object creation was delayed\n");
>+		GOTO(out_och_free, rc);
> 	}
>-	file->f_flags &= ~O_LOV_DELAY_CREATE;
>+	cl_lov_delay_create_clear(&file->f_flags);
> 	GOTO(out_och_free, rc);
> 
> out_och_free:
>@@ -1381,23 +1380,25 @@ int ll_lov_setstripe_ea_info(struct inode *inode,
>struct file *file,
> 		ccc_inode_lsm_put(inode, lsm);
> 		CDEBUG(D_IOCTL, "stripe already exists for ino %lu\n",
> 		       inode->i_ino);
>-		return -EEXIST;
>+		GOTO(out, rc = -EEXIST);
> 	}
> 
> 	ll_inode_size_lock(inode);
> 	rc = ll_intent_file_open(file, lum, lum_size, &oit);
> 	if (rc)
>-		GOTO(out, rc);
>+		GOTO(out_unlock, rc);
> 	rc = oit.d.lustre.it_status;
> 	if (rc < 0)
> 		GOTO(out_req_free, rc);
> 
> 	ll_release_openhandle(file->f_dentry, &oit);
> 
>- out:
>+out_unlock:
> 	ll_inode_size_unlock(inode);
> 	ll_intent_release(&oit);
> 	ccc_inode_lsm_put(inode, lsm);
>+out:
>+	cl_lov_delay_create_clear(&file->f_flags);
> 	return rc;
> out_req_free:
> 	ptlrpc_req_finished((struct ptlrpc_request *) oit.d.lustre.it_data);
>diff --git a/drivers/staging/lustre/lustre/mdc/mdc_lib.c
>b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
>index 91f6876..5b9f371 100644
>--- a/drivers/staging/lustre/lustre/mdc/mdc_lib.c
>+++ b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
>@@ -197,7 +197,7 @@ static __u64 mds_pack_open_flags(__u64 flags, __u32
>mode)
> 	if (flags & FMODE_EXEC)
> 		cr_flags |= MDS_FMODE_EXEC;
> #endif
>-	if (flags & O_LOV_DELAY_CREATE)
>+	if (cl_is_lov_delay_create(flags))
> 		cr_flags |= MDS_OPEN_DELAY_CREATE;
> 
> 	if (flags & O_NONBLOCK)
>-- 
>1.7.3.4
>
>


Cheers, Andreas
-- 
Andreas Dilger

Lustre Software Architect
Intel High Performance Data Division



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] staging/lustre/llite: fix O_TMPFILE/O_LOV_DELAY_CREATE conflict
  2014-02-03 23:07 [PATCH] staging/lustre/llite: fix O_TMPFILE/O_LOV_DELAY_CREATE conflict Andreas Dilger
  2014-02-04  0:09 ` Dilger, Andreas
@ 2014-02-04  4:10 ` Greg KH
  2014-02-04 12:13   ` Bernd Schubert
  1 sibling, 1 reply; 4+ messages in thread
From: Greg KH @ 2014-02-04  4:10 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: linux-fsdevel, Oleg Drokin, Peng Tao

On Mon, Feb 03, 2014 at 04:07:57PM -0700, Andreas Dilger wrote:
> In kernel 3.11 O_TMPFILE was introduced, but the open flag value
> conflicts with the O_LOV_DELAY_CREATE flag 020000000 previously used
> by Lustre-aware applications.  O_LOV_DELAY_CREATE allows applications
> to defer file layout and object creation from open time (the default)
> until it can instead be specified by the application using an ioctl.
> 
> Instead of trying to find a non-conflicting O_LOV_DELAY_CREATE flag
> or define a Lustre-specific flag that isn't of use to most/any other
> filesystems, use (O_NOCTTY|FASYNC) as the new value.  These flag
> are not meaningful for newly-created regular files and should be
> OK since O_LOV_DELAY_CREATE is only meaningful for new files.
> 
> I looked into using O_ACCMODE/FMODE_WRITE_IOCTL, which allows calling
> ioctl() on the minimally-opened fd and is close to what is needed,
> but that doesn't allow specifying the actual read or write mode for
> the file, and fcntl(F_SETFL) doesn't allow O_RDONLY/O_WRONLY/O_RDWR
> to be set after the file is opened.
> 
> Lustre-change: http://review.whamcloud.com/8312
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4209
> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
> Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
> Signed-off-by: Peng Tao <bergwolf@gmail.com>
> ---
>  .../lustre/lustre/include/lustre/lustre_user.h     |   12 ++++------
>  drivers/staging/lustre/lustre/include/lustre_mdc.h |   11 ++++++++++
>  drivers/staging/lustre/lustre/llite/file.c         |   21 ++++++++++---------
>  drivers/staging/lustre/lustre/mdc/mdc_lib.c        |    2 +-
>  4 files changed, 28 insertions(+), 18 deletions(-)

As you are abusing well known file flags, I can't accept any patch that
doesn't get accepted by the linux-fsdev developers as well, please
repost this there and cc: me for the fun discussion :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] staging/lustre/llite: fix O_TMPFILE/O_LOV_DELAY_CREATE conflict
  2014-02-04  4:10 ` Greg KH
@ 2014-02-04 12:13   ` Bernd Schubert
  0 siblings, 0 replies; 4+ messages in thread
From: Bernd Schubert @ 2014-02-04 12:13 UTC (permalink / raw)
  To: Greg KH, Andreas Dilger; +Cc: linux-fsdevel, Oleg Drokin, Peng Tao

On 02/04/2014 05:10 AM, Greg KH wrote:
> On Mon, Feb 03, 2014 at 04:07:57PM -0700, Andreas Dilger wrote:
>> In kernel 3.11 O_TMPFILE was introduced, but the open flag value
>> conflicts with the O_LOV_DELAY_CREATE flag 020000000 previously used
>> by Lustre-aware applications.  O_LOV_DELAY_CREATE allows applications
>> to defer file layout and object creation from open time (the default)
>> until it can instead be specified by the application using an ioctl.
>>
>> Instead of trying to find a non-conflicting O_LOV_DELAY_CREATE flag
>> or define a Lustre-specific flag that isn't of use to most/any other
>> filesystems, use (O_NOCTTY|FASYNC) as the new value.  These flag
>> are not meaningful for newly-created regular files and should be
>> OK since O_LOV_DELAY_CREATE is only meaningful for new files.
>>
>> I looked into using O_ACCMODE/FMODE_WRITE_IOCTL, which allows calling
>> ioctl() on the minimally-opened fd and is close to what is needed,
>> but that doesn't allow specifying the actual read or write mode for
>> the file, and fcntl(F_SETFL) doesn't allow O_RDONLY/O_WRONLY/O_RDWR
>> to be set after the file is opened.
>>
>> Lustre-change: http://review.whamcloud.com/8312
>> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4209
>> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
>> Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
>> Signed-off-by: Peng Tao <bergwolf@gmail.com>
>> ---
>>   .../lustre/lustre/include/lustre/lustre_user.h     |   12 ++++------
>>   drivers/staging/lustre/lustre/include/lustre_mdc.h |   11 ++++++++++
>>   drivers/staging/lustre/lustre/llite/file.c         |   21 ++++++++++---------
>>   drivers/staging/lustre/lustre/mdc/mdc_lib.c        |    2 +-
>>   4 files changed, 28 insertions(+), 18 deletions(-)
>
> As you are abusing well known file flags, I can't accept any patch that
> doesn't get accepted by the linux-fsdev developers as well, please
> repost this there and cc: me for the fun discussion :)
>

I think it makes sense anyway to make this flag a generic feature. I.e. 
delayed file layout creation solves the splice-copy-offload issue for 
most (all?) object storage file systems.

https://lkml.org/lkml/2013/9/30/421


Thanks,
Bernd


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-02-04 12:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-03 23:07 [PATCH] staging/lustre/llite: fix O_TMPFILE/O_LOV_DELAY_CREATE conflict Andreas Dilger
2014-02-04  0:09 ` Dilger, Andreas
2014-02-04  4:10 ` Greg KH
2014-02-04 12:13   ` Bernd Schubert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).