All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tommi Virtanen <tommi.virtanen@dreamhost.com>
To: Yehuda Sadeh Weinraub <yehudasa@gmail.com>
Cc: Gregory Farnum <gregory.farnum@dreamhost.com>,
	ceph-devel@vger.kernel.org
Subject: Re: long object names
Date: Thu, 21 Apr 2011 16:07:44 -0700	[thread overview]
Message-ID: <20110421230744.GC26545@dreamer> (raw)
In-Reply-To: <BANLkTikCG487GB=n+9eT+66hxBVk_gfwqw@mail.gmail.com>

On Thu, Apr 21, 2011 at 03:25:35PM -0700, Yehuda Sadeh Weinraub wrote:
> Yeah, we're well aware of those races. Note that splitting to
> subdirectories is racey too. Imagine one thread/process creating an
> object, while the other one removing a similar object with the same
> prefix. The first one tries to create a subtree, while the other is
> trying to remove the same subtree. I've seen these issues before,
> they're real.

Yup, that's why I said there's a rmdir/mkdir race. You can fix that
two ways:

1. Don't rmdir; there's not going to be that much junk there
   (punting it, but not badly; no harm done, just littering).

2. Make the mkdir & create file case just handle the race; all you
   need is a simple retry loop, there's no problems and the races
   can't cause actual harm.

   And more to the point, this is the only kind of race there is.
   If FileStore needs to support arbitrary rename etc operations,
   they all need this same retry loop, but it's still just the
   same retry loop, and can probably put in a nice utility function.

   *There are no other kinds of races*, and it seems FileStore doesn't
   really do renames etc anyway.



// try to create a file, using the dynamic dirs trick for long
// filenames. note that this is only needed for file creation; opening
// an existing file needs no mkdir trickery. overwrites pathname,
// returns fd or <0 on errors. pathname is relative to dirfd.
int really_create(int dirfd, char *pathname, int flags, mode_t mode) {
  int ret;

  // split into leading path and base filename
  const char *filename = strrchr(pathname, '/');

  if (!filename) {
    // pathname has no slashes, safe to just open
    return openat(dirfd, pathname, flags, mode);
  }

  // nul terminate leading path
  filename = '\0';
  // move from slash to actual filename
  filename++;

  // go through leading prefixes and mkdir them
 retry:
  char *cursor = pathname;
  while (1) {
    printf("cursor=%p %s\n", cursor, cursor);
    cursor = strchr(cursor, '/');
    if (!cursor)
      break;
    // terminate the string here temporarily, mkdir that
    *cursor = '\0';
    ret = mkdirat(dirfd, pathname, 0755);
    // restore the slash so we don't forget
    *cursor = '/';
    // and nudge us past the slash
    cursor++;
    if (ret < 0) {
      switch (errno) {
      case EEXIST:
	// it already exists; ignore
	break;
      case ENOENT:
	// somebody rmdir'd a parent path; retry from the top
	goto retry;
      default:
	return -errno;
      }
    }
    // loop back to find the next slash and mkdir that
  }

  // leading path is created (unless we lost a race just now); now do
  // the file operation
  ret = openat(dirfd, pathname, flags, mode);
  if (ret<0) {
    switch (errno) {
    case ENOENT:
      // it seems we lost a race at the last second; do mkdirs again
      goto retry;
    default:
      return -errno;
    }
  }
  return ret;
}


-- 
:(){ :|:&};:

  reply	other threads:[~2011-04-21 23:07 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-21  4:42 long object names Sage Weil
2011-04-21 18:56 ` Tommi Virtanen
2011-04-21 19:27   ` Colin McCabe
2011-04-21 19:32     ` Tommi Virtanen
2011-04-21 20:03       ` Gregory Farnum
2011-04-21 21:09         ` Colin McCabe
2011-04-21 21:23           ` Yehuda Sadeh Weinraub
2011-04-21 21:44             ` Colin McCabe
2011-04-21 21:54               ` Yehuda Sadeh Weinraub
2011-04-21 22:01                 ` Colin McCabe
2011-04-21 22:58                   ` Zenon Panoussis
2011-04-21 23:04                     ` Yehuda Sadeh Weinraub
2011-04-21 22:00         ` Tommi Virtanen
2011-04-21 22:23           ` Gregory Farnum
2011-04-21 22:25           ` Yehuda Sadeh Weinraub
2011-04-21 23:07             ` Tommi Virtanen [this message]
2011-04-22 15:44               ` Sage Weil
2011-04-22 16:34                 ` Tommi Virtanen
2011-04-22 17:36                 ` Colin McCabe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110421230744.GC26545@dreamer \
    --to=tommi.virtanen@dreamhost.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=gregory.farnum@dreamhost.com \
    --cc=yehudasa@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.