All of lore.kernel.org
 help / color / mirror / Atom feed
From: Loic Dachary <ldachary@redhat.com>
To: Ken Dreyer <kdreyer@redhat.com>
Cc: Ceph Development <ceph-devel@vger.kernel.org>
Subject: Backporting stability fixes for ceph-disk
Date: Tue, 2 Feb 2016 12:53:58 +0700	[thread overview]
Message-ID: <56B04476.5020203@redhat.com> (raw)

Hi Ken,

https://github.com/ceph/ceph/pull/6926 and https://github.com/ceph/ceph/pull/5999 fixed a number of stability problems related to the udev / ceph-disk / initsystem code path. I'm now convinced (after a few weeks with no surprising failures when running the ceph-disk teuthology suite) that we have something stable. I'm not saying all problems have been found and fixed. But at least we have something stable and repeatable to work with. I've used it as a based for a partial refactor of ceph-disk to support Bluestore ( https://github.com/ceph/ceph/pull/7218 ).

I think all stability fixes have been backported to infernalis ( https://github.com/ceph/ceph/pull/7001 etc. ). Unfortunately backporting to hammer can't be done by cherry-picking commits from https://github.com/ceph/ceph/pull/6926 and https://github.com/ceph/ceph/pull/5999. In hammer things go like this:

    * ceph-disk prepare
    * triggers a udev event
    * udev action runs ceph-disk activate
    * ceph-disk activate run ceph-osd via the init system

In infernalis Sage implemented an intermediate step so that the udev action does as little as possible:

   * ceph-disk prepare
   * triggers a udev event
   * udev action asynchronously delegates activation to the init system
   * the init system runs ceph-disk activate
   * ceph-disk activate run ceph-osd via the init system

This helps with stability because ceph-disk activate may trigger udev events, which is not recommended when running as a child process of a udev action and also because ceph-disk activate may take minutes to complete in some cases. Backporting this logic to hammer would require shipping new init files (ceph-disk unit for systemd for instance) and new udev rules (to call ceph-disk trigger instead of ceph-disk activate to add the delegation step).

The conservative approach to the problem would be to cherry-pick what we can ( https://github.com/dachary/ceph/commit/9dce05a8cdfc564c5162885bbb67a04ad7b95c5a for instance ) and document known side effects of ceph-disk instability so people know it's an annoyance but nothing destructive or blocking. In the worst case scenario, deactivating the udev rules and running ceph-disk prepare + ceph-disk activate manually or by writing a script that does things sequentially is a viable workaround.

The better approach would be to backport the udev / init system changes together with most of what ceph-disk is in infernalis. Not only would that solve the problems we know about, but it would give us a solid ground to fix future problems. It is unfortunately, IMHO, too much of a risk at this stage of the hammer release.

I'm quite conflicted about how to approach that in a sane way and your input would be most precious.

Cheers


             reply	other threads:[~2016-02-02  5:54 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-02  5:53 Loic Dachary [this message]
2016-02-03 17:56 ` Backporting stability fixes for ceph-disk Ken Dreyer
2016-02-03 19:10   ` Loic Dachary
2016-02-04  3:13     ` Ken Dreyer
2016-02-04  5:18       ` Loic Dachary

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56B04476.5020203@redhat.com \
    --to=ldachary@redhat.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=kdreyer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.