linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Lennart Poettering <lennart@poettering.net>
To: NeilBrown <neilb@suse.de>
Cc: Dan Williams <dan.j.williams@intel.com>,
	Andrey Borzenkov <arvidjaar@mail.ru>,
	Tomasz Torcz <tomek@pipebreaker.pl>,
	systemd-devel@lists.freedesktop.org, linux-raid@vger.kernel.org
Subject: Re: [systemd-devel] systemd kills mdmon if it was started manually by user
Date: Mon, 7 Nov 2011 13:00:49 +0100	[thread overview]
Message-ID: <20111107120048.GD25130@tango.0pointer.de> (raw)
In-Reply-To: <20111107135241.64ae261d@notabene.brown>

On Mon, 07.11.11 13:52, NeilBrown (neilb@suse.de) wrote:

> > Why doesn't the kernel do that on its own?
> 
> Because the kernel doesn't know about the format of the metadata that
> describes the array.

Yupp, my suggestion would be to change that. 

> > What we do right now is this:
> > 
> > kill_all_processes();
> > do {
> >      umount_all_file_systems_we_can();
> >      read_only_mount_all_remaining_file_systems();
> > } while (we_had_some_success_with_that());
> > jump_into_initrd();
> > 
> > As long as mdmon references a file from the root disk we cannot umount
> > it, so the loop wouldn't be effective.
> 
> What exactly is "kill_all_processes()"?   is it SIGTERM or SIGKILL or both
> with a gap or ???

SIGTERM followed by SIGKILL after 5s if the programs do not react to
that in time. But note that this logic only applies to processes which
for some reason managed to escape systemd's usual cgroup-based killing
logic. Normal services are hence already killed at that time, and only
processes which moved themselves out of any cgroup or for which the
service files disabled killing might survive to this point.

> I assume a SIGKILL.  I don't mind a SIGTERM and it could be useful to
> expedite mdmon cleaning up.
> 
> However there is an important piece missing.  When you remount,ro a
> filesystem, the block device doesn't get told so it thinks it is still open
> read/write.  So md cannot tell mdmon that the array is now read-only
> It would make a lot of sense for mdmon to exit after receiving a SIGTERM as
> soon as the device is marked read-only.  But it just doesn't know.

As mentioned by Kay, you can get notifications for this by poll()ing on
/proc/self/mountinfo. Note again however, that we kill first, and only
then try to unmount/remount.

> We can probably fix that, but that doesn't really help for now.
> 
> I think I would like:
> 
>  - add to the above loop "stop any virtual devices that we can".
>    Exactly how to do that if /proc and /sys are already unmounted
>    is unclear.  Is one or both of these kept around somewhere?

/proc and /sys are not unmounted in this loop. Being virtual API fs we
exclude them from this logic and leave them around until the initrd
unmounts them if it wants to.

Actually, in the loop above there are three more steps: in each
iteration we also try to detach all swap devices, all loopback devices
and all DM devices. We probably could add a similar operation for MD
devices here too. But note that this loop is more of a last-resort
thing, and normally shouldn't do much.

>  - allow processes to be marked some way so they get SIGTERM but not
>    SIGKILL.  I'm happy adding magic char to argv[0].

Note that you can configure how you are killed relatively flexibly in
the service files and that the loop pointed out above is only this last
resort thing which is applied to all processes/mount points/... which
stick around after this normal shutdown.

Here's another attempt in explaining how this works:

<snip>
terminate_all_mount_and_service_units();
kill_all_remaining_processes();
do {
     umount_all_remaining_file_systems_we_can();
     read_only_mount_all_remaining_file_systems();
     detach_all_remaining_loop_devices();
     detach_all_remaining_swap_devices();
     detach_all_remaining_dm_devices();
} while (we_had_some_success_with_that());
jump_into_initrd();
</snip>

You have relatively flexible control of the first step in this code. The
second step is then the hammer that tries to fix up what this step
didn't accomplish. My suggestion to check argv[0][0] was to avoid the
hammer.

Lennart

-- 
Lennart Poettering - Red Hat, Inc.

  parent reply	other threads:[~2011-11-07 12:00 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-04  8:41 systemd kills mdmon if it was started manually by user Andrey Borzenkov
2010-12-04  9:12 ` Christian Parpart
2010-12-04 12:08   ` Andrey Borzenkov
2010-12-12 13:20     ` [systemd-devel] " Luca Berra
2011-01-07  0:40     ` Lennart Poettering
     [not found]     ` <20101204121413.GC11336@mother.pipebreaker.pl>
     [not found]       ` <AANLkTi=nTSdHc55f08G9sdEK6u8eXp276VOTHHr0jmXT@mail.gmail.com>
     [not found]         ` <20110125034434.GC7046@tango.0pointer.de>
     [not found]           ` <AANLkTik189VTXYpzLFqP=MNBg=Nx-Yq6BOKURtiby++B@mail.gmail.com>
     [not found]             ` <20110125042814.GA9727@tango.0pointer.de>
2011-02-04 19:55               ` Andrey Borzenkov
2011-02-08  9:48                 ` [systemd-devel] " Lennart Poettering
2011-02-08 10:52                   ` Andrey Borzenkov
2011-02-08 11:07                     ` Lennart Poettering
2011-02-08 13:54                       ` Andrey Borzenkov
2011-02-08 17:28                         ` [systemd-devel] " Lennart Poettering
2011-10-23  8:00                           ` Dan Williams
2011-10-24  8:04                             ` Thomas Jarosch
2011-10-25  1:40                             ` NeilBrown
2011-10-31 11:06                             ` Lennart Poettering
2011-10-31 11:15                               ` [systemd-devel] " Lennart Poettering
2011-11-02  0:44                               ` NeilBrown
2011-11-02  1:16                                 ` Lennart Poettering
2011-11-02  2:03                                   ` NeilBrown
2011-11-02 13:32                                     ` Lennart Poettering
2011-11-02 14:33                                       ` Kay Sievers
2011-11-02 15:17                                         ` Lennart Poettering
2011-11-02 15:21                                           ` Kay Sievers
2011-11-02 15:29                                             ` [systemd-devel] " Lennart Poettering
2011-11-02 22:18                                               ` Williams, Dan J
2011-11-02 23:39                                                 ` Lennart Poettering
2011-11-03  0:28                                                   ` Williams, Dan J
2011-11-02 17:21                                           ` Williams, Dan J
2011-11-02 23:35                                             ` Lennart Poettering
2011-11-02 18:16                                         ` Williams, Dan J
2011-11-02 18:49                                           ` Kay Sievers
2011-11-02 19:31                                             ` Williams, Dan J
2011-11-02 19:51                                               ` Kay Sievers
2011-11-07  2:52                                       ` NeilBrown
2011-11-07  3:42                                         ` Kay Sievers
2011-11-07  4:30                                           ` NeilBrown
2011-11-07 12:00                                         ` Lennart Poettering [this message]
2011-11-07 19:09                                           ` Williams, Dan J
2011-11-08 14:43                                             ` Lennart Poettering
2011-11-08 23:27                                               ` Williams, Dan J
2011-11-08  0:11                                       ` Michal Soltys
2011-11-08 16:46                                         ` Michal Soltys
2011-11-08 20:32                                           ` Michal Soltys
2011-11-08 22:29                                             ` Williams, Dan J
2011-02-09 14:01                       ` Lennart Poettering
2011-01-07  0:38 ` Lennart Poettering
2011-01-07  1:09   ` [systemd-devel] " Michael Biebl
2011-01-07  1:17     ` Roman Mamedov
2011-01-07  1:16   ` NeilBrown
2011-01-07  1:42     ` Lennart Poettering

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111107120048.GD25130@tango.0pointer.de \
    --to=lennart@poettering.net \
    --cc=arvidjaar@mail.ru \
    --cc=dan.j.williams@intel.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=systemd-devel@lists.freedesktop.org \
    --cc=tomek@pipebreaker.pl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).