From: Yi Yang <yang.y.yi@gmail.com>
To: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: linux-kernel@vger.kernel.org, Andrew Morton <akpm@osdl.org>,
Matt Helsley <matthltc@us.ibm.com>
Subject: Re: [2.6.18 PATCH]: Filesystem Event Reporter V4
Date: Sun, 08 Oct 2006 22:34:02 +0800 [thread overview]
Message-ID: <45290C5A.1020708@gmail.com> (raw)
In-Reply-To: <4c4443230610080629j33bc3685g8bb22029c390727d@mail.gmail.com>
> I do not say that they are broken, but you in some places you
> access per-cpu
> variuables without turning preemption off. I think some locking or
> preemption tweaks should be done there to explicitly mark critical
> regions.
>
You're right, some places have such issues, I just considered how to
avoid the lock
or atomic operation, Andrew ever mentioned the lock is unacceptible in
file system
code path, so I always avoid the lock or atomic operation.
> > >What prevents from adding another skb into the queue between
> above loop
> > >and check for flag?
> > >
> > before adding a fsevent to the queue, a process will check
> exit_flag, if
> > it is set to 1, that
> > process won't queue the fsevent and return immediately.
>
> But you check for exit_flag in fsevent_commit() without any locks.
>
Only rmmod will set exit_flag, other users are readers, so I think the
lock is unnecessary,
only one issue is that I should clear fsevent_queue in the last
section of fsevent_exit.
>
> > >
> > >Above operation seems racy, what prevents from changing
> missed_refcnt
> > >after it was read?
> > >
> > if the case you said is hit, missed_refcnt must be not equal to
> > missed_refcnt, because they are for the same cpu, so no problem,
> it will
> > be checked
> > in the next work schedule.
>
> Since it is called with disabled preemption it is ok, but in that
> case
> you do not need missed_refcnt to be atomic.
>
in include/linux/fsevent.h, it is possibly accessed from diffrent cpus,
so atomic is necessary.
>
> > >Why are you doing this? It looks wrong, since socket's queue is
> cleaned
> > >automatically.
> > >
> > When I release fsevent_sock, the kernel always printk a message
> which
> > says "sk_rmem_alloc isn't zero",
> > I don't know why, I doubt there are some packets in recieve and
> write
> > queue, so try to free them.
> > but sk_rmem_alloc is always non-zero, so I must set it to 0, the
> kernel
> > doesn't printk.
>
> That means that you broke socket accounting in some way.
> sock_release() should do all cleanup for you.
>
> Each time you add skb into socket queue appropriate socket is
> charged for
> value equal to sizeof(skb)+sizeof(skb_shared_info)+aligned size of
> the data.
> That number is added to the one of the sk_r/wmem_alloc, depending
> on the
> direction of the skb way, skb's destructor is set to the function
> which
> will remove appropiate amount of from above variables.
> When you call sock_release() all skbs are removed and freed, so socket
> accounting is corrected in kfree_skb(), which (if there are no users)
> calls destructor and frees skb and data.
> If you see asserions that above variables are not zero, that means
> that
> you either removed skb from the queue and forgot to free it, or
> freed it
> several times (although it will be likely a crash in this case),
> or you
> overwrote that variables after some memory corruption.
>
maybe that surplus skb_get is the root cause.
>
> > >This is racy.
> > >
> > This doesn't take effect in the normal processing, the work
> kthread will
> > do the real
> > work which will ensure no racy.
>
> Then just remove it, and actually the whole modularity does not
> seems a
> good idea, although it is of course your decision to make design
> static
> or not. I would implement such things with dynamic registration of
> the
> clients and just make fsevent statically built into the kernel.
>
It is hard a bit for the subsystem using the hook mechanism to be
implemented as
a module. In fact, all the newly-added code in this patch is for
modularity. :)
Really that is a way to build as a static infrastructure, the filesystem
init code calls
a fsevent register API to enable it, but unregister is not a trivial,
the syncronization
issue still exists. Nevertheless, this is really is a way I can try.
>
> > >This looks really racy.
> > >What prevents from rescheduling here?
> > >
> > This has disabled the preemption, so it is impossible to reshcedule.
>
> No, put_fsevent_refcnt() andbles it again.
> Or is it disabled on higher layer?
>
I think your "reschedule" means process migration, those code just considers
this issue, missed_refcnt is just for this, start_cpuid is used to
identify the cpu
before migration, end_cpuid is used to identify the cpu after migration, if
start_cpuid is equal to end_cpuid, we can think there is no migration
happened,
otherwise, missed_refcnt[start_cpuid] will increase, because there are
possibly
several prcoesses on different cpus to modify this value, so it is
defined as
atomic.
>
> > >
> > >What prevents change for __raise_fsevent in that function?
> > >
> > If reference count is not -1, rmmod won't change
> __raise_fsevent. the
> > key is two new-added
> > refrence counters.
>
> You do it without preemption disabled and any other locks...
>
Only rmmod will change __raise_fsevent and it will set it to 0 just after
all the filesystem code paths nerver call it, if reference count on anuy cpu
is not -1, rmmod will wait for it until this cpu doesn't call raise_fsevent
any more, rmmod will set it to 0 just after all the reference count on
all the
cpu are -1, so only one user -- rmmod -- is accessing it in that time,
this is
safe.
prev parent reply other threads:[~2006-10-08 14:33 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-09-30 15:24 [2.6.18 PATCH]: Filesystem Event Reporter V4 Yi Yang
2006-10-03 16:47 ` Evgeniy Polyakov
2006-10-04 15:15 ` Yi Yang
2006-10-05 9:41 ` Evgeniy Polyakov
[not found] ` <4c4443230610080629j33bc3685g8bb22029c390727d@mail.gmail.com>
2006-10-08 14:34 ` Yi Yang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=45290C5A.1020708@gmail.com \
--to=yang.y.yi@gmail.com \
--cc=akpm@osdl.org \
--cc=johnpol@2ka.mipt.ru \
--cc=linux-kernel@vger.kernel.org \
--cc=matthltc@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox