Re: [2.6.18 PATCH]: Filesystem Event Reporter V4

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Yi Yang <yang.y.yi@gmail.com>
To: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: linux-kernel@vger.kernel.org, Andrew Morton <akpm@osdl.org>,
	Matt Helsley <matthltc@us.ibm.com>
Subject: Re: [2.6.18 PATCH]: Filesystem Event Reporter V4
Date: Sun, 08 Oct 2006 22:34:02 +0800	[thread overview]
Message-ID: <45290C5A.1020708@gmail.com> (raw)
In-Reply-To: <4c4443230610080629j33bc3685g8bb22029c390727d@mail.gmail.com>


>     I do not say that they are broken, but you in some places you
>     access per-cpu
>     variuables without turning preemption off. I think some locking or
>     preemption tweaks should be done there to explicitly mark critical
>     regions.
>
You're right, some places have such issues, I just considered how to 
avoid the lock
or atomic operation, Andrew ever mentioned the lock is unacceptible in 
file system
 code path, so I always avoid the lock or atomic operation.

>     > >What prevents from adding another skb into the queue between
>     above loop
>     > >and check for flag?
>     > >
>     > before adding a fsevent to the queue, a process will check
>     exit_flag, if
>     > it is set to 1, that
>     > process won't queue the fsevent and return immediately.
>
>     But you check for exit_flag in fsevent_commit() without any locks.
>
Only rmmod will set exit_flag, other users are readers, so I think the 
lock is unnecessary,
 only one issue is that I  should clear fsevent_queue in the last 
section of fsevent_exit.
>
>     > >
>     > >Above operation seems racy, what prevents from changing
>     missed_refcnt
>     > >after it was read?
>     > >
>     > if the case you said is hit, missed_refcnt must be not equal to
>     > missed_refcnt, because they are for the same cpu, so no problem,
>     it will
>     > be checked
>     > in the next work schedule.
>
>     Since it is called with disabled preemption it is ok, but in that
>     case
>     you do not need missed_refcnt to be atomic.
>
in include/linux/fsevent.h, it is possibly accessed from diffrent cpus, 
so atomic is necessary.
>
>     > >Why are you doing this? It looks wrong, since socket's queue is
>     cleaned
>     > >automatically.
>     > >
>     > When I release fsevent_sock, the kernel always printk a message
>     which
>     > says "sk_rmem_alloc isn't zero",
>     > I don't know why, I doubt there are some packets in recieve and
>     write
>     > queue, so try to free them.
>     > but sk_rmem_alloc is always non-zero, so I must set it to 0, the
>     kernel
>     > doesn't printk.
>
>     That means that you broke socket accounting in some way.
>     sock_release() should do all cleanup for you.
>
>     Each time you add skb into socket queue appropriate socket is
>     charged for
>     value equal to sizeof(skb)+sizeof(skb_shared_info)+aligned size of
>     the data.
>     That number is added to the one of the sk_r/wmem_alloc, depending
>     on the
>     direction of the skb way, skb's destructor is set to the function
>     which
>     will remove appropiate amount of from above variables.
>     When you call sock_release() all skbs are removed and freed, so socket
>     accounting is corrected in kfree_skb(), which (if there are no users)
>     calls destructor and frees skb and data.
>     If you see asserions that above variables are not zero, that means
>     that
>     you either removed skb from the queue and forgot to free it, or
>     freed it
>     several times (although it will be likely a crash in this case),
>     or you
>     overwrote that variables after some memory corruption.
>
maybe that surplus skb_get is the root cause.
>
>     > >This is racy.
>     > >
>     > This doesn't take effect in the normal processing, the work
>     kthread will
>     > do the real
>     > work which will ensure no racy.
>
>     Then just remove it, and actually the whole modularity does not
>     seems a
>     good idea, although it is of course your decision to make design
>     static
>     or not. I would implement such things with dynamic registration of
>     the
>     clients and just make fsevent statically built into the kernel.
>
It is hard a bit for the subsystem using the hook mechanism to be 
implemented as
 a module. In fact, all the newly-added code in this patch is for 
modularity. :)

Really that is a way to build as a static infrastructure, the filesystem 
init code calls
 a fsevent register API to enable it, but unregister is not a trivial, 
the syncronization
issue still exists.  Nevertheless, this is really is a way I can try.
>
>     > >This looks really racy.
>     > >What prevents from rescheduling here?
>     > >
>     > This has disabled the preemption, so it is impossible to reshcedule.
>
>     No, put_fsevent_refcnt() andbles it again.
>     Or is it disabled on higher layer?
>
I think your "reschedule" means process migration, those code just considers
 this issue, missed_refcnt is just for this, start_cpuid is used to 
identify the cpu
before migration, end_cpuid is used to identify the cpu after migration, if
start_cpuid is equal to end_cpuid, we can think there is no migration 
happened,
 otherwise, missed_refcnt[start_cpuid] will increase, because there are 
possibly
 several prcoesses on different cpus to modify this value, so it is 
defined as
atomic.
>
>     > >
>     > >What prevents change for __raise_fsevent in that function?
>     > >
>     > If reference count is not -1, rmmod won't change
>     __raise_fsevent. the
>     > key is two new-added
>     > refrence counters.
>
>     You do it without preemption disabled and any other locks...
>
Only rmmod will change __raise_fsevent and it will set it to 0 just after
all the filesystem code paths nerver call it, if reference count on anuy cpu
 is not -1, rmmod will wait for it until this cpu doesn't call raise_fsevent
 any more, rmmod will set it to 0 just after all the reference count on 
all the
cpu are -1, so only one user -- rmmod -- is accessing it in that time, 
this is
safe.

     prev parent reply	other threads:[~2006-10-08 14:33 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-09-30 15:24 [2.6.18 PATCH]: Filesystem Event Reporter V4 Yi Yang
2006-10-03 16:47 ` Evgeniy Polyakov
2006-10-04 15:15   ` Yi Yang
2006-10-05  9:41     ` Evgeniy Polyakov
     [not found]       ` <4c4443230610080629j33bc3685g8bb22029c390727d@mail.gmail.com>
2006-10-08 14:34         ` Yi Yang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45290C5A.1020708@gmail.com \
    --to=yang.y.yi@gmail.com \
    --cc=akpm@osdl.org \
    --cc=johnpol@2ka.mipt.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthltc@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox