Re: another semantic storage system (in userspace)

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Hans Reiser <reiser@namesys.com>
To: Clay Barnes <clay.barnes@gmail.com>
Cc: reiserfs-list@namesys.com
Subject: Re: another semantic storage system (in userspace)
Date: Thu, 13 Jul 2006 10:38:23 -0700	[thread overview]
Message-ID: <44B6850F.9010801@namesys.com> (raw)
In-Reply-To: <20060713170626.GE31144@HAL_5000D.tc.ph.cox.net>

Clay Barnes wrote:

>I have been thinking lately that though we certainly need to do 
>cleanup of the various bugs and such relating to the storage layer,
>perhaps now is a good time to review and discuss the plans for the
>semantic layer so that any outstanding concerns can be thouroughly
>discussed and resolved before we get close to time to start with actual
>work on that portion of Reiser4.  Remember, we have a real chance at
>being the first semantic storage system with a significant user base,
>and that places a terrible pressure for perfection on us (and I use 'us'
>loosely, since I don't have nearly the code skills in C needed to dare
>touch source in non-trivial ways---I hope however that between my CS and
>Linguistics degrees, I'll be able to at least contribute some ideas).
>If we're first out of the gate, but we have some significant flaw in
>design, we're deeply endangered.  People will wait for our correction of
>it (which may be impossible if it's a fundamental or debated problem),
>or for another system that has less critical flaws.
>
>These are my cricial concerns.  I know some of these have been addressed
>before, but this keeps anything from being skipped under the assumption
>that they've already been resolved.
>1) Scope
>  a) Should the semantic content of files be purely user-defined?
>  
>
Yes.

>  b) Should the full extricable content of a file be read into semantic
>  space?
>  
>
If the user wants that.   The user should configure his auto-indexer
that he has selected to work as he desires and to be applied to those
files he desires to.  By default there should be a delay (such as, until
the repacker runs at night) in indexing to ensure that we only index
that which will be around for a while.  This is for performance reasons.

>  c) If so, should there be a seperation of the two forms of content?
>  d) How would we address the two in a simple, user-transparent way?
>2) Storage
>  a) How do we store the semantic data so it is very rapidly accessable
>  and easy to update, especially if we decide to use the full textual
>  contentent of parsabe file?
>3) Changes
>  a) Should we instantly index at full capacity changes, or should we
>  queue files needing re-indexing for a very low resource daemon to
>  process?
>  b) If we use the latter, how do we avoid disagreement between newly
>  changed/created files and the semanic actions regarding them while the
>  daemon works?
>  c) If we use the former, how do we mimize the impact of this sudden
>  spike in resources to the user without risking letting the index and
>  data get out of sync.
>4) Portability
>  a) Should we provide a way to export semantic data when archiving to
>  formats which standards prevent from using Reiser4 (such as DVD)?
>  b) How do we handle exports from a partial filesystem, if we decide to
>  provide export capabilities?
>  c) Should we provide the ability to import from compeating semantic
>  systems?  Export?
>5) Code revisions
>  a) With emerging formats, updates to formats and the numerous ways
>  file standard change, how do we provide easy addition and updates to
>  the filters we use to index files?
>  b) Should we provide a simple user-editable means to change/augment
>  filters?
>  c) Can these both be resolved by placing the actual filters in
>  userspace/filesystemspace instead of into the code?
>
>I hope I haven't overstepped my relevance, and my apologies if I have,
>but I just wanted to raise some concerns while they are easy to
>address---before the code is started.
>
>Further disclaimer:  I'm at work, so I may have been a little hasty
>writing this (though technically, I'm *supposed* to be reasearching
>semantic storage systems for our documents, so I'm not really goofing
>off), so there may be errors from my minimal review/revision.
>
>Thanks,
>Clay
>
>
>
>  
>

next prev parent reply	other threads:[~2006-07-13 17:38 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-07-12 21:44 another semantic storage system (in userspace) Hubert Chan
2006-07-12 23:22 ` Clay Barnes
2006-07-13  7:30   ` Hans Reiser
2006-07-13 17:06 ` Clay Barnes
2006-07-13 17:38   ` Hans Reiser [this message]
2006-07-13 20:30     ` Hubert Chan
2006-07-14  0:23       ` Jonathan Briggs

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44B6850F.9010801@namesys.com \
    --to=reiser@namesys.com \
    --cc=clay.barnes@gmail.com \
    --cc=reiserfs-list@namesys.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.