From: "Eric S. Johansson" <esj@eggo.org>
To: Paolo Bonzini <pbonzini@redhat.com>, kvm <kvm@vger.kernel.org>,
Gerd Hoffmann <kraxel@redhat.com>,
Hans De Goede <hdegoede@redhat.com>
Subject: Re: can I make this work… (Foundation for accessibility project)
Date: Tue, 18 Nov 2014 09:51:42 -0500 [thread overview]
Message-ID: <546B5CFE.5080205@eggo.org> (raw)
In-Reply-To: <546B4E97.7050208@redhat.com>
On 11/18/2014 8:50 AM, Paolo Bonzini wrote:
>
> I'm adding two people who might know.
>
> Do you have any idea what the "magic to pipe data back to the Linux
> host" should look like? Does a normal serial port (COM1 for Windows,
> /dev/ttyS0 for Linux) work?
>
The fine magic comes in three forms. Keystroke injection, context
feedback, and exporting UI elements such as microphone level,
recognition correction, partial recognition pop-ups into the linux
environment.
All of these have in common the magic trick of using the isolation of
the Windows environment to provide a single dictation target to
NaturallySpeaking. All of the information necessary for the above
capabilities would pass through this target. initially, this would be an
ssh session with command redirecting standard into whatever
accessibility inputs available.
The host side of this gateway would be responsible for all of the
proper input redirection. In theory, it would even be possible to direct
speech recognition towards two targets depending on the grammar. For
example in the programming by speech environment I'm working on, I would
dictate directly into the editor sometimes and into a secondary window
for focused speech UI action. At no time, would my hand touch the mouse.
:-) It will happen because of the context set by the speech UI as a
deliberate effect of certain commands.
--- longer ramble about speech and nuance issues. ---
Being a crip who's trying to write code with speech, it's not going to
be fast. once I get the Basic keystroke injection working, it will be
good enough to continuing developing my program by speech environment.
But to discuss that, would go down the rathole of current models of
speech user interfaces, why don't work, things you shouldn't do such as
speaking the keyboard, intentional automation, contextual grammars and a
host of other things of spent the past 15 years learning about and
figuring out how to make a change. By the way, that knowledge and
passion is why I I've started a consulting practice that focuses on
improving user experience/user interfaces starting from the intent of
the user and perspective of a disabled person with the result being an
improved UI for everybody.
The hardest part is going to be everything except a keystroke injection.
This is because they require special knowledge that nuance is loath to
give up. I don't get it. Nuance totes and gets federal benefits for
producing something that is "section 508 compliant" yet, the only way
you could be considered an accessibility tool is if you do nothing but
write in Microsoft Word. I worked for a dragon reseller for a while
with medical record systems and, nuance doesn't even make an attempt to
try and speech enable the medical record environment. They have people
using a couple of solutions that don't work well and effectively provide
no UI automation[1] tied into speech commands.
A bunch of us techno Crips have built environments that greatly enhance
the range of solutions NaturallySpeaking could be used for but, nuance
won't talk to us, won't give us any documentation to keep things running
on our own, won't sell us the documentation either and worst of all,
they have written terms into the AUP designed to bar extensions like our
environment unless you buy the most expensive version of
NaturallySpeaking available.
And did I mention that they have many bugs that are a significant
problem for every user, not to mention the scripts and the last time I
checked, it will cost about $10 to report a bug (support call cost) and
then there's no guarantee they'll ever fix. In version 13, I'm seeing
bugs that have been around since version 7 or 8.
I will do what I can to implement the magic and when I get stumped,
then, I'll figure out what I'm going to do technically and politically.
--- eric
[1] This is kind of a lie. They have the tools to what you navigate
blindly through an application (i.e. hit 15 tabs, two down arrows, and a
mouse click and it might end up in the right UI element to do something.
unfortunately, they do not have anything to make it predictable,
repeatable or survive revisions in the user interface. But this is one
of those rat holes I said it wouldn't go down.
prev parent reply other threads:[~2014-11-18 14:51 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-18 6:48 can I make this work… (Foundation for accessibility project) Eric S. Johansson
2014-11-18 13:50 ` Paolo Bonzini
2014-11-18 13:53 ` Hans de Goede
2014-11-18 14:57 ` Eric S. Johansson
2014-11-20 16:28 ` Eric S. Johansson
2014-11-20 21:48 ` Paolo Bonzini
2014-11-20 22:22 ` Eric S. Johansson
2014-11-21 14:06 ` Paolo Bonzini
2014-11-21 16:52 ` Eric S. Johansson
2014-11-21 18:22 ` Paolo Bonzini
2014-11-21 18:24 ` next puzzle: " Eric S. Johansson
2014-11-21 18:47 ` Eric S. Johansson
2014-11-18 14:51 ` Eric S. Johansson [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=546B5CFE.5080205@eggo.org \
--to=esj@eggo.org \
--cc=hdegoede@redhat.com \
--cc=kraxel@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.