|
|
Voice Recognition - This article was
originally published in the Campbell Law Observer, a monthly legal
newsletter published by the Campbell University School of Law in Buies
Creek, N.C. To subscribe, contact Shannon Vandiver at (910) 893-1798.
Voice
Recognition
What it can ----- and still can’t do for you
Hello to all. I trust everyone made it to the
year 2000 safe and sound. Y2K has done whatever damage it will do and
civilization has survived. Now that we are sure technology will continue
to dominate our near future I wanted to caution you to scrutinize your
documents and programs closely in the next few weeks. Y2K can affect
computers in insidious ways as well as shut them down completely. Make
sure calculations are verified in spreadsheets, accounting and real
estate programs. Any program that prorates or calculates based on a time
period could be affected. IT IS NOT SUFFICIENT THAT THE PROGRAM
CONTINUES TO OPERATE! The question is whether it is operating properly.
Hopefully, the answer is yes for all of you.
Voice recognition programs are very resource
intensive programs, that is, they need a LOT of computer power to
operate properly. Development has been, and still is, hampered by the
relative lack of power available in the personal computer. We have come
a long way but computers still do not begin to approach the power of an
infant child’s brain. With that said, there is a place in today’s
law office for voice recognition software, as long as you understand its
limits and costs.
Remember our mantra:
Why do I need it?
What does it do?
How does it work?
When should I implement it?
Who should be using it?
Where do I obtain it?
Why do I need it?
You may not need it, yet. Voice recognition
will continue to evolve as computers become more powerful. It will be
gradually integrated into every computer program available. For this
reason alone, it is important you do not casually dismiss it as a
gimmick or luxury.
Windows arrived and many resisted. If you are not in Windows by now you
are losing money. Why? Because there are software programs available to
Windows users which are not available to DOS or LINUX users. Software is
the path to ROI (Return On Investment).
Hardware is analogous to the roads, bridges, and highways across the
nation. Software is analogous to the vehicle that takes advantage of it
and makes it valuable. Without the vehicle, the roads become mere
decoration.
Like the vehicles in which we have traveled, software has evolved from
walking to the equivalent of the early automobile. Software has not yet
achieved the equivalent of air travel or space travel. Voice recognition
will be the software equivalent of man’s leap into air travel. First,
it will provide access to the power of computers to those who cannot
type. Eventually, it will do the same for everyone. The science fiction
of talking with computers is only a decade or so away.
Still, since the first effect will be to give access to those who want
to use computers efficiently but cannot type efficiently, those who can
type will not benefit to a large degree over the next few years. Of
course, there is the specter of carpal tunnel syndrome to worry about
and this may be reason enough to make it available to staff who
otherwise may not need it.
Another aspect of the transition to voice can be seen by looking at the
mouse. As Windows software developed, the integration of the mouse into
software operation rose to an entirely new level. Whereas the mouse was
an afterthought to the keyboard in DOS development, the mouse has now
become the primary method to control programs in the Windows world. The
keyboard is used only for data entry by a majority of computer users
today.
The same thing will happen as voice gradually moves into mainstream
usage. The mouse (and keyboard) will eventually take a backseat to the
power of voice command.
Now you may be wondering, “Gee, if I can use my voice to run the
computer then every thing will be easy to operate. I can just wait for
the day when the computer talks to me.” WRONG. Unfortunately, it will
take some amount of time until computers can truly carry on a
conversation with humans. Until then, the power of voice will be an
advantage but will require a little bit of work on your part.
What does it do?
Voice recognition (VR) software takes sounds
and attempts to match them up with a standardized version of a language.
When you say “well”, the computer takes the raw sound, compares it
to a dictionary of recorded American English words and tries to match it
up, all in about a 1/100th of a second. As you can imagine, as the
dictionary becomes larger, the computer needs more power.
There are also a couple of other wrinkles involved. Slang,
pronunciation, enunciation, homonyms, and background noise are just a
few of them. All of the VR products on the market allow you to add a
limited number of words to a custom dictionary. This generally takes
care of the few slang words necessary.
Pronunciation and enunciation, however, are much bigger problems. Since
the computer is not listening to a word and trying to make sense of its
meaning (not yet anyway), regional accents change the raw sound from the
“standard” on which the VR product is based. To help overcome this,
the newest products incorporate various regional accents into the
product and try to match you up with one during the training phase of
the installation.
Unfortunately, there are many variations in regional accents and to hone
the computers ability to match up your voice with its dictionary,
training is a necessary evil. This is the stage where most of us fail to
make the effort required in order for our voice software to function
properly.
Training is performed in two stages. First, you read a pre-determined
selection to the computer. It compares what you say with the sound
dictionary that came from the programmer. It then makes adjustments for
your pronunciation and enunciation. Obviously, the more you read the
more refinement of the sound dictionary. This reading stage can take up
to several hours.
The second stage is the correction stage. This is where we usually fail
to keep up. We are to read something to the computer of our choice only
this time we are supposed to monitor the result on the screen. If the
program misinterprets a word or phrase, we are to immediately correct it
using the tools provided. Depending on how close your diction fits the
“standard” there can be a LOT of correction necessary before the
program begins to reach an acceptable accuracy.
Performing the correction is aggravating. It interrupts the flow and
causes you to lose your train of thought. For this reason, I advise you
not to attempt to use the product for real work until you have taken the
time to do correction for at least an hour or two. Read from your
current novel or textbook. Read from cases you need to read anyway. This
makes the correction stage a lot easier to tolerate.
Last of all, don’t be discouraged by previous versions inability to
handle correction properly. All of the major vendors have made
tremendous strides with their respective products. In other words, they
actually work now.
How does it operate?
Once you have gotten the program trained for
your voice you are now ready to learn how to use the program. Yes, you
read correctly, learn to use the program. Unfortunately, current VR
software is not ‘smart’ enough to figure out context.
This means that homonyms will cause some problems although newer
software is getting more intelligent all the time. For example, suppose
you dictate, “…and the Guarantor hereby affirms he has read the
foregoing …”. The VR program must decide whether you mean ‘red’
or ‘read’. Remember it is only listening for the sound. Newer
programs have rules which help the computer decide but homonym errors
are still common.
Another requirement in using the program is telling the program when to
put in punctuation. Since the software isn’t yet able to understand
context, it can’t tell when a sentence ends. Fortunately, most VR
programs have adopted standard dictation practices so it is not much
different from standard dictation. Unfortunately, many of us don’t use
standard dictation methods. We rely on our secretary’s computer (i.e.
brain) to figure out punctuation, capitalization, line spacing and other
formatting. Consequently, there is often a learning curve involved here
as well.
Once we have trained the software and learned how to use it we are ready
to go to work. (Finally!) We have two modes
for operating the program. Voice Control and Voice Dictation. In other
words, we can choose between controlling the program we are using (i.e.
File Open, File Save, Select All, Delete, etc.) and dictating into the
program. There is usually both a voice and a keyboard command that
switches between the two modes of operation.
When should I implement
it?
The implementation needs to be driven by how
beneficial the software will be to the user balanced by how much time
the user has to devote to voice training and user training. If the
lawyer is in the middle of a big case or a flurry of estate plans then
it is unlikely they will devote the necessary time to training. The best
method is to actually schedule the training as if it was a client. This
clearly establishes the cost (lost billable hours) and clears out the
necessary time. Unlike the setup of case management and document
assembly software, it cannot be farmed out to anyone else.
Who should be using it?
This is easy. Anyone who needs to use the
computer software but is inefficient because they cannot type well. This
could include certain paralegals and would include most lawyers. As
discussed above, it could also include certain injured or disabled
individuals.
As with all software purchases the cost-benefit must be weighed on a
case by case basis. Virtually all lawyers will derive substantial
benefit once they learn how to use the software.
Where do I obtain it?
This is a trickier question than it first
appears. There are so many different programs and versions available.
First, let’s establish something we all know but generally refuse to
acknowledge. You get what you pay for. Of course, this means we can get
less value than we deserve for what we spend but we almost never get
more. This is especially true with computer software. If you buy junk,
you get junk.
The major VR software providers are Dragon, L & H, IBM &
Kurzweil. All of them use the same underlying programming code and then
make changes in the user interface. All of them have a cheap version of
their software that is intended for recreational home use. Please
don’t buy it expecting it to work the same as the business level
version. Most have a ‘legal’ version which means it has an
additional vocabulary for the legal field. In other words, a legal
specific sound dictionary is added. Generally, the additional cost
outweighs the usefulness of the additional dictionary. Most lawyers are
moving away from ‘legalese’ so buy the best non-legal specific
dictionary and you should be fine. If you practice in an area that still
requires legal jargon then you will need to ante up to the ‘legal’
edition of the software.
One last consideration is the other software in
use on your system. Some software packages have chosen a specific
product and will not work with any others. Find out about this
interaction before making your choice.
Conclusion
Voice recognition will be one of the hottest
technologies for the next few years. It will eventually become the
‘preferred’ method for manipulating your computer and its software.
By keeping up as VR software evolves, your firm will reap the benefits
of the efficiency it provides while avoiding large learning curves down
the road. Don’t be an ostrich. Take your head out of the sand and look
around. Your computer may be trying to talk to you.
Lee D. Cumbie is the founder of Cumbie Law
Office Automation Consulting, one of the Technology Assistance Program
consultants for the N.C. Bar Assn. He is also an Adjunct Professor of
Law at Campbell University where he teaches the Law Firm Computer Lab
course. Lee is a member of Tyson & Cumbie, PLLC in Fayetteville,
N.C.
Lee earned his B.S. degree from Regents College after military service
in the U.S. Navy. He earned his J.D. from Campbell University, cum
laude, in 1997.
Resources to get you started:
Information:
NetsaversCenter Y2K Section www.suttondesigns.com/NetsaversCenter/Y2K/Y2K-Links.html
Year 2000 Software Windowing Solutions www.suttondesigns.com/NetsaversCenter/index.htm
PC Magazine Online www.pcmag.com/y2k
PC Magazine October 6, 1998 "What To Do About the
Year 2000", Jim Seymour, p. 100
Ziff-Davis ZDNet www.zdnet.com/y2k
Year 2000 Information Center www.year2000.com
Legal/Management Issues www.y2k.com
Programs:
Netsavers Y2K TSR Scanner Kit,
V. 4.0.1 www.suttondesigns.com/NetsaversCenter/Y2K/NetY2K
The RighTime Clock Company www.rightime.com
DMX II www.dmx2.com
Computer Experts Ltd www.computerexperts.co.uk
UniComp Products www.unicomp-products.com
Network Associates www.nai.com
|