Welcome!

Joe Barr

Subscribe to Joe Barr: eMailAlertsEmail Alerts
Get Joe Barr via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Article

How to get IBM ViaVoice Dictation running on Red Hat 7.3

IBM may be investing $1 billion in Linux, but none of it's going to ViaVoice Dictation

(LinuxWorld) — My boss sent me to a seminar to evaluate a hot technology that might prove useful: voice recognition. Kurzweil, a pioneer in the black art, conducted the seminar. I concluded voice recognition was nifty and gave impressive demo, but getting reliable, speaker-independent, continuous-speech recognition was beyond the ken of current technology. That was in 1980. It's still a fair summation today.

For some people, the promise of voice recognition is a siren call heard above the noise of reality. The desire to produce text from the spoken word remains strong. A friend of mine, we'll call him Robert, recently asked me to help him get IBM's ViaVoice Dictation working on his Red Hat 7.3 desktop. Robert was on a quest for a friend of his, who for physical reasons finds it difficult to type, making school papers harder to create than they should be. Dictating her papers would make her life much easier.

A year or so ago I tinkered briefly with ViaVoice Dictation while using either Mandrake 8.0 or the SuSE distributions that had included it, but evidently my microphone was not adequate. I thought IBM pulled the Linux version off the market, but Robert said he found it for sale on the IBM Web site. Not only that, but the boxed version included its own microphone. I ordered a copy for $43.25, including tax and shipping.

Early warning sign

There is no printed manual, and installation instructions are printed on the CD. A nice user's guide awaits on the CD in PDF format. The good news about the installation is that it takes just a minute. The bad news is that it doesn't work, at least on a modern distribution like Red Hat 7.3.

IBM says it supports ViaVoice Dictation for Linux on Red Hat 6.2 only, but I dispute that claim. IBM doesn't support it at all, and instead refers people to a mailing list (a non-IBM mailing list, at that) and a FAQ. Nothing else. Worse, IBM doesn't even point you to the most valuable source of information about ViaVoice Dictation for Linux — Volker Kuhlmann's Web site. Without that resource, I would have simply given up. (The links for all three are in Resources below.)

Even if you do happen to run Red Hat 6.2, you're not out of the woods, especially if you've upgraded your version of Java. IBM recommends Blackdown's 1.2.2 RC 4. That's the version IBM used in developing and testing the product. If you're using ViaVoice Dictation out of the box, I don't think you have a choice. At least I haven't been able to make it work on any other version of Java. According to comments I've read on the mailing list, if you are using later versions of ViaVoice Dictation — the versions bundled with Mandrake and SuSE — you can upgrade to a newer version of Java.

It is possible, however, to get a slightly more recent version of the ViaVoice Dictation runtime than comes in the box. That's a good thing, because I never would never have gotten it to work without it.

I installed VV Dictation for Linux on Red Hat 7.3 running the Ximian Gnome desktop. If you are running Red Hat 7.3, you might be able to get the commercial version of VV Dictation running by following the trail-sign here. No promises, though, this is a treacherous trail.

How to make ViaVoice Dictation go

First, get the right version of Java. Go to the Blackdown site (see Resources for the URL) and download JDK-1.2.2 RC4. Next, go to IBM (ditto for the URL) and download viavoice_dict_rtk_3.tar. When you untar that file, you'll find an RPM for ViaVoice_runtime-3.0-1.2. Hang on to it, you'll need it.

My first attempt to install ViaVoice Dictation was a failure, as were the next 20. I'll try to limit my commentary to the highlights. The problems I ran into included:

  • Music playback test results in only a burst of static
  • "Insufficient data" caused first reading sample to fail
  • Segfault occurred at beginning of training session after selecting Treasure Island Part 1 as the training text
  • Segfault occurred at end of training session after reading Treasure Island Part 1
  • Starting the app brought up a window with no menus, or a misshapen window with panels looking like they would in a Halloween horror house mirror, or way-wrong fonts
  • After first use, lost menus and panels from application window

After more than a day of trying to get the music to play, I gave up and just continued to the next step. I believe this problem is related to the early versions of the ViaVoice runtime IBM provides.

The "insufficient data" problem occurred when I did not have the correct version of Java installed and correctly patched. Let me emphasize the patch from Kuhlmann's Web site.

Parts of it are rejected unless you are running SuSE 7.2. Don't worry about them. Just make sure you read everything on his page and heed it all. The patch is only a part of the solution.

The segfaults during training (you are training ViaVoice to recognize words the way you speak them) are a direct result of me not reading Kuhlmann's entire page. That and IBM's programming problems. Once I followed all of the tips on the page, and ran vvuser with the parameters he specifies, the segfaults went away.

Almost there

The last problem was the most disheartening. I managed to use ViaVoice Dictation for the first time, and it had recognized a surprising number of the words I spoke. I saved the file (containing text from a part of the Declaration of Independence I had dictated) and exited. When I came back a few minutes later and fired it up again, I was back in the land of non-functional windows.

I repeated this a few times, making sure I configured Java and patched the application correctly, and followed all of Kuhlmann's other tips. Each time it failed on second use. I finally decided that differences between SuSE and Red Hat were the problem.

I decided to manually apply what I thought were the essential parts of the patch to the shell scripts that actually run ViaVoice Dictation. I tinkered with them until I could finally install, train, and use ViaVoice Dictation twice in a row without messing up the patching. I was thrilled. However, it isn't soup yet. Since then, I've seen ViaVoice Dictation start with nothing but a big empty window, with commands in the wrong font, and then on the third try back to the way it should be.

This inconsistency might be acceptable for occasional use by nerds, but I don't recommend it for anyone who needs to rely on ViaVoice Dictation.

ViaVoice for Dummies engineering grad students

Here is an overview of the steps I've followed to get the commercial boxed set of ViaVoice Dictation running on Red Hat 7.3 with an Ensoniq 1371 PCI sound card, and the IBM-supplied microphone:

  1. Remove all existing versions of Java completely.
  2. Download Blackdown JDK 1.2.2 RC 4 and install it.
  3. Download the updated ViaVoice runtime from IBM.
  4. Install ViaVoice Dictation from the CD.
  5. Update the ViaVoice Dictation runtime with the IBM RPM.
  6. Download and apply the patch from Kuhlmann's site.
  7. Run vvuser as documented on Kuhlmann's instructions.
  8. Run vvstartuserguru.
  9. Ignore music track playback failure.
  10. Go through the training session.
  11. Use ViaVoice Dictation!


Editor’s note: The above image is reduced in size and color palette to allow it to load quickly. Click on the above image to see the original.

Given the problems I've run into in subsequent usage of ViaVoice Dictation, I am not going to detail exactly what I applied and what I changed in Kuhlmann's patch. If I can get it to work reliably, I'll create a patch specific to Red Hat 7.3 and make it available for all who ask. I may even do a story on using ViaVoice Dictation instead of how to overcome the frustrations of getting it working.

I wrote to Volker Kuhlmann to thank him for his page. I also wanted to learn a little more about him and his motivation. Kuhlmann is an engineering student and hopes to use ViaVoice Dictation — or something else that runs on Linux — to do his thesis.

Kuhlmann lives in New Zealand. When IBM released ViaVoice for Linux, he tried to buy a copy, but IBM refused to sell it to him. IBM offers it for sale only in the US. He found a good deal on the Mandrake 8.0 release that included a bundled version and bought that instead. He says it took him about 3½ days to get it installed on SuSE and to document what he had done. Now and then, someone writes to tell him they appreciate the page, so he figures, "It's done its job."

And a good job at that. Better than IBM's effort by a long shot. The boxed version of IBM's ViaVoice Dictation for Linux is one of the most poorly delivered commercial offerings I have ever seen for Linux or any other platform. That's a real shame, because there are many who need what ViaVoice Dictation promises.

I admire IBM and its people. I like what they have done for and with Linux. Probably no other firm in the industry can match IBM's contributions to free and open source software projects. However, ViaVoice Dictation is a black eye. The way the Linux version lags behind the Windows and Mac OS/X versions reminds me of OS/2 on hospice care. Shabby programming and the complete lack of support doesn't do much to improve IBM's image.

A word from IBM

I brought my concerns to Toby Maners, director, ViaVoice Segment, Pervasive Computing at IBM. She could not yet say what if anything would be done to improve the Linux version. "We are reassessing it now. We have, as you know, just put a new release of our desktop Windows product, and we did a new release of our Apple Mac product last year. We have significantly more customers on those two platforms, so we get much clearer feedback in terms of what we should do next."

I asked if that assessment including the possibility of withdrawing the product from the market, and she said yes. She also alluded to the possibility of a version for a Linux handheld, noting IBM sees possibilities there. She added, "I don't see that much on the desktop yet."

Maners also cleared up the question of why IBM doesn't sell the latest version. IBM did not do the versions of ViaVoice Dictation that were bundled with Mandrake and SuSE, but represented refinements added to IBM's version by Mandrake and SuSE.

In a future story, I'll review ViaVoice Dictation as bundled with Mandrake 8.0.

More Stories By Joe Barr

Joe Barr is a freelance journalist covering Linux, open source and network security. His 'Version Control' column has been a regular feature of Linux.SYS-CON.com since its inception. As far as we know, he is the only living journalist whose works have appeared both in phrack, the legendary underground zine, and IBM Personal Systems Magazine.

Comments (2) View Comments

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Most Recent Comments
Dan Andersson 10/24/03 05:52:42 PM EDT

I tried to purchase 1000 licenses of IBM's viavoice outloud for a bus computer application of ours...

IBM: No Sir! We don't support it anymore!
Me: I don't want any support! just sell me the license, my testapp on my eval license is working 110% !!
IBM: No can't do Sir, we don't support this on Linux anymore Sir...

(We and IBM use the 'Sir' in formal conversations here in the UK as it sound jolly nice...Mister ( US version, is what you call a soldier boy only, or a Royal Navy Lt ) .

Cheers

Doooh! We are now running 2500 (approx) devices with someone elses text-to-speak app on, not as good as the IBM Viavoice Outloud, but we could at least buy the commercial licenses for embedded systems!!!!!!!!!!!

They (IBM) rather decide if to support Linux or not, removing an already present and working linux app seems DAFT!

DOHHH!

Chauvin Emmons 10/23/03 08:05:46 PM EDT

I am very interested in voice recognition and all things linux. unfortunatly my skills lack painfully all support is appreciated. CME.