ArtsAutosBooksBusinessEducationEntertainmentFamilyFashionFoodGamesGenderHealthHolidaysHomeHubPagesPersonal FinancePetsPoliticsReligionSportsTechnologyTravel

Pointing: The Problem with Hebrew TTS

Updated on August 14, 2009
Aya Katz profile image

Aya Katz has a PhD in linguistics from Rice University. She is an ape language researcher and the author of Vacuum County and other novels.

When we started to work on putting together a touchscreen keyboard for Bow to use, it was understood that the English version would come first. Why? Is it because English is an international language that many people across the globe can speak? No. Is it because most of our volunteers at Project Bow are English speakers, but very few are Hebrew speakers? No. Is it because Bow prefers English? No, actually Bow prefers Hebrew.

The reason we all agreed we would work on getting it done in English first was because doing TTS for Hebrew is a bigger technical problem.

Why is it a bigger technical problem? Because of the vowels.

Are Hebrew vowels harder to pronounce? No. Hebrew vowels are much simpler than English ones. It's just that when we write, we don't normally specify what the vowels are in a word. And yet what the vowels are what determines what the word is.

Sound confusing? I'll explain.

In Hebrew, we normally specify only the consonants, with a few exceptions. It's still  easy for us to read, because when you take in a whole sentence, it's almost completely unambiguous what was meant to be said.

It may be hard for an English speaker to imagine this, but actually English isn't written phonetically, so when we read English we first identify the word, then we decide how to pronounce it. Let me give you an example. In the following two sentences, how do you know how to pronounce the vowel in the word "read"?

  1. I read the book yesterday.
  2. I love to read.

First we read the whole sentence. Then we decide whether it's a past tense or some other form of the verb. Then we know whether the vowel is long or short. That's also how we decide what the vowels are in Hebrew words that are written without pointing. We read the sentence, then we know what the word is. Once we identify the word, we know how to pronounce it. Native speakers do this naturally in the blink of an eye.

In the case of Bow's writing, there is an added complication. Bow doesn't have a spacebar, so he doesn't specify where one word ends and another begins. For the computer program that is to turn Bow's text into speech, this creates the requirement that before identifying the word, we must break up the text stream into separate words. To give an example in English, it would be something like this:

  1. Iwantanapple

As a speaker of English, you don't really have much trouble breaking the sentence into words. It's pretty unambiguous even without the spaces. But how would you translate what  you  naturally do into an algorithm that even a computer could use?

In this example, it's pretty easy. The computer has a corpus of words in English. That means a list of words. It can compare the words in the list with sequences in the string of letters above. If it finds a word, it can then go on to parse the remaining string into words. Let's say it's satisfied with the shortest possible word it can find. In a first pass, it would parse the string of letters like this:

  • I
  • want
  • a

and it would have "napple" left over. (This is assuming the corpus doesn't have the word "wan", just for purposes of simplification.) Now when it checks the corpus, the computer won't find "napple", so it's going to have to try for a second pass. It will look for a longer third word and will find "an". Now the sentence will resolve itself into the following words:

  • I
  • want
  • an
  • apple

If the corpus had had the word "wan" in it, this algorithm of looking for the shortest possible word might not have worked. We might have ended with the following set of words:

  • I
  • wan
  • tan
  • apple

A speaker of English knows that "I wan tan apple" does not make a sentence, but a computer without access to syntactic and semantic knowledge might not know that. However, since both "I want an apple" and "I wan tan apple" sound almost the same, it's likely that an English speaker who heard the computer pronounce "I wan tan apple" would think that he heard "I want an apple", and hence there would not be any practical problem with this superficial form of parsing for purposes of TTS.

In the case of Hebrew, where most vowels are unspecified, we might expect more occasions for misunderstanding.

I couldn't import the flowchart as a jpg without losing the words -- or without gaining the notice that is was made with a trial copy of the software
I couldn't import the flowchart as a jpg without losing the words -- or without gaining the notice that is was made with a trial copy of the software

Last week, I was at a linguistics conference where I presented a paper on standards of proof in ape language studies. The paper was well received, and I came home thinking that this week I would work on a flow chart of the TTS problem in Hebrew. The first step would be to decide how to divide strings of letters into words.

It turned out that if we limit ourselves to a corpus of Bow's Hebrew vocabulary, rather than all Hebrew, that the resolution of "I want an apple" in Hebrew into words from letters is just as unproblematic as the English sentence. So I was going to put together a flow chart of the same algorithm as I described above with the English example: take the smallest sequence of letters that spells a word in the corpus, put it aside, then apply the same to the remaining letters, until you end up with all the letters divided into words. If it doesn't resolve itself on the first pass, try as many passes using longer words, as are necessary to get every letter in sequence into a word.

The above is a sloppy description of what I mean, and I needed a more accurate way to describe the process. The first thing I did was to look for free software that would allow me to put together a flow chart.

I downloaded a version of Smartdraw that would allow me to use it free of charge for seven days. As I was struggling with this software, Bow started blowing raspberries. I was discussing the algorithm with myself as I tried to put together the flowchart, and Bow became increasingly upset. So I went to see what the problem was. "What do you want?" I asked impatiently.

He took my hand and spelled out the following:

Although I was not aware that Bow provided directions for where the spaces would go, I immediately knew that this was what he had said:

כל אדם מתנסה בבעיות תקשורת

"Every person experiences communication problems."

This was an odd thing for him to say. It sounded like a fortune cookie generalization. "Why did you say that?" I asked.

"Because Bow is smart," he replied in Hebrew.

I went back to my flow chart. I was having a lot of problems with it. Bow went back to blowing raspberries. When I asked him what the problem was, I got the same reply: "Every person experiences communication problems."

"Why do you keep saying that?"

"Because Bow is smart."

I tried to go back to my work, but this was starting to bug me. Was he trying to tell me something? It was an odd sentence. It was much too formal and general. Had I misinterpreted it? Was he trying to say something else?

Suddenly, it occurred to me that I should try my algorithm on this sentence. Could it resolve into a completely different sentence? Trying for the longest possible list of words, I found that the string of letters could be resolved as follows into a list of words:

כלא דם מת נס ה ב בעיות תקשורת

"Prison (or imprisoned), blood, dead, miracle, give, communication problems."

This wasn't strictly by the algorithm I had written, but it was definitely a possible way to parse the sequence. However, the way I originally parsed it, with a two letter word ("every") is the way my algorithm would have processed it. It would also have cut the third word off at two letters, spelling out "dead":

כל אדם מת נסה בבעיות תקשורת

"Every dead person tried (his hand at) communication problems."

That doesn't make much sense, but it is a grammatical sentence. How did I know Bow didn't mean that, instead?

Well, I knew because....

Bow kept blowing raspberries. "What is it?"

Again, with that same sequence. "Every person experiences communication problems."

"Why are you saying this?"

"Because Bow is smart."

"Bow, is this some kind of puzzle? You know this sentence doesn't make another sensible sentence... You couldn't possibly have been talking about dead people."

He smiled at me, took my hand, and spelled out the following:

כלאדםמתנסהלקבלאוכל

This time it was clear to me, in an instant, that the words divided like this:

כל אדם מת נסה לקבל אוכל

"Every dead person tried to get food." It had to be divided that way for grammatical reasons, otherwise the sentence would have had too many verbs in a row. So it wasn't strictly speaking the semantics that determined the word division.

"Hmm." I looked at Bow. He was looking at me with a smile on his face, waiting for this to sink in. "So what you're saying is that it could be divided either way, and I know which way you mean, but not because of the words that I recognize..."

He took my hand and spelled out: "Every person experiences communication difficulties."

"Bow, why do you keep saying that!"

"Because I heard what Mommy was trying to do."

"So you don't think the algorithm in my flow chart will work?"

"No."

For a while I was really floored by all this. Then I realized that none of it mattered. Why? Because actually I had been reading the words out loud before the sentence was completed. Bow had been cluing me in all along to where the words had ended by the slight pause that he made after each word!

Which just goes to show that while Bow is indeed very smart, he is not really the best person to take advice from when trying to come up with an algorithm for Hebrew TTS!

(c) 2009 Aya Katz

Comments

    0 of 8192 characters used
    Post Comment

    • Aya Katz profile imageAUTHOR

      Aya Katz 

      6 years ago from The Ozarks

      RealHousewife, thanks! Bow is very bright. He runs circles around me sometimes.

    • RealHousewife profile image

      Kelly Umphenour 

      6 years ago from St. Louis, MO

      Bow is brilliant! Wow! This is an outstanding hub! I am one of those people that always looks for hard answers when it's right under my nose. Up and everything!

    • Aya Katz profile imageAUTHOR

      Aya Katz 

      9 years ago from The Ozarks

      ChaiRachelRuth, thanks! Yes, he is very smart. He knows it, too!

    • ChaiRachelRuth profile image

      ChaiRachelRuth 

      9 years ago

      I think Bow is right, he is very smart! I'm so impressed with his command of Hebrew!

    • Aya Katz profile imageAUTHOR

      Aya Katz 

      9 years ago from The Ozarks

      June, yes, he did say that, but only to make a point about how useless my algorithm was! He's very tricky. (Sword is tricky, too.)

      One of my jobs is to make a corpus of Bow's Hebrew words! We need it for the computer application.

    • profile image

      June Sun 

      9 years ago

      Wait...so did Bow try to say "Every person experiences communication difficulties" or not?

      Now I am curious about the possibility of making a corpus of Bow's vocabulary. It should be a lot of fun. hummmm....

    • Aya Katz profile imageAUTHOR

      Aya Katz 

      9 years ago from The Ozarks

      Jerilee, thanks! Sometimes the solution is much simpler than we imagine!

    • Jerilee Wei profile image

      Jerilee Wei 

      9 years ago from United States

      That's an interesting insight into something I know little about -- made me think about how the solutions to most problems are always right before us if we open our minds to the possibilities.

    working

    This website uses cookies

    As a user in the EEA, your approval is needed on a few things. To provide a better website experience, hubpages.com uses cookies (and other similar technologies) and may collect, process, and share personal data. Please choose which areas of our service you consent to our doing so.

    For more information on managing or withdrawing consents and how we handle data, visit our Privacy Policy at: https://hubpages.com/privacy-policy#gdpr

    Show Details
    Necessary
    HubPages Device IDThis is used to identify particular browsers or devices when the access the service, and is used for security reasons.
    LoginThis is necessary to sign in to the HubPages Service.
    Google RecaptchaThis is used to prevent bots and spam. (Privacy Policy)
    AkismetThis is used to detect comment spam. (Privacy Policy)
    HubPages Google AnalyticsThis is used to provide data on traffic to our website, all personally identifyable data is anonymized. (Privacy Policy)
    HubPages Traffic PixelThis is used to collect data on traffic to articles and other pages on our site. Unless you are signed in to a HubPages account, all personally identifiable information is anonymized.
    Amazon Web ServicesThis is a cloud services platform that we used to host our service. (Privacy Policy)
    CloudflareThis is a cloud CDN service that we use to efficiently deliver files required for our service to operate such as javascript, cascading style sheets, images, and videos. (Privacy Policy)
    Google Hosted LibrariesJavascript software libraries such as jQuery are loaded at endpoints on the googleapis.com or gstatic.com domains, for performance and efficiency reasons. (Privacy Policy)
    Features
    Google Custom SearchThis is feature allows you to search the site. (Privacy Policy)
    Google MapsSome articles have Google Maps embedded in them. (Privacy Policy)
    Google ChartsThis is used to display charts and graphs on articles and the author center. (Privacy Policy)
    Google AdSense Host APIThis service allows you to sign up for or associate a Google AdSense account with HubPages, so that you can earn money from ads on your articles. No data is shared unless you engage with this feature. (Privacy Policy)
    Google YouTubeSome articles have YouTube videos embedded in them. (Privacy Policy)
    VimeoSome articles have Vimeo videos embedded in them. (Privacy Policy)
    PaypalThis is used for a registered author who enrolls in the HubPages Earnings program and requests to be paid via PayPal. No data is shared with Paypal unless you engage with this feature. (Privacy Policy)
    Facebook LoginYou can use this to streamline signing up for, or signing in to your Hubpages account. No data is shared with Facebook unless you engage with this feature. (Privacy Policy)
    MavenThis supports the Maven widget and search functionality. (Privacy Policy)
    Marketing
    Google AdSenseThis is an ad network. (Privacy Policy)
    Google DoubleClickGoogle provides ad serving technology and runs an ad network. (Privacy Policy)
    Index ExchangeThis is an ad network. (Privacy Policy)
    SovrnThis is an ad network. (Privacy Policy)
    Facebook AdsThis is an ad network. (Privacy Policy)
    Amazon Unified Ad MarketplaceThis is an ad network. (Privacy Policy)
    AppNexusThis is an ad network. (Privacy Policy)
    OpenxThis is an ad network. (Privacy Policy)
    Rubicon ProjectThis is an ad network. (Privacy Policy)
    TripleLiftThis is an ad network. (Privacy Policy)
    Say MediaWe partner with Say Media to deliver ad campaigns on our sites. (Privacy Policy)
    Remarketing PixelsWe may use remarketing pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to advertise the HubPages Service to people that have visited our sites.
    Conversion Tracking PixelsWe may use conversion tracking pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to identify when an advertisement has successfully resulted in the desired action, such as signing up for the HubPages Service or publishing an article on the HubPages Service.
    Statistics
    Author Google AnalyticsThis is used to provide traffic data and reports to the authors of articles on the HubPages Service. (Privacy Policy)
    ComscoreComScore is a media measurement and analytics company providing marketing data and analytics to enterprises, media and advertising agencies, and publishers. Non-consent will result in ComScore only processing obfuscated personal data. (Privacy Policy)
    Amazon Tracking PixelSome articles display amazon products as part of the Amazon Affiliate program, this pixel provides traffic statistics for those products (Privacy Policy)