Founder & CTO @SMERGERS & @wealthrox
Past - Dev Infra @Google · RF @NI · CS @USC

Currently
Founder & CTO @SMERGERS & @wealthrox
Previously
Dev Infra Intern @Google NYC RF @NI-WCDMA
Computer Science @USC
Contact
hey [at] krishnabharadwaj.info
02-Jul-2011

Swype is an awesome software which makes typing in mobile phones using the qwerty keyboard very easy. This is how it looks:

swype

I was just thinking how this could be implemented. It boils down to a string sub-sequence problem. The path traced by the user consists of all the characters in a word. This is an ideal situation, but many a times, we do not take care of all the characters in between and miss many of them.

Some of the characteristics i have considered:

1. Filtering of words based on the first and last character.
2. Characters which are buried among other characters in the path traversed by the user.
3. The number of traversals between different rows of the keyboard gives us a fair idea about the length of the word.

We can use a number of such characteristics and increase the chances of suggesting the right word. One such hint i can think of is:

Split the words in sets of three characters:
Lets take the case of the word "English" here. The actual trace may be like "edfgbnbghjkliugfdsdfgh"

Groups like bnb, bgh, kli, dsd strongly suggest that there is a turn which signifies a character which has a higher probability of being present in the word.

I wrote a basic version of this using python and this wordlist. Some basic introduction to some of the functional programming used here is discussed in this link

The results for the same were pretty good.

['hello', 'hero', 'ho']
['quick']
['wed', 'weird', 'weld', 'wild', 'wold', 'word', 'world', 'would']
['doctor', 'door', 'dour']
['architecture']
['adjure', 'agriculture', 'article', 'astute']
['music', 'mystic']
['vocabulary']

The basic version was working within an hour. I must say that the string split method in python is very well thought of. split( delimiter, 1) returns a the string before and after the first match.

Atharva #breakthrough Sept. 20, 2021

HI Krishna,
thanks a lot for this code!
This was an actual Google interview question for SWE positions.

2 suggestions.
1. Even though your subsequence logic takes care of this, after line 65, it might be more efficient to filter by length of candidates <= length of path.
2. You might want to use a 2 pointer solution for checking if a string is a subsequence of the other

I had one question though. I don't quite understand why you do -3 in your line 60. Shouldn't a margin of error of -1 be enough?
ie
``` return len(compressed_row_numbers) - 1 ```

For example: lets say you go from row 3 to row 1 (through row 2). Now since we assume the first and last char *is* correct, shouldn't we assume that the output should be at-least 2 characters long (ie 3-1)?
of even if you move from row 2 to row 1 and back, isnt it safe to assume that you changes rows for a reason?

My thinking is that the only reason to subtract from the row changes would be because we used the middle row as an intermediate in our journey and hence -1

Vikalp Saxena Jan. 14, 2017

Can you explain the try except block in the def match(path,word), please?
I'd be very grateful to you. Thank you.

Menny Even Danan Dec. 10, 2013

Nice explanation.

What is the license of your code? I would like to port it to Android: https://github.com/AnySoftK...

Krishna Bharadwaj Dec. 10, 2013

Menny, Thanks. Feel free to use it.. I just added a MIT license for the code. Let me know if that is fine. Otherwise I can change it to something that works for a larger audience.

Menny Even Danan Dec. 11, 2013

AnySoftKeyboard is Apache2, and MIT is compatible (http://www.quora.com/Open-S... with it.

I'm glad you like it :)

Do you have experience with Java? Or C?
I need to deep a bit more into your solution, but at first glance, I feel that I'll need to implement parts of it in C (since my wordlist is stored and searched upon using a JNI component).

Krishna Bharadwaj Dec. 11, 2013

Menny, I know C/C++. If required I can translate this program to C. Let me know how I can help you with it. Will be more than happy to do it.

Menny Even Danan Dec. 11, 2013

Thank you.

Krishna Bharadwaj Dec. 11, 2013

And I just tried ASK. It's very smooth and I like it :)

Karel Vervaeke April 25, 2013

I somehow doubt that you would get the right performance like this. In reality I suspect they have a more efficient system for quickly looking up words with similar paths (e.g. using locality consistent hashing based on features of the paths)

Krishna Bharadwaj April 25, 2013

Karel, Definitely, this can nowhere be close to the actual swype algorithm :)

1. Performance (Quality of prediction) - this will not give the right performance because the decision making is not strict as i have considered. It is very much fuzzy and we will need lot more complex algorithms as you mentioned to take care of human errors.

2. Speed - honestly, releasing the screen after drawing the pattern and proceeding with the next word easily takes about 0.5 - 0.8 second. on those lines, the code above performs really well.

Annonymous July 6, 2012

Hi,
I was wondering if you have any idea how the lines are drawn over the soft keyboard? I am not able to get any canvas over the keyboard.

Prasanna Nov. 6, 2011

Wonderfully explained.
Love the way how you made use of split, rather than a comparing every character.

Krishna Bharadwaj Nov. 6, 2011

Thanks Prasanna, I am glad you liked it.

Nathaniel Sept. 11, 2014

Even though split compares every character anyway (However, in a highly optimized way)

Vikas Bhat Oct. 23, 2011

impressive man! keep it up!

Bharath S Kallur Sept. 26, 2011

good write up bro.... :) :) helpful to kick start with....

orkanizer.com Aug. 8, 2011

This is a clear and very useful article. I love PYTHON!

Krishna Bharadwaj Aug. 8, 2011

Thanks! I'm glad you found it useful :)

Akshay July 3, 2011

hey! this is good stuff man :)
 

Krishna Bharadwaj July 3, 2011

Thanks akshay! :)

Shafi_bmsce July 2, 2011

Nice ! very coool....
Nice points or characteristics considered

Ramanathan Ganesh July 2, 2011

wow!!! very neatly written code! Can this be done using a  trie instead of a wordlist?

Krishna Bharadwaj July 2, 2011

Since the path will have a number of characters in between the actual characters of the word, Trie kind of becomes useless. With each wrong character we are entering a different branch making things worse for ourselves. 

Sandeep Samdaria July 2, 2011

the Bada OS in few of the samsung phone has this feature.. :)

Krishna Bharadwaj July 2, 2011

Yes, Swype is very famous. Many manufacturers are bundling it with their phones.

Sandra Oct. 3, 2011

The Revolution also has this feature.  I tried it but it was a mess so I need to do some practice.