If you search for something in Google and use a word like "running", Google is smart enough to match "run" or "runs" as well. That's because search engines do what's called stemming before matching words.
In English, stemming involves removing common endings from words to produce a base word. It's hard to come up with a complete set of rules that work for all words, but this simplified set does a pretty good job:
If the word starts with a capital letter, output it without changes.
If the word ends in 's', 'ed', or 'ing' remove those letters, but if the resulting stemmed word is only 1 or 2 letters long (e.g. chopping the ing from sing), use the original word.
Your program should read one word of input and print out the corresponding stemmed word. For example:
Enter the word: states
Another example interaction with your program is:
Enter the word: rowed
Remember that capitalised words should not be stemmed:
Enter the word: James
and nor should words that become too short after stemming:
Enter the word: sing
Google actually does quite sophisticated stemming. They give an example on their search help page.
You should only implement the rules we've listed above, even though they get some words, like 'buses' wrong (converting it to buse). Stemmers make these kinds of mistakes all the time!
I am sorry but I really need to be pointed in the right direction. Do i need to slice up the string???
Last edited by micseydel
on Wed Aug 21, 2013 8:57 am, edited 1 time in total.