Counting words in OpenOffice spreadsheet

I use OpenOffice for various purposes. The spreadsheet application is particularly useful for statistical projects and making great looking graphs. Every now and then I need to count the number of words in a cell. Unfortunately OpenOffice lacks such a function to find out this information. The word processor does this trick with ease. Writing a formula for the required need is, however, rather easy. It only needs some understanding of character strings and that’s all. Suppose we have the following sentence in a cell:

The quick fox jumped over.

At a glance we can say it contains five separate words. The sentence, or in this case a character string, has total of 26 letters or technically speaking: characters including the ending punctuation. The words are usually separated with spaces, so let’s suppose we have a character string without any spaces like this:

Thequickfoxjumpedover.

Now the length of this string is 22 characters. Subtracting this value from the original length of 26 characters gives us number 4. What does this tell us about? Actually, it tells us the number of spaces in the original sentence. It also reveals (almost) the number of words, clever, isn’t it? But I must underline here only the approximate value so far, as four is not same as five words. Something is still missing. The key explanation to the difference in results is that there are usually a word on both sides of the space character. Only the last (or first) word has one space on the other side. To fix this we must add +1 to the formula. But first we need to figure out how to remove the space in order to get the second string for comparison.

The spaces are removed using the SUBSTITUTE() function. We substitute the space character with an empty string as follows:

=SUBSTITUTE("The quick fox jumped over.", " ", "")

Next we need to find out the number of characters with the LEN() function:

=LEN("The quick fox jumped over.")

The final formula looks as follows:

=LEN("The quick fox jumped over.") -LEN(SUBSTITUTE("The quick fox jumped over.", " ", "")) +1

Counting words in a spreadsheet cell is quite straightforward. It requires no special tricks to accomplish, but the approach here may differ from other solutions. It still has its disadvantages. In case the cell is empty the formula still reports erroneously one word. Instead of using constant value +1 at the end of the formula we need to fix it with another LEN() function call as follows:

=LEN("The quick fox jumped over.") -LEN(SUBSTITUTE("The quick fox jumped over.", " ", "")) +(LEN("The quick fox jumped over.")>0)

This formula checks if the sentence has more than one character. The logical operation >0 returns TRUE, if the sentence is longer than one character and otherwise FALSE. These Boolean values are then interpreted as +1 for TRUE and 0 for FALSE, thus giving the needed addition.

Final words

The formula given here counts the number of spaces in a sentence. Comparing the original sentence with the substituted one actually gives the number of words. You may need to further adjust the formula to avoid misinterpretation of other punctuation in the line, such as dashes and numbers. This formula is a good start, anyway.

Julkaistu lauantaina 19.6.2010 klo 18:42 avainsanoilla harrastukset, ohjelmointi ja OpenOffice.

Edellinen
Tyrni ja Tyyne
Seuraava
Kuluttajan tietoiskut