Solving the word count problem with LLMs, a continuation of the sub-token solution…

AI

So unlike the previous solution in my other post about “solving for R”, this solution seems to have a minimum model threshold performance with small models. This solution was tested with one shot initial success on all of the following models: Mistral Large, Llama3-8b-8192, GPT3.5, GPT4o, and Claude 3 Sonnet. Interestingly, Claude 3 Haiku and Opus models were the only models to not separate the “small…and” into two separate words, therefore leaving their counts off by one word. Also of note, if your run the sample paragraph through online word counters, they too will count only 54 instead of 55 words.

The solution is using the spaces as word separators/identifiers, then further separating the word count into individual lines. This allows even one letter words like “I” or “a” to now take up more space as there is a corresponding number, period, space, and line break to encompass the denotation of the individual word. This gives the model enough data to ensure an accurate total count.

Due to the output length and the maximum length of LinkedIn posts, here is just the prompt itself. Feel free to try it on many different models, with varying text samples.

Prompt:

How many words are in the following paragraph?

“What they do not comprehend is man’s helplessness. I am weak, small, and of no consequence to the universe. It does not notice me; I live on unseen. But why is that bad? Isn’t it better that way? Whom the gods notice they destroy. But small...and you will escape the jealousy of the great.”

Critically analyze the above instructions, and devise your plan to complete all the tasks you define. Ensure you determine the number of words by first identifying all the blank spaces between words, and use the blank spaces as separator values for word identification. Once each word is identified, list each word on an individual line in a numbered list. Use the number of lines in your new list to determine the word count of the original paragraph. Cover all the bases, answer as thoroughly as possible. Take a deep breath, and think step by step.

Previous
Previous

About damn time…great way for the government to start boiling the frog of nuclear regulation change by starting with a military customer instead of a civilian one.

Next
Next

Solving for “R” with lower quality models, the challenge and the solution…