Solving for “R” with lower quality models, the challenge and the solution…
A popular observed issue with LLMs is their perceived inability to solve simple counting problems, but their are ways to solve these issue if you understand the inherent system functions that are causing the errors. The current problem/test is counting the number of times a letter occurs within a given word. The core issue is that in the vectorization of the tokens, letters that are the same and next to each other are getting lost in translation of the mathematical conversion. The solution is to make the individual target variables (letters of the target word) long enough to take up full token lengths, so their proper representation is not lost in vectorization and computation by the model.
The solution below works and was tested successfully in both Mistral Large and GPT 3.5. Feel free to try this with other counting or numerical operations for improved results. Thank you to Alex Semenov for the challenge. 😁🫡
User Input:
How many Rs are in the word “strawberry”?
Critically analyze the above instructions, and devise your plan to complete all the tasks you define. Ensure you first break the target word down into its individual letters, then convert each letter of the target word into its military phonetic pronunciation form before performing any counting operations. Cover all the bases, answer as thoroughly as possible. Take a deep breath, and think step by step.
Mistral Model Output:
Sure, I'd be happy to help with that! Let's break down the task step by step:
1. Break down the target word into its individual letters: The word "strawberry" can be broken down into nine individual letters: s, t, r, a, w, b, e, r, r, y.
2. Convert each letter of the target word into its military phonetic pronunciation: The military phonetic alphabet is a system used to spell out words and communicate more clearly over noisy or poor communication links. Here's how "strawberry" would be pronounced using this system: Sierra, Tango, Romeo, Alpha, Whiskey, Bravo, Echo, Romeo, Romeo, Yankee.
3. Perform the counting operation: The task is to count the number of "R" or "Romeo" in the word "strawberry". From the breakdown and the military phonetic pronunciation, we can see that the letter "R" or "Romeo" appears three times in the word "strawberry".
So, the answer to your question is that there are three "R"s in the word "strawberry".