The most abundant and yet untapped trove of valuable data is in your mind.
With so many people making the pilgrimage to the DC area for AFA this week, I thought this would be a timely topic to bring up. Remember that LLMs fundamentally rely on mass amounts of data to function well. As good as scraping the entirety of the internet is, I think we can all agree that’s neither authoritative nor necessarily as deep or accurate as any topic actually goes. So where are the most detailed and accurate sources of knowledge on any particular domain? In the minds of the people that work in those domains everyday.
Most people do not have the time, nor the inclination to sit down and write a fully documented research paper, much less a mildly concise record of what they know on a topic. We are all busy, and sitting down and just writing away isn’t most people’s forte. However, most people if given the opportunity can speak at length on their opinions, knowledge, and experiences. The explosion in podcast content over the last decade is an excellent example of this fact. Most people don’t like to write at length, but most people can talk you to death.
The trick is to combine LLM and human strengths to serve each other. Recording a conversation with anyone and everyone you can that might have anything even remotely useful in terms of experience or knowledgeable in terms of information is the start. This might be an office full of coworkers, or a conference full of people. Using LLMs to then both translate the audio into text as a first step, then distill the totality of the text into meaningful information and/or insights is the key second step. In the military we call it lessons learned, but imagine the trove of valuable data that just sits unknown to others in all our collective heads. Speech is the single modality that we as individuals are most comfortable utilizing for mass export of our thoughts, and LLMs can perform all the grunt work at scale to convert and refine that mass data into useful knowledge for others.
So next time you get a quiet moment with a friend or colleague, breakout the voice recorder function on your phone, and have a chat. Then go run that through transcription tool, and play around with the resulting data inside Claude 2, and see what kind of insights you find. Just imagine if you scaled that up across your whole office, a conference, or organization.