Prompt engineering: A brand-new skill 20+ years in the making
What I learned from working with prompt engineers and dialog designers in the early 2000s
Imagine stepping into a time machine and setting the dial back to the early 2000s, a time when Nokia phones were all the rage, and Internet Explorer was duking it out in the browser wars against Firefox.
Yours truly was building speech recognition apps for companies like American Airlines and TiVo at a scrappy little Silicon Valley startup called TuVox.
TuVox was focused on transforming customer care through “Conversational Voice Response” systems “designed to get callers what they want quickly by using speech to route and handle a wide range of calls and improve live agent effectiveness.”
Back then, we carefully crafted systems in VoiceXML with (mostly) static dialogs and using comparatively primitive speech recognition and intent detection systems.
Fast forward to today, and while the gadgets have changed, the core principles of dialog design and prompt engineering remain surprisingly similar. Let’s explore how the lessons from early speech recognition systems apply to the cutting-edge world of Large Language Models (LLMs).
Voice Recognition - Prompt Engineering and Dialog Design 101
One core objective of early dialog designers was to ensure that prompts were clear, easy to understand, in the appropriate tone and persona, and optimized to elicit a machine-understandable input. Speech-to-text engines were much less powerful then and often required dialog-specific “grammar” files that essentially primed the engine to only listen for a limited set of possible utterances. This helped boost recognition accuracy and success rates.
Part art, part science, Voice User Interface (VUI) design could make or break an entire speech recognition flow. Too many recognition failures early on in the interaction would cause user frustration and dramatically reduce the overall automation potential. For a topical discussion of current VUI design best practices, check out this Alexa-centric tutorial.
Here are some concrete examples from the realm of speech automation, ca. 2004:
Input prompt design
Outdated (even back then): “To renew or change your policy, press or say 1.”
Confusing prompts:
“Say ‘Billing’ to renew or change your policy”
“Do you want to change or remove features? Say ‘Yes’”
Better prompt, including examples: “For questions about your policy like renewals or any other changes, say ‘Policy.’”
Best for callers comfortable with the open-ended interaction mode, but also most difficult to implement back then given the open-ended nature of the prompt: “Tell me what you’d like to do. You can say things like ‘Renew my policy’ or ‘Update my phone number.’”
Landmarking
Landmarking is a useful technique in voice applications that helps callers manage the “cognitive load” by letting them implicitly know whether a previous transaction succeeded or failed and helping them understand where they are in the process. Relevant landmarking keywords are bolded below. Also, note how the system is only eliciting one piece of information at a time to help manage the input flow and cognitive load:
System: I see you are calling from 650-176-2295. Is that the number you want me to use to look up your claim?
Caller: Yes.
System: Got it. Using the number ending in 2295. There are three more pieces of information to get to your status.
First, what’s your ZIP code?Caller: It’s 90210
System: Great, next, what’s your car’s make and model? You can say things like “2004 Toyota Camry” or “95 Taurus”
Caller: Sure, it’s a 2001 Cadillac Eldorado
System: And finally, is this a personal or business policy?
Caller: Personal.
…
LLM Prompt Engineering - Everything Old is New Again
As we move from the topic of speech recognition to the world of Large Language Models (LLMs), we find that many of the principles that guided early voice user interface design are still relevant today. Let's explore how these concepts translate to modern AI systems and prompt engineering.
Bridging the Gap: From Speech Recognition to LLMs
Just as speech recognition systems use grammar files to prime the engine for specific utterances, LLM prompt engineering often involves providing context and examples to 'prime' the model for the desired output. For instance, when working with GPT models, you might include a few examples of the desired input-output format before asking the model to generate new responses.
Let's check in with modern prompt engineering best practices. Here are some relevant tips from the official OpenAI Prompt Engineering Guide:
Write clear instructions (well…obviously!)
Include details in your query to get more relevant answers (seems familiar)
Specify the steps required to complete a task (also familiar)
Provide examples (see above)
Split complex tasks into simpler subtasks (yup)
For dialogue applications that require very long conversations, summarize or filter previous dialogue (aka landmarking)
…and many, many more!
The Timeless Nature of Good Design
Modern LLMs are much more capable than early speech recognition systems. Written or even multi-modal input and output formats open up whole new worlds of use cases beyond the tiny amount of information that could be exchanged via the phone channel. But many of the best practices from way back then remain highly applicable and relevant today!
So when you encounter a usability challenge with your advanced LLM system and struggle to achieve user engagement, consider taking a page from the early days of voice recognition technology or similarly "outdated" technologies. The principles of clear prompts, user-friendly guidance, and reducing cognitive load are just as relevant now as they were then.
Conclusion
As AI technology continues to advance, these fundamental concepts will likely remain crucial in designing effective human-AI interactions. By understanding and applying these timeless principles, we can create more intuitive, efficient, and user-friendly AI systems that truly augment human capabilities. AI = Augmented Intelligence with and through, not instead of people.
Sometimes the best and most innovative solutions come from revisiting, reinterpreting, and reincorporating the basics. What are your thoughts on the evolution of AI interaction design? What are some of the other parallels between established best practices in UX/UI design and highly usable modern LLM solutions? Share your experiences and insights in the comments below!