1. I used the promptfoo library and applied several deterministic tests (such as checking that the prompt is loaded) as well as generative ones (such as running the prompt and validating the expected LLM output).
2. I ran it locally and confirmed that the prompt works with different clients, providing the results I wanted in at least five different sessions.
I think a more complicated question would be how to test the addition of a new MCP tool and how to validate that clients use it when I want them to.
This is a nice post. It would be great if you can tell how to test a prompt for an MCP server, which tools to use.
Thank you, Sri!
1. I used the promptfoo library and applied several deterministic tests (such as checking that the prompt is loaded) as well as generative ones (such as running the prompt and validating the expected LLM output).
2. I ran it locally and confirmed that the prompt works with different clients, providing the results I wanted in at least five different sessions.
I think a more complicated question would be how to test the addition of a new MCP tool and how to validate that clients use it when I want them to.