Smarter test runs: How we use AI and Qase to turn failures into fast fixes

If you’ve ever stared at a wall of logs after a test fails, you know the feeling: it’s not the failure itself that slows you down, it’s the messy detective work that follows.
You jump between tabs, scroll through endless stack traces, and try to piece together what actually happened, or you scroll through the Playwright trace to figure all that out.
This becomes even a bigger problem, if a test would fail, and in your absence, a developer needs to troubleshoot what exactly happened. It’s exhausting, especially when the problem turns out to be something simple – and in many occasions it is.
We wanted to cut out the noise. We wanted a simple solution, with no massive overhaul.
We added a small AI layer on top of our existing Playwright + Qase setup. Nothing dramatic. But it changed the day-to-day experience more than we expected.
Before: All the data, none of the clarity
A typical failing test gave us:
- An error message that might or might not be helpful
- A long stack trace
- Console logs buried under more logs
- Playwright trace log, where you go through screenshots, network events etc.
All the information was technically there, but it felt like solving a puzzle every single time.
After: A bit of AI that immediately points you in the right direction
Now, when a test fails:
- Our custom reporter gathers the important pieces (title, steps, error, stack).
- Instead of just writing that out to a file, it sends a short, structured request to an AI model asking for a breakdown: “What went wrong, what’s the evidence, and what should a developer look at first?”
- The model returns a simple explanation.
- We save the summary locally and attach it directly inside Qase.
So when you open a failed test run in Qase, you see an actual narrative instead of a pile of raw data. You have a full diagnostic explanation of the failure and what the developer should check and do.
Why this combo works
Qase already gives us a neat place to look at results. AI adds the connective tissue: the part that explains what the results mean.
No new dashboards. No “AI-only mode.” Just better explanations in the same place everyone is already looking: Qase.
What AI actually does
No magic auto-fixing. No rewriting selectors for us. What it does:
- Points out flaky patterns (timeouts, missing elements, navigation jumps)
- Flags questionable selectors
- Suggests practical fixes
- Highlights the exact log lines related to the failure
It’s basically someone saying, “Hey, check this part right here.”
What we gained
| Pain Before | What We Get Now |
| Time wasted figuring out where to start | A short summary you can read in seconds |
| Repeatedly diagnosing the same types of issues | Pattern awareness across runs |
| Junior engineers stuck in logs | Clear pointers on what to investigate |
| Long handoffs between QA and developers | Everyone sees the same explanation |
What we didn’t do
- We didn’t automate everything.
- We didn’t build a huge ML system.
- We never send sensitive or full internal logs, only what we decide to pass in.
- And if AI is down, tests still behave exactly the same.
This is an add-on, not a dependency.
Developer experience today
Run the tests. If something fails, the failure still shows up, but now with:
- A root-cause line
- A snippet of evidence
- A suggestion for how to fix it
You get the signal, not the noise.
Why this matters even outside our team
This approach has a bunch of second-order benefits:
- New teammates onboard faster
- More consistency in how failures are analyzed
- Better test hygiene (poor selectors get called out immediately)
- You build a historical trail of why things broke, not just that they broke
It turns automated tests into something closer to a feedback system.
Final thought
Automated tests already tell you when something breaks. With a tiny bit of AI, they can start telling you why. You don’t need anything fancy to replicate this:
- Collect the failure details in a structured way.
- Pass them into a simple prompt.
- Put the result somewhere the team already checks.
- Iterate.
Small change, surprisingly big payoff.






