Smarter test runs: How we use AI and Qase to turn failures into fast fixes

Smarter Test Runs with AI and Qase

If you’ve ever stared at a wall of logs after a test fails, you know the feeling: it’s not the failure itself that slows you down, it’s the messy detective work that follows.

You jump between tabs, scroll through endless stack traces, and try to piece together what actually happened, or you scroll through the Playwright trace to figure all that out.

This becomes even a bigger problem, if a test would fail, and in your absence, a developer needs to troubleshoot what exactly happened. It’s exhausting, especially when the problem turns out to be something simple – and in many occasions it is.

We wanted to cut out the noise. We wanted a simple solution, with no massive overhaul.

We added a small AI layer on top of our existing Playwright + Qase setup. Nothing dramatic. But it changed the day-to-day experience more than we expected.

Before: All the data, none of the clarity

A typical failing test gave us:

  • An error message that might or might not be helpful
  • A long stack trace
  • Console logs buried under more logs
  • Playwright trace log, where you go through screenshots, network events etc.

All the information was technically there, but it felt like solving a puzzle every single time.

After: A bit of AI that immediately points you in the right direction

Now, when a test fails:

  1. Our custom reporter gathers the important pieces (title, steps, error, stack).
  2. Instead of just writing that out to a file, it sends a short, structured request to an AI model asking for a breakdown: “What went wrong, what’s the evidence, and what should a developer look at first?”
  3. The model returns a simple explanation.
  4. We save the summary locally and attach it directly inside Qase.

So when you open a failed test run in Qase, you see an actual narrative instead of a pile of raw data. You have a full diagnostic explanation of the failure and what the developer should check and do.

Why this combo works

Qase already gives us a neat place to look at results. AI adds the connective tissue: the part that explains what the results mean.

No new dashboards. No “AI-only mode.” Just better explanations in the same place everyone is already looking: Qase.

What AI actually does

No magic auto-fixing. No rewriting selectors for us. What it does:

  • Points out flaky patterns (timeouts, missing elements, navigation jumps)
  • Flags questionable selectors
  • Suggests practical fixes
  • Highlights the exact log lines related to the failure

It’s basically someone saying, “Hey, check this part right here.”

What we gained

Pain BeforeWhat We Get Now
Time wasted figuring out where to startA short summary you can read in seconds
Repeatedly diagnosing the same types of issuesPattern awareness across runs
Junior engineers stuck in logsClear pointers on what to investigate
Long handoffs between QA and developersEveryone sees the same explanation

What we didn’t do

  • We didn’t automate everything.
  • We didn’t build a huge ML system.
  • We never send sensitive or full internal logs, only what we decide to pass in.
  • And if AI is down, tests still behave exactly the same.

This is an add-on, not a dependency.

Developer experience today

Run the tests. If something fails, the failure still shows up, but now with:

  • A root-cause line
  • A snippet of evidence
  • A suggestion for how to fix it

You get the signal, not the noise.

Why this matters even outside our team

This approach has a bunch of second-order benefits:

  • New teammates onboard faster
  • More consistency in how failures are analyzed
  • Better test hygiene (poor selectors get called out immediately)
  • You build a historical trail of why things broke, not just that they broke

It turns automated tests into something closer to a feedback system.

Final thought

Automated tests already tell you when something breaks. With a tiny bit of AI, they can start telling you why. You don’t need anything fancy to replicate this:

  1. Collect the failure details in a structured way.
  2. Pass them into a simple prompt.
  3. Put the result somewhere the team already checks.
  4. Iterate.

Small change, surprisingly big payoff.

Similar Posts

  • 2020 – A roller coaster ride

    The year 2020 has definitely been something different. Many would already like to skip the whole year. Unfortunately it will be mainly remembered negatively due to the Coronavirus outbreak and Covid-19 pandemia. Besides the spreading virus and lockdown, we in Zagreb also experienced the strongest earthquake in 140 years on March 22nd. A wakeup call I will never forget and don’t want…

  • Software Sauna among the top B2B companies in Croatia

    From the very beginning, our mission at Software Sauna has been to change the perception of nearshoring. We want to offer our customers such a high-quality service from day one that they would consider us a reliable and trusted strategic partner. Exceeding their expectations helps us to reach the goal. Our concept of combining the Nordic…

  • Wrapping up 2020

    Yet another year came to an end. 2020 will be remembered as the year of Corona pandemia which changed the world in many ways. In Croatia, it will also be remembered, unfortunately for two strong earthquakes. The country has not suffered from them during the last 100 years. The first one hit Zagreb in March,…