K6 Performance Testing Interview Questions — What SDET Panels Ask About Load Testing, Stress Testing, and Grafana k6 in 2026
Real k6 performance testing interview questions from SDET panels. Covers k6 vs JMeter vs Gatling, JavaScript scripting, load/stress/soak/spike test types, metrics and thresholds, CI/CD integration, and the Grafana k6 questions that separate candidates who've run a performance test from those who understand performance engineering. Built from panels at HMRC, MoD, Nationwide, and Accenture.
Published 13 May 2026 • By Mitchell Agoma
It's 11pm. Your SDET interview is in 10 hours. You've rehearsed your Playwright framework design answer until it flows like conversation. You can discuss CI/CD pipelines, API testing, and BDD scenarios without breaking stride. Then you re-read the job description one last time and your stomach drops: "Experience with performance testing — k6, JMeter, or Gatling."
You've run JMeter once. You recorded a script, added some users, and watched green lines fill a dashboard. But k6? You've heard it's "developer-friendly" and "JavaScript-based" — and that's where your knowledge ends. Now you're picturing the panel asking you to explain the difference between a load test and a stress test, or describe how you'd set thresholds in k6, or — worst of all — defend why an SDET should care about performance testing when there's a dedicated performance engineering team. And you realise you've never had to articulate performance testing. You've only ever acknowledged it exists.
This guide is for that moment. Built from 20 years of sitting on both sides of the SDET interview table — at HMRC, the Ministry of Defence, Nationwide, and Accenture — it covers exactly what interviewers ask about k6 performance testing, how they separate candidates who've integrated performance testing into their SDET practice from those who've only run a script someone else wrote, and how SDET Interview Coach prepares you for performance-specific questions so you walk into that room with answers that demonstrate performance engineering thinking, not performance testing buzzwords.
Why k6 Questions Are Separating SDET Candidates in 2026
Two years ago, performance testing in an SDET interview was a bonus question — mention JMeter and you'd get a nod. In 2026, k6 has disrupted the performance testing landscape, and candidates who can't discuss performance testing fluently are losing offers to those who can. Here's what's changed:
- Grafana k6 has become the fastest-growing load testing tool — and interviewers have noticed. k6's developer-first approach — write tests in JavaScript, version them in Git, run them in CI/CD — has made it the default choice for organisations adopting DevTestOps. Unlike JMeter's XML-based test plans and GUI-driven workflow, k6 treats performance tests as code. Interviewers at Nationwide and Accenture have told Mitchell they now probe k6 knowledge specifically because it signals a candidate who thinks about performance testing as an engineering practice, not a separate testing phase.
- Performance testing has shifted left — into the SDET's territory. The "shift-left" movement that brought functional testing earlier in the development cycle has now pulled performance testing with it. Organisations that once ran performance tests weeks before release are now embedding performance checks into CI/CD pipelines — smoke tests on every PR, load tests on merge to main, and stress tests on a scheduled cadence. SDETs are being asked to own these performance gates alongside functional testing gates, and k6's CLI-first, code-first design makes it the natural tool for this integration.
- AI-driven traffic patterns are making performance testing more complex. As LLMs generate increasingly realistic user behaviour simulations and traffic patterns, the ability to script complex scenarios in k6 — multi-step user journeys with dynamic data, API chaining, and conditional branching — is becoming a differentiator. The 2026 SDET interview expects you to understand that performance testing isn't just about hitting an endpoint with 1,000 concurrent users; it's about modelling realistic user behaviour and interpreting the results to prevent production incidents.
Performance testing isn't a separate discipline from quality assurance. It's quality assurance applied to the system under load. Interviewers who've been through a production outage know the difference between a tester who's run k6 and a tester who understands what the metrics mean — and they're probing for that distinction in every round.
k6 vs JMeter vs Gatling — The Comparison Question Every Interview Tests
This is the question that appears in nearly every performance testing interview, in some form: "Why k6 instead of JMeter or Gatling?" The interviewer isn't looking for JMeter bashing — they're testing whether you understand the trade-offs between performance testing tools and can justify tool selection on engineering grounds, not personal preference. Here's what a strong answer covers for each tool:
Apache JMeter — The Legacy Workhorse
JMeter has been the industry-standard load testing tool for over two decades. It's GUI-driven (though it has a CLI mode), Java-based, and supports virtually every protocol — HTTP, JDBC, FTP, LDAP, JMS, SOAP, and more through its massive plugin ecosystem. The strengths: mature, well-documented, enormous community, and its GUI makes it accessible to non-developers. The trade-offs: JMeter's thread-per-user model consumes significant memory (one virtual user = one thread), limiting it to thousands of concurrent users on a single machine without distributed testing. Its XML-based test plans are difficult to version-control and code-review. And its GUI-driven workflow makes CI/CD integration clunky — you typically record in GUI, export to XML, and run via CLI, creating a disconnect between authoring and execution. JMeter is a strong choice for: organisations with existing JMeter expertise, complex multi-protocol testing (beyond HTTP), and teams where non-developers need to create and modify performance tests through a GUI. It's less suited for: developer-owned performance testing in CI/CD, teams that treat tests as code, and organisations adopting DevTestOps practices.
Gatling — The Scala Power Tool
Gatling is a developer-focused load testing tool written in Scala with a Scala DSL for test scripting. Its key innovation: an asynchronous, non-blocking engine based on Akka that can simulate tens of thousands of concurrent users on a single machine — dramatically more efficient than JMeter's thread-per-user model. The strengths: exceptional performance efficiency, a rich DSL for complex scenarios (loops, conditions, feeders for test data), and built-in HTML reports that are genuinely useful — response time percentiles, active users over time, and request distribution. The trade-offs: Gatling's Scala DSL has a learning curve. If your team doesn't know Scala, the onboarding cost is real. The XML-based Maven/Gradle integration is Java-ecosystem-centric. And while Gatling has an active community, it's smaller than JMeter's and k6's. Gatling is a strong choice for: Java/Scala shops that need high-throughput testing, teams that already have Scala expertise, and scenarios requiring complex virtual user logic with minimal resource consumption. It's less suited for: JavaScript/TypeScript teams, rapid prototyping of performance tests, and organisations where the test-as-code workflow benefits from a language the whole team already knows.
Grafana k6 — The DevTestOps Native
k6 is the modern entrant that has reshaped the performance testing landscape. It's a Go-based engine with JavaScript test scripting — you write tests in ES6 JavaScript, the engine executes them with Go-level performance. The key innovations: (1) Code-first — tests are JavaScript files you version in Git, review in pull requests, and run in CI/CD, exactly like your functional tests. (2) Go-powered engine — k6 uses Go's goroutine model (not threads) for concurrency, enabling high virtual user counts with minimal resource consumption. (3) CLI-native — designed to run headless in CI/CD from day one, with JSON/CSV output for pipeline integration and native Grafana dashboard integration. (4) Built-in checks, thresholds, and metrics — you define pass/fail criteria in code, and k6 enforces them at runtime. The trade-offs: k6 is HTTP-first (HTTP/1.1, HTTP/2, WebSocket, gRPC) — it doesn't support the breadth of protocols JMeter does. It doesn't have a GUI recorder (though the browser extension and HAR converter help). And browser-based testing in k6 (k6 browser module) is still maturing compared to dedicated browser automation tools. k6 is the strongest choice for: DevTestOps teams, JavaScript/TypeScript shops, CI/CD-integrated performance testing, and organisations adopting shift-left performance engineering. It's the fastest-growing tool in the category for a reason — and that's exactly why interviewers are asking about it.
The senior-level answer acknowledges that tool choice is contextual. A startup shipping a Node.js API to AWS might choose k6 for developer alignment and CI/CD integration. A large enterprise with a decade of JMeter scripts, non-technical testers, and multi-protocol requirements might stay with JMeter. A Scala-based team doing microservice load testing at 50,000 RPS might choose Gatling for resource efficiency. The candidate who can discuss tool trade-offs without evangelising demonstrates engineering maturity — the kind interviewers specifically probe for at senior level and above.
k6 JavaScript Scripting Fundamentals — What Interviewers Expect You to Know
k6's JavaScript scripting model is its defining feature — and the area where interviewers probe deepest. They're not testing whether you can memorise the k6 API. They're testing whether you understand the scripting model well enough to design realistic performance tests. Here's what they ask:
The Test Lifecycle — Four Stages
Every k6 script has a lifecycle that interviewers expect you to understand: (1) Init stage — runs once, before the test starts. This is where you import modules (import http from 'k6/http'), define options (virtual users, duration, thresholds), and set up global configuration. No HTTP requests happen here — it's pure setup. (2) Setup stage — runs once, before VU (virtual user) iterations begin. Use export function setup() to perform one-time preparation: authenticate and get a token, create test data, or warm up caches. The return value is passed to every VU's default function. (3) VU stage — the default function (export default function()) runs repeatedly for each virtual user, for the duration of the test. This is where your performance test logic lives — HTTP requests, checks, sleep timers, and business-logic flows. (4) Teardown stage — runs once, after all VUs finish. Use export function teardown(data) to clean up: delete test data, revoke tokens, close connections. The candidate who can describe this lifecycle — and explain that the init stage runs in a different JavaScript runtime (Goja, not Node.js) and therefore doesn't have access to Node.js APIs — demonstrates genuine k6 experience, not just documentation reading.
Checks vs Thresholds — The Critical Distinction
This is the k6 question that catches most candidates. Checks (check(res, { 'status is 200': (r) => r.status === 200 })) are per-request assertions — they verify that individual responses meet criteria. Checks don't fail the test; they report pass/fail rates. A test with 99% check pass rate still exits with code 0 (success). Thresholds are test-level pass/fail criteria — they define acceptable performance boundaries and cause the test to fail (exit code non-zero) when breached. Example: thresholds: { http_req_duration: ['p(95)<500'] } means "the 95th percentile response time must be under 500ms, or the test fails." The critical interview insight: checks validate functional correctness under load ("are my responses still 200?"), thresholds validate performance requirements ("are my responses still fast enough?"). Both are essential. Candidates who conflate them — or who only use checks without thresholds — demonstrate they've run k6 without understanding performance engineering. Bonus: mention that k6 supports custom metrics (new Trend('my_metric')) and that you can set thresholds on custom metrics, not just built-in ones.
Virtual Users, Iterations, and Ramping Patterns
Interviewers probe your understanding of k6's execution model: virtual users (VUs) execute iterations of the default function concurrently. You control the test shape through the options object. The three parameters: vus — fixed number of concurrent VUs. iterations — total number of script executions to perform. duration — how long the test runs. You specify combinations: fixed VUs for a duration ({ vus: 100, duration: '5m' }), or a specific number of iterations ({ vus: 10, iterations: 1000 }). The advanced answer discusses stages — ramping patterns for realistic load profiles: stages: [{ duration: '2m', target: 100 }, { duration: '5m', target: 100 }, { duration: '2m', target: 0 }] — which ramps up to 100 VUs over 2 minutes, holds for 5 minutes, then ramps down. The candidate who can discuss stages and explain why a linear ramp-up (rather than an instant spike) produces more realistic load demonstrates understanding of production traffic patterns. Even better: mention the scenarios API for advanced executors — constant-arrival-rate (maintain a fixed request rate, regardless of response time) vs ramping-arrival-rate (gradually increase request rate), which model real-world API consumption more accurately than the basic VU model.
Performance Test Types — Load, Stress, Soak, and Spike Testing in k6
Every performance testing interview probes your understanding of test types. The question comes in different forms: "What's the difference between a load test and a stress test?" or "When would you run a soak test?" or "Design a performance test strategy for a new feature launch." Here's what interviewers expect you to know — and how to implement each in k6:
Load Testing
What it is: Testing the system under expected normal and peak load conditions. The goal is to verify that the system meets performance requirements under realistic traffic. Interview question: "How would you design a load test for an e-commerce checkout API?" Strong answer: "First, I'd determine the expected load — the product team says peak traffic is 500 concurrent checkouts per minute. My k6 script would model a realistic checkout flow: add item to basket → apply discount code → calculate shipping → submit payment. I'd use the ramping-arrival-rate executor to simulate 500 iterations per minute, with stages to ramp up gradually (to avoid cold-start distortion) and hold at peak for 15 minutes. I'd set thresholds: p(95) response time under 2 seconds, error rate under 1%. I'd monitor server-side metrics (CPU, memory, database connections, connection pool utilisation) alongside k6 metrics. The key insight: a load test validates that the system performs under expected conditions. It doesn't find the breaking point — that's a stress test."
Stress Testing
What it is: Testing the system beyond normal capacity to find the breaking point and observe how it fails. The goal is to identify the maximum capacity and understand failure modes — does it degrade gracefully (slower responses) or catastrophically (crashes, data corruption)? Interview question: "How do you determine when to stop a stress test?" Strong answer: "I increase load in steps — 100% of expected peak, 150%, 200%, 250% — holding each level for 5-10 minutes. I stop when: (1) error rate exceeds 10%, (2) p(95) response time exceeds 5x the baseline, or (3) the application becomes unresponsive. But the real test is what happens after the stress is removed — does the system recover to normal performance, or does it require a restart? This recovery behaviour is what interviewers want to hear you discuss. In k6, I'd implement this with stages: ramp to expected load, then step up with hold periods. Thresholds would be set on error rate and response time to auto-fail when the system breaks. The operational insight: share the stress test schedule with the infrastructure team so they're not surprised by the load — and never run stress tests against production."
Soak Testing (Endurance Testing)
What it is: Testing the system under sustained load over an extended period (hours to days) to detect memory leaks, resource exhaustion, and degradation over time. Interview question: "You've passed load and stress tests. Why run a soak test?" Strong answer: "Load tests prove the system can handle peak traffic for minutes. Soak tests prove it can handle sustained traffic for hours. Problems that only surface over time: memory leaks (garbage collection can't keep up), connection pool exhaustion (connections aren't returned to the pool), log file growth (disk fills up), database connection leaks, thread pool starvation, and cache eviction policies that degrade over time. In k6, I'd run a soak test at 70-80% of peak load — enough to stress the system without triggering the failure modes a stress test would find. Duration: typically 4-24 hours, depending on the application's release cycle. The operational insight: schedule soak tests during low-traffic periods — they're resource-intensive and long-running. And monitor server-side metrics that trend over time (heap memory, thread count, file handles, database connections) — the k6 metrics might stay stable while the server slowly degrades."
Spike Testing
What it is: Testing the system's response to sudden, dramatic increases in load — simulating flash sales, breaking news traffic, or DDoS-like conditions. Interview question: "A marketing campaign goes viral. Your traffic jumps from 100 to 10,000 concurrent users in 30 seconds. How would you test for this?" Strong answer: "This is a spike test scenario. In k6, I'd use stages with a near-vertical ramp: hold at baseline (100 VUs) for 2 minutes to establish normal behaviour, then ramp to 10,000 VUs in 30 seconds, hold for 5 minutes, and ramp back down. The key metrics: can the auto-scaling infrastructure provision resources fast enough? Does the load balancer handle the connection surge? Do connection pools and thread pools expand quickly enough, or do requests queue and time out? The test also validates alerting — does the monitoring system detect the spike and alert within the defined SLA? The common mistake: running spike tests against a system without informing the infrastructure team — the auto-scaling can trigger cloud costs and alarm fatigue. The responsible approach: coordinate with operations, set spending limits on auto-scaling, and monitor the infrastructure's response alongside the application's response."
k6 Metrics and Thresholds — The Numbers Interviewers Probe
Performance testing without metrics is just load generation. Interviewers who've done real performance testing will probe your understanding of what to measure, what the numbers mean, and how to set thresholds that separate acceptable performance from unacceptable. Here's what matters:
Response Time Metrics — Beyond the Average
"What response time metrics do you monitor in k6?" The weak answer says "average response time." The strong answer covers percentiles: http_req_duration with p(50) (median), p(90), p(95), and p(99). The key insight: the average is misleading. If 99 users get 100ms responses and 1 user gets a 30-second timeout, the average looks fine (~400ms) while 1% of users have a terrible experience. p(95) under 500ms means "95% of users get sub-500ms responses" — a much more useful guarantee. The advanced candidate also mentions http_req_waiting (time spent waiting for the server to respond — isolates network latency from server processing time), http_req_connecting (TCP handshake time), and http_req_tls_handshaking (TLS negotiation time). These sub-metrics help diagnose whether the problem is network, TLS overhead, or server processing. A candidate who can discuss long-tail latency — the p(99) or even p(99.9) response times where the worst user experiences live — demonstrates exposure to production performance engineering, not just test execution.
Threshold Configuration — The Performance Contract
"How do you decide what thresholds to set?" Thresholds are k6's pass/fail mechanism — they're the performance contract between the test and the deployment pipeline. The strong answer: "I start from the business's performance requirements, not from arbitrary numbers. If the product team says 'the checkout page must load in under 3 seconds,' that becomes a threshold: http_req_duration: ['p(95)<3000']. I set thresholds on: (1) response time — p(95) and p(99) for critical endpoints, (2) error rate — http_req_failed: ['rate<0.01'] meaning under 1% failures, (3) iteration duration for end-to-end flows, and (4) custom metrics for business-specific checks (e.g., 'payment processing time under 5 seconds'). I also set different thresholds per endpoint — the health-check endpoint might have a p(95)<50ms threshold, while a report generation endpoint might have p(95)<10s. The key: thresholds gate deployments. If a load test fails its thresholds in CI, the deployment is blocked. This makes thresholds a product decision, not just a testing decision — and I involve product owners in defining what 'fast enough' means."
Custom Metrics — Measuring What Matters
"How would you track a business-specific metric in k6 that isn't a built-in HTTP metric?" k6 supports four custom metric types: Counter (cumulative value — total orders placed), Gauge (current value — active database connections), Rate (percentage — successful payment rate), and Trend (statistical distribution — payment processing time). The interview answer: "For an e-commerce site, I'd create a Trend metric for checkout completion time and a Counter for successful purchases. I'd set thresholds: 99% of checkouts must complete within 5 seconds (checkout_time: ['p(99)<5000']). The power of custom metrics: they connect performance testing to business outcomes. When you can say 'under peak load, we still processed 487 orders per minute with 99% success rate,' you're talking the language that product owners and VPs of Engineering understand." Bonus: mention that custom metrics can be tagged (new Trend('response_time', true, { service: 'payment' })), and that k6 outputs can be sent to Grafana Cloud, Datadog, New Relic, or any statsd-compatible backend for real-time dashboarding.
k6 in CI/CD — The Pipeline Integration Question Every Senior SDET Faces
If you're interviewing at mid-level or above, expect this: "How would you integrate k6 performance tests into your CI/CD pipeline?" This tests whether you think about performance testing as an engineering gate, not a separate manual activity. Here's what a strong answer covers:
Pipeline Placement — What Runs When
"Where do performance tests fit in your pipeline stages?" The strong answer layers performance tests by speed and scope, just like functional tests: (1) Smoke test on every PR — 1-2 VUs for 1 minute, verifying that the application doesn't break under minimal load. This is sub-3 minutes and catches performance regressions introduced by code changes (e.g., a developer accidentally adds an N+1 query). (2) Load test on merge to main — expected peak load for 10-15 minutes, verifying the deployment doesn't degrade performance. This runs against staging and gates deployment to production. (3) Stress and soak tests nightly — beyond-peak load and extended duration tests that are too resource-intensive for every merge. These run against a production-like staging environment and report results to a shared dashboard. (4) Synthetic monitoring in production — k6 can run as a continuous monitoring tool, executing scripts at regular intervals from multiple geographic locations to detect performance degradation in production. The key: performance tests get progressively more intensive and less frequent as they move through the pipeline. This mirrors the functional testing pyramid and demonstrates you think about performance testing as a system, not a one-off activity.
Practical k6 CI/CD Configuration
"Walk me through a GitHub Actions workflow that runs k6 tests." The practical answer: (1) The workflow triggers on pull requests to main. (2) The job checks out code, sets up the application (docker-compose or direct deployment to staging), and waits for health checks to pass. (3) It runs k6: k6 run --out json=results.json script.js — the --out flag exports results for post-processing. (4) It checks the exit code — k6 exits non-zero when thresholds are breached, which fails the pipeline. (5) It uploads the results as a pipeline artifact and optionally posts a summary comment on the PR: "Load test: 100 VUs, p(95) response time 234ms, 0% errors." (6) For advanced pipelines: use k6's --out to stream results to Grafana Cloud or Datadog, set the K6_CLOUD_TOKEN as a GitHub Secret, and view real-time results during the pipeline run. The candidate who can discuss K6_INFLUXDB_* environment variables for pushing metrics to InfluxDB or the --summary-export flag for JSON summary output demonstrates genuine pipeline integration experience. The trap: running k6 against the same staging environment that other pipelines are hitting — contention from concurrent k6 runs distorts results. Use dedicated test environments or schedule performance pipeline stages to avoid overlap.
Real k6 Interview Scenarios — What Panels Actually Ask
Drawing from panels Mitchell has conducted at HMRC, MoD, Nationwide, and consulting for Accenture, here are the k6 performance testing scenarios that appear in SDET interviews — and what a strong answer looks like for each.
"Write a k6 script that tests a login endpoint under load."
This is the practical exercise that appears in most k6 interview rounds. A complete answer covers: (1) Import http from k6 and the check and sleep helpers. (2) Define options — 50 VUs ramping up over 1 minute, holding for 5 minutes. (3) Use the setup function to create a test user once and return credentials. (4) In the default function, send a POST to /api/login with the test credentials, check that the response status is 200 and the response body contains a token, and sleep for a realistic think-time (1-3 seconds) between iterations. (5) Set thresholds: p(95) response time under 1 second, error rate under 1%. (6) Handle the response JSON — extract the token for subsequent requests. The interviewer evaluates: correct k6 API usage, realistic test design (sleep between iterations, unique data handling), appropriate thresholds, and the use of checks for functional validation alongside performance measurement. The candidate who also mentions using __VU and __ITER (k6's built-in execution context variables for unique data per VU/iteration) demonstrates deeper k6 knowledge.
"Your performance test shows p(95) response time is 800ms. The threshold is 500ms. What do you do?"
This tests your performance debugging methodology. The weak answer: "Increase the threshold." The strong answer walks through a structured investigation: (1) Check whether the response time increase is uniform (all endpoints slow) or specific (one slow endpoint). Use k6's sub-metrics (http_req_waiting, http_req_connecting) to isolate network vs server time. (2) Check server-side metrics — CPU, memory, database query times, connection pool utilisation. Is the database under-provisioned? Are there slow queries that only manifest under load? (3) Check for resource contention — is the test environment shared with other tests or pipelines? (4) Compare against baseline — is this a regression from a previous run, or a first-time benchmark? (5) If it's a specific endpoint: profile the code path, check for N+1 queries, serialisation overhead, or external API dependency latency. (6) If it's systemic: evaluate infrastructure — does the application need more CPU, memory, or database connections? The candidate who can walk through this debugging process demonstrates performance engineering thinking, not just test execution. Bonus: mention that k6's http_req_duration trend metric includes min, max, avg, and percentile values that help pinpoint whether the problem is consistent or spiky.
"How would you test the performance of a GraphQL API differently from a REST API?"
This tests whether you understand that API architecture affects performance testing strategy. Strong answer: "GraphQL performance testing has distinct challenges: (1) Query complexity — a single GraphQL query can fetch deeply nested data, making response time highly variable depending on the query. I'd create test scripts with different query depths — shallow (1 level), typical (3-4 levels), and deep (7+ levels) — to capture this variability. (2) N+1 problems — GraphQL resolvers often execute one query per nested field, creating N+1 database queries. A performance test might pass with 50 VUs querying 2-level nesting but fail with 50 VUs querying 5-level nesting because the database connection pool saturates. (3) Single endpoint — unlike REST's multiple endpoints, GraphQL typically has one /graphql endpoint. k6's metrics will show aggregate stats for all queries hitting that endpoint. I'd use k6's tagging (tags: { query: 'GetUserProfile' }) to segment metrics by query type. (4) Variable payload sizes — a GraphQL response can be 200 bytes or 2MB depending on the query. I'd set thresholds on response size as well as response time. The GraphQL-specific insight: monitor resolver-level performance on the server side alongside k6's HTTP-level metrics — the HTTP response might be fast, but a specific resolver might be slow and hidden behind parallel resolver execution."
"k6 runs in a single JavaScript thread. How does it handle thousands of concurrent users?"
This is the architecture question that tests whether you understand k6's internals. The answer: "k6's scripting is single-threaded per VU — each virtual user's JavaScript executes in its own Goja (JavaScript VM) instance, in a single thread. But the HTTP engine underneath is Go's highly concurrent goroutine model. When a VU's JavaScript calls http.get(), the Go engine handles the HTTP request asynchronously using goroutines while the JavaScript thread is blocked waiting for the response — exactly like async/await but at the engine level. This means: (1) Thousands of VUs don't consume thousands of OS threads — each VU is a lightweight goroutine. (2) The JavaScript code is straightforward and synchronous — no async/await, no promises — because the engine handles concurrency underneath. (3) The bottleneck isn't k6's architecture; it's typically the machine's network bandwidth, CPU for TLS, or the target system's capacity. The practical implication: k6 can drive significantly more load per machine than JMeter (which uses thread-per-VU), typically 50,000+ RPS on modest hardware. The candidate who can discuss this architecture signals they've gone beyond the k6 API into understanding why k6 is performant — a distinction interviewers at senior level notice."
5 Common Performance Testing Mistakes That Cost SDET Candidates Offers
After watching hundreds of candidates navigate performance testing questions, Mitchell has identified the specific mistakes that cause interviewers to lean back and wait for the next candidate. These aren't gaps in knowledge — they're gaps in how you present that knowledge.
Mistake #1: Treating Performance Testing as a Specialist's Job
The single most common mistake SDET candidates make: saying "performance testing is for the performance engineering team" or "I'd hand off performance concerns to the platform team." In 2026, this answer signals you haven't kept up with DevTestOps. Performance testing has shifted left — it's now part of the SDET's responsibility, just like functional testing and security testing. The strong answer: "As an SDET, I own the automated performance testing gates — smoke tests on PRs, load tests on merge to main, threshold enforcement. I work with the performance engineering team on complex stress testing, capacity planning, and production monitoring. But the CI/CD performance gates that prevent a slow deployment from reaching production are my responsibility. k6 makes this possible because it's designed for CI/CD integration — it runs headless, it's fast, and its thresholds provide clear pass/fail signals." This demonstrates you understand the modern SDET-performance boundary, not an outdated siloed model.
Mistake #2: Focusing on Average Response Time
"The average response time was 200ms — performance is great." This answer tells the interviewer you don't understand performance measurement. The average hides the worst experiences. A candidate who only discusses averages hasn't done production performance work — where the 1% of users experiencing 10-second response times are the ones writing negative reviews. The strong answer: "I monitor percentiles — p(50), p(90), p(95), and p(99). I set thresholds on p(95) and p(99), not on the average. The p(99) is particularly important because it captures the long-tail latency that indicates problems like garbage collection pauses, connection pool exhaustion, or contended locks — problems that the average completely masks." Bonus: mention the max value with caution — it's often an outlier that distorts understanding, but it can also reveal catastrophic failures that a percentile threshold might miss because they're below the percentile threshold (e.g., if p(99) is under 500ms but max is 30 seconds, something failed badly for a small number of users).
Mistake #3: Testing Without Realistic Think Time
"My k6 script sends 1,000 requests as fast as possible." This tells the interviewer you're load-generating, not performance-testing. Real users don't fire requests back-to-back — they read pages, fill in forms, and think between actions. A script without sleep() calls generates unrealistic, abusive load that produces panic-inducing metrics but doesn't represent real-world behaviour. The strong answer: "I use sleep() with randomised durations — sleep(Math.random() * 3 + 1) for 1-4 second pauses between user actions. This simulates realistic user behaviour and produces metrics that reflect what actual users would experience. I also model different user behaviours — some users browse quickly (short pauses), some read thoroughly (long pauses). The exception: stress testing and spike testing deliberately remove or minimise think time to push the system to its limits. Know which type of test you're running and justify your think-time strategy."
Mistake #4: Not Understanding the Difference Between Open and Closed Models
This is the advanced concept that separates seniors. k6 supports both closed model (fixed number of VUs, each VU waits for the previous iteration to complete before starting the next — if the system slows down, the request rate drops) and open model (fixed arrival rate, new iterations start regardless of whether previous ones have completed — request rate stays constant). The interview trap: "Your load test shows increasing response times, but the request rate is dropping. Why?" The answer: you're using a closed model. As the system slows down, each VU takes longer to complete an iteration, so fewer requests are sent per second — the test unintentionally eases off. The strong answer: "For load testing, I prefer the open model using k6's constant-arrival-rate executor. It maintains a fixed request rate regardless of response time, which means: (1) it more accurately models real-world traffic (users don't slow down their requests because the server is slow), (2) it catches performance degradation sooner (the system can't hide by slowing down), and (3) it provides more consistent, reproducible results. I use the closed model (VU-based) for stress testing, where the goal is to see how many concurrent users the system can handle."
Mistake #5: Confusing Performance Testing with Load Generation
The candidate who says "I run k6 and check the dashboard" is describing load generation, not performance testing. Performance testing includes: (1) defining performance requirements before the test, (2) designing realistic test scenarios, (3) running the test, (4) analysing results against requirements, (5) diagnosing the root cause of violations, and (6) verifying improvements. Load generation is step 3. Interviewers probe steps 1, 4, and 5 because that's where the engineering happens. The strong answer: "Before I run any test, I define the performance acceptance criteria with the product owner and infrastructure team — specific thresholds for response time, error rate, and throughput. After the test, I don't just report the numbers — I analyse them against the criteria, identify the slowest endpoints and operations, correlate k6 metrics with server-side telemetry (CPU, memory, database queries), and produce a findings document that prioritises issues by impact. This turns performance testing from a checkbox activity into an engineering practice that prevents production incidents."
What a Real k6 Performance SDET Interview Looks Like — Timed Breakdown
Drawing from panels Mitchell has conducted across government, defence, and enterprise, here's how k6 performance testing questions typically appear in a 60-minute SDET interview:
Experience Probe
"What performance testing have you done in your current role?" This opener tests whether you've genuinely practised performance testing or just listed it on your CV. Be honest about your level. If you've primarily run k6 scripts that others wrote: "I've executed k6 performance tests as part of our CI/CD pipeline — running load tests against staging, checking threshold violations, and investigating regressions. I've also written k6 scripts for specific API flows — multi-step user journeys with checks and custom metrics. I haven't designed the full performance testing strategy for an organisation, but I understand the test types, metrics, and pipeline integration patterns." This answer demonstrates performance testing competence while being honest about its scope.
k6 Scripting & Test Design
"Write a k6 script for a given scenario" or "Explain the k6 lifecycle." You may be asked to whiteboard a k6 script or discuss the four lifecycle stages. Focus on: correct API usage (http.get, check, sleep), realistic test design (ramp-up, think time, unique data), appropriate use of setup and teardown, and threshold configuration. Interviewers evaluate your ability to translate a performance requirement into a working k6 script — and they'll ask follow-ups on why you chose specific options (fixed VUs vs arrival-rate, why those specific thresholds).
Performance Engineering & Architecture
"Your load test passes, but the application degrades under production traffic. What did you miss?" This probes your understanding of the gap between testing and reality. Discuss: (1) Test data realism — were you testing with production-like data volumes? (2) User behaviour modelling — were you simulating realistic user journeys or just hitting endpoints? (3) Infrastructure parity — was your test environment equivalent to production? (4) External dependencies — did your test include third-party APIs, payment gateways, CDN behaviour? (5) Test isolation — were you the only load on the system, or were there background jobs, data pipelines, and other consumers hitting it simultaneously? The candidate who can discuss these gaps demonstrates that they think about performance testing as modelling reality, not just generating load.
Operational & Incident Response
"A performance test fails in CI — p(95) response time is 3x the threshold. What's your process?" This tests whether you treat performance test failures as operational events. Discuss: (1) Immediate triage — check if it's a transient issue (rerun once to confirm) or a genuine regression. (2) Isolate the cause — which endpoints are slow? Is it a specific change in the latest commit? (3) Correlate with server metrics — database CPU spike, connection pool saturation, or application thread pool exhaustion? (4) Block the deployment if the regression is confirmed — performance degradation is a production risk. (5) Post-mortem — why wasn't this caught earlier? Should performance smoke tests on PRs catch this before the full load test does? The candidate who treats a performance test failure with the same seriousness as a security vulnerability or a crash demonstrates operational maturity.
Your Questions
Ask about their performance testing maturity: "What's your current performance testing setup — do you have automated k6 tests in CI/CD, or is performance testing mostly manual? How do you handle performance test failures — do they block deployments, or are they advisory? Do SDETs here own the performance testing scripts and thresholds, or is that a separate performance team activity? What was the last performance-related production incident, and how has it changed your testing approach?" Questions that probe their performance posture demonstrate you're thinking about how you'd contribute to their specific environment.
Why Performance Testing Competence Is Becoming a Career Accelerator for SDETs
After 20 years watching the UK testing market evolve — from HMRC to the MoD, from Nationwide to Accenture — Mitchell has observed a consistent pattern: SDETs who add performance testing to their skill set advance faster and command higher salaries than pure functional automation engineers. Here's why:
- Performance-competent SDETs are rare and the demand is growing. The pool of testers who can discuss Playwright locator strategies is deep. The pool who can also discuss k6 thresholds, percentile analysis, arrival-rate executors, and CI/CD performance gates is shallow. In every panel Mitchell has conducted where a candidate demonstrated genuine performance testing competence, the post-interview debrief included the phrase "they bring performance testing skills we weren't expecting." That surprise value translates to offers and higher salary bands — the same way security testing competence has become a premium differentiator.
- Performance testing makes your work visible to the people who control budgets. When your k6 thresholds block a deployment because p(95) response time degraded by 40%, the VP of Engineering sees the value — you prevented a performance incident that would have affected real users. When your nightly soak test catches a memory leak three days before it would have caused a production outage, the CTO hears about it. Performance testing — more than any other testing domain except security — has direct visibility to decision-makers. SDETs who can demonstrate performance impact get invited to architecture reviews and capacity planning conversations where careers advance.
- The DevTestOps trend is structural, not cyclical. The integration of performance testing into the development pipeline isn't a passing trend — it's a permanent shift driven by the economics of cloud infrastructure (performance directly impacts cost), user expectations (slow apps lose users in seconds), and the k6 tooling revolution (performance testing is now accessible to developers, not just performance specialists). SDETs who build performance testing skills now are positioning themselves for roles that are being defined as we speak: DevTestOps SDET, Performance Test Automation Engineer, Reliability SDET. These roles command premium compensation because they combine two scarce skill sets — test automation and performance engineering.
The candidates adding k6 to their repertoire now — not just running scripts, but understanding metrics analysis, threshold design, and CI/CD integration — are the ones who'll walk into 2027 interviews as senior performance-aware SDETs while their purely functional peers are still competing for the same mid-level roles they have today.
How to Prepare for Your Performance Testing Interview — Starting Tonight
You don't need to be a performance engineering specialist. You need to understand k6's scripting model, be able to discuss performance test types and their purposes, articulate how you'd set thresholds and interpret metrics, and — most importantly — demonstrate that you think about performance as a quality attribute that can be tested, automated, and gated in CI/CD, just like functional correctness. Here's the 3-step plan:
- Download SDET Interview Coach from the iOS App Store and complete the 2-minute onboarding assessment. Select your target stack and seniority level. The app's 800+ question bank includes performance testing topics — k6 scripting, load/stress/soak/spike testing, metrics and thresholds, k6 vs JMeter vs Gatling, CI/CD integration, and performance debugging — calibrated to all five seniority levels. Even if performance testing is a gap in your current role, the app surfaces questions at your level so you can build confidence before the interview exposes the gap.
- Run a performance testing mock interview today. Pick Performance Testing as your topic, set a 30-minute timer, and answer the questions out loud. The AI feedback scores you on technical accuracy, completeness, communication, and code quality — showing you exactly where your performance testing knowledge gaps are before the real panel finds them. The AI mock interviewer asks adaptive follow-ups on k6 architecture, threshold design, and operational scenarios, just like a real panel.
- Use Job Match for your target role. If the job description mentions "performance testing," "k6," "JMeter," "Gatling," "load testing," "stress testing," "soak testing," or "performance engineering," paste it into Job Match. You'll get 50 questions tailored to that exact role's performance testing expectations — no guessing whether they'll ask about arrival-rate executors, percentile analysis, or k6 CI/CD integration.
The candidates who prepare for performance testing questions now — who can articulate k6's lifecycle, who understand the difference between checks and thresholds, and who can discuss integrating k6 into a CI/CD pipeline with the same fluency they discuss Playwright — are the ones who'll walk into panels and surprise interviewers with a competency they weren't necessarily expecting to find. Performance testing isn't a specialist silo any more. It's a core SDET competency, and with SDET Interview Coach, available on the iOS App Store, you can build that performance testing confidence before you ever sit down with an interviewer.
If you're building your performance testing skills from a test automation background, start with our guide on API Testing Interview Questions 2026 — API performance testing is where functional API testing and performance testing most directly intersect. For the CI/CD pipeline integration where performance gates live, see our guide on CI/CD Pipeline Testing Interview Questions. For the security testing that performance issues can expose, see Security Testing QA Interview Questions. And for the framework design round where performance test architecture is planned, see Test Automation Framework Design Interview Questions.
Ready to Transform Your Testing?
The AI Test Automation Playbook gives you everything you need: Playwright setup, Claude AI integration, MCP deep dive, 10+ ready-to-use prompts, CI/CD pipeline setup, and a 30-day implementation roadmap.
By Mitchell Agoma, Senior SDET & AI Testing Specialist with 8+ years of experience