All examples for Acceptance Criteria Testability Checker

API Endpoint with Mixed Testability

Demonstrates handling criteria that are already partially testable (response time, error rate) alongside vague ones ('gracefully', 'scalable'), showing how the skill enhances existing metrics while rewriting subjective terms.

Input

What you provide to the skill

API endpoint. Criteria: response time under 200ms, error rate below 1%, should handle edge cases gracefully, be scalable

Output

What the skill generates for you

Acceptance Criteria Analysis: API Endpoint

Criterion 1: “Response time under 200ms”

Status: ✅ TESTABLE
Issues:

  • Well-defined metric (200ms)
  • However, missing context: under what conditions? for what percentage of requests?

Testable Rewrites:
Option A (Percentile-Based):
“API endpoint responds within 200ms for 95% of requests under normal load (100 req/min)”

  • Measurable: Yes (200ms, 95% threshold, 100 req/min load)
  • Observable: Yes (can measure with load testing tools)
  • Specific: Yes (defines success threshold and conditions)

Option B (P99 Performance):
“API endpoint p99 response time is under 200ms, p50 response time under 100ms”

  • Measurable: Yes (specific percentiles and thresholds)
  • Observable: Yes (standard performance metrics)
  • Specific: Yes (defines both typical and worst-case performance)

Recommendation: Option A for comprehensive performance definition with load context


Criterion 2: “Error rate below 1%”

Status: ✅ TESTABLE
Issues:

  • Good threshold specified
  • Missing: timeframe, what counts as an error (4xx vs 5xx)

Testable Rewrites:
Option A (Server Error Focus):
“Server errors (5xx responses) occur in less than 1% of all requests over any 5-minute window”

  • Measurable: Yes (1% threshold, 5-minute window)
  • Observable: Yes (server logs, monitoring)
  • Specific: Yes (defines error type and measurement period)

Option B (Comprehensive Error Rate):
“Total error rate (4xx and 5xx responses) is below 1% over any 24-hour period, excluding client validation errors (400 Bad Request)”

  • Measurable: Yes (1% threshold, 24-hour period, exclusions defined)
  • Observable: Yes (logging and monitoring)
  • Specific: Yes (defines all error types and exclusions)

Recommendation: Option A (focuses on server reliability; client errors are often not the API’s fault)


Criterion 3: “Should handle edge cases gracefully”

Status: ❌ NOT TESTABLE
Issues:

  • “gracefully” is subjective
  • “edge cases” not defined (which ones?)
  • No specific behavior defined

Testable Rewrites:
Option A (Error Response Quality):
“API returns appropriate HTTP status codes (400 for invalid input, 404 for missing resources, 429 for rate limits) with descriptive JSON error messages including ‘error’ and ‘message’ fields”

  • Measurable: Yes (specific status codes and response structure)
  • Observable: Yes (can verify response format)
  • Specific: Yes (defines expected error handling behavior)

Option B (Common Edge Cases):
“API correctly handles: empty request body (400), malformed JSON (400), missing required fields (400), excessively large payload >1MB (413), special characters in strings (200 with sanitization)”

  • Measurable: Yes (specific edge cases with expected responses)
  • Observable: Yes (can test each scenario)
  • Specific: Yes (defines exact edge cases and responses)

Option C (Validation Behavior):
“API validates all inputs and returns detailed error messages within 50ms, identifying specific invalid fields in the response”

  • Measurable: Yes (50ms validation time, specific error details)
  • Observable: Yes (can verify validation speed and message quality)
  • Specific: Yes (defines validation performance and output)

Recommendation: Combine Option B + Option A for comprehensive edge case handling definition


Criterion 4: “Be scalable”

Status: ❌ NOT TESTABLE
Issues:

  • “scalable” is vague (how much scale? in what dimension?)
  • No performance under load defined
  • No capacity limits specified

Testable Rewrites:
Option A (Throughput-Based):
“API handles 1,000 requests per minute with response times remaining under 200ms for 95% of requests”

  • Measurable: Yes (1,000 req/min, 200ms, 95% threshold)
  • Observable: Yes (load testing verification)
  • Specific: Yes (defines capacity and performance maintenance)

Option B (Linear Scaling):
“API response time degrades less than 10% when request volume doubles from baseline (100 to 200 req/min)”

  • Measurable: Yes (10% degradation threshold, specific volumes)
  • Observable: Yes (comparative load testing)
  • Specific: Yes (defines how performance scales with load)

Option C (Concurrent Connections):
“API supports at least 100 concurrent connections while maintaining error rate below 1%”

  • Measurable: Yes (100 connections, 1% error rate)
  • Observable: Yes (concurrent connection testing)
  • Specific: Yes (defines concurrency capacity)

Recommendation: Option A + Option C to define both throughput and concurrency scalability


Summary

Original Criteria: 4 statements, 2 not testable (50% vague)
Testable Criteria Generated: 9 specific, measurable alternatives

Testability Standards Met:
✅ Clear pass/fail conditions for all criteria
✅ Measurable thresholds defined (milliseconds, percentages, counts)
✅ Observable behaviors specified (what QA can verify)
✅ No subjective terms remaining

Recommended Final Acceptance Criteria:

  1. API endpoint responds within 200ms for 95% of requests under normal load (100 req/min)
  2. Server errors (5xx responses) occur in less than 1% of all requests over any 5-minute window
  3. API returns appropriate HTTP status codes (400 for invalid input, 404 for missing resources, 429 for rate limits) with descriptive JSON error messages including ‘error’ and ‘message’ fields
  4. API correctly handles: empty request body (400), malformed JSON (400), missing required fields (400), excessively large payload >1MB (413), special characters in strings (200 with sanitization)
  5. API handles 1,000 requests per minute with response times remaining under 200ms for 95% of requests
  6. API supports at least 100 concurrent connections while maintaining error rate below 1%

Quality Improvement: From 50% testable → 100% testable