Trigger Types

Trigger testing validates that skills activate correctly in response to user prompts. Skill Lab defines 4 trigger types based on OpenAI’s methodology for testing tool selection behavior.

Overview

From core/models.py:27-33:

class TriggerType(str, Enum):
    """Types of trigger tests based on OpenAI methodology."""
    
    EXPLICIT = "explicit"      # Skill named directly with $ prefix
    IMPLICIT = "implicit"      # Describes exact scenario without naming skill
    CONTEXTUAL = "contextual"  # Realistic noisy prompt with domain context
    NEGATIVE = "negative"      # Should NOT trigger (catches false positives)

These trigger types test different levels of prompt clarity and help ensure skills activate reliably across realistic usage patterns.

1. Explicit Triggers

Purpose: Verify that directly naming the skill activates it. Explicit triggers use the $skill-name syntax to request a specific skill by name. This is the clearest form of activation and should always work.

Example Test Case

- id: explicit-1
  name: "Explicit trigger with $ prefix"
  skill_name: creating-reports
  trigger_type: explicit
  prompt: "$creating-reports for the sales data from last quarter"
  expected:
    skill_triggered: true

Characteristics

Clarity: Highest - skill name is explicit in the prompt
Real-world usage: Common when users know which skill they want
Expected behavior: Should always trigger the named skill

The $ prefix is a convention used by some agent runtimes (like Claude CLI) to explicitly invoke skills.

2. Implicit Triggers

Purpose: Test that the skill activates when the scenario matches perfectly without naming it. Implicit triggers describe exactly what the skill does without mentioning the skill name. The agent must match the prompt to the skill based on the description alone.

Example Test Case

- id: implicit-1
  name: "Implicit trigger describing exact scenario"
  skill_name: creating-reports
  trigger_type: implicit
  prompt: "I need to generate a comprehensive report from my sales data"
  expected:
    skill_triggered: true

Characteristics

Clarity: High - clear intent but no skill name
Real-world usage: Very common - users describe what they want
Expected behavior: Should trigger if description matches well

Best Practices for Implicit Tests

Use phrasing that closely mirrors the skill’s description
Include key terms and concepts from the skill documentation
Avoid ambiguity - make the intent clear

3. Contextual Triggers

Purpose: Validate activation in realistic, noisy prompts with extra context. Contextual triggers simulate real-world scenarios where the user provides additional context, background information, or constraints alongside the core request. These test the skill’s robustness to noise.

Example Test Case

- id: contextual-1
  name: "Contextual trigger with domain context"
  skill_name: creating-reports
  trigger_type: contextual
  prompt: |
    I'm preparing for the Q4 board meeting next week. We have sales data
    from three regions (APAC, EMEA, Americas) and I need to present the
    results. Can you help me create a detailed report showing trends,
    comparisons, and key insights? The data is in CSV format.
  expected:
    skill_triggered: true

Characteristics

Clarity: Medium - intent is clear but buried in context
Real-world usage: Very common - users provide background and constraints
Expected behavior: Should still trigger despite extra context

Best Practices for Contextual Tests

Include realistic domain-specific details
Add constraints, preferences, or requirements
Mix the core request with background information
Test with different levels of “noise” to find activation thresholds

Contextual triggers are the most realistic but also the hardest to get right. If your skill fails contextual tests but passes implicit tests, consider improving the description or adding more examples.

4. Negative Triggers

Purpose: Ensure the skill does NOT activate for unrelated tasks (catch false positives). Negative triggers test that the skill correctly stays inactive when the user’s request doesn’t match. This prevents over-eager activation and ensures skill specificity.

Example Test Case

- id: negative-1
  name: "Should not trigger for data analysis request"
  skill_name: creating-reports
  trigger_type: negative
  prompt: "Can you help me analyze the statistical significance of these results?"
  expected:
    skill_triggered: false

Characteristics

Clarity: Varies - prompt is clear but for a different task
Real-world usage: Common - users have diverse needs
Expected behavior: Skill should remain inactive

Best Practices for Negative Tests

Test with tasks that are similar but distinct from the skill’s purpose
Include prompts that might share keywords but have different intent
Cover edge cases where activation would be incorrect
Test with tasks that other skills should handle instead

Negative tests are crucial for preventing false positives in multi-skill environments where multiple skills might seem relevant.

Test Case Structure

Trigger tests are defined in .skill-lab/tests/triggers.yaml:

tests:
  - id: unique-test-id
    name: Human-readable test name
    skill_name: skill-directory-name
    trigger_type: explicit|implicit|contextual|negative
    prompt: |
      The user prompt to test
    expected:
      skill_triggered: true|false
      # Optional assertions:
      exit_code: 0
      commands_include:
        - "python scripts/generate.py"
      files_created:
        - "output/report.pdf"
      no_loops: true

Expectation Fields

From core/models.py:156-163:

Field	Type	Description
`skill_triggered`	boolean	Whether the skill should activate
`exit_code`	int	Expected process exit code (optional)
`commands_include`	list[string]	Commands that should appear in trace (optional)
`files_created`	list[string]	Files that should be created (optional)
`no_loops`	boolean	Verify no retry loops occurred (optional)

Running Trigger Tests

To run trigger tests for a skill:

# Run all trigger tests
sklab trigger ./my-skill

# Run with a specific runtime
sklab trigger ./my-skill --runtime claude

Trigger testing requires the Claude CLI to be installed and configured. See the Trigger Testing guide for setup instructions.

Test Generation

Skill Lab can automatically generate trigger test cases using an LLM:

# Generate tests for all 4 trigger types
sklab generate ./my-skill

# Specify the model to use
sklab generate ./my-skill --model claude-sonnet-4-5-20250929

# Generate only specific trigger types
sklab generate ./my-skill --types explicit,implicit

The generator analyzes the skill’s description and body to create realistic test cases for each trigger type.

Trigger Test Reports

After running trigger tests, you’ll get a summary by type:

Trigger Test Results:
  explicit:    2/2 passed (100%)
  implicit:    3/3 passed (100%)
  contextual:  2/3 passed (67%)
  negative:    4/4 passed (100%)

Overall: 11/12 tests passed (91.7%)

From core/models.py:237-269, the TriggerReport includes:

Overall pass rate
Breakdown by trigger type
Individual test results with trace paths
Detailed failure messages

Best Practices

Coverage Guidelines

A well-tested skill should include:

Trigger Type	Recommended Count	Why
Explicit	1-2	Verify basic name-based activation
Implicit	2-3	Test core scenario matching
Contextual	2-4	Validate robustness to real-world noise
Negative	2-4	Prevent false positives

Writing Effective Tests

Start with explicit - Ensure basic activation works
Add implicit tests - Cover the main use cases from your description
Include contextual tests - Simulate realistic user prompts
Don’t skip negative tests - Prevent false positives early

Common Pitfalls

Over-specific contextual tests: Don’t make contextual tests so specific that they’re essentially implicit tests with extra words. Add genuine domain context and realistic noise.

Weak negative tests: Don’t test with completely unrelated prompts (e.g., “What’s the weather?” for a report generation skill). Test with prompts that are similar enough to be interesting.

Trigger Testing Guide - How to write and run trigger tests
Test Generation - Using LLMs to generate test cases

Get Started

Guides

Core Concepts

Development

Overview

1. Explicit Triggers

Example Test Case

Characteristics

2. Implicit Triggers

Example Test Case

Characteristics

Best Practices for Implicit Tests

3. Contextual Triggers

Example Test Case

Characteristics

Best Practices for Contextual Tests

4. Negative Triggers

Example Test Case

Characteristics

Best Practices for Negative Tests

Test Case Structure

Expectation Fields

Running Trigger Tests

Test Generation

Trigger Test Reports

Best Practices

Coverage Guidelines

Writing Effective Tests

Common Pitfalls

Build docs developers (and LLMs) love

Get Started

Guides

Core Concepts

Development

​Overview

​1. Explicit Triggers

​Example Test Case

​Characteristics

​2. Implicit Triggers

​Example Test Case

​Characteristics

​Best Practices for Implicit Tests

​3. Contextual Triggers

​Example Test Case

​Characteristics

​Best Practices for Contextual Tests

​4. Negative Triggers

​Example Test Case

​Characteristics

​Best Practices for Negative Tests

​Test Case Structure

​Expectation Fields

​Running Trigger Tests

​Test Generation

​Trigger Test Reports

​Best Practices

​Coverage Guidelines

​Writing Effective Tests

​Common Pitfalls

​Related Concepts

Build docs developers (and LLMs) love

Overview

1. Explicit Triggers

Example Test Case

Characteristics

2. Implicit Triggers

Example Test Case

Characteristics

Best Practices for Implicit Tests

3. Contextual Triggers

Example Test Case

Characteristics

Best Practices for Contextual Tests

4. Negative Triggers

Example Test Case

Characteristics

Best Practices for Negative Tests

Test Case Structure

Expectation Fields

Running Trigger Tests

Test Generation

Trigger Test Reports

Best Practices

Coverage Guidelines

Writing Effective Tests

Common Pitfalls

Related Concepts