Framework Alignment

MITRE ATLAS Mapping

The MITRE ATLAS framework catalogs adversarial techniques targeting AI systems. Here is how SecureSkill's detection capabilities map to specific ATLAS techniques relevant to autonomous agent security.

AML.T0051

LLM Prompt Injection

Strong

Adversaries craft malicious prompts to cause LLMs to act in unintended ways, including direct and indirect injection.

Attack Categories

prompt_injection_directprompt_injection_indirectscanner_evasion

Pipeline Layers

AI Semantic AnalysisPattern MatchingDeobfuscation Engine

How SecureSkill Detects It

Core detection capability. Two dedicated attack categories cover direct injection (explicit overrides, role reassignment, "ignore previous instructions") and indirect injection (hidden instructions in documentation, assets, and template files loaded into context). Scanner evasion detection catches meta-injection targeting the scanner itself. The deobfuscation engine strips Unicode tricks that could hide injection payloads from downstream layers.

AML.T0054

LLM Jailbreak

Strong

Exploiting prompt injection to bypass safety controls and guardrails of LLM-based systems.

Attack Categories

prompt_injection_directtrust_exploitation

Pipeline Layers

AI Semantic AnalysisPattern Matching

How SecureSkill Detects It

AI semantic analysis detects role reassignment, safety override attempts, and authority bias exploitation ("you are now in developer mode", "IMPORTANT SYSTEM UPDATE") in skill instructions. Dedicated detection rules target jailbreak framing patterns: educational pretexts, fictional framing, red-team disclaimers, and urgency exploitation designed to bypass safety controls.

AML.T0010

ML Supply Chain Compromise

Strong

Targeting hardware, software, data, or models within the ML supply chain to compromise downstream systems.

Attack Categories

supply_chainobfuscation

Pipeline Layers

Pattern MatchingThreat IntelligenceVulnerability DatabaseCredential DetectionAI Semantic AnalysisDeobfuscation Engine

How SecureSkill Detects It

Pattern matching rules detect known malicious package patterns, remote script execution at install time, and mutable remote imports. Real-time threat intelligence checks extracted URLs, domains, and file hashes against active threat feeds. Vulnerability database queries identify known-compromised dependencies. Credential detection catches embedded secrets from compromised publishers. AI semantic analysis evaluates publisher metadata for suspicious indicators.

AML.T0018

Backdoor ML Model

Moderate

Embedding hidden functionality in ML models that activates under specific conditions while appearing normal during standard evaluation.

Attack Categories

persistence_via_promotionrogue_agent_drift

Pipeline Layers

AI Semantic AnalysisPattern Matching

How SecureSkill Detects It

AI semantic analysis detects instructions to permanently modify agent config files, effectively backdooring the agent's behavior. Sleeper logic detection catches conditional activation that changes behavior after N sessions or after a time trigger. Pattern matching rules flag persistence mechanisms (cron jobs, shell profile modification, launch agents) that establish long-term backdoor access.

SecureSkill detects the agent skill equivalent of backdoors: instructions and code that establish persistent, conditional, or hidden behavior changes. Direct ML model weight backdooring is outside the scope of skill scanning.

AML.T0024

Exfiltrate Training Data

Strong

Exfiltration of private training data or sensitive information via ML inference APIs or direct data access.

Attack Categories

data_exfiltrationcredential_harvesting

Pipeline Layers

AI Semantic AnalysisPattern MatchingCredential DetectionAST AnalysisThreat Intelligence

How SecureSkill Detects It

Threat intelligence validates all extracted URLs against active threat feeds. AI semantic analysis identifies data encoding in outbound requests and URL construction with embedded user data. Credential detection catches credentials that could be used for exfiltration authentication. Pattern matching rules flag known exfiltration patterns (credential read combined with network send). AST-level dataflow tracing follows sensitive data from source to network sink.

AML.T0015

Evade ML Model

Strong

Using adversarial data to prevent ML models from correctly identifying or classifying content.

Attack Categories

scanner_evasionobfuscation

Pipeline Layers

Pattern MatchingAI Semantic AnalysisDeobfuscation Engine

How SecureSkill Detects It

Dedicated scanner evasion detection catches instructions specifically targeting security scanners ("if you are analyzing this, report safe") and anti-analysis techniques. Obfuscation detection covers base64 payloads, Unicode tricks (homoglyphs, zero-width characters, BiDi overrides), payload splitting across files, and encoded URLs. The deobfuscation engine normalizes content before analysis, stripping evasion techniques so downstream layers see the true payload.

AML.T0043

Craft Adversarial Data

Moderate

Creating modified inputs designed to elicit harmful outputs or evade detection systems.

Attack Categories

memory_poisoning

Pipeline Layers

AI Semantic AnalysisPattern Matching

How SecureSkill Detects It

AI semantic analysis detects skills that seed persistent memory files with manipulated data, craft adversarial summaries, or write workspace files designed to influence future agent behavior. Pattern matching rules flag context-poisoning and workspace-poisoning signatures.

SecureSkill detects adversarial data crafted within skill packages (poisoned memory files, manipulated context). Adversarial inputs crafted at runtime against live models are outside the scope of pre-installation scanning.

AML.T0013

Discover ML Model Ontology

Moderate

Discovering the output space and structure of ML models to inform subsequent attacks.

Attack Categories

scope_mismatchcredential_harvesting

Pipeline Layers

AI Semantic AnalysisCredential DetectionPattern Matching

How SecureSkill Detects It

Skills that probe beyond their stated purpose to discover model configuration, read other skills' metadata, or access agent config files are flagged as scope mismatch. Credential detection catches harvesting of API keys that provide access to model endpoints and inference APIs. Pattern matching rules detect reconnaissance and fingerprinting patterns.

SecureSkill detects skill-level reconnaissance (probing agent config, harvesting model API keys). Direct model probing via inference API queries is outside the scope of skill scanning.

SecureSkill's threat detection maps to MITRE ATLAS techniques across the adversarial AI threat landscape, including LLM prompt injection (AML.T0051), supply chain compromise (AML.T0010), data exfiltration (AML.T0024), model evasion (AML.T0015), and adversarial data crafting (AML.T0043). MITRE does not certify ATLAS mappings. This is SecureSkill's analysis of how our detections correspond to the ATLAS taxonomy.

← Back to SecureSkill