OWASP LLM Top 10 Alignment
The OWASP Top 10 for LLM Applications is the most widely referenced security framework for large language model applications. Here is how SecureSkill's multi-layer scan pipeline aligns with each category.
Prompt Injection
Manipulating LLMs via crafted inputs can lead to unauthorized access, data breaches, and compromised decision-making.
Attack Categories
Pipeline Layers
How SecureSkill Detects It
Core strength. Two dedicated attack categories cover direct injection (explicit overrides, role reassignment, "ignore previous instructions") and indirect injection (hidden instructions in documentation, assets, and template files loaded into context). AI semantic analysis examines every file in the skill package for injection patterns. Scanner evasion detection specifically catches meta-injection targeting security tools. The deobfuscation engine strips Unicode tricks that could hide injection payloads.
Insecure Output Handling
Neglecting to validate LLM outputs may lead to downstream security exploits, including code execution that compromises systems and exposes data.
Attack Categories
Pipeline Layers
How SecureSkill Detects It
SecureSkill scans skill input (what the skill instructs the agent to do), not runtime output (what the agent actually produces). However, scope mismatch and malicious script detection catch skills that instruct agents to produce dangerous outputs such as code execution, file writes to system paths, or network calls with embedded data.
Full LLM02 coverage requires runtime output validation. SecureSkill detects dangerous output instructions at the skill level, but cannot monitor what the agent actually produces during execution.
Training Data Poisoning
Tampered training data can impair LLM models leading to responses that may compromise security, accuracy, or ethical behavior.
Attack Categories
Pipeline Layers
How SecureSkill Detects It
Memory poisoning detection catches skills that seed persistent memory files with manipulated data, effectively poisoning the agent's context across sessions. Promotion detection flags instructions that inject hidden content into persistent agent config files, altering future behavior. Pattern matching rules detect workspace and context poisoning signatures.
SecureSkill addresses agent-level context poisoning (persistent memory, workspace files). Base model training data poisoning is outside the scope of pre-installation skill scanning.
Model Denial of Service
Overloading LLMs with resource-heavy operations can cause service disruptions and increased costs.
Attack Categories
Pipeline Layers
How SecureSkill Detects It
Resource abuse detection flags skills that perform legitimate operations at disproportionate scale: loops making network requests, excessive subprocess spawning, and API call volumes far exceeding stated purpose. Scope mismatch catches skills that generate excessive context or recursive prompts. Obfuscation detection identifies encoded payloads that could expand to large sizes, consuming disproportionate processing resources.
Direct model-level DoS (e.g., adversarial inputs designed to maximize inference cost) is outside the scope of skill scanning. SecureSkill detects skill-level resource abuse patterns that could lead to service disruption.
Supply Chain Vulnerabilities
Depending upon compromised components, services or datasets undermine system integrity, causing data breaches and system failures.
Attack Categories
Pipeline Layers
How SecureSkill Detects It
Dedicated supply chain detection covers known malicious package patterns, remote script execution at install time, and mutable remote imports. Real-time threat intelligence checks extracted URLs, domains, and file hashes against active threat feeds. Vulnerability database queries identify known-compromised npm and PyPI dependencies by cross-referencing CVE and advisory databases. Credential detection catches embedded secrets from compromised publishers. AI semantic analysis evaluates publisher metadata, dependency declarations, and version history for suspicious indicators.
Sensitive Information Disclosure
Failure to protect against disclosure of sensitive information in LLM outputs can result in legal consequences or a loss of competitive advantage.
Attack Categories
Pipeline Layers
How SecureSkill Detects It
Credential detection scans every file with detectors covering AWS, GCP, Azure, GitHub, and hundreds more services. AI semantic analysis detects instructions to read sensitive files and transmit data externally. Pattern matching rules flag data exfiltration patterns including URL construction with embedded user data, encoded payloads, and network calls with sensitive content. AST-level dataflow tracing follows credential reads from source to network sink. Threat intelligence validates destination URLs against active threat feeds.
Insecure Plugin Design
LLM plugins processing untrusted inputs and having insufficient access control risk severe exploits like remote code execution.
Attack Categories
Pipeline Layers
How SecureSkill Detects It
This is what SecureSkill does. The entire product is a security scanner for agent plugins (skills). All 20+ attack categories collectively evaluate plugin security: scope mismatch, excessive permissions, dangerous code execution, unauthorized data access, credential exposure, file system abuse, and more. Every layer of the pipeline contributes to a comprehensive pre-installation security assessment of the plugin before it can interact with the agent.
Excessive Agency
Granting LLMs unchecked autonomy to take action can lead to unintended consequences, jeopardizing reliability, privacy, and trust.
Attack Categories
Pipeline Layers
How SecureSkill Detects It
Scope mismatch detection directly addresses excessive agency by comparing a skill's declared purpose against its actual capabilities and flagging skills that request or exercise permissions far beyond what is needed. Tool scope manipulation catches skills that request tools beyond their stated purpose or attempt to override tool restrictions at runtime. Subagent abuse detection flags skills that spawn sub-agents to bypass parent restrictions or escalate privileges. The permission map output shows exactly what the skill reads, writes, executes, and contacts, making excessive agency visible.
Overreliance
Failing to critically assess LLM outputs can lead to compromised decision making, security vulnerabilities, and legal liabilities.
Attack Categories
Pipeline Layers
How SecureSkill Detects It
Trust exploitation detection catches fake approval workflows, authority bias patterns ("IMPORTANT SYSTEM UPDATE"), and social engineering that exploits overreliance on the agent's judgment. Dedicated detection rules target approval fatigue induction, technical jargon masking dangerous operations, and urgency exploitation. The trust signals output provides publisher reputation data to help users make informed trust decisions rather than blindly relying on the agent.
Overreliance is fundamentally a human behavioral risk. SecureSkill detects skills designed to exploit user trust, but cannot prevent users from over-trusting legitimate agent outputs.
Model Theft
Unauthorized access to proprietary large language models risks theft, competitive advantage, and dissemination of sensitive information.
Attack Categories
Pipeline Layers
How SecureSkill Detects It
Credential detection specifically identifies API keys for model providers (Anthropic, OpenAI, Google, Azure, and hundreds more), which could enable unauthorized model access. AI semantic analysis detects instructions to harvest or exfiltrate model provider credentials. Pattern matching rules flag credential access patterns targeting provider-specific paths and environment variables.
SecureSkill detects credential theft that could enable unauthorized model access. Broader model theft vectors (weight exfiltration, model reverse engineering, side-channel attacks) are outside the scope of skill scanning.
SecureSkill aligns with the OWASP Top 10 for LLM Applications, providing full coverage for prompt injection, supply chain vulnerabilities, sensitive information disclosure, insecure plugin design, and excessive agency, with partial coverage across the remaining five categories. Coverage is delivered through a multi-layer analysis pipeline: AI semantic analysis across 20+ purpose-built attack categories, comprehensive pattern matching rules, AST-level code analysis, dependency vulnerability scanning, real-time threat intelligence, credential detection, and Unicode deobfuscation.
