SiteShadow

Technical proof

Detection Coverage

This page states what SiteShadow can currently claim, what evidence backs it, and where the boundaries are. It is written for security engineers evaluating detection quality, not as a marketing feature list.

Published by SiteShadow Technical evidence reviewed before publication Evidence baseline: May 20, 2026 Extension/CLI: v0.4.9 Hosted scanner: always current

Current Evidence Summary

SiteShadow combines rule-based checks, structural heuristics, dependency checks, cross-file project checks, and WASM-powered taint analysis. The numbers below are evidence-backed controlled benchmark and library counts, not estimates from public marketing copy.

2,011security checks in public coverage
190CWE mappings
31heuristic checks
1,000MultiLang taint benchmark cases
324AI security benchmark cases
6,444Juliet (Java) benchmark cases
Important: controlled benchmark FP/FN rates are not customer-code FP/FN rates. Benchmark suites are designed regression fixtures. Real-world false-positive and false-negative rates require a separate customer-like corpus methodology, which is listed as an open evidence item.

Supported Languages

"Full" means public taint and rule coverage is benchmark-backed for the listed language family. "Rules" means shipped rule/pattern coverage exists, but public full-taint claims remain qualified.

Language / Surface Status Taint capability Primary evidence Evidence boundary
PythonFullSource-to-sink taint + rulesOWASP, Juliet, MultiLang, heuristics, AI-securityFull for represented cases; framework-specific sanitizer coverage continues to expand.
JavaScriptFullSource-to-sink taint + rulesMultiLang, heuristics, AI-security, React XSS casesFull for represented cases; JSX and template-parser edge cases continue to expand.
TypeScriptFullJavaScript-family analyzer + TS rulesAI-security and JS-family rule coverageFull through the JavaScript-family analyzer; deeper type-aware framework modeling is ongoing.
JavaFullSource-to-sink taint + rulesOWASP, Juliet, MultiLangFull for represented cases; large-framework route graph coverage continues to expand.
C#FullSource-to-sink taint + rulesJuliet, MultiLang, C#/Razor checksFull for represented cases; ASP.NET middleware and Razor fixture depth continue to expand.
GoFullSource-to-sink taint + rulesMultiLang and heuristic benchmark evidenceFull for represented cases; router and validation library models continue to expand.
RubyFullSource-to-sink taint + rulesLanguage regression evidence and Ruby analyzer evidenceFull for represented cases; Rails/Sinatra and customer-like corpus coverage continue to expand.
PHPFullSource-to-sink taint + rulesLanguage regression evidence and PHP analyzer evidenceFull for represented cases; Laravel/Symfony/WordPress and customer-like corpus coverage continue to expand.
PowerShellFullSource-to-sink taint + rulesLanguage regression evidence and PowerShell analyzer evidenceFull for represented cases; enterprise module and shell-argument sanitizer coverage continue to expand.
BlazorRulesC#/Razor-oriented rulesBlazor rule family and C# analyzer evidenceRules coverage today; dedicated Blazor benchmark pack is required before Full.
YAML / JSON / Dockerfile / Kubernetes / TerraformConfigPattern and structural checks, not taint-ledRule benchmarks where representedConfig checks today; dedicated IaC/cloud benchmark suite is required before Full Config.

Vulnerability-Class Coverage

The table below summarizes the claim level by vulnerability class. "Green" means the latest controlled evidence is passing for represented cases; it does not mean the class is exhaustively solved for every framework.

Class Claim level Languages / surfaces How SiteShadow detects it Known gap
SQL injectionGreenPython, JS/TS, Java, C#, Go, Ruby, PHP; PowerShell where representedSource-to-sink taint, raw DB driver heuristics, injection rulesMore ORM/framework corpus evidence.
XSSGreenJS/TS, Python, Java, C#, Ruby, PHP; Blazor rulesDOM sinks, template sinks, sanitizer recognition, framework checksMore template-engine and context-specific escaping fixtures.
Command injectionGreenPython, JS/TS, Java, C#, Go, Ruby, PHP, PowerShellUser-controlled command, shell, process, and eval sink trackingBroader shell-argument sanitizer modeling across real projects.
Code injection (CWE-94)PartialPython, JS/TS, Java, C#, Go, Ruby, PHP, PowerShellTaint into dynamic-code sinks: eval, new Function, and related dynamic-construction shapesSome indirect-construction bypass variants — see Currently Known Gaps below.
SSRFGreenPython, JS/TS, Java, C#, Go, Ruby, PHP, PowerShell where representedUser-controlled URL to HTTP clients, metadata endpoint patterns, AI-output-to-request flowsAllowlist, DNS rebinding, parser, and network validation modeling.
Secrets and credentialsGreenAll scanned code/config languagesCWE-798, provider token patterns, config rules, duplicate-secret cross-file detectionProvider pattern drift needs continuous updates and sampled FP review.
Path traversal and file accessGreenPython, JS/TS, Java, C#, Go, Ruby, PHP, PowerShellUser-controlled path to read/write/open sinks with sanitizer handlingUpload-storage, normalization, and platform-specific fixtures.
Auth, access control, IDORGreenFramework-dependent across Python, JS/TS, Java, C#, GoRoute/auth heuristics, cross-file auth consistency, object-access indicatorsMore real app route graphs and middleware corpora.
AI/LLM security flowsGreenPython, JS/TS, Java; MCP tool/config rulesLLM output as tainted source to tools, HTTP, browser automation, shell, storage, email, chat sinks, and unsafe MCP tool accessSink-library expansion and public AI-agent risk methodology.
IaC, container, configurationPartialYAML, JSON, Dockerfile, Kubernetes, Terraform patternsPattern and structural config checksDedicated IaC/cloud-provider benchmark suite not yet complete.

Currently Known Gaps

Active scanner gaps as of 2026-05-20. Each item is tracked as a defect with an owner, and is published here so customers see the same picture the release team does. Closed defects are removed from this table as their fixtures pass a clean rerun.

ClassLanguage / surfaceWhat the scanner misses today
Code injection (CWE-94) via Reflect.constructJavaScriptReflect.construct(Function, [tainted])() — taint not flagged through this indirect-construction shape

Evidence and Benchmarks

SiteShadow uses public benchmark suites where strong scored benchmarks exist, and internal regression suites where public benchmarks are immature or unavailable. Internal regression suites are release gates, not independent third-party certifications.

Status definitions: Release gate — suite is part of the scanner release decision and can block a release. Threshold: Youden ≥ 0.95. Production baseline — suite is measured and tracked but not yet release-blocking. Threshold: Youden 0.7 – 0.95. In calibration — suite is measured and being actively improved; below baseline confidence for public quality claims. Threshold: Youden < 0.7. Evaluation candidate — external suite looks relevant, but SiteShadow has not yet integrated a reviewed runner, expected-result mapping, and public score. Planned — the corpus is on the roadmap, but it is not evidence for a current claim yet.
Evidence layer Status What it supports Current limitation
OWASP Benchmark for JavaRelease gateJava web-application vulnerability detection against a public scored benchmark. Current result: +1.000 Youden (1,698 cases, 873 TP, 0 FN, 0 FP).Java-focused; it does not validate PHP, Ruby, PowerShell, or JavaScript framework claims.
NIST SARD / Juliet CWE suites (Java)Release gateCWE-style regression evidence for supported imported Java cases. Current result: +1.000 Youden (6,444 cases, 6,264 TP, 0 FP).Synthetic corpus; controlled results are not real-world customer-code FP/FN rates.
OWASP Benchmark for PythonRelease gatePython web-application vulnerability detection against a public scored benchmark. Current result: +1.000 Youden (1,230 cases, 452 TP, 0 FP).Python-focused; it does not validate JavaScript, Ruby, Go, or other-language framework claims.
NIST SARD / Juliet CWE suites (C#)Release gateC# CWE-style regression evidence across path traversal, LDAP injection, XPath injection, command injection, and SQL injection. Current result: +1.000 Youden (2,516 cases, 2,412 TP, 0 FP).Synthetic corpus; controlled results are not real-world customer-code FP/FN rates.
SecBench.jsEvaluation candidateServer-side JavaScript package vulnerability evidence with executable examples.Requires an adapter and result mapping before it becomes a release gate.
External vulnerable-app corpusPlannedRealistic JavaScript, PHP, and Ruby smoke tests using projects such as OWASP Juice Shop, OWASP NodeGoat, DVWA, and OWASP RailsGoat.These are training/demo applications, not scored SAST benchmarks until expected-finding maps are built.

Internal Release Gates

The release gates below cover languages whose public quality is anchored in internal regression suites rather than external scored benchmarks. Java, Python, JavaScript/TypeScript, and C# are covered by the external benchmark suites listed in the table above.

Area What it protects How to read it publicly
PHP, Ruby, and PowerShell language regressionSource-to-sink cases, safe negative cases, framework-style sources, command, SSRF, path, and sanitizer behavior.Supports Full status for represented taint and rule families, with framework-depth limitations still listed above.
AI/LLM security regressionLLM output and AI-agent dataflow into tools, shell, HTTP, browser automation, storage, email, and chat sinks.Supports current AI/LLM rule-family claims, not a universal claim over all agent frameworks.
Rule, heuristic, cross-file, and taint regressionRelease blocking checks for noisy rules, missed findings, multi-file behavior, and explainability fields.Shows release discipline. It is not a substitute for public benchmarks or customer-like corpus measurement.

This page summarizes SiteShadow's technical coverage matrix: Detection Credibility Matrix and the authoritative benchmark rollup maintained for release review. Controlled benchmark and regression results are not the same as statistically measured customer-code false-positive or false-negative rates.

Benchmark Methodology

Positive and negative cases

Suites include true-positive vulnerable fixtures and true-negative safe fixtures. A suite is not considered strong enough for a coverage claim when it only contains vulnerable examples.

Source-to-sink evidence

Taint findings are evaluated as paths from source to sink, including intermediate propagation steps when the analyzer can observe them. Serious findings now expose source, sink, data path, rule id, confidence, vulnerable pattern, remediation, and benchmark/example link when available.

Regression gate

Scanner and rule releases require evidence across the authoritative SiteShadow non-regression suites. Any regression requires severity classification, assigned remediation ownership, and release approval before shipping.

Claim boundary

Coverage wording cannot exceed benchmark evidence. Where a language has rule coverage but incomplete benchmark-backed taint evidence, this page labels it as rules coverage instead of full taint support.

Current Limitations

Area Limitation What it means Evidence status
Real-world FP/FN rateNo published customer-like corpus methodology yet.Controlled benchmark rates must not be used as real-world production rates.Customer-like sampling methodology not yet published.
IaC/container/cloudNo dedicated IaC benchmark suite yet.Coverage exists where represented by rules, but full benchmark-backed IaC claims stay qualified.Dedicated IaC benchmark suite pending.
Ruby/PHP/PowerShell corpus depthFull status is backed by controlled language regression evidence, but real-world framework and enterprise-script variety is broader than the controlled fixtures.Public claims can say Full for represented taint and rule coverage, while still qualifying framework and customer-like corpus depth.Full in release-gated language regression evidence; ongoing framework and corpus expansion.
Framework modelingMiddleware, route graphs, sanitizers, and ORM behavior vary by framework.Supported languages still need continuous framework fixture expansion.Ongoing framework fixture expansion.
AI/LLM risk classesAI-agent sink libraries evolve quickly.AI/LLM coverage is benchmark-backed for current suites and needs continued sink-family expansion.Current suites covered; sink-family expansion ongoing.
Runtime-only issuesSome vulnerabilities depend on deployed configuration, live identity data, secrets, network reachability, or production authorization state.Static analysis can flag risky code paths, but it cannot prove every runtime policy, tenancy boundary, or environment-specific behavior.Requires runtime telemetry, integration tests, or customer environment validation.
Business logicDeep workflow abuse, fraud logic, approval bypasses, and policy mistakes can require domain context the scanner does not have.SiteShadow may identify risky authorization or data-flow patterns, but it is not a complete substitute for threat modeling and application-specific review.Heuristic coverage exists; domain-specific proof remains customer/application dependent.
Dependency and supply chainSiteShadow detects selected dependency and configuration risks, but it is not a full SCA, SBOM, malware, or exploitability platform.Use dedicated dependency/SBOM tooling alongside SiteShadow for package vulnerability inventory and transitive dependency governance.Partial rule coverage only.
Generated, minified, and highly dynamic codeGenerated/minified bundles and heavy reflection/metaprogramming can hide intent and reduce path explainability.Findings may be less precise, and some source-to-sink paths may need source maps, original source, or manual review.Not claimed as fully measured across all generated-code styles.
How to read this page: green means the represented controlled cases are passing today. It does not mean every framework-specific variant, custom sanitizer, business workflow, runtime policy, or production coding style is fully covered.