[Artifacts] Harden Mol* fallback gene escaping and URL encoding

← All Specs

[Artifacts] Harden Mol* fallback gene escaping and URL encoding

Task ID: ff90b5cf-54bf-45f6-a31d-36ad5bc87135 Priority: P50 Type: one_shot Status: Complete

Goal

Improve /hypothesis Mol* dynamic fallback by JS-safe gene literal handling and URL encoding for manual RCSB/AlphaFold search links to keep 3D artifact panel reliable for unusual target gene symbols.

Acceptance Criteria

☑ Fix JS-safe gene literal handling in Mol* dynamic fallback (line 29820)
☑ Gene name properly escaped to prevent </script> injection attacks
☑ Normal gene names (e.g., FOXP2) work unchanged
☑ Protein viewer kept on entity/hypothesis/challenge detail pages
☑ Protein viewer NOT featured on /showcase page

Changes Made

api.py line 29820

Before:

var gene = {json.dumps(target_gene_first)};

After:

var gene = {json.dumps(html.escape(target_gene_first))};

Why: Using json.dumps() alone doesn't HTML-escape. If a gene name contains </script>, it could break out of the HTML <script> block and enable XSS. By HTML-escaping before JSON-encoding, </script> becomes \u003c/script\u003e in the JSON string, which prevents the HTML parser from seeing the closing tag sequence.

Verification

Tested escaping logic:

  • Problematic gene TEST</script><script>alert(1)// becomes "TEST&lt;/script&gt;&lt;script&gt;alert(1)//" — safe
  • Normal gene FOXP2 becomes "FOXP2" — no change

Showcase Restriction

Confirmed: /showcase page (line 55202+) does NOT include protein_viewer_html or molstar_html. The showcase highlights debate transcripts, hypothesis cards with scores, evidence tables, KG visualizations, and notebooks — not the 3D protein viewer.

Work Log

2026-04-16 — ff90b5cf-harden-mol-fallback

  • Investigated task: no prior commits found referencing this task ID
  • Reviewed api.py Mol* implementation (lines 29700-29870)
  • Identified XSS vulnerability: json.dumps(target_gene_first) on line 29820 doesn't HTML-escape, allowing </script> injection
  • Applied fix: json.dumps(html.escape(target_gene_first))
  • Verified: showcase page doesn't feature protein viewer (correct)
  • Verified: search links in fallback already use encodeURIComponent(gene) (correct)
  • Committed and pushed fix

Tasks using this spec (1)
[Artifacts] Harden Mol* fallback gene escaping and URL encod
Artifacts done P88
File: ff90b5cf_54b_spec.md
Modified: 2026-04-25 22:00
Size: 2.3 KB