fix(search-algolia): add HTML sanitization to prevent XSS in search results#11726
fix(search-algolia): add HTML sanitization to prevent XSS in search results#11726hasseneafif wants to merge 1 commit intofacebook:mainfrom
Conversation
|
Hi @hasseneafif! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks! |
✅ [V2]Built without sensitive environment variables
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
I have signed the CLA. |
|
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks! |
|
I'm not sure if this is an actual attack vector, but I do think this is not as benign as other places where we use |
|
Yes I do have a proof of concept. I've locally initiated XSS POC tests that demonstrate real attack scenarios. If an attacker can inject malicious content into the Algolia index (either by compromising the index or through user-generated content that gets indexed), they can execute XSS attacks. Attack Scenarios Tested:
etc.. i could provide it if needed. |
|
I would be interested to see your POC, indeed. FYI, this only impacts our search page (barely used), which is implemented with custom code. But the official DocSearch modal (widely used) also uses https://github.com/algolia/docsearch/blob/main/packages/docsearch-react/src/Snippet.tsx export function Snippet<TItem extends StoredDocSearchHit>({
hit,
attribute,
tagName = 'span',
...rest
}: SnippetProps<TItem>): JSX.Element {
return createElement(tagName, {
...rest,
dangerouslySetInnerHTML: {
__html: getPropertyByPath(hit, `_snippetResult.${attribute}.value`) || getPropertyByPath(hit, attribute),
},
});
}https://github.com/algolia/docsearch/blob/main/packages/docsearch-react/src/Results.tsx <div className="DocSearch-Hit-content-wrapper">
<Snippet className="DocSearch-Hit-title" hit={item} attribute={`hierarchy.${item.type}`} />
<Snippet className="DocSearch-Hit-path" hit={item} attribute="hierarchy.lvl1" />
</div>What I mean is: if indeed the DocSearch/Algolia index can be compromised and doesn't sanitize inputs, then DocSearch in general is probably vulnerable, not just the Docusaurus integration. Note that we don't use exactly the same search hit attributes (we use title/summary/breadcrumb), so it's hard for me to know which ones are safe/unsafe to use with |
|
https://docusaurus.io/search?q=%3Cscript%3Ealert%28%22XSS%22%29%3C%2Fscript%3E This doesn't trigger alerts for me? |
The problem is not The question is how Algolia index things and how it sanitizes them before returning them through the API. I tried adding script tags through the interface, and from what I can see, this doesn't seem to work (the value is removed/reverted somehow), so I'm curious to understand how the POC of @hasseneafif works exactly and what's inside the index.
|

Pre-flight checklist
Motivation
This PR addresses a security vulnerability and code quality issue:
Security: The Algolia search theme was rendering HTML from search results using
dangerouslySetInnerHTMLwithout proper sanitization. While Algolia's search results are generally trusted, this creates a potential XSS vulnerability if:This PR adds comprehensive HTML sanitization to prevent XSS attacks while preserving search highlighting functionality, and modernizes deprecated JavaScript methods.
Test Plan
Security Testing
Created comprehensive test suite with 14 test cases covering:
<em>,<mark>,<strong>,<b>,<i>)<script>,<iframe>,<object>,<img>, etc.)onclick,onload,onerror, etc.)javascript:,data:URIs)Verification Steps