CVE-2024-0243
Published:February 24, 2024
Updated:May 25, 2026
With the following crawler configuration:
from bs4 import BeautifulSoup as Soup
url = "https://example.com"
loader = RecursiveUrlLoader(
url=url, max_depth=2, extractor=lambda x: Soup(x, "html.parser").text
)
docs = loader.load()
An attacker in control of the contents of "https://example.com" could place a malicious HTML file in there with links like "https://example.completely.different/my_file.html" and the crawler would proceed to download that file as well even though "prevent_outside=True".
https://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51
Resolved in https://github.com/langchain-ai/langchain/pull/15559
Affected Packages
langchain (PYTHON):
Affected version(s) >=0.0.1 <0.1.0Fix Suggestion:
Update to version 0.1.0Additional Notes
The description of this vulnerability differs from MITRE.
Related Resources (7)
Do you need more information?
Contact UsCVSS v4
Base Score:
1
Attack Vector
LOCAL
Attack Complexity
HIGH
Attack Requirements
NONE
Privileges Required
HIGH
User Interaction
PASSIVE
Vulnerable System Confidentiality
LOW
Vulnerable System Integrity
LOW
Vulnerable System Availability
NONE
Subsequent System Confidentiality
LOW
Subsequent System Integrity
LOW
Subsequent System Availability
NONE
CVSS v3
Base Score:
3.7
Attack Vector
LOCAL
Attack Complexity
HIGH
Privileges Required
HIGH
User Interaction
REQUIRED
Scope
CHANGED
Confidentiality
LOW
Integrity
LOW
Availability
NONE
Weakness Type (CWE)
Server-Side Request Forgery (SSRF)
EPSS
Base Score:
0.10