-
Notifications
You must be signed in to change notification settings - Fork 979
Open
Description
Hello Google Ads API Team,
Firstly, thanks so much for your great work!
While using and reviewing the googleads-python-lib
, I came across a potential XML parsing issue in the ZeepSchemaHelper
class that I'd like to raise for discussion.
I understand that the library is designed to work with trusted WSDL endpoints provided by Google, and this issue is unlikely to be exploitable under normal use. However, for defense-in-depth and potential future-proofing, I wanted to share the finding.
# Affected Source Code: `googleads/common.py`
class ZeepSchemaHelper(GoogleSchemaHelper):
def __init__(self, endpoint, timeout, proxy_config, namespace_override, cache):
...
transport = _ZeepProxyTransport(timeout, proxy_config, cache)
try:
data = transport.load(endpoint) # [Untrusted Input Source: XML from user-supplied endpoint]
except requests.exceptions.HTTPError as e:
raise googleads.errors.GoogleAdsSoapTransportError(str(e))
self.schema = zeep.xsd.Schema(
lxml.etree.fromstring(data) # [VULNERABILITY SINK: unsafe XML parsing]
)
This type of attack leverages recursive entity declarations in XML to cause exponential memory usage like the Billion Laughs attack.
We can set
parser = lxml.etree.XMLParser(
resolve_entities=False,
load_dtd=False,
no_network=True
)
to solve this
Metadata
Metadata
Assignees
Labels
No labels