-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sending Huge Tree loses namespace #218
Comments
Thanks, @coolacid. I'll look into this. |
Is this related to the huge_tree issue? If so, possible duplicate of #18. |
Not a duplicate of #18 - while related, this is specific to when huge_tree is disabled and the namespace no longer parses. While turning on huge_tree does work around the issue - The package should be able to handle the input when huge_tree is set to false. |
Here's a self contained script to reproduce the issue. from __future__ import print_function
import libtaxii.messages_11 as tm11
def load_xml(xml_bytes):
try:
msg = tm11.get_message_from_xml(xml_bytes)
print('Loaded %r' % (msg,))
except Exception:
import traceback
print(traceback.format_exc())
if __name__ == '__main__':
poll_request_bytes = tm11.PollRequest(
message_id=tm11.generate_message_id(),
collection_name='Test collection',
subscription_id='Test subscription',
).to_xml()
# Use some content for the text node that is known to trigger the
# parser's
# XMLSyntaxError: xmlSAX2Characters: huge text node
# exception by having ten million bytes in the text node *and* an entity
# that needs expansion (< in this example).
huge_text = '<' + ('x' * (10 * 1000 * 1000))
inbox_msg_bytes = tm11.InboxMessage(
message_id=tm11.generate_message_id(),
content_blocks=[
tm11.ContentBlock(
content_binding='urn:example.com:huge_tree_issue:218',
content=huge_text,
)
]
).to_xml()
print('First load of poll request - works fine')
load_xml(poll_request_bytes)
print('\nLoad inbox message - expected to error due to "huge text node"')
load_xml(inbox_msg_bytes)
print('\nSecond load of same poll request - should work fine too')
# Uncomment next line to "fix" by resetting broken cached parser
# libtaxii.common.set_xml_parser(None)
load_xml(poll_request_bytes) For a python 2.7.15 install this gives:
Note how the second parse of the poll request now fails with
Arguably this is an lxml bug - it seems that the There are several ways to workaround the problem:
|
I don't think that is a good idea for the reason that it can cause python to crash with trivial construction of a deeply nested input document. See my comment on #18 |
I made a change to how we use the cached |
When sending Taxii 11 Messages with a huge tree causes future messages to loose the namespace.
Example output:
Simple Test Script:
Sample data would be to big to provide. However, Start by creating a standard discovery packet, then a huge tree packet (Example: Large sample file), then a discovery packet. This will yield the issue.
The text was updated successfully, but these errors were encountered: