Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected diff when writing XML #1184

Closed
cuixq opened this issue Aug 16, 2024 · 5 comments
Closed

Unexpected diff when writing XML #1184

cuixq opened this issue Aug 16, 2024 · 5 comments
Assignees
Labels
bug Something isn't working guided remediation Related to guided remediation / osv-scanner fix

Comments

@cuixq
Copy link
Contributor

cuixq commented Aug 16, 2024

Texts are escaped to 	 when we write XML in Maven updater.
Go does escaping in EncodeToken: https://github.com/golang/go/blob/master/src/encoding/xml/marshal.go#L223

@cuixq cuixq added bug Something isn't working guided remediation Related to guided remediation / osv-scanner fix labels Aug 16, 2024
@cuixq cuixq self-assigned this Aug 16, 2024
cuixq added a commit that referenced this issue Aug 21, 2024
#1184

When encoding tokens to pom.xml, tabs are escaped and this is not what
we want.
In this PR, before writing to pom.xml, we replace all escaped tabs with
unescaped characters.
@cuixq cuixq closed this as completed Aug 21, 2024
@cuixq cuixq reopened this Aug 21, 2024
@cuixq cuixq changed the title Tabs are escaped when writing XML Texts are escaped when writing XML Aug 21, 2024
@cuixq
Copy link
Contributor Author

cuixq commented Aug 21, 2024

Not only tabs are escaped but also other characters: https://github.com/golang/go/blob/master/src/encoding/xml/xml.go#L1916

@cuixq cuixq changed the title Texts are escaped when writing XML Unexpected diff when writing XML Aug 26, 2024
@cuixq
Copy link
Contributor Author

cuixq commented Aug 26, 2024

Besides the escaped texts, due to issue golang/go#21399 self-closing tags are not encoded as expected.

We may consider refactoring how to write XMLs: instead of calling xml.Encode(), write the content directly to avoid unexpected behaviour.

cuixq added a commit that referenced this issue Sep 2, 2024
#1184

Currently, texts are escaped in `xml.EncodeToken()` and this generates
unexpected diff when writing updates to XML.

This PR adds `func encodeToken()` that writes `CharData` directly to
avoid escaping, and all buffered XML is flushed after encoding.

This PR also fixes #1215 by
adding two more space to indentation.
cuixq added a commit that referenced this issue Sep 5, 2024
#1184

We would like to modify Go's implementation on self-closing tags and
this PR is the first step:
- add `encoding/xml` to `internal/thirdparty` (also ignore this
directory for lint)
 - make Maven manifest writing to use the internal version of `xml`
- modify `xml` to not escape char data so we can call
`xml.EncodeToken()`
@oliverchang
Copy link
Collaborator

Are there remaining issues to track here, or can we close this @cuixq ?

@cuixq
Copy link
Contributor Author

cuixq commented Sep 18, 2024

I noticed one issue of multiple-line self-closing tag encoded into one line:

<a 
something
something />

is now

<a something something />

this seems low priority though.

@cuixq cuixq closed this as completed Jan 13, 2025
@cuixq
Copy link
Contributor Author

cuixq commented Jan 13, 2025

Going to track this in separate issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working guided remediation Related to guided remediation / osv-scanner fix
Projects
None yet
Development

No branches or pull requests

2 participants