Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid Encoding Error (Unescaped Line Break) #23

Open
nmichels opened this issue Mar 7, 2019 · 2 comments
Open

Invalid Encoding Error (Unescaped Line Break) #23

nmichels opened this issue Mar 7, 2019 · 2 comments

Comments

@nmichels
Copy link

nmichels commented Mar 7, 2019

Hi @sam-github
First I would like to thank you for the effort in this gem, really helpful!
So here is the problem I'm facing: I was testing with my customer an importing contacts feature using vCard files, so I've decided to use vpim gem to read files and then perform my business logic over the decoded file. The client reported a problem while trying to import a file, so I found out that the generated file has a line like this:

item2.TEL:777-888-9999
item2.X-ABLabel:Other
VOICE

With an unescaped line break. I've also noticed that between Other and VOICE there is just a LF character. According to the RFC 6350 https://tools.ietf.org/html/rfc6350 the delimiting character between lines should be CRLF, and I was able to overcome this situation with the following monkeypatch:

module Vpim
  #enforce CRLF line break according to RFC 6350 Session 3.2 https://tools.ietf.org/html/rfc6350
  def Vpim.unfold(card) # :nodoc:
    unfolded = []
    card.each_line("\r\n") do |line|
      line.chomp!
      # If it's a continuation line, add it to the last.
      # If it's an empty line, drop it from the input.
      if( line =~ /^[ \t]/ )
        unfolded[-1] << line[1, line.size-1]
      elsif (unfolded.last && unfolded.last =~ /;ENCODING=QUOTED-PRINTABLE:.*?=$/)
        unfolded.last << line
      elsif( line =~ /^$/ )
      else
        unfolded << line
      end
    end
    unfolded
  end
end

But I don't know how safe is to use this solution, do you have any ideas on this?
Thanks a lot!

@sam-github
Copy link
Owner

https://tools.ietf.org/html/rfc6350#section-3.4

NEWLINE (U+000A) characters in values MUST be
encoded by two characters: a BACKSLASH followed by either an 'n'
(U+006E) or an 'N' (U+004E).

So input is either invalid, or possibly its not 6350 encoded, VCF predates the IETF standardization, and the earlier formats were looser. Also, it technically just describes the format passed via MIME (HTTP, email, etc.), but what people often work with is files saved to disk, and when files with CRLF endings are saved to disk, they usually get converted to the local system's line ending convention.

vpim tries to be useful (not just correct), so it does its best to support the various flavours of vcard, even slightly invalid ones, but I'm not sure what heuristic it could use to detect what you are seeing. I guess if it saw a param: value line that had no :, it could just decide to merge it with the previous line? Seems dicy. You could do this as a pre-processing step, though, before feeding the data to vpim.

@nmichels
Copy link
Author

nmichels commented Mar 7, 2019

Thanks for the quick and clear response! Initially I didn't consider preprocessing the file due to the performance impact of reading the file twice, but I agree that this is a less "agressive" approach than monkey patching the gem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants