Skip to content

16b Body Text and Sanitized HTML

Dave Strus edited this page Jul 18, 2015 · 1 revision

Body Text and Sanitized HTML

We included both body_html and body_text in our notes table, but we've only used body_html in our app so far. Let's update the Note model to automatically save body_text as a copy of body_html with all the HTML tags stripped out.

Furthermore, we're currently doing nothing to ensure that users don't submit malicious scripts in their HTML.

Check out Ryan Grove's Sanitize gem.

Sanitize is a whitelist-based HTML and CSS sanitizer. Given a list of acceptable elements, attributes, and CSS properties, Sanitize will remove all unacceptable HTML and/or CSS from a string.

Sanitize can also strip out all HTML tags. Let's add it to our Gemfile.

Gemfile

gem 'sanitize'

Let's add a private method to the Note model to take care of both body fields.

app/models/note.rb

  private

  def sanitize_body
    self.body_html = Sanitize.fragment body_html, Sanitize::Config::RELAXED
    self.body_text = Sanitize.clean body_html.strip
  end

The first line leaves only whitelisted elements intact in the body_html field. We call the fragment method, because body_html contains only a fragment of an HTML document—that is, it doesn't have an element. Santize comes with three built-in configurations: RESTRICTED, BASIC, and RELAXED. RELAXED fits our needs in this case.

Sanitize.clean will strip out all tags, making it the perfect candidate for preparing body_text.

Now we want to run Note#sanitize_body every time we create or update a note. Just as we have before_action in controllers, we have before_save in models. Add the following line near the top of your Note model, alongside validations, relations, and scopes.

app/models/note.rb

  before_save :sanitize_body