Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comment on ensuring that uploaded files are images #35

Closed
ndrean opened this issue Dec 19, 2023 · 6 comments
Closed

Comment on ensuring that uploaded files are images #35

ndrean opened this issue Dec 19, 2023 · 6 comments

Comments

@ndrean
Copy link
Collaborator

ndrean commented Dec 19, 2023

About ensuring that uploaded files are images

The code below supposes that the user uploads isn't tricked in the sense that the declared MIME is not corrupted or disguised image files.

def upload_image_to_s3(file_path, mimetype) do
    extension = MIME.extensions(mimetype) |> Enum.at(0)
    ...
end

This maybe extra or unnecessary work at this stage to ensure that the uploaded file is indeed an image. The risks are to upload corrupted files into a bucket and to fail the Vix process.

However, in case of interest, I looked into this and believe we should/could use two consecutive strategies to ensure that the upload is a picture.

  • we firstly use libmagic via gen_magic. It works with magic numbers. It is added as a depency in the Dockerfile. We run this as a worker in order not to reload the C code on each call.

  • if this test is positive, we then run ExIamgeInfo to confirm by matching on the type of data. It does not use magic number but rather reads the content of the file. Note that this gives a Sobelow warning since we read external data.

This should assure that we receive a file of type "image" with the desired format: ["webp", "jpeg", ""jpg"].

Dependencies:

#Dockerfile
RUN apt-get update -y && apt-get install -y build-essential git libmagic-dev curl\
  && apt-get clean && rm -f /var/lib/apt/lists/*_*
#mix.exs
{:gen_magic, "~> 1.1.1"},
{:ex_image_info, "~> 0.2.4"},
#Appication.ex
children = [
  {GenMagic.Server, name: :gen_magic},
]
# example of usage:

@doc """
It takes a path and checks the file type via magic number. 
It uses a GenServer running the `C` lib "libmagic".
 """

  def gen_magic_eval(path) do
    case GenMagic.Server.perform(:gen_magic, path) do
      {:error, reason} ->
        {:error, inspect(reason)}

      {:ok,
       %GenMagic.Result{
         mime_type: mime,
         encoding: "binary",
         content: _content
       }} ->
        if Enum.member?(@accepted_mime, mime),
          do: {:ok, %{mime_type: mime}},
          else: {:error, "bad mime"}

      {:ok, %GenMagic.Result{} = res} ->
        Logger.warning(%{gen_magic_response: res})
        {:error, "not acceptable"}
    end
  end


@doc """
Counter-check with ExImage the findings of `gem_magic`. It reads the file.
It determines if the file is an acceptable image and matches `gen_magic`.

  """
  def ex_image_check(file, mime) when is_binary(file) do
    case ExImageInfo.info(File.read!(file)) do
      nil ->
        {:error, "Error reading the file"}

      {^mime, _w, _h, _} ->
        :ok

      {type, _, _, _} ->
        Logger.info(%{content_type: type})
        {:error, "Does not match"}
    end
  end

To get the idea on how to use this, the flow could be:

with {:ok, %{mime_type: mime}} <-
           gen_magic_eval(file),
         :ok <-
           ex_image_check(file, mime),
        {:ok, img} <-
           Image.new_from_file(file),
       ....
@LuchoTurtle
Copy link
Member

Thank you for the awesome feedback, as always!

Because we are using accept: ~w(image/*), in allow_upload/3, that's why we assume it's an image when we reach that line you mentioned.

We have tests for corrupted images and they seem to error out fine. If the user purposely tries to change the mimetype of their files and tries to upload a file and the LiveView fails, I wager that's intended behaviour.

Although I do agree that it's best to cover all possible edge cases and properly show feedback to the user, is adding extra dependencies and making the time-to-predict slower worth it? What do you think?

@ndrean
Copy link
Collaborator Author

ndrean commented Dec 19, 2023

You can still upload the image (with a false MIME type) I believe.

For simplicity (and quasi-completeness, fi possible),I would just use Libmagic. It adds only very little overhead (a UNIX dependency in the ockerfile). It is a well known Unix library and the C package can be run/loaded once at a time as a GenServer (as per the docs). It is not the ultimate response but its fast and works! It reads the first bytes (check this GIST). This extra check won't really slow down the upload process. I would just check the MIME type is equal to the GenMagic.Result.mime_type before uploading.

@LuchoTurtle
Copy link
Member

Yes, the person can still upload the image with false MIME type. What I'm saying is that if they are purposely tinkering with the mimetype and corrupting it on purpose, they deserve for the LiveView to fail :P.

But I agree with you. I'll implement your piece of code once I'm through with #31 , unless you want to create a PR yourself 👍

@ndrean
Copy link
Collaborator Author

ndrean commented Dec 19, 2023

Yes , the LV should send a warning that this task failed. I wait for your code to be ready and make a PR if you want once I digest your code :)

For the test, probably a negative one is enough. Load a text file, change its MIME (to check), simulate the upload and capture the :error or flash.

@LuchoTurtle
Copy link
Member

The code (as it stands) is ready. The PR I'm working on is just meant to compare Bumblebee models and it will have its own little folder with Elixir scripts and Jupyter Notebooks. :)

Regarding the test, we already have one that checks if an invalid image is uploaded ->

test "uploading an invalid image", %{conn: conn} do

Cheers for the feedback, btw, it's really appreciated!

@LuchoTurtle
Copy link
Member

Closing since #37 was merged 🆗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants