I have a rails site that accepts pdf files as uploads. Based on previous experience, I have found that trying to validate on content-type causes lots of problems. There are many browsers that send weird content types with file names. For pdf files, they should always be using ‘application/pdf’ and most browsers do. However, this morning I got email from someone who couldn’t upload his file in one browser. He told me he used a different browser and it then worked. I searched the log files and found a bunch of error messages, but the main bit was this:

INFO -- : [paperclip] Content Type Spoof: Filename Tom_Smith_CV.pdf (x-unknown/pdf from Headers,
...
 content type discovered from file command: application/pdf. See documentation to allow this combination.

If I understand this correctly (and there’s no guarantee of that), it’s saying that the file is being uploaded with a content-type of x-unknown/pdf, but the file command (file –mime-type -b ) is returning application/pdf.

I believe that this mismatch is caused by the client browser being configured incorrectly. (And yes, it is. Check out this website.) Why? Because I was able to duplicate the problem by editing mimeTypes.rdf in my firefox profile directory. I simply changed anything that showed application/pdf to x-unknown/pdf and restarted the browser. I then got the same error.

This check is part of a new spoof detector in Paperclip version 4 and up. (I’m using 4.3.2.) Now based on lots of problems I’ve had before with validating content type, I put the do_not_validate_attachment_file_type :cv line in my model. Then I write a special validator that basically just checks the filename that it’s a pdf file. This isn’t ideal, but it has worked for me, until now.

This page explained that the spoof detector is not affected by the do_not_validate_attachment_file_type helper. That same page also gave me the workaround to turn off the spoof detector for now. I made a new initializer called paperclip.rb that contains:

require 'paperclip/media_type_spoof_detector'
module Paperclip
  class MediaTypeSpoofDetector
    def spoofed?
      false
    end
  end
end

This basically just returns a false value for anything checking against the spoof detector.

I realize that none of this is ideal. However, I really don’t understand things enough to know how to set up paperclip to properly do file validations while also accepting all the files that we should accept.