Sunday, December 12, 2010

[Rails] Re: PGError: ERROR: invalid byte sequence for encoding "UTF8": 0xa0

I haven't been using attachments but for files it sounds like blob
might be better, "files" don't have character sets, they're just
binary data, right?

This Postgres error did happen to me though, because I was receiving
emails from Sendgrid in all sorts of different encodings. My code
reconstructed a Mail::Message object from the Sendgrid params and then
pulled specific fields off of it to put in the database. I falsely
assumed that all incoming messages would have the same encoding as the
test messages I was sending and so I just forced 'charset=UTF-8;' on
the Message objects and I started getting very similar errors to you
when Windows users started sending my service emails from Outlook.

To remedy this, I used Iconv to convert the incoming data to a
standard charset which was the same as the Postgres DB I was using on
Heroku. Sendgrid sends emails as POST parameters and includes a JSON
array of the encodings of each of the other fields, so I was able to
use these to tell Iconv what to convert from. I also told it to ignore
invalid characters by appending "//IGNORE" to the "from" argument. The
code looks like this:

encodings = ActiveSupport::JSON.decode(params[:charsets])
# Sendgrid auto-decodes the headers into UTF8
mail = Mail.new(params[:headers])
mail.text_part = Mail::Part.new(:charset => 'UTF-8', :content_type =>
"text/plain;", :body => Iconv.conv(encodings['text']+"//IGNORE",
'UTF-8', params[:text])) if params[:text].present?

If you aren't using Sendgrid, make certain that the text you insert
into the database is in the same encoding as what the database
expects. I believe something in the ActiveRecord stack, be it
ActiveRecord, the PgSQL backend, the ruby postgres bindings, or
whatever, something assumed that the incoming string was properly
encoded and just sent it as it was to the database, which errored
because it was not.

I'm on REE 1.8.7, so maybe this whole thing would go away if you used
1.9.2. I've read string encodings are a lot more magical there, and
wycats has a good blog post explaining how it all works if you are
curious.

On Dec 12, 12:47 am, CuriousNewbie <bhellm...@gmail.com> wrote:
> Hello, my app is reading emails with attachments and inserting the
> Email message into the database to be sent to delayed job for
> processing.
>
> When inserting an email with attachments into the database, I get the
> following error:
>
> PGError: ERROR:  invalid byte sequence for encoding "UTF8": 0xa0
>
> Has anyone seen this before? There doesn't seem to be much rails
> related via googling.
>
> Right now I'm saving the email with attachments to a text column, has
> anyone tried using a blob to resolve?
>
> Thanks

--
You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group.
To post to this group, send email to rubyonrails-talk@googlegroups.com.
To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home


Real Estate