[Rails] Re: Encoding
Thank you everyone for your responses. They are helped me figure out
a solution. This seems to work for my problem:
s = s.gsub("\xe2\x80\x9c", '"')
s = s.gsub("\xe2\x80\x9d", '"')
s = s.gsub("\xe2\x80\x98", "'")
s = s.gsub("\xe2\x80\x99", "'")
s = s.gsub("\xe2\x80\x93", "-")
s = s.gsub("\xe2\x80\x94", "--")
s = s.gsub("\xe2\x80\xa6", "...")
s = Iconv.conv('UTF-8//IGNORE', 'UTF-8', s)
-Erica
On Jun 21, 12:24 pm, Jeff Lewis <jeff.bu...@gmail.com> wrote:
> Maybe post an example of a string/char that's causing the problem, as
> it's logged in your app's log?
>
> Here's an example of a problem string/char that I was seeing in data
> posted to my app:
>
> $ ./script/rails console
> ...
> ruby-1.9.2-p136 :001 > s = "foo\xAE bar"
> => "foo\xAE bar"
>
> ruby-1.9.2-p136 :002 > s.is_utf8?
> => false
>
> ruby-1.9.2-p136 :003 > s.valid_encoding?
> => false
>
> ruby-1.9.2-p136 :004 > s.sub(/bar/, 'biz')
> ArgumentError: invalid byte sequence in UTF-8
> from (irb):4:in `sub'
> ...
>
> ruby-1.9.2-p136 :005 > s2 = Iconv.new('UTF-8//IGNORE',
> 'UTF-8').iconv("#{s} ")[0..-2]
> => "foo bar"
>
> ruby-1.9.2-p136 :006 > s2.gsub(/bar/, 'biz')
> => "foo biz"
>
> And if that's not doing the trick, then maybe try forcing the string
> to utf8 first?:
>
> ruby-1.9.2-p136 :007 > s3 = Iconv.new('UTF-8//IGNORE',
> 'UTF-8').iconv("#{s.force_encoding('UTF-8')} ")[0..-2]
> => "foo bar"
>
> Jeff
>
> On Jun 20, 4:33 pm,Erica<ericarhol...@gmail.com> wrote:
>
> > Thanks for your response. I tried this on a string that was causing
> > the error and it didn't work. The problem is with microsoft word
> > special characters. I can't find a way to replace these characters.
> > Here is one website I found that describes the special characters:http://www.toao.net/48-replacing-smart-quotes-and-em-dashes-in-mysql,
> > although it's not about rails.
>
> > Can anyone help me out?
>
> > Thanks,
>
> >Erica
>
> > On Jun 17, 7:38 pm, Jeff Lewis <jeff.bu...@gmail.com> wrote:
>
> > > HiErica,
>
> > > I ran into similar situation a while ago for a webservice app I was
> > > working on where I had to handle a lot of bad / untrusted non-utf8
> > > data, and found a fix that met the needs of the app using Iconv
> > > (http://www.ruby-doc.org/stdlib/libdoc/iconv/rdoc/index.html)
> > > following a strategy outlined by Paul Battley (http://po-ru.com/diary/
> > > fixing-invalid-utf-8-in-ruby-revisited/):
>
> > > ...
> > > def AppUtil.force_utf8(str)
> > > ic = Iconv.new('UTF-8//IGNORE', 'UTF-8')
> > > return ic.iconv("#{str} ")[0..-2]
> > > end
> > > ...
>
> > > Jeff
>
> > > On Jun 16, 5:27 pm,Erica<ericarhol...@gmail.com> wrote:
>
> > > > What's a good solution for fixing character encoding problems for
> > > > compatibility between ascii and utf-8? The database is postgres and
> > > > is encoded in utf-8.
>
> > > > Once in awhile there will be a compatibility error from strings from a
> > > > webform.
>
> > > > Is there a command to fix this besides using
> > > > a_string.force_encoding('utf-8')? Even this doesn't seem to always
> > > > work either.
>
> > > > Thanks,
>
> > > >Erica
--
You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group.
To post to this group, send email to rubyonrails-talk@googlegroups.com.
To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
0 Comments:
Post a Comment
Subscribe to Post Comments [Atom]
<< Home