Friday, June 24, 2011

[Rails] Re: Encoding

Thank you everyone for your responses. They are helped me figure out
a solution. This seems to work for my problem:

s = s.gsub("\xe2\x80\x9c", '"')
s = s.gsub("\xe2\x80\x9d", '"')
s = s.gsub("\xe2\x80\x98", "'")
s = s.gsub("\xe2\x80\x99", "'")
s = s.gsub("\xe2\x80\x93", "-")
s = s.gsub("\xe2\x80\x94", "--")
s = s.gsub("\xe2\x80\xa6", "...")
s = Iconv.conv('UTF-8//IGNORE', 'UTF-8', s)


-Erica

On Jun 21, 12:24 pm, Jeff Lewis <jeff.bu...@gmail.com> wrote:
> Maybe post an example of a string/char that's causing the problem, as
> it's logged in your app's log?
>
> Here's an example of a problem string/char that I was seeing in data
> posted to my app:
>
> $ ./script/rails console
> ...
> ruby-1.9.2-p136 :001 > s = "foo\xAE bar"
>  => "foo\xAE bar"
>
> ruby-1.9.2-p136 :002 > s.is_utf8?
>  => false
>
> ruby-1.9.2-p136 :003 > s.valid_encoding?
>  => false
>
> ruby-1.9.2-p136 :004 > s.sub(/bar/, 'biz')
> ArgumentError: invalid byte sequence in UTF-8
>         from (irb):4:in `sub'
> ...
>
> ruby-1.9.2-p136 :005 > s2 = Iconv.new('UTF-8//IGNORE',
> 'UTF-8').iconv("#{s} ")[0..-2]
>  => "foo bar"
>
> ruby-1.9.2-p136 :006 > s2.gsub(/bar/, 'biz')
>  => "foo biz"
>
> And if that's not doing the trick, then maybe try forcing the string
> to utf8 first?:
>
> ruby-1.9.2-p136 :007 > s3 = Iconv.new('UTF-8//IGNORE',
> 'UTF-8').iconv("#{s.force_encoding('UTF-8')} ")[0..-2]
>  => "foo bar"
>
> Jeff
>
> On Jun 20, 4:33 pm,Erica<ericarhol...@gmail.com> wrote:
>
> > Thanks for your response.  I tried this on a string that was causing
> > the error and it didn't work.  The problem is with microsoft word
> > special characters.  I can't find a way to replace these characters.
> > Here is one website I found that describes the special characters:http://www.toao.net/48-replacing-smart-quotes-and-em-dashes-in-mysql,
> > although it's not about rails.
>
> > Can anyone help me out?
>
> > Thanks,
>
> >Erica
>
> > On Jun 17, 7:38 pm, Jeff Lewis <jeff.bu...@gmail.com> wrote:
>
> > > HiErica,
>
> > > I ran into similar situation a while ago for a webservice app I was
> > > working on where I had to handle a lot of bad / untrusted non-utf8
> > > data, and found a fix that met the needs of the app using Iconv
> > > (http://www.ruby-doc.org/stdlib/libdoc/iconv/rdoc/index.html)
> > > following a strategy outlined by Paul Battley (http://po-ru.com/diary/
> > > fixing-invalid-utf-8-in-ruby-revisited/):
>
> > > ...
> > >   def AppUtil.force_utf8(str)
> > >     ic = Iconv.new('UTF-8//IGNORE', 'UTF-8')
> > >     return ic.iconv("#{str} ")[0..-2]
> > >   end
> > > ...
>
> > > Jeff
>
> > > On Jun 16, 5:27 pm,Erica<ericarhol...@gmail.com> wrote:
>
> > > > What's a good solution for fixing character encoding problems for
> > > > compatibility between ascii and utf-8?  The database is postgres and
> > > > is encoded in utf-8.
>
> > > > Once in awhile there will be a compatibility error from strings from a
> > > > webform.
>
> > > > Is there a command to fix this besides using
> > > > a_string.force_encoding('utf-8')?  Even this doesn't seem to always
> > > > work either.
>
> > > > Thanks,
>
> > > >Erica

--
You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group.
To post to this group, send email to rubyonrails-talk@googlegroups.com.
To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home


Real Estate