I use Yahoo RSS news feeds on many of my sites. However, I recently started encountering weird binary character sequences in the titles and article summaries that appear to be confusing AdSense into serving up japaneese kanji ads, lol. I think I've fixed them all so far, but Google hasn't come back yet to respider the front page after which it will hopefully start serving up non-kanji ads. You can see an example of this on the front page of http://www.diabetesheadlines.com .
Some examples of these character sequences are \xe2\x80\x93 being a long dash, \xe2\x80\x99 being a backwards apostrophe, \xe2\x84\xa2 being the trademark symbol, \xc2\xae being the registered sign, \xc3\xb1 being a squiggly spanish "n", etc.
Is there any table of all of these funky character sequences anywhere, or do I just have to fix them as I encounter them? Has anybody encountered any RSS handling code that translates these sequences into "web safe" characters?
Thanks!
Some examples of these character sequences are \xe2\x80\x93 being a long dash, \xe2\x80\x99 being a backwards apostrophe, \xe2\x84\xa2 being the trademark symbol, \xc2\xae being the registered sign, \xc3\xb1 being a squiggly spanish "n", etc.
Is there any table of all of these funky character sequences anywhere, or do I just have to fix them as I encounter them? Has anybody encountered any RSS handling code that translates these sequences into "web safe" characters?
Thanks!







