Sponsored Links
I'm using the SAX parser to read some RSS feeds and have found a problem. Some feeds, for example CNN Money Top Stories, have embedded some characters in their content, I.e. the copyright symbol. Well, that's not valid XML and the SAXParser fails with an exception "invalid token". The only help I have seen given is to fix the XML at the source and that's not an option obviously. So, I can think of 2 options and they both stink: (a) read the content first, scrub it, and then pass it to the parser. (B) use DOM instead of SAX. What I *want* to do is make the parser a little more forgiving and just accept or discard/ignore the bad text. I'm not have any luck with setErrorHandler. My error handler does not get called. Can anyone offer some help on this? Thanks --~--~---------~--~----~------------~-------~--~----~