Hi,
One of the simple techniques to obfuscate email addresses is to simply replace "@" characters by "%40" in mailto: URIs and by @ in plain text. It appears that most spammers aren't able to read something as simple as: <a href="vdv%40dyomedea.com">vdv@dyomedea.com</a> even if that's strictly equivalent to plainly exposed addresses for any piece of software which is minimally conform (meaning that this technique has none of the drawbacks of other obfuscation techniques). Escaping @s in URIs is easy enough in XSLT... Getting them replaced by numeric entity references in plain text is more challenging in OPS :-) ... That could be achieved in "plain XSLT" using either disable-output- escaping attributes or XSLT 2.0 character maps but none of them would survive in an XML pipe and would be usable only in a transformation that would be the last processor in the pipe. Note that their might be other cases where applications would need a finer control over entity references. How could we plug this behaviour into the existing converters? I had a quick look to the code and noticed that the HTML and XML converters are still using the "old legacy" serializers and that these serializers are using Saxon identity transformations. An option would be to add either an input or an element in the existing config input to provide an alternate XSLT transformation to use as an identity transformation. This transformation could then use either disable-output-escaping (which is a hack) or character maps which are more elegant... Another option would be, of course, to write a new output method. What do you think? Thanks, Eric -- Read me on XML.com. http://www.xml.com/pub/au/74 ------------------------------------------------------------------------ Eric van der Vlist http://xmlfr.org http://dyomedea.com (ISO) RELAX NG ISBN:0-596-00421-4 http://oreilly.com/catalog/relax (W3C) XML Schema ISBN:0-596-00252-1 http://oreilly.com/catalog/xmlschema ------------------------------------------------------------------------ -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws |
Administrator
|
Eric,
One question is: does XSLT serialization allow you to control this? If so, then we could allow all XSLT serialization options to be passed to the XML, XHTML or HTML converters. This would solve the problem neatly. -Erik Eric van der Vlist wrote: > Hi, > > One of the simple techniques to obfuscate email addresses is to simply > replace "@" characters by "%40" in mailto: URIs and by @ in plain > text. > > It appears that most spammers aren't able to read something as simple > as: > > <a href="vdv%40dyomedea.com">vdv@dyomedea.com</a> > > even if that's strictly equivalent to plainly exposed addresses for any > piece of software which is minimally conform (meaning that this > technique has none of the drawbacks of other obfuscation techniques). > > Escaping @s in URIs is easy enough in XSLT... Getting them replaced by > numeric entity references in plain text is more challenging in > OPS :-) ... > > That could be achieved in "plain XSLT" using either disable-output- > escaping attributes or XSLT 2.0 character maps but none of them would > survive in an XML pipe and would be usable only in a transformation that > would be the last processor in the pipe. > > Note that their might be other cases where applications would need a > finer control over entity references. > > How could we plug this behaviour into the existing converters? > > I had a quick look to the code and noticed that the HTML and XML > converters are still using the "old legacy" serializers and that these > serializers are using Saxon identity transformations. > > An option would be to add either an input or an element in the existing > config input to provide an alternate XSLT transformation to use as an > identity transformation. This transformation could then use either > disable-output-escaping (which is a hack) or character maps which are > more elegant... > > Another option would be, of course, to write a new output method. > > What do you think? > > Thanks, > > Eric > > > ------------------------------------------------------------------------ > > > -- > You receive this message as a subscriber of the [hidden email] mailing list. > To unsubscribe: mailto:[hidden email] > For general help: mailto:[hidden email]?subject=help > ObjectWeb mailing lists service home page: http://www.objectweb.org/wws -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws |
Free forum by Nabble | Edit this page |