HTML Area: Unescaped HTML code?

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

HTML Area: Unescaped HTML code?

fl.schmitt
Hi,

after deperately trying to get the HTML Area to work with own HTML content, i realized that in the html-area.xhtml file, the "HTML Code" is a mixture of escaped and non-escaped Characters - all the opening tag brackets are escaped, everything else isn't. It seems that the HTML Area Control doesn't accept Code which is completely unescaped. Is there a way to get it to accept unescaped HTML Code, too? Or is it possible to transform non-escaped HTML to escaped Code "on-the-fly"?

I tried to use the OPS XPath extension xxforms:serialize as it works for xforms:output, but the textarea control doesn't accept the output because it's a string, not a XML Node.


Thank you in advance
florian



--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
ObjectWeb mailing lists service home page: http://www.objectweb.org/wws
Reply | Threaded
Open this post in threaded view
|

Re: HTML Area: Unescaped HTML code?

Erik Bruchez
Administrator
Florian,

 > after deperately trying to get the HTML Area to work with own HTML
 > content, i realized that in the html-area.xhtml file, the "HTML
 > Code" is a mixture of escaped and non-escaped Characters - all the
 > opening tag brackets are escaped, everything else isn't. It seems
 > that the HTML Area Control doesn't accept Code which is completely
 > unescaped. Is there a way to get it to accept unescaped HTML Code,
 > too?

Not at this time. This is because in XForms, each control only updates
text within elements or attributes (with the funny exception of
xforms:copy). xforms:output/@mediatype="text/html" also works this way.
So we followed that pattern.

 > Or is it possible to transform non-escaped HTML to escaped Code
 > "on-the-fly"?
 >
 > I tried to use the OPS XPath extension xxforms:serialize as it works
 > for xforms:output, but the textarea control doesn't accept the
 > output because it's a string, not a XML Node.

You need the oposite of xxforms:serialize in order to convert from
serialized XML to XML. What you need is something like the
saxon:parse() function. I don't think we map it to the xxforms
namespace yet, but the saxon namespace might work. However the HTML
Area produces plain HTML I think, which may not parse as XML. Please
try it and let us know if you get anywhere.

-Erik

--
Orbeon Forms - XForms Everywhere
http://www.orbeon.com/blog/




--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
ObjectWeb mailing lists service home page: http://www.objectweb.org/wws
Reply | Threaded
Open this post in threaded view
|

Re: Re: HTML Area: Unescaped HTML code?

fl.schmitt
Hi Erik, hi List,

> However the HTML Area produces plain HTML I think, which may not
> parse as XML. Please try it and let us know if you get anywhere.

serializing the HTML parts of an XML instance and parsing the output of the HTML Area back to XML seems to work without problem so far. I used the saxon:serialize and saxon:parse functions through a XPL pipeline implementing an XSLT processor that passes every XML data unchanged except the HTML content identified by the enclosing XML element.

But i face another problems concerning the HTML area now: i couldn't get the HTML area to accept special characters like the german 'ä' or other 'umlauts'. Those characters get displayed, but the log shows an exception as soon as an 'umlaut' was entered:

org.orbeon.oxf.common.ValidationException: file:/C:/Programme/Apache%20Software%20Foundation/Tomcat%205.5/work/Catalina/localhost/exist/cocoon-files/cache-dir/upload_00000633.tmp, line 6, column 195: Fatal error: The entity "auml" was referenced, but not declared.
file:/C:/Programme/Apache%20Software%20Foundation/Tomcat%205.5/work/Catalina/localhost/exist/cocoon-files/cache-dir/upload_00000633.tmp, line 6, column 195: Fatal error: The entity "auml" was referenced, but not declared.

After submission, the referenced XML node was empty; it seems that the 'illegal' content was not included into the submission. It didn't help to enter the special character as an HTML entity (ä instead of 'ä'); now the text was included in the submittet content, but the Saxon XML parser responsible to transform the serialized HTML back to XML complained:

Exception at oxf:/xsl/parse-html.xsl, line 1, column 121
org.xml.sax.SAXParseException: The entity "auml" was referenced, but not declared. (...)

One might implement another transformation before parsing the HTML content, but it would be easier if the HTML area would accept special characters directly.

Florian



--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
ObjectWeb mailing lists service home page: http://www.objectweb.org/wws
Reply | Threaded
Open this post in threaded view
|

Re: HTML Area: Unescaped HTML code?

Erik Bruchez
Administrator
[hidden email] wrote:

> One might implement another transformation before parsing the HTML content, but it would be easier if the HTML area would accept special characters directly.

My opinion is that the component we use, FCKeditor, albeit named from
the initials of its author, deserves its name in other ways ;-)

But here I cannot really fault it, as auml is a valid HTML entity,
however it is not a valid XML entity by default. It would be great if we
could configure the editor to produce numeric character entities
instead, but I don't know that this is possible.

-Erik

--
Orbeon Forms - XForms Everywhere
http://www.orbeon.com/blog/



--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
ObjectWeb mailing lists service home page: http://www.objectweb.org/wws
Reply | Threaded
Open this post in threaded view
|

Re: HTML Area: Unescaped HTML code?

fl.schmitt
Hi Erik, hi List,

> But here I cannot really fault it, as auml is a valid HTML entity,
> however it is not a valid XML entity by default.

That's true - but the problem is that the editor component doesn't accept it even during the HTML editing.

> It would be great if we
> could configure the editor to produce numeric character entities
> instead, but I don't know that this is possible.

At the FCKeditor Wiki, i found some information about the handling of special characters: http://wiki.fckeditor.net/Developer%27s_Guide/Configuration/Configurations_Settings

It seems to be a configuration option; i will check this in detail as soon as possible. By the way, at http://www.fckeditor.net/ it's stated that the editor is able to output XHTML 1.0 - it would be great if this could be implemented in Orbeon Forms.

Regards,
Florian

--
http://www.florian-schmitt.net




--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
ObjectWeb mailing lists service home page: http://www.objectweb.org/wws
Reply | Threaded
Open this post in threaded view
|

Re: HTML Area: Unescaped HTML code?

Erik Bruchez
Administrator
Florian Schmitt wrote:
 > Hi Erik, hi List,
 >
 >> But here I cannot really fault it, as auml is a valid HTML entity,
 >> however it is not a valid XML entity by default.
 >
 > That's true - but the problem is that the editor component doesn't
 > accept it even during the HTML editing.
 >
 >> It would be great if we
 >> could configure the editor to produce numeric character entities
 >> instead, but I don't know that this is possible.
 >
 > At the FCKeditor Wiki, i found some information about the handling
 > of special characters:
 >
http://wiki.fckeditor.net/Developer%27s_Guide/Configuration/Configurations_Settings

Ah, great. So we should probably change the following to false in
fckconfig.js:

FCKConfig.ProcessHTMLEntities = true ;
FCKConfig.IncludeLatinEntities = true ;
FCKConfig.IncludeGreekEntities = true ;

 > It seems to be a configuration option; i will check this in detail
 > as soon as possible. By the way, at http://www.fckeditor.net/ it's
 > stated that the editor is able to output XHTML 1.0 - it would be
 > great if this could be implemented in Orbeon Forms.

Great, if you try it successfully, let us know and we'll change the
defaults.

-Erik

--
Orbeon Forms - XForms Everywhere
http://www.orbeon.com/blog/




--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
ObjectWeb mailing lists service home page: http://www.objectweb.org/wws
Reply | Threaded
Open this post in threaded view
|

Re: HTML Area: Unescaped HTML code?

fl.schmitt
Hi Erik, hi List,

> Ah, great. So we should probably change the following to false in
> fckconfig.js:
>
> FCKConfig.ProcessHTMLEntities = true ;
> FCKConfig.IncludeLatinEntities = true ;
> FCKConfig.IncludeGreekEntities = true ;
>
> Great, if you try it successfully, let us know and we'll change the
> defaults.

it worked - i changed all three options to false, and now the special characters are accepted both at the time they are entered into the editor pane and when parsing the HTML area output to xml. Thank you very much!


Greetings,
Florian

--
http://www.florian-schmitt.net




--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
ObjectWeb mailing lists service home page: http://www.objectweb.org/wws
Reply | Threaded
Open this post in threaded view
|

Re: HTML Area: Unescaped HTML code?

Erik Bruchez
Administrator
Florian Schmitt wrote:

> Hi Erik, hi List,
>
>> Ah, great. So we should probably change the following to false in
>> fckconfig.js:
>>
>> FCKConfig.ProcessHTMLEntities = true ;
>> FCKConfig.IncludeLatinEntities = true ;
>> FCKConfig.IncludeGreekEntities = true ;
>>
>> Great, if you try it successfully, let us know and we'll change the
>> defaults.
>
> it worked - i changed all three options to false, and now the special characters are accepted both at the time they are entered into the editor pane and when parsing the HTML area output to xml. Thank you very much!
Great, I committed this change to CVS.

-Erik

--
Orbeon Forms - XForms Everywhere
http://www.orbeon.com/blog/



--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
ObjectWeb mailing lists service home page: http://www.objectweb.org/wws