Private space UTF-8 characters crashing eXist

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Private space UTF-8 characters crashing eXist

Greg Andreou
We have developed an application using Orbeon with an eXist database backend (Both at the latest versions).

TinyMCE is used as the WYSIWYG editor for the application and because users are pasting in text from RTF documents it seems that sometimes Private Space UTF-8 characters used by MS Word in its formats manage to make their way through to the eXist database, which results in the corruption of the XML written to the database.

We have enabled in the configuration file both the XML1252Emitter and the HTML1252Emitter as serializes following the instructions here: http://wiki.orbeon.com/forms/doc/developer-guide/processors-converters but this doesn't seem to solve the problem.

Has anyone else had similar experiences to this and if so how did you solve this problem?

If not what would be the best way to tackle this issue? Since we know the range of characters that cause this issue could we filter them out during serialization? If this is not an option would maybe filtering at the TinyMCE level be an option?

Any help would be greatly appreciated,
Thanks
--
Greg Andreou
Senior Business Consultant

ByteCrafts Ltd
www.bytecrafts.com

Tel +357-22-250032
Mob +357-99400954
Fax +357-22-327404
P.O.Box 24782 - 1303 Strovolos


--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
OW2 mailing lists service home page: http://www.ow2.org/wws
Reply | Threaded
Open this post in threaded view
|

Re: Private space UTF-8 characters crashing eXist

Erik Bruchez
Administrator
Greg,

Possibly, this could be cleaned in here:

https://github.com/orbeon/orbeon-forms/blob/master/src/resources-packaged/ops/xforms/clean-html.xsl

This way, the data would be cleaned before reaching even the XForms instance.

-Erik

On Tue, Dec 4, 2012 at 2:12 AM, Greg Andreou
<[hidden email]> wrote:

> We have developed an application using Orbeon with an eXist database backend
> (Both at the latest versions).
>
> TinyMCE is used as the WYSIWYG editor for the application and because users
> are pasting in text from RTF documents it seems that sometimes Private Space
> UTF-8 characters used by MS Word in its formats manage to make their way
> through to the eXist database, which results in the corruption of the XML
> written to the database.
>
> We have enabled in the configuration file both the XML1252Emitter and the
> HTML1252Emitter as serializes following the instructions here:
> http://wiki.orbeon.com/forms/doc/developer-guide/processors-converters but
> this doesn't seem to solve the problem.
>
> Has anyone else had similar experiences to this and if so how did you solve
> this problem?
>
> If not what would be the best way to tackle this issue? Since we know the
> range of characters that cause this issue could we filter them out during
> serialization? If this is not an option would maybe filtering at the TinyMCE
> level be an option?
>
> Any help would be greatly appreciated,
> Thanks
> --
> Greg Andreou
> Senior Business Consultant
>
> ByteCrafts Ltd
> www.bytecrafts.com
>
> Tel +357-22-250032
> Mob +357-99400954
> Fax +357-22-327404
> P.O.Box 24782 - 1303 Strovolos
>
>
> --
> You receive this message as a subscriber of the [hidden email] mailing
> list.
> To unsubscribe: mailto:[hidden email]
> For general help: mailto:[hidden email]?subject=help
> OW2 mailing lists service home page: http://www.ow2.org/wws
>


--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
OW2 mailing lists service home page: http://www.ow2.org/wws