Invalid byte 1 of 1-byte UTF-8 sequence

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Invalid byte 1 of 1-byte UTF-8 sequence

nvdbleek
Hi,

I have a form and I get an 'Invalid byte 1 of 1-byte UTF-8 sequence' when I give focus to an input field in a repeat. I traced the problem back to the parse of the file containing the action that was performed on the client. There is an invalid UTF-8 character in the source-control-id attribute.

The repeat separator character isn't UTF-8 encoded but directly written in the XML as 0xB7 

This is a 'bug' somewhere, but isn't there a property to change the character to separate the original id and the iteration count? I thought there was, but can't find it. This will be a workaround for me. 

--
Regards,

Nick Van den Bleeken

--
You received this message because you are subscribed to the Google Groups "Orbeon Forms" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
 
 

upload_9aee614_13fc8ad6151__7ffb_00000015.tmp (504 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Invalid byte 1 of 1-byte UTF-8 sequence

bruno.buzzi
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: Invalid byte 1 of 1-byte UTF-8 sequence

Alessandro  Vernet
Administrator
In reply to this post by nvdbleek
Hi Nick,

There is no property to change the separator. You could change it by changing server-side and client-side code, where those separators are declared as constants. However, you shouldn't have this problem in the first place, and shouldn't have to worry about what character is used for those separators. Are you sure that your XForms source is properly encoded (reiterating Bruno's question)? If it is, would there be a way for us to reproduce this?

Alex
--
Follow Orbeon on Twitter: @orbeon
Follow me on Twitter: @avernet
Reply | Threaded
Open this post in threaded view
|

Re: Invalid byte 1 of 1-byte UTF-8 sequence

nvdbleek
Hi Alex,

The data doesn't contain invalid-xml characters. The error doesn't complains about the instance data either. The parsing error is in the data of the AJAX request if the control is in a repeat. I have to look further to this problem, but currently I'm working on something else. Will pick that up next week or so. It is not the problem, but as you can see in the attachment of the original e-mail the XML of the AJAX update doesn't contains an XML declaration, which is not nice ;) .

Nick


On Sat, Jul 13, 2013 at 6:06 AM, Alessandro Vernet [via Orbeon Forms community mailing list] <[hidden email]> wrote:
Hi Nick,

There is no property to change the separator. You could change it by changing server-side and client-side code, where those separators are declared as constants. However, you shouldn't have this problem in the first place, and shouldn't have to worry about what character is used for those separators. Are you sure that your XForms source is properly encoded (reiterating Bruno's question)? If it is, would there be a way for us to reproduce this?

Alex
--
Follow Orbeon on Twitter: @orbeon
Follow me on Twitter: @avernet



If you reply to this email, your message will be added to the discussion below:
http://discuss.orbeon.com/Invalid-byte-1-of-1-byte-UTF-8-sequence-tp4657000p4657007.html
To start a new topic under Orbeon Forms community mailing list, email [hidden email]
To unsubscribe from Orbeon Forms community mailing list, click here.
NAML



--
Regards,

Nick Van den Bleeken
Reply | Threaded
Open this post in threaded view
|

Re: Invalid byte 1 of 1-byte UTF-8 sequence

pc3356
Hi Nick,

We had something similar a while back, but I can't remember the exact context.

While the encoding might be OK at UTF-8, is the first character being sent the BOM (Byte Order Mark)? As far as I recall, not all XML parsers are too happy with this and reject it outright, which might give the sort of result you're seeing. If you can ensure BOM-output is suppressed do you still see the issue?

Thanks,

Phil.
Reply | Threaded
Open this post in threaded view
|

Re: Invalid byte 1 of 1-byte UTF-8 sequence

nvdbleek
Hi Phil,

The AJAX update request :
1) XML doesn't contains an XML declaration => encoding should be UTF-8 or UTF-16
2) With an HEX editor I confirmed that the byte representing the repeat separator character is 0XB7, which is incorrect, it should have been encoded as a two byte sequence, not its Unicode value, because the value is greater or equal to 0x80.

This is the XML that the server received (I replaced the byte representing 0xB7 with {0xB7} to make it human readable):

<!DOCTYPE xxf:event-request [<!ENTITY nbsp "&#160;">]>
<xxf:event-request xmlns:xxf="http://orbeon.org/oxf/xml/xforms">
    <xxf:uuid>9dcb89881b69a098483a00d83dbc1284a00929c2</xxf:uuid>
    <xxf:sequence>3</xxf:sequence>
    <xxf:action>
        <xxf:event name="xxforms-repeat-activate" source-control-id="xf-12{0xB7}1"></xxf:event>
    </xxf:action>
</xxf:event-request>

Nick




On Wed, Jul 17, 2013 at 3:49 PM, pc3356 <[hidden email]> wrote:
Hi Nick,

We had something similar a while back, but I can't remember the exact
context.

While the encoding might be OK at UTF-8, is the first character being sent
the BOM (Byte Order Mark)? As far as I recall, not all XML parsers are too
happy with this and reject it outright, which might give the sort of result
you're seeing. If you can ensure BOM-output is suppressed do you still see
the issue?

Thanks,

Phil.

--
View this message in context: http://discuss.orbeon.com/Invalid-byte-1-of-1-byte-UTF-8-sequence-tp4657000p4657020.html
Sent from the Orbeon Forms community mailing list mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups "Orbeon Forms" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].





--
Regards,

Nick Van den Bleeken

--
You received this message because you are subscribed to the Google Groups "Orbeon Forms" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Invalid byte 1 of 1-byte UTF-8 sequence

nvdbleek
Hi Phil, Alex, Erik,

I found the cause of the problem. It appears that repeats are broken in Chrome Canary (Version 30.0.1553.2). 

I can't reproduce the problem in the the Chrome Dev channel build, nor in FireFox, nor in Safari.

It appears that Google broke something in the XML serialisation, or that Orbeon is using it in a way that Chrome Canary no longer supports...

Nick.


On Wed, Jul 17, 2013 at 4:01 PM, Nick Van den Bleeken <[hidden email]> wrote:
Hi Phil,

The AJAX update request :
1) XML doesn't contains an XML declaration => encoding should be UTF-8 or UTF-16
2) With an HEX editor I confirmed that the byte representing the repeat separator character is 0XB7, which is incorrect, it should have been encoded as a two byte sequence, not its Unicode value, because the value is greater or equal to 0x80.

This is the XML that the server received (I replaced the byte representing 0xB7 with {0xB7} to make it human readable):

<!DOCTYPE xxf:event-request [<!ENTITY nbsp "&#160;">]>
<xxf:event-request xmlns:xxf="http://orbeon.org/oxf/xml/xforms">
    <xxf:uuid>9dcb89881b69a098483a00d83dbc1284a00929c2</xxf:uuid>
    <xxf:sequence>3</xxf:sequence>
    <xxf:action>
        <xxf:event name="xxforms-repeat-activate" source-control-id="xf-12{0xB7}1"></xxf:event>
    </xxf:action>
</xxf:event-request>

Nick




On Wed, Jul 17, 2013 at 3:49 PM, pc3356 <[hidden email]> wrote:
Hi Nick,

We had something similar a while back, but I can't remember the exact
context.

While the encoding might be OK at UTF-8, is the first character being sent
the BOM (Byte Order Mark)? As far as I recall, not all XML parsers are too
happy with this and reject it outright, which might give the sort of result
you're seeing. If you can ensure BOM-output is suppressed do you still see
the issue?

Thanks,

Phil.

--
View this message in context: http://discuss.orbeon.com/Invalid-byte-1-of-1-byte-UTF-8-sequence-tp4657000p4657020.html
Sent from the Orbeon Forms community mailing list mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups "Orbeon Forms" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].





--
Regards,

Nick Van den Bleeken



--
Regards,

Nick Van den Bleeken

--
You received this message because you are subscribed to the Google Groups "Orbeon Forms" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].