Orbeon Forms 3.9.0.201105152046 CE
I've used the exact code from the HowTo to upload XML: saxon:parse(saxon:base64Binary-to-string(xs:base64Binary(instance('upload')), 'UTF-8')) Whenever I upload UTF-8 encoded XML with a BOM, I get "no content allowed in prolog". Bug or feature? How to avoid? Fixing all input at workplaces around the world is not feasible. |
Administrator
|
Hi ahenket,
saxon:parse() expects XML, and if what people upload isn't XML, it just won't work. Now, I am wondering if something else could be happening. Could you maybe add an <xf:output value="saxon:base64Binary-to-string(xs:base64Binary(instance('upload')), 'UTF-8')"/> somewhere in your form to see what that value looks like. Is it proper XML? If it looks to you like it is, but saxon:parse() fails, could you share with us a specific example of that XML, so we can reproduce the issue? Alex
--
Follow Orbeon on Twitter: @orbeon Follow me on Twitter: @avernet |
Hi. It's absolutely XML. OxygenXML is my tool of choice for editing XML/XQuery etc. and it is set to be very picky. I've validated the files before uploading using oxygen and it gave no errors. I removed the 3 UTF-8 BOM characters with a Hexeditor (I found out later that you can instruct Oxygen to remove the UTF-8 BOM upon save) and then uploaded without any problem. There's no question that the BOM was the only thing between me and a successful upload.
|
Administrator
|
Hi ahenket,
Indeed, looks like a bug in saxon:base64Binary-to-string() to me. Since that function is UTF-8 aware, it should know how to interpret the BOM. Even if this is an issue with Saxon (at least the version we're using), I added an issue against Orbeon Forms: https://github.com/orbeon/orbeon-forms/issues/1093 In you can manually strip the BOM in XForms if present, as done in this example: view.xhtml. I also copied here the relevant part: <xf:var name="dec" value="saxon:base64Binary-to-octets(xs:base64Binary(.))"/> <xf:var name="has-bom" value="$dec[1] = 239 and $dec[2] = 187 and $dec[3] = 191"/> <xf:bind ref="." type="xs:base64Binary" calculate="if ($has-bom) then saxon:octets-to-base64Binary($dec[position() > 3]) else ."/> Alex
--
Follow Orbeon on Twitter: @orbeon Follow me on Twitter: @avernet |
Hi, thanks for the workaround. I'll be on holiday for 3 weeks so I'll get back to it afterwards most likely. Alexander Op 27 jun. 2013, om 03:12 heeft Alessandro Vernet [via Orbeon Forms community mailing list] <[hidden email]> het volgende geschreven: Hi ahenket, |
Administrator
|
Hi Alexander,
Sure, there of course no rush at all; you'll let us know when you get a chance to test this. Alex
--
Follow Orbeon on Twitter: @orbeon Follow me on Twitter: @avernet |
This obviously fell off my radar. We decided to go a different, but similar route solving this in xquery as Saxon under eXist-db has the exact same issue, so we need circumvention deeper down.
let $file-data := if (request:exists()) then (request:get-data()) else () let $update := if (not(empty($file-data))) then (:Hack alert: upload fails when content has UTF-8 Byte Order Marker. the UTF-8 representation of the BOM is the byte sequence 0xEF,0xBB,0xBF:) let $file-content := util:base64-decode($file-data/content) let $content-no-bom := if (string-to-codepoints(substring($file-content,1,1))=65279) then (substring($file-content,2)) else ($file-content) let $store := xmldb:store($messageStoragePath, encode-for-uri($filename), $content-no-bom) else () |
Administrator
|
Hi Alexander,
I'm glad doing this in eXist works for you. BTW, have you tried asking Mike Kay, the Saxon author, about this? (If you haven't already, the saxon-help mailing list would be a good place.) Alex
--
Follow Orbeon on Twitter: @orbeon Follow me on Twitter: @avernet |
Free forum by Nabble | Edit this page |