Some uses that I haven't expressed.
We're using XSLT 2.0 "schema aware" (saxon sa) quite extensively for large XML document processing. For the moment, we're using Java to glue together the different processing steps that are managed with different XSLT stylesheets, but this is tedious and difficult to read. XPL would seem to be a good candidate for describing the chaining of the XSLT steps. However, the Pipe paradigm doesn't seem to be quite the right fit because we regularly produce N documents from 1 (using the xsl:result-document() function) or we coalesce N documents into 1 (using the Xpath document() function). In order to get this to work with XPL, we would need a way of describing the storage of documents produced with result-document() and a way of describing where to find documents that are retrieved via document(). This certainly isn't impossible, but may require a paradigm shift. For instance, this would mean that an XSLT processing step would produce N documents, which may need to chain to N new XSLT processing steps. To give an example, we currently manage this kind of processing: Receive 2 big documents: stock-A and stock-B, with stock A containing N unit-A documents and stock B containing M unit-B documents 1. Process stock-A, splitting into N unit-A documents that are indexed and stored into a database (using result-document() ) 2. Process stock-B, splitting into M unit-B documents that are indexed and stored into a database (using result-document() ) 3. For each unit-A document from step 1, convert unit-A document to unit-Z document using XSLT this step reads related unit-B documents from step 2, using the document() command and index info to retrieve documents from database output:: N unit-Z documents 4. Coalesce the N unit-Z documents into a few (say 3) stock-Z documents input document is an empty document retrieves the unit-Z documents using the Xpath document() function and based upon index information Hope that this helps to motivate some design ideas for XPL. -alan ************************************************************************* This message and any attachments (the "message") are confidential and intended solely for the addressee(s). Any unauthorised use or dissemination is prohibited. E-mails are susceptible to alteration. Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall be liable for the message if altered, changed or falsified. ************ Ce message et toutes les pieces jointes (ci-apres le "message") sont confidentiels et etablis a l'intention exclusive de ses destinataires. Toute utilisation ou diffusion non autorisee est interdite. Tout message electronique est susceptible d'alteration. La SOCIETE GENERALE et ses filiales declinent toute responsabilite au titre de ce message s'il a ete altere, deforme ou falsifie. ************************************************************************* -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws |
Administrator
|
Alan,
You should be able do this with XPL today. Instead of using result-document(), first produce a document like this: <result> <unit-a> ... </unit-a> <unit-a> ... </unit-a> </result> then use p:for-each to iterate over /result/unit-a, and process each "unit-a" document separately. p:for-each also provides a final aggregation facility. I am not saying that introducing a concept like a "sequence of documents" in a pipeline output would not be good (in fact, it could be an interesting one to contemplate), just that there appears to be a workaround. -Erik [hidden email] wrote: > Some uses that I haven't expressed. > > We're using XSLT 2.0 "schema aware" (saxon sa) quite extensively for large > XML document processing. > > For the moment, we're using Java to glue together the different processing > steps that are managed > with different XSLT stylesheets, but this is tedious and difficult to read. > XPL would seem to be a > good candidate for describing the chaining of the XSLT steps. > > However, the Pipe paradigm doesn't seem to be quite the right fit because > we regularly produce > N documents from 1 (using the xsl:result-document() function) or we > coalesce N documents into 1 > (using the Xpath document() function). > > In order to get this to work with XPL, we would need a way of describing > the storage of documents > produced with result-document() and a way of describing where to find > documents that are > retrieved via document(). This certainly isn't impossible, but may > require a paradigm shift. > > For instance, this would mean that an XSLT processing step would produce N > documents, which > may need to chain to N new XSLT processing steps. > > To give an example, we currently manage this kind of processing: > > > Receive 2 big documents: stock-A and stock-B, with stock A containing N > unit-A documents and stock B containing M unit-B documents > > 1. Process stock-A, splitting into N unit-A documents that are indexed and > stored into a database (using result-document() ) > 2. Process stock-B, splitting into M unit-B documents that are indexed and > stored into a database (using result-document() ) > 3. For each unit-A document from step 1, > convert unit-A document to unit-Z document using XSLT > this step reads related unit-B documents from step 2, using the > document() command and index info to retrieve documents from database > output:: N unit-Z documents > 4. Coalesce the N unit-Z documents into a few (say 3) stock-Z documents > input document is an empty document > retrieves the unit-Z documents using the Xpath document() function > and based upon index information > > Hope that this helps to motivate some design ideas for XPL. > > -alan > ************************************************************************* > This message and any attachments (the "message") are confidential and intended solely for the addressee(s). > Any unauthorised use or dissemination is prohibited. E-mails are susceptible to alteration. > Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall be liable for the message if altered, changed or falsified. > ************ > Ce message et toutes les pieces jointes (ci-apres le "message") sont confidentiels et etablis a l'intention exclusive de ses > destinataires. Toute utilisation ou diffusion non autorisee est interdite. Tout message electronique est susceptible d'alteration. > La SOCIETE GENERALE et ses filiales declinent toute responsabilite au titre de ce message s'il a ete altere, deforme ou falsifie. > ************************************************************************* -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws |
In reply to this post by alan.painter
Thinking it over a big, I do believe that the p:for-each() function could
well work for producing N output documents, effectively replacing the xsl:result-document() command. It's probably not as convenient in some cases, and I'm not sure about the performance considerations, but from a theoretical standpoint it's probably going to be equivalent. I don't see how this works for reading N documents. You mentioned an aggregation facility but I haven't found it. [hidden email] om To: [hidden email] Sent by: cc: [hidden email] Subject: Re: [ops-users] Call for XPL use cases m 01/05/06 12:31 AM Please respond to ops-users Alan, You should be able do this with XPL today. Instead of using result-document(), first produce a document like this: <result> <unit-a> ... </unit-a> <unit-a> ... </unit-a> </result> then use p:for-each to iterate over /result/unit-a, and process each "unit-a" document separately. p:for-each also provides a final aggregation facility. I am not saying that introducing a concept like a "sequence of documents" in a pipeline output would not be good (in fact, it could be an interesting one to contemplate), just that there appears to be a workaround. -Erik [hidden email] wrote: > Some uses that I haven't expressed. > > We're using XSLT 2.0 "schema aware" (saxon sa) quite extensively for large > XML document processing. > > For the moment, we're using Java to glue together the different processing > steps that are managed > with different XSLT stylesheets, but this is tedious and difficult to read. > XPL would seem to be a > good candidate for describing the chaining of the XSLT steps. > > However, the Pipe paradigm doesn't seem to be quite the right fit because > we regularly produce > N documents from 1 (using the xsl:result-document() function) or we > coalesce N documents into 1 > (using the Xpath document() function). > > In order to get this to work with XPL, we would need a way of describing > the storage of documents > produced with result-document() and a way of describing where to find > documents that are > retrieved via document(). This certainly isn't impossible, but may > require a paradigm shift. > > For instance, this would mean that an XSLT processing step would produce > documents, which > may need to chain to N new XSLT processing steps. > > To give an example, we currently manage this kind of processing: > > > Receive 2 big documents: stock-A and stock-B, with stock A containing N > unit-A documents and stock B containing M unit-B documents > > 1. Process stock-A, splitting into N unit-A documents that are indexed > stored into a database (using result-document() ) > 2. Process stock-B, splitting into M unit-B documents that are indexed > stored into a database (using result-document() ) > 3. For each unit-A document from step 1, > convert unit-A document to unit-Z document using XSLT > this step reads related unit-B documents from step 2, using the > document() command and index info to retrieve documents from database > output:: N unit-Z documents > 4. Coalesce the N unit-Z documents into a few (say 3) stock-Z documents > input document is an empty document > retrieves the unit-Z documents using the Xpath document() function > and based upon index information > > Hope that this helps to motivate some design ideas for XPL. > > -alan > ************************************************************************* > This message and any attachments (the "message") are confidential and > Any unauthorised use or dissemination is prohibited. E-mails are > Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall be liable for the message if altered, changed or falsified. > ************ > Ce message et toutes les pieces jointes (ci-apres le "message") sont confidentiels et etablis a l'intention exclusive de ses > destinataires. Toute utilisation ou diffusion non autorisee est interdite. Tout message electronique est susceptible d'alteration. > La SOCIETE GENERALE et ses filiales declinent toute responsabilite au titre de ce message s'il a ete altere, deforme ou falsifie. > ************************************************************************* -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws |
Administrator
|
Hi Alan,
The idea of the p:for-each "aggregation facility" is that each step of the iteration can produce a document and that the p:for-each can aggregate all those documents into a new document under some root element that you specify. See for instance the example at the URL below. http://www.orbeon.com/ops/doc/reference-xpl-pipelines#for-each Alex On 1/6/06, [hidden email] <[hidden email]> wrote: > Thinking it over a big, I do believe that the p:for-each() function could > well work for producing N output documents, effectively replacing > the xsl:result-document() command. It's probably not as convenient > in some cases, and I'm not sure about the performance considerations, > but from a theoretical standpoint it's probably going to be equivalent. > > I don't see how this works for reading N documents. You mentioned > an aggregation facility but I haven't found it. > > > > > > [hidden email] > om To: [hidden email] > Sent by: cc: > [hidden email] Subject: Re: [ops-users] Call for XPL use cases > m > > > 01/05/06 12:31 AM > Please respond to > ops-users > > > > > > > Alan, > > You should be able do this with XPL today. Instead of using > result-document(), first produce a document like this: > > <result> > <unit-a> > ... > </unit-a> > <unit-a> > ... > </unit-a> > </result> > > then use p:for-each to iterate over /result/unit-a, and process each > "unit-a" document separately. p:for-each also provides a final > aggregation facility. > > I am not saying that introducing a concept like a "sequence of > documents" in a pipeline output would not be good (in fact, it could be > an interesting one to contemplate), just that there appears to be a > workaround. > > -Erik > > [hidden email] wrote: > > Some uses that I haven't expressed. > > > > We're using XSLT 2.0 "schema aware" (saxon sa) quite extensively for > large > > XML document processing. > > > > For the moment, we're using Java to glue together the different > processing > > steps that are managed > > with different XSLT stylesheets, but this is tedious and difficult to > read. > > XPL would seem to be a > > good candidate for describing the chaining of the XSLT steps. > > > > However, the Pipe paradigm doesn't seem to be quite the right fit because > > we regularly produce > > N documents from 1 (using the xsl:result-document() function) or we > > coalesce N documents into 1 > > (using the Xpath document() function). > > > > In order to get this to work with XPL, we would need a way of describing > > the storage of documents > > produced with result-document() and a way of describing where to find > > documents that are > > retrieved via document(). This certainly isn't impossible, but may > > require a paradigm shift. > > > > For instance, this would mean that an XSLT processing step would produce > N > > documents, which > > may need to chain to N new XSLT processing steps. > > > > To give an example, we currently manage this kind of processing: > > > > > > Receive 2 big documents: stock-A and stock-B, with stock A containing N > > unit-A documents and stock B containing M unit-B documents > > > > 1. Process stock-A, splitting into N unit-A documents that are indexed > and > > stored into a database (using result-document() ) > > 2. Process stock-B, splitting into M unit-B documents that are indexed > and > > stored into a database (using result-document() ) > > 3. For each unit-A document from step 1, > > convert unit-A document to unit-Z document using XSLT > > this step reads related unit-B documents from step 2, using the > > document() command and index info to retrieve documents from database > > output:: N unit-Z documents > > 4. Coalesce the N unit-Z documents into a few (say 3) stock-Z documents > > input document is an empty document > > retrieves the unit-Z documents using the Xpath document() function > > and based upon index information > > > > Hope that this helps to motivate some design ideas for XPL. > > > > -alan > > ************************************************************************* > > This message and any attachments (the "message") are confidential and > intended solely for the addressee(s). > > Any unauthorised use or dissemination is prohibited. E-mails are > susceptible to alteration. > > Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall > be liable for the message if altered, changed or falsified. > > ************ > > Ce message et toutes les pieces jointes (ci-apres le "message") sont > confidentiels et etablis a l'intention exclusive de ses > > destinataires. Toute utilisation ou diffusion non autorisee est > interdite. Tout message electronique est susceptible d'alteration. > > La SOCIETE GENERALE et ses filiales declinent toute responsabilite au > titre de ce message s'il a ete altere, deforme ou falsifie. > > ************************************************************************* > > > -- > You receive this message as a subscriber of the [hidden email] > mailing list. > To unsubscribe: mailto:[hidden email] > For general help: mailto:[hidden email]?subject=help > ObjectWeb mailing lists service home page: http://www.objectweb.org/wws > > > > > > > -- > You receive this message as a subscriber of the [hidden email] mailing list. > To unsubscribe: mailto:[hidden email] > For general help: mailto:[hidden email]?subject=help > ObjectWeb mailing lists service home page: http://www.objectweb.org/wws > > > -- Blog (XML, Web apps, Open Source): http://www.orbeon.com/blog/ -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws
--
Follow Orbeon on Twitter: @orbeon Follow me on Twitter: @avernet |
Free forum by Nabble | Edit this page |