can't get 2 processor in 1 pipeline to work

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

can't get 2 processor in 1 pipeline to work

James Liang
Hi all,

My understanding of pipeline must be missing something obvious.  I
have a delete_all.xpl pipe line that clears the database.  I also have
a save.xpl pipeline that populate the database.  When I call these
pipe line individually they worked.

However, when I combine these two pipeline inside the save_all.xpl,
save_all.xpl doesn't work.  When save_all.xpl is called, it seems to
only execute save.xpl (from looking at the log).  Since the
delete_all.xpl didn't execute to clean up the database first, the
save.xpl fails on database constraint violation.

Isn't each processor inside p:config called sequentially?


content of save_all.xpl :

<p:config xmlns:p="http://www.orbeon.com/oxf/pipeline"
xmlns:sql="http://orbeon.org/oxf/xml/sql"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:oxf="http://www.orbeon.com/oxf/processors"
xmlns:xi="http://www.w3.org/2001/XInclude">
  <p:param type="input" name="instance" />
  <p:param type="output" name="data" />

  <p:processor name="oxf:pipeline">
    <p:input name="config" href="delete_all.xpl" />
  </p:processor>

  <p:processor name="oxf:pipeline">
    <p:input name="config" href="save.xpl" />
    <p:input name="instance" href="#instance" />
    <p:output name="data" ref="data" />
  </p:processor>
</p:config>


--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
OW2 mailing lists service home page: http://www.ow2.org/wws
Reply | Threaded
Open this post in threaded view
|

Re: can't get 2 processor in 1 pipeline to work

fl.schmitt(ops-users)
James,

> Isn't each processor inside p:config called sequentially?

AFAIK no, because of the lazy evaluation model. In your save_all.xpl, it
seems producing the output is simply done by the save.xpl, so it seems
that there's no need to call delete_all.xpl. If the save.xpl depends on
a succesfull execution of delete_all.xpl, you will have to express that
dependency explicitly.

To do so, you could connect both pipeline processors:

- add a p:output to the first processor (calling delete_all.xpl) to
expose the result of delete_all.xpl;
- add a p:input the the second processor (calling save.xpl) that
references the output of the first processor.

This way, save.xpl will be called only if delete_all.xpl has finished;
you may even handle different results of the delete_all "action", for
example if the delete wasn't succesful for any reason.

HTH
florian




--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
OW2 mailing lists service home page: http://www.ow2.org/wws
Reply | Threaded
Open this post in threaded view
|

Re: Re: can't get 2 processor in 1 pipeline to work

James Liang
Hi,

Thank you for your help.

Unfortunately, connecting the 2 processors did not work.

To make it easy to reproduce the problem, I simplify it by making 2
pipelines that use the directory-scanner processor.

dir.xpl - list *.bat files
dir2.xpl - list *.sh files

Each of these pipeline worked independently.  Also notice that I
specified debug="" in the <p:output> tag.  As such I was able to
verified these working in the log files and from the browser.

I also created dir_all.xpl as follow:

<p:config
                xmlns:p="http://www.orbeon.com/oxf/pipeline"
                xmlns:sql="http://orbeon.org/oxf/xml/sql"
                xmlns:xs="http://www.w3.org/2001/XMLSchema"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:oxf="http://www.orbeon.com/oxf/processors"
                xmlns:xi="http://www.w3.org/2001/XInclude">
  <p:param type="input" name="instance" />
  <p:param type="output" name="data" />

        <p:processor name="oxf:pipeline">
                <p:input name="config" href="dir.xpl"/>
                <p:output name="data" id="result" />
        </p:processor>

        <p:processor name="oxf:pipeline">
                <p:input name="dir_result" href="#result"/>
                <p:input name="config" href="dir2.xpl"/>
                <p:input name="instance" href="#instance"/>
                <p:output name="data" ref="data" />
        </p:processor>
       
</p:config>


I've connected output of dir.xpl with input of dir2.xpl (with result).
 When I use the browser to load dir_all.xpl, I see it in the log that
only dir2.xpl was ever called.

This is exactly the same behavior as before when I used two sql
processors.  In that case, I was able to verify from both the log and
from mysql that one of the sql pipeline was never executed.

I've attached dir.xpl, dir2.xpl and dir_all.xpl.

Again, thank you for your help.


Thanks,
James







On Wed, May 11, 2011 at 1:27 AM, Florian Schmitt
<[hidden email]> wrote:

> James,
>
>> Isn't each processor inside p:config called sequentially?
>
> AFAIK no, because of the lazy evaluation model. In your save_all.xpl, it
> seems producing the output is simply done by the save.xpl, so it seems
> that there's no need to call delete_all.xpl. If the save.xpl depends on
> a succesfull execution of delete_all.xpl, you will have to express that
> dependency explicitly.
>
> To do so, you could connect both pipeline processors:
>
> - add a p:output to the first processor (calling delete_all.xpl) to
> expose the result of delete_all.xpl;
> - add a p:input the the second processor (calling save.xpl) that
> references the output of the first processor.
>
> This way, save.xpl will be called only if delete_all.xpl has finished;
> you may even handle different results of the delete_all "action", for
> example if the delete wasn't succesful for any reason.
>
> HTH
> florian
>
>
>
>
> --
> You receive this message as a subscriber of the [hidden email] mailing list.
> To unsubscribe: mailto:[hidden email]
> For general help: mailto:[hidden email]?subject=help
> OW2 mailing lists service home page: http://www.ow2.org/wws
>
>


--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
OW2 mailing lists service home page: http://www.ow2.org/wws

dir.xpl (1K) Download Attachment
dir_all.xpl (1K) Download Attachment
dir2.xpl (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Re: can't get 2 processor in 1 pipeline to work

ncrofts
Hi,

The general rule with XML pipelines is to ensure that every input/output is terminated. As such your solution is just missing a couple of things.

Firstly, in "dir2.xpl" you haven't specified an input parameter with name "dir_result". Without this the "#result" won't be pulled through the pipeline.

Secondly, once you define that input you will need to make sure that you use it. There are several options, but in a case where you don't really care about that input I would connect it to an "oxf:null-serializer" processor. This will effectively 'sink' the input.

With this done, the "dir.xpl" processor will run first because "dir2.xpl" is then dependent on the output of "dir.xpl".

Hope this helps.

Regards,
Neil

Reply | Threaded
Open this post in threaded view
|

Re: Re: Re: can't get 2 processor in 1 pipeline to work

fl.schmitt(ops-users)
In reply to this post by James Liang
James,

> Unfortunately, connecting the 2 processors did not work.

hmm - ok. I think connecting the processors is only part of the
solution. In your example code, the output of the second processor is
still in no way dependend on the output of the first one, so "lazy
evaluation" still may omit the first processor when generating the
output of the second one.

If this is correct, one solution would be to make the output of both
processors relevant to the reslut of the XPL, for example by aggregating
them (using the identity processor [1]), or by evaluating the content of
the first processor's output. In pseudocode:

p:pipeline input=in output=out
    p:processor id=first, input=#in output=firstresult (true|false)
    p:choose href=#firstresult
        p:when test=true
            p:processor id=second output=finalresult ref=out
        p:otherwise
           (generate null document, throw exception or anything else)
   /p:choose
/p:pipeline

Another (and easier) solution would be to let the "Null serializer" [2]
consume the output of the first processor. This way, the XPL engine has
to access the output of the first processor, but just to discard it.
Pseudocode again:

p:pipeline input=in output=out
    p:processor id=first, input=#in output=firstresult
    p:nullserializer input=#firstresult (no output, sort of /dev/null)

    p:processor id=second output=finalresult ref=out
/p:pipeline

If the second processor depends on the succesful execution of the first
one, i would recommend using p:choose (as in my first example) to
express this dependency explicitly in the XPL code.

HTW (hope this works ;) )
florian


[1]
http://wiki.orbeon.com/forms/doc/developer-guide/processors-other#TOC-Identity-processor
[2]
http://wiki.orbeon.com/forms/doc/developer-guide/processors-other-serializers#TOC-Null-serializer




--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
OW2 mailing lists service home page: http://www.ow2.org/wws
Reply | Threaded
Open this post in threaded view
|

Re: Re: Re: Re: can't get 2 processor in 1 pipeline to work

James Liang
Thanks everyone for helping me sort this out!  I've got it working.

I am still a bit unclear why the solution worked :)  One of the
changes I made to dir2.xpl was to include a null-serializer to sink
the input (see below).

Notice that  the output of null-serializer is not fed into the input
of the the directory-scanner.  Yet, both processors were executed in
sequence.  So why is it ok that sometimes the output of one processor
doesn't need to feed into another?

This issue isn't critical.  I just feel like there is still a gap in
my understanding of the pipeline evaluation strategy.

dir2.xpl:

<p:config
        xmlns:p="http://www.orbeon.com/oxf/pipeline"
        xmlns:sql="http://orbeon.org/oxf/xml/sql"
        xmlns:xs="http://www.w3.org/2001/XMLSchema"
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:oxf="http://www.orbeon.com/oxf/processors"
        xmlns:xi="http://www.w3.org/2001/XInclude">

  <p:param type="input" name="instance" />
  <p:param type="input" name="dir_result" />
  <p:param type="output" name="data" />
       
  <p:processor name="oxf:null-serializer">
      <p:input name="data" href="#dir_result"/>
  </p:processor>
       
  <p:processor name="oxf:directory-scanner">
    <!-- The configuration can often be inline -->
    <p:input name="config">
      <config>
        <base-directory>file:.</base-directory>
        <include>**/*.sh</include>
        <case-sensitive>false</case-sensitive>
      </config>
    </p:input>
    <p:output name="data" ref="data" debug="" />
  </p:processor>
       
</p:config>


-Thanks,
James



On Thu, May 12, 2011 at 12:55 AM, Florian Schmitt
<[hidden email]> wrote:

> James,
>
>> Unfortunately, connecting the 2 processors did not work.
>
> hmm - ok. I think connecting the processors is only part of the
> solution. In your example code, the output of the second processor is
> still in no way dependend on the output of the first one, so "lazy
> evaluation" still may omit the first processor when generating the
> output of the second one.
>
> If this is correct, one solution would be to make the output of both
> processors relevant to the reslut of the XPL, for example by aggregating
> them (using the identity processor [1]), or by evaluating the content of
> the first processor's output. In pseudocode:
>
> p:pipeline input=in output=out
>    p:processor id=first, input=#in output=firstresult (true|false)
>    p:choose href=#firstresult
>        p:when test=true
>            p:processor id=second output=finalresult ref=out
>        p:otherwise
>           (generate null document, throw exception or anything else)
>   /p:choose
> /p:pipeline
>
> Another (and easier) solution would be to let the "Null serializer" [2]
> consume the output of the first processor. This way, the XPL engine has
> to access the output of the first processor, but just to discard it.
> Pseudocode again:
>
> p:pipeline input=in output=out
>    p:processor id=first, input=#in output=firstresult
>    p:nullserializer input=#firstresult (no output, sort of /dev/null)
>
>    p:processor id=second output=finalresult ref=out
> /p:pipeline
>
> If the second processor depends on the succesful execution of the first
> one, i would recommend using p:choose (as in my first example) to
> express this dependency explicitly in the XPL code.
>
> HTW (hope this works ;) )
> florian
>
>
> [1]
> http://wiki.orbeon.com/forms/doc/developer-guide/processors-other#TOC-Identity-processor
> [2]
> http://wiki.orbeon.com/forms/doc/developer-guide/processors-other-serializers#TOC-Null-serializer
>
>
>
>
> --
> You receive this message as a subscriber of the [hidden email] mailing list.
> To unsubscribe: mailto:[hidden email]
> For general help: mailto:[hidden email]?subject=help
> OW2 mailing lists service home page: http://www.ow2.org/wws
>
>


--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
OW2 mailing lists service home page: http://www.ow2.org/wws