controlling when resources are revalidated?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

controlling when resources are revalidated?

Adrian Baker
In a production system, ideally we'd like to avoid any revalidation (ie check for last modified) of XPL files: particularly if they're loading from a remote HTTP url, as this involves creating a connection etc etc. Profiling a http sourced dummy xpl which did nothing but return a document constructed in memory showed that more time was spent on getInputKeyValidity than was spent actually executing the pipeline (and the http url was on the localhost).

I can see what looks like an option in the URLGenerator (/config/cache-control/always-revalidate) related to controlling exactly this, but is there a way to more globally specify this behavior for references like:

<p:processor name="oxf:pipeline">
    <p:input name="config" href="epilogue-servlet.xpl"/>

or for an invocation like

    final ProcessorDefinition processorDefinition = new ProcessorDefinition();
    processorDefinition.setName(new QName("pipeline", XMLConstants.OXF_PROCESSORS_NAMESPACE));
    processorDefinition.addInput("config", "http://localhost/abc/dummy.xpl");
    final Processor processor = InitUtils.createProcessor(processorDefinition);
    InitUtils.runProcessor(processor, ...);

If not, I guess in the former case (href xpl reference) it would be possible to manually replace each of these urls with an explicit call to the URLGenerator wouldn't it? eg

    <p:processor name="oxf:url-generator">
        <p:input name="config">
            <config>
                <url>epilogue-servlet.xpl</url>
                <content-type>application/xml</content-type>
                <cache-control>
                    ...
                </cache-contro>
            </config>
        </p:input>
        <p:output name="data" id="epilogue-servlet"/>
    </p:processor>   

    <p:processor name="oxf:pipeline">
        <p:input name="config" href="#epilogue-servlet"/>

In the latter case (Java invocation) I suppose I could writing a staging pipeline which would take the url as an input, pass this to the url-generator to get the config document, then pass this to the pipeline processor...

Adrian



--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
ObjectWeb mailing lists service home page: http://www.objectweb.org/wws
Reply | Threaded
Open this post in threaded view
|

Re: controlling when resources are revalidated?

Erik Bruchez
Administrator
Adrian,

The answer is that no, there is no global mechanism to handle this at
the moment.

You could imagine Orbeon Forms (at least the pipeline engine) having a
smarter and configurable HTTP client, which could even handle a local
cache, but currently this is not the case (although we already use the
Apache HTTP client instead of the built-in Java HttpURLConnection).

Clearly, you have to have tight control over HTTP URLs. The XForms
engine processor (oxf:xforms-to-xhtml) is already smarter about this,
fetching resources during initialization only once, but this is a
"local" fix, and it doesn't help much in the case of multiple
invocations of a pipeline anyway.

-Erik

Adrian Baker wrote:

> In a production system, ideally we'd like to avoid any revalidation (ie
> check for last modified) of XPL files: particularly if they're loading
> from a remote HTTP url, as this involves creating a connection etc etc.
> Profiling a http sourced dummy xpl which did nothing but return a
> document constructed in memory showed that more time was spent on
> getInputKeyValidity than was spent actually executing the pipeline (and
> the http url was on the localhost).
>
> I can see what looks like an option in the URLGenerator
> (/config/cache-control/always-revalidate) related to controlling exactly
> this, but is there a way to more globally specify this behavior for
> references like:
>
> <p:processor name="oxf:pipeline">
>     <p:input name="config" href="epilogue-servlet.xpl"/>
>
> or for an invocation like
>
>     final ProcessorDefinition processorDefinition = new
> ProcessorDefinition();
>     processorDefinition.setName(new QName("pipeline",
> XMLConstants.OXF_PROCESSORS_NAMESPACE));
>     processorDefinition.addInput("config",
> "http://localhost/abc/dummy.xpl");
>     final Processor processor =
> InitUtils.createProcessor(processorDefinition);
>     InitUtils.runProcessor(processor, ...);
>
> If not, I guess in the former case (href xpl reference) it would be
> possible to manually replace each of these urls with an explicit call to
> the URLGenerator wouldn't it? eg
>
>     <p:processor name="oxf:url-generator">
>         <p:input name="config">
>             <config>
>                 <url>epilogue-servlet.xpl</url>
>                 <content-type>application/xml</content-type>
>                 <cache-control>
>                     ...
>                 </cache-contro>
>             </config>
>         </p:input>
>         <p:output name="data" id="epilogue-servlet"/>
>     </p:processor>  
>
>     <p:processor name="oxf:pipeline">
>         <p:input name="config" href="#epilogue-servlet"/>
>
> In the latter case (Java invocation) I suppose I could writing a staging
> pipeline which would take the url as an input, pass this to the
> url-generator to get the config document, then pass this to the pipeline
> processor...
>
> Adrian
--
Orbeon Forms - Web Forms for the Enterprise Done the Right Way
http://www.orbeon.com/



--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
ObjectWeb mailing lists service home page: http://www.objectweb.org/wws
Reply | Threaded
Open this post in threaded view
|

Re: controlling when resources are revalidated?

Adrian Baker
Hi Erik,

I think the overzealous validation problem extends even to applications which have primarily file based resources. Profiling a simple form just by hitting refresh 10-20 times in the browser showed ~6% of the time was being spent in ResourceManager.lastModified() - this is using the class loader to load resources from local files.

Also, would you have any comment on the feasibility of the workarounds I've suggested?

Adrian

Erik Bruchez wrote:
Adrian,

The answer is that no, there is no global mechanism to handle this at the moment.

You could imagine Orbeon Forms (at least the pipeline engine) having a smarter and configurable HTTP client, which could even handle a local cache, but currently this is not the case (although we already use the Apache HTTP client instead of the built-in Java HttpURLConnection).

Clearly, you have to have tight control over HTTP URLs. The XForms engine processor (oxf:xforms-to-xhtml) is already smarter about this, fetching resources during initialization only once, but this is a "local" fix, and it doesn't help much in the case of multiple invocations of a pipeline anyway.

-Erik

Adrian Baker wrote:
In a production system, ideally we'd like to avoid any revalidation (ie check for last modified) of XPL files: particularly if they're loading from a remote HTTP url, as this involves creating a connection etc etc. Profiling a http sourced dummy xpl which did nothing but return a document constructed in memory showed that more time was spent on getInputKeyValidity than was spent actually executing the pipeline (and the http url was on the localhost).

I can see what looks like an option in the URLGenerator (/config/cache-control/always-revalidate) related to controlling exactly this, but is there a way to more globally specify this behavior for references like:

<p:processor name="oxf:pipeline">
    <p:input name="config" href="epilogue-servlet.xpl"/>

or for an invocation like

    final ProcessorDefinition processorDefinition = new ProcessorDefinition();
    processorDefinition.setName(new QName("pipeline", XMLConstants.OXF_PROCESSORS_NAMESPACE));
    processorDefinition.addInput("config", "http://localhost/abc/dummy.xpl");
    final Processor processor = InitUtils.createProcessor(processorDefinition);
    InitUtils.runProcessor(processor, ...);

If not, I guess in the former case (href xpl reference) it would be possible to manually replace each of these urls with an explicit call to the URLGenerator wouldn't it? eg

    <p:processor name="oxf:url-generator">
        <p:input name="config">
            <config>
                <url>epilogue-servlet.xpl</url>
                <content-type>application/xml</content-type>
                <cache-control>
                    ...
                </cache-contro>
            </config>
        </p:input>
        <p:output name="data" id="epilogue-servlet"/>
    </p:processor>  
    <p:processor name="oxf:pipeline">
        <p:input name="config" href="#epilogue-servlet"/>

In the latter case (Java invocation) I suppose I could writing a staging pipeline which would take the url as an input, pass this to the url-generator to get the config document, then pass this to the pipeline processor...

Adrian


-- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: [hidden email] For general help: [hidden email] ObjectWeb mailing lists service home page: http://www.objectweb.org/wws


--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
ObjectWeb mailing lists service home page: http://www.objectweb.org/wws
Reply | Threaded
Open this post in threaded view
|

Re: controlling when resources are revalidated?

Erik Bruchez
Administrator
Adrian,

Regarding the Java / addInput() case: if you follow the code, this
will end up creating a new DOMGenerator, which could (if it doesn't
yet) take a configuration parameter controlling caching.

But the cacheAlwaysRevalidate member of URLGenerator.Config never
used! So obviously this would have to be implemented before this
works!

-Erik

 > I think the overzealous validation problem extends even to applications
 > which have primarily file based resources. Profiling a simple form just
 > by hitting refresh 10-20 times in the browser showed ~6% of the time was
 > being spent in ResourceManager.lastModified() - this is using the class
 > loader to load resources from local files.
 >
 > Also, would you have any comment on the feasibility of the workarounds
 > I've suggested?
 >
 > Adrian
 >
 > Erik Bruchez wrote:
 >> Adrian,
 >>
 >> The answer is that no, there is no global mechanism to handle this at
 >> the moment.
 >>
 >> You could imagine Orbeon Forms (at least the pipeline engine) having a
 >> smarter and configurable HTTP client, which could even handle a local
 >> cache, but currently this is not the case (although we already use the
 >> Apache HTTP client instead of the built-in Java HttpURLConnection).
 >>
 >> Clearly, you have to have tight control over HTTP URLs. The XForms
 >> engine processor (oxf:xforms-to-xhtml) is already smarter about this,
 >> fetching resources during initialization only once, but this is a
 >> "local" fix, and it doesn't help much in the case of multiple
 >> invocations of a pipeline anyway.
 >>
 >> -Erik
 >>
 >> Adrian Baker wrote:
 >>> In a production system, ideally we'd like to avoid any revalidation
 >>> (ie check for last modified) of XPL files: particularly if they're
 >>> loading from a remote HTTP url, as this involves creating a
 >>> connection etc etc. Profiling a http sourced dummy xpl which did
 >>> nothing but return a document constructed in memory showed that more
 >>> time was spent on getInputKeyValidity than was spent actually
 >>> executing the pipeline (and the http url was on the localhost).
 >>>
 >>> I can see what looks like an option in the URLGenerator
 >>> (/config/cache-control/always-revalidate) related to controlling
 >>> exactly this, but is there a way to more globally specify this
 >>> behavior for references like:
 >>>
 >>> <p:processor name="oxf:pipeline">
 >>>     <p:input name="config" href="epilogue-servlet.xpl"/>
 >>>
 >>> or for an invocation like
 >>>
 >>>     final ProcessorDefinition processorDefinition = new
 >>> ProcessorDefinition();
 >>>     processorDefinition.setName(new QName("pipeline",
 >>> XMLConstants.OXF_PROCESSORS_NAMESPACE));
 >>>     processorDefinition.addInput("config",
 >>> "http://localhost/abc/dummy.xpl");
 >>>     final Processor processor =
 >>> InitUtils.createProcessor(processorDefinition);
 >>>     InitUtils.runProcessor(processor, ...);
 >>>
 >>> If not, I guess in the former case (href xpl reference) it would be
 >>> possible to manually replace each of these urls with an explicit call
 >>> to the URLGenerator wouldn't it? eg
 >>>
 >>>     <p:processor name="oxf:url-generator">
 >>>         <p:input name="config">
 >>>             <config>
 >>>                 <url>epilogue-servlet.xpl</url>
 >>>                 <content-type>application/xml</content-type>
 >>>                 <cache-control>
 >>>                     ...
 >>>                 </cache-contro>
 >>>             </config>
 >>>         </p:input>
 >>>         <p:output name="data" id="epilogue-servlet"/>
 >>>     </p:processor>
 >>>     <p:processor name="oxf:pipeline">
 >>>         <p:input name="config" href="#epilogue-servlet"/>
 >>>
 >>> In the latter case (Java invocation) I suppose I could writing a
 >>> staging pipeline which would take the url as an input, pass this to
 >>> the url-generator to get the config document, then pass this to the
 >>> pipeline processor...
 >>>
 >>> Adrian

--
Orbeon Forms - Web Forms for the Enterprise Done the Right Way
http://www.orbeon.com/




--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
ObjectWeb mailing lists service home page: http://www.objectweb.org/wws