Hi
I hope these are not stupid questions and someone can help me with cacheing... I have a few different questions... (I should say I am asking in the context of OPS 3.0.1, but I have looked through the documentation at orbeon.com) In the regular OPS cache, when I look at the statistics, on the page I am testing now, the log shows a Cache success rate of 88%. Sounds pretty good, but nothing changed between one request and the next, so I am wondering why it is not 100%? If I look at the HTTP header returned to the browser, both Last- Modified and Expires are set to the moment that the page was generated. Each time it is "generated". This raises a number of points * if OPS could achieve 100% success from its own cache, it could look at the If-Last-Modified in the http request, and send back a 304 not modified That's probably hoping for too much * obviously, if you are building a truly dynamic website, then it would be desirable to tell the browser the content it has received has already expired. But many sites/pages are not that dynamic, not completely. But I don't see any way to influence what values are in the response for last-modified nor expires. Hopefully I am missing something there. What I am thinking I am needing is some way to set a default expiration datetime (like "now + defaultvalue") and then some way to set a value for specific pages. So a mostly dynamic site would have a default of 0 and set values on the few not so dynamic pages, while a mostly almost-static site with a few truly dynamic pages would have a default of a few hours say and then apply an instant expire to the few dynamic pages. I am fronting tomcat5/OPS with Apache2. So there is an ExpiresDefault directive there, but it lacks the granularity I have just described. And I suspect that somewhere in OPS is setting the expires anyway, and so Apache2 wouldn't apply its default. And the reason I am looking at all this is that I am now going to Apache 2.2.3 (from Apache 2.0.52) and so I can now use Apache caching. But it isn't going to do anything if it thinks the document has expired already. This is all to improve performance for pages that are indeed built dynamically, but in reality, once they have completed a revision cycle, are actually stable for quite a while. If anyone else has looked at a similar problem and found a (different) solution, I would be happy to hear it. Thanks in advance Colin -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws |
Administrator
|
Colin,
> In the regular OPS cache, when I look at the statistics, on the page I > am testing now, the log shows a Cache success rate of 88%. > Sounds pretty good, but nothing changed between one request and the > next, so I am wondering why it is not 100%? Good question. Without something like a "cache analyzer", it is difficult to provide an answer ;-) So much goes through the cache that there is probably no intuitive answer either. Does the number change if you make the cache larger or smaller? > If I look at the HTTP header returned to the browser, both > Last-Modified and Expires are set to the moment that the page was > generated. Each time it is "generated". > This raises a number of points > > * if OPS could achieve 100% success from its own cache, it could > look at the If-Last-Modified in the http request, and send back a > 304 not modified That's probably hoping for too much Yes and no, in the case a view only depends on static files, this should actually happen, as we have this code in the HTTP serializer: // Check If-Modified-Since (conditional GET) and don't return content if condition is met if (!response.checkIfModifiedSince(lastModified, true)) { response.setStatus(ExternalContext.SC_NOT_MODIFIED); if (logger.isDebugEnabled()) logger.debug("Sending SC_NOT_MODIFIED"); return; } But there may be cases where this code is not triggered, for some reason. (There is also some "interesting" caching-related code in ServletExternalContext.setCaching() ;-) > * obviously, if you are building a truly dynamic website, then it would > be desirable to tell the browser the content it has received has already > expired. > But many sites/pages are not that dynamic, not completely. > But I don't see any way to influence what values are in the response for > last-modified nor expires. Correct, this is meant to be handled automatically by the XPL caching mechanism and the HTTP serializer. > Hopefully I am missing something there. > > What I am thinking I am needing is some way to set a default expiration > datetime (like "now + defaultvalue") and then some way to set a value > for specific pages. > So a mostly dynamic site would have a default of 0 and set values on the > few not so dynamic pages, while a mostly almost-static site with a few > truly dynamic pages would have a default of a few hours say and then > apply an instant expire to the few dynamic pages. We have a hack that we have been using for 100% static sites. You can add these properties to properties.xml: <property as="xs:dateTime" name="oxf.http.force-last-modified" value="2005-06-17T00:00:00"/> <property as="xs:boolean" name="oxf.http.force-must-revalidate" value="false"/> This will set a single last modified date for all the pages of the web site that do not receive a "remote user". For those pages, you can also control whether you want the server to force the client to revalidate the cache when the page is needed again. On the contrary pages that receive a remote user will not be cacheable, with the idea that those may be dynamic pages. Again, this is a hack of sorts. > I am fronting tomcat5/OPS with Apache2. > So there is an ExpiresDefault directive there, but it lacks the > granularity I have just described. > And I suspect that somewhere in OPS is setting the expires anyway, and > so Apache2 wouldn't apply its default. > > And the reason I am looking at all this is that I am now going to Apache > 2.2.3 (from Apache 2.0.52) and so I can now use Apache caching. > But it isn't going to do anything if it thinks the document has expired > already. > > This is all to improve performance for pages that are indeed built > dynamically, but in reality, once they have completed a revision cycle, > are actually stable for quite a while. > If anyone else has looked at a similar problem and found a (different) > solution, I would be happy to hear it. There is not perfect solution at the moment, but the property above could possibly help in your scenario. Let us know how this goes. -Erik -- Orbeon - XForms Everywhere: http://www.orbeon.com/blog/ -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws |
Hi Erik
many thanks for the quick reply. On Oct 16, 2006, at 8:52 AM, Erik Bruchez wrote: > Colin, > > > In the regular OPS cache, when I look at the statistics, on the > page I > > am testing now, the log shows a Cache success rate of 88%. > > Sounds pretty good, but nothing changed between one request and the > > next, so I am wondering why it is not 100%? > > Good question. Without something like a "cache analyzer", it is > difficult to provide an answer ;-) So much goes through the cache that > there is probably no intuitive answer either. Does the number change > if you make the cache larger or smaller? In properties.xml, I see oxf.cache.size with a value of 384 Is this what you mean? I increased it to 512 but it did not seem to make any difference. These are not pages coming from a straight view, they are being generated through multiple steps, so sounds like you are saying 100% is unlikely in this case, though as I said, none of the files (xml, xpl, xsl) involved have changed. If I increase the cache size, should I increase the VM memory size? > > If I look at the HTTP header returned to the browser, both > > Last-Modified and Expires are set to the moment that the page was > > generated. Each time it is "generated". > > > This raises a number of points > > > > * if OPS could achieve 100% success from its own cache, it could > > look at the If-Last-Modified in the http request, and send back a > > 304 not modified That's probably hoping for too much > > Yes and no, in the case a view only depends on static files, this > should actually happen, as we have this code in the HTTP serializer: > > // Check If-Modified-Since (conditional GET) and don't return > content if condition is met > if (!response.checkIfModifiedSince(lastModified, true)) { > response.setStatus(ExternalContext.SC_NOT_MODIFIED); > if (logger.isDebugEnabled()) > logger.debug("Sending SC_NOT_MODIFIED"); > return; > } > > But there may be cases where this code is not triggered, for some > reason. expires > > (There is also some "interesting" caching-related code in > ServletExternalContext.setCaching() ;-) > > > * obviously, if you are building a truly dynamic website, then it > would > > be desirable to tell the browser the content it has received has > already > > expired. > > But many sites/pages are not that dynamic, not completely. > > But I don't see any way to influence what values are in the > response for > > last-modified nor expires. > > Correct, this is meant to be handled automatically by the XPL caching > mechanism and the HTTP serializer. > > > Hopefully I am missing something there. > > > > What I am thinking I am needing is some way to set a default > expiration > > datetime (like "now + defaultvalue") and then some way to set a > value > > for specific pages. > > So a mostly dynamic site would have a default of 0 and set values > on the > > few not so dynamic pages, while a mostly almost-static site with > a few > > truly dynamic pages would have a default of a few hours say and then > > apply an instant expire to the few dynamic pages. > > We have a hack that we have been using for 100% static sites. You can > add these properties to properties.xml: > > <property as="xs:dateTime" > name="oxf.http.force-last-modified" > value="2005-06-17T00:00:00"/> > > <property as="xs:boolean" > name="oxf.http.force-must-revalidate" > value="false"/> > > This will set a single last modified date for all the pages of the web > site that do not receive a "remote user". For those pages, you can > also control whether you want the server to force the client to > revalidate the cache when the page is needed again. > > On the contrary pages that receive a remote user will not be > cacheable, with the idea that those may be dynamic pages. > > Again, this is a hack of sorts. when I had apache cache on Alternating requests would get first the value from properties.xml and the next the current time the expires was always a week ago so the cache always went to OPS for the document I tried setting the ExpiresDefault in apache and that didn't change the expires, indicating that it is already being set by OPS when I had apache cache off the response to the first request is my last-modified-date from properties.xml and an expires of a week ago (at some point in the testing it did go back to having the expire as the current hour - is this in some way related to translating my time in properties.xml to UTC) the response to the second request is now a 304 not modified! BUT the request still went to OPS and OPS still had to do all its processing so there was no performance advantage (presumably apache looked at the info coming back from OPS and then said, oh ok no change, but the performance issue is in OPS, hence looking at apache cache to prevent apache coming to OPS at all (apache cache does have a way to clear the cache if the content has changed but not expired)). when I took the force-last-modified out of the properties.xml, the expires returned to the current time. My initial feeling on this is that maybe last-modified isn't all that critical, what matters is expires. If OPS wasn't setting expires, I could set it with the default in apache (which as I said, might not be entirely in my interest) - the ExpiresDefault has a low maintenance way of defining the time required, expressed as an addition to either access or last-modified times, and not as an absolute (if I want to set force-last-modified to "tomorrow" then I would have to create a cron to update it each night and restart tomcat!) Hope that all makes sense Best regards Colin -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws |
Hi Erik
I have looked into the memory questions and even improved the performance a little. I am still interested in seeing what apache2 caching can do for for this project. If I am reading correctly this apache2 document http://httpd.apache.org/docs/2.3/mod/mod_cache.html then it is expires and not last-modified that is important for this. It seems OPS is setting expires, but not in a particular consistent way, and never for a future date, and I don't seem to have any control over it. Similarly, OPS is setting the last-modified, as the current time - unless the hack is used to specify a particular datetime, but this datetime is not relative to the time the page is first served. Am I reading all this correctly, and is there anything (simple) we can do to make OPS more usable with apqache2 caching? The optimum would seem to be for OPS not to set the expires, just the last-modified, thus allowing the caching directives to set the expires, and for there to be a way for the pipeline to indicate when the page being served needs to be served as already expired. But then, I don't know why OPS is setting the expires, and there may be a very good, conflicting reason for this? Thanks & regards Colin -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws |
Link correction:
http://httpd.apache.org/docs/2.3/caching.html On Oct 20, 2006, at 4:33 PM, Colin O'Brien wrote: > Hi Erik > > I have looked into the memory questions and even improved the > performance a little. > I am still interested in seeing what apache2 caching can do for for > this project. > > If I am reading correctly this apache2 document > http://httpd.apache.org/docs/2.3/mod/mod_cache.html > then it is expires and not last-modified that is important for this. > It seems OPS is setting expires, but not in a particular consistent > way, and never for a future date, and I don't seem to have any > control over it. > Similarly, OPS is setting the last-modified, as the current time - > unless the hack is used to specify a particular datetime, but this > datetime is not relative to the time the page is first served. > > Am I reading all this correctly, and is there anything (simple) we > can do to make OPS more usable with apqache2 caching? > > The optimum would seem to be for OPS not to set the expires, just > the last-modified, thus allowing the caching directives to set the > expires, and for there to be a way for the pipeline to indicate > when the page being served needs to be served as already expired. > But then, I don't know why OPS is setting the expires, and there > may be a very good, conflicting reason for this? > > Thanks & regards > Colin > > > > -- > You receive this message as a subscriber of the ops- > [hidden email] mailing list. > To unsubscribe: mailto:[hidden email] > For general help: mailto:[hidden email]?subject=help > ObjectWeb mailing lists service home page: http://www.objectweb.org/ > wws -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws |
Administrator
|
In reply to this post by Colin O'Brien
Colin,
> Didn't know I could make it larger. > In properties.xml, I see oxf.cache.size with a value of 384 > Is this what you mean? Yes. > I increased it to 512 but it did not seem to make any difference. > These are not pages coming from a straight view, they are being > generated through multiple steps, so sounds like you are saying 100% is > unlikely in this case, though as I said, none of the files (xml, xpl, > xsl) involved have changed. > If I increase the cache size, should I increase the VM memory size? In general it makes sense to do that, especially if you get OutOfMemoryError's, but only testing may shows what optimal values are. But since none of this doesn't seem to help in your case, then there is no point. > My initial feeling on this is that maybe last-modified isn't all > that critical, what matters is expires. Yes. And I think that if an expiration header is missing, then an estimated expiration date is left to the browser to calculate based on a Last-Modified header. But if an expiration header is present, I am not sure if or how a Last-Modified header is used. OPS sets both headers. > If OPS wasn't setting expires, I could set it with the default in > apache (which as I said, might not be entirely in my interest) - the > ExpiresDefault has a low maintenance way of defining the time > required, expressed as an addition to either access or last-modified > times, and not as an absolute (if I want to set force-last-modified > to "tomorrow" then I would have to create a cron to update it each > night and restart tomcat!) > > Hope that all makes sense Kind of, but I am not sure how to take this forward! -Erik -- Orbeon - XForms Everywhere: http://www.orbeon.com/blog/ -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws |
Administrator
|
In reply to this post by Colin O'Brien
Colin,
> If I am reading correctly this apache2 document > http://httpd.apache.org/docs/2.3/mod/mod_cache.html then it is > expires and not last-modified that is important for this. It seems > OPS is setting expires, but not in a particular consistent way, and > never for a future date, and I don't seem to have any control over > it. Similarly, OPS is setting the last-modified, as the current > time - unless the hack is used to specify a particular datetime, but > this datetime is not relative to the time the page is first served. This is how the HTTP serializer works. First it tries to find a last-modified date for the resource produced. If it can't find one, then it uses "now" (the current time). Then it sets the Last-Modified header with that value, Expires to the same value, and forces a revalidation. This was designed this way in order to ensure that your browser doesn't show cached pages in case they have changed on the server. The Resource Server processor (used to serve static .js, .css, etc.) works a little differently: it does not force revalidation. It still uses the Last-Modified date if found, but it sets an Expires header with the same heuristic used by web browsers. This was done with the idea that static resources in general would be cacheable by the web browser. In other words: static resources should be cacheable in a way similar to the case where they are served directly by Tomcat or by Apache. But dynamic resources, while they can be cached, always necessitate revalidation from the browser. At least, that's how things were intended to work... (If you us the oxf.http.force-last-modified hack, then we change things a little.) > Am I reading all this correctly, and is there anything (simple) we > can do to make OPS more usable with apqache2 caching? What are you trying to cache again? I fear that I have lost the exact use case. > The optimum would seem to be for OPS not to set the expires, just > the last-modified, thus allowing the caching directives to set the > expires, and for there to be a way for the pipeline to indicate when > the page being served needs to be served as already expired. Now I am wondering if the Expires header should in fact be set when we force revalidation, or if it should be set with the usual heuristic instead. I am not sure about that part. But can't you with Apache override the Expires header to experiment? -Erik -- Orbeon - XForms Everywhere: http://www.orbeon.com/blog/ -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws |
Hi Erik
just to recap, what I see as my current issue... I am using OPS to generate web content and one reason is that I am using OPS to give the site owner the ability to update some of the content which then requires presentation to be dynamic. In reality, the content changes infrequently, but as currently implemented, a lot of server work occurs for each request. As part of this dynamic page generation, various components might have to be assembled. I'm guessing this makes it difficult for OPS to work out the last- modified date. (As an aside, which may or not be relevant, in testing in Eclipse, I am seeing around 75/85% success in OPS caching. One time I edited an xslt file, and did a refresh of the page, and the cache % was unchanged.) Anyway, I am looking for a way to reduce the server workload and improve the responsiveness of our OPS-generated sites to visitors. The way we are exploring here is to use the caching capabilities in Apache 2.2... On Oct 30, 2006, at 9:25 AM, Erik Bruchez wrote: > Colin, > > > If I am reading correctly this apache2 document > > http://httpd.apache.org/docs/2.3/caching.html then it is > > expires and not last-modified that is important for this. It seems > > OPS is setting expires, but not in a particular consistent way, and > > never for a future date, and I don't seem to have any control over > > it. Similarly, OPS is setting the last-modified, as the current > > time - unless the hack is used to specify a particular datetime, but > > this datetime is not relative to the time the page is first served. > > This is how the HTTP serializer works. First it tries to find a > last-modified date for the resource produced. If it can't find one, but I don't want you to spend all day writing emails. The next question would be, if OPS gets the date wrong, what about providing a way for the resource to tell OPS the right date to use. Every "object" in our sites has an element with date-last-modified. > then it uses "now" (the current time). Then it sets the Last-Modified > header with that value, Expires to the same value, and forces a > revalidation. This was designed this way in order to ensure that your > browser doesn't show cached pages in case they have changed on the > server. Sorry, my knowledge on this part is not detailed enough to under full what you mean by revalidation? This is obviously over-simplistic, even ignorant on my part, but it seems to me that the world developed a lot of ways to address this question of whether a document can be cached and/or has expired and you are short-circuiting that with what may be an over-simplistic approach. I'm sure that you had a very real situation in the past where this was the right answer, and that that you have seen many more situations than I have. Unfortunately it doesn't seem to apply in this case. The resource being returned by our server knows whether it can be cached or not. > The Resource Server processor (used to serve static .js, .css, etc.) > works a little differently: it does not force revalidation. It still > uses the Last-Modified date if found, but it sets an Expires header > with the same heuristic used by web browsers. This was done with the > idea that static resources in general would be cacheable by the web > browser. > > In other words: static resources should be cacheable in a way similar > to the case where they are served directly by Tomcat or by Apache. But > dynamic resources, while they can be cached, always necessitate > revalidation from the browser. At least, that's how things were > intended to work... > > (If you us the oxf.http.force-last-modified hack, then we change > things a little.) > > > Am I reading all this correctly, and is there anything (simple) we > > can do to make OPS more usable with apqache2 caching? > > What are you trying to cache again? I fear that I have lost the exact > use case. > > > The optimum would seem to be for OPS not to set the expires, just > > the last-modified, thus allowing the caching directives to set the > > expires, and for there to be a way for the pipeline to indicate when > > the page being served needs to be served as already expired. > > Now I am wondering if the Expires header should in fact be set when we > force revalidation, or if it should be set with the usual heuristic > instead. I am not sure about that part. But can't you with Apache > override the Expires header to experiment? found it. Neither by setting values in httpd.conf nor by using http-equiv. One simple solution might be add to properties.xml so that the expire behavior can be controlled - off or on/like-browser or on/always-expire or on/resource-defines And/or There is already a way to for the resource to communicate to the serializer when I value is required. That is "http-equiv". If a value is present, the serializer could use it, if none, carries on as now. This could be used for "expires" and/or for "last-modified". In my research this past week, I've begun to feel last-modified is actually what is more important in this process. Expires derives from last-modified, and it would still be relevant to be able to specify at the global, site and resource levels what the last-modified to expires offset is. But it is last-modified that the browser is going to use in its request, and so it is last-modified that the cache will use if it passes the request on to OPS. Thanks & regards Colin PS just in case you are wondering - if there is a change that necessitates the site to expire, then there are means available in apache to clear the cache and it would be easy to repopulate it. -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws |
Hi Erik,
just wondering if you have had any more thoughts on the attached? A quick question (not sure how much it really helps, but might be quick way to get part way) you mentioned the addition of oxf.http.force-last-modified in properties.xml. And the example you gave was of a fixed/absolute time. Is it possible (or easy) to have it as a relative time - I would most likely set it to 12 or 24 hours, since as I said, most of the time the site content does not change (apart from use of a contact form) but things do come and go (for instance on a calendar/events page) so allowing for occasional updates would be a way to satisfy much of the requirement. Thanks again Colin On Nov 5, 2006, at 6:43 PM, Colin O'Brien wrote: Hi Erik just to recap, what I see as my current issue... I am using OPS to generate web content and one reason is that I am using OPS to give the site owner the ability to update some of the content which then requires presentation to be dynamic. In reality, the content changes infrequently, but as currently implemented, a lot of server work occurs for each request. As part of this dynamic page generation, various components might have to be assembled. I'm guessing this makes it difficult for OPS to work out the last- modified date. (As an aside, which may or not be relevant, in testing in Eclipse, I am seeing around 75/85% success in OPS caching. One time I edited an xslt file, and did a refresh of the page, and the cache % was unchanged.) Anyway, I am looking for a way to reduce the server workload and improve the responsiveness of our OPS-generated sites to visitors. The way we are exploring here is to use the caching capabilities in Apache 2.2... On Oct 30, 2006, at 9:25 AM, Erik Bruchez wrote: > Colin, > > > If I am reading correctly this apache2 document > > http://httpd.apache.org/docs/2.3/caching.html then it is > > expires and not last-modified that is important for this. It seems > > OPS is setting expires, but not in a particular consistent way, and > > never for a future date, and I don't seem to have any control over > > it. Similarly, OPS is setting the last-modified, as the current > > time - unless the hack is used to specify a particular datetime, but > > this datetime is not relative to the time the page is first served. > > This is how the HTTP serializer works. First it tries to find a > last-modified date for the resource produced. If it can't find one, but I don't want you to spend all day writing emails. The next question would be, if OPS gets the date wrong, what about providing a way for the resource to tell OPS the right date to use. Every "object" in our sites has an element with date-last-modified. > then it uses "now" (the current time). Then it sets the Last-Modified > header with that value, Expires to the same value, and forces a > revalidation. This was designed this way in order to ensure that your > browser doesn't show cached pages in case they have changed on the > server. Sorry, my knowledge on this part is not detailed enough to under full what you mean by revalidation? This is obviously over-simplistic, even ignorant on my part, but it seems to me that the world developed a lot of ways to address this question of whether a document can be cached and/or has expired and you are short-circuiting that with what may be an over-simplistic approach. I'm sure that you had a very real situation in the past where this was the right answer, and that that you have seen many more situations than I have. Unfortunately it doesn't seem to apply in this case. The resource being returned by our server knows whether it can be cached or not. > The Resource Server processor (used to serve static .js, .css, etc.) > works a little differently: it does not force revalidation. It still > uses the Last-Modified date if found, but it sets an Expires header > with the same heuristic used by web browsers. This was done with the > idea that static resources in general would be cacheable by the web > browser. > > In other words: static resources should be cacheable in a way similar > to the case where they are served directly by Tomcat or by Apache. But > dynamic resources, while they can be cached, always necessitate > revalidation from the browser. At least, that's how things were > intended to work... > > (If you us the oxf.http.force-last-modified hack, then we change > things a little.) > > > Am I reading all this correctly, and is there anything (simple) we > > can do to make OPS more usable with apqache2 caching? > > What are you trying to cache again? I fear that I have lost the exact > use case. > > > The optimum would seem to be for OPS not to set the expires, just > > the last-modified, thus allowing the caching directives to set the > > expires, and for there to be a way for the pipeline to indicate when > > the page being served needs to be served as already expired. > > Now I am wondering if the Expires header should in fact be set when we > force revalidation, or if it should be set with the usual heuristic > instead. I am not sure about that part. But can't you with Apache > override the Expires header to experiment? found it. Neither by setting values in httpd.conf nor by using http-equiv. One simple solution might be add to properties.xml so that the expire behavior can be controlled - off or on/like-browser or on/always-expire or on/resource-defines And/or There is already a way to for the resource to communicate to the serializer when I value is required. That is "http-equiv". If a value is present, the serializer could use it, if none, carries on as now. This could be used for "expires" and/or for "last-modified". In my research this past week, I've begun to feel last-modified is actually what is more important in this process. Expires derives from last-modified, and it would still be relevant to be able to specify at the global, site and resource levels what the last-modified to expires offset is. But it is last-modified that the browser is going to use in its request, and so it is last-modified that the cache will use if it passes the request on to OPS. Thanks & regards Colin PS just in case you are wondering - if there is a change that necessitates the site to expire, then there are means available in apache to clear the cache and it would be easy to repopulate it. -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws |
Administrator
|
Colin,
I was thinking that you could do this with Apache mod_expires. For example, we have been using this kind of configuration: ExpiresActive on ExpiresByType application/x-javascript "access plus 2000 minutes" Have you tried such an approach, and would it not solve your problem? Best, -Erik -Erik Colin O'Brien wrote: > Hi Erik, > > just wondering if you have had any more thoughts on the attached? > > A quick question (not sure how much it really helps, but might be quick > way to get part way) > > you mentioned the addition of oxf.http.force-last-modified in > properties.xml. > And the example you gave was of a fixed/absolute time. > Is it possible (or easy) to have it as a relative time - I would most > likely set it to 12 or 24 hours, since as I said, most of the time the > site content does not change (apart from use of a contact form) but > things do come and go (for instance on a calendar/events page) so > allowing for occasional updates would be a way to satisfy much of the > requirement. > > Thanks again > Colin > > > > On Nov 5, 2006, at 6:43 PM, Colin O'Brien wrote: > > Hi Erik > > just to recap, what I see as my current issue... > > I am using OPS to generate web content > and one reason is that I am using OPS to give the site owner the ability > to update some of the content > which then requires presentation to be dynamic. > In reality, the content changes infrequently, > but as currently implemented, a lot of server work occurs for each request. > > As part of this dynamic page generation, various components might have > to be assembled. > I'm guessing this makes it difficult for OPS to work out the > last-modified date. > > (As an aside, which may or not be relevant, in testing in Eclipse, I am > seeing around 75/85% success in OPS caching. > One time I edited an xslt file, and did a refresh of the page, and the > cache % was unchanged.) > > Anyway, I am looking for a way to reduce the server workload and improve > the responsiveness of our OPS-generated sites to visitors. > The way we are exploring here is to use the caching capabilities in > Apache 2.2... > > On Oct 30, 2006, at 9:25 AM, Erik Bruchez wrote: > >> Colin, >> >> > If I am reading correctly this apache2 document >> > http://httpd.apache.org/docs/2.3/caching.html then it is >> > expires and not last-modified that is important for this. It seems >> > OPS is setting expires, but not in a particular consistent way, and >> > never for a future date, and I don't seem to have any control over >> > it. Similarly, OPS is setting the last-modified, as the current >> > time - unless the hack is used to specify a particular datetime, but >> > this datetime is not relative to the time the page is first served. >> >> This is how the HTTP serializer works. First it tries to find a >> last-modified date for the resource produced. If it can't find one, > > one obvious question would be "how does it do this?" > but I don't want you to spend all day writing emails. > The next question would be, if OPS gets the date wrong, > what about providing a way for the resource to tell OPS the right date > to use. > Every "object" in our sites has an element with date-last-modified. > >> then it uses "now" (the current time). Then it sets the Last-Modified >> header with that value, Expires to the same value, and forces a >> revalidation. This was designed this way in order to ensure that your >> browser doesn't show cached pages in case they have changed on the >> server. > > Sorry, my knowledge on this part is not detailed enough to under full > what you mean by revalidation? > > This is obviously over-simplistic, even ignorant on my part, but it > seems to me that the world developed a lot of ways to address this > question of whether a document can be cached and/or has expired and you > are short-circuiting that with what may be an over-simplistic approach. > I'm sure that you had a very real situation in the past where this was > the right answer, and that that you have seen many more situations than > I have. > Unfortunately it doesn't seem to apply in this case. > > The resource being returned by our server knows whether it can be cached > or not. > >> The Resource Server processor (used to serve static .js, .css, etc.) >> works a little differently: it does not force revalidation. It still >> uses the Last-Modified date if found, but it sets an Expires header >> with the same heuristic used by web browsers. This was done with the >> idea that static resources in general would be cacheable by the web >> browser. >> >> In other words: static resources should be cacheable in a way similar >> to the case where they are served directly by Tomcat or by Apache. But >> dynamic resources, while they can be cached, always necessitate >> revalidation from the browser. At least, that's how things were >> intended to work... >> >> (If you us the oxf.http.force-last-modified hack, then we change >> things a little.) >> >> > Am I reading all this correctly, and is there anything (simple) we >> > can do to make OPS more usable with apqache2 caching? >> >> What are you trying to cache again? I fear that I have lost the exact >> use case. >> >> > The optimum would seem to be for OPS not to set the expires, just >> > the last-modified, thus allowing the caching directives to set the >> > expires, and for there to be a way for the pipeline to indicate when >> > the page being served needs to be served as already expired. >> >> Now I am wondering if the Expires header should in fact be set when we >> force revalidation, or if it should be set with the usual heuristic >> instead. I am not sure about that part. But can't you with Apache >> override the Expires header to experiment? > > As to overriding the Expires answer, if there is a way, I haven't found it. > Neither by setting values in httpd.conf nor by using http-equiv. > > One simple solution might be add to properties.xml so that the expire > behavior can be controlled > - off or on/like-browser or on/always-expire or on/resource-defines > > And/or > > There is already a way to for the resource to communicate to the > serializer when I value is required. > That is "http-equiv". > If a value is present, the serializer could use it, if none, carries on > as now. > This could be used for "expires" and/or for "last-modified". > > In my research this past week, I've begun to feel last-modified is > actually what is more important in this process. > Expires derives from last-modified, and it would still be relevant to be > able to specify at the global, site and resource levels what the > last-modified to expires offset is. > But it is last-modified that the browser is going to use in its request, > and so it is last-modified that the cache will use if it passes the > request on to OPS. > > Thanks & regards > Colin > > PS just in case you are wondering - if there is a change that > necessitates the site to expire, then there are means available in > apache to clear the cache and it would be easy to repopulate it. > > -- Orbeon Forms - Web Forms for the Enterprise Done the Right Way http://www.orbeon.com/ -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws |
Hi Erik,
thanks for the reply. I hadn't come across these settings, so I will give them some more thought. But I fear they are not granular enough, that they are not taking account of what is in the document being returned. The best for me might be if I was setting something in the page being returned (such as last-modified) and then this got through to the browser in a way the browser acted upon. That way, I could set long expirations for "slowly" changing material, in effect, for instance, making new material expire in a day so that daily updates appeared each day (though not immediately) and more interactive material, such as a form, could be set to expire immediately, so the user always saw the very latest. Best regards Colin On Feb 6, 2007, at 6:18 AM, Erik Bruchez wrote: > Colin, > > I was thinking that you could do this with Apache mod_expires. For > example, we have been using this kind of configuration: > > ExpiresActive on > ExpiresByType application/x-javascript "access plus 2000 minutes" > > Have you tried such an approach, and would it not solve your problem? > > Best, > > -Erik > > -Erik > > Colin O'Brien wrote: >> Hi Erik, >> just wondering if you have had any more thoughts on the attached? >> A quick question (not sure how much it really helps, but might be >> quick way to get part way) >> you mentioned the addition of oxf.http.force-last-modified in >> properties.xml. >> And the example you gave was of a fixed/absolute time. >> Is it possible (or easy) to have it as a relative time - I would >> most likely set it to 12 or 24 hours, since as I said, most of the >> time the site content does not change (apart from use of a contact >> form) but things do come and go (for instance on a calendar/events >> page) so allowing for occasional updates would be a way to satisfy >> much of the requirement. >> Thanks again >> Colin >> On Nov 5, 2006, at 6:43 PM, Colin O'Brien wrote: >> Hi Erik >> just to recap, what I see as my current issue... >> I am using OPS to generate web content >> and one reason is that I am using OPS to give the site owner the >> ability to update some of the content >> which then requires presentation to be dynamic. >> In reality, the content changes infrequently, >> but as currently implemented, a lot of server work occurs for each >> request. >> As part of this dynamic page generation, various components might >> have to be assembled. >> I'm guessing this makes it difficult for OPS to work out the last- >> modified date. >> (As an aside, which may or not be relevant, in testing in Eclipse, >> I am seeing around 75/85% success in OPS caching. >> One time I edited an xslt file, and did a refresh of the page, and >> the cache % was unchanged.) >> Anyway, I am looking for a way to reduce the server workload and >> improve the responsiveness of our OPS-generated sites to visitors. >> The way we are exploring here is to use the caching capabilities >> in Apache 2.2... >> On Oct 30, 2006, at 9:25 AM, Erik Bruchez wrote: >>> Colin, >>> >>> > If I am reading correctly this apache2 document >>> > http://httpd.apache.org/docs/2.3/caching.html then it is >>> > expires and not last-modified that is important for this. It >>> seems >>> > OPS is setting expires, but not in a particular consistent way, >>> and >>> > never for a future date, and I don't seem to have any control over >>> > it. Similarly, OPS is setting the last-modified, as the current >>> > time - unless the hack is used to specify a particular >>> datetime, but >>> > this datetime is not relative to the time the page is first >>> served. >>> >>> This is how the HTTP serializer works. First it tries to find a >>> last-modified date for the resource produced. If it can't find one, >> one obvious question would be "how does it do this?" >> but I don't want you to spend all day writing emails. >> The next question would be, if OPS gets the date wrong, >> what about providing a way for the resource to tell OPS the right >> date to use. >> Every "object" in our sites has an element with date-last-modified. >>> then it uses "now" (the current time). Then it sets the Last- >>> Modified >>> header with that value, Expires to the same value, and forces a >>> revalidation. This was designed this way in order to ensure that >>> your >>> browser doesn't show cached pages in case they have changed on the >>> server. >> Sorry, my knowledge on this part is not detailed enough to under >> full what you mean by revalidation? >> This is obviously over-simplistic, even ignorant on my part, but >> it seems to me that the world developed a lot of ways to address >> this question of whether a document can be cached and/or has >> expired and you are short-circuiting that with what may be an over- >> simplistic approach. >> I'm sure that you had a very real situation in the past where this >> was the right answer, and that that you have seen many more >> situations than I have. >> Unfortunately it doesn't seem to apply in this case. >> The resource being returned by our server knows whether it can be >> cached or not. >>> The Resource Server processor (used to serve static .js, .css, etc.) >>> works a little differently: it does not force revalidation. It still >>> uses the Last-Modified date if found, but it sets an Expires header >>> with the same heuristic used by web browsers. This was done with the >>> idea that static resources in general would be cacheable by the web >>> browser. >>> >>> In other words: static resources should be cacheable in a way >>> similar >>> to the case where they are served directly by Tomcat or by >>> Apache. But >>> dynamic resources, while they can be cached, always necessitate >>> revalidation from the browser. At least, that's how things were >>> intended to work... >>> >>> (If you us the oxf.http.force-last-modified hack, then we change >>> things a little.) >>> >>> > Am I reading all this correctly, and is there anything (simple) we >>> > can do to make OPS more usable with apqache2 caching? >>> >>> What are you trying to cache again? I fear that I have lost the >>> exact >>> use case. >>> >>> > The optimum would seem to be for OPS not to set the expires, just >>> > the last-modified, thus allowing the caching directives to set the >>> > expires, and for there to be a way for the pipeline to indicate >>> when >>> > the page being served needs to be served as already expired. >>> >>> Now I am wondering if the Expires header should in fact be set >>> when we >>> force revalidation, or if it should be set with the usual heuristic >>> instead. I am not sure about that part. But can't you with Apache >>> override the Expires header to experiment? >> As to overriding the Expires answer, if there is a way, I haven't >> found it. >> Neither by setting values in httpd.conf nor by using http-equiv. >> One simple solution might be add to properties.xml so that the >> expire behavior can be controlled >> - off or on/like-browser or on/always-expire or on/resource-defines >> And/or >> There is already a way to for the resource to communicate to the >> serializer when I value is required. >> That is "http-equiv". >> If a value is present, the serializer could use it, if none, >> carries on as now. >> This could be used for "expires" and/or for "last-modified". >> In my research this past week, I've begun to feel last-modified is >> actually what is more important in this process. >> Expires derives from last-modified, and it would still be relevant >> to be able to specify at the global, site and resource levels what >> the last-modified to expires offset is. >> But it is last-modified that the browser is going to use in its >> request, and so it is last-modified that the cache will use if it >> passes the request on to OPS. >> Thanks & regards >> Colin >> PS just in case you are wondering - if there is a change that >> necessitates the site to expire, then there are means available in >> apache to clear the cache and it would be easy to repopulate it. > > > -- > Orbeon Forms - Web Forms for the Enterprise Done the Right Way > http://www.orbeon.com/ > > > -- > You receive this message as a subscriber of the ops- > [hidden email] mailing list. > To unsubscribe: mailto:[hidden email] > For general help: mailto:[hidden email]?subject=help > ObjectWeb mailing lists service home page: http://www.objectweb.org/ > wws -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws |
Administrator
|
Colin,
> thanks for the reply. I hadn't come across these settings, so I will > give them some more thought. But I fear they are not granular > enough, that they are not taking account of what is in the document > being returned. The best for me might be if I was setting something > in the page being returned (such as last-modified) and then this got > through to the browser in a way the browser acted upon. That way, I > could set long expirations for "slowly" changing material, in > effect, for instance, making new material expire in a day so that > daily updates appeared each day (though not immediately) and more > interactive material, such as a form, could be set to expire > immediately, so the user always saw the very latest. My comment was based on your proposal for adding a minimal change, which was a "relative" oxf.http.force-last-modified. I assumed this meant relative to the request or to an existing Last-Modified header returned by Orbeon Forms, so assumed that this could also be done with the Apache module. Otherwise, the question is, to what should it be relative? Orbeon Forms produces Last-Modified, Expires and Cache-Control headears based on either: 1. A last-modified date obtained from the pipeline. 2. The timestamp of the request (or rather, that of the serialization). If #1 is available, which means that the pipeline is cacheable, then you will get a Last-Modified header which typically will be in the past (as opposed to the time of the request). In which case, I assume that expiration can be changed using an Apache module. Otherwise, in case #2, the value of the Last-Modified header is the time of the request. Here again you can use an Apache module to change expiration. But I understand that this does not help you much, which seems to mean that a relative oxf.http.force-last-modified wouldn't do anything for you. As you point out, it seems that what you need is a way of specifying a Last-Modified date programmatically instead of relying on the pipeline to do this for you, in cases where the pipeline is not cacheable. It seems to me that there are two ways of doing this: 1. One would be to create a sort of intermediary processor, which would look at its input (for example xhtml:html/xhtml:head/xhtml:meta/@http-equiv="last-modified"), and then change the pipeline's last-modified based on those headers. This would be an HTML/XHTML-specific fix. 2. Allow for externally controlling caching headers on oxf:http-serializer. One question is how the input would be passed. It could be attributes on the root element of the binary or text document passed, and then who would produce those attributes (more difficult to control because the previous step is oxf:xml-converter). With XProc, the new pipeline language, this could be made easier with the use of parameters. But we don't have this yet, so this option doesn't seem very realistic at this point. Would #1 do the trick for you? Again, the only thing it would do would be to artificially change the pipeline's last-modified date, and that for HTML only (although variants could be used for other content). Still, I wouldn't call this a quick fix. Then you would still have to use an Apache module to change the Expire header if necessary. -Erik -- Orbeon Forms - Web Forms for the Enterprise Done the Right Way http://www.orbeon.com/ -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws |
Free forum by Nabble | Edit this page |