patch to reduce excessive org.orbeon.oxf.xforms.mip.* object creation

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

patch to reduce excessive org.orbeon.oxf.xforms.mip.* object creation

Adrian Baker
As we've optimised other parts of our application we've been increasing the number of requests running through Orbeon.

In doing so garbage collection has become more & more of an overhead as objects have been created at a greater rate, to the point where our application will fill up the heap & then spend much of the time doing full garbage collects (even with a very small cache).

Looking at the heap shows *massive* numbers of org.orbeon.oxf.xforms.mip.* objects (eg over a million in a 384MB heap). Most of these are unnecessarily created & never used. Attached is a fairly simple patch to lazily create these, which eleminates most of them. In my simple test form 6,000-odd mip objects (!) dropped to less than a hundred in a single form invocation. Obviously it will vary by form, but in general only a small percentage of instance nodes will have binds, and
most of these will have only a subset of the required/relevant/readonly/constraint properties.

The TypeModelItemProperty wrapping a String is unnecessary, since
TypeModelItemProperty.isSet is never called - InstanceData can just have a String field. Readonly & RelevantModelItemProperty could be eliminated altogether too and just boolean primitives used instead, just by making BooleanModelItemProperty.hasChangedFromDefault a static method.

It works on our forms, however originally I wrote the patch against 3.5.1 source: I've updated it for the current HEAD, but haven't retested. Not too many differences however.

It could be improved: it relies on runtime checking to prevent calls to (mip-object).set(xxx) from outside InstanceData: would be better to make this a compile check by putting InstanceData into the .mip package and making it package visible only.

There are still a lot of unnecessary objects created. From the numbers in the heap, it seems *every node* in an instance document results in a creation of at least
- InstanceData
- HashMap entry, presumably idToNodeMap in XFormsUtils.setInitialDecorationWorker

I wonder if the same approach of lazy creation-until-referenced can be taken for InstanceData objects as well. Also does this note in the setInitialDecorationWorker method:

    // NOTE: ids are only used by the legacy XForms engine

indicate that perhaps the
idToNodeMap could be retired?

Also dom4j seems to generate a ton of objects - particularly ArrayLists. Not sure what options there are to tackle this.

Also can I make another plea for a Subversion repository... !

Adrian




--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
ObjectWeb mailing lists service home page: http://www.objectweb.org/wws

org.orbeon.oxf.xforms.reduce-mip-object-patch.zip (51K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: patch to reduce excessive org.orbeon.oxf.xforms.mip.* object creation

Erik Bruchez
Administrator
Adrian,

(Resending since this reply doesn't seem to have made it to ops-users.)

Thanks a lot, that sounds great!

I think that you are right on the spot:

* InstanceData objects should only be created when needed, i.e. when
    there is some information different from the default to store.

* idToNodeMap is only used with XForms Classic, which we should
    retire. However, I thought that the Map was not even created with
    XForms NG, since XFormsUtils.setInitialDecoration() passes a null
    for the map. Maybe the Map is another one?

* We have found dom4j to be quite inefficient in general. There is one
    place where it is used unnecessarily, and that is to represent the
    "static state" document. Here, using TinyTree would be much
    better. In other places, we could in the future move maybe to XOM,
    assuming it is a better model than dom4j. But that's further away.

* I also agree with TypeModelItemProperty: I think that some code has
    changed there recently too, when we did some work with validation,
    and some stuff has become unneeded.

We will definitely want to integrate this. But we can't guarantee that
it will happen before we release 3.6 as that would be a last-minute
change. I have added an RFE for this:

http://forge.objectweb.org/tracker/index.php?func=detail&aid=307293&group_id=168&atid=350207

Regarding Subversion, we'll think about it ;-) The main objections
were, I think:

* It will be work to migrate our automatic build system and our
    synchronization with the ObjectWeb (now OW2) repository.

* CVS "just works" at the moment.

* My personal experience with SVN, which I use regularly for
    documents, is less than stellar: corrupted repositories, snv cleanup
    all the time, issues with encodings on the Mac, and more.

All in all though, that's great work, and I can't wait to see this
integrated in the codebase!

-Erik

Adrian Baker wrote:

> As we've optimised other parts of our application we've been increasing
> the number of requests running through Orbeon.
>
> In doing so garbage collection has become more & more of an overhead as
> objects have been created at a greater rate, to the point where our
> application will fill up the heap & then spend much of the time doing
> full garbage collects (even with a very small cache).
>
> Looking at the heap shows *massive* numbers of
> org.orbeon.oxf.xforms.mip.* objects (eg over a million in a 384MB heap).
> Most of these are unnecessarily created & never used. Attached is a
> fairly simple patch to lazily create these, which eleminates most of
> them. In my simple test form 6,000-odd mip objects (!) dropped to less
> than a hundred in a single form invocation. Obviously it will vary by
> form, but in general only a small percentage of instance nodes will have
> binds, and most of these will have only a subset of the
> required/relevant/readonly/constraint properties.
>
> The TypeModelItemProperty wrapping a String is unnecessary, since
> TypeModelItemProperty.isSet is never called - InstanceData can just have
> a String field. Readonly & RelevantModelItemProperty could be eliminated
> altogether too and just boolean primitives used instead, just by making
> BooleanModelItemProperty.hasChangedFromDefault a static method.
>
> It works on our forms, however originally I wrote the patch against
> 3.5.1 source: I've updated it for the current HEAD, but haven't
> retested. Not too many differences however.
>
> It could be improved: it relies on runtime checking to prevent calls to
> (mip-object).set(xxx) from outside InstanceData: would be better to make
> this a compile check by putting InstanceData into the .mip package and
> making it package visible only.
>
> There are still a lot of unnecessary objects created. From the numbers
> in the heap, it seems *every node* in an instance document results in a
> creation of at least
> - InstanceData
> - HashMap entry, presumably idToNodeMap in
> XFormsUtils.setInitialDecorationWorker
>
> I wonder if the same approach of lazy creation-until-referenced can be
> taken for InstanceData objects as well. Also does this note in the
> setInitialDecorationWorker method:
>
>     // NOTE: ids are only used by the legacy XForms engine
>
> indicate that perhaps the idToNodeMap could be retired?
>
> Also dom4j seems to generate a ton of objects - particularly ArrayLists.
> Not sure what options there are to tackle this.
>
> Also can I make another plea for a Subversion repository... !
>
> Adrian
>
>

--
Orbeon Forms - Web Forms for the Enterprise Done the Right Way
http://www.orbeon.com/



--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
ObjectWeb mailing lists service home page: http://www.objectweb.org/wws
Reply | Threaded
Open this post in threaded view
|

Re: patch to reduce excessive org.orbeon.oxf.xforms.mip.* object creation

Erik Bruchez
Administrator
Adrian & all,

We have now integrated into Orbeon Forms changes to optimize memory
usage and object count related to Model Item Properties (MIP).

As a reminder, the whole idea of these changes is that by not
performing the annotations the way we do it now, there is potential
for saving on memory usage, object creation and garbage collection.

I have not directly integrated Adrian's patch, since as we discussed
there was potential for more savings by making InstanceData creation
lazy as well, and also to not use *ModelItemProperty objects at all.

So I jumped directly to such a solution and the result is an
InstanceData implementation where:

* InstanceData objects are created lazily on nodes when information
   that differs from the default is needed.

* We keep as much as possible information as simple boolean variables
   instead of using dependent objects. All the *ModelItemProperty are
   gone.

* We no longer annotate all element and attribute nodes with
   InstanceData except in the case of XForms Classic which needs to
   assign an id to every node. XForms Classic still gains from the fact
   that we don't create all the *ModelItemProperty objects. But please
   don't use XForms Classic anyway ;-)

As a corollary, I had to fix instance replacement annotations as
well. Previously, upon instance replacement, all element and attribute
nodes in the instance were marked as having changed value (why we do
this altogether is another topic). This was not an issue since nodes
has InstanceData associated anyway, so this was just setting a
flag. However, now that we are aiming as being as lazy as possible,
this is not optimal: in fact only the nodes to which value controls
are bound should be marked. So I made this change as well, and by the
way made a couple other optimizations related to MIP events sending.

Adrian, if you have a chance, is there a way you could try out these
changes and see if they improve things in your application as well as
or, hopefully, better than with just the lazy *ModelItemProperty
patch?

-Erik

Erik Bruchez wrote:

> Adrian,
>
> (Resending since this reply doesn't seem to have made it to ops-users.)
>
> Thanks a lot, that sounds great!
>
> I think that you are right on the spot:
>
> * InstanceData objects should only be created when needed, i.e. when
>    there is some information different from the default to store.
>
> * idToNodeMap is only used with XForms Classic, which we should
>    retire. However, I thought that the Map was not even created with
>    XForms NG, since XFormsUtils.setInitialDecoration() passes a null
>    for the map. Maybe the Map is another one?
>
> * We have found dom4j to be quite inefficient in general. There is one
>    place where it is used unnecessarily, and that is to represent the
>    "static state" document. Here, using TinyTree would be much
>    better. In other places, we could in the future move maybe to XOM,
>    assuming it is a better model than dom4j. But that's further away.
>
> * I also agree with TypeModelItemProperty: I think that some code has
>    changed there recently too, when we did some work with validation,
>    and some stuff has become unneeded.
>
> We will definitely want to integrate this. But we can't guarantee that
> it will happen before we release 3.6 as that would be a last-minute
> change. I have added an RFE for this:
>
>
http://forge.objectweb.org/tracker/index.php?func=detail&aid=307293&group_id=168&atid=350207 


>
>
> Regarding Subversion, we'll think about it ;-) The main objections
> were, I think:
>
> * It will be work to migrate our automatic build system and our
>    synchronization with the ObjectWeb (now OW2) repository.
>
> * CVS "just works" at the moment.
>
> * My personal experience with SVN, which I use regularly for
>    documents, is less than stellar: corrupted repositories, snv cleanup
>    all the time, issues with encodings on the Mac, and more.
>
> All in all though, that's great work, and I can't wait to see this
> integrated in the codebase!
>
> -Erik
>
> Adrian Baker wrote:
>> As we've optimised other parts of our application we've been
>> increasing the number of requests running through Orbeon.
>>
>> In doing so garbage collection has become more & more of an overhead
>> as objects have been created at a greater rate, to the point where our
>> application will fill up the heap & then spend much of the time doing
>> full garbage collects (even with a very small cache).
>>
>> Looking at the heap shows *massive* numbers of
>> org.orbeon.oxf.xforms.mip.* objects (eg over a million in a 384MB
>> heap). Most of these are unnecessarily created & never used. Attached
>> is a fairly simple patch to lazily create these, which eleminates most
>> of them. In my simple test form 6,000-odd mip objects (!) dropped to
>> less than a hundred in a single form invocation. Obviously it will
>> vary by form, but in general only a small percentage of instance nodes
>> will have binds, and most of these will have only a subset of the
>> required/relevant/readonly/constraint properties.
>>
>> The TypeModelItemProperty wrapping a String is unnecessary, since
>> TypeModelItemProperty.isSet is never called - InstanceData can just
>> have a String field. Readonly & RelevantModelItemProperty could be
>> eliminated altogether too and just boolean primitives used instead,
>> just by making BooleanModelItemProperty.hasChangedFromDefault a static
>> method.
>>
>> It works on our forms, however originally I wrote the patch against
>> 3.5.1 source: I've updated it for the current HEAD, but haven't
>> retested. Not too many differences however.
>>
>> It could be improved: it relies on runtime checking to prevent calls
>> to (mip-object).set(xxx) from outside InstanceData: would be better to
>> make this a compile check by putting InstanceData into the .mip
>> package and making it package visible only.
>>
>> There are still a lot of unnecessary objects created. From the numbers
>> in the heap, it seems *every node* in an instance document results in
>> a creation of at least
>> - InstanceData
>> - HashMap entry, presumably idToNodeMap in
>> XFormsUtils.setInitialDecorationWorker
>>
>> I wonder if the same approach of lazy creation-until-referenced can be
>> taken for InstanceData objects as well. Also does this note in the
>> setInitialDecorationWorker method:
>>
>>     // NOTE: ids are only used by the legacy XForms engine
>>
>> indicate that perhaps the idToNodeMap could be retired?
>>
>> Also dom4j seems to generate a ton of objects - particularly
>> ArrayLists. Not sure what options there are to tackle this.
>>
>> Also can I make another plea for a Subversion repository... !
>>
>> Adrian
>>
>>
>
>

--
Orbeon Forms - Web Forms for the Enterprise Done the Right Way
http://www.orbeon.com/



--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
ObjectWeb mailing lists service home page: http://www.objectweb.org/wws