What would the
architecture look like?
My guess is that a
servlet would query the lucene index and display the results as xml, for OPS to
present.
Richard Braman http://www.taxcodesoftware.org -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws |
Administrator
|
On 4/10/06, Richard Braman <[hidden email]> wrote:
> What would the architecture look like? > My guess is that a servlet would query the lucene index and display the > results as xml, for OPS to present. Hi Richard, We haven't done much with Lucene here, but I think Eric van der Vlist did, and maybe even wrote a processor for Lucene. Maybe Eric will comment directly on this. Alex -- Blog (XML, Web apps, Open Source): http://www.orbeon.com/blog/ -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws
--
Follow Orbeon on Twitter: @orbeon Follow me on Twitter: @avernet |
Hi,
Le jeudi 13 avril 2006 à 15:45 -0700, Alessandro Vernet a écrit : > On 4/10/06, Richard Braman <[hidden email]> wrote: > > What would the architecture look like? > > My guess is that a servlet would query the lucene index and display the > > results as xml, for OPS to present. > > Hi Richard, > > We haven't done much with Lucene here, but I think Eric van der Vlist > did, and maybe even wrote a processor for Lucene. Maybe Eric will > comment directly on this. The first one is for XMLfr and you can play with it at http://beta.xmlfr.org/orbeon/lucene/cherche . This one uses an OPS processor to query Lucene indexes. The input of this query processor contains the parameters of the query and its output is a RSS 1.0 document (see http://beta.xmlfr.org/orbeon/lucene/rss?query=orbeon+presentationserver for an example of such an output). The indexing is done outside of OPS through a crontab process running a Java program. This implementation could be used as a basis for other applications, but some features are currently rather specific to XMLfr. For instance the you can select between two set of indexes, one of them calculating the level of pertinence taking the date and the document type into account (an article will come before a wire item which will come before a mailing list message and newer stuff come before older ones). The second implementation is still work in progress even if I have a proof of concept which is working pretty fine. The idea is to manage everything within OPS. The query processor is derived from the one developed for XMLfr and adapted to be less advanced but more generic. The indexing is done as a background task within OPS through the scheduler. The indexing uses a modified mime type XML database that associates OPS pipelines to media types and the algorithms to index different file types are defined as OPS pipelines. I have started developing processors (called from these pipelines) to index Word, Excel, OpenOffice, PDF, XML and HTML documents. The configuration (directories to index, priority of the indexing process, ...) is done through a XML configuration file that can be updated while the server is running through a configuration processor. This system could be used to develop a Java OPS based alternative to Beagle (http://beaglewiki.org/Main_Page) but I am sure many other applications could take advantage of these processors. I would be happy to publish all that under open source licences, unfortunately, all these developments are still very experimental and not very well documented and I don't expect to have enough time to work on them anytime soon (except of course if some funding could be raised that would allow me to postpone other payed activities). To summarize, I am sure Lucene + OPS make a very interesting couple but I don't have as much to share as I'd like to right now! Eric -- GPG-PGP: 2A528005 Did you know it? Python has now a Relax NG (partial) implementation. http://advogato.org/proj/xvif/ ------------------------------------------------------------------------ Eric van der Vlist http://xmlfr.org http://dyomedea.com (ISO) RELAX NG ISBN:0-596-00421-4 http://oreilly.com/catalog/relax (W3C) XML Schema ISBN:0-596-00252-1 http://oreilly.com/catalog/xmlschema ------------------------------------------------------------------------ -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws signature.asc (196 bytes) Download Attachment |
This brings up an interesting question. In the event that we were to
send an XPL file over HTTP, what media type should we be using? 'application/xpl+xml' springs immediately to mind, but I don't think that is registered yet. (Is it?) Should I just stick with the standard (text|application)/xml, or is there a need for a new media type for XPL? Danel E. Renfer (http://kronkltd.net/) On 4/14/06, Eric van der Vlist <[hidden email]> wrote: > Hi, > > Le jeudi 13 avril 2006 à 15:45 -0700, Alessandro Vernet a écrit : > > On 4/10/06, Richard Braman <[hidden email]> wrote: > > > What would the architecture look like? > > > My guess is that a servlet would query the lucene index and display the > > > results as xml, for OPS to present. > > > > Hi Richard, > > > > We haven't done much with Lucene here, but I think Eric van der Vlist > > did, and maybe even wrote a processor for Lucene. Maybe Eric will > > comment directly on this. > > I have done two different implementations integrating Lucene with OPS. > > The first one is for XMLfr and you can play with it at > http://beta.xmlfr.org/orbeon/lucene/cherche . > > This one uses an OPS processor to query Lucene indexes. The input of > this query processor contains the parameters of the query and its output > is a RSS 1.0 document (see > http://beta.xmlfr.org/orbeon/lucene/rss?query=orbeon+presentationserver > for an example of such an output). > > The indexing is done outside of OPS through a crontab process running a > Java program. > > This implementation could be used as a basis for other applications, but > some features are currently rather specific to XMLfr. For instance the > you can select between two set of indexes, one of them calculating the > level of pertinence taking the date and the document type into account > (an article will come before a wire item which will come before a > mailing list message and newer stuff come before older ones). > > The second implementation is still work in progress even if I have a > proof of concept which is working pretty fine. > > The idea is to manage everything within OPS. > > The query processor is derived from the one developed for XMLfr and > adapted to be less advanced but more generic. > > The indexing is done as a background task within OPS through the > scheduler. > > The indexing uses a modified mime type XML database that associates OPS > pipelines to media types and the algorithms to index different file > types are defined as OPS pipelines. > > I have started developing processors (called from these pipelines) to > index Word, Excel, OpenOffice, PDF, XML and HTML documents. > > The configuration (directories to index, priority of the indexing > process, ...) is done through a XML configuration file that can be > updated while the server is running through a configuration processor. > > This system could be used to develop a Java OPS based alternative to > Beagle (http://beaglewiki.org/Main_Page) but I am sure many other > applications could take advantage of these processors. > > I would be happy to publish all that under open source licences, > unfortunately, all these developments are still very experimental and > not very well documented and I don't expect to have enough time to work > on them anytime soon (except of course if some funding could be raised > that would allow me to postpone other payed activities). > > To summarize, I am sure Lucene + OPS make a very interesting couple but > I don't have as much to share as I'd like to right now! > > Eric > -- > GPG-PGP: 2A528005 > Did you know it? Python has now a Relax NG (partial) implementation. > http://advogato.org/proj/xvif/ > ------------------------------------------------------------------------ > Eric van der Vlist http://xmlfr.org http://dyomedea.com > (ISO) RELAX NG ISBN:0-596-00421-4 http://oreilly.com/catalog/relax > (W3C) XML Schema ISBN:0-596-00252-1 http://oreilly.com/catalog/xmlschema > ------------------------------------------------------------------------ > > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.1 (GNU/Linux) > > iD8DBQBEP1XZDvn+ZCpSgAURAmLjAJwPhYt1XZnuJfHb9PA0O+zcioyKsQCdEBQ1 > 5c5qwo2TTMJi/anMIqJBcIY= > =geVg > -----END PGP SIGNATURE----- > > > > -- > You receive this message as a subscriber of the [hidden email] mailing list. > To unsubscribe: mailto:[hidden email] > For general help: mailto:[hidden email]?subject=help > ObjectWeb mailing lists service home page: http://www.objectweb.org/wws > > > -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws |
Administrator
|
On 4/14/06, Daniel E. Renfer <[hidden email]> wrote:
> This brings up an interesting question. In the event that we were to > send an XPL file over HTTP, what media type should we be using? > 'application/xpl+xml' springs immediately to mind, but I don't think > that is registered yet. (Is it?) Should I just stick with the standard > (text|application)/xml, or is there a need for a new media type for > XPL? Hi Daniel, I would stick with the standard application/xml, unless you need the receiving side to take some special action when XPL is being received, i.e. an action different than the one taken for other XML files. Alex -- Blog (XML, Web apps, Open Source): http://www.orbeon.com/blog/ -- You receive this message as a subscriber of the [hidden email] mailing list. To unsubscribe: mailto:[hidden email] For general help: mailto:[hidden email]?subject=help ObjectWeb mailing lists service home page: http://www.objectweb.org/wws
--
Follow Orbeon on Twitter: @orbeon Follow me on Twitter: @avernet |
Free forum by Nabble | Edit this page |