extracting data from excel (or csv)

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

extracting data from excel (or csv)

Alistair Miles-2
Hi folks,

I'm interested in extracting data from excel (and csv) files and
converting them into XML so I can pass them on to some xforms.

I found the section [1] on excel processors on the web site, but it
mentions that the processors are now deprecated, pointing to a link on
the converters page [2]. However, [2] makes no mention of excel, and
the #xls-converters anchor isn't present.

Can I ask, what's the latest status of excel (and csv) -related
processors/converters in orbeon?

Thanks in advance,

Alistair

[1] http://www.orbeon.com/orbeon/doc/processors-charts-spreadsheets#excel
[2] http://www.orbeon.com/orbeon/doc/processors-converters#xls-converters
--
Alistair Miles
Centre for Genomics and Global Health <http://cggh.org>
The Wellcome Trust Centre for Human Genetics
Roosevelt Drive
Oxford
OX3 7BN
United Kingdom
Web: http://purl.org/net/aliman
Email: [hidden email]
Tel: +44 (0)1865 287669


--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
OW2 mailing lists service home page: http://www.ow2.org/wws
Reply | Threaded
Open this post in threaded view
|

Re: extracting data from excel (or csv)

Alessandro  Vernet
Administrator
Alistair,

Those processors are indeed deprecated, as in "not maintained". You
are saying that you'd like to extract information from Excel files. Do
you know the exact format of those files (Excel 97, Microsoft Office
XML format, or Office Open XML)?

Alex

On Tue, Apr 20, 2010 at 5:27 AM, Alistair Miles
<[hidden email]> wrote:

> Hi folks,
>
> I'm interested in extracting data from excel (and csv) files and
> converting them into XML so I can pass them on to some xforms.
>
> I found the section [1] on excel processors on the web site, but it
> mentions that the processors are now deprecated, pointing to a link on
> the converters page [2]. However, [2] makes no mention of excel, and
> the #xls-converters anchor isn't present.
>
> Can I ask, what's the latest status of excel (and csv) -related
> processors/converters in orbeon?
>
> Thanks in advance,
>
> Alistair
>
> [1] http://www.orbeon.com/orbeon/doc/processors-charts-spreadsheets#excel
> [2] http://www.orbeon.com/orbeon/doc/processors-converters#xls-converters
> --
> Alistair Miles
> Centre for Genomics and Global Health <http://cggh.org>
> The Wellcome Trust Centre for Human Genetics
> Roosevelt Drive
> Oxford
> OX3 7BN
> United Kingdom
> Web: http://purl.org/net/aliman
> Email: [hidden email]
> Tel: +44 (0)1865 287669
>
>
> --
> You receive this message as a subscriber of the [hidden email] mailing list.
> To unsubscribe: mailto:[hidden email]
> For general help: mailto:[hidden email]?subject=help
> OW2 mailing lists service home page: http://www.ow2.org/wws
>
>


--
Orbeon Forms - Web forms, open-source, for the Enterprise -
http://www.orbeon.com/
My Twitter: http://twitter.com/avernet


--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
OW2 mailing lists service home page: http://www.ow2.org/wws
--
Follow Orbeon on Twitter: @orbeon
Follow me on Twitter: @avernet
Reply | Threaded
Open this post in threaded view
|

Re: Re: extracting data from excel (or csv)

Alistair Miles-2
Hi Alex,

Thanks for getting back to me. Excel 97 (and CSV) would be a great
start.

To give you a bit more information, in general we'd like to be able to
extract XML data from an excel file without relying on prior
annotation of the excel file.

In particular, we'd like to be able to generate an XML representation
of data from excel files where each sheet is laid out as a regular
table where the first row provides the column headings.

We don't want to have to annotate the excel file first, we'd rather
just make the assumption that each sheet is laid out as a regular
table, then perform the extraction based on that assumption, and
either detect any obvious errors (e.g., the sheet is laid out in a
very unusual way) or just present the results back to the user (via an
xform) and ask them to confirm the data is extracted correctly.

(We may then use other xforms to work with the extracted data, e.g.,
to allow the user to edit it.)

Any pointers or suggestions for how me might achieve this with orbeon
(+POI?) would be very much appreciated.

I have had a look at POI and can imagine it wouldn't be too difficult
to write an orbeon processor to do this, but was hoping for a leg up
:)

Thanks,

Alistair

On Wed, Apr 21, 2010 at 04:11:51PM -0700, Alessandro Vernet wrote:

> Alistair,
>
> Those processors are indeed deprecated, as in "not maintained". You
> are saying that you'd like to extract information from Excel files. Do
> you know the exact format of those files (Excel 97, Microsoft Office
> XML format, or Office Open XML)?
>
> Alex
>
> On Tue, Apr 20, 2010 at 5:27 AM, Alistair Miles
> <[hidden email]> wrote:
> > Hi folks,
> >
> > I'm interested in extracting data from excel (and csv) files and
> > converting them into XML so I can pass them on to some xforms.
> >
> > I found the section [1] on excel processors on the web site, but it
> > mentions that the processors are now deprecated, pointing to a link on
> > the converters page [2]. However, [2] makes no mention of excel, and
> > the #xls-converters anchor isn't present.
> >
> > Can I ask, what's the latest status of excel (and csv) -related
> > processors/converters in orbeon?
> >
> > Thanks in advance,
> >
> > Alistair
> >
> > [1] http://www.orbeon.com/orbeon/doc/processors-charts-spreadsheets#excel
> > [2] http://www.orbeon.com/orbeon/doc/processors-converters#xls-converters
> > --
> > Alistair Miles
> > Centre for Genomics and Global Health <http://cggh.org>
> > The Wellcome Trust Centre for Human Genetics
> > Roosevelt Drive
> > Oxford
> > OX3 7BN
> > United Kingdom
> > Web: http://purl.org/net/aliman
> > Email: [hidden email]
> > Tel: +44 (0)1865 287669
> >
> >
> > --
> > You receive this message as a subscriber of the [hidden email] mailing list.
> > To unsubscribe: mailto:[hidden email]
> > For general help: mailto:[hidden email]?subject=help
> > OW2 mailing lists service home page: http://www.ow2.org/wws
> >
> >
>
>
>
> --
> Orbeon Forms - Web forms, open-source, for the Enterprise -
> http://www.orbeon.com/
> My Twitter: http://twitter.com/avernet

>
> --
> You receive this message as a subscriber of the [hidden email] mailing list.
> To unsubscribe: mailto:[hidden email]
> For general help: mailto:[hidden email]?subject=help
> OW2 mailing lists service home page: http://www.ow2.org/wws


--
Alistair Miles
Centre for Genomics and Global Health <http://cggh.org>
The Wellcome Trust Centre for Human Genetics
Roosevelt Drive
Oxford
OX3 7BN
United Kingdom
Web: http://purl.org/net/aliman
Email: [hidden email]
Tel: +44 (0)1865 287669


--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
OW2 mailing lists service home page: http://www.ow2.org/wws
Reply | Threaded
Open this post in threaded view
|

Re: Re: Re: extracting data from excel (or csv)

Alessandro  Vernet
Administrator
Alistair,

I see. Like you suggested, I think writing your own processor using
the POI library is a good way to go, especially if you need to support
the Excel 97 format.

Alex

On Thu, Apr 22, 2010 at 3:29 AM, Alistair Miles
<[hidden email]> wrote:

> Hi Alex,
>
> Thanks for getting back to me. Excel 97 (and CSV) would be a great
> start.
>
> To give you a bit more information, in general we'd like to be able to
> extract XML data from an excel file without relying on prior
> annotation of the excel file.
>
> In particular, we'd like to be able to generate an XML representation
> of data from excel files where each sheet is laid out as a regular
> table where the first row provides the column headings.
>
> We don't want to have to annotate the excel file first, we'd rather
> just make the assumption that each sheet is laid out as a regular
> table, then perform the extraction based on that assumption, and
> either detect any obvious errors (e.g., the sheet is laid out in a
> very unusual way) or just present the results back to the user (via an
> xform) and ask them to confirm the data is extracted correctly.
>
> (We may then use other xforms to work with the extracted data, e.g.,
> to allow the user to edit it.)
>
> Any pointers or suggestions for how me might achieve this with orbeon
> (+POI?) would be very much appreciated.
>
> I have had a look at POI and can imagine it wouldn't be too difficult
> to write an orbeon processor to do this, but was hoping for a leg up
> :)
>
> Thanks,
>
> Alistair
>
> On Wed, Apr 21, 2010 at 04:11:51PM -0700, Alessandro Vernet wrote:
>> Alistair,
>>
>> Those processors are indeed deprecated, as in "not maintained". You
>> are saying that you'd like to extract information from Excel files. Do
>> you know the exact format of those files (Excel 97, Microsoft Office
>> XML format, or Office Open XML)?
>>
>> Alex
>>
>> On Tue, Apr 20, 2010 at 5:27 AM, Alistair Miles
>> <[hidden email]> wrote:
>> > Hi folks,
>> >
>> > I'm interested in extracting data from excel (and csv) files and
>> > converting them into XML so I can pass them on to some xforms.
>> >
>> > I found the section [1] on excel processors on the web site, but it
>> > mentions that the processors are now deprecated, pointing to a link on
>> > the converters page [2]. However, [2] makes no mention of excel, and
>> > the #xls-converters anchor isn't present.
>> >
>> > Can I ask, what's the latest status of excel (and csv) -related
>> > processors/converters in orbeon?
>> >
>> > Thanks in advance,
>> >
>> > Alistair
>> >
>> > [1] http://www.orbeon.com/orbeon/doc/processors-charts-spreadsheets#excel
>> > [2] http://www.orbeon.com/orbeon/doc/processors-converters#xls-converters
>> > --
>> > Alistair Miles
>> > Centre for Genomics and Global Health <http://cggh.org>
>> > The Wellcome Trust Centre for Human Genetics
>> > Roosevelt Drive
>> > Oxford
>> > OX3 7BN
>> > United Kingdom
>> > Web: http://purl.org/net/aliman
>> > Email: [hidden email]
>> > Tel: +44 (0)1865 287669
>> >
>> >
>> > --
>> > You receive this message as a subscriber of the [hidden email] mailing list.
>> > To unsubscribe: mailto:[hidden email]
>> > For general help: mailto:[hidden email]?subject=help
>> > OW2 mailing lists service home page: http://www.ow2.org/wws
>> >
>> >
>>
>>
>>
>> --
>> Orbeon Forms - Web forms, open-source, for the Enterprise -
>> http://www.orbeon.com/
>> My Twitter: http://twitter.com/avernet
>
>>
>> --
>> You receive this message as a subscriber of the [hidden email] mailing list.
>> To unsubscribe: mailto:[hidden email]
>> For general help: mailto:[hidden email]?subject=help
>> OW2 mailing lists service home page: http://www.ow2.org/wws
>
>
> --
> Alistair Miles
> Centre for Genomics and Global Health <http://cggh.org>
> The Wellcome Trust Centre for Human Genetics
> Roosevelt Drive
> Oxford
> OX3 7BN
> United Kingdom
> Web: http://purl.org/net/aliman
> Email: [hidden email]
> Tel: +44 (0)1865 287669
>
>
> --
> You receive this message as a subscriber of the [hidden email] mailing list.
> To unsubscribe: mailto:[hidden email]
> For general help: mailto:[hidden email]?subject=help
> OW2 mailing lists service home page: http://www.ow2.org/wws
>
>


--
Orbeon Forms - Web forms, open-source, for the Enterprise -
http://www.orbeon.com/
My Twitter: http://twitter.com/avernet


--
You receive this message as a subscriber of the [hidden email] mailing list.
To unsubscribe: mailto:[hidden email]
For general help: mailto:[hidden email]?subject=help
OW2 mailing lists service home page: http://www.ow2.org/wws
--
Follow Orbeon on Twitter: @orbeon
Follow me on Twitter: @avernet