Humble Trader

Saturday, September 09, 2006

Extract HTML Data - Package STA.EXTRACT_HTML_DATA Specification

Package: extract_html_data
Description: Container for all procedures and functions relating to extracting data from semi-structured HTML.

Type:

Name: data_list
Description: Container for HTML table data.
Type: TABLE
Datatype: VARCHAR2(4000)

Function: extract_table

Description: Return a PL/SQL table containing the data found in the p_table_no HTML from PARSED_HTML. The PARSED_HTML page is referenced as the last created block pointed to by the page name.
Parameters:

p_table_no:

Datatype: NUMBER
Direction: IN
Description: The cardinal location of the table within the HTML.

p_web_page:

Datatype: html_pages.name%TYPE
Direction: IN
Description: The page name that the PARSED_HTML data block originated from.

Return:

Datatype: data_list
Description: A PL/SQL table containing the data. Each element contains the char'd data for one row in the table in the form; '[data_1]','[data_2]','[...

Action:

Get run_no keys (parse and raw) for parsed_html from the web page name.
Get the start component sequence for the table of interest.
Get the end component sequence for the table of interest.
For all the HTML components inside the range found above...

If the start of a new table row...

Create a new list object.
Put an opening quote into the object.

Else If an opening or closing Table Data tag...

If an opening tag and not the first piece of data...

Add a comma component seperator.

End; If an opening tag and not the first piece of data.
Add a quote.

Else if not a tag...

Add the data to the object.

Else do nothing.
End; If the start of a new table row.

End; For all the HTML components in the range found above.
Return the list.

0 Comments:

Post a Comment

<< Home

Building Oracle-based Data Warehouses on Linux for fun and profit. We will; build and configure a Linux server, install Oracle software, design and build a data warehouse, perform ongoing maintenance to the data warehouse and build a toolkit to support this, and learn to speak eloquently on the subject of data warehousing so as to attract members of the opposite sex.

: Name: Cat Play Crew; Location: Port Macquarie, New South Wales, Australia

View my complete profile