Class: Extractor

Extractor

Extract data from HTML pages based on as schema. A schema consists of following.

  • names of fields to be fetched
  • CSS rules for each fields
  • Data extractor function for each fields.

Constructor

new Extractor(schema) → {undefined}

Extractor

Parameters:
Name Type Description
schema Object

schema definition. Schema definition can have nested schema at any level.

Source:
Returns:
Type
undefined

Methods

extract(html) → {Object}

extract data from a html string/page

Parameters:
Name Type Description
html String

HTML page.

Source:
Returns:

extracted data as per schema definition.

Type
Object