Indigo DQM's HTML To XML Data Web Scraper and Web Crawler
can be used to extract Data from HTML Web Pages.
Web Scraping is used for content scraping and data
extraction, and as a component of applications used
for web indexing, data mining, Email and Link extraction,
online price change monitoring and price comparison,
product review scraping (to watch the competition),
gathering real estate listings, weather data monitoring,
website change detection, research, tracking online
presence and reputation, web mashup and, web data integration.
Web Crawling is the process of iteratively finding
and fetching web links from a website. Using the Web
Page XML Data Processed by Indigo DQM the Indigo DRS
Reporting Engine can create extremely powerful reports
for business intelligence on competitors Websites.
Indigo DQM can harvest a single URL or an entire websites
content using the web crawler and then use XQuery to
mine data from it, for example products, prices etc.
contacts delivering you real business intelligence.
The Indigo DRS Report Designer can be used to create
advanced Reports and Outputs from Data Scraped from
Using the Web Page XML Data Processed by Indigo DQM
the Indigo DRS Reporting Engine can create extremely
powerful reports for business intelligence on competitors
Web Scrape Designer
Indigo DQM Web Scrape Designer allows complex XQueries
and XPath statements to be executed to extract or scrape
elements from HTML Web Pages or Files.
XQuery is a query and functional programming language
that is designed to query and transform collections
of structured and unstructured Data, usually in the
form of XML (Extensible Markup Language).
The Web Scrape Designer is a visual aid
that allows web page elements to be clicked, selected
or highlighted to automatically generate XPath statements.
The language is based on the XQuery and
XPath Data Model (XDM) which uses a tree-structured
model of the information content of an XML document.
Web page data can be queried using XQuery
Using an XQuery function to normalize
space and extract the plain text.
Data Tree View
Selecting a node in the Data Tree will update the current
XPath for that node.
Web Page Source
Viewing the HTML to XML for the Web Page Source.
XSD Diagrams allow a visual representation of an XPath
expression in the Data Schema. Click the Diagram tab
and expand out the Diagram elements to show the structure
of the Data Schema.