Extension

Webtable

SQLite extension

Extract structured data from webpages and URLs

$35 Commercial Pro Version

  • Version 0.1.0
  • SQLite 3.45.1 or newer
  • Operating system Linux & Mac (ARM)

This SQLite extension lets you extract structured data from web pages, all locally and without relying on external online services. It allows you to access HTML tables, embedded JSON, meta tags, and other useful information directly from web pages.

Key points

  • Requests are made directly from your local computer
  • Persists responses with a local SQLite table
  • Prevents overuse by throttling live requests to one per second
  • Response cache reuse with a five minute expiry

Example

select value‐>>'code' as code, value‐>>'country_name' as name from web_list( 'https://en.wikipedia.org/wiki/ISO_3166-1_numeric' ); ┌──────┬──────────────────────────────┐ │ code │ name │ ├──────┼──────────────────────────────┤ │ 834 │ Tanzania, United Republic of │ │ 840 │ United States of America │ │ 850 │ Virgin Islands (U.S.) │ └──────┴──────────────────────────────┘

Functions 2

web_list( string url )

Extract best structured list from URL
Fetch given URL and return the best structured data from it. Data is returned as JSON rows, each normalized and typed fields.
-- From URL select * from web_list( 'https://en.wikipedia.org/wiki/ISO_3166-1_numeric' ); ┌───────────────────────────────────────────────────┐ │ value │ ├───────────────────────────────────────────────────┤ │ { │ │ "code":834, │ │ "country_name":"Tanzania, United Republic of", │ │ "notes":"" │ │ } │ └───────────────────────────────────────────────────┘

web_meta( string url )

Extract meta information from URL
Fetch given URL and parse key value pairs from it, most notably HTML meta tag names and values.
-- From URL select * from web_meta( 'https://en.wikipedia.org/wiki/ISO_3166-1_numeric' ); ┌──────────────────┬────────────────────────────────┐ │ key │ value │ ├──────────────────┼────────────────────────────────┤ │ generator │ MediaWiki 1.46.0-wmf.7 │ │ format-detection │ telephone=no │ │ viewport │ width=1120 │ │ og:title │ ISO 3166-1 numeric - Wikipedia │ │ og:type │ website │ └──────────────────┴────────────────────────────────┘

Configuration

Every new SQLite connection should call function web_config with setup details. This information is not saved, and kept in-memory for the duration of the connection.

select web_config('');

  • cache_table
    Name of local table to create and use for response data, prepended with prefix. Defaults to cache.
  • user_agent
    User agent string to use when making HTTP requests during connection. Defaults to randomly selected browser user agent.
  • prefix
    Prefix for names, defaults to web_

FAQ

For any questions, support, or feedback regarding this extension, please let us know.

What information is parsed?
The following information is parsed from HTML pages:
  • Meta tags
  • Some HTML role tag values
  • Inline HTML tables
  • Inline JSON arrays or objects
Can you parse dynamic web page content?
No, this extension does not use headless browsers to extract HTML pages where JavaScript generates the content.
How do I parse a URL behind a login?
This is not currently supported. The extension can only access URLs that are publicly available.
Can I try this extension for free?

Try this extension for free from our Desktop application.

What can I do with the Pro version?

With the Pro version of this extension, you can use it commercially, such as run you SaaS with it, or share it inside your organization.

What if the extension does not work?

Please contact our support for assistance. You can also get a refund in accordance with our Term of Service policies.

What other software do I need?

You will need either Linux or Mac (ARM) 64-bit operating system. Windows is currently not supported.

In addition, you should have SQLite version 3.45.1 or newer installed, with support for extensions.

How do I receive updates?

We will email you when an update to the same extension version has been released. It will be made available to you free of charge.