Skip to content

Databases

sixcious edited this page Jul 9, 2023 · 48 revisions

After you install Infy, it should automatically download the AutoPagerize (AP) and InfyScroll (IS) Databases, allowing it to work out of the box on thousands of websites.

What are the databases?

The databases are a user-curated list of URLs with the settings needed to find the next link and page element to append. A database is what allows AutoPagerize extensions and Infy to work automatically on lots of websites. Under Infy's Database Options section, you can view the database items themselves, check their stats, and see the last time they were updated/downloaded. You can also manually update/download them at any time or even delete them from your storage space if you prefer not to use them.

The Most Important Database Setting: Activation Mode

The single-most important setting in Infy Scroll is the Activation Mode setting (near the bottom), which is checked on to Blacklist Mode by default. This controls if Infy should activate on all those thousands of Database URLs by default, and how you filter the URLs. Depending on how it is checked, you will have one of two types of filters: a blacklist or a whitelist. If you want Infy to work on as many websites as possible automatically, you'll probably want to keep it on Blacklist mode. If you only want Infy to automatically work on just a few websites of your own choosing, you'll want to use Whitelist mode.

Database Blacklist

When you have the Activation Mode setting set to Blacklist Mode, you'll have a Database Blacklist. This lets you allow all Database URLs by default, but specify a small "blacklist" of URLs that Infy should never auto-activate on. This mode is useful for entering in a few websites that you simply don't want to enable infinite scrolling for, or where the website no longer works right.

Database Whitelist

When you have the Activation Mode setting set to Whitelist Mode, you'll instead have a Database Whitelist. This lets you disallow all Database URLs by default, but specify a small "whitelist" of URLs that Infy is allowed to auto-activate on. This is useful if you only need Infy to auto-activate on just a few URLs. For example, you could just enter www.google.com/search in your Database Whitelist, and Infy will only activate on Google Search. You could also have no URLs in the whitelist, and Infy simply won't activate on any Database URL whatsoever.

1-Click Database Blacklist/Whitelist Button

When you're on a Database URL and click on Infy's toolbar icon to enter the UI Window, you should see a "Blacklist" or "Check" button next to the "Power" button allowing you to blacklist or whitelist this URL. This will automatically add the website's domain to your blacklist/whitelist in one click, so you don't have to manually go into the Options screen and type it in. This button is toggable, so you can click it again to un-blacklist/un-whitelist the URL as well.

Example Blacklist/Whitelist URLs

You can manually enter the URLs to blacklist or whitelist in five different ways: Substring Patterns, Wildcard Patterns, Regular Expressions, Exact URLs, and Database URLs. Substring Patterns are the easiest to match against, as it only checks to see if the URL contains the pattern/text you enter anywhere in it.

The table below provides an example of each type:

Type Example Description
Substring Pattern www.google.com/search This will match any URL that has the substring www.google.com/search in it
Wildcard Pattern *google.com/* As indicated by the * characters in it, this wildcard pattern will match any URL that has google.com/ in it
Regular Expression /^https?://www\.google\.com/search/ This is a regular expression as indicated by the surrounding / characters with each . escaped by a \
Exact URL "https://www.google.com/search?q=this-exact-url" As indicated by the surrounding " characters, this is an Exact URL that will only match this one (single) URL
Database URL (^https?://.) This Database URL, as indicated by the surrounding ( ) characters, will only match the generic database URL ^https?://. so you can disallow it specifically from ever being used while still keeping all the other Database URL rules

Generic Database URL Patterns

Inside the AP Database are what's known as generic (short) URL patterns that cover a wide range of URLs -- essentially, any URL that starts with http. You may wish to exclude these generic http Database URLs inside your Blacklist, as they sometimes get the wrong page element or prevent you from getting to the bottom of the page and see content you care more about (such as seeing the comments below a blog post, rather than seeing the next page's blog post).

Here's a selected list of the Generic Database URL Patterns you may want to outright blacklist. Parenthesis have been added for your convenience so you can copy and paste them as they are into your Blacklist:

  1. (^https?://.)
  2. (^https?://..)
  3. (^https?://...)
  4. (^https?://.+)
Generic Database URLs - Why are the parenthesis needed?

When targeting generic database URLs in your Blacklist/Whitelist, you'll want to surround them in ( and ). So, for example ^https?://. would become (^https?://.). This lets you exclude just the one database URL ^https?://. specifically without excluding every other database URL that matches the generic regular expression.

Database Update Schedule

Infy lets you specify how often it should auto-update the database, from 1-7 Days. It's recommended to keep this at 1 or 2 days, as websites can change their settings at any time. You can set it to 0 to disable auto-updating.

Database Download Locations

Infy uses the following locations to download the databases. If your databases appear to be empty, please check to make sure you can access the following locations. It may be that your ISP has blocked access to them, or even your browser (for example, Firefox is now blocking requests to non https URLs, such as wedata).

Contributing

If you'd like to add a new URL to the databases (or update one that is no longer working), you can contribute to them on the Wedata.net website.

The Databases are located at:

Important: The AutoPagerize Database is currently only compatible with the Next Link action and Element append mode (not including Element Iframe) and only uses XPath and Regular Expressions. An AP database item typically only uses three keys: url, nextLink, pageElement, (and, optionally, insertBefore). Contribute to the AP Database when your settings are compatible with it so that it benefits other apps that use it.

Registering

The Login API that Wedata uses is OpenID, which is unfortunately slowly being phased out. However, there are still a couple of OpenID providers available that you can use to create an account. Although not intended for this purpose, the provider I last used and recommend is this one: https://openid.dbcls.jp/

If you know a little XPath and would like to help out, I hope you'll consider contributing! 💜

Clone this wiki locally