Skip to main content

Using the URL generator

This article picks up from the end of Part 3: Creating a Listings Extractor.

With the listings extractor created for a search page, you might want to add more URLs to your listings extractor since the search returns more than one page. To do this, you can use the URL Generator.

tip

Read more about how you can add sources to your extractor here.

Accessing Extractor Settings

To make changes to how your extractor behaves, you can make changes to the extractor's settings. To access an extractor's settings, navigate to the extractor in the dashboard and click Settings to switch over to the settings tab.

From here you can modify sources, schedule your extractor, and set webhooks. For this example, we'll be modifying the sources.

Remove all URLs

For this example, we need to remove all URLs from the extractor. In the extractor settings, remove all URLs from the extractor by clicking Clear All.

Using the URL Generator

The URL generator is the quickest way to generate multiple URLs by using the patterns in the URLs. The following examples show how URL parameters might vary for items like categories, search terms, and page numbers. To access the URL generator, click Generate URLs from the extractor settings.

Generating Pages

In the URL generator, you can create parameters, which allows you to use a list of values when generating URLs. To start, try highlighting and then clicking the 0, which controls the pagination. This will create a value called PARAMETER-1. Since the pagination is index based, it starts at 0 and then skips to 10, 20, 30, and so on. You can generate the first 5 pages by setting PARAMETER-1 as 0 to 40 skip 10.

Generating Locations

Besides a range of numbers, you can add a list of comma-separated values to search for the other locations. To do this, select the find desc value and then add your list of locations. After adding the locations, click Add to list to insert the new URLs generated.

Cleaning Extractor Sources

After adding the URLs, you can use various options in the extractor settings to clean up the URLs. Remove duplicate rows will remove any duplicate sources for your extractor. Clean-up will remove any empty rows and invalid URLs.

With the setup nearly complete, you can now chain the two extractors together.