Skip to main content

Creating a listings extractor

With a details extractor created, you will now want to create a second extractor to get the URLs (or links) of multiple business pages. This will be done by creating a new extractor to extract each business URL in the Yelp search results.

Creating a Listings Extractor

Create a new extractor as you did in Part 1 for the following URL:

https://www.yelp.com/search?find_desc=Coffee&find_loc=Los+Gatos,+CA&start=0

Clearing an Extractor's Training

Import.io will load in the page and detect the list of businesses returned on the page. If you wanted to create this list yourself instead, you can click the trash icon in the data panel.

Training a Column

Once you've cleared your training, you can name the New Column. After naming the column, you can go ahead and select the first result. If the training detects a table, make sure you do not extract the table; selecting data from the table may include results that are sponsored. If the training detects a link, make sure you include the link URLs. Selecting more results will train the extractor to grab the entire list.

When you select the data points, you will see that your selection is shown both in the floating column and on the page with green overlays. With some pages, you might have to scroll through a list and click a few more data points to capture the whole list.

Capturing Links

When selecting the links, the blue text indicates that it is capturing the URL link to their individual business pages. This can be turned and off within the column settings under Capture this link's URL, which will have a checkmark next to it when enabled.

Once you've saved your Listings Extractor, you can start generating URLs.