Skip to main content

Creating a details extractor

With a listings extractor created, you will now want to create a second extractor to get the finer details of multiple owls. This will be done by creating a new extractor to extract each owls details.

In the previous tutorial, we created a listing extractor that captured the URL's of all the Owl's. Below shows what the output of the Listing Extractor should look like.

First Step

Create a new extractor as you did in Part 1 for the following URL:

http://owlkingdom.com/pointy.html

This URL shows the details of the Owl, Pointy. We want to get the details of every Owl shown on the home page. The informtion on each page is all structured in the same format, so we can use Pointy as a template for what all the other Owls URLs will look like. By creating columns and clicking on data, you show Import.io where the data you are looking to extract is located on the website.

Training a Column

The page loads, showing a single owl with extra details about them such as the Plumage and Vision. We want to create columns to capture these additional details.

When you select the data points, you will see that your selection is shown both in the floating column and on the page with green overlays. With some pages, you might have to scroll through a list and click a few more data points to capture the details you want.

Since this is a details extractor, you might want to restrict the data selected to be returned in one row per for page extracted, rather than list of data. To do this, reveal the Advanced option, then select Rows, and check to make sure it is set to Single Row.

tip

Anything that is available on the loaded page can also be captured by adding more columns. Try adding the Name, and an Image of each owl to your details and listing extractors too!

Once you've named and saved your Details Extractor, you can Run the extractor to see how the data is extracted.

Whats Next?

We now have a Details Extractor and a Listing Extractor. But how can these work together? How can we expand our listing extractor to get more than the first page of results?

Think about these questions as we continue to our next guide.