Skip to main content

Adding a List of URLs

Managing an Extractor's URL List

From the Settings tab of an Extractor, you change manage the list of URLs extracted for when an Extractor starts a crawl run. You can either manually add URLs, import them from a file, extract them from other pages with Chained Extractors, or add similar URLs using URL Discovery.

Elements of the Inputs View

  1. Input source: Dropdown to set whether the Extractor uses URLs from an explicit list of URLs provided or URLs extracted by another Extractor.
  2. Clear All: Removes all the URLs from the list to start over.
  3. Remove Duplicate Rows: Removes any duplicate URLs from the list.
  4. Cleanup URLs: Removes invalid URLs and empty rows from the list.
  5. Download Inputs: Download a list of the URLs in CSV, Excel, JSON, or NDJSON format.
  6. Import Inputs: Import a list of URLs from a CSV or Excel (XLSX) file.
  7. Add input row: Add blank row to list of inputs.
  8. Reset to saved inputs: Resets URL list to saved inputs.
  9. List view: Shows all of the URLs currently added.
  10. Save: This saves any changes made to the URL list. When you add/remove/update URLs using the URLs Input, the changes will not be saved until you click Save.
  11. Run Inputs: Starts a new crawl run. If you have unsaved changes, this button will be disabled until you save your changes.
  12. Total Inputs: Display a count of URLs in the list. This is also how many queries a crawl run will use with that list of URLs (If screen capture is enabled then the total number of queries will be doubled).

Importing URLs from a File

Clicking Import URLs will reveal the Import URLs view which allows you to add URLs from a CSV or Excel file. This list of URLs can either replace or be added to your current list of URLs.

Elements of the Import URLs View

  1. Browse: Reveal file browser to select the file to import URLs from.
  2. Include column headers: Set whether the file includes column headers. If selected, then the first row will not be imported.
  3. Select header: Select which column the URLs are saved in.
  4. Append/Replace: Choose whether the list of URLs from the file are added to the current list of URLs or replaces/overwrites the current list.
  5. Preview list: Shows a preview of the URLs from the column selected.
  6. Cancel: Closes the Import URLs view and returns to the Extractor's settings.
  7. Upload URL list: Adds the selected URLs for import to the Extractor's URL list.