As of Extractor Studio CLI version 2 and higher, Source Engineers can:
Deploy extractors directly to a source
Start crawl runs for a source (creates a snapshot)
In order to deploy your source, you will need to complete the following prerequisites:
Before running any of the CLI commands beginning with
source:deploy, your DOC User Token will need to be configured locally.
Once you have your token configure it by running
All of the
import-io source:[action] commands rely on deploying to existing sources in DOC. The commands will not create a source for you, it will fail if the
--source slug you provided does not exist.
|Creating a source in DOC does not require an Extractor ID be provided. If you are deploying your extractor to a source for the first time, an extractor will be created, and its ID will be added to the specified source.|
import-io source:deploy command will:
Tag and push to the git repo (must be “origin” remote):
Update or create the runtime configuration
Update or create the policy
Update or create the extractor
Update the sample inputs linked to the extractor
import-io source:deploy --help Deploy a source and update the sample inputs USAGE $ import-io source:deploy OPTIONS -c, --collection=collection (required) collection to deploy to -e, --path=path (required) path to extractor directory -h, --help show CLI help -o, --org=org (required) org slug -p, --project=project (required) project to deploy to -s, --source=source (required) source slug
This command will fail if:
In order to run an extractor, you need to run a source which will then run the underlying extractor. If
--deploy is set to true (this is the default),
the extractor will be updated before starting a run with the sample inputs. After a run is successfully started, a link to the newly created snapshot will be printed in your terminal.
import-io source:run --help Run an extractor (creates a snapshot) USAGE $ import-io source:run OPTIONS -c, --collection=collection (required) collection to deploy to -d, --deploy deploy before running, if false will run the currently saved extractor -e, --path=path (required) path to extractor directory -h, --help show CLI help -o, --org=org (required) org slug -p, --project=project (required) project to deploy to -s, --source=source (required) source slug -w, --wait whether or not to wait until the crawl run completes