A simple web scraper.
Format: node scrappy.js <command>
There are 3 commands:
urljsonhelp
This command is the main command for scraping. You'll need to use JSON created from
json command.
First argument is the url to be scraped.
--root-element: this is the selector of parent element of the fellow scrapees.--json-object: this is the json object of soon to be scraped elements. You should create it byjsoncommand.--time-interval: Allows you scrap with same configuration in loop with given time interval. If not given then it scraps only for once.--first: Returns firstnnumber of elements given with this command.--help: Prints help text of theurlcommand.
This command helps you to create json object interactively. Prompts following:
{
field: 'string',
selector: 'string',
type: 'string',
attr_selector: 'string'
}fieldis the name of the scraped tag.selectoris the JQuery selector of scraped tag.typeis about getting eithertextvalue orattrvalue of selected tag. Can only betextorattr.attr_selector: iftypeisattr, this value must be given to scrap selected attribute.
You can create as many as you want.
If q is given at any time during prompt, it will quit and print created json object. Use json output in url command.
Prints help text.