Purpose
The /scrape
endpoint is used to fetch HTML content from a specified URL and extract elements based on a provided CSS selector or regular expression. This can be useful for web scraping tasks such as collecting specific data from web pages.
HTTP Method
POST
URL
https://node.nodetrigger.com/scrape
Request Body Parameters
url
(string, required): The URL of the web page to scrape.selector
(string, optional): A CSS selector to identify the elements to extract from the HTML. Eitherselector
orregex
must be provided.regex
(string, optional): A regular expression to match text within the HTML content. Eitherselector
orregex
must be provided.
Response
result
(array): An array of strings containing the extracted elements based on the provided selector or regex.error
(string): Error message if the request fails.message
(string): Additional information about the error.
Example Usage
Example 1: Extracting Elements with a CSS Selector
Request:
curl -X POST https://node.nodetrigger.com/scrape \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"selector": ".article-title"
}'
Response:
{
"result": [
"Article Title 1",
"Article Title 2",
"Article Title 3"
]
}
Example 2: Extracting Text with a Regular Expression
Request:
curl -X POST https://node.nodetrigger.com/scrape \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"regex": "\\b\\w{5}\\b"
}'
Response:
{
"result": [
"words",
"found",
"match"
]
}
Best Practices
Ensure that the POST
method is used for this endpoint to securely send data in the request body. If currently using GET
, please change the method to POST
.