Endpoints
Scrape
Extract data from URLs with basic or AI-enhanced scraping capabilities
POST
The scrape endpoint allows you to extract data from a single URL at a time. You can choose between two scraping modes:
- Basic Scraping: Extracts data from the provided URL without AI assistance.
- AI-Enhanced Scraping: Uses AI to process the scraped content with either:
- A custom prompt to guide the extraction
- A prompt combined with a JSON schema for structured and consistent output
Use Cases
- Extract specific content from web pages
- Transform unstructured web content into structured data
- Ensure consistent data format using JSON schema validation
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
application/json
Optional schema definition for structured data extraction. Format should follow OpenAI's function calling schema format (https://platform.openai.com/docs/guides/structured-outputs).
Example types:
- string: "type": "string"
- integer: "type": "integer"
- number: "type": "number"
- boolean: "type": "boolean"
- array: "type": "array", "items": {"type": "string"}
- object: "type": "object", "properties": {...}
Example:
{
"description": "Schema for capturing product information",
"name": "Product Schema",
"schema": {
"properties": {
"product_url": {
"description": "The URL of the specific product",
"type": "string"
},
"product_name": {
"description": "The name of the specific product",
"type": "string"
},
"price": {
"description": "The price of the product",
"type": "number"
},
"product_images": {
"description": "List of product image URLs",
"items": {
"properties": {
"url": {
"description": "URL of the product image",
"type": "string"
}
},
"required": ["url"],
"type": "object"
},
"type": "array"
}
},
"required": [
"product_url",
"product_name",
"price",
"product_images"
],
"type": "object"
}
}
Response
200
application/json
Successful Response
The identifier for the scraping job
Example:
"f47ac10b-58cc-4372-a567-0e02b2c3d479"