Crawl

Crawl

curl --request POST \
  --url https://api.datafuel.dev/crawl \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "url": "<string>",
  "ai_prompt": "<string>",
  "json_schema": {
    "description": "Schema for capturing product information",
    "name": "Product Schema",
    "schema": {
      "properties": {
        "product_url": {
          "description": "The URL of the specific product",
          "type": "string"
        },
        "product_name": {
          "description": "The name of the specific product",
          "type": "string"
        },
        "price": {
          "description": "The price of the product",
          "type": "number"
        },
        "product_images": {
          "description": "List of product image URLs",
          "items": {
            "properties": {
              "url": {
                "description": "URL of the product image",
                "type": "string"
              }
            },
            "required": [
              "url"
            ],
            "type": "object"
          },
          "type": "array"
        }
      },
      "required": [
        "product_url",
        "product_name",
        "price",
        "product_images"
      ],
      "type": "object"
    }
  },
  "javascript_scenario": [
    {}
  ],
  "depth": 1,
  "limit": 1,
  "exclusion_pattern": "https://.*datafuel\\.dev/blog/.*",
  "excluded_links": "https://www.datafuel.dev/pricing,https://www.datafuel.dev/blog"
}'

{
  "job_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479"
}

POST

crawl

Crawl

curl --request POST \
  --url https://api.datafuel.dev/crawl \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "url": "<string>",
  "ai_prompt": "<string>",
  "json_schema": {
    "description": "Schema for capturing product information",
    "name": "Product Schema",
    "schema": {
      "properties": {
        "product_url": {
          "description": "The URL of the specific product",
          "type": "string"
        },
        "product_name": {
          "description": "The name of the specific product",
          "type": "string"
        },
        "price": {
          "description": "The price of the product",
          "type": "number"
        },
        "product_images": {
          "description": "List of product image URLs",
          "items": {
            "properties": {
              "url": {
                "description": "URL of the product image",
                "type": "string"
              }
            },
            "required": [
              "url"
            ],
            "type": "object"
          },
          "type": "array"
        }
      },
      "required": [
        "product_url",
        "product_name",
        "price",
        "product_images"
      ],
      "type": "object"
    }
  },
  "javascript_scenario": [
    {}
  ],
  "depth": 1,
  "limit": 1,
  "exclusion_pattern": "https://.*datafuel\\.dev/blog/.*",
  "excluded_links": "https://www.datafuel.dev/pricing,https://www.datafuel.dev/blog"
}'

{
  "job_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479"
}

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

Response

200

application/json

Successful Response

The response is of type object.

Scrape With Login Get Scrapes

Get Started

Endpoints

Authorizations

Body

Response