Aller au contenu

Create an extraction

POST
/v1/extractor/extractions
curl --request POST \
--url http://localhost:8080/v1/extractor/extractions \
--header 'Content-Type: application/json' \
--header 'x-api-key: <x-api-key>' \
--data '{ "type": "article", "format": "html", "content": { "additionalProperty": "example" }, "url": "https://example.com", "domain": "example", "webhookUrl": "https://example.com", "webhookEventTypes": [ "done" ], "forceRefresh": true, "options": { "useLlm": true, "llmModel": "example", "llmPromptOverride": "example", "llmMode": "fill-missing" } }'

Sanitize content, compute cacheKey, lookup or insert. Codes: 200-cache_hit (sibling cache.hit=true), 202-queued (cache miss enqueued).

Media type application/json
object
type
required
string
Allowed values: article product
format
string
default: html
Allowed values: html markdown text json
content
required
Any of:
string
url
string format: uri
domain
string
webhookUrl
string format: uri
webhookEventTypes
Array<string>
Allowed values: done failed dead
forceRefresh
boolean
options
object
useLlm
boolean
llmModel
string
llmPromptOverride
string
<= 8000 characters
llmMode
string
Allowed values: fill-missing correct

Cache hit.

Media type application/json
object
status
required
string
Allowed value: success
code
required
string
data
required
object
id
required
string
type
required
string
Allowed values: article product
status
required
string
Allowed values: queued claimed done failed dead cancelled
url
required
string | null
domainId
required
string | null
tenantId
required
string
inputFormat
required
string
Allowed values: html markdown text json
data
null
metadata
null
errorMessage
string | null
errorClass
string | null
Allowed values: parse_error sanitize_fail llm_request llm_validation llm_timeout llm_unavailable invalid_input input_stash_lost cancelled_by_admin unknown
attempts
integer
0
cacheHits
integer
0
recycleCount
integer
0
createdAt
required
string format: date-time
finishedAt
required
string | null format: date-time
cache
object
hit
required
boolean
key
string
ageSeconds
integer
expiresAt
string format: date-time
timing
object
totalMs
required
integer
dbMs
integer
externalMs
integer
deprecation
object
sunset
required
string format: date-time
successor
string
note
string
Example
{
"status": "success",
"data": {
"type": "article",
"status": "queued",
"inputFormat": "html",
"errorClass": "parse_error",
"attempts": 0,
"cacheHits": 0,
"recycleCount": 0
}
}

Cache miss enqueued.

Media type application/json
object
status
required
string
Allowed value: success
code
required
string
data
required
object
id
required
string
type
required
string
Allowed values: article product
status
required
string
Allowed values: queued claimed done failed dead cancelled
url
required
string | null
domainId
required
string | null
tenantId
required
string
inputFormat
required
string
Allowed values: html markdown text json
data
null
metadata
null
errorMessage
string | null
errorClass
string | null
Allowed values: parse_error sanitize_fail llm_request llm_validation llm_timeout llm_unavailable invalid_input input_stash_lost cancelled_by_admin unknown
attempts
integer
0
cacheHits
integer
0
recycleCount
integer
0
createdAt
required
string format: date-time
finishedAt
required
string | null format: date-time
cache
object
hit
required
boolean
key
string
ageSeconds
integer
expiresAt
string format: date-time
timing
object
totalMs
required
integer
dbMs
integer
externalMs
integer
deprecation
object
sunset
required
string format: date-time
successor
string
note
string
Example
{
"status": "success",
"data": {
"type": "article",
"status": "queued",
"inputFormat": "html",
"errorClass": "parse_error",
"attempts": 0,
"cacheHits": 0,
"recycleCount": 0
}
}

Requête mal formée (validation_error, invalid_idempotency_key, invalid_sort_field, invalid_filter).

Media type application/json
object
status
required
string
Allowed value: error
code
required
string
error
required
object
message
string
requestId
required
string
details
Array<object>
object
path
string
code
string
message
string
key
additional properties
Example
{
"status": "error"
}

Authentification manquante ou invalide.

Media type application/json
object
status
required
string
Allowed value: error
code
required
string
error
required
object
message
string
requestId
required
string
details
Array<object>
object
path
string
code
string
message
string
key
additional properties
Example
{
"status": "error"
}

Scope insuffisant (forbidden, no_active_plan, service_disabled_on_plan).

Media type application/json
object
status
required
string
Allowed value: error
code
required
string
error
required
object
message
string
requestId
required
string
details
Array<object>
object
path
string
code
string
message
string
key
additional properties
Example
{
"status": "error"
}

Payload trop volumineux.

Media type application/json
object
status
required
string
Allowed value: error
code
required
string
error
required
object
message
string
requestId
required
string
details
Array<object>
object
path
string
code
string
message
string
key
additional properties
Example
{
"status": "error"
}

Validation métier KO (unsafe_url, invalid_bulk_body).

Media type application/json
object
status
required
string
Allowed value: error
code
required
string
error
required
object
message
string
requestId
required
string
details
Array<object>
object
path
string
code
string
message
string
key
additional properties
Example
{
"status": "error"
}

Rate-limit dépassé. Header Retry-After retourné.

Media type application/json
object
status
required
string
Allowed value: error
code
required
string
error
required
object
message
string
requestId
required
string
details
Array<object>
object
path
string
code
string
message
string
key
additional properties
Example
{
"status": "error"
}