Creates a vector dataset from a set of files

Description

Create a new dataset by uploading one or more source files.
During ingestion AutoDrive DDF:

Parses the file(s) (Markdown, TXT, PDF, CSV, Excel, JSON, …).
Generates embeddings so the dataset becomes immediately searchable with natural-language questions.

The file upload itself is synchronous, but content processing is asynchronous.
You receive a dataset_id (UUID). Poll GET /dataset/{dataset_id} until status = "success".

HTTP Request

POST /upload

There are no path or query parameters.

Authentication

In your REST client (Postman/Insomnia), select Basic Auth and enter your Username and Password.
The client will automatically generate the Authorization: Basic … header for you. Do not add that header manually.

Request Headers

Header	Example	Required	Notes
Content-Type	`multipart/form-data`	✓	Each file is a separate `files` part.
Accept	`application/json`	—	Optional – guarantees JSON response.

Request Body (multipart/form-data)

Field	Type	Required	Description
files	file	✓ (≥1)	One or more documents. Use multiple `files` parts for multiple uploads.
name	text	✓	Human-readable dataset name (must be unique).

Tip : keep individual files ≤ 10 MB for faster indexing.

Example (cURL)

curl -u admin:mypassword \
     -F "files=@User_Guide.md" \
     -F "[email protected]" \
     -F "name=platform_knowledge_base" \
     https://api.autodriveddf.example/upload

Success Response `201 Created`

{
  "dataset_id": "c8004e22-87f7-441f-8302-80c934841196",
  "name": "platform_knowledge_base",
  "status": "processing",
  "status_reason": null,
  "id": 31,
  "customer_name": "dadosferaDemo",
  "instance_id": "78b9e8c0-a123-4e56-9f01-abcdef123456",
  "created_at": "2025-06-12T11:15:02.123Z",
  "updated_at": "2025-06-12T11:15:02.123Z"
}

Field	Description
`dataset_id`	Canonical UUID – use this in all later calls.
`name`	Dataset name you supplied.
`status`	`processing` → will change to `success` after indexing finishes
`status_reason`	Error details if `status = "failed"`; otherwise `null`.
`id`	Internal numeric identifier (rarely needed by API consumers).
`created_at`	Upload timestamp (ISO-8601).
`updated_at`	Equals `created_at` until further updates are made.

Error Responses

Code	Reason	When it happens	Example Body
400	Bad Request	Missing `files` or `name`, unsupported type, malformed form	`{ "detail": "No files provided." }`
401	Unauthorized	Wrong / missing Basic Auth credentials	—
409	Conflict	Another dataset already uses the given `name`	`{ "detail": "Dataset name already exists." }`
413	Payload Too Large	Combined upload exceeds server limit	`{ "detail": "Upload exceeds size limit." }`
500	Server Error	Unexpected failure during file ingestion	`{ "detail": "Unexpected error." }`

Follow-up Poll

Continue polling GET /dataset/{dataset_id}.
When status switches to "success", the dataset is ready for questions (/question or /ai_question).