Creates a vector dataset from a set of files

Description

Create a new dataset by uploading one or more source files.
During ingestion AutoDrive DDF:

  1. Parses the file(s) (Markdown, TXT, PDF, CSV, Excel, JSON, …).
  2. Generates embeddings so the dataset becomes immediately searchable with natural-language questions.

The file upload itself is synchronous, but content processing is asynchronous.
You receive a dataset_id (UUID). Poll GET /dataset/{dataset_id} until status = "success".


HTTP Request

POST /upload

There are no path or query parameters.




Authentication

  • In your REST client (Postman/Insomnia), select Basic Auth and enter your Username and Password.
  • The client will automatically generate the Authorization: Basic … header for you. Do not add that header manually.

Request Headers

HeaderExampleRequiredNotes
Content-Typemultipart/form-dataEach file is a separate files part.
Acceptapplication/jsonOptional – guarantees JSON response.

Request Body (multipart/form-data)

FieldTypeRequiredDescription
filesfile✓ (≥1)One or more documents. Use multiple files parts for multiple uploads.
nametextHuman-readable dataset name (must be unique).

Tip : keep individual files ≤ 10 MB for faster indexing.

Example (cURL)

curl -u admin:mypassword \
     -F "files=@User_Guide.md" \
     -F "[email protected]" \
     -F "name=platform_knowledge_base" \
     https://api.autodriveddf.example/upload

Success Response 201 Created

{
  "dataset_id": "c8004e22-87f7-441f-8302-80c934841196",
  "name": "platform_knowledge_base",
  "status": "processing",
  "status_reason": null,
  "id": 31,
  "customer_name": "dadosferaDemo",
  "instance_id": "78b9e8c0-a123-4e56-9f01-abcdef123456",
  "created_at": "2025-06-12T11:15:02.123Z",
  "updated_at": "2025-06-12T11:15:02.123Z"
}
FieldDescription
dataset_idCanonical UUID – use this in all later calls.
nameDataset name you supplied.
statusprocessing → will change to success after indexing finishes
status_reasonError details if status = "failed"; otherwise null.
idInternal numeric identifier (rarely needed by API consumers).
created_atUpload timestamp (ISO-8601).
updated_atEquals created_at until further updates are made.

Error Responses

CodeReasonWhen it happensExample Body
400Bad RequestMissing files or name, unsupported type, malformed form{ "detail": "No files provided." }
401UnauthorizedWrong / missing Basic Auth credentials
409ConflictAnother dataset already uses the given name{ "detail": "Dataset name already exists." }
413Payload Too LargeCombined upload exceeds server limit{ "detail": "Upload exceeds size limit." }
500Server ErrorUnexpected failure during file ingestion{ "detail": "Unexpected error." }

Follow-up Poll

Continue polling GET /dataset/{dataset_id}.
When status switches to "success", the dataset is ready for questions (/question or /ai_question).

Language
Credentials
Basic
base64
:
Click Try It! to start a request and see the response here!