Ask an question to an existing vector dataset. Returns the dataset_id and a question_id that can be used to track the question using the endpoint +GET /dataset/{dataset_id}/question/{question_id}.
Base URL
Use the URL of the Autodrive instance that is being used, usually following the pattern: https://app-intelligence-{deployment_customers}.dadosfera.ai/service-auto-drive-api-{instance_id}
How to Find the Base URL
Method 1: GitHub Access
-
Access the GitHub Repository:
- Open the project's GitHub repository: https://github.com/dadosfera/demo-generative-document-analyzer-deep-lake/blob/main/prd.azure.deploy_config.yaml
-
Look for the Autodrive Instance and Customer desired:
- Check if the
deploy_api
istrue
(if itfalse
you can not access by API). - Get the
instance_id
and thedeployment_customers
data_app-9999-9999-9999-9999-9999-99999999999: instance_id: 9999-9999-9999-9999-9999-9999999999 deployment_customers: - dadosferademo app_tier: professional language: pt temp_cluster: false tags: - "deployment_env: prd" - "version: 1.0.0" - "lifecycle: live" - "app_tier: professional" - "dataapp: autodrive" - "language: pt-br" logger_level: INFO deploy_api: 'true'
- Check if the
The base url will be https://app-intelligence-{deployment_customers}.dadosfera.ai/service-auto-drive-api-{instance_id}
Method 2: Access to an Autodrive Instance
-
Check the Browser's Address:
the browser address of the instance of the Autodrive will be like:https://app-intelligence-{deployment_customers}.dadosfera.ai/service-auto-drive-{intance_id}
-
Not every instance of autodrive has the API functionality available. Please contact [email protected] to confirm if your autodrive includes this feature.
Endpoint
The endpont is <base url>/dataset/{dataset_id}/question
Headers
- Authorization: Basic <base64_encoded_credentials>
Parameters
- dataset_id (str, required): Unique identifier of the dataset to which the question will be directed.
- question (str, required): The string containing the question.
- metadata_filter (dict, optional): A dictionary containing filters to adjust the search based on the dataset's metadata. Example:
{"category": "finance"}
- distance_metric (str, optional): Defines the distance metric used to calculate relevance. Accepted values:
- "cos" (default): Cosine similarity.
- "L2": Euclidean distance.
- "L1": Manhattan distance.
- "max": Maximum distance.
- "dot": Dot product.
- maximize_marginal_relevance (bool, optional): If
True
, uses marginal relevance maximization to optimize the diversity of the results. The default value isTrue
. - fetch_k (int, optional): The number of possible answers to be retrieved from the dataset. The default is 10.
- k (int, optional): The number of final answers that will be returned to the user. The default is 3.
API key
To obtain the api_key
, you need the instance_id
of Autodrive, which can be found directly in the URL, as described in the "Base URL".
To generate the API key, we perform Base64 encoding:
Authorization: Basic <Base64_encode(admin:instance_id)>
Code exemple to get api_key
import base64
username = 'admin'
instance_id = '9999-9999-9999-9999-9999-9999999999'
credentials = f"{username}:{instance_id}"
# Codifica as credenciais em Base64
credentials_bytes = credentials.encode('utf-8')
base64_bytes = base64.b64encode(credentials_bytes)
# Converte de volta para string
base64_credentials = base64_bytes.decode('utf-8')
api_key = base64_credentials
print(api_key)
Exemple Request
import requests
import json
# Define the parameters
base_url = "https://api.example.com" # Base URL of the API
dataset_id = "abc123" # Unique identifier of the dataset that was created when creates a vector dataset
question = "What is the revenue for Q1 2024?" # The question being asked
# Headers including authentication
headers = {
'Authorization': f"Basic {api_key}", # Replace with your actual Authorization
'Content-Type': 'application/json' # Specify that the content is in JSON format
}
# Request body
data = {
"question": question,
"metadata_filter": metadata_filter,
"distance_metric": "cos", # Cosine similarity metric
"maximize_marginal_relevance": True, # Optimize for diversity in results
"fetch_k": 10, # Number of possible answers to retrieve
"k": 3 # Number of final answers to return
}
# Make the POST request
response = requests.post(
url=f"{base_url}/dataset/{dataset_id}/question", # API endpoint for submitting a question
headers=headers,
data=json.dumps(data) # Convert Python dictionary to JSON
)