Skip to main content
GET
/
v1
/
datasets
/
{dataset_id}
Get Dataset
curl --request GET \
  --url https://api.duplik.cn/v1/datasets/{dataset_id} \
  --header 'Authorization: Bearer <token>'
{
  "created_at": 123,
  "updated_at": 123,
  "type": 1,
  "name": "<string>",
  "description": "",
  "advanced_enabled": 2,
  "score_threshold": 0.5,
  "top_k": 5,
  "generate_metadata_enabled": true,
  "generate_metadata_prompt": "<string>",
  "overall_summarize_enabled": true,
  "overall_summarize_prompt": "<string>",
  "smart_indexing_enabled": true,
  "smart_indexing_prompt": "<string>",
  "retrieval_filter_entity_enabled": true,
  "retrieval_filter_period_enabled": true,
  "retrieval_recall_strategy_ids": [],
  "retrieval_context_mode": 1,
  "split_media_enabled": true,
  "chunk_size": 512,
  "chunk_overlap": 102,
  "qa_pairs_return_direct": false,
  "qa_pairs_score_threshold": 123,
  "generate_qa_pairs_enabled": true,
  "generate_qa_pairs_prompt": "<string>",
  "refine_user_question_enabled": false,
  "refine_user_question_prompt": "<string>",
  "pdf_parsing_mode": 1,
  "audio_parsing_mode": 1,
  "status": 1,
  "dataset_id": "<string>",
  "filter_attributes": [],
  "retrieval_segment_expansion": {
    "pre_count": 123,
    "next_count": 123
  },
  "retrieval_chunk_expansion": {
    "pre_count": 123,
    "next_count": 123
  },
  "document_count": 0
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Path Parameters

dataset_id
integer
required

Response

Successful Response

name
string
required
Maximum length: 128
dataset_id
string
required
created_at
integer
updated_at
integer
type
integer
default:1

1: Passage, 2: FAQ

description
string
default:""
Maximum length: 256
advanced_enabled
integer
default:2
score_threshold
number
default:0.5
top_k
integer
default:5
generate_metadata_enabled
boolean
default:true

generate metadata, 0: disabled, 1: enabled

generate_metadata_prompt
string

generate metadata prompt

Maximum length: 4096
overall_summarize_enabled
boolean
default:true

generate overall summary, 0: disabled, 1: enabled

overall_summarize_prompt
string

overall summary prompt

Maximum length: 4096
smart_indexing_enabled
boolean
default:true

generate intelligent indexing, 0: disabled, 1: enabled

smart_indexing_prompt
string

intelligent indexing prompt

Maximum length: 4096
retrieval_filter_entity_enabled
boolean
default:true

retrieval filter by entity, 0: disabled, 1: enabled

retrieval_filter_period_enabled
boolean
default:true

retrieval filter by period, 0: disabled, 1: enabled

retrieval_recall_strategy_ids
integer[]

retrieval recall strategy list, eg: [1, 2]. 1: semantic related context, 2: smart indexing context, 3: big context windows

retrieval_context_mode
integer
default:1

retrieval context mode. 1: only content, 2: content & metadata

split_media_enabled
boolean
default:true

generate media segments, 0: disabled, 1: enabled

chunk_size
integer
default:512

document chunk size

Required range: 256 <= x <= 2048
chunk_overlap
integer
default:102

document chunk size

Required range: 0 <= x <= 512
qa_pairs_return_direct
boolean
default:false

whether to return similarity search qa pairs directly, 0: no, 1: yes

qa_pairs_score_threshold
number

similarity search qa pairs score threshold

generate_qa_pairs_enabled
boolean
default:true

generate qa pairs, 0: disabled, 1: enabled

generate_qa_pairs_prompt
string

generate qa pairs prompt

Maximum length: 4096
refine_user_question_enabled
boolean
default:false

refine user question, 0: disabled, 1: enabled

refine_user_question_prompt
string

refine user question prompt

Maximum length: 4096
pdf_parsing_mode
integer
default:1

pdf parsing mode, 1: page, 2: section

audio_parsing_mode
integer
default:1

audio parsing mode, 1: transcript, 2: speaker diarization

status
integer
default:1
filter_attributes
FilterAttribute · object[]
retrieval_segment_expansion
object
retrieval_chunk_expansion
object
document_count
integer
default:0