VDF AI Data FAQ | VDF AI Documentation

Uploads and file handling

What file types can I upload?

The most common formats:

Documents — PDF, DOCX, Markdown, plain text
Spreadsheets — XLSX, CSV
Slides — PPTX
Transcripts — TXT, VTT, SRT

Some workspaces also support scanned PDFs (with OCR), images, and audio. Check your workspace for the full list.

Is there a file size limit?

Most platforms have a per-file size limit (often 100–250 MB). Very large files take longer to process and sometimes produce uneven results. If you’re hitting limits, two options:

Split the file into smaller logical sections.
Export a smaller portion (a chapter, a sheet, a date range).

What happens to my file after upload?

It’s processed so VDF AI can read and search it, then stored in your Data area. The file becomes referenceable in any conversation, agent, or network in your workspace (within whatever visibility you set).

Can I edit a file after uploading?

You can update it by uploading a new version with the same name (most workspaces will offer to replace) or by deleting and re-uploading. Note: if you’d previously referenced the file in a saved network or conversation, the reference points to “the file with that name” — so replacing it updates downstream references.

How do I delete an uploaded file?

Open the Data area, find the file, and delete it. Past conversations that referenced it keep the record of what was asked but no longer have access to the file content for future runs.

Connected apps

What apps can I connect?

Common ones include:

Google (Drive, Docs, Sheets, Slides)
Microsoft (OneDrive, SharePoint, Outlook, Teams)
Confluence (spaces and pages)
Jira (projects, boards, tickets)
GitHub (repos, issues, PRs)
Slack (channels)
Zoom (recordings and transcripts)
GitBook (spaces and docs)
Box (folders and files)

Your workspace may have additional connectors. Check the Connections area for the full list.

How do I add a new connection?

Go to the Connections area, pick the app, and click Connect. You’ll be redirected to the app to authenticate and grant scope. After approval, the connection appears as active.

My connection says “needs reauthorization.” Why?

A few possibilities:

The app’s auth token expired (some apps require periodic re-authentication).
The account that authorized the connection lost access to the scoped content.
The app side disconnected VDF AI (admin policy change, etc.).

Open the Connections area and reauthorize. The fix is usually a one-click re-login.

Can I connect the same app twice with different scopes?

In most workspaces, yes. You can connect a personal scope and a team scope separately. Naming them clearly helps avoid confusion (“My Drive — Personal” vs. “Team Drive — Engineering”).

How do I disconnect an app?

In the Connections area, find the connection and click Disconnect. The connected content stops being referenceable, but past conversations that used it keep their record.

Search and answer quality

My search returns generic answers. Why?

Usually one of three causes:

Scope is too broad. Narrow to a specific source, folder, or space.
The question is vague. “Tell me about X” is open-ended; “What does our Q3 playbook say about X?” is sharper.
The sources don’t contain the answer. No amount of prompting fixes a missing source.

See Searching your knowledge for sharper question patterns.

The answer cited a source — but the quote isn’t in it. What’s happening?

Sometimes the AI paraphrases content rather than quoting it directly, and the paraphrase can drift. Two ways to verify:

Ask: “Show me the exact quote from the source for that claim.”
Open the cited source and search for the key idea (not the exact words — concepts).

If you can’t find the supporting content, the citation is unreliable. Flag it and use a different source.

How do I make the AI only use my sources?

Be explicit at the start of your prompt:

“Only use the attached files and connected sources for this answer. If something can’t be confirmed from those sources, say ‘not in the available sources.’”

This dramatically reduces drift toward general knowledge.

Why does search by meaning miss the exact word I’m looking for?

Search by meaning prioritizes concept over keyword. Usually that’s what you want. When you specifically need an exact term, mention it directly:

“Find every reference to the exact phrase ‘data sovereignty’ in our connected Confluence space.”

This pushes the search to honor the literal term.

Freshness and refresh

How often do connected sources refresh?

It depends on the workspace and the app. Common patterns:

Periodic auto-refresh — connections check for new content every few hours.
On-demand refresh — you can force an immediate refresh from the Connections area.
Real-time for some apps — chat platforms (Slack, Teams) may surface new messages quickly.

If you need a specific source to be up to date right now, do a manual refresh before your query.

I just edited a doc. How fast can VDF AI see the change?

For connected sources with auto-refresh, it’s usually within minutes. For immediate access, trigger a manual refresh on the connection.

For uploaded files (not connected), the file is a snapshot — re-upload if it’s changed.

A connection has gone stale and is producing old answers. What do I check?

Connection status (active vs. needs attention)
Auto-refresh schedule
The source itself (was content moved, renamed, archived?)
Permissions (did your access change?)

Reauthorize or rescope to fix.

What stays private to me, what’s shared with the team?

Visibility usually has three levels:

Personal — only you
Team-shared — your team
Workspace-shared — the whole workspace

Defaults vary by workspace. Always check visibility after uploading or connecting if you’re not sure.

Can VDF AI see content I shouldn’t see?

No. Connections honor the access permissions of the person who authorized them. If your account can’t see a folder, the connection can’t see it either.

Can my teammates see what I’ve uploaded?

Only if you set the visibility to team or workspace. Personal-visibility uploads stay private to you.

How is sensitive data handled?

See Privacy & Security for the full picture. The short version: VDF AI honors your workspace’s data-handling policies, uploaded content is encrypted at rest, and access is bounded by the visibility you set on each source.

Can I delete my data?

Yes. You can delete individual files and conversations, and you can disconnect apps. For workspace-wide deletions, your workspace admin has additional tools. For requests covered by regulations (GDPR, CCPA), see Privacy & Security or contact support.

Databases, EDA, indexes, and fine-tuning

Which databases can I connect?

The most common ones — PostgreSQL, MySQL, Microsoft SQL Server, Oracle, SAP HANA, Presto, Exasol — plus Jira as a structured source. Anything with a JDBC driver also connects through the generic JDBC option, which covers Snowflake, BigQuery, Redshift, Trino, Vertica, and many more. See Connecting databases for the full picture and what to have ready.

What does Exploratory Data Analysis (EDA) actually tell me?

In plain language: it tells you whether you can trust a dataset. You get missing-value rates, duplicate counts, outlier signals, column-level summaries, and a drift signal that flags when a column’s shape has shifted. The whole report runs in one click and is meant for people who don’t write SQL. See Exploring your data for how to read the report.

What’s a vector index for, in plain language?

A vector index is a custom search surface you build from one of your sources. You decide what goes in, how the content is split into searchable pieces, and which embedding model powers the search. Once built, every other product — Chat, Agents, Networks — can search it. See Vector indexes and semantic search.

Can I fine-tune a model on my data?

Yes — the fine-tuning workflow lives in VDF AI Data. You pick a source, pick a mapping template (how rows become prompt/completion pairs), preview the dataset, and export in JSONL or CSV. The exported file flows into a training step run inside your own ML platform or by a partner. See Fine-tuning datasets.

Should I build a vector index, fine-tune a model, or both?

Indexes are best when you want current facts the AI can look up. Fine-tuning is best when you want a shift in behavior or voice. Most teams need indexes first; fine-tuning is the next step once a workflow is producing real value but the voice still feels generic.

What is a “feature list”?

A feature list is a saved, named selection of columns from an asset — the handful that matter for one use case. The same customers table can have a marketing list, a risk list, and a support list. See Features and relationships.

Choosing between products

When should I use Data vs. just attaching a file in Chat?

Attach in Chat if you’ll only use the file once.
Add to Data if it’ll be used more than once, by you or anyone on the team.

Adding to Data unlocks reuse, search, and citation across every product surface.

When should I use Data with an Agent vs. with a Network?

Agent + Data for a single deliverable that needs grounded sources (a checklist, a brief, a summary).
Network + Data for multi-stage work where Data is the source layer for several stages (research → draft → critique → final, with sources fueling each stage).

Can I use Data without using Agents or Networks?

Absolutely. Chat alone works well with Data — uploaded files and connected sources are referenceable from any conversation. Data on its own is already a step-change improvement over scattered files.

Still stuck?

Getting started with VDF AI Data — the first-day walkthrough.
Connecting sources — connection details and freshness habits.
Connecting databases — bring structured sources into the same surface.
Exploring your data — quality, completeness, and drift, in one click.
Searching your knowledge — sharper questions, better answers.
Governance and admin — for workspace admins.
Talk to us — if your question isn’t here, send it.