Tabular Data & Querying

Query CSV, Excel, and Parquet files using natural language. Upload your data to a workspace, then ask questions in plain English.

Natural Language SQL

No SQL knowledge required. ParseSphere interprets your questions, generates optimized queries, and returns results with natural language explanations.

Workspaces Also Support Documents

In addition to tabular data, workspaces can hold PDF, DOCX, PPTX, TXT, and MD files for semantic search and RAG. See Core Concepts for details on document capabilities.

Creating a Workspace

Workspaces organize related files that you want to query together. Create a workspace before uploading files:

POST/v1/workspaces

Create a container for organizing related files

bash

curl -X POST https://api.parsesphere.com/v1/workspaces \
-H "Authorization: Bearer sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{
  "name": "Sales Analysis",
  "description": "Q4 2024 sales data and customer metrics"
}'

Managing Workspaces

GET/v1/workspaces

List all workspaces you have access to

bash

curl https://api.parsesphere.com/v1/workspaces \
-H "Authorization: Bearer sk_your_api_key"

DELETE/v1/workspaces/{workspace_id}

Delete a workspace and all its files

bash

curl -X DELETE https://api.parsesphere.com/v1/workspaces/550e8400-e29b-41d4-a716-446655440000 \
-H "Authorization: Bearer sk_your_api_key"

Important

Deleting a workspace permanently removes all files and conversation history. This action cannot be undone.

Uploading Tabular Files

Upload CSV, Excel, or Parquet files to your workspace for natural language querying:

POST/v1/workspaces/{workspace_id}/files

Upload a tabular file for processing

bash

curl -X POST https://api.parsesphere.com/v1/workspaces/550e8400-e29b-41d4-a716-446655440000/files \
-H "Authorization: Bearer sk_your_api_key" \
-F "file=@sales_data.csv"

Supported File Types

Upload CSV (.csv), Excel (.xlsx, .xls), or Parquet (.parquet) files up to 200 MB. ParseSphere automatically infers column types and optimizes data for fast queries.

File Processing

File uploads are processed asynchronously. ParseSphere analyzes your data structure, infers types, and optimizes for analytical queries.

Processing Lifecycle

Queued

Waiting for processing

Processing

Analyzing and optimizing

Completed

Ready for queries

Failed

Processing error

Queued

Waiting for processing

Processing

Analyzing and optimizing

Completed

Ready for queries

Failed

Processing error

Check Processing Status

GET/v1/workspaces/{workspace_id}/files/{file_id}/status

Monitor file processing progress

bash

curl https://api.parsesphere.com/v1/workspaces/550e8400-e29b-41d4-a716-446655440000/files/880e8400-e29b-41d4-a716-446655440000/status \
-H "Authorization: Bearer sk_your_api_key"

Tip

Poll the status endpoint every 5 seconds until status is completed or failed. Processing time varies by file size and complexity.

Managing Files

GET/v1/workspaces/{workspace_id}/files

List all files in a workspace

bash

curl https://api.parsesphere.com/v1/workspaces/550e8400-e29b-41d4-a716-446655440000/files \
-H "Authorization: Bearer sk_your_api_key"

Chatting with Your Data

Once your files are processed, start a conversation and ask questions in natural language:

POST/v1/workspaces/{workspace_id}/chat

Send a conversational message to query your data

bash

curl -X POST https://api.parsesphere.com/v1/workspaces/550e8400-e29b-41d4-a716-446655440000/chat \
-H "Authorization: Bearer sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{
  "message": "What are the top 5 products by revenue?",
  "stream": false
}'

Chat Parameters

message

REQUIREDString

Your question about the data in natural language (e.g., 'What are the top 5 products by revenue?')

conversation_id

OPTIONALUUID

Continue an existing conversation with full context. Omit to start a new conversation

stream

OPTIONALBoolean

Default: true

Enable Server-Sent Events (SSE) streaming for real-time responses

dataset_ids

OPTIONALArray

Limit query to specific tabular files. Omit to query all files in workspace

max_iterations

OPTIONALInteger

Default: 15

Maximum agent iterations for complex queries (1-20). Default is 15. Higher values enable deeper analysis

model

OPTIONALString

Override default model (e.g., 'claude-sonnet-4', 'gpt-4o')

include_execution_details

OPTIONALBoolean

Default: false

Include detailed SQL execution metadata in response for transparency and debugging

Example Questions

Be Specific

More specific questions produce better results. Reference actual column names when possible.

Simple Aggregations:

"What are the top 5 products by revenue?"
"How many customers made purchases last month?"
"What's the average order value?"

Comparisons:

"Compare sales between Q1 and Q2"
"Which products have above-average revenue?"
"Show revenue growth month over month"

Filtering:

"Show me all customers who made more than 10 purchases"
"What products were sold in December?"
"List orders over $1000"

Follow-up Questions:

"Now show me just the Electronics category"
"What about for Q4 instead?"
"Break that down by region"

Multi-File Queries:

"What's the profit margin on our top-selling products?" (requires sales + products files)
"Which customers bought Product A and Product B?" (requires orders + customers files)

Conversational Context

The chat endpoint maintains conversation history, allowing natural follow-up questions:

bash

curl -X POST https://api.parsesphere.com/v1/workspaces/550e8400/chat \
-H "Authorization: Bearer sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{
  "message": "What are the top products by revenue?"
}'

# Response includes conversation_id for follow-ups

Conversation benefits:

Natural follow-up questions without repeating context
Agent remembers previous queries and results
Build complex analyses iteratively
All conversation history saved automatically

Response Structure

Chat responses include natural language content and optional execution details:

json

{
"message_id": "990e8400-e29b-41d4-a716-446655440000",
"conversation_id": "880e8400-e29b-41d4-a716-446655440000",
"role": "assistant",
"content": "The top 3 products are Widget A ($125K), Widget B ($98K), and Widget C ($87K).",
"created_at": "2025-01-03T12:00:00Z"
}

Response fields:

content: Natural language answer to your question
conversation_id: Use this to continue the conversation
execution_details: Optional SQL execution metadata (when include_execution_details: true)

File Scoping

By default, chat searches all tabular files in a workspace. Use dataset_ids to limit scope:

bash

# Query all tabular files (default)
curl -X POST https://api.parsesphere.com/v1/workspaces/550e8400/chat \
-H "Authorization: Bearer sk_your_api_key" \
-H "Content-Type: application/json" \
-d '{
  "message": "What are total sales across all products?",
  "stream": false
}'

When to specify files:

Improve query speed by limiting scope
Prevent unintended joins when files share column names
Query specific subsets of your data

When to use all files:

Enable cross-file joins and correlations
Let the AI discover relationships automatically
Ask questions that span multiple data sources

Conversation History

View and manage your chat conversations:

GET/v1/workspaces/{workspace_id}/conversations

List all conversations in a workspace

bash

curl https://api.parsesphere.com/v1/workspaces/550e8400/conversations?limit=20&offset=0 \
-H "Authorization: Bearer sk_your_api_key"

Get Conversation Messages

Retrieve full message history with optional execution details:

GET/v1/workspaces/{workspace_id}/conversations/{conversation_id}

Get conversation details with message history

bash

curl "https://api.parsesphere.com/v1/workspaces/550e8400/conversations/880e8400?include_tool_details=true" \
-H "Authorization: Bearer sk_your_api_key"

Conversation history includes:

Full message thread (user and assistant messages)
SQL execution details when requested
Token usage and performance metrics
Conversation metadata and status

Best Practices

Chat Like a Pro

Follow these tips to get the most accurate results from conversational queries.

1. Organize Workspaces Logically

Group related files that you'll query together:

✓ "Q4 Sales Analysis" → sales.csv, products.csv, customers.csv
✓ "Financial Reporting" → revenue.csv, expenses.csv, budgets.csv
✗ "All Company Data" → too broad, unrelated files

2. Use Descriptive Column Names

The AI relies on column names to understand your data:

✓ customer_name, order_date, total_revenue
✗ col1, col2, value

3. Wait for Processing

Always verify file status is completed before chatting:

python

import requests
import time

# Wait for file to be ready
while True:
  response = requests.get(
      f"https://api.parsesphere.com/v1/workspaces/{workspace_id}/files/{file_id}/status",
      headers={"Authorization": f"Bearer {api_key}"}
  )
  status = response.json()
  
  if status["status"] == "completed":
      break
  elif status["status"] == "failed":
      raise Exception(f"Processing failed: {status['error_message']}")
  
  time.sleep(5)

# Now chat
response = requests.post(
  f"https://api.parsesphere.com/v1/workspaces/{workspace_id}/chat",
  headers={
      "Authorization": f"Bearer {api_key}",
      "Content-Type": "application/json"
  },
  json={"message": "What are the top products?", "stream": False}
)

4. Start Simple, Build Complexity

Start with straightforward questions, then ask follow-ups:

"How many rows are in this file?"
"What columns are available?"
"Show me the first 5 rows"
"What are the top products by revenue?" (builds on context)
"Now show me just Electronics" (uses conversation history)

5. Use Conversation Context

Ask follow-up questions naturally:

"What about Q4?" (after asking about Q3)
"Break that down by region" (after seeing totals)
"Show me the bottom 5 instead" (refining previous query)

6. Review SQL Execution

Enable include_execution_details: true to see:

Understand how your question was interpreted
Debug unexpected results
Learn which columns and tables were used
Copy SQL for use in other tools
Monitor token usage and performance

7. Control Iterations

Adjust max_iterations based on query complexity:

Simple queries: 5-10 iterations (faster)
Complex analysis: 15 iterations (default, thorough)

What's Next?

Continue learning:

Quick Start - Your first workspace and query
Core Concepts - Understand workspaces and files
Rate Limits - Query quotas and limits
Dashboard - Manage workspaces in the UI