docs/03-type-driven-financial-extraction.md for the canonical type-driven extraction approach and future user-in-the-loop direction.dwata-api) and shared typesThe dwata project is organized as a Cargo workspace plus frontend and desktop app packages:
The root Cargo.toml defines the workspace members:
members = [
"dwata-agents",
"dwata-api",
"shared-types",
]
exclude = [
"gui"
]
dwata-agents - KG Extraction AgentsLocation: /dwata-agents/
Main email KG extractor: dwata-agents/src/kg_email_extractor/
See docs/06-knowledge-graph-extraction.md for the pass architecture, gating, and persistence/search flow.
dwata-api - Backend API ServerLocation: /dwata-api/
The API server is built with Actix-web and uses SQLite for data storage.
From dwata-api/Cargo.toml:
actix-web.workspace = true
rusqlite = { version = "0.31", features = ["bundled"] }
shared-types = { path = "../shared-types" }
config = { version = "0.14", default-features = false, features = ["toml"] }
dirs = "5.0"
src/main.rs - Entry point, HTTP server setupsrc/config.rs - Configuration managementsrc/database/ - Database models, queries, and migrations
mod.rs - Database connection and session managementcredentials.rs - Credential storagedownloads.rs - Download job managementemails.rs - Email storagemigrations.rs - Database schema migrationssrc/handlers/ - HTTP request handlers
credentials.rs - Credential CRUD endpointsdownloads.rs - Download job endpointsoauth.rs - OAuth flow handlerssettings.rs - Settings endpointssrc/helpers/ - Utility functions
database.rs - Database path and initializationgoogle_oauth.rs - Google OAuth clientoauth_state.rs - OAuth state managementtoken_cache.rs - Token cachingsrc/integrations/ - External service integrationssrc/jobs/ - Background job management
download_manager.rs - Manages download jobsshared-types - Type DefinitionsLocation: /shared-types/
This crate contains all the shared type definitions used by both the API server and the GUI.
From shared-types/Cargo.toml:
serde.workspace = true
ts-rs = "8.0"
src/lib.rs - Main module that re-exports all typessrc/credential.rs - Credential-related typessrc/download.rs - Download job typessrc/email.rs - Email typessrc/event.rs - Event typessrc/project.rs - Project typessrc/session.rs - Agent session typessrc/settings.rs - Settings typessrc/task.rs - Task typessrc/extraction.rs - Data extraction typesThe crate includes a binary at src/bin/generate_api_types.rs that uses ts-rs to generate TypeScript type definitions:
let output_dir = Path::new("../gui/src/api-types");
fs::create_dir_all(output_dir)?;
let output_path = output_dir.join("types.ts");
To generate types:
cargo run --bin generate_api_types
gui - Frontend ApplicationLocation: /gui/
The GUI is built with SolidJS and Vite.
From gui/package.json:
{
"dependencies": {
"@solidjs/router": "^0.15.1",
"solid-js": "^1.9.5",
"daisyui": "^5.5.14"
}
}
src/index.tsx - Application entry pointsrc/App.tsx - Root componentsrc/api-types/ - Generated TypeScript types from shared-typessrc/components/ - Reusable UI componentssrc/config/ - Frontend configurationsrc/pages/ - Page components
settings/ - Settings pageThe API server reads its configuration from the OS user’s config directory + dwata.
From dwata-api/src/config.rs:
pub fn get_config_path() -> PathBuf {
if let Some(config_dir) = dirs::config_dir() {
config_dir.join("dwata").join("config.toml")
} else {
PathBuf::from("config.toml")
}
}
Platform-specific config paths:
~/Library/Application Support/dwata/config.toml~/.config/dwata/config.toml%APPDATA%\dwata\config.tomlThe configuration is loaded in src/main.rs:
// Load config
let (config, _) = config::ApiConfig::load().expect("Failed to load config");
The ApiConfig::load() method:
get_config_path()Default configuration structure (from config.rs):
[api_keys]
# gemini_api_key = "your-gemini-key"
[cors]
allowed_origins = ["http://localhost:3030"]
[server]
host = "127.0.0.1"
port = 8080
[google_oauth]
# client_id = "YOUR_CLIENT_ID.apps.googleusercontent.com"
# client_secret = "YOUR_CLIENT_SECRET"
# redirect_uri = "http://localhost:8080/api/oauth/google/callback"
[downloads]
# When false, the API will not auto-start download jobs on startup.
auto_start = false
Desktop OAuth note: Google Desktop OAuth is sensitive to the exact host in the redirect URI. Use server.host = "localhost" (not 127.0.0.1) to avoid token exchange failures. We support bring-your-own Google OAuth apps; if you set client_id/client_secret in the config, those are always used.
Release defaults: Release builds can embed a default Google OAuth client_id/client_secret at compile time. scripts/build-production.sh will read them from your local config.toml (or from DWATA_DEFAULT_GOOGLE_CLIENT_ID / DWATA_DEFAULT_GOOGLE_CLIENT_SECRET) and compile them in. The runtime config still overrides these defaults when set.
The API server uses SQLite for storage. The database path is determined by the OS.
From dwata-api/src/helpers/database.rs:
/// Platform-specific paths
///
/// - **macOS**: `~/Library/Application Support/dwata/db.sqlite`
/// - **Linux**: `~/.local/share/dwata/db.sqlite`
/// - **Windows**: `%LOCALAPPDATA%\dwata\db.sqlite`
pub fn get_db_path() -> anyhow::Result<PathBuf> {
let data_dir = dirs::data_local_dir()
.ok_or_else(|| anyhow::anyhow!("Could not determine local data directory"))?;
let db_path = data_dir.join("dwata").join("db.sqlite");
Ok(db_path)
}
From dwata-api/src/database/mod.rs:
pub fn new(db_path: &PathBuf) -> anyhow::Result<Self> {
// Ensure directory exists
if let Some(parent) = db_path.parent() {
std::fs::create_dir_all(parent)?;
}
// Create sync connection first and run migrations
let sync_conn = Connection::open(db_path)?;
let sync_mutex = Arc::new(Mutex::new(sync_conn));
// Run migrations on sync connection before opening async connection
{
let conn = sync_mutex.lock().unwrap();
migrations::run_migrations(&conn)?;
}
// Now open async connection
let async_conn = Connection::open(db_path)?;
let database = Database {
connection: sync_mutex,
async_connection: Arc::new(TokioMutex::new(async_conn)),
};
Ok(database)
}
The database is initialized in src/main.rs:
// Initialize database
let db = helpers::database::initialize_database().expect("Failed to initialize database");
println!(
"Database initialized at: {:?}",
helpers::database::get_db_path().unwrap()
);
dwata uses the OS native keychain for secure credential storage:
Credentials are stored in the SQLite database as metadata only (without passwords). Passwords and sensitive tokens are stored separately in the OS keychain using the keyring crate.
dwata uses “master credentials mode” to minimize OS keychain prompts. Instead of storing each credential as a separate keychain entry, all credentials are stored together in a single master entry as encrypted JSON.
Benefits:
How it works:
"dwata-master"Storage format (internal):
{
"version": 1,
"credentials": [
{
"type": "imap",
"identifier": "gmail",
"username": "user@example.com",
"password": "encrypted_by_os_keychain"
}
]
}
In addition to master mode, dwata implements an in-memory password cache:
KeyringService::with_ttl())Arc<RwLock<HashMap>> for concurrent accessFrom dwata-api/src/helpers/keyring_service.rs:
// Initialize with default 1 hour TTL
let keyring_service = KeyringService::new();
// Or customize the TTL
let keyring_service = KeyringService::with_ttl(Duration::from_secs(7200)); // 2 hours
On macOS, the first time dwata starts, you’ll see one system prompt:
"dwata-api" wants to access the keychain item "dwata-master"
[ Deny ] [ Allow ] [ Always Allow ]
Important: Select “Always Allow” to grant permanent access. You’ll never see this prompt again.
If you accidentally selected “Allow” (temporary access), you can fix this:
dwata-api to the “Always allow access” listThe KeyringService provides methods for cache management:
// Invalidate a specific credential
keyring_service.invalidate(&credential_type, &identifier, &username).await;
// Clear entire cache (useful after password changes)
keyring_service.clear_cache().await;
// Get cache statistics
let (total, expired) = keyring_service.cache_stats().await;
tauri - Desktop App ShellLocation: /tauri/
The Tauri app wraps the SolidJS GUI and starts dwata-api as a sidecar. It is the primary desktop build target.
cd dwata-api
cargo run
With logging to a file:
cargo run -- --log-file-path /path/to/log/file.log
The server will:
~/Library/Application Support/dwata/config.toml (on macOS)127.0.0.1:8080 (or as configured)The build-release workflow builds the Tauri desktop app (and bundles the dwata-api sidecar). It can embed default Google OAuth credentials at build time. Set these repository secrets:
DWATA_GOOGLE_CLIENT_IDDWATA_GOOGLE_CLIENT_SECRETRelease automation (scripts/release.sh and scripts/build-production.sh) targets the Tauri desktop app bundle, not a standalone dwata-api + GUI release.
cd gui
npm install
npm run dev
This starts the development server, typically on http://localhost:3030.
cd tauri
npm install
npm run dev
After modifying types in shared-types:
cd shared-types
cargo run --bin generate_api_types
This generates gui/src/api-types/types.ts with TypeScript definitions.
shared-types/src/cargo run --bin generate_api_typesshared-typesdwata-api/src/handlers/dwata-api/src/main.rsdwata-api/src/database/migrations.rsIf you have the SQLite CLI installed, you can query the database directly:
# On macOS
sqlite3 ~/Library/Application\ Support/dwata/db.sqlite
# Example queries
SELECT * FROM credentials_metadata;
SELECT * FROM download_jobs;
SELECT * FROM emails;
.tables # List all tables
.schema credentials_metadata # Show table schema