<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Documentation | VaultSafe | Chat with Your Files — AI Document Assistant</title><link>https://www.vaultsafe.ai/en/docs/</link><atom:link href="https://www.vaultsafe.ai/en/docs/index.xml" rel="self" type="application/rss+xml"/><description>Documentation</description><generator>Hugo Blox Builder (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Wed, 18 Feb 2026 00:00:00 +0000</lastBuildDate><image><url>https://www.vaultsafe.ai/media/logo.svg</url><title>Documentation</title><link>https://www.vaultsafe.ai/en/docs/</link></image><item><title>How internal document parsing works</title><link>https://www.vaultsafe.ai/en/docs/document-parsing-pipeline/</link><pubDate>Sat, 28 Feb 2026 00:00:00 +0000</pubDate><guid>https://www.vaultsafe.ai/en/docs/document-parsing-pipeline/</guid><description>&lt;h1 id="how-internal-document-parsing-works"&gt;How internal document parsing works&lt;/h1&gt;
&lt;p&gt;This page describes how VaultSafe turns uploaded documents (receipts, invoices, IDs) into structured data. We use an &lt;strong&gt;OCR-first pipeline&lt;/strong&gt; with deterministic extraction and optional LLM cleanup—no vision model in the hot path—to keep cost low and accuracy high (in the high 90s in our evaluations).&lt;/p&gt;
&lt;h2 id="pipeline-overview"&gt;Pipeline overview&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Upload&lt;/strong&gt; — User uploads an image or PDF (page rendered as image).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Geo + languages&lt;/strong&gt; — We use geo-specific tagging (locale/region/preference) to select which languages to run. Users can add more languages and re-run parsing later.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;OCR&lt;/strong&gt; — PaddleOCR runs for the selected languages only and returns plain text (and optionally line-level boxes/scores).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Extraction&lt;/strong&gt; — Regex and pattern matching extract candidate fields (e.g. receipt number, amount, date).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Normalization&lt;/strong&gt; — Python rules normalize values (number formatting, date formats).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Optional cleanup&lt;/strong&gt; — Low-confidence or parse-fail cases can be sent to a small text-only LLM for correction.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Output&lt;/strong&gt; — Structured JSON is stored and exposed to apps and APIs.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Design principle:&lt;/strong&gt; We avoid sending every image through a vision LLM. OCR gives us text that preserves semantic structure; rules do most of the work, and a small LLM handles the remaining noise.&lt;/p&gt;
&lt;h2 id="ocr-engine-selection"&gt;OCR engine selection&lt;/h2&gt;
&lt;p&gt;We need one OCR path that supports &lt;strong&gt;nine languages&lt;/strong&gt;: English, Chinese (Simplified), Spanish, Hindi, Arabic, Portuguese, French, Japanese, and German. We evaluated three engines on the same &lt;strong&gt;100+ image&lt;/strong&gt; set (receipts, invoices, multi-language), measuring &lt;strong&gt;time&lt;/strong&gt;, &lt;strong&gt;memory&lt;/strong&gt;, and &lt;strong&gt;accuracy&lt;/strong&gt; (character/word-level vs. ground truth):&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Engine&lt;/th&gt;
&lt;th&gt;Time (s)&lt;/th&gt;
&lt;th&gt;Memory (MB)&lt;/th&gt;
&lt;th&gt;Accuracy (our eval)&lt;/th&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;PaddleOCR&lt;/td&gt;
&lt;td&gt;27&lt;/td&gt;
&lt;td&gt;1570&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Best&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;9 languages × 1 pass each, results merged by position&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EasyOCR&lt;/td&gt;
&lt;td&gt;840&lt;/td&gt;
&lt;td&gt;570&lt;/td&gt;
&lt;td&gt;Lower&lt;/td&gt;
&lt;td&gt;5 readers (script restrictions), results merged&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tesseract&lt;/td&gt;
&lt;td&gt;2.4&lt;/td&gt;
&lt;td&gt;91&lt;/td&gt;
&lt;td&gt;Lowest&lt;/td&gt;
&lt;td&gt;1 pass, 9 languages in a single call&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;Production choice: PaddleOCR.&lt;/strong&gt; It delivered the best accuracy on our receipt/invoice set (consistent with reported document-OCR benchmarks). Tesseract and EasyOCR were faster or lighter in some setups but trailed on accuracy in our eval. We use &lt;strong&gt;PaddleOCR&lt;/strong&gt; in a &lt;strong&gt;hybrid, geo-aware&lt;/strong&gt; way so we don&amp;rsquo;t pay the full cost of running all nine languages on every image.&lt;/p&gt;
&lt;h3 id="hybrid-approach-geo-specific-languages-and-re-run"&gt;Hybrid approach: geo-specific languages and re-run&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Geo-specific tagging&lt;/strong&gt; — We use account locale, upload region, or user preference to choose a &lt;strong&gt;small set of languages per document&lt;/strong&gt; (e.g. en+hi for India, en+ch for China, en+es+pt for Latin America). Only those language models are loaded and run.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Re-run with more languages&lt;/strong&gt; — Users can add more languages in settings and &lt;strong&gt;re-run parsing&lt;/strong&gt; on existing documents. We run OCR only for the selected languages, so PaddleOCR&amp;rsquo;s accuracy is preserved without running all nine every time.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;PaddleOCR output is line- or word-level text with bounding boxes; we use it as the single source of truth for the extraction step.&lt;/p&gt;
&lt;h2 id="why-ocr-output-is-good-enough"&gt;Why OCR output is good enough&lt;/h2&gt;
&lt;p&gt;OCR text often contains minor noise (e.g. &lt;code&gt;Date z&lt;/code&gt;, &lt;code&gt;Amount 3 %2841927&lt;/code&gt;). What matters for our pipeline is that &lt;strong&gt;semantic structure is preserved&lt;/strong&gt;: labels like &amp;ldquo;Receipt No&amp;rdquo;, &amp;ldquo;Amount&amp;rdquo;, &amp;ldquo;Date&amp;rdquo; appear in the text with values nearby. Under that condition:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Regex and pattern matching&lt;/strong&gt; reliably isolate fields.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;A small text LLM&lt;/strong&gt; can correct or normalize the remaining noisy cases without ever seeing the image.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No vision model&lt;/strong&gt; is required in the default path, which keeps cost at ~$0.0004 per image (vs. ~$0.004 with a vision LLM per image).&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="extraction-layer"&gt;Extraction layer&lt;/h2&gt;
&lt;p&gt;The extraction layer is &lt;strong&gt;rule-based&lt;/strong&gt; on the OCR text:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Patterns&lt;/strong&gt; — We use regex and simple heuristics to find receipt number, amount, date, merchant, etc., depending on document type.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Normalization&lt;/strong&gt; — Number formats (e.g. &lt;code&gt;28,41,927&lt;/code&gt; → &lt;code&gt;2841927&lt;/code&gt;), dates (e.g. &lt;code&gt;04-Jul-2025&lt;/code&gt; → ISO), and units are normalized in Python.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Confidence&lt;/strong&gt; — When a field is missing or low-confidence, we can route that document (or field) to an optional &lt;strong&gt;text-only LLM cleanup&lt;/strong&gt; step.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Most documents are fully extracted with rules only; the remainder use the small LLM for cleanup. End-to-end accuracy in our experiments is in the high 90s, comparable to a vision-LLM–based pipeline.&lt;/p&gt;
&lt;h2 id="cost-and-scale"&gt;Cost and scale&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Baseline (vision LLM per image):&lt;/strong&gt; ~$0.004 per image (generalized over 1000+ images).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Our pipeline (OCR + rules + optional text LLM):&lt;/strong&gt; ~$0.0004 per image.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The order-of-magnitude cost reduction comes from removing the vision model from the hot path and using OCR + deterministic extraction + optional small LLM cleanup instead.&lt;/p&gt;
&lt;h2 id="summary"&gt;Summary&lt;/h2&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;OCR engine&lt;/td&gt;
&lt;td&gt;PaddleOCR (geo-specific languages; re-run when user adds more)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Extraction&lt;/td&gt;
&lt;td&gt;Regex and pattern matching on OCR text&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cleanup&lt;/td&gt;
&lt;td&gt;Optional small text LLM for low-confidence cases&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vision model&lt;/td&gt;
&lt;td&gt;Not used in default path&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost (approx.)&lt;/td&gt;
&lt;td&gt;~$0.0004 per image&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Accuracy&lt;/td&gt;
&lt;td&gt;Best in our eval (PaddleOCR); aligned with vision-LLM baseline&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;For more context and benchmarks, see the blog post:
.&lt;/p&gt;</description></item><item><title>Apps</title><link>https://www.vaultsafe.ai/en/docs/apps/</link><pubDate>Wed, 18 Feb 2026 00:00:00 +0000</pubDate><guid>https://www.vaultsafe.ai/en/docs/apps/</guid><description>&lt;h1 id="apps"&gt;Apps&lt;/h1&gt;
&lt;p&gt;VaultSafe&amp;rsquo;s app ecosystem represents a breakthrough in secure, AI-powered personal data processing. Our platform enables sophisticated applications that transform document metadata into structured, actionable intelligence—all while maintaining the highest standards of privacy and security.&lt;/p&gt;
&lt;h2 id="architecture-overview"&gt;Architecture overview&lt;/h2&gt;
&lt;p&gt;VaultSafe Apps operate on a sophisticated pipeline that combines advanced AI models with zero-trust security principles:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Document metadata extraction&lt;/strong&gt; — When files are uploaded, our AI analysis engine (powered by state-of-the-art computer vision and NLP models) extracts rich metadata: document classification, entity recognition (people, dates, locations), structured key-value pairs, and semantic descriptions. This metadata layer powers both search/chat and app processing.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Agent prompt execution&lt;/strong&gt; — Each App defines an &lt;em&gt;agent prompt&lt;/em&gt;: sophisticated instructions that guide our AI models to extract app-specific structured data from the metadata. These prompts leverage advanced few-shot learning and structured output capabilities, enabling precise extraction without exposing raw file content.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Secure data storage&lt;/strong&gt; — Extracted data is stored in a per-user, per-app namespace within our zero-trust architecture. Each app&amp;rsquo;s data is isolated, encrypted, and accessible only to the user who owns it.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Intelligent widget rendering&lt;/strong&gt; — Apps define &lt;em&gt;widgets&lt;/em&gt; that present extracted data through configurable UI components (e.g., sorted lists, timelines, dashboards). Widgets are dynamically rendered based on app-defined schemas and display preferences.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Pipeline:&lt;/strong&gt; &lt;code&gt;Document → AI Metadata Extraction → Agent Prompt → Structured Data → Secure Storage → Widget UI&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Security guarantee:&lt;/strong&gt; Apps never access raw files or user credentials. They operate exclusively on pre-extracted metadata within isolated execution environments.&lt;/p&gt;
&lt;h2 id="app-types"&gt;App types&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Default apps&lt;/strong&gt; — Pre-installed applications (e.g., Birthdays) that run automatically on all processed documents. These apps are developed by VaultSafe&amp;rsquo;s research team and represent best practices in secure personal data processing.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optional apps&lt;/strong&gt; — Additional apps available in our catalog that users can enable. Once enabled, they process new documents and provide dedicated widgets for their extracted data.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Users can add or edit app entries (e.g. birthdays, reminders) from the Apps page using simple forms—no technical setup or JSON required.&lt;/p&gt;
&lt;h2 id="data-schema-and-extensibility"&gt;Data schema and extensibility&lt;/h2&gt;
&lt;p&gt;Each App defines a &lt;strong&gt;schema&lt;/strong&gt; that specifies the structure of extracted data (e.g., &lt;code&gt;{ person_name: string, date: ISO8601, source: string }&lt;/code&gt;). Schemas enable:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Type-safe data storage and retrieval&lt;/li&gt;
&lt;li&gt;Validation of extraction outputs&lt;/li&gt;
&lt;li&gt;Future extensibility for custom user-defined fields&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Our schema system supports complex nested structures and is designed to evolve with our platform&amp;rsquo;s capabilities.&lt;/p&gt;
&lt;h2 id="marketplace-and-third-party-development"&gt;Marketplace and third-party development&lt;/h2&gt;
&lt;p&gt;VaultSafe is building an &lt;strong&gt;open ecosystem&lt;/strong&gt; for personal data applications. Third-party developers can create apps that integrate seamlessly with our platform:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;App packages&lt;/strong&gt; — Self-contained bundles (folder or zip) containing app metadata, agent prompts, schemas, and widget definitions&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Unified deployment&lt;/strong&gt; — Single-package format ensures consistent installation and execution across our infrastructure&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Secure execution&lt;/strong&gt; — Apps run in isolated environments with strict access controls, ensuring user data privacy&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For technical specifications, development guidelines, and marketplace submission details, see &lt;a href="https://www.vaultsafe.ai/en/developers/"&gt;Marketplace apps&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id="technical-capabilities"&gt;Technical capabilities&lt;/h2&gt;
&lt;p&gt;Our app platform leverages:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Advanced AI models&lt;/strong&gt; for document understanding and structured extraction&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Zero-trust security&lt;/strong&gt; with end-to-end encryption and granular permissions&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Scalable infrastructure&lt;/strong&gt; supporting millions of documents and thousands of concurrent app executions&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Research-driven design&lt;/strong&gt; informed by our team&amp;rsquo;s experience building AI systems for Asia&amp;rsquo;s largest consumer applications&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For technical support or partnership inquiries, contact &lt;a href="mailto:support@vaultsafe.ai"&gt;support@vaultsafe.ai&lt;/a&gt;.&lt;/p&gt;</description></item><item><title>Getting started</title><link>https://www.vaultsafe.ai/en/docs/getting-started/</link><pubDate>Wed, 18 Feb 2026 00:00:00 +0000</pubDate><guid>https://www.vaultsafe.ai/en/docs/getting-started/</guid><description>&lt;h1 id="getting-started"&gt;Getting started&lt;/h1&gt;
&lt;p&gt;VaultSafe is your AI document vault: upload PDFs and images; &lt;strong&gt;OCR and parsing&lt;/strong&gt; make everything searchable. &lt;strong&gt;Chat&lt;/strong&gt; in plain English, &lt;strong&gt;fill PDF forms&lt;/strong&gt;, merge or compress PDFs, and use &lt;strong&gt;smart apps&lt;/strong&gt; that auto-extract birthdays, reminders, and relationships. Zero-trust security.&lt;/p&gt;
&lt;h2 id="quick-start"&gt;Quick start&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Sign up&lt;/strong&gt; at &lt;a href="https://app.vaultsafe.ai" target="_blank" rel="noopener"&gt;app.vaultsafe.ai&lt;/a&gt; (Google or email).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Upload files&lt;/strong&gt; (PDFs, images, receipts, IDs). OCR runs in 9 languages; documents are indexed for search and chat. Attach to a chat or add to My Files.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;My Files&lt;/strong&gt; — Browse, filter by person and type, see status. &lt;strong&gt;Chats &amp;amp; QnA&lt;/strong&gt; — Ask in plain English, attach files, get answers and download links; use PDF tools from the conversation.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Enable Apps&lt;/strong&gt; from the catalog (e.g. Birthdays, Reminders). See &lt;a href="apps/"&gt;Apps&lt;/a&gt; and &lt;a href="document-parsing-pipeline/"&gt;Document parsing pipeline&lt;/a&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id="what-you-get"&gt;What you get&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;OCR &amp;amp; parsing&lt;/strong&gt; — 9 languages; receipts, invoices, IDs → searchable metadata. &lt;a href="document-parsing-pipeline/"&gt;How it works&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Chat&lt;/strong&gt; — Ask e.g. &amp;ldquo;when does my insurance expire?&amp;rdquo; or &amp;ldquo;fill this PDF with Peter&amp;rsquo;s details.&amp;rdquo; Download links for filled/merged PDFs.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;PDF tools&lt;/strong&gt; — Fill forms, merge, compress, place on A4—from chat.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Smart apps&lt;/strong&gt; — Birthdays, relationships, reminders auto-extracted; filter by person and type.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Privacy&lt;/strong&gt; — Your data private, encrypted, under your control.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For technical support or integration assistance, contact &lt;a href="mailto:support@vaultsafe.ai"&gt;support@vaultsafe.ai&lt;/a&gt;.&lt;/p&gt;</description></item><item><title>App Developer Guide</title><link>https://www.vaultsafe.ai/en/docs/app-developer-guide/</link><pubDate>Sat, 08 Mar 2025 00:00:00 +0000</pubDate><guid>https://www.vaultsafe.ai/en/docs/app-developer-guide/</guid><description>&lt;h1 id="app-developer-guide"&gt;App Developer Guide&lt;/h1&gt;
&lt;p&gt;Build apps that extract structured data from documents and present it through widgets. All app definitions live in the Apps table (Postgres)—no code in the repo.&lt;/p&gt;
&lt;h2 id="how-it-works"&gt;How It Works&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;File upload&lt;/strong&gt; → VaultSafe analyzes it (document type, description, entities, key-value content).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Your app&amp;rsquo;s agent prompt runs&lt;/strong&gt; on that metadata → extracts structured data.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Data is stored&lt;/strong&gt; in &lt;code&gt;user_app&lt;/code&gt; (user_id + app_id + data).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Widget renders&lt;/strong&gt; the data (list, table, or card view).&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Security:&lt;/strong&gt; Apps never see raw files. They only receive pre-extracted metadata.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="creating-an-app"&gt;Creating an App&lt;/h2&gt;
&lt;h3 id="1-define-your-schema"&gt;1. Define Your Schema&lt;/h3&gt;
&lt;p&gt;Example (Birthdays):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-json" data-lang="json"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;extraction&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;birthdays&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nt"&gt;&amp;#34;person_name&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;string&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nt"&gt;&amp;#34;date&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;string&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nt"&gt;&amp;#34;source&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;string&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nt"&gt;&amp;#34;file_id&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;string&amp;#34;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="2-write-the-agent-prompt"&gt;2. Write the Agent Prompt&lt;/h3&gt;
&lt;p&gt;Input JSON: &lt;code&gt;suggested_file_name&lt;/code&gt;, &lt;code&gt;type_of_file&lt;/code&gt;, &lt;code&gt;description&lt;/code&gt;, &lt;code&gt;main_person&lt;/code&gt;, &lt;code&gt;other_persons&lt;/code&gt;, &lt;code&gt;full_content&lt;/code&gt;, &lt;code&gt;file_id&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Birthday example:&lt;/strong&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;You extract birthday-related information from document metadata. Only use information explicitly present in the input; do not infer or guess.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Input is a JSON object with: suggested_file_name, type_of_file, description, main_person, other_persons, full_content (key-value from document).
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Return a JSON object with:
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;- &amp;#34;birthdays&amp;#34;: list of { &amp;#34;person_name&amp;#34;: string, &amp;#34;date&amp;#34;: string (YYYY-MM-DD or partial like &amp;#34;15 March&amp;#34;), &amp;#34;source&amp;#34;: string, &amp;#34;file_id&amp;#34;: string (pass through from input) }
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;If no birthday/date of birth is explicitly present, return { &amp;#34;birthdays&amp;#34;: [] }. Do not fabricate dates.
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="3-configure-the-widget"&gt;3. Configure the Widget&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Field&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;type&lt;/td&gt;
&lt;td&gt;&lt;code&gt;list_by_date&lt;/code&gt;, &lt;code&gt;table&lt;/code&gt;, or &lt;code&gt;card&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;list_key&lt;/td&gt;
&lt;td&gt;Key in extracted JSON (e.g. &lt;code&gt;birthdays&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;sort_field&lt;/td&gt;
&lt;td&gt;Field to sort by (e.g. &lt;code&gt;date&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;display_fields&lt;/td&gt;
&lt;td&gt;Fields to show&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;title&lt;/td&gt;
&lt;td&gt;Widget title&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;empty_message&lt;/td&gt;
&lt;td&gt;Message when no data&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id="4-publish-via-admin"&gt;4. Publish via Admin&lt;/h3&gt;
&lt;p&gt;Use the VaultSafe Admin app (local only). Create or edit the app, set &lt;strong&gt;status&lt;/strong&gt; (&lt;code&gt;enable_for_all&lt;/code&gt; or &lt;code&gt;enabled_by_user&lt;/code&gt;), and save.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="widget-types"&gt;Widget Types&lt;/h2&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;type&lt;/th&gt;
&lt;th&gt;Use case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;list_by_date&lt;/td&gt;
&lt;td&gt;Sorted list, ideal for dates&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;table&lt;/td&gt;
&lt;td&gt;Tabular view&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;card&lt;/td&gt;
&lt;td&gt;Card layout&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr&gt;
&lt;h2 id="app-status"&gt;App Status&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;enable_for_all&lt;/strong&gt;: Auto-enabled for all users. Backfill runs on publish.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;enabled_by_user&lt;/strong&gt;: Marketplace only. Users enable explicitly.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Users can disable any app at any time.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="user-data"&gt;User Data&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;user_app.data&lt;/strong&gt;: List of &lt;code&gt;{ file_id, extracted, updated_at }&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Users can add or edit items in the Apps UI via simple forms (e.g. birthdays, reminders)—no JSON or technical setup required.&lt;/li&gt;
&lt;li&gt;Updates sync to chat context for AI-assisted Q&amp;amp;A.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2 id="full-example"&gt;Full Example&lt;/h2&gt;
&lt;p&gt;See
for a complete Birthdays app example.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="support"&gt;Support&lt;/h2&gt;
&lt;p&gt;
&lt;/p&gt;</description></item></channel></rss>