How are my documents processed?
This page explains transparently how uploaded PDF documents are processed within the service “Smart Document Analyzer”. The service enables automated invoice analysis to extract specific information such as invoice number, date, or total amount.
1. Upload of documents
You can upload PDF documents directly via the web interface. The transmission is encrypted using HTTPS (TLS).
The uploaded files are used exclusively for performing the automated analysis.
2. Automated processing
After upload, the content of the PDF files is automatically processed to extract text information from the document.
This process is fully automated and does not involve manual review by staff.
3. Text extraction from PDF files
If a document does not contain directly readable text (e.g. scanned invoices), automated text recognition (OCR – Optical Character Recognition) may be applied.
In such cases, OCR is performed using Microsoft Azure AI Document Intelligence.
Text extraction is performed exclusively for preparing the subsequent structured AI analysis.
OCR is used only when necessary (e.g. image-based/scanned PDFs) and only for converting image content into machine-readable text.
The processing settings in Microsoft Azure are configured so that submitted OCR data is not used for general model training. The applicable Azure terms and privacy policy remain authoritative.
4. AI-based analysis
The extracted text is then automatically analyzed to identify structured information from the document.
Typical extracted data may include:
- Invoice number
- Invoice date
- Total amount
- Company name
- VAT ID
- Address
- Additional selected fields
A language model (AI-based system) may be used for this analysis.
The following provider is currently used:
Microsoft Azure OpenAI Service (GPT-5 mini)
Microsoft Ireland Operations Limited (EU operating entity)
Parts of the extracted text content may be transmitted to this service to enable structured analysis of document data.
The transmitted data is not used for training purposes.
Depending on the selected Azure deployment type (e.g. Global Standard), processing may also occur in data centers outside the European Union. Where required, appropriate safeguards pursuant to Art. 46 GDPR apply, in particular Standard Contractual Clauses (SCC).
5. Temporary storage
Uploaded documents and extracted data may be stored temporarily for the purpose of performing the analysis. Temporary storage is managed using Azure Blob Storage.
This storage is strictly limited to the following purposes:
- Performing document analysis
- Providing analysis results
- Generating exportable result files (e.g. Excel)
Uploaded documents and extracted data are automatically deleted after a maximum of 24 hours.
6. Provision of results
The extracted data is provided in structured form.
Users may download the results, for example as an Excel file (.xlsx).
The downloaded file is stored exclusively on the user's device.
7. User responsibility
Users are responsible for ensuring that only documents are uploaded whose processing is legally permitted.
In particular, users must not upload documents containing sensitive personal data, such as:
- Health data
- Biometric data
- Information on criminal convictions
- Other sensitive personal data under Art. 9 GDPR
By activating the checkbox during the upload process, you confirm that the uploaded documents do not contain such sensitive data.
8. Accuracy disclaimer
The automated analysis is based on OCR and AI technologies.
Despite careful implementation, it cannot be guaranteed that all extracted data is complete, accurate, or error-free.
The results are provided for assistance purposes only and should be reviewed by the user before further use (e.g. accounting or tax purposes).
9. Further information
Further information on the processing of personal data can be found in our Privacy Policy.