From documents to defensible findings. Sources traceable, doubts flagged.

Input document comp_file.pdf

CloudSync's Enterprise segment added $42.6 million in ARR, Mid-Market $18.3 million . Small Business reached $7,450 thousand . Public Sector was roughly 12.4 million , but this blends recurring and one-time fees.

Data instructions

Report reading…
  • Enterprise $42.6M
  • Mid-Market $18.3M
  • Small Business $7.45M
  • Public Sector $12.4M
source: comp_file.pdf
A research commercialisation project. Funded by
Problem

Knowledge work runs on documents.

Underwriting, diligence, compliance, research.. So much of it comes down to finding the right facts in messy, scattered sources. Chat-based AI tools offer help, but they work like black boxes: confident answers with no honest signal of what they got wrong.

Today's tools

ChatGPT or Copilot

Loses the thread across long document sets. Citations inconsistent, often hallucinated. Confident output regardless of underlying certainty.

With ReliaParse

Auditable, calibrated, abstaining

Reads documents exhaustively and works across many documents. Backs claims with auditable source citations. Model abstains and asks for clarifications when uncertain or there is contradictory information.

How it works

From raw documents to traceable, structured answers.

Step 01

Select sources

Drop in the source documents: filings, reports, transcripts, spreadsheets. Any format, any scale.

Step 02

Define

Spell out what the reports or structured outputs should contain: the fields to extract, the shape of the output, the constraints it must satisfy.

Step 03

Check and send off

Get the report or structured output back with a full audit bundle: every field linked to the source span it came from.

Approach

A different architecture for trustworthy answers.

Pillar I

New model architectures

We pair generative LLMs with discriminative zero-shot models that identify the relevant chunks and spans before generation. This anchors every answer to a specific location in the source and produces calibrated confidence, something a single LLM cannot reliably do on extraction tasks.

Pillar II

Reconciliation across documents

Workflows operate over whole collections, not single files. When several passages are relevant or sources contradict one another, the system reconciles them explicitly and surfaces the conflict, rather than silently picking one and moving on.

Pillar III

UI built for verification

The interface is the proof. Every answer traces back to a highlighted span in its source document, with reconciliation steps and conflicting information shown in full, turning verification from a re-investigation into a glance.

Our team

ReliaParse is created by a team of academics and commercial experts based at the University of Helsinki.

Dr. Aleksi Knuutila
Dr. Aleksi Knuutila
Project Lead

15+ years of research in computational social science, including Oxford and University College London. Leading research work.

Dr. Roman Kyrychenko
Dr. Roman Kyrychenko
Analytical Workflows

Widely published in statistical methods and LLM applications. Industry experience as a data scientist.

Antti Nikander
Antti Nikander
Industry & Pilots

Runs pilot relationships, scopes engagements with partner organisations, and maps the route out of the research phase.

Tommi Laivuori
Tommi Laivuori
LLM Verification

20+ years of entrepreneur in tech related startups & corporate venturing. Coordinating go-to-market & AI risk mitigation.

Dr. MBA Vinod Vadakital
Dr. MBA Vinod Vadakital
LLM Verification

Background in basic research in signal processing, and expertise in commercialisation in the digital domain.

FAQ

Questions we hear from potential users.

Is ReliaParse a product I can buy today? +

Not yet. ReliaParse is a research project actively developing new approaches to document AI, and we’re working with a small number of design partners to evaluate the system on real, high-stakes work. Early users get hands-on access to the current system and direct involvement in shaping where it goes next. We expect a productised offering to follow, but design partners come first.

Which document types and formats do you support? +

PDFs, scans, Word documents, spreadsheets, and emails: anything that arrives in a typical case file. We handle native digital documents and scanned ones, and we work with documents in European languages such as Finnish. If your workflow involves a format we don’t yet support, tell us, and we will include it in our roadmap.

What happens to the data of potential users? +

The pilot version of our system runs on CSC infrastructure in a data centre in Kajaani, Finland, not on the public cloud. Data shared during a pilot is not used for training models nor any other purposes. We work within whatever data handling arrangement potential users require, and are open to on-premise deployment for sensitive workflows where needed. We’re happy to sign a DPA before any documents are shared.

Who's behind the project? +

ReliaParse is research project based at the University of Helsinki. We are funded by Business Finland as a Research to Business project. Our team’s background is in applied machine learning in the field of computational social sciences, especially NLP and the reliable processing of heterogenous data.

Latest writing

From the research notebook.

Our approach is to "build in public." Roughly one note a month. Methods, field reports, and the occasional position piece. All notes →
Get in touch

We are looking for early adopters.

Could your work benefit from new approaches to AI? Get in touch to discuss how we would collaborate. We are looking for early users to help us develop the project following user needs. We are able to offer our system free for early users.