SHARE:

AI applied to contracts: how Doczy.ai turns complex documents into strategic data using AWS

Artificial intelligence has finally landed where it makes the biggest difference: that mountain of documents nobody wants to read but everybody needs to understand.

If you have ever worked at a company that deals with contracts, you know the drill. Stacks of files, PDFs all over the place with no standard format, tables that seem designed to confuse, and at the end of the day someone still has to manually pull every relevant piece of information out of every clause.

It is slow, it is expensive, and inevitably, it is full of mistakes.

Well, that is exactly the problem that global consulting firm AArete, which specializes in healthcare and financial services, decided to solve. The result has a name: Doczy.ai. The solution runs on top of AWS infrastructure and uses generative AI to transform complex contracts into structured data ready to fuel real business decisions. And the numbers this platform has posted over the past 22 months leave very little room for skepticism.

We are talking about 2.5 million documents processed, the equivalent of roughly 50 million pages, more than $330 million in cumulative savings for clients, and a 97% reduction in manual processing time. But what really stands out in this story is not just the scale. It is the journey to get here, because Doczy.ai was not born ready, and understanding how it evolved helps explain why it works so well today. 🚀

The real problem: valuable data locked inside documents

For healthcare and financial services organizations, managing and interpreting contracts is a massive operational bottleneck. Think about the sheer amount of critical information trapped inside legal agreements, vendor arrangements, invoices, and provider contracts. All of it in unstructured formats, scattered across physical and digital folders, with zero standardization that would allow a quick lookup.

The traditional review process requires entire teams to be mobilized to extract data from thousands of documents. This approach does not scale, it is not sustainable, and it is highly error-prone. To make matters worse, many organizations rely on institutional knowledge, meaning the most important information about contracts lives in the heads of a handful of professionals. That creates knowledge silos and serious continuity risks when those people leave the company.

Traditional contract lifecycle management systems, known as CLMs, do not fully solve the problem either. They can set up predefined fields, but they miss the rich detail and contextual information that makes each contract unique. In practice, that means clauses with important nuances end up being treated generically, and the real value of the document gets lost along the way.

In healthcare, this impact is even more visible. Reimbursement terms need to be manually translated into claims systems, a slow process riddled with opportunities for errors. Likewise, verifying vendor invoices against contract terms requires constant manual effort, creating payment delays and missed savings opportunities. These are inefficiencies that, added up, leave a lot of money on the table.

From manual spreadsheets to a generative AI platform

AArete started building Doczy.ai from a real pain point it experienced firsthand in its own consulting operations. Before the platform existed in its current form, the company relied on heavily manual processes to review and extract data from contracts, depending on entire teams to catalog information that could perfectly well be automated. The problem was not a lack of human talent. It was the repetitive, high-volume nature of the work, which consumed precious hours from professionals who could have been focused on more strategic analysis.

Receive the best innovation content in your email.

All the news, tips, trends, and resources you're looking for, delivered to your inbox.

By subscribing to the newsletter, you agree to receive communications from Método Viral. We are committed to always protecting and respecting your privacy.

The evolution of Doczy.ai mirrors the rapid advancement of artificial intelligence itself. Before 2020, document processing was essentially manual, with professionals able to handle roughly 100 documents per week. Between 2020 and 2023, the company implemented rule-based processing, reaching about 55% accuracy. The big leap came in 2024, when generative AI-based processing built on AWS hit 99% accuracy, a dramatic improvement over the 55% from earlier rule-based systems.

Over time, the consultancy realized that the artificial intelligence models available on the market had matured enough to handle legal and corporate language reliably. That is when the partnership with AWS took on a central role in the project. Amazon Web Services did not enter this story merely as a cloud infrastructure provider. It brought along a complete ecosystem of machine learning services, natural language processing, and generative AI tools that allowed Doczy.ai to scale with security and speed without compromising the accuracy that legal documents demand.

How document processing works in practice

When a new contract enters Doczy.ai, it goes through a document processing pipeline that starts with file normalization. Regardless of the original format, the system converts the content into a structure that artificial intelligence models can process consistently. This step might sound simple, but it is where many similar solutions stumble, especially when the document has embedded tables, headers in unusual positions, or text in images that needs OCR before any analysis can happen.

The architecture behind the magic

Doczy.ai is built on a comprehensive AWS architecture designed to handle the entire document processing lifecycle, from the moment a file enters the system to the moment it generates actionable business intelligence.

External users access the platform through a secure Next.js frontend, with Amazon Cognito managing authentication and authorization behind the scenes. After authentication, users upload documents directly to Amazon S3, where durable and scalable object storage ensures nothing gets lost and everything remains accessible at scale. From there, the real intelligence kicks in.

An AWS Lambda function triggers Amazon Textract to extract text and metadata from documents in various formats. What sets Doczy.ai apart at this stage is its patented intelligent fragmentation algorithm, called smart chunking. This proprietary approach goes far beyond simply pulling words off a page.

Smart chunking: intelligent document fragmentation

Instead of treating a document as a flat sequence of text, smart chunking preserves the hierarchical structure and one-to-many relationships within documents. It uses a combination of semantic and keyword-based search to break the text into meaningful, context-aware fragments, applying dynamic parameters to maintain logical relationships throughout the entire process. Sequential identifiers and metadata-based groupings organize these fragments into field groups, detecting overlaps and removing duplications while keeping the natural flow of the document intact.

The dual clustering engine

After fragmentation, the document enters Doczy.ai’s dual clustering engine. This two-perspective methodology analyzes each contract simultaneously from both a semantic and a structural standpoint.

On the semantic side, the extracted text is converted into embeddings, numerical representations of meaning, and similar ideas are grouped together even when expressed with different words. On the structural side, pattern recognition algorithms identify clause types, formatting conventions, table layouts, and hierarchical organization. For example, the system understands that an appendix with three levels of nesting has fundamentally different implications than a simple attached schedule.

These two analyses do not operate in isolation. Projection algorithms compare the semantic and structural clusters side by side, synthesizing them into a unified, enriched document model that captures both meaning and context. It is this convergence that drives Doczy.ai’s 99% accuracy rate. The system does not just read the words — it understands the contract.

Large language models, the now-famous LLMs, then generate structured outputs grounded in this dual clustering intelligence. Before the output is finalized, the system determines the file class of each document and generates prompts tailored to the extracted text, cluster classification, and domain context. Through few-shot and multi-shot prompting techniques, the platform continuously refines prompts based on domain-specific examples and real-world results, creating a feedback loop that progressively improves accuracy over time. ⚙️

Storage and monitoring

The resulting structured data flows into Snowflake, forming a centralized repository that feeds intelligent dashboards with actionable insights and visualizations. Throughout the entire pipeline, Amazon CloudWatch monitors performance in real time and proactively identifies issues before they escalate, while AWS Secrets Manager protects sensitive information, ensuring security is not an afterthought but a foundational layer integrated into every stage of the system.

The results that put Doczy.ai on the market’s radar

When a contract automation platform says it has generated more than $330 million in savings for its clients, the natural first reaction is to question how that number was calculated. In the case of Doczy.ai, the methodology accounts for both direct operational cost reductions, such as eliminated hours of manual labor, and indirect gains like contract renegotiations that were only possible because the platform identified unfavorable clauses that would have gone unnoticed in a conventional human review. That second type of gain tends to be the most surprising for clients, because it represents value that was hidden inside their own contracts all along.

In terms of operational scale, the past 22 months demonstrate the platform’s maturity and production readiness. Doczy.ai has made 137 million API calls to Amazon Bedrock and processed 442 billion tokens, a level of automation and precision previously unattainable through manual or traditional document processing approaches.

The 99% accuracy rate represents a significant improvement over the roughly 55% accuracy of rule-based systems and far exceeds manual processing, which is typically affected by fatigue and human error. The 97% reduction in manual processing time translates directly into cost savings and allows organizations to reallocate human resources to higher-value activities that require judgment and strategic thinking.

For industries like healthcare, retail, and financial services, where contract management is constant and strategic, this speed gain has a direct impact on business responsiveness. Renegotiations that used to depend on time-consuming analyses can now happen on schedule without the legal team having to put everything else on hold to make them possible.

Beyond the financial and operational results, Doczy.ai has also stood out for significantly reducing the interpretation errors that occur in manual reviews. Artificial intelligence does not get tired, does not lose focus after reading the tenth contract of the day, and does not interpret a clause differently depending on who the reviewer is. That level of consistency is especially valuable in audits and due diligence processes, where an inconsistency in contract reading can have serious consequences. 📊

Use case in action: process automation for health plans

For health plans, Doczy.ai offers a powerful solution to automate and improve contract management across the entire lifecycle. The platform ingests existing contracts in both physical and digital formats, integrates with contract management systems like Coupa and Icertis, and processes new contracts and amendments as they are executed. It then creates a centralized metadata repository that feeds directly into downstream systems, enabling end-to-end business process automation.

This automation unlocks critical capabilities:

Tools we use daily

  • Continuous analysis of contract terms: organizations can consistently identify opportunities to improve financial performance and operational efficiency.
  • Automatic feeding of claims systems: the architecture sends accurate, up-to-date contract data directly to claims systems, automating the configuration process that previously required manual translation of reimbursement terms, eliminating manual data entry, configuration errors, and delays.
  • Payment verification: the platform helps maintain claims payment accuracy by evaluating payments against contract terms, identifying discrepancies, and flagging potential overpayments or underpayments before they occur.

By automating manual processes, health plans can quickly adapt to new contract terms and regulatory requirements. The intelligent dashboards and actionable insights provided by Doczy.ai enable decision-makers to understand contract performance, identify trends, and take proactive steps to optimize financial outcomes.

The strategic role of AWS in this equation

The choice of AWS as the technological foundation for Doczy.ai was no accident. Amazon Web Services offers a suite of services that align very directly with the needs of a large-scale document processing platform. Services like Amazon Textract for text and data extraction, Amazon Cognito for authentication, Amazon S3 for storage, AWS Lambda for serverless processing, Amazon ECS for containerization, and Amazon Bedrock for access to generative AI models were all fundamental to building the current architecture.

More than the individual services, what the AWS partnership brought was the ability to combine these tools into a cohesive pipeline with data governance, security, and regulatory compliance baked in. For enterprise clients, especially in the financial and healthcare sectors, this is not a nice-to-have — it is a prerequisite. No large organization is going to put its contracts on a platform that cannot clearly demonstrate how data is handled, stored, and protected. AWS infrastructure covers exactly that requirement, and AArete knew how to leverage it in building the product.

Another key aspect of the partnership is access to continuous updates to the artificial intelligence models available through AWS. The generative AI field evolves at a breakneck pace, and a platform that relies on static models quickly falls behind. By running on Amazon Bedrock, Doczy.ai can incorporate newer, more accurate models without having to rebuild the entire architecture from scratch. This ensures the platform stays competitive without the costs and risks associated with a complete technology migration every time the industry goes through an innovation cycle. 🔄

The SaaS model and platform accessibility

AArete offers Doczy.ai as a Software as a Service solution, which means interested organizations can adopt the platform without significant infrastructure investments. This distribution model allows for rapid deployment, and AArete’s team of specialists configures the solution according to the client’s specific document types, domain terminology, and business processes. The goal is to ensure the platform delivers maximum value from day one of operation.

This approach makes the technology accessible not only to large corporations with robust IT budgets but also to mid-sized organizations that face the same contract management challenges with more limited resources for investing in custom solutions. The ability to process up to 250,000 contract documents per week at 99% accuracy shows the platform is ready to meet demand at any scale.

What this means for people who deal with contracts every day

For legal, procurement, and vendor management teams, Doczy.ai represents a paradigm shift in how contract work gets done. Instead of spending hours manually hunting for information in lengthy documents, these professionals get to work with data that is already organized and searchable, freeing them to focus their energy on analysis, strategy, and decision-making. This does not mean replacing people — it means the same people can do far more, with far greater confidence in the data they are using.

Contract automation also changes how companies view their own contract portfolio. When data is structured and accessible, it becomes much easier to spot patterns, like vendors with systematically unfavorable clauses, contracts nearing expiration that need attention, or obligations being overlooked due to lack of visibility. This kind of strategic intelligence used to be the privilege of large enterprises with teams dedicated exclusively to contract management. With platforms like Doczy.ai, it is starting to become accessible to organizations of all sizes.

What the trajectory of Doczy.ai shows, above all, is that artificial intelligence applied to real, specific problems tends to deliver much more concrete results than generic solutions. The platform’s success did not come from trying to solve everything at once but from going deep on a well-defined problem — turning contracts into a strategic data asset — with robust technology and the right partners. By embracing document intelligence on AWS, organizations can tackle this longstanding operational challenge and unlock a new frontier of strategic advantage, turning their data into their most valuable asset. 💡

Picture of Rafael

Rafael

Operations

I transform internal processes into delivery machines — ensuring that every Viral Method client receives premium service and real results.

Fill out the form and our team will contact you within 24 hours.

Related publications

AI SDR Agent on WhatsApp: How SMBs Can Cut Costs and Scale Sales

Respond 21x faster your leads and scale your sales operation with a fraction of the cost of expanding your sales

Robot Detects Unusual Browser Activity Using JavaScript and Cookies

Learn why sites require JavaScript and cookies for unusual activity and how to fix blocks with quick, simple steps

Productivity with Agentic Artificial Intelligence in execution and workflows.

Agentic AI: how to operationalize AI agents to improve workflows, metrics, and governance, turning pilots into real productivity gains.

Receive the best innovation content in your email.

All the news, tips, trends, and resources you're looking for, delivered to your inbox.

By subscribing to the newsletter, you agree to receive communications from Método Viral. We are committed to always protecting and respecting your privacy.

Rafael

Online

Atendimento

Website Pricing Calculator

Find out how much the ideal website for your business costs

Website Pages

How many pages do you need?

Drag to select from 1 to 20 pages

In just 2 minutes, automatically find out how much a custom website for your business costs

More than 0+ companies have already calculated their quote

Fale com um consultor

Preencha o formulário e nossa equipe entrará em contato.