On this page
8 Documentation is data
8.1 The operating system analogy
Maritime rules and regulations are, in a meaningful sense, the software that runs the shipping industry. Just as a computer's operating system defines how hardware resources are managed, processes are scheduled, and applications communicate, the ISM Code defines how safety management systems must work. It specifies the interfaces (Designated Person Ashore as the link between ship and shore), the processes (internal audits, non-conformity reporting, management review), and the requirements that every "application" (every company procedure) must satisfy.
Extending this analogy, SOLAS chapters are like operating system modules. Chapter III handles life-saving appliances. Chapter II-2 handles fire protection. Chapter V handles navigation safety. Each module has its own interface, its own requirements, and its own update cycle. Company procedures are the application code, implementing the operating system's requirements for a specific context. Vessel-specific documents are the runtime configuration, the parameters that customize the application for a particular execution environment.
When documentation remains in flat files (Word documents and PDFs) it is as if you were running this entire software stack by manually typing instructions into a terminal, one line at a time, with no compiler to catch errors, no version control to track changes, and no automated tests to verify correctness. Treating documentation as data means giving this critical system the same rigorous infrastructure that actual software depends on.
8.2 What becomes possible when documents are structured data
Programmatic gap analysis. When every regulatory requirement is a node in a tree, and every company procedure is linked to the requirements it addresses, a computer can traverse both trees and identify gaps. Requirements with no corresponding procedure. Procedures that reference superseded regulations. Required sections missing from a manual. This is not hypothetical; it is the operational principle behind AI-powered compliance tools already emerging in adjacent industries.
Automated change propagation. When IMO amends a SOLAS regulation, the change can be mapped to a specific node in the regulatory tree. Every document that references or implements that node is automatically flagged. The compliance manager does not discover the impact through manual review; the system presents a precise list of affected documents, sections, and cross-references.
Content reuse without drift. The assembly pattern ensures that shared content (standard safety warnings, common procedures, regulatory boilerplate) is maintained in one place and included by reference everywhere it appears. When the source changes, every assembly that includes it reflects the change, with version pinning providing controlled rollout. Across a fleet of fifty vessels, a single procedure update propagates consistently and traceably.
AI-powered search and analysis. Large language models and retrieval-augmented generation (RAG) systems perform dramatically better when operating on structured content. Rather than searching through monolithic PDFs and hoping the relevant passage is found, an AI system can query at the section level, filter by document type or regulatory basis, and retrieve precisely the content needed. The natural chunking provided by AST-based content, where every section is an individually addressable node with metadata, is exactly what RAG architectures require.
Automatic audit trails. Every content block has a version history. Every assembly records which versions of which blocks were included at each edition. Every edit is timestamped and attributed. When an auditor asks for the history of a specific procedure, the system provides it instantly, not as a folder of superseded PDFs but as a structured timeline of changes with full context.
8.3 The vision ahead
The structured documentation approach represents a foundation, not a ceiling. When documents are data, they become inputs to increasingly sophisticated systems. A gap analyzer can compare an organization's documentation against a regulatory framework and quantify compliance completeness. A fleet-wide search system can answer questions like "Which vessels have procedures that reference the pre-2024 version of MARPOL Annex VI?" in seconds. A training system can extract procedural steps from operational documents and generate competency assessments. A change management system can model the impact of a proposed regulatory amendment across an entire fleet's documentation before the amendment even takes effect.
Other high-regulation industries have already traveled this path. Aviation uses S1000D, an XML standard for technical publications where content is stored as individually identified "data modules" in a common source database, reused across aircraft types, maintenance organizations, and regulatory jurisdictions. The British Type 45 destroyer's documentation (equivalent to 120,000 pages) is managed entirely through S1000D. The maritime industry has its own adaptation, Shipdex, built on S1000D principles, with participants including MAN Energy Solutions, Rolls-Royce Maritime, and Yanmar. But Shipdex requires specialized XML tooling and significant implementation investment.
Markdoc and the AST-based approach offer a lighter-weight path to the same destination. Structured, validatable, reusable, machine-readable content, authored in a format that technical writers and subject matter experts can learn in an afternoon rather than a year. It provides much of the structural power of enterprise XML standards with a fraction of the complexity, making it practical for maritime organizations of all sizes.