Our Call to Open Knowledge Format

The Google Cloud team proposed Open Knowledge Format (OKF) as a open specification or standard to formalizes how knowledge should be structured so that AI agents can actually use it. Not just search it. Not just retrieve it. Actually reason over it, navigate it, and update it over time.

The spec itself is elegant in its simplicity. A directory of markdown files. YAML frontmatter with a small set of agreed-upon fields: type, title, description, resource, tags, timestamp. Concepts linked to each other through normal markdown links, forming a traversable graph. No proprietary SDK. No vendor lock-in. No required runtime.

Read the full blog post here: Introducing the Open Knowledge Format

We read it carefully. Here is our take.

Why OKF Matters

The problem Google is naming is real and it is severe and actually that also actually is some of the part of the problem we are trying to solve and we building Malimgraph plugin to experiment and standardize our output.

Most organisations have knowledge scattered across wikis, metadata catalogs, KMS, Library system, site pages, code comments, shared drives, and maybe no any engineers. When an AI agent needs to answer a question, it has to assemble context from these mutually incompatible surfaces. Every agent builder solves the same context-assembly problem from scratch. Every catalog vendor reinvents the same data models. Nothing is portable.

OKF proposes a format as the solution, not another service. That is exactly the right instinct. A shared format is how ecosystems are built. It is how HTML unified the web. It is how JSON unified APIs. A format that anyone can produce and anyone can consume, without an integration, without an account, without a vendor in the middle.

The three principles behind the design are worth calling out:

Minimally opinionated. OKF only requires a type field. Everything else is left to the producer. The spec defines the interoperability surface, not the content model. This is how you get adoption.

Producer and consumer independence. A bundle written by a human can be consumed by an AI agent. A bundle generated by a pipeline can be browsed in a visualizer. The format is the contract. Tooling at each end is independently swappable.

Format, not platform. OKF will never require a proprietary account to read or write. The value of a knowledge format comes from how many parties speak it, not from who owns it.

These are good principles. We agree with all three.

OKF is designed for structured organisational knowledge: database schemas, metric definitions, API runbooks, join paths between tables. It is built for the context that BigQuery agents need. That is a valuable and legitimate use case.

But it assumes the knowledge already exists in a legible form somewhere. It is a format for knowledge that has already been captured.

The harder problem, the one we are working on at Malim AI Labs, is what happens when the knowledge has never been properly captured at all.

Malaysian public knowledge is still fragmented not just across systems, but across languages, institutions, legal jurisdictions, and decades of inconsistent documentation practices. For example, Government legislation exists in PDFs that were never meant to be machine-readable. Court judgments sit in formats that vary by year and by issuing body. Policy documents reference each other without hyperlinks. Census data uses classification codes that have been revised multiple times without cross-referencing the old codes.

An OKF producer agent that crawls a BigQuery dataset and drafts structured markdown files is impressive. But it assumes clean, accessible upstream data. In the Malaysian public data context, you have to solve the upstream problem first.

That is what our Knowledge Bank is (name to be change).

Our Take on the Future of Knowledge Format

OKF v0.1 is a starting point, as the Google team themselves acknowledge. We think the format will evolve in a few important directions.

Provenance will become a first-class field.

Right now, OKF tracks a timestamp. That is useful. But agents operating over institutional knowledge need to know more than when a document was last updated. They need to know who verified it, what the source document was, which version of a regulation it reflects, and whether a human reviewed it before it was published. Citation and provenance will need to be built into the format, not left to the markdown body.

This is something we have been thinking about deeply with the Knowledge Bank. Every entry in our knowledge infrastructure is tied to a source citation, reviewed by a human, and version-tracked. That discipline is what makes knowledge trustworthy enough to ground high-stakes queries.

Language and locale will need to be explicit.

OKF currently treats knowledge as if it exists in a single language. For global adoption, including in multilingual contexts like Malaysia where institutional knowledge exists in Bahasa Malaysia, English, and sometimes Mandarin or Tamil, the format will need to accommodate language metadata, translation linkage, and locale-specific variants of the same concept.

The graph structure needs to be richer.

OKF uses markdown links to connect concepts. That is simple and human-readable. But as agent ecosystems mature, the relationship between concepts will matter as much as the concepts themselves. Is this link a dependency, a reference, a supersession, a contradiction? A richer but still lightweight relationship vocabulary will likely emerge as agents try to do more than retrieval.

Trust tiers will matter.

Not all knowledge is equal. A note left by a junior analyst is not the same as a regulation published by a government ministry. As OKF gets adopted by more producers, the format will need a way to signal the confidence level, verification status, or institutional authority of a document. Otherwise agents will treat a speculative internal comment with the same weight as a gazetted law.

What We Are Building

Malim AI Labs firstly will need build a Knowledge Bank Model: a structured, verified, publicly accessible layer of institutional knowledge.

Our architecture can be different from OKF in implementation but aligned in philosophy. The difference is context. We are building the infrastructure for knowledge in digital domain for all. Malim untuk Semua.

Closing Thought

Google releasing OKF is a signal, not just a product launch.

It is a signal that the AI ecosystem is maturing past raw retrieval toward structured, navigable, verifiable knowledge. That the context problem is being taken seriously as a first-class engineering challenge. That interoperability, not vendor lock-in, is the right foundation for knowledge infrastructure.

We think they are in right lath. And we believe the most valuable work to be done is not only at the format layer but at the knowledge layer itself: the hard, slow, human-intensive work of taking messy, fragmented, poorly-documented institutional knowledge and turning it into something an agent can actually trust.

That is what Malim AI Labs is here to do.