Connect to the AI Grid

Connect to the AI Grid

How to Draw an Excel Table for AI

Why spreadsheets built for humans confuse machines — and how to make your data machine-readable

Daniel Antal's avatar
Daniel Antal
Nov 17, 2025
∙ Paid

For decades, the spreadsheet — or its programmatic cousin, the data frame — has been the universal medium of analysis. But that world has changed.

Today, most tables are no longer consumed by humans.
They are consumed by machines: APIs, data pipelines, LLM agents, validation systems, and reasoning engines that need to know what the numbers mean, not just how they are arranged.

And yet our tools — and our habits — haven’t caught up.

Whether you work in Excel, Python, R, SQL, PowerBI, or OpenOffice, the systems you use can calculate regressions, visualise trends, and clean data brilliantly.
But they cannot, on their own, tell another system:

  • “this column is GDP, measured in millions of euros”,

  • “this code ‘1’ means wholesale, and ‘2’ means retail”,

  • “this timestamp is in UTC”,

  • “this variable follows an official statistical concept defined by Eurostat”.

Without this semantic layer, AI systems cannot reason across datasets — they can only guess. This is one reason LLMs hallucinate: most data tables carry values, not meaning.

To build data that is future-proof and AI-ready, we need to rethink the table itself. We implemented a solution to this problem in the R statistical environment and language, but our approach works in PowerBI or Python, too.

Why Spreadsheets Fail AI

A spreadsheet or tidy data frame is easy for a human: rows, columns, headers.
But trained (and expensive) data scientists typically spend hours — sometimes days — trying to transform a human-friendly spreadsheet into something machines can interpret.

A notorious StackOverflow, example shows just how much manual work is needed to “fix” a spreadsheet that was never meant for machines.

For AI, that structure is meaningless unless it also knows:

  • what each variable represents,

  • the units of measurement,

  • the ontology or vocabulary it belongs to,

  • when and how it was created.

Most tabular data is a cognitive artefact — built for human readability, not machine understanding. As we move toward autonomous agents and LLM-integrated pipelines, this becomes a critical bottleneck.

The solution: a table must become a semantic contract.

Rethinking the Table

In modern data governance, a dataset is not just a table. It is a described object: it has meaning, provenance, units, definitions, contributors, and context that must travel with the data itself.

International standards such as ISO/IEC 20546, Dublin Core, and the W3C Data Cube Vocabulary all define a dataset not as a file, but as a semantically described resource. Traditional tables — whether in Excel, CSV, pandas, R data.frames, or SQL exports — don’t contain any of this information. They are brilliant containers for numbers and strings, but they are not designed for semantics.

Leave a comment

That is why we developed a software library for R, but the principles behind it apply everywhere: attach meaning at the moment of dataset creation, not afterwards.

From Values to Meaning

Keep reading with a 7-day free trial

Subscribe to Connect to the AI Grid to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Daniel Antal
Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture