Malloy - Semantic Modeling for SQL

Malloy is interesting if you keep rewriting the same SQL logic in different dashboards, notebooks, scripts, and BI tools.

Instead of treating every query as a one-off string, Malloy gives you a language for modeling sources, relationships, measures, dimensions, transformations, and result views.

It still runs on SQL engines, but the reusable semantic layer lives in Malloy.

Malloy is a semantic modeling language and query language for relational data. It describes the business logic once, then compiles queries into SQL for engines such as DuckDB, BigQuery, PostgreSQL, MySQL, Snowflake, Trino, and Presto.

Malloy GitHub Source Code Malloy Documentation Malloy Website License: MIT

What is Malloy?

Malloy is a modern open-source language for describing data relationships and transformations.

The simplest way to think about it:

Malloy model + query
  -> compiler/runtime
  -> SQL for a database engine
  -> result data and visualization metadata

That means Malloy is not trying to replace your database. It uses existing SQL engines. The repository says Malloy can connect to BigQuery, Snowflake, PostgreSQL, MySQL, Trino, or Presto, and it has native DuckDB support for local files and embedded analytical workflows.

Here is the kind of query shape Malloy is built around:

run: bigquery.table('malloydata-org.faa.flights') -> {
  where: origin = 'SFO'
  group_by: carrier
  aggregate:
    flight_count is count()
    average_flight_time is flight_time.avg()
}

The same idea in SQL is possible, of course. The practical difference is that Malloy can preserve more of the model, metric, join, and result-shape intent outside a single throwaway SQL statement.

Repository Snapshot

I cloned the repository into:

tmp/malloy

Source state:

Item	Value
Repository	`malloydata/malloy`
Commit	`89ee9ef773362db163c16be1c1f90c662d7a32e0`
Commit date	`2026-06-04T18:33:58+00:00`
Commit message	`Version 0.0.406-dev [skip ci]`
Latest tag observed	`v0.0.405`
Workspace package version	`0.0.406`
Runtime	Node.js `>=20`
Package manager	npm workspaces
License	MIT

This is a TypeScript monorepo, not a single binary project.

Tech Overview of Malloy

The important repo split is:

packages/malloy
  -> parser, translator, IR, compiler, runtime API
packages/malloy-db-*
  -> database connection packages
packages/malloy-render
  -> browser renderer for result tables, charts, dashboards
packages/malloy-syntax-highlight
  -> editor grammar assets
demo/
  -> historical demos, now pointing at moved standalone demo repositories
test/
  -> multi-dialect test infrastructure

The language architecture has two main phases:

Malloy source
  -> ANTLR parser
  -> AST builder
  -> database-independent IR
  -> dialect-specific SQL compiler

That separation matters. The translator can parse and understand the Malloy model without committing immediately to one SQL dialect. Then the compiler uses dialect implementations for DuckDB, BigQuery, PostgreSQL, MySQL, Snowflake, Trino, Presto, and Databricks.

The connection system is also modular. Connector packages register database types when imported, and host applications can use @malloydata/malloy-connections when they want the standard backend set available.

Key Packages

Package	Purpose
`@malloydata/malloy`	Core language, compiler, runtime APIs
`@malloydata/db-duckdb`	DuckDB connector for local and embedded analytics
`@malloydata/db-bigquery`	BigQuery connector
`@malloydata/db-postgres`	PostgreSQL connector
`@malloydata/db-snowflake`	Snowflake connector
`@malloydata/db-trino`	Trino and Presto-style connector support
`@malloydata/render`	Solid.js/Vega renderer for Malloy result output
`@malloydata/syntax-highlight`	Syntax highlighting assets for Malloy file formats

Is Malloy Self-Hosted?

Not in the usual “copy this Docker Compose file and open a web dashboard” way.

Malloy is better framed as:

a language for semantic modeling and query authoring
a compiler/runtime for turning Malloy into SQL
a VS Code workflow for authoring models and running queries
an npm library set for embedding Malloy into data apps

So I did not create a Home-Lab Docker Compose snippet for this post. The upstream repository does not ship an official first-party Compose stack for a persistent Malloy web app, and pretending it does would give readers the wrong expectation.

Trying Malloy Locally

For most people, the practical first path is the VS Code extension or the browser/devcontainer-style quickstart from the official docs.

If you are developing against the repo itself, the documented source path is:

git clone https://github.com/malloydata/malloy
cd malloy
npm install
npm run build

The development guide notes that building the project requires Node.js and a Java Runtime Environment. It also documents Nix as an optional way to install the dependencies used by CI.

Field Note: Source Build Was Not Attempted Here

I inspected the repository locally, but I did not run the full workspace install/build on this machine.

Environment:

Node.js: v22.22.0
npm: 10.9.4
Java: not installed
Root disk: 7.9G free, 96% used
RAM: about 4.3GiB available
Swap: 0B

That matters because this monorepo has many TypeScript workspace packages and the documented build path requires Java. The safer conclusion is: the codebase is analyzable here, but this machine is not a clean source-build environment for Malloy today.

Cleanup note: the temporary clone at tmp/malloy was removed after review to free disk space.

Working with DuckDB and Local Files

DuckDB is the most approachable backend for local experimentation because it avoids cloud credentials and external database services.

The Malloy docs and repository examples show patterns like modeling a local CSV or Parquet file, then running Malloy queries against it through DuckDB. A model might describe a source once, add measures and views, then reuse those views in multiple queries.

For example, the demo docs show this style of model:

source: airports is duckdb.table('airports.parquet') extend {
  measure:
    airport_count is count()
    avg_elevation is elevation.avg()

  view: by_state is {
    group_by: state
    aggregate: airport_count
  }
}

That is the value proposition in one small block: the source, measures, and views become reusable model code rather than being buried across multiple SQL files.

Where Malloy Fits

Malloy is worth a look when you want:

reusable metrics and dimensions without committing to a heavy BI platform
a code-first semantic layer for SQL-backed data
local DuckDB analysis over CSV or Parquet files
generated SQL that still runs on existing engines
a language/runtime you can embed in another TypeScript application

It is probably not the right fit if you only need a quick SQL client, a drag-and-drop dashboard, or a ready-made self-hosted analytics web app.

Developer Notes

The core package exposes runtime classes and helper APIs such as Runtime, SingleConnectionRuntime, connection registration helpers, model loading, query preparation, result writers, and lower-level translator/compiler types.

The package README shows this kind of JavaScript integration shape:

const malloy = require("@malloydata/malloy");
const bigquery = require("@malloydata/db-bigquery");

const connection = new bigquery.BigQueryConnection("bigquery");
const runtime = new malloy.SingleConnectionRuntime(connection);
const model = runtime.loadModel("source: airports is bigquery.table('malloytest.airports')");
const runner = model.loadQuery("run: airports->{aggregate: airport_count is count()}");

runner.run().then((result) => {
  console.log(result.data.value);
});

The APIs are still evolving, so treat Malloy as an active language/runtime project rather than a frozen enterprise standard.

Conclusion

Malloy is one of those projects that makes more sense when you stop asking “what server do I deploy?” and start asking “where should my data semantics live?”

If your team already has SQL, DuckDB, BigQuery, PostgreSQL, Snowflake, or Trino in the stack, Malloy offers a code-first way to make models and queries more reusable.

The strongest local path is DuckDB plus the VS Code workflow.

The strongest developer path is the npm package ecosystem around @malloydata/malloy.