Skip to main content
Semantic MediaWiki (SMW) extends MediaWiki to enable machine-readable wiki content by providing structured annotation, storage, and querying of semantic data. This page describes the major subsystems, how data flows through the system, and the development policies that govern all contributions.

Main subsystems

Parser

Intercepts MediaWiki page parsing to extract [[Property::Value]] annotations and build a SemanticData object for each page.

Store

Abstract layer (SMW\Store) that persists and retrieves semantic data. Concrete implementations exist for SQL, SPARQL, and Elasticsearch backends.

Query engine

Translates SMW query language (#ask) into backend-specific queries and returns QueryResult objects consumed by result printers.

Export

Serializes semantic data to RDF/OWL and other formats for linked-data consumers and the SPARQL store.

Storage backends

SMW ships three interchangeable store implementations, all of which implement the abstract SMW\Store interface.

SQLStore

The default backend, suitable for small and mid-size wikis. Stores semantic data in dedicated MySQL/PostgreSQL/SQLite tables using a subject-predicate-object (SPO) schema.

SPARQLStore

For advanced users who need a triple store and linked-data integration. Semantic data is mirrored to an external SPARQL endpoint (e.g. Blazegraph, Fuseki).

ElasticStore

Recommended for large wiki farms that need to scale or combine structured and full-text search. Stores data in Elasticsearch alongside the SQL tables.

How a page annotation flows through the system

1

Wikitext parsing

A user saves a page containing [[Population::517,052]]. MediaWiki invokes the SMW parser hook, which reads the annotation and creates a DataValue from the user input string.
2

DataItem creation

The DataValue validates and normalizes the input, then produces an immutable DataItem (e.g. DINumber) that represents the canonical, language-independent value.
3

SemanticData assembly

The DataItem is added to a SemanticData container via addPropertyObjectValue(property, dataItem). The container accumulates all property-value pairs for the current page.
4

Store update

After parsing completes, Store::updateData(SemanticData) is called. The store fires the SMW::Store::BeforeDataUpdateComplete and SMW::Store::AfterDataUpdateComplete hooks, then delegates to doDataUpdate() in the active backend.
5

Querying

An #ask query reaches Store::getQueryResult(SMWQuery). The backend translates the query conditions into SQL (or SPARQL/Elasticsearch), executes it, and returns a QueryResult.
6

Result printing

A result printer consumes the QueryResult and renders the output as a table, list, chart, or other format on the wiki page.

Development policies

Never modify or alter MediaWiki core tables. Any data that SMW must persist must go into SMW’s own database schema.
PolicyDetail
No MediaWiki table modificationsSMW creates and manages its own tables exclusively. Writing to the cache is the only exception.
No MediaWiki class patchingSMW never modifies, patches, or monkey-patches MediaWiki classes.
Hooks and public API onlyIntegration with MediaWiki is done exclusively through publicly available hooks and API interfaces.
@private classes are internalClasses and public methods annotated @private are not part of the public API and may change or be removed without notice.
No direct table accessExtensions should use the SMW public API rather than querying SMW tables directly.

Architecture sub-pages

Data model internals

DataItem hierarchy, SemanticData container, DataValue and DataType layers.

Database schema

Table definitions, SPO pattern, fixed vs. variable property tables.

Coding conventions

PSR-4 namespacing, method ordering, dependency injection, testing requirements.

Build docs developers (and LLMs) love