Future of IFC5

I would like to discuss some minor topics related to this paper:

"Future of the Industry Foundation Classes: towards IFC 5 "

This paper suggests that IFC5 should be modelled as an UML class diagram.
The EXPRESS schema language is also critizised.

I would like to express the following opinions:

  1. It is important to use a machine readable non-proprietary openly available schema definition language that can be parsed using non proprietary tools.

EXPRESS is machine readable. UML is mainly a visual language. Some text-based file-format versions of UML exist, but as far as I know, they are all proprietary file formats. It makes no sense to publish IFC using a schema definition stored in a proprietary UML file format.

Summary: Let’s keep it possible to work with IFC using only openly available non-proprietary tools and formats.

  1. It is important to keep the IFC schema definition machine readable.

The visual UML class diagrams could be helpful for people who are designing data models, such as IFC. But the most important aspect of IFC is not to design the IFC.

The most important aspect of IFC is to use the IFC in applications. To use IFC (in any application) you first have to convert the IFC schema definitions into class definitions in any object oriented programming language (OOPL). Nobody uses EXPRESS directly. Nobody can convert EXPRESS to OOPL class definitions by hand. It is too complex for that. It would take to much time and there would be a spelling error somewhere.

If the IFC data model is not presented in a machine readable format, then we can all forget about using IFC in real life applications. Since UML is not machine readable (at least not if the rule is to implement the full UML standard) there are no algoritms that can fully and correctly convert visual UML to class definitions.

Summary: Let’s keep the IFC data model machine readable. It is more important to be machine readable correctly, then to be user friendly. Because no user can convert IFC to class definitions manually.

  1. A method for round-tripping IFC data in graph databases is important.

IFC data is persisted as text files. I think STEP is OK, but it is still a text file. To read or analyze data in a STEP file it first has to be loaded from the hard drive and parsed into the internal memory of the computer. This is not efficient if the STEP file is huge, or if you need to search through many different STEP files, to find the result of a query and retrieve just a tiny bit of information (about a window). Especially not if this had to be done repeatedly from many different STEP files.

Nowadays we prefer to store big data in databases instead of text files. This makes it easier to analyze and query loads of complex data compared to using text files. Text files were used for data storage and retrieval back in the days of COBOL. Today we use databases.

IFC data consists of a complex network of relationships and entitites that inherits from many classes. Such object oriented data is not suitable to store or analyze in a regular relational database using SQL. Because there will soon be too many JOINS. One JOIN for each relationship traversed. For more information on this topic, please read about the “object relational impedance mismatch” in an article on internet.

Data with a network like structure and lots of relationships are in general suitable to store in a graph databases.

The Graph Query Language was just recently adopted as an international ISO standard: GQL. GQL is the only ISO standard database language besides SQL. This means that it is now possible to use graph databases using an open international standardized query language, the GQL.

You will find lots of information about the new GQL ISO standard on the internet (links are not allowed in posts here).

The GQL is based on openCypher which is already used by several open-source and proprietary graph databases, such as Memgraph for example.

An algorithm for storing IFC data in graph databases using Cypher was published in a scientific article called:

“IFC-graph for facilitating building information access and query”.

You will find this article on the internet.

However, the algorithm in that article is only for storage and should be complemented with an algorithm for “round-tripping”. I also made an implementation of this algorithm using a newer driver for python. You will find this algorithm on Github if you search for “IFC graph Github” or similary.

Summary: I suggest using graph databases and ISO GQL as a new and more efficient way of persisting IFC data (besides STEP).

  1. Algoritms for converting EXPRESS schemas to class definitions in new OOPL would be great.

Often C++ have been used for IFC. The EXPRESS schema have been developed in, or converted into data models in C++. There exists algorithms for converting EXPRESS schema definitions into C++ class definitions.

Most developers want to use their favourite OOPL. EXPRESS is just a way to communicate the data model in a machine readable way. Nobody uses EXPRESS directly to get an application running.

Today there are many new object oriented languages besides C++. But to get started using IFC in any OOPL, there first has to be a way to convert the IFC EXPRESS schemas to class definitions this OOPL.

Such algorithms are lacking for most new modern OOPL. Hence IFC can not be used in most enw modern OOPL.

Few organisations have interest or resources to start implementing an EXPRESS to OOPL converter, before starting the real project in the OOPL of their choice.

Why not publish standardized methods to convert the IFC EXPRESS schema definitions into class definitions in the most popular modern object oriented languages?

Or why not just publish IFC as class definitions in some object oriented languages, directly?

Summary: buildingSMART could facilitate the conversion of EXPRESS schemas into class definitions in many new OOPL, such as: Python, Java, Typescript, C#, Ruby, Go, Rust

Hi Martin,

That paper introduced some fundamental ideas but is not the state of the art of the IFC5. The group has already progressed since it was published three years ago.

  1. It is important to use a machine readable non-proprietary openly available schema definition language that can be parsed using non proprietary tools


Of course, the IFC5 schema will be in a machine-readable form. The intentions were to make it technology-independent: Towards a technology independent IFC · buildingSMART/NextGen-IFC Wiki · GitHub.

The way the IFC4.3 is managed today, the source definitions are defined in UML/XML and markdown files which can be serialized into EXPRESS (GitHub - buildingSMART/IFC4.3.x-development: Repository to collect updates to the IFC4.3 Specification). And no, UML isn’t proprietary.

Yes, we are well aware of the Junxiang Zhu paper and in direct contact. The graph representation of the data is considered, along with Linked Data, ECS architecture (opposed to OOP) and Universal Scene Description (USD) for coupling geometry representation. This presentation from @gschleusner answers some of your questions: https://www.youtube.com/watch?v=GgN1he00dpc

cc: @berlotti, @aothms

I understand that UML is not proprietary. UML is an ISO-standard.

However, I have always thought of ISO UML as a visual language. And hence ISO UML is not machine readable.

Of course there is image recognition, but this technology is too complicated for most people.

I know there are UML modelling software that saves UML diagrams as text files.

But I do not know of any open and free and well established international standard for serializing visual UML diagrams into text files.

I would thus like to ask:

  • In which open and free text-based file format is UML used and serialized, before it is converted to EXPRESS?

  • Or in which open and free text-based file format would UML be used in the future?

I would say the opposite: UML diagrams can be visualized, but they are primarily text-based. I’m not sure about the details, but I think 4x3 uses XMI (XML Metadata Interchange) for that.

Nevertheless, it’s not decided how IFC5 will be expressed.

According to my knowledge UML (as defined in the ISO standard) is only a visual language (of diagrams, symbols, drawn tables, arrows et.c.)

Yes, UML can be represented by text/code. XMI is a text based language, and might be used to describe UML diagrams. But on wikipedia, I can read the following about using XMI to represent UML:

“For diagrams, the Diagram Interchange (DI, XMI[DI]) standard is used. There are currently several incompatibilities between different modeling tool vendor implementations of XMI, even between interchange of abstract model data. The usage of Diagram Interchange is almost nonexistent. This means exchanging files between UML modeling tools using XMI is rarely possible.”

See page for “XML Metadata Interchange” in English Wikpedia.

This means that XMI in reality can not be used as an open text-based machine readable format for sharing information about data models or UML diagrams.

Each software uses its own flavour of XMI that can not be opened by other software that claim to be able to use XMI.

And if this software is a proprietary software, then I guess that you will have to use that particular proprietary software, to open that XMI file, with that flavour of XMI.

XMI is supposed to be an open standard, and yes it is published as a standard - but in reality (according to Wikipedia) it is not implemented to work in that way. Exchanging XMI data between modelling tools, is rarely possible.

My concerns about UML not beeing machine readable thus remains. I also have concerns about XMI not beeing open and free in practical use, instead beeing tied to a specifik modelling tool product.