Proper way to access translations of IFC entities?

When you browse the HTML version of the IFC documentation, you get access to the explanations and translations.

https://standards.buildingsmart.org/IFC/RELEASE/IFC4/ADD2_TC1/HTML/link/ifcairterminal.htm

However, the XSD or Express schemes only contain the main English entities and attributes. Propertysets are in external XML files and the translation terms are… somewhere else?

I understand that the documentation is generated from IfcDoc which is used to organise the schema and edit the descriptions. I was under the impression that the bSDD would help us to access terms and their translations. But I cannot retrieve them anymore.

Is there support for translations of terms in the current version of bSDD?

https://search.bsdd.buildingsmart.org

1 Like

Yes, @stefkeB . It is an ongoing project within the bSI regarding entity translation in several languages using an efficient digitally-based method. Our chapter is also collaborating with it. I suggest you ask @jwouellette and maybe he can provide you update info about its current progress.

The official translations are developed and maintained on https://translations.buildingsmart.org/

Publication of approved translations are published on GitHub, in bSDD and through an API.

Translations can be provided by everyone (feel free to help!).
Chapters are organising proofreading.

3 Likes

@berlotti , @artur_tomczak

Where and how are the translations deployed?

  • IfcAirTerminal (bSDD) shows only “English” in the “Select language” box.
  • Our investigation of JSON, RDF and GraphQL (see Semantic bSDD sec improve-multilingual-support) shows that there are no fields that can carry multilingual names/definitions.
  • Our GraphQL platform (Semantic Objects) has various capabilities for selecting by lang tag, and for lang fallback (eg the list de,en-GB,en,~ means “first give me de; if not available then en-GB, else any en variant, finally any lang at all”) . See GraphQL Query Tutorial — Semantic Objects 4.0.0 documentation sec filtering-literal-values

cc @Nata_Ke

1 Like

There are no translations of IFC approved as of now, that’s why you don’t see them in bSDD. But the intention is to have them in bSDD as soon as they are approved.

You can check other classifications that have translations, such as our demo “Fruits and Vegetables”. Check apple for example: Apple (bSDD)

But is the multilingual data returned by the APIs?
If I try this, it returns English, same as if Accept-Language is not specified:

curl -L -HAccept-Language:pl https://identifier.buildingsmart.org/uri/bs-agri/fruitvegs-1.0/class/apple|jq .

Are there fields in GraphQL, JSON and RDF to accommodate multilingual strings?

you make separate requests for each language, for example to access the Polish version of the apple class, you could use:

https://test.bsdd.buildingsmart.org/api/Classification/v4?namespaceUri=https%3A%2F%2Fidentifier.buildingsmart.org%2Furi%2Fbs-agri%2Ffruitvegs-1.0%2Fclass%2Fapple&languageCode=pl-PL

similar to how the web page only shows you one language at a time.

Edit: that was for REST-JSON. You could also ask for TTL using Curl:

curl -X ‘GET’
https://test.bsdd.buildingsmart.org/api/Classification/v4?namespaceUri=https%3A%2F%2Fidentifier.buildingsmart.org%2Furi%2Fbs-agri%2Ffruitvegs-1.0%2Fclass%2Fapple&languageCode=pl-PL
-H ‘accept: text/turtle’

  • Not yet available on GraphQL, and the JSON and RDF entity APIs (i.e. resolving the semantic URL)?
  • Which of the following fields are in which record? Eg is dataType repeated between the EN and PL records, or recorded only once? Does description come from the EN record as fallback, or is it repeated in the PL record?
name: "Objętość",
description: "The volume of an apple",
dataType: "Real",
dimension: "3 0 0 0 0 0 0",
  • The strings in JSON don’t indicate their language. Eg "Objętość"@pl, "The volume of an apple"@en
  • The strings in RDF indicate a language but assume it’s the same for the whole record, which is not true: bsdd:Name "Objętość"@pl-pl; bsdd:PhysicalQuantity "Volume"@pl-pl
  • (Technically speaking, @pl-pl is better written @pl since in Poland they don’t speak any special dialect of Polish)

Please refer to the list we have available here: https://api.bsdd.buildingsmart.org/api/Language/v1

One part is the data model, another is how authors define their content. In that case, the description was uploaded as Polish but was not translated properly. (It was my fault :wink: )

What do you mean? It is possible with REST-JSON and REST-TTL (RDF) APIs. Not sure about GraphQL, can you help @Erik.bSDD?

Check our documentation to see which attributes can be translated (marked in “Translatable?” column): https://github.com/buildingSMART/bSDD/blob/master/Documentation/bSDD%20JSON%20import%20model.md. The rest of the content is taken from the initial upload.

@pl-pl is better written @pl

en-GB is ok, pl-PL is nok

In that case, the description was uploaded as Polish but was not translated properly.

You assume EVERYTHING will be translated uniformly. From my experience with large thesauri, I think that assumption is unrealistic, so you need to think about “language fallback” (see link above) and strings in the payload should indicate their language (be self-describing).

It is possible with REST-JSON and REST-TTL

I mean what’s called “entity API”, i.e. following TimBL’s basic web principles (URLs should resolve, and return useful data). Can you get Polish from the semantic URL https://identifier.buildingsmart.org/uri/bs-agri/fruitvegs-1.0/class/apple? Using standard content negotiation (Language and MIME type)?

The rest of the content is taken from the initial upload.

Does that mean duplicated? So instead of multilingual entities (eg Domains, Classifications), you have independent entities, where one has been translated from the other?

To give a turtle example:

# THIS IS GOOD
<https://identifier.buildingsmart.org/uri/bs-agri/fruitvegs-1.0/class/apple> a :Classification;
  :name "Apple"@, "Jabłko"@pl;
  :dataType: "Real",
  :dimension: "3 0 0 0 0 0 0".

# THIS IS BAD
<https://identifier.buildingsmart.org/uri/bs-agri/fruitvegs-1.0/class/apple/en> a :Classification;
  :name "Apple"@;
  :dataType: "Real",
  :dimension: "3 0 0 0 0 0 0".
<https://identifier.buildingsmart.org/uri/bs-agri/fruitvegs-1.0/class/apple/pl> a :Classification;
  :name "Jabłko"@pl;
  :dataType: "Real",
  :dimension: "3 0 0 0 0 0 0".
1 Like

For some of the questions there is an answer in how the bSDD stores the data and what is the input data structure: https://github.com/buildingSMART/bSDD/blob/master/Documentation/bSDD%20JSON%20import%20model.md
But good improvement suggestions are welcome and I will go through your list in due time!

those are standard country-language codes from ISO, can you explain what is the problem with them?

Ah, I see now. No, it’s not possible, but nice idea for improvement. How would you suggest to have it?

https://identifier.buildingsmart.org/uri/bs-agri/fruitvegs-1.0/class/apple&languageCode=pl-PL&mimetype=ttl

or even:

https://identifier.buildingsmart.org/uri/bs-agri/fruitvegs-1.0/class/apple/ttl/pl

?

standard country-language codes from ISO, can you explain what is the problem with them?

Citing RFC 5646: Tags for Identifying Languages

Region subtags are used to indicate linguistic variations associated
with or appropriate to a specific country, territory, or region.
Typically, a region subtag is used to indicate variations such as
regional dialects or usage, or region-specific spelling conventions.

“Polish as spoken in Poland” is just Polish, so adding the region subtag is not right.
(Similarly, I tried to convince VIAF that tagging person authority names with @it-VA just because they came from Biblioteca Vaticana is wrong, because pure @it is spoken in the Vatican).

How would you suggest to have it?

Please read Cool URIs for the Semantic Web and Hypertext Style: Cool URIs don't change..
URIs (or their better brother resolvable URLs) are identifiers of things. They shouldn’t reflect implementation or representation details.

Then read Content negotiation - Wikipedia. HTTP uses standard headers (List of HTTP header fields - Wikipedia), so you don’t need to mangle the URL.

  • the client uses Accept-Language: and Accept:
  • the server uses Language: and Content-Type

Eg this asks for the classification as Turtle; in Polish or as fallback in English or German:

curl -L -Haccept-language:pl,en,de -Haccept:text/turtle \
 https://identifier.buildingsmart.org/uri/bs-agri/fruitvegs-1.0/class/apple

The “fallback” should be used for every translatable field individually.
I.e. you should not duplicate strings in entity language variants.

More importantly, you should not duplicate non-translatable data.
There must be only one semantic URL https://identifier.buildingsmart.org/uri/bs-agri/fruitvegs-1.0/class/apple that knows its name and description in a variety of languages. If you duplicate data, that makes it very hard to maintain.

Most importantly, the semantic URL should not depend on language. If you have https:.../apple/en vs https:.../apple/pl, that means that an English AECO company cannot interoperate easily with a Polish company because they use different URLs (identifiers) for the same thing.

1 Like

Thanks for the sources. At first, I didn’t fully understand you, but now I think I do. What you described with one URL is how bSDD currently works, so glad you agree with it. The MIME type can be specified in REST API with the header, but the languageCode must be a parameter. For example you can request:

curl -X 'GET' \
 'https://test.bsdd.buildingsmart.org/api/Classification/v4?namespaceUri=https%3A%2F%2Fidentifier.buildingsmart.org%2Furi%2Fbs-agri%2Ffruitvegs-1.0%2Fclass%2Fapple&languageCode=pl-PL' \
 -H 'accept: text/turtle'

Is that solution meeting your expectations?

Regarding how the data is stored, we only keep translations on top of the core data in its original language, without duplicating non-translatable parts.

Hi @artur_tomczak! This still doesn’t follow the Web Architecture principles about resource (entity/semantic) URLs.
Eg to fetch the classification “apple” in Polish (or English, or German) in Turtle, one should be able to use:

curl -L -Haccept-language:pl,en,de -Haccept:text/turtle \
  https://identifier.buildingsmart.org/uri/bs-agri/fruitvegs-1.0/class/apple

See 4.6.3. Use Language Content Negotiation, and more details in its parent section 4.6. Improve Multilingual Support

Thank you for the explanation @VladimirAlexiev, makes sense. We added to our backlog to improve that.

As discussed on the meeting today, the language header is working, but there are two problems in the above query:

  • the “pl” is not recognized, as we use “pl-PL” in bSDD. We will add support for “pl” as well.
  • if something is not in the Production server, it doesn’t have the “/identifier.buildingsmart” assigned. You could use the “test.bsdd” instead, for example “https://test.bsdd.buildingsmart.org/uri/bs-agri/fruitvegs-1.0/class/apple”, but this just for testing.

So a query that works now and gives you content in Polish could be:

curl -L -Haccept-language:pl-PL https://test.bsdd.buildingsmart.org/uri/bs-agri/fruitvegs-1.0/class/apple