standard country-language codes from ISO, can you explain what is the problem with them?
Citing RFC 5646: Tags for Identifying Languages
Region subtags are used to indicate linguistic variations associated
with or appropriate to a specific country, territory, or region.
Typically, a region subtag is used to indicate variations such as
regional dialects or usage, or region-specific spelling conventions.
“Polish as spoken in Poland” is just Polish, so adding the region subtag is not right.
(Similarly, I tried to convince VIAF that tagging person authority names with @it-VA
just because they came from Biblioteca Vaticana is wrong, because pure @it
is spoken in the Vatican).
How would you suggest to have it?
Please read Cool URIs for the Semantic Web and Hypertext Style: Cool URIs don't change..
URIs (or their better brother resolvable URLs) are identifiers of things. They shouldn’t reflect implementation or representation details.
Then read Content negotiation - Wikipedia. HTTP uses standard headers (List of HTTP header fields - Wikipedia), so you don’t need to mangle the URL.
- the client uses
Accept-Language:
and Accept:
- the server uses
Language:
and Content-Type
Eg this asks for the classification as Turtle; in Polish or as fallback in English or German:
curl -L -Haccept-language:pl,en,de -Haccept:text/turtle \
https://identifier.buildingsmart.org/uri/bs-agri/fruitvegs-1.0/class/apple
The “fallback” should be used for every translatable field individually.
I.e. you should not duplicate strings in entity language variants.
More importantly, you should not duplicate non-translatable data.
There must be only one semantic URL https://identifier.buildingsmart.org/uri/bs-agri/fruitvegs-1.0/class/apple that knows its name and description in a variety of languages. If you duplicate data, that makes it very hard to maintain.
Most importantly, the semantic URL should not depend on language. If you have https:.../apple/en
vs https:.../apple/pl
, that means that an English AECO company cannot interoperate easily with a Polish company because they use different URLs (identifiers) for the same thing.