during some tests on around 2000 ifc files I stumpled above the following. Around 250 files uses the same Project ID 344O7vICcwH8qAEnwJDjSU . But they are around 80 different projects from various different architects. Some files are new, some files are around 5 years old. All of them where exported by ArchiCAD.
One potentially related issue is that the UUID version isn’t specified by buildingSMART, resulting in less random GUIDs. I’ve previously reported this problem here and here.
I am surprised at the frequency and the specificity to the IfcProject entity, though. That suggests it may be a different issue.
I took a look at that IFC file and sorted the GUIDs ascending - there are no internal collisions, and I do not see clearly wrong ones (e.g. sequential GUIDs) within that file. My new theory for this scenario is that perhaps ArchiCAD is generating GUIDs correctly, but perhaps if you load a particular template which includes a project GUID, ArchiCAD will preserve it in subsequent exports? I’m not convinced, though, that 250 files are using the same template …
ArchiCAD seems to create the project’s globally unique ID (and few other spatial entities’) from a set of userdefined properties (see Graphisoft Help Center). This way they grant users contol over the uniqueness and allow to intentionally generate equal IDs, when two or more of their files do indeed cover the same project, hence if a project is split across multiple files.
This makes a lof of sense IMHO, even though I am not convinced if and how this goes well with guaranteed randomness, imagine for example if multiple users follow the ID scheme that they propose in the Help Center. I guess at least when all those fields are empty, they might better generate a random ID.
In my opinion, I think a better solution would be for ArchiCAD to just randomly generate it, but then allow a field to override it in the UI if you want it to match a specific value … the use of such a seed decreases randomness.
@igor.sokolov which other applications exhibit this problem?
GlobalId may be the same for semantically different projects
From my understanding, this should not happen, and vendors should not design software in a way that makes this occur.
GlobalId may be different for 2 IFC files created from one design.
From my understanding, this is not an issue: this is the desired behaviour to denote that two IFC files actually refer to the same data. When IFC 5 comes with IFC referencing, that’ll clean this up further, but for now, GUID matching is all we’ve got.
Edit: crossed out my statements - I had misread Igor’s message Thanks @tauscher!
Exactly, that is why it is an issue to have different global IDs for semantically equivalent projects, buildings, storeys etc. when they are in different files. Maybe you misread @igor.sokolov 's second case?
I guess we all agree that GlobalIds should be equal if and only if the identified products are semantically equivalent. Neither of the two violations should happen, but they do happen even if implemented properly because determining semantic equivalence is rarely possible without an educated user.
@tauscher whoops! You’re absolutely right. I’ve edited the post. Yes, I believe we are all in agreement here.
One addition detail, though, I’d like to propose that such strongly seeded approaches like that seen in ArchiCAD may reduce randomness. Therefore, I’d refer to my original proposal for the buildingSMART spec to specify UUID4, and then allow users to override it if they know what they’re doing.
Does it mean we all agree, two totally different projects should never ever have the same UUID. Thus I consider this a bug in ArchiCAD.
I just remember all the file are from different Architects, from different Projects. They just have one commonalties: they all where exported with ArchiCAD.
How would one define “(totally) different projects” though? As @tauscher said
Neither of the two violations should happen, but they do happen even if implemented properly because determining semantic equivalence is rarely possible without an educated user.
I am not sure if I understand this semantic equivalence … I am a practical user who does some Python … I will try to expain it as simple as possible.
Assumed two building projects. Everything is equal exact equal, in reality and in the CAD system, except the geolocation. One project is just 100 m beside the other. Even for these projects I would expect different uuids, because they are different projects IMHO.
In my case the projects are totally different. I mean the project itself. They are randomly taken from the web, from my harddisk, from any resource I have.
I do not know about the object classes in the CAD system. I do not care about them since I do not even have access to them. I only see the projects, not the digital class data of the CAD system.
You can not define equality just by locking at the object graph and the values of the attributes. The data can vary (for example after planning has progressed and things have changed, details where added etc.) and yet it is the same building or site or project. Only the author of the files can tell whether it is equal or not.
You don’t have to as a CAD application user - you just have to provide meta data for project, building, site. With these meta data a user gives a hint of whether project, building, site are to be regarded as equal or not. Check the Graphisoft Help Center, they don’t mention object classes. But yet as an author you would be familiar with the concepts of project, site and building.
This is when users did not fill in any meta data. It can be argued that this is a bug and as Moult suggested, ArchiCAD should not assume identity, but randomly generate the UID it in this case, which happens to be the default.
This is when users did not fill in any meta data. It can be argued that this is a bug and as Moult suggested, ArchiCAD should not assume identity, but randomly generate the UID it in this case, which happens to be the default.
Which metadata? I could check the metadate for these dozens of ifc fileswith the same Project UUID and report back if the meta data really is different.
I am posting the Graphisoft Helpcenter link again here: How to Control Global ID (IFC Attribute) Based on ARCHICAD Project Info. At least two of your examples have “Project” as project name, which is the ArchiCAD default and a hint that they did not enter specific project information. Scheependomlan seems to have a specific project name though. One possible reason from the top of my head could be that the ArchiCAD project is configured to retain GUIDs (which is good!) and the project name was changed later, after first save.
Disclaimer: I am not an ArchiCAD user and don’t even have ArchiCAD installed. It would be good if someone with ArchiCAD experience and access could confirm the behavior or do a quick test.