Why aren't faces inlined in IfcPolygonalFaceSet?

Consider the following snippet - why can’t the IfcIndexedPolygonalFace be in-lined (similar to IfcCartesianPointList3D into IfcPolygonalFaceSet? Wouldn’t that be much, much more efficient?

#15=IFCINDEXEDPOLYGONALFACE((2,1,4,5));
#16=IFCINDEXEDPOLYGONALFACE((1,2,8,3));
#17=IFCINDEXEDPOLYGONALFACE((1,3,6,4));
#18=IFCINDEXEDPOLYGONALFACE((4,6,7,5));
#19=IFCINDEXEDPOLYGONALFACE((3,8,7,6));
#20=IFCINDEXEDPOLYGONALFACE((8,2,5,7));
#21=IFCCARTESIANPOINTLIST3D(((265.,4.,0.),(265.,6.,0.),(277.,4.,0.),(265.,4.,2.),(265.,6.,2.),(277.,4.,2.),(277.,6.,2.),(277.,6.,0.)));
#22=IFCPOLYGONALFACESET(#21,$,(#15,#16,#17,#18,#19,#20),$);
#23=IFCSHAPEREPRESENTATION(#10,'Body','Tessellation',(#22));

The proposal:

#21=IFCCARTESIANPOINTLIST3D(((265.,4.,0.),(265.,6.,0.),(277.,4.,0.),(265.,4.,2.),(265.,6.,2.),(277.,4.,2.),(277.,6.,2.),(277.,6.,0.)));
#22=IFCPOLYGONALFACESET(#21,$,((2,1,4,5),(1,2,8,3),(1,3,6,4),(4,6,7,5),(3,8,7,6),(8,2,5,7)),$);
#23=IFCSHAPEREPRESENTATION(#10,'Body','Tessellation',(#22));

The proposal is 58% of the original size in bytes. And that’s just for a cube - for more complex shapes the savings would be even greater. If this was done in IFC5, we could have much, much smaller files!

Edit: filed boog https://github.com/buildingSMART/NextGen-IFC/issues/71

Edit 2: the boog also proposes how to deal with voids.

Would this have an impact on wanting to differentiate color/render for each of the different faces? Just asking/thinking out loud. Trying to think of a case where you would actually do that for a PolygonalFaceSet…

No, colour is based on representation item - it works regardless of this change.

… speaking of colour, I need to return to the colour threads which have been forgotten about.

3 Likes

In my understanding the reason that this is not done yet is because IFCINDEXEDPOLYGONALFACE as an entity has a child, i.e. IFCINDEXEDPOLYGONALFACEWITHVOIDS.

Technically it would be possible to somehow integrate both IFCINDEXEDPOLYGONALFACE and IFCINDEXEDPOLYGONALFACEWITHVOIDS into IFCPOLYGONALFACESET. However this would reduce readability of IFCPOLYGONALFACESET. I see your point in memory reduction but my feeling would be keeping the semantics (i.e. discrimination between IFCINDEXEDPOLYGONALFACE and IFCINDEXEDPOLYGONALFACEWITHVOIDS) would prevail over reduced file size in this specific case.

@PeterBonsma I have addressed that in my initial post - click on the link to the Github issue where I describe the details of it.

I think we agree on the possibility of having such a solution and the memory use benefit.

Nevertheless I see a difference, where in the current solution the semantics of index polygon faces with / without voids is in the schema entities an optimized solution would cover this in documentation. Therefore such a choice would boil down to modelling style and underlying basic requirements for the schema development, i.e. what do we want to have described by entities and what do we want to have described in documentation.

Whatever choice is made for IFC5 I think it should be consistent throughout the schema. For example if we choose this adjustment, should we also embed points as a set of coordinated within an IFCBSPLINESURFACE? (https://standards.buildingsmart.org/IFC/RELEASE/IFC4/ADD2_TC1/HTML/link/ifcbsplinesurface.htm) More generically when do we embed content and when do we reference content?

Agreed - if this change is made here, there are other places to make the same change. The generic question of “when do we embed, and when do we reference” seems like a relatively simple objective question (since we’re dealing with non-rooted elements, semantic arguments are much more simplified): you embed when the probability of repeat references is low, and you reference when the probability of repeat references is high. Unless I’m missing some other factor to consider…?

In this scenario, the probability of repeat references is low. I can think of very little scenarios where you might want to reference an indexed polygonal face (with or without voids) twice. Perhaps when sharing faces in a space boundary? Hmm, a bit of a stretch of the imagination. Therefore, my recommendation.

I agree! Also I think your proposal is what will be ending up in IFC5 and hopefully in a consistent manner so other entities will follow the same or a similar underlying conceptual thought considering entity build-up.

Just on a side-note it would not be my preference. In my perception the schema should be as self-explanatory and semantically rich as possible, especially on the geometry description part. Removing entities and therefore semantics from the schema itself (even though covered in documentation and technically without loss of knowledge) is the wrong direction. If the consequence is unwanted memory growth of serialized data this is something that should be solved in IFC5 by rethinking the serialization itself and it should not be solved by the impoverishment of the schema itself.

2 Likes

I think the semantics do not suffer, as my proposal retains two classes: FaceSet, and FaceSetWithVoids. Also, the indexed semantic simply gets transferred from a class name into an attribute name - simply because IFC-SPF does not include attribute names does not mean the semantics are lost. Or am I misunderstanding which semantics you are referring to?

Okay, maybe this is a different discussion, but in my perception a lot of semantics is lost on schema level. The fact that outer + inner polygons inherit from outer polygons, many of the attribute values from IFCINDEXEDPOLYGONALFACE / IFCINDEXEDPOLYGONALFACEWITHVOIDS, the fact that they are in itself tessellated items, etc…

But again this is not up-to-me and more a theoretical discussion. I am perfectly okay if IFC5 moves towards such a solution and of course we will implement whatever is chosen.

I want to quickly expand on this “inlining”. The mechanism at work here is the differentiation between a TYPE in express and an ENTITY. So for example the RefLatitude of a IfcSite is of type IfcCompoundPlaneAngleMeasure, which is a TYPE: LIST [3:4] OF INTEGER so it get’s inlined. IfcIndexedPolygonalFace however is an entity with one attribute LIST [3:?] OF IfcPositiveInteger so it get’s instantiated.

TYPEs can reference each other, IfcPositiveInteger is a IfcInteger, but you can only add WHERE constraints, not additional attributes (e.g. the inner bounds). ENTITYs can have (multiple…) super types and also sometimes important, inverse attributes.

Also you are limited in what you can define as a type, you can define aggregates (like IfcCompoundPlaneAngleMeasure) but you can not define pairs/tuples of heterogeneous types (e.g a pair of <string, integer>). Although you can emulate that as list [2:2] of select(string|integer) and further restrict that with where rules so that there is exactly one of each, but you wouldn’t be making friends in that way.

I tend to agree with Peter, the fact that TYPEs are more efficient is only an artefact of the spf serialization and I don’t think it has a measurable impact on memory usage (in native code class names are not stored on a per-instance basis after all).

Consistency and expressiveness should be our aim. And in the past we did deviate from this where for example a IfcCartesianPointList is actually not at all a list of IfcCartesianPoint, but rather a nested list of length measures. So this is indeed a worthwhile discussion.

3 Likes

Hold up, but how exactly does TesselatedFaceSet.HasColours container play with that whole color applicability chain you listed here? IFC Textures and Colors: Current Situation - #2 by Moult

@claimred good spot! I need to add that to the colour applicability chain. Also, it’s ambiguous as to which takes precedence! Need buildingSMART to clarify here. Now I’m curious as to who has implemented it… :slight_smile:

But back to regarding this topic of inlining, HasColours works equally with inlined vs non-inlined faces.

I definitely agree that the “inlining” and its effect on file size is a matter of serialization. You can imagine serialization where you spell out every type and attribute name, where you leave out unambiguous type names or where you encode semantics into position. Beyond indexed structures, you can optimize serialization in various other ways from shortened type symbols to binary encoding. The EXPRESS-to-SPF mapping prescribes inline values for simple and derived types, explicitly spelling of type name with parameters for entity types.

To expand on aothms explanation, defined types are similar to unstructured datatypes in UML (with additional ability to extend other defined or simple types), while entity types are similar to classes. Thus I see the question rather as “when do we model something as a class versus a datatype”. The objectives for this decisions can not be reduced to single or multiple references and even that could be debated for faces. With that in mind, it seems hardly possible to replace entity types with defined types without loosing semantics in the schema. I think it is easier to serialize a semantically rich schema into an indexed data structure relying on order and positions rather than restoring those semantics from a meager schema catering to efficient serialization.

There are other schemas that heavily rely on indexed structures, e.g. CityJSON, but I think they don’t distinguish between schema and serialization. Although it would be possible to apply this modelling style consistently on the schema, not only on serialization level, it wouldn’t by my preference either. For example, I would also prefer the faces to be modelled as a set and assign style/color by reference, not by index. Since serialization inevitably imposes an order on the set, it could then establish an indexed representation.

Regarding the relevance of voids: For polygonal face sets, faces with voids are essential, not a rare side case. I assume that those tools that don’t use them are using triangles only anyway. As soon as you combine all connected coplanar triangles into one face, there are only limited cases where you can do without void polygons.