Thursday, June 30, 2011

HL7 attempts to get things clear about its own use of the word 'concept'

HL7 v3 has been under development since 1996. Now, at last, the HL7 organization is beginning to attempt to provide definitions for its basic terms. I provide the relevant document (in full) below, in the version of June 29, 2011. My comments are interspersed in red.

HL7 Nomenclature for Concept Representations
For many years, a number of words have been used to describe various bits and pieces of terminologies and their components. In an effort to precisely define these words, and use them consistently in the Core Principles document currently under ballot, this paper documents the current position of the Vocabulary WorkGroup with regards to these words, as used in the HL7 datatypes and the HL7 vocabulary model.
The fundamental unit of meaning in HL7 with respect to vocabulary is CONCEPT.
As defined in HL7, “a concept is a unitary mental representation of a real or abstract thing – an atomic unit of thought.” 
So:
  1. A concept is a fundamental unit of meaning. 
  2. A concept is a unitary mental representation.
  3. A concept is an atomic unit of thought.
Are 1., 2. and 3. equivalent? 
However, there is not yet technology able to directly transmit or manipulate abstract thought. 
What is 'abstract thought'? Is it the same as: thought about what are referred to above as abstract things? Is this still thought of the sort that takes place in human brains? If yes, what sort of technology does the author of this sentence have in mind? If no, what sort of thought is it, that does not take place in human brains?
Furthermore, for useful computational interoperability, the set of concepts that can be captured, shared and manipulated must be standardized.  
If concepts are mental representations, as according to 1., or units of thought, as according to 2., how -- leaving aside complex brain surgery -- could concepts be captured or shared or manipulated? How could concepts thus conceived (i.e. as creatures of human mental activity) play a role in computational interoperability? 
Code Systems perform this function of concept standardization 
Does this mean that the concepts which are the atomic units of thought -- the concepts, presumably, with which human thinkers operate when they have thoughts -- are to be replaced by other, standardized, concepts when Code Systems are introduced? If so, where do these new standardized concepts live? Are they still 'units of thought'? And if so, who or what is the thinker who is having the relevant thoughts? 
and may perform other functions such as defining relationships between concepts.  Most importantly, they provide representations for the concepts they standardize. 
Our thoughts, or mental representations, are to be standardized by Code Systems. What does this mean? What is the end-result of such standardization? Concepts themselves, we are told, are representations of real or abstract things, and Code Systems provide representations of concepts. From this we can infer:
4. Code Systems provide representations of representations of real or abstract things.
But then a further problem arises; for if Code Systems provide representations of the very concepts they themselves standardize, does this mean that they provide representations of these concepts as they existed before standardization; or that they provide representations of the very concepts which they themselves have standardized? How can something be a representation of X and a standardization of X at one and the same time? 
A Concept Representation is some form of symbol that, when interpreted in the context of a given Code System, is understood to stand in for the meaning associated with the Concept. 
5. Meanings are associated with concepts.
According to 1. a concept is a unit of meaning. According to 5. there are meanings associated with concepts. Which is correct?  
6. Symbols can stand in for meanings 
What could this mean? Surely the symbols are the sorts of things that have meanings. 
While these representations may be graphical, or other media (e.g. barcodes, video clips, etc.), in the context of HL7 they are generally limited to character strings.  Code systems may provide multiple concept representations of different types for a single concept. Some concept representations might be intended for internal use by the code system, others for exchange by computer systems representing the concept, others for display to human users, and still others to "formally define" the underlying concept.
In many code systems, the same symbol might serve multiple purposes.  For example, the symbol "F" in the HL7 code system AdministrativeGender can be used internally in the definition of a Value Set, can be sent over the wire between computer systems and can be displayed to users.  In other code systems, a given concept representation might be intended for only a single use, or even variants of a given use.  For example, a code system might define both short and long descriptive name concept representations for human display in different types of user interfaces.  Depending on their intended use, types of concept representation may have varying characteristics in terms of whether they are unique within the code system and whether there can be multiple representations of that type for a given code system.
7. There are descriptive names which are concept representations.
This means, I think, that there are descriptive names such as 'person' or 'arm' which represent concepts. Such descriptive names, presumably, also have meanings. Are these meanings also concepts? If so, are they the same concepts as the concepts which the descriptive names represent? If not, are there some meanings of descriptive names which are concepts, and other meanings which are not concepts? And if so, how do we then tell the difference between the two? 
In other code systems, it may be possible to construct a representation by combining other representations according to a code-system-specific grammar.  This is called post-coordination.  For example, the symbol for "arm" might be combined in some way with the symbol for "left" to construct a concept representation for "left arm". 
There is an issue, here, of what is called the 'use-mention confusion', illustrated by the sentence "Swimming is healthy and has two vowels." The use-mention confusion occurs when it is not realized that "swimming" refers to swimming (and not to "swimming"), that "London" refers to London (and not to "London"), and so on.
We are told that we can put together the symbol for "arm" with the symbol for "left" in order to construct a concept representation for "left arm". Not so, however, The symbol for "arm" is '"arm"'. ("Arm" is the symbol, not for "arm", but for arm.) 
[See the second comment by Spero melior below on a further problem raised by this paragraph -- which is that it suggests that the combination of the concepts arm and left would itself constitute a concept, which conflicts with HL7's own definition of concept as an atomic unit of thought.]
The [accompanying table] of three different names for concept representations attempts to disambiguate the nomenclature we commonly use in HL7.  It also includes examples for these types of concept representations from some common healthcare-related code systems, including some post-coordinated examples.

Note that these terms [‘Code’, ‘ConceptID’, ‘Designation’] all refer to the use of a particular Concept Representation. In some code systems, such as the AdministrativeGender example given earlier, the same representation (e.g. "F") can be a considered a Code, a Concept Id and a Designation.
So that's clear, then.

Notes
1. The above continues my commentary here.
2. For an example of a set of definitions which, I believe, comes closer to the level of precision that is required, in this difficult field, see here, and especially the distinction between what are there called "levels 1, 2 and 3." For further discussion of the problems we face in understanding the meaning of the term 'concept' in computer science circles -- including arguments on behalf of the view that we should abandon this term entirely, see here.  
3. For a discussion of the historical source of some of the confusions conveyed in the above document, including the delightful passage concerning "technology able to directly transmit or manipulate abstract thought", see here.
4. Update June 30, 2011: See now Grahame Grieve and Thomas Beale discussion here.

3 comments:

Spero melior said...

I would add that if concepts are mental representations or thoughts, how on earth are we supposed to standardize them? Does that require that we wire everyone's neuronal connections the same way, which assumes, of course, that mental representations and thoughts can be reduced to such connections. And if we take a non-reductionist approach to mental representations and thoughts, the problem of standardizing them remains, and is probably more difficult.

Another problem is that the structure of mental representations and thoughts are fuzzy things, even to those that have them. What in fact is an atom or unit of thought? How do I know if I'm down to something atomic when it comes to thought?

If I think of a book on my desk, is that thought an atom? What if I'm thinking about my set of Winston Churchill's six books "The Second World War". One thought or six? Or what if I'm thinking about a meeting yesterday and think of an exchange between two people, each utterance in succession. If it's more than one thought (maybe one thought per utterance, or perhaps those utterances I'm remembering involve multiple thoughts themselves), where does one thought begin and another end?

What are the identity criteria for thoughts and mental representations? How do they come into existence? How and when do they go out of existence?

And if no one is thinking of the element Samarium at a particular moment in time, does that mean the chemical symbol of Sm does not refer to any thought or mental representation at that moment?

Spero melior said...

Another problem with the document as it currently stands is that it says that the combination of symbols for arm and for left constitute a concept representation.

Well, if a concept representation represents a concept, and if a concept is an atomic unit of thought, then a combination of two concept representations cannot represent a concept, because what the combination represents is no longer atomic or unitary.

That is, if left and arm are concepts, and "left" and "arm" represent those concepts; then combining the two symbols into "left arm" means that left arm is not a concept. The reason is that left arm is not an atomic unit of thought. It is more of a "molecule" (scare quotes) of thought. Thus the representation of left arm cannot be a concept representation because what it represents--left arm--is not atomic and therefore not a concept.

So, the process of combining representations must create some other kind of representations besides concept representations. Regardless, according to the definitions of "concept" and "concept representation", combinations of concept representations by definition do not represent concepts and thus are not concept representations.

gemstest said...

A fundamental problem here is that concepts are not the fundamental unit of meaning, which emerges in a complex way in language. So even if it were possible to do so, standardising concepts does not enable system interoperability.