2006-04-24

Syntax, Semantics, and RDF


We have been taught that syntax and semantics are two issues that have to be well-distinguished. Semantics refers to the meaning whereas syntax refers to the representation. Several syntactic constructs might have the same meaning, eg. "3+4" and "5+2" both mean "7". The excellent textbook "Structure and Interpretation of Computer Programs" highlights this duality even in its title. Structure is syntax and interpretation semantics. We use different formalisms for defining each: we use BNF and similar formalisms for formally defining syntax and rewrite rules or denotational methods for formally defining semantics. This is so for programming languages and it is well the way it is. Now one can ask oneself several questions: how does this extend to systems modelling in general and everyday descriptions? Leaving pragmatics aside, that refers to the use of constructs, is this always a two-level issue? Or can there be more levels? Are the two levels something fixed or is the difference a matter of consideration?

Let us illustrate the point with a simple example:
  • I could say: I am "Carlos"
  • Somebody could reply that I am a person, and that "Carlos" is my name in Spanish. My name in English would be "Charles" and in German "Karl". So we can conclude that "Carlos" is a name, not a person. For the same object, we can have several representations that describe it. "Carlos", "Charles", "Karl", etc. are just concrete (syntactical) representations that refer to the same, that have the same meaning (we mean the same person, just call him differently)
  • But it could be further argued, that "Carlos" is just a textual representation of the word "Carlos". The Spanish word "Carlos" may have other representations as well, such as an audio representation. It might be also written in a different alphabet, such as the Morse alphabet. The textual alphabetic, the morse, and the audio representation mean the same word.
  • But if we take the text "Carlos", we could also have different representations by using different fonts and sizes. The text (the meaning) is the same, and "Carlos" is just one of the possible images that can be used for this text.
  • So, is then "Carlos" an image? Or is it what we see of the image? We could continue like this for ever.



We have shown that "Carlos" can be considered a person, a word (a name), a text, an image, etc. So, there not just 2 levels (syntax and semantics), but whatever number we like. What is syntax from one point of view is semantics from another. Or stated differently, there is nothing mystical or magic about a semantic construct, because it can be also viewed as syntax from a different perspective. Some peoples' metadata, might be other peoples' data. For instance, for the data for a librarian might be the metadata a reader uses to find a book.

On each step down we take a representation step (syntax). Going up in the figure, we take an interpretation or abstraction step (semantics). We normally like to think of interpretation as a function, ie. a mapping from several syntactic constructs to one semantic construct (not the other way around). The abstraction goes from the syntax to the semantics. A semantic construct might represent several syntactic ones. Of course, one might want to have several semantic values for the same syntactic construct, but then we talk of several interpretation( function)s. The abstraction function induces an equivalence relation over the syntactic domain. In the world of arithmetic expressions, "3+4", "5+2", "8-1" and many others are put into the same equivalence class, for which we might use a representative from the class itself, the (syntactic) expression "7".

In a way, using a syntactic contruct to mean something is doing an indirection step. I am giving you "3+4", instead on giving you its value 7, its meaning. The (existence or the) number of indirection steps is not something predetermined. We might introduce them as we wish. Moreover, we remove them also with ease. In the following traffic sign:



is the lane turning left (this is the abstraction) or the cars on it? We have eliminated an indirection step. When we talk, we might introduce indirection steps or eliminate them to simplify. Try speaking with maximum precision. You will find yourself using a lot of redundant words, that just represent (sometimes useless) indirection steps. Do you get email or email messages? Or text? Or bits? Or signals? Or electric currents? Or electromagnetic fields? Or is it your computer not you that gets them? Or its modem? Or... forget it! Abstractions are indeed important.



What is important is the ability to make the indirection step. And this is the essence of RDF. Let us give another example, this taken from Lewis Caroll's "Through the Looking Glass":
  • `The name of the song is called "Haddocks' Eyes".'
  • `Oh, that's the name of the song, is it?' Alice said, trying to feel interested.
  • `No, you don't understand,' the Knight said, looking a little vexed. `That's what the name is called. The name really is "The Aged Aged Man".'
  • `Then I ought to have said "That's what the song is called"?' Alice corrected herself.
  • `No, you oughtn't: that's quite another thing! The song is called "Ways and Means": but that's only what it's called, you know!'
  • `Well, what is the song, then?' said Alice, who was by this time completely bewildered.
  • `I was coming to that,' the Knight said. `The song really is "A-sitting On a Gate": and the tune's my own invention.'
It is very easy to specify this in RDF:



So summing up, the point I want to make is that syntax and semantics are not absolute concepts, they are relative. Syntax might have a syntax and semantics might have a semantics. You might climb up or down the syntax/semantics ladder as much as you want. You might even go up or down several steps at once (what you then do mathematically is to compose the abstraction functions). Further, that often a syntax/semantics step is dropped or made implicit. The art of modelling lies in the definition of what abstraction/representation steps are needed to describe a system in relation with the use we want to make of this description.

When writing this, my first blog, I wanted to convey some ideas, some meaning. In order to express them, I had to use words, words such as abtraction, representation, etc. that might evoke different associations and meanings to different readers. I hope I have been clear enough in the selection of the arguments, comparisons, and examples (or should I say of the words and figures?) as to convey my ideas as well as possible.

1 comment:

Simon Grant said...

Great post, and I'm a little late leaving a comment :-) -- and that's only because Carlos has commented this recent one of mine. I'd be interested in really getting to grips with this, as I clearly don't think it's quite so much a lost cause...

Thanks anyway!