Semantic links are one of the keys to digital asset management as well as content management success. Computer-based search operations have historically been based on character matching: a search for building finds content tagged with building.
One limitation of this approach is that the search engine has no real understanding of the user’s intentions or expectations. For example, was the user thinking of building in the verb or the noun sense? Was she looking for content related to physical structures or was she looking for content showing a child constructing a fort?
In order to provide meaningful search results, the search engine must have a semantic understanding of the user’s meaning. There are a few ways in which this can happen:
- The user can provide additional terms in the search
- The search engine can consider previous searches by the user to get a sense of intention
- The search engine can consider public trend or popularity to assume one over the other
Google uses additional factors, such as the searching user’s location and local time of search, to establish semantic understanding. For example, if you type pizza into Google around lunchtime, the engine’s assumption is that you want one, not that you want to find out how many are sold each year, or learn to make your own. Based on your location at the time of the search, Google prioritizes results near you.
You can influence Google’s semantic understanding of your search by providing additional terms, such as “how to make pizza” or “origin of pizza.” With these few additional terms, you “educate” Google about your intentions and expectations.
Some content systems enable users to make semantic connections between content, between content and terms (tags), or between terms themselves. Search engines can consider these links in making search suggestions, “more articles like this” suggestions or similar guidance.
There are two factors involved with making this possible in a content system:
- What options are there for users to make semantic connections?
- How can the system’s search engine use these relationships to guide users?
Say, for example, you had photos that were taken at an event in Berlin. Unique to those photos might be the event name and date. But when it comes to location, perhaps you assign the tag Berlin from a list of cities that has already been established.
Now, assume that Berlin tag has its own tag descriptors, such as Germany (the country) and German (the prevailing local language, chosen from a list of known languages). By extension, say that the Germany tag has a tag descriptor for Europe.
Without specifically assigning German, Germany or Europe to the event photo, those terms are still accessible to search operations. For example, “events in Europe” might find the Berlin event photos; likewise, “events where German is spoken” might also find them.
Again, what is important to remember about the value of semantics in content systems is that, in addition to making it possible for users to define relationships, the search engine must be “smart” enough to consider the user’s search criteria in the context of all available terms.
“Events where German is spoken” requires that the search engine understand that, in this context, German would be a language. But German is also a nationality and a descriptor of culture, among other things. German Architecture, for example, doesn’t describe buildings that speak German or carry German passports.
Semantics provides the machine with context. When you ask a child, “what did you do in school today?” you are not looking to hear, “I walked and listened and looked and played and learned.” You have a clear idea of the kind of response you are expecting and—sarcasm and misbehavior aside—so does your child.
As humans, we are natively aware of semantics in communication. When semantic understanding is missing in a conversation, we ask for clarification. Machines must derive clarification in other ways.
This excerpt from Picturepark’s Routing Digital Content through the Enterprise is part of a multi-part blog series that features sections of the complete document.
Automated metadata tagging is becoming more common in content management. The premise of such a technology is alluring: Send an image or video to a service and have it send back a selection of tags that describe the content.