Simple AMG Interface specification (SAmgI)

The full description of the Simple AMG Interface can be found here.

Explanation of some of the used terminology

MetadatasourceId and ContextBasedGenerator

A MetadatasourceId is intended to represent a source of metadata. More specifically, a MetadatasourceId is something that uniquely identifies a learning object in a certain context. These contexts can be various, such as a course in a learning management system, or a file from a file system.

Each MetadatasourceId has a corresponding ContextBasedGenerator. In fact, the former class is used to create and initialize the correct latter one (based on the name of the MetadatasourceId), which will then be able to create metadata for the learning object, as used in that context.

The MetadatasourceIds and the ContextBasedGenerators will vary from server to server. ServerA will support other MetadatasourceIds than serverB. Therefore we provide an operation that allows retrieving all supported MetadatasourceIds.


Conflict handling and merging information

The idea behind this specification for automatic metadata generation is that the final metadata instance will be a collection of parts that are generated by different metadata generators. Those different generators can for example be different classes in one application. But it can also for example be several different web applications, residing on different servers, that return part of the metadata, that is then combined e.g. by the client or by another web application.

Because we have different generators that each generate part of the metadata, these subsets have to be combined into one resulting metadata record for the learning object. Because those subsets can overlap, there may arise a conflict between the generators, that has to be solved. There are several strategies to solve the conflicts; depending on the element, one strategy might work better than another.

  1. One option would be to include all the values in the resulting set. This is the easiest to implement and might be feasible for some metadata elements. For example, a list of concepts could contain all the keywords extracted by several generators. In some systems, however, the metadata set is strictly defined so we cannot implement this as an overall strategy for all the elements.
  2. A second option would be to ask the user how the merging has to happen. This can be used in a small system with only a low number of new entries per week or month. In larger systems, however, we would lose all the benefits of automatic metadata generation as the user has to spend time on controlling all the values and decide which one to use.
  3. A third option would be to try and find out which of the generators are most likely to be correct, and use their value in the result. In this case, every generated value will get an associated value which is the degree of certainty of the generator about that value. We call this value the confidence value in our framework. Every generator determines such a value for the metadata elements it generates. In case of conflict, this strategy will prefer a value with a higher confidence value over one with a lower value.
  4. A fourth option would be to apply heuristics to decide on the value. This option applies in certain cases if heuristics are known about the metadata elements. In that case, the heuristic will provide the solution about the conflict. An example element for which heuristics can apply is the document language. A lot of families of languages exist and in those families the differences between languages might be very small. For example Italian and Catalan are closely related to each other but are different languages; the same applies to Afrikaans and Dutch. If one metadata generator decides the language is Catalan, the heuristic might say to use Italian. In either case, if the document is used in an Italian or a Catalan environment, the users will understand the contents and thus be able to use the object. Applying Catalan for the document language however could be more precise but the value Italian is not wrong.

To deal with the previous thoughts, we introduced what we call "Conflict handling methods/strategies".



The operations

Authentication and Session Management

createSession

Creates a session

createAnonymousSession

Creates an anonymous session (without requiring an account at the system where the session will be created)

destroySession

Destroys a session

Simple AMG Interface

setMetadataFormat

Sets the metadata format that will be used for the generated metadata

getMetadataFormat

Gets the current metadata format

setConflictHandlingMethod

Sets the method that is used to solve conflicts in the generated metadata

getConflictHandlingMethod

Gets the current conflict handling method

getSupportedMetadataFormats

Retrieves all supported metadata formats

getSupportedConflictHandlingMethods

Retrieves all supported conflict handling methods

getSupportedMetadatasourceIds

Retrieves a list with the names of the supported MetadatasourceIds. Based on these names, the concrete implementations will determine how it is handled.

getCheckRelatedMetadatasourceIds

Return true if related MetadatasourceIds have to be retrieved in the metadata generation process; otherwise false.

setCheckRelatedMetadatasourceIds

Set the option to go look for related MetadatasourceIds to true or false.

getMetadata

Generates and returns metadata for a given learning object

getMetadataWithMergingInformation

Generates and returns metadata together with merging information for a given learning object (together this makes up what we call AmgMetadata)

convertMetadata

Converts an AmgMetadata instance to another metadata format