March 29, 2016

Multiple Representations in KeLP

KeLP supports natively a multiple representation formalism. It is useful, for example, when the same data can be represented by different observable properties. For example, in NLP one can decide to derive features of a sentence for different syntactic levels (e.g., part-of-speech, chunk, dependency) and treat them in a learning algorithms with different kernel functions.

As an example, consider the following representation:

It is composed by

  1. a label (i.e., the class to be learned, service).
  2. a Sparse vector, whose boundaries are delimited by the special tokens |BV| |EV|; in this example, a bag of word is used. Note that features can be strings!
  3. a Dense vector, whose boundaries are delimited by the special tokens |BDV| |EDV|.
  4. Two String representations, delimited by |BS| |ES|; in this case they are used for comments.

On this representation a multiple kernel learning algorithm can be applied. Let’s look at an example of code (the full class can be found on github, here):

The first part load a dataset, print some statistics and define the basic objects for our learning procedure.

The kernel function is the only that has knowledge about the representation on which it will operate. To use multiple representations, each with a specific kernel function, we must specify for each kernel what representation to use. Note that to have comparable scores with different kernels, we normalize each kernel, by applying a NormalizationKernel.

A weighted linear combination of kernel contribution is simply obtained by instantiating a LinearKernelCombination, and by using the add method on it. Finally we set the kernel on the passive aggressive algorithm.

Then, we learn a prediction function, and we apply it on the test data.