| Constructor and Description | 
|---|
| SimpleDataset()Initializes an empty dataset | 
| Modifier and Type | Method and Description | 
|---|---|
| void | addExample(Example example)Add an example to the dataset | 
| void | addExamples(Dataset datasetToBeAdded)Add all the examples contained in  datasetToBeAdded | 
| static Dataset | extractExamplesOfClasses(Dataset dataset,
                        List<Label> labels)This method extracts examples of given  labelsfromdataset | 
| List<Label> | getClassificationLabels()Returns all the classification labels in the dataset. | 
| Example | getExample(int exampleIndex)Return the example stored in the  exampleIndexposition | 
| List<Example> | getExamples()Returns an array containing all the stored examples | 
| Example | getNextExample()Returns the next  n Examples stored in the Dataset or a fewer number 
 ifnexamples are not available. | 
| List<Example> | getNextExamples(int n)Returns the next  Examplestored in the Dataset | 
| int | getNumberOfExamples()Returns the number of  Examples in the dataset | 
| int | getNumberOfNegativeExamples(Label positiveClass)Returns the number of negative  Examples of a given class | 
| int | getNumberOfPositiveExamples(Label positiveClass)Returns the number of positive  Examples of a given class | 
| Example | getRandExample() | 
| List<Example> | getRandExamples(int k) | 
| List<Label> | getRegressionProperties()Returns all the regression properties in the dataset. | 
| SimpleDataset | getShuffledDataset() | 
| Vector | getZeroVector(String representationIdentifier)Returns a zero vector compliant with the representation identifier by  representationIdentifiercontainings all zero | 
| boolean | hasNextExample()Returns a boolean declaring whether there are other Examples in the dataset | 
| Dataset[] | nFolding(int n)Returns  ndatasets. | 
| Dataset[] | nFoldingClassDistributionInvariant(int n)Returns  ndatasets. | 
| void | normalizeExamples()It will force every representation of every examples to be a unit vector
 in its explicit feature space. | 
| void | populate(String filename)Populate the dataset by reading it from a platform
 compliant file. | 
| void | reset()Reset the reading pointer | 
| void | setSeed(long seed)Sets the seed of the random generator used to shuffling examples and getting random examples | 
| void | shuffleExamples(Random randomGenerator)Shuffles the examples in the dataset | 
| Dataset[] | split(float percentage)Returns two datasets created by splitting this dataset accordingly to
  percentage. | 
| Dataset[] | splitClassDistributionInvariant(float percentage)Returns two datasets created by splitting this dataset accordingly to
  percentage. | 
public void addExample(Example example)
addExample in interface Datasetexample - the example to be addedpublic void addExamples(Dataset datasetToBeAdded)
datasetToBeAddeddatasetToBeAdded - the dataset containing all the examples to be addedpublic Example getExample(int exampleIndex)
exampleIndex positionexampleIndex - the index of the example to returnexampleIndex positionpublic boolean hasNextExample()
DatasethasNextExample in interface Datasettrue if and only if there is at least another Example in the datasetpublic Example getNextExample()
Datasetn Examples stored in the Dataset or a fewer number 
 if n examples are not available.getNextExample in interface Datasetn Examplespublic List<Example> getNextExamples(int n)
DatasetExample stored in the DatasetgetNextExamples in interface Datasetn - the number of examples to be returnedExamplepublic void reset()
Datasetpublic int getNumberOfPositiveExamples(Label positiveClass)
DatasetExamples of a given classgetNumberOfPositiveExamples in interface DatasetpositiveClass - the class whose number of positive Examples are requiredExamples of positiveClasspublic int getNumberOfNegativeExamples(Label positiveClass)
DatasetExamples of a given classgetNumberOfNegativeExamples in interface DatasetpositiveClass - the class whose number of negative Examples are requiredExamples of positiveClasspublic int getNumberOfExamples()
DatasetExamples in the datasetgetNumberOfExamples in interface DatasetExamples in the datasetpublic List<Label> getClassificationLabels()
DatasetgetClassificationLabels in interface Datasetpublic List<Label> getRegressionProperties()
DatasetgetRegressionProperties in interface Datasetpublic void shuffleExamples(Random randomGenerator)
randomGenerator - a random number generatorpublic Dataset[] splitClassDistributionInvariant(float percentage)
percentage. The original distribution of the examples among
 the classes is maintained in the two datasets. The examples are split
 accordingly to their order. Thus the first dataset consists of the first
 percentage% of examples of each class, while the second
 dataset consists in all the remaining examplespercentage - should be a number in [0,1]public Dataset[] split(float percentage)
percentage. The examples are split accordingly to their
 order without maintaining the original data distribution among the
 classes. Thus the first dataset consists of the first
 percentage% of examples, while the second dataset consists
 in all the remaining examplespercentage - should be a number in [0,1]public Dataset[] nFoldingClassDistributionInvariant(int n)
n datasets. Each dataset is a fold storing 1/n of
 the total examples. The folds are not overlapped and maintain the
 original distribution of the examples among the classes. The example in
 this dataset are split into n folds accordingly to their
 order, so that for instance the first folds has all the first examples of
 each classn - the number of folds to createn datasets each one consisting of 1/n% of the
         examplespublic Dataset[] nFolding(int n)
n datasets. Each dataset is a fold storing 1/n of
 the total examples. The folds are not overlapped and do not maintain the
 original distribution of the examples among the classes. The example in
 this dataset are split into n folds accordingly to their
 order, so that for instance the first folds has all the first examples of
 each class. Once they have been split into n folds, the
 examples in each fold are then shuffledn - the number of folds to createn datasets each one consisting of 1/n% of the
         examplespublic List<Example> getExamples()
DatasetgetExamples in interface Datasetpublic void normalizeExamples()
Dataset
 Note: some representations cannot be normalized (for instance a
 TreeRepresentation
normalizeExamples in interface Datasetpublic static Dataset extractExamplesOfClasses(Dataset dataset, List<Label> labels) throws InstantiationException, IllegalAccessException
labels from
 datasetdataset - original datasetlabels - labels of interestIllegalAccessExceptionInstantiationExceptionpublic void populate(String filename) throws Exception
filename - the path of the file to be readExceptionpublic Example getRandExample()
getRandExample in interface Datasetpublic List<Example> getRandExamples(int k)
getRandExamples in interface Datasetk - the number of examples to be returnedk random examplespublic SimpleDataset getShuffledDataset()
getShuffledDataset in interface Datasetpublic void setSeed(long seed)
Datasetpublic Vector getZeroVector(String representationIdentifier)
DatasetrepresentationIdentifier containings all zerogetZeroVector in interface DatasetrepresentationIdentifier - the identifier of the representationrepresentationIdentifier containings all zeroCopyright © 2014 Semantic Analytics Group @ Uniroma2. All rights reserved.