GENDIS¶

class gendis.genetic.GeneticExtractor(population_size=50, iterations=25, verbose=False, normed=False, add_noise_prob=0.4, add_shapelet_prob=0.4, wait=10, plot=None, remove_shapelet_prob=0.4, crossover_prob=0.66, n_jobs=4)[source]¶

Feature selection with genetic algorithm.

population_size : int: The number of individuals in our population. Increasing this parameter increases both the runtime per generation, as the probability of finding a good solution.
iterations : int: The maximum number of generations the algorithm may run.
wait : int: If no improvement has been found for wait iterations, then stop
add_noise_prob : float: The chance that gaussian noise is added to a random shapelet from a random individual every generation
add_shapelet_prob : float: The chance that a shapelet is added to a random shapelet set every gen
remove_shapelet_prob : float: The chance that a shapelet is deleted to a random shapelet set every gen
crossover_prob : float: The chance that of crossing over two shapelet sets every generation
normed : boolean: Whether we first have to normalize before calculating distances
n_jobs : int: The number of threads to use
verbose : boolean: Whether to print some statistics in every generation
plot : object: Whether to plot the individuals every generation (if the population size is smaller than or equal to 20), or to plot the fittest individual

shapelets : array-like: The fittest shapelet set after evolution
label_mapping: dict: A dictionary that maps the labels to the range [0, …, C-1]

An example showing genetic shapelet extraction on a simple dataset:

>>> from tslearn.generators import random_walk_blobs
>>> from genetic import GeneticExtractor
>>> from sklearn.linear_model import LogisticRegression
>>> import numpy as np
>>> np.random.seed(1337)
>>> X, y = random_walk_blobs(n_ts_per_blob=20, sz=64, noise_level=0.1)
>>> X = np.reshape(X, (X.shape[0], X.shape[1]))
>>> extractor = GeneticExtractor(iterations=5, n_jobs=1, population_size=10)
>>> distances = extractor.fit_transform(X, y)
>>> lr = LogisticRegression()
>>> _ = lr.fit(distances, y)
>>> lr.score(distances, y)
1.0

Methods

`__init__`([population_size, iterations, …])	Initialize self.
`fit`(X, y)	Extract shapelets from the provided timeseries and labels.
`transform`(X)	After fitting the Extractor, we can transform collections of timeseries in matrices with distances to each of the shapelets in the evolved shapelet set.
`fit_transform`(X, y)	Combine both the fit and transform method in one.
`save`(path)	Write away all hyper-parameters and discovered shapelets to disk
`load`(path)	Instantiate a saved GeneticExtractor