International Conference on Multivariate Approximation
September 24-27, 2011

# Penalized frame potential for clustering short time series

Tobias Springer
The clustering of short time series is a statistical problem occuring in the analysis of microarray data. In this talk, we give a short geometric interpretation of the commonly used dissimilarity measure $$\mathrm{dis}(p,q) = 1- \rho(p,q)$$ with Bravais-Pearson correlation coefficient $\rho$ between the time series $p,q \in \mathbb{R}^{d+1}$. Based on an approach by Benedetto, Czaja and Ehler [1] for sparse representations of retinal data, we propose to minimize the Penalized Frame Potential $$\min\limits_{\Theta \subset \mathcal{S}^{d-1}} \, \mathrm{PFP}_{\lambda}(\Theta,Y) = \min\limits_{\Theta \subset \mathcal{S}^{d-1}} \, \frac{\lambda \, d }{m^2} \, \mathrm{TFP}(\Theta) \, + \, (1-\lambda)m \, \mathrm{P} (\Theta,Y), \hspace{0.4cm} \lambda \in [ 0,1 ],$$ including the (Total) Frame Potential $\mathrm{TFP}$, a data-dependent penalty term $\mathrm{P}$ and given time series data $Y$. Different penalty terms $\mathrm{P}$ will be presented with results of the application to simulated and real data with comparison to the Short Time Series Expression Miner (STEM) in [2]. \vspace{0.5cm} \begin{itemize} \item[[1]] J.J. Benedetto, W. Czaja, M. Ehler: Frame potential classification algorithm for retinal data, Springer Proc. Series: IFMBE, $26^{\mathrm{th}}$ Southern Biological Engineering Conference (2010). \item[[2]] J. Ernst, G.J. Nau, Z. Bar-Joseph: Clustering short time series gene expression data, Bioinformatics, Vol. 21, 2005. \end{itemize}