Implicit and Explicit Representation of Approximated Motifs

Detecting repeated 3D protein substructures has become a new crucial frontier in motifs inference. In \cite{cpm} we have suggested a possible solution to this problem by means of a new framework in which the repeated pattern is required to be conserved also in terms of relations between its position pairs. In our application these relations are the distances between $\alpha$-carbons of amino acids in 3D proteins structures, thus leading to a \emph{structural consensus} as well. In this paper we motivate some complexity issues claimed (and assumed, but not proved) in \cite{cpm} concerning inclusion tests between occurrences of repeated motifs. These inclusion tests are performed during the motifs inference in \emph{KMRoverlapR} (presented in \cite{cpm}), but also within other motifs inference tools such as \emph{KMRC} (\cite{kmrc}). These involve alternative representations of motifs, for which we also prove here some interesting properties concerning pattern matching issues. We conclude this contribution with a few tests on cytochrome P450 protein structures.