Alessandra Merlotti — Università di Bologna # Statistical modelling of CG interdistance across multiple organisms # We considered the DNA as a symbolic sequence constituted by four letters A, C, G, T, corresponding to the four nitrogenous bases, and we analyzed the inderdistances probability distributions of all the possibile couples of letters inside the sequence. We first studied a set of 18 different organisms, finding that CG distribution greatly differs from all the others, especially among mammals. Therefore we decided to charaterize it, by fitting the data to four model functions: exponential distribution, which is typical of stochastic processes with a characteristic scale; double exponential distribution, which involves two characteristic scales; stretched exponential distribution, which is related to relaxation phenomena in complex condensed-matter systems, and gamma distribution, which is typical of critical phenomena in the presence of a finite size effect and stochastic processes with multiplicative noise. According to residuals analysis and a goodness-of-fit test based on resampling technique and Kuiper's statistic, we found that CG interdistances distribution was well described by a gamma distribution. In order to understand which process could give rise to such a difference between CG and non-CGs distributions, in mammals, we elaborated a null model which showed that, given a non-CG distribution, it is always possible to obtain a new distribution, similar to that of CG, by randomly changing a specfic number of non-CG dinucleotides into another. Finally, we decided to extend this analysis by fitting CG distributions of 4425 different organisms, belonging to 7 different categories, to a gamma distribution. We found that, plotting scale parameter as a function of shape parameter, it is possible to distinguish 7 different clusters, which correspond to the 7 considered categories. Therefore we believe that this method can provide new tools for organisms characterization and classification, such as taxonomy.