Treffer: A parallel tool for the identification of differentially methylated regions in genomic analyses
Weitere Informationen
[Abstract] Parallel and High Performance Computing (HPC) has gained attention in the last years as a mean to accelerate several kind of computationally expensive applications. Bioinformatics is one of the fields that benefits from this acceleration, since it demands a high computational power to analyse the biological data obtained from experiments. Due to the cost reductions related to obtaining biological data, more and more tools are able to extract conclusions out of this data are coming out, with capabilities to visualize, analyse and extract, but they come with high execution times and computational requirements. In particular, methylation analysis is one of the bioinformatics fields that fits into this description, since this process is associated to different biological functions, and abnormal methylation levels can indicate the presence of certain diseases. For instance, the existence of regions with different methylation levels is a common characteristic for several types of cancer. Therefore, discovering differentially methylated regions is an important research field in genomics, as it can help to anticipate the risk to suffer from some diseases. Nevertheless, the high computational cost associated to the discovery of differentially methylated regions prevents its application to large-scale datasets. Hence, a much faster application is required to further progress in this research field. During this bachelor’s thesis an optimized version of RADMeth, a tool for the identification of differentially methylated regions based on beta-binomial regression, has been developed and arranged to take advantage of the features of HPC systems. The different optimization techniques implemented were developed by applying a workload distribution among the processing elements using domain decomposition and by keeping in mind the typical architecture of HPC systems composed of several nodes (each of the nodes being a multicore system) so the novel tool takes advantage of both levels by a hybrid MPI/OpenMP ...