Skip to content
Snippets Groups Projects
user avatar
Guillaume Devailly authored
169fff35
History

PigHeaT RRBS analysis

Analysis of PigHeaT RRBS data.

Whole-blood samples were taken form back-cross pigs Large White x (Large White x Créole pigs) that were raised in either a temperate or a tropical facility.

Reads were processed by Sonia using Remi's RRBS pipeline: private link to pipeline

01. metadata cleaning

Link to script

Over the 479 initial RRBS samples, one file was not successfully processed (over-covered, not enough allocated compute ressources). Two fruther libraries had 0 CpG sites covered with at least 10 reads: GC45472 and URZ3149. Metadata was gathered from the PigHeaT sharepoint for the remainig 476 samples, leading to this metadata table.

02. BED import into a BSseq object

Link to script

BED import into a BSseq object was performed on the GenoBioinfo cluster as following:

sbatch 02_data_import_genobioinfo.R

03. BSseq filtering

Link to script

BED filtering was performed on the GenoBioinfo cluster as following:

sbatch 03_bsseq_cleaning.R

The initial object was a mostly empty matrix of 477 RRBS samples over 3.155.227, strand-merged, CpG sites. Only CpG sites with at least 10 reads in more than 80% of the samples (more than 381 samples) were kept, resulting in 221.580 CpG sites. Only RRBS samples with at least 10 reads in more than 30% of the 221.580 CpG sites (more than 66.474 CpG sites) were kept, resulting in 435 RRBS samples. CpG sites that were overlapping biscuit-identified SNP were removed, filtering an additional 479 CpG.

The filtered object was consituted of 221.357 CpG sites and 435 RRBS samples.

04. Differential analysis using DSS

Link to script

Average DNA methylation was oberved on the 221.357 CpG sites.

Histogram of % DNA methylation at CpH sites

We observed a bimodal distribution of DNA methylation.

The link between the average and the Standard Deviation of the DNA methylation levels was also investigated:

Histogram of % DNA methylation at CpH sites

Only variable CpG sites (SD > 0.05) were kept for the differential analysis, keeping 69.854 CpG sites.

Differential analysis was perfomed using DSS v 2.54.0, using unsmoothed DNA methylation data, DMLfit.multiFactor on ~ environment + sex, and FDR multiple-testing correction. This resulted in 179 (0.26%) CpG sites differentially methylated between the two environments, and 2110 (3.0%) CpG sites differentially methylated between the sexes. 23.6% (498) of the sex-DMC were on the X chromosome.

DMC were exported as .bed files: env DMC, sex DMC.

IGV tracks

Here are two env DMC examples.

env DMC 1 env DMC 2

Here is one sex DMC example.

sex DMC

The heatmap of sex DMC is ok-ish:

sex heatmap

While the heatmap for env DMC is ugly due to high individual variability:

env heatmap

(Grey = not enough coverage to compute % meCpG)

Mahattan plots of DMC were also generated, were the genome wide threshold in red is the FDR = 0.05 threshold.

env manhattan

sex manhattan