PigHeaT RRBS analysis
Analysis of PigHeaT RRBS data.
Whole-blood samples were taken form back-cross pigs Large White x (Large White x Créole pigs) that were raised in either a temperate or a tropical facility.
Reads were processed by Sonia using Remi's RRBS pipeline: private link to pipeline
01. metadata cleaning
Over the 479 initial RRBS samples, one file was not successfully processed (over-covered, not enough allocated compute ressources). Two fruther libraries had 0 CpG sites covered with at least 10 reads: GC45472 and URZ3149. Metadata was gathered from the PigHeaT sharepoint for the remainig 476 samples, leading to this metadata table.
02. BED import into a BSseq object
BED import into a BSseq object was performed on the GenoBioinfo cluster as following:
sbatch 02_data_import_genobioinfo.R
03. BSseq filtering
BED filtering was performed on the GenoBioinfo cluster as following:
sbatch 03_bsseq_cleaning.R
The initial object was a mostly empty matrix of 477 RRBS samples over 3.155.227, strand-merged, CpG sites. Only CpG sites with at least 10 reads in more than 80% of the samples (more than 381 samples) were kept, resulting in 221.580 CpG sites. Only RRBS samples with at least 10 reads in more than 30% of the 221.580 CpG sites (more than 66.474 CpG sites) were kept, resulting in 435 RRBS samples. CpG sites that were overlapping biscuit-identified SNP were removed, filtering an additional 479 CpG.
The filtered object was consituted of 221.357 CpG sites and 435 RRBS samples.
04. Differential analysis using DSS
Average DNA methylation was oberved on the 221.357 CpG sites.
We observed a bimodal distribution of DNA methylation.
The link between the average and the Standard Deviation of the DNA methylation levels was also investigated:
Only variable CpG sites (SD > 0.05) were kept for the differential analysis, keeping 69.854 CpG sites.
Differential analysis was perfomed using DSS v 2.54.0, using unsmoothed DNA methylation data,
DMLfit.multiFactor
on ~ environment + sex
, and FDR multiple-testing correction.
This resulted in 179 (0.26%) CpG sites differentially methylated between the two environments,
and 2110 (3.0%) CpG sites differentially methylated between the sexes.
23.6% (498) of the sex-DMC were on the X chromosome.
DMC were exported as .bed
files: env DMC, sex DMC.
Here are two env DMC examples.
Here is one sex DMC example.
The heatmap of sex DMC is ok-ish:
While the heatmap for env DMC is ugly due to high individual variability:
(Grey = not enough coverage to compute % meCpG)
Mahattan plots of DMC were also generated, were the genome wide threshold in red is the FDR = 0.05 threshold.