Repository for scripts used in Gene- and transcript-level analyses reveal sex-specific trancriptional alterations in prefrontal cortex in Major Depressive Disorder.
We use Miniconda (or Anaconda) for environment management (See environment.yml
file). To create the environment:
conda env create -f environment.yml
We use R version 4.1.2 and Bioconductor version 3.14.
Here, we use multiple R packages. We use renv
to track R package versions (see renv.lock
file).
Clone this repository to your local machine and open the project through mdd.Rproj
to install required packages.
This project is organized in the following directories:
-
data/
: Holds the RNA-seq processed data, study metadata, and genome references;-
genome/
: auxiliary files for kallisto quantification.-
Homo_sapiens.GRCh38.97.gtf.gz
: GTF file from human gencode/Ensembl version 97. -
Homo_sapiens.GRCh38.cdna.all.fa.gz
: Human transcriptome gencode/Ensembl version 97.
-
-
kallisto/
: output from kallisto quantification mode (abundance.tsv
). Quantification files are separated by brain region.-
aINS/
: -
Cg25/
: -
dlPFC/
: -
OFC/
: -
Nac/
: -
Sub/
:
-
-
meta/
: metadata for BioProject SRP115956.-
SraRunTable.txt
: study metadata; -
Metadata.csv
: Information on individual samples. This information was previously available here: http://neuroscience.mssm.edu/nestler/contecenter/listchromatingenedatabase.html.
-
-
vcfs
: VCF files filtered by the variants showed in Table 1 from the paper.
-
-
scripts/
: Holds all the scripts used in the analyses; -
results/
: Holds results from each step analysis.
- Quantification
Script for download, processing and quantification by kallisto;
-
Exploratory analysis:
- Metadata:
-
metadata
: Organizes metadata for further steps.
-
- Transcript and gene estimates:
-
tx_gene
: Usestximport
to summarise gene counts and prepare data for further steps. -
tx_tx
: Usestximport
to prepare data to further steps.
-
- Outlier identification:
-
robust_pca
: Performs robust PCA analysis of samples usingrrcov
; -
remove_outliers_samples
: Removes the outliers samples chosen by Robust PCA analysis.
-
- Covariates selection:
-
impute_meta
: Imputes data for missing values found in some metadata covariates. -
rank_variables
: Performs covariate analysis.
-
- Metadata:
-
TAG:
- Feature-wise outlier detection:
-
outliers_edge_ppcseq_gene
: Performs identification of outlier genes byppcseq
. Outlier genes were removed from DGE analysis. -
outliers_edge_ppcseq_tx
: Performs identification of outlier transcripts byppcseq
. Outlier transcripts were removed from DTE and DTU analyses.
-
- Differential gene expression:
-
edger_diff_gene
: Differential gene expression withedgeR
.
-
- Differential transcript expression:
-
edger_diff_tx
: Differential transcript expression withedgeR
. -
diff_tx_correct
: Performs multiple hypothesis correction withstageR
.
-
- Differential transcript usage:
-
ISA/
: scripts for differential transcript usage usingIsoformSwitchAnalyzeR
are stored in this directory.
-
- Gather results from three methods:
-
organize_dge_dte_after_filtering
: Filters the outlier genes and transcripts identified byppcseq
from DGE and DTE results. -
summarise_results_dge_dte_dtu
: Removes outlier transcripts from DTU analysis and gathers results from three methods.
-
- Feature-wise outlier detection:
-
Functional analyses:
-
network
: Network inference using stringDB. Visualization byRedeR
andggraph
. -
enrichment
: Enrichment analysis of transcriptionally altered genes usingclusterProfiler
. -
gwas_intersections
: Get genes with genomic regions related to depression usinggwasrapidd
. -
intersection_analysis
: intersection analysis by sex, brain region, and method used.
-
-
Additional scripts:
-
plots
andplot_dtu
: Description of figures produced to the paper. -
supp_fig_variants_by_donors
: Description of Supplementary Figures 9 and 10, which represent the presence of depression-associated SNPs on the samples considered.
-