Zu Hauptinhalt springen
Gewählte Sprache ist Deutsch Select language
Startseite UR

Software

Regensburger GEM Plattform

The Genetic Epidemiology Unit

Downloads

Prof. Dr. Iris Heid, Dr. Thomas Winkler, Mathias Gorski, Dr. Matthias Olden


EasyStrata

EasyStrata


Description

EasyStrata is an R-package that provides advanced funcionality

(i) for the evaluation of stratified GWAS;

(ii) for plotting GWAS results with a specific focus on stratification;

(iii) to simplify data-handling of large-scale GWA data-sets

Download

Version 8.6: EasyStrata_8.6.tar.gz

Command Reference / Manual: EasyStrata_8.6_Commands_140615.pdf

Alternatively, you can access the package via the CRAN R package repository: http://cran.r-project.org/web/packages/EasyStrata/

Download – Example scripts and data

The following scripts have been developed and can be used for the evaluation of stratified GWAMA results from the Genetic Investigation of ANthropometric Traits (GIANT) consortium.

Scripts:

  Plotting scripts:

     easystrata_figure1_miami.ecf (Miami-Plot for contrasting two strata)

     easystrata_supplfigure3_qqplot.ecf (QQ-Plot of multiple strata)

     easystrata_supplfigure4_scatter.ecf (Scatter-Plot of strata-specific effect sizes)

     easystrata_supplfigure5_qq_omitreported.ecf (QQ-Plot excluding known loci)

     easystrata_supplfigure6_plotspeed.ecf (Increasing plot speed)

     easystrata_supplfigure7_break_yaxis.ecf (Breaking up y-axis of Manhattan-plot)

     easystrata_supplfigure8_panel.ecf (Panel of QQ and scatter plots)

  Evaluation scripts:

     easystrata_supplpipe2A_sexdiff.ecf (Difference btw. 2 strata)

     easystrata_supplpipe2B_sexdiff_filt.ecf (Difference btw. 2 strata + overall filter)

     easystrata_supplpipe2C_joint.ecf (Joint main+interaction effect)

Data:

  Example mapping file:

     hapmap36.map (Hapmap b36 mapping file: SNPID, Chromosome, Position)

  Example locus annotation file:

     WAIST_2009_2010_14_reported.txt (Known waist-hip ratio loci, published by Lindgren et al 2009, Heid et al 2010)


Requirements

R 2.13 or higher. R packages 'Cairo' and 'plotrix'.


Citation

If you use EasyStrata please cite

"Winkler et al.: EasyStrata: evaluation and visualization of stratified genome-wide
association meta-analysis data. Bioinformatics 2014"

and (if possible) reference our webpage "www.genepi-regensburg.de/easystrata".

Thank you.


License

EasyStrata is licensed under the GNU General Public License, version 3.

Copyright © 2012 by Thomas Winkler.

Although we hope that EasyStrata will be very useful, it is published WITHOUT ANY WARRANTY.


Contact

If you require support for a different platform or have any further questions please e-mail Thomas Winkler


date of last update: 2014-04-14


EasyQC

Description

EasyQC is an R-package that provides advanced funcionality

(i) to perform file-level QC of single genome-wide association (GWA) data-sets;

(ii) to conduct quality control across several GWA data-sets (meta-level QC);

(iii) to simplify data-handling of large-scale GWA data-sets

One could also say, it can be used as Nonsense-Detector for study-specific GWA data-sets.

Download

Version 9.2: EasyQC_9.2.tar.gz

Manual: EasyQC_9.0_Commands_140918_2.pdf

ChangeLog: EASYQC_CHANGE.log

Download – 1000 Genomes cleaning material

The following material can be used for quality control of 1000 Genomes imputed GWAS result data sets.

Scripts:

     fileqc_1000G.ecf

This script can be used with the below 1000G reference files and incorporates different QC steps such as Sanity Checks, Filtering, Allele coding harmonization, Marker harmonization, Allele frequency checks, QQ plots, etc. In particular the allele coding and the marker harmoniization are inevitable steps prior to meta-analysis.

Reference data:

  Allele frequency reference data (ancestry-specific):

      Without X-Chr markers:

      allelefreq.1000G_EUR_p1v3.impute_legends.noDup.noX.gz

      allelefreq.1000G_AFR_p1v3.impute_legends.noDup.noX.gz

      allelefreq.1000G_AMR_p1v3.impute_legends.noDup.noX.gz

      allelefreq.1000G_ASN_p1v3.impute_legends.noDup.noX.gz

      Without X-Chr markers and excluding monomorphic SNPs:               

      allelefreq.1000G_EUR_p1v3.impute_legends.noMono.noDup.noX.v2.gz

      allelefreq.1000G_AFR_p1v3.impute_legends.noMono.noDup.noX.v2.gz

      allelefreq.1000G_AMR_p1v3.impute_legends.noMono.noDup.noX.v2.gz

      allelefreq.1000G_ASN_p1v3.impute_legends.noMono.noDup.noX.v2.gz

The allele frequency reference files are based on allele frequencies given in the "legend" files from the IMPUTE website. Their marker identifier column is using the cptid format as introduced by the EasyQC CREATECPTID function. Please see the EasyQC manual for more detailed information on the format.

  Mapping files:

   For 1KGP phase1 version3 imputed GWAS:

      rsmid_map.1000G_ALL_p1v3.merged_mach_impute.v1.txt.gz

   For 1KGP phase3 version5 imputed GWAS:

      rsmid_machsvs_mapb37.1000G_p3v5.merged_mach_impute.v1.txt.gz

The mapping files contain information about chromosome and position for all 1000 Genomes markers (e.g., rs-IDs) that do not contain the chromosomal and position information within the marker name (e.g., "chr1:123:AT_A"). The files are based on imputation reference files from the MACH and the IMPUTE websites. It can be used with the EasyQC function CREATECPTID that allows for harmonization of marker names across studies by compiling unique cptid's. Please see the EasyQC manual for more detailled information on the cptid format.

Download – GIANT QC paper (Winkler et al) material

The following material has been used for quality control and for several projects of the Genetic Investigation of ANthropometric Traits (GIANT) consortium.

Scripts:

  File-level QC scripts:

     1_filelevel_qc.gwa.ecf (for HapMap imputed data)

     1_filelevel_qc.metabochip.ecf (for genotyped Metabochip data)

  Meta-level QC script:

     2_metalevel_qc.ecf

  Meta-Analysis script (to be used with metal):

     3_metal_metaanalysis.txt

  Meta-Analysis QC scripts

     4_metaanalysis_qc.compare.ecf

     4_metaanalysis_qc.compare_logfiles.r (R-script)

     4_metaanalysis_qc.studymeta.ecf

Reference data:

  Allele frequency reference data:

     AlleleFreq_HapMap_CEU.v2.txt.gz (for CEU HapMap imputed data)

     AlleleFreq_1000G_EUR_Metabochip.v1.txt.gz (for CEU genotyped Metabochip data)

  Marker harmonization reference data:

     SNPID_to_ChrPosID.b36_v2.txt.gz

  QT interval SNPs reference data:

     QTSNPs_AEL_TW.txt

Please see our QC paper "Winkler et al.: Quality control and conduct of genome-wide association meta-analyses. Nature Protocols 2014" for further details regarding this scripts and material.

Download – Exomechip cleaning material

Scripts:

  Cleaning scripts for Rvtests output:

     clean_rvtests.ecf (for Rvtets association output)

     clean_rvtests_cov.ecf (for Rvtets *Cov* output)

  Cleaning scripts for Raremetalworker output:

     clean_raremetalworker.ecf (for Raremetalworker association output)

     clean_raremetalworker_cov.ecf (for Raremetalworker *cov* output)

Reference data:

  Exomechip Allele frequency reference data:

       AFR.frequencies

       AMR.frequencies

       EUR.frequencies

       ASN.frequencies


Requirements

R 2.13 or higher.

Only UNIX/LINUX systems are supported.


Citation

If you use EasyQC please cite

"Winkler et al.: Quality control and conduct of genome-wide association meta-analyses. Nature Protocols 2014"

and (if possible) reference our webpage "www.genepi-regensburg.de/easyqc".

Thank you.


License

EasyQC is licensed under the GNU General Public License, version 3.

Copyright © 2012 by Thomas Winkler.

Although we hope that EasyQC will be very useful, it is published WITHOUT ANY WARRANTY.


Contact

If you require support for a different platform or have any further questions please e-mail Thomas Winkler


date of last update: 2015-11-04


R-Scripts

R-Scripts


Data Preparation

1 – R-Code to extract SNPs out of MACH .mldose and .mlinfo files:

2 – R-Code to switch alleles, allele-frequencies and dosages in Info- and Dose-files according to given set of reference alleles:


SNPxSNP Interaction Analyses

3 – R-Code to conduct SNPxSNP interaction with a binary trait on a certain number of SNPs


Detailed descriptions on the R-Codes are given as comments in the R-files.

Only UNIX/LINUX systems are supported.

If you require support for a different platform or have any further questions please e-mail Thomas Winkler or Mathias Gorski


date of last update: Mon, September 26th 2011


idGenerator

idGenerator provides an automated tool to generate identifiers (IDs) with multiple features, particularly for modern epidemiological or clinical studies. The software enables the generation of structured IDs to facilitate study organization, layered IDs to enhance data protection, and check digits to detect entry errors. It is easy to utilize due to a user-friendly graphic user interface and practical by providing IDs as standard text and 128B barcode. idGenerator addresses towards small to medium epidemiologic or clinical studies in need of a simple yet secure concept and tool for ID creation management. The software may be used by study personnel without programming training on a standard Windows computer.

Download:


If you require support for a different platform or have any further questions please e-mail Thomas Winkler, Mathias Gorski or Matthias Olden.

  1. Fakultät für Medizin

Lehrstuhl für Genetische Epidemiologie

Institut für Epidemiologie und Präventivmedizin

Fittosize 191 191 4464fe38391e129d5ad35222161ab65b Eingang

Universitätsklinikum Regensburg
Franz-Josef-Strauß-Allee 11
93053 Regensburg