Code repository for the book Modern Python Cookbook, published by Packt. Lollipop-style mutation diagrams for annotating genetic variations. Python3 teaching materials for basic introduction to Python 2 days. Single cell current best practices tutorial case study for the paper:Luecken and Theis, "Current best practices in single-cell RNA-seq analysis: a tutorial". Software for the automatic analysis and data visualization of International Mouse Phenotyping Consortium data.

This repository is for developing a MCMC over admixture phylogenies. Stochastic model for admixture in a geographically structured population. A wrapper for liftOver for converting plink genotype data between different genome reference builds. Provides helper scripts for inferring local ancestry, performing ancestry-specific PCA, etc. Methods to integrate data from multiple genome sequencing datasets and form consensus variant calls. This repo provides tools to convert ClinVar data into a tab-delimited flat file, and also provides that resulting tab-delimited flat file.

R package to predict mutations whether can elicit nonsense-mediated decay or not. Apache 2 licensed. Open-source foundation of the user-sponsored PyMOL molecular visualization system. Seeking information like heteroplasmy, structure variants, etc.

Aggregate results from bioinformatics analyses across many samples into a single report. A collection of tutorials and examples for solving and understanding machine learning and pattern classification tasks.

Validated, scalable, community developed variant calling and RNA-seq analysis. A hidden Markov model for detecting segments of shared ancestry identity by descent. This method estimates the probability of sharing alleles identity by descent IBD across the genome and can also be used for mapping disease loci using distantly related individuals. Penetrance estimates; frequency and distribution of secondary findings for the ACMG gene panel. Course material in notebook format for learning about single cell bioinformatics methods.

Annotate models of genetic inheritance patterns in variant files vcf files. A very portable scheduler. Create timecourse "fish plots" that show changes in the clonal architecture of tumors.

Inferring and visualizing clonal evolution in multi-sample cancer sequencing. Application for inferring subclonal composition and evolution from whole-genome sequencing data. Analyze exome data for Mendelian disorders.

Still in alpha-testing. Informatics for RNA-seq: A web resource for analysis on the cloud. Educational tutorials and working pipelines for RNA-seq analysis including an introduction to: cloud computing, critical file formats, reference genomes, gene annotation, expression, differential expression, alternative splicing, data visualization, and interpretation.

Bayesian analysis of contingency tables as the ratio of two binomially distributed random variables. An algorithm for clonal tree reconstruction from multi-sample cancer sequencing data. R package for extracting mutation signatures from a list of somatic mutations.

HotNet2 is an algorithm for finding significantly altered subnetworks in a large gene interaction network. Reproducible machine learning analysis of gene expression and alternative splicing data.

Vim plug-ins which offer support for various programming languages. Scripts to unite Strelka somatic variant caller and SnpEff annotation tool. Bioconductor annotation packages for microRNAs mirbase, targetscan. Pysam is a python module for reading and manipulating Samfiles. It's a lightweight wrapper of the samtools C-API. Pysam also includes an interface for tabix. Deconvolving tumor purity and ploidy by integrating copy number alterations and loss of heterozygosity.

AbsCN-seq: a statistical method to estimate tumor purity, ploidy and absolute copy numbers from next-generation sequencing data. Meta-pipeline to identify transposable element insertions using next generation sequencing data. Analysis pipeline for cross species exome next-generation sequencing data. Genotyping and variant annotation pipelines for exome sequencing data. Source code, data and documentation to demonstrate automation in NGS data analysis.

Bioinformatics scripts, primarily in perl, dealing primarily with RNA-Seq data and statistics on. This is a collection of small scripts built by the Lenz lab to make RNA sequencing tasks more efficient.

A novice attempt at GATK pipeline for running on non-model organisms. QC3, a quality control tool designed for DNA sequencing data for raw data, alignment, and variant calling. Tools written in C using htslib for manipulating next-generation sequencing data. A false-positive filter for variants called from massively parallel sequencing. Scripts dealing with various aspects of next-gen sequence data. QC, CNVs, etc. Toggle navigation. Jie Yin.


