메뉴 바로가기 본문 바로가기 하단 바로가기

분석 파이프라인

  • 모식도 이미지
Whole-Genome-Sequencing
  • 파이프라인IDBX02-20210831-0034
  • 카테고리Whole-Genome-Sequencing
  • 작성자ninesoft_adm
  • 작성 일자2024-11-14
  • 수정 일자2024-12-16
  • 버전1.0
  • 매뉴얼 바로가기 바로가기
#Whole Genome Sequencing#WGS#Genomics#Next Generation Sequencing#Precision Medicine#Clinical Genomics#noncoding genome#GATK#fastp#Cutadapt#BWA#SortSam#MarkDuplicates#CountBase#BaseRecalibrator#ApplyBQSR#HaplotypeCaller#somalier

The whole-genome-sequencing pipeline is a modular pipeline for processing WGS data. This pipeline takes a fastq file as input and provides haplotype call results and annotations and visualizations based on GATK pipeline. First, raw read data with well-calibrated base error estimates in fastq format are mapped to the reference genome. The BWA mapping application is used to map reads to the human genome reference, allowing for two mismatches in 30-base seeds, and generate a technology-independent SAM/BAM reference file format. Next, duplicate fragments are marked and eliminated with Picard(http://picard.sourceforge.net), mapping quality is assessed and low-quality mapped reads are filtered, and paired read information is evaluated to ensure that all mate-pair information is in sync between each read. We then refine the initial alignments by local realignment and identify suspicious regions. Using this information as a covariate along with other technical covariates and known sites of variation, the GATK base quality score recalibration (BQSR) is carried out. Call germline SNPs and indels via local re-assembly of haplotypes using the recalibrated and realigned BAM files. Finally, we provide somalier, a tool to quickly assess relevance from sequencing data in BAM, CRAM or VCF format.

결과 이미지
  • 결과이미지
  • 결과이미지
  • 결과이미지
  • 결과이미지
TOP