首页 > 编程知识 正文

基因结构注释文件,基因组注释的文件格式是

时间:2023-05-05 11:37:59 阅读:222360 作者:2986

用bedtools对基因组片段区域进行基因注释
根据gtf格式的基因注释文件得到人所有基因的染色体坐标
选择的genecode内最早的Grch38版本(201408)

v20是最早的hg38版本对应的ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_20/gencode.v20.annotation.gtf.gzzcat gencode.v20.annotation.gtf.gz| grep protein_coding |perl -alne '{next unless $F[2] eq "gene" ;/gene_name "(.*?)";/; print "$F[0]t$F[3]t$F[4]t$1" }' >protein_coding.hg38.position mkdir -p ~/reference/gtf/gencodecd ~/reference/gtf/gencode## https://www.gencodegenes.org/releases/current.htmlwget ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_25/gencode.v25.2wayconspseudos.gtf.gzwget ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_25/gencode.v25.long_noncoding_RNAs.gtf.gz wget ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_25/gencode.v25.polyAs.gtf.gz wget ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_25/gencode.v25.annotation.gtf.gz ## https://www.gencodegenes.org/releases/25lift37.html wget ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_25/GRCh37_mapping/gencode.v25lift37.annotation.gtf.gz wget ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_25/GRCh37_mapping/gencode.v25lift37.metadata.HGNC.gz wget ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_25/GRCh37_mapping/gencode.v25lift37.metadata.EntrezGene.gz wget ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_25/GRCh37_mapping/gencode.v25lift37.metadata.RefSeq.gz zcat gencode.v25.long_noncoding_RNAs.gtf.gz |perl -alne '{next unless $F[2] eq "gene" ;/gene_name "(.*?)";/; print "$F[0]t$F[3]t$F[4]t$1" }' >lncRNA.hg38.positionzcat gencode.v25.2wayconspseudos.gtf.gz |perl -alne '{next unless $F[2] eq "transcript" ;/gene_name "(.*?)";/; print "$F[0]t$F[3]t$F[4]t$1" }' >pseudos.hg38.positionzcat gencode.v25.annotation.gtf.gz| grep protein_coding |perl -alne '{next unless $F[2] eq "gene" ;/gene_name "(.*?)";/; print "$F[0]t$F[3]t$F[4]t$1" }' >protein_coding.hg38.positionzcat gencode.v25.annotation.gtf.gz|perl -alne '{next unless $F[2] eq "gene" ;/gene_name "(.*?)";/; print "$F[0]t$F[3]t$F[4]t$1" }' >allGene.hg38.position zcat gencode.v25lift37.annotation.gtf.gz | grep protein_coding |perl -alne '{next unless $F[2] eq "gene" ;/gene_name "(.*?)";/; print "$F[0]t$F[3]t$F[4]t$1" }' >protein_coding.hg19.positionzcat gencode.v25lift37.annotation.gtf.gz | perl -alne '{next unless $F[2] eq "gene" ;/gene_name "(.*?)";/; print "$F[0]t$F[3]t$F[4]t$1" }' >allGene.hg19.position

版权声明:该文观点仅代表作者本人。处理文章:请发送邮件至 三1五14八八95#扣扣.com 举报,一经查实,本站将立刻删除。