-
Notifications
You must be signed in to change notification settings - Fork 0
Genomicrangesexample
walkerhound edited this page Feb 26, 2013
·
7 revisions
This is is an example of using Genomic Ranges and Genomic Features to match transcript ID's produced with cufflinks to transcript ID's from other sources.
First define a few things.
chromRename is required because the ensembl transcript database from bioMart uses, for example, "1" instead of "chr1"
newSeqLevelsToKeep and oldSeqLevelsToKeep are used to limit the chromosomes saved in the transcript databases.
rn5RefGeneUCSC <- makeTranscriptDbFromUCSC(genome="rn5",tablename="refGene") refGeneExonsByTranscript <- exonsBy(rn5RefGeneUCSC,by = "tx") refGeneExonsByTranscript <- keepSeqlevels(refGeneExonsByTranscript,newSeqLevelsToKeep)
bioMartEnsemblRn5 <- makeTranscriptDbFromBiomart(biomart="ENSEMBL_MART_ENSEMBL", dataset='rnorvegicus_gene_ensembl', host="uswest.ensembl.org") bioMartExonsByTranscript <- exonsBy(bioMartEnsemblRn5,by="tx") bioMartExonsByTranscript <- keepSeqlevels(bioMartExonsByTranscript,oldSeqLevelsToKeep) bioMartExonsByTranscript <- renameSeqlevels(bioMartExonsByTranscript,chromRename)
This part uses the R program cuffToDB.R. The cufflinks file is a gtf file called combined.polyA.17Jan13.gtf.
source('cuffToDB.R')
gtfFile <- 'combined.polyA.17Jan13.gtf'
cuffLinksTxdb <- cuff2db(gtfFile)
cufflinksExonsByTranscript <- exonsBy(cuffLinksTxdb,by="tx")
refGeneTranscript_IDs <- names(refGeneExonsByTranscript) refGeneTranscriptNames <- select(rn5RefGeneUCSC,keys=refGeneTranscript_IDs, cols="TXNAME",keytype="TXID")
ensemblTranscript_IDs <- names(bioMartExonsByTranscript) ensemblTranscriptNames <- select(bioMartEnsemblRn5,keys=ensemblTranscript_IDs, cols="TXNAME",keytype="TXID")
cufflinksTranscript_IDs <- names(cufflinksExonsByTranscript) cufflinksTranscriptNames <- select(cuffLinksTxdb,keys=cufflinksTranscript_IDs, cols="TXNAME",keytype="TXID")