• 已选条件:
  • × BMC Bioinformatic
 全选  【符合条件的数据共:10182条】

来源:BMC Bioinformatics

作者:Steven L Salzberg;Wei Zhang;Xiangyu Deng;等

预览  |  原文链接  |  全文  [ 浏览:0 下载:0  ]     分享

Background

Array comparative genomic hybridization is a fast and cost-effective method for detecting, genotyping, and comparing the genomic sequence of unknown bacterial isolates. This method, as with all microarray applications, requires adequate coverage of probes targeting the regions of interest. An unbiased tiling of probes across the entire length of the genome is the most flexible design approach. However, such a whole-genome tiling requires that the genome sequence is known in advance. For the accurate analysis of uncharacterized bacteria, an array must query a fully representative set of sequences from the species' pan-genome. Prior microarrays have included only a single strain per array or the conserved sequences of gene families. These arrays omit potentially important genes and sequence variants from the pan-genome.

Results

This paper presents a new probe selection algorithm (PanArray) that can tile multiple whole genomes using a minimal number of probes. Unlike arrays built on clustered gene families, PanArray uses an unbiased, probe-centric approach that does not rely on annotations, gene clustering, or multi-alignments. Instead, probes are evenly tiled across all sequences of the pan-genome at a consistent level of coverage. To minimize the required number of probes, probes conserved across multiple strains in the pan-genome are selected first, and additional probes are used only where necessary to span polymorphic regions of the genome. The viability of the algorithm is demonstrated by array designs for seven different bacterial pan-genomes and, in particular, the design of a 385,000 probe array that fully tiles the genomes of 20 different Listeria monocytogenes strains with overlapping probes at greater than twofold coverage.

Conclusion

PanArray is an oligonucleotide probe selection algorithm for tiling multiple genome sequences using a minimal number of probes. It is capable of fully tiling all genomes of a species on a single microarray chip. These unique pan-genome tiling arrays provide maximum flexibility for the analysis of both known and uncharacterized strains.

    来源:BMC Bioinformatics

    作者:Michael K Udvardi;Yuhong Tang;Patrick X Zhao;等

    预览  |  原文链接  |  全文  [ 浏览:0 下载:0  ]     分享

    Background

    Legumes (Leguminosae or Fabaceae) play a major role in agriculture. Transcriptomics studies in the model legume species, Medicago truncatula, are instrumental in helping to formulate hypotheses about the role of legume genes. With the rapid growth of publically available Affymetrix GeneChip Medicago Genome Array GeneChip data from a great range of tissues, cell types, growth conditions, and stress treatments, the legume research community desires an effective bioinformatics system to aid efforts to interpret the Medicago genome through functional genomics. We developed the Medicago truncatula Gene Expression Atlas (MtGEA) web server for this purpose.

    Description

    The Medicago truncatula Gene Expression Atlas (MtGEA) web server is a centralized platform for analyzing the Medicago transcriptome. Currently, the web server hosts gene expression data from 156 Affymetrix GeneChip® Medicago genome arrays in 64 different experiments, covering a broad range of developmental and environmental conditions. The server enables flexible, multifaceted analyses of transcript data and provides a range of additional information about genes, including different types of annotation and links to the genome sequence, which help users formulate hypotheses about gene function. Transcript data can be accessed using Affymetrix probe identification number, DNA sequence, gene name, functional description in natural language, GO and KEGG annotation terms, and InterPro domain number. Transcripts can also be discovered through co-expression or differential expression analysis. Flexible tools to select a subset of experiments and to visualize and compare expression profiles of multiple genes have been implemented. Data can be downloaded, in part or full, in a tabular form compatible with common analytical and visualization software. The web server will be updated on a regular basis to incorporate new gene expression data and genome annotation, and is accessible at: http://bioinfo.noble.org/gene-atlas/ webcite.

    Conclusions

    The MtGEA web server has a well managed rich data set, and offers data retrieval and analysis tools provided in the web platform. It's proven to be a powerful resource for plant biologists to effectively and efficiently identify Medicago transcripts of interest from a multitude of aspects, formulate hypothesis about gene function, and overall interpret the Medicago genome from a systematic point of view.

      来源:BMC Bioinformatics

      作者:Gordon K Smyth;Steve Gerondakis;Matthew E Ritchie;等

      预览  |  原文链接  |  全文  [ 浏览:0 下载:0  ]     分享

      Background

      Illumina Sentrix-6 Whole-Genome Expression BeadChips are relatively new microarray platforms which have been used in many microarray studies in the past few years. These Chips have a unique design in which each Chip contains six microarrays and each microarray consists of two separate physical strips, posing special challenges for precise between-array normalization of expression values.

      Results

      None of the normalization strategies proposed so far for this microarray platform allow for the possibility of systematic variation between the two strips comprising each array. That this variation can be substantial is illustrated by a data example. We demonstrate that normalizing at the strip-level rather than at the array-level can effectively remove this between-strip variation, improve the precision of gene expression measurements and discover more differentially expressed genes. The gain is substantial, yielding a 20% increase in statistical information and doubling the number of genes detected at a 5% false discovery rate. Functional analysis reveals that the extra genes found tend to have interesting biological meanings, dramatically strengthening the biological conclusions from the experiment. Strip-level normalization still outperforms array-level normalization when non-expressed probes are filtered out.

      Conclusion

      Plots are proposed which demonstrate how the need for strip-level normalization relates to inconsistent intensity range variation between the strips. Strip-level normalization is recommended for the preprocessing of Illumina Sentrix-6 BeadChips whenever the intensity range is seen to be inconsistent between the strips. R code is provided to implement the recommended plots and normalization algorithms.

        来源:BMC Bioinformatics

        作者:Byungkook Lee, Chin-Hsien Tai, Changhoon Kim

        预览  |  原文链接  |  全文  [ 浏览:0 下载:0  ]     分享

        Background

        Accurate sequence alignment is required in many bioinformatics applications but, when sequence similarity is low, it is difficult to obtain accurate alignments based on sequence similarity alone. The accuracy improves when the structures are available, but current structure-based sequence alignment procedures still mis-align substantial numbers of residues. In order to correct such errors, we previously explored the possibility of replacing the residue-based dynamic programming algorithm in structure alignment procedures with the Seed Extension algorithm, which does not use a gap penalty. Here, we describe a new procedure called RSE (Refinement with Seed Extension) that iteratively refines a structure-based sequence alignment.

        Results

        RSE uses SE (Seed Extension) in its core, which is an algorithm that we reported recently for obtaining a sequence alignment from two superimposed structures. The RSE procedure was evaluated by comparing the correctly aligned fractions of residues before and after the refinement of the structure-based sequence alignments produced by popular programs. CE, DaliLite, FAST, LOCK2, MATRAS, MATT, TM-align, SHEBA and VAST were included in this analysis and the NCBI's CDD root node set was used as the reference alignments. RSE improved the average accuracy of sequence alignments for all programs tested when no shift error was allowed. The amount of improvement varied depending on the program. The average improvements were small for DaliLite and MATRAS but about 5% for CE and VAST. More substantial improvements have been seen in many individual cases. The additional computation times required for the refinements were negligible compared to the times taken by the structure alignment programs.

        Conclusion

        RSE is a computationally inexpensive way of improving the accuracy of a structure-based sequence alignment. It can be used as a standalone procedure following a regular structure-based sequence alignment or to replace the traditional iterative refinement procedures based on residue-level dynamic programming algorithm in many structure alignment programs.

          来源:BMC Bioinformatics

          作者:Yuh-Show Tsai;Chung-Yung Chen;Jun-Jie Hong;等

          预览  |  原文链接  |  全文  [ 浏览:0 下载:0  ]     分享

          Background

          Internal ribosomal entry sites (IRESs) provide alternative, cap-independent translation initiation sites in eukaryotic cells. IRES elements are important factors in viral genomes and are also useful tools for bi-cistronic expression vectors. Most existing RNA structure prediction programs are unable to deal with IRES elements.

          Results

          We designed an IRES search system, named IRSS, to obtain better results for IRES prediction. RNA secondary structure prediction and comparison software programs were implemented to construct our two-stage strategy for the IRSS. Two software programs formed the backbone of IRSS: the RNAL fold program, used to predict local RNA secondary structures by minimum free energy method; and the RNA Align program, used to compare predicted structures. After complete viral genome database search, the IRSS have low error rate and up to 72.3% sensitivity in appropriated parameters.

          Conclusion

          IRSS is freely available at this website http://140.135.61.9/ires/ webcite. In addition, all source codes, precompiled binaries, examples and documentations are downloadable for local execution. This new search approach for IRES elements will provide a useful research tool on IRES related studies.

            来源:BMC Bioinformatics

            作者:Ritsert C Jansen;Rainer Breitling;Jingyuan Fu;等

            预览  |  原文链接  |  全文  [ 浏览:0 下载:0  ]     分享

            Background

            High-dimensional biomolecular profiling of genetically different individuals in one or more environmental conditions is an increasingly popular strategy for exploring the functioning of complex biological systems. The optimal design of such genetical genomics experiments in a cost-efficient and effective way is not trivial.

            Results

            This paper presents designGG, an R package for designing optimal genetical genomics experiments. A web implementation for designGG is available at http://gbic.biol.rug.nl/designGG webcite. All software, including source code and documentation, is freely available.

            Conclusion

            DesignGG allows users to intelligently select and allocate individuals to experimental units and conditions such as drug treatment. The user can maximize the power and resolution of detecting genetic, environmental and interaction effects in a genome-wide or local mode by giving more weight to genome regions of special interest, such as previously detected phenotypic quantitative trait loci. This will help to achieve high power and more accurate estimates of the effects of interesting factors, and thus yield a more reliable biological interpretation of data. DesignGG is applicable to linkage analysis of experimental crosses, e.g. recombinant inbred lines, as well as to association analysis of natural populations.