eagle-i Dartmouth CollegeDartmouth College
See it in Search

Discovery Dartmouth Initiative for SuperCOmputing Ventures in Education and Research


Discovery is a 2600+ core Linux HPC cluster available to the Dartmouth community, Geisel School of Medicine and DHMC.

The cluster is comprised of both AMD and Intel processors. Nodes range from 8 processor cores and 32GB of memory to 48 processor cores and 192GB of memory. In total the cluster has 9.4TB of memory and 200TB of disk space delivered by a Isilon storage cluster. Many of the compute nodes are interconnected via a high-speed InfiniBand network. For more information on the node specs please visit the cluster details page

Discovery is built on CentOS 6 and contains ‘C’ and FORTRAN compilers as well as third party applications.

Job submissions on Discovery are submitted to a moab/torque scheduling system. This system allows for more equitable allocation of resources and optimizes cpu usage.

Discovery accepts SSH, SFTP or clients that use these protocols to connect.

  • To get started using Discovery all you need to do is request an account via Discovery’s account request form here

  • We encourage research groups who have compute-intensive applications to reach out and see what Discovery can offer. For questions please email research.computing@dartmouth.edu.




      • ADF ( Algorithmic software suite )

        ADF (Amsterdam Density Functional) is a Fortran program for calculations on atoms and molecules (in gas phase or solution). It can be used for the study of such diverse fields as molecular spectroscopy, organic and inorganic chemistry, crystallography and pharmacochemistry.

      • afni ( Algorithmic software suite )

        AFNI (which might be an acronym for Analysis of Functional NeuroImages) is a set of C programs for processing, analyzing, and displaying functional MRI (FMRI) data - a technique for mapping human brain activity. It runs on Unix+X11+Motif systems, including SGI, Solaris, Linux, and Mac OS X. It is available free (in C source code format, and some precompiled binaries) for research purposes.

      • AutoDock ( Software )

        AutoDock is a suite of automated docking tools. It is designed to predict how small molecules, such as substrates or drug candidates, bind to a receptor of known 3D structure. AutoDock actually consists of two main programs: AutoDock performs the docking of the ligand to a set of grids describing the target protein; AutoGrid pre-calculates these grids. In addition to using them for docking, the atomic affinity grids can be visualised. This can help, for example, to guide organic synthetic chemists design better binders.

      • BLAST+ ( Software )

        Sequence similarity searching is one of the more important bioinformatics activities and often provides the first evidence for the function of a newly sequenced gene or piece of sequence. Basic Local Alignment Search Tool (BLAST) is probably the most popular similarity search tool.

      • Bowtie ( Software )

        Bowtie is an ultrafast, memory-efficient short read aligner geared toward quickly aligning large sets of short DNA sequences (reads) to large genomes. It aligns 35-base-pair reads to the human genome at a rate of 25 million reads per hour on a typical workstation. Bowtie indexes the genome with a Burrows-Wheeler index to keep its memory footprint small: for the human genome, the index is typically about 2.2 GB (for unpaired alignment) or 2.9 GB (for paired-end or colorspace alignment). Multiple processors can be used simultaneously to achieve greater alignment speed. Bowtie can also output alignments in the standard SAM format, allowing Bowtie to interoperate with other tools supporting SAM, including the SAMtools consensus, SNP, and indel callers.

      • Cufflinks ( Software )

        Cufflinks assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples. It accepts aligned RNA-Seq reads and assembles the alignments into a parsimonious set of transcripts. Cufflinks then estimates the relative abundances of these transcripts based on how many reads support each one, taking into account biases in library preparation protocols.

      • Gaussian ( Software )

        Gaussian provides state-of-the-art capabilities for electronic structure modeling. Gaussian is licensed for a wide variety of computer systems. All versions of Gaussian contain every scientific/modeling feature, and none imposes any artifical limitations on calculations other than your computing resources and patience.

      • HDF5 ( Algorithmic software suite )

        The HDF5 technology suite is designed to organize, store, discover, access, analyze, share, and preserve diverse, complex data in continuously evolving heterogeneous computing and storage environments. HDF5 supports all types of data stored digitally, regardless of origin or size. Petabytes of remote sensing data collected by satellites, terabytes of computational results from nuclear testing models, and megabytes of high-resolution MRI brain scans are stored in HDF5 files, together with metadata necessary for efficient data sharing, processing, visualization, and archiving.

      • IDL ( Algorithmic software component )

        IDL, the Interactive Data Language, is the ideal software for data analysis, visualization, and cross-platform application development. IDL integrates a powerful, array-oriented language with numerous mathematical analysis and graphical display techniques, thus giving you incredible flexibility.

      • Jaguar ( Algorithmic software component )

        Jaguar is a high-performance ab initio package for both gas and solution phase simulations, with particular strength in treating metal containing systems, making it the most practical quantum mechanical tool for solving real-world problems.

      • Maple ( Software )

        Maple is an interactive computer algebra system. Maple can algebraically manipulate unbounded integers, exact rational numbers, real numbers with arbitrary precision, symbolic formulae, polynomials, sets, lists, equations, arrays, vectors, and matrices. It can solve systems of equations and differentiate and integrate expressions.

      • Mathematica ( Algorithmic software suite )

        At the core of Mathematica is its highly developed symbolic language, which unifies a broad range of programming paradigms, and uses its unique concept of symbolic programming to add a new level of flexibility to the very concept of programming.

      • MATLAB ( Algorithmic software suite )

        MATLAB® is a high-level language and interactive environment for numerical computation, visualization, and programming. Using MATLAB, you can analyze data, develop algorithms, and create models and applications. The language, tools, and built-in math functions enable you to explore multiple approaches and reach a solution faster than with spreadsheets or traditional programming languages, such as C/C++ or Java®. You can use MATLAB for a range of applications, including signal processing and communications, image and video processing, control systems, test and measurement, computational finance, and computational biology.

      • ParaView ( Algorithmic software suite )

        ParaView is a scalable, open-source visualization application. When run in a parallel environment, ParaView can process very large data in a wide variety of data formats (structured and unstructured, time varying and static). It provides a comprehensive suite of visualization algorithms and supports many different file formats for both loading and exporting datasets. This application is extensible, so new algorithms can be easily added.

      • R ( Software )

        R is a language which bears a passing resemblance to the S language developed at AT&T Bell Laboratories. It provides support for a variety of statistical and graphical analyses. R is a true computer language which contains a number of control-flow constructions for iteration and alternation. It allows users to add additional functionality by defining new functions.

      • structure ( Algorithmic software component )

        The program structure implements a model-based clustering method for inferring population structure using genotype data consisting of unlinked markers

        The program uses multi-locus genotype data to investigate population structure. Its uses include inferring the presence of distinct populations, assigning individuals to populations, studying hybrid zones, identifying migrants and admixed individuals, and estimating population allele frequencies in situations where many individuals are migrants or admixed. It can be applied to most of the commonly-used genetic markers, including SNPS, microsatellites, RFLPs and AFLPs.

      • TopHat ( Algorithmic software component )

        TopHat is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons.

        TopHat runs on Linux and OS X.

      • VMD ( Algorithmic software component )

        VMD is a molecular graphics program designed for the interactive visualization and analysis of biopolymers such as proteins, nucleic acids, lipids, and membranes.

      Web Links:

      Last updated: 2015-06-08T22:06:57.823-04:00

      Copyright © 2016 by the President and Fellows of Harvard College
      The eagle-i Consortium is supported by NIH Grant #5U24RR029825-02 / Copyright 2016