New GPU Solutions for Grid Computing and other Supercomputing Needs

We’ve just launched two new GPU products, along with our new GPU supercomputing solutions section! GPU computing is the use of a graphics processing unit (GPU) in computing purposes, from general-purpose to supercomputing tasks. With a constantly increasing demand for greater computing performance across the board, supercomputing is more and more drawing upon a hybrid model which integrates the roles of GPUs and CPUs.

If you are looking to gain a performance boost in your cluster or grid computing, ICC’s GPU solutions may be the perfect fit. The scalability requirements involved especially in grid computing make GPU solutions an ideal IT asset.

At ICC, we integrate our top technology with NVIDIA’s® CUDA™ GPU architecture, the simplest way for you to purchase, utilize, and manage a GPU-based cluster. GPU supercomputing has never been easier than with ICC NovaServ™ solutions, providing optimal value by minimizing time spent dealing with the technology and allowing you to focus on what you do best.

Grid Computing and other GPU Supercomputing Solutions from ICC

Continue reading

Advanced Manufacturing Partnership (AMP) to spur innovation

Photo of sun behind a factoryOn June 24, President Obama announced the Advanced Manufacturing Partnership (AMP) between the federal government, academia, and businesses to help stimulate the manufacturing sector of the U.S. economy. We have been following the so-called “Missing Middle” of small- to medium-sized manufacturers (SMMs) on this blog, and I’d like to describe some of the recent initiatives to engage this high-potential segment of our economy.

Speaking at Carnegie Mellon University, Obama described that AMP would allocate $500 million of federal money to help make U.S. manufacturing more competitive around the world.

Inspired by a report drafted by the President’s Council of Advisors on Science and Technology (PCAST), which found that there are market failures in the advanced manufacturing space that need to be overcome by government intervention, AMP will focus on five initiatives:

  1. Manufacturing for national security
  2. Materials science
  3. Robotics
  4. Energy efficiency
  5. Developing partnerships and consortia between government, universities, and industry Continue reading

Cluster computing for small and medium businesses

Image of an ICC Modular Server (IMS)We’ve updated our high-density servers webpage with a solution tailored to small and medium businesses (SMB): the ICC Modular Server (IMS). Built on similar principles to blade servers, we are selling our IMS to organizations that can benefit from an upgrade of their current (limited) server resources to an entry-level cluster solution. Let’s take a look at the advantages and disadvantages of this product to get an idea of whether it’s right for you.

The ICC Modular Server (IMS) contains compute servers, a separate storage area network (SAN), and management modules all housed and centrally controlled within one 6U enclosure. As the name implies, all components are “modular” (so, easy to expand or replace) and hot-swappable, and cabling is absolutely minimal (just like blade servers).

There are several advantages of this type of solution over, say, having multiple 1U rack servers networked together in one’s office:

  • The IMS’s modular design makes it much easier to maintain
  • The complete IMS system has about the same low noise level as a 1U server (though it house six compute nodes)
  • The drives are separate from the compute nodes, which allows all six of the compute nodes to access the up-to 18TB of storage in the SAN Continue reading

National Center for Supercomputing Applications turns 25

Miles south of Chicago, amid the wind-swept flatlands of central Illinois, is the home of perhaps the world’s next fastest supercomputer. The National Center for Supercomputing Applications (NCSA) at the University of Illinois in Urbana-Champaign, which is co-developing the Blue Waters supercomputer, is today at the forefront of petascale computing. But 25 years ago, when the Center revved up its first machine, the computing world looked much different.

This article looks back at the highlights in NCSA’s 25-year history, which illustrate well how far computing technology has come in such a short span of time and the various innovations that supercomputers have made possible that we now take for granted. These highlights are based on the slideshow posted on NCSA’s website. (In the interest of full disclosure, ICC works with NCSA on the Dark Energy Survey, so we’re a little biased).

The first supercomputer at NCSA, which went operational in 1986, was a dual-processor machine that performed at about 400 megaflops. In comparison, the upcoming Blue Waters supercomputer will have 300,000 CPUs and a peak performance of 10 petaflops (that’s 25 million times faster than NCSA’s first supercomputer).

In 1998, NCSA came out with its first “cluster”, which connected 128 workstations together and was known as the NT Supercluster. This aggregation of towers looks somewhat comical today, and it wasn’t long before rack servers replaced these bulky form factors. Continue reading

Graph 500, Green500, HPCC, and SPEC: Alternative benchmarks for high-performance computing

Image of Supermicro SuperRackSupercomputers have become a vital part of almost any innovative project undertaken by collaborative teams in the developed world. Server clusters can be found anywhere from the offices of small businesses to compartments in U.S. Navy submarines.

So which are the fastest supercomputers on earth? The usual measurement for high-performance computer (HPC) clusters is the TOP500 ranking, which is based on the High Performance LINPACK (HPL) benchmark. LINPACK stands for “linear equations software package”, and the benchmark measures how fast a supercomputer can solve a system of linear equations. The results are reported in units of billions of floating point operations per second (GFLOPS).

The high-performance LINPACK metric has long been the established standard for measuring computing performance, with intense competition worldwide for the lead spot in the TOP500. But some scientists criticize the TOP500 ranking for creating an incomplete picture of how to measure performance. The risk, as Mark Anderson describes in an article in IEEE Spectrum magazine, is motivating computer hardware manufacturers to develop less-effective technologies.

Continue reading

GPU workstation sale (and other news)

Image of Supermicro workstationWow, this is the first update in a while on the ICC blog. We have been working on several web-based projects that have been keeping us busy, and I would like to highlight some of them (and other news) in this post.

Website and product news

First of all, as you may have noticed, our HPC by ICC section of the site launched earlier this month which describes ICC solutions for high-performance computing clusters. There is an outline of the Platform Computing HPC Suite, an industry-leading cluster management software package, as well as a diagram which explains common cluster components. Our goal is to open up high-performance computing to industries that have been slow to adopt it, even though HPC may save them a lot of money in the long term and help them stay competitive. If you think you could benefit from an upgrade to your IT infrastructure, feel free to contact us for a free consultation.

At SC10, Supermicro unveiled their GPU SuperBlade server modules (SBI-7126TG) that will offer perhaps the highest density CPU-GPU computing power available on the market. A 42U rack, comprised of six 7U blade enclosures – each with ten GPU SuperBlade modules – can carry 120 CPUs and 120 GPUs. For comparison, a rack with 42 standard dual-processor 1U servers would have 84 CPUs and no GPUs. We will have these high-density server products available on our site soon after they are released. Read more about them in the Supermicro press release or on this podcast interview with Tau Leng, GM of Supermicro, by insideHPC.

Continue reading

China builds the world’s fastest supercomputer

Photo of Tianhe-1A supercomputer courtesy of NVIDIA.comAfter almost a year-long run, the Jaguar supercomputer at Oak Ridge National Laboratory in Tennessee has relinquished its title as the world’s fastest computer. This honor now belongs to the Tianhe-1A supercomputer located in the National Supercomputing Center in Tianjin, China.

Tianhe-1A is expected to officially become the leader of the TOP500.org list of the world’s fastest supercomputers sometime in mid-November. It clocked an impressive 2.507 petaflops on the LINPACK scale, which is about the sum of the performance of supercomputers #6 to #10 on the Top 500 list, according to insideHPC. Jaguar, now the second most powerful supercomputer in the world, had a peak performance of about 1.75 petaflops.

Although Tianhe-1A may re-ignite the anxiety in the West that usually accompanies news of great achievements from East Asia, this is not the first time that America or Europe had lost the #1 place on the Top 500. In 2002, Japan captured the top spot with their Earth Simulator (ES) supercomputer, which remained the world’s fastest until September of 2004 when IBM’s Blue Gene/L cluster at Argonne National Laboratory surpassed it. The quasi-geopolitical competition for computing power is far from over, but China’s ascendancy is actually one of the less interesting things about Tianhe-1A.

Tianhe-1A can potentially usher in a new era in “personal supercomputing”. It is the first leader of the Top 500 to make extensive use of GPUs (Graphics Processing Units). In fact, it is comprised of 7,168 NVIDIA Tesla M2050 GPUs and 14,336 Intel CPUs. In comparison, Jaguar has 37,376 AMD CPUs and no GPUs.

Continue reading

OpenSFS formed to lead development of Lustre file system

Photo accessed on Wikipedia (http://en.wikipedia.org/wiki/File:Open_HDD_and_SSD.JPG)OpenSFS (Open Scalable File Systems, Inc.), a non-profit corporation, has recently been formed to continue development of the Lustre file system alongside Oracle, which owns and maintains the Lustre code even though it is an open-source technology.

Lustre has been and continues to be used by many high-performance clusters for managing storage systems. When Oracle acquired Sun Microsystems, the former owner of Lustre, earlier this year, they became the keepers of the Lustre source code. Oracle has been developing Lustre for its own hardware, but there is a large sector of Lustre users who have an interest in continuing to add their own functionality to the file system. Enter OpenSFS. While they are explicitly not seeking to branch off from Oracle’s Lustre project (they are doing the exact opposite, actually, in hoping to implement their modifications as future updates to Oracle Lustre) these groups are seeking to develop the file system for non-Oracle deployments.

Membership in the OpenSFS partnership, according to an HPC Wire article, is based upon contribution. For $5,000 a year, one can become a participant on a working group in OpenSFS; for $50,000 a year, one can manage a working group; for $500,000 a year, one can become a member of the board of executives. Currently about 20 organizations have become members in one form or another on OpenSFS. The founding members were Cray, Direct Data Networks, Oak Ridge National Laboratory, and Lawrence Livermore National Laboratory (Lustre updates developed by OpenSFS will be tested first on supercomputers at the latter two locations).

This is an exciting development because it signals a strong effort to continue development of Lustre for applications beyond those needed by Oracle. Hopefully this will become an active community that will keep this heavily-deployed open-source project alive for many years to come.

How a SAS switch can improve storage management

LSI SAS 6Gb/s switch and accessoriesLast week, LSI announced their release of “the industry’s first 6Gb/s SAS switch”. The switch offers unique opportunities for cluster managers to improve the architecture of their storage systems.

The value of the SAS switch is its function of transforming a cluster from a NAS (network-attached storage) structure into a DAS (direct-attached storage) structure. With DAS, storage data does not have to be transferred from the SAS protocol to the network protocol (Ethernet or InfiniBand) and back to SAS. The bottleneck of the middle step is eliminated – the LSI switch allows all I/O of data to happen through just the SAS protocol. This is especially useful for clusters which have or plan to upgrade to 6Gb/s RAID controllers – their throughput will be increased when connected to a 6Gb/s switch rather than to a network.

Another advantage of switching to a DAS configuration for a cluster is it migrates the RAID controllers from the storage nodes to the compute nodes. In a NAS cluster, each storage node typically has its own RAID controller which communicates with the compute nodes through a network. In a DAS cluster with a SAS switch, the storage nodes are JBODs (“Just A Bunch Of Drives” – essentially hard drive warehouses without other computing components within their chassis) that are all accessed by RAID controllers located directly inside the compute nodes.

This configuration separates the RAID controllers from the storage drives and centralizes each of them for simpler management and improved performance. Now, as many RAID controllers as the cluster administrator decides can access any quantity of drives on separate JBOD-based storage. The process that allows this kind of interaction is known as SAS zoning and is illustrated in the diagram below:

Diagram showing the DAS configuration of a cluster with a SAS switch

Diagram showing the DAS configuration of a cluster with a SAS switch, RAID controllers located on the compute nodes, and SAS zoning of the JBOD storage nodes.

For more detailed information about the various uses of the LSI 6Gb/s SAS switch, read their white paper about this product. As storage technology continues to evolve, new solutions such as a DAS cluster configuration with a 6Gb/s SAS switch are helping overcome the various I/O bottlenecks that hamper computing performance.

HPC and the life sciences

Connected network cablesThis week, a team from our company visited a large laboratory located in the Chicago area. IT representatives there told us how a major focus for them has been migrating their computing resources from a model of individual workgroups using separate clusters to a shared private cloud that all research teams in the facility can access for running their jobs. This shift to private clouds for getting the most out of dedicated clusters is a hot topic of conversation in the HPC world.

HPC in the Cloud recently published an article responding to a case study written by Platform Computing about the implementation of a private cloud at the Harvard Medical School. Both are worth a read if you are interested in the challenges encountered by small- and medium-sized life sciences organizations when they try to adopt HPC clusters.

HPC holds much promise for organizations such as the Harvard Medical School. With middleware such as Platform Computing (we are biased, I must admit, since this is what HPC clusters by ICC deploy as well) it is getting easier to operate an HPC cluster with hosts running different operating systems and applications. It used to be that this multiplicity of software on the same cluster would cause extensive compatibility and usability problems, but not so much anymore. End-users in the life sciences (such as medical researchers) are benefiting from computing applications that are productive and easy to use.

So Harvard Medical School, as the HPC in the Cloud article describes, has migrated from an inefficient computing model of unshared individual computers scattered across various laboratories to a centralized private cloud that can be accessed by any of those users and managed as one unit. Simplifying maintenance while maximizing accessibility to HPC resources by medical school staff is most likely going to save money and increase the pace of innovation in the long run.

While this is a hopeful case study that sheds light on how other organizations can pool their computing resources to great effect, challenges remain for spreading this model to other small- and medium-size laboratories and businesses. For one, private medical companies are heavily regulated by the government and their IT infrastructure has to incorporate many time-consuming applications to store detailed records.

HPC is becoming more affordable and easier to use, but software has to continue evolving to accommodate the particular context of each industry. Only then will the life sciences (not to mention other markets) have a truly turn-key HPC solution that can benefit labs and private companies of every size.