Web Services for Distributed Scientific Computing
In recent years, there has been a trend towards multi-disciplinary, multi-institutional, collaborative projects. In these projects, a number of research groups with disparate interests or skills to come together for a limited time to solve a common problem. It is not uncommon for a research group to be simultaneously involved in a number of such efforts (our group is one such example). This situation presents a challenging problem to the computational and computer scientist, namely how best to deploy software. The traditional approach has been for a project to select a single or small number of platforms for the project to support and then require everyone to port their codes to those platforms. When a group is involved with a number of such projects, the number of platforms to support can grow quite quickly. In the end, a group can find itself spending more time porting and maintaining software than in researching and developing it.
Fortunately for the researcher, the business world has encountered a similar problem. In the brave new world of e-commerce, it is thought that, in order to survive, a business must be able to respond quickly to emerging opportunities. One approach for doing this is to organize a business around a number of small teams. These teams can be loosely combined in order to quickly address emerging trends, disbanded, and the recombined with other teams. Teams from different companies can be combined to address shared problems or interests. Instead of developing monolithic business applications, a new form of software organization is required for this environment.
This new form appears to be Web Services. Web Services are software modules that are accessible using standard Internet protocols, like HTTP, XML and SOAP. Web Services can be discovered using UDDI and combined to build larger applications using WSFL and BPEL4WS. Although Web Services are a relatively new concept and their protocols are still being developed, there are a number of commercial systems for providing Web Services.
And this is what makes Web Services so attractive for the computer and computational science. Unlike Grid frameworks, which are complex and not available for a wide range of platforms, infrastructure for Web Services are relatively simple and widely available. Using this infrastructure, it is possible for multi-disciplinary, multi-institutional projects to build complex applications without having to deploy their codes on a number of platforms. There are a number of advantages to using Web Services in this manner,
- Interoperability is achieved without having to run all of the codes on a single platform at a single location.
- A module can be used without its source code being exposed. This is important in certain cases that involve intellectual property rights.
- In order to fix a bug, only one location needs to be updated.
- We have found that the overhead of using our Web Services approach is very small, usually less than 10%.
Within the Adaptive Software Project, we have taken this approach for developing state of the art simulations for studying problems in chemically reacting fluid flow, thermal dynamics, structural mechanics and fracture mechanics at a number of different length scales. The work flow and geographic distribution of Web Services for one of our problem, the Basic Pipe problem is shown below.
This work is being done as part of the Adaptive Software Project.
SOAP::Clean
As part of the project, we have developed a new Web Services framework, SOAP::Clean. Our framework differs from other Web Services frameworks in that it is designed for the particular needs to computer and computational scientists that need to deploy legacy codes. However, because it uses the standard Web Services protocols, our framework is able to interoperate with other Web Services servers and clients.
Client-side Scripting Languages
One the components of an application have been deployed as Web services, the next task consists of building the simulation out of these components. We are designing a scripting language which can specify the dataflow between these components. The language must be elegant enough to allow users to conveniently build their simulations. It must be expressive enough to specify the interactions that users would want between the components of their simulation. Finally, it should be simple enough so that the compiler can extract the dependence information and generate an efficient parallel schedule. We are looking at compiler and program analysis techniques to effectively tackle the above problems.
An important issue that crops up in running distributed applications is fault tolerance. One of the goals of this research is to come up with a distributed environment where programs written in this language can continue to run in the presence of failures. We aim to look at ways of efficiently checkpointing the state of the running simulation and also have distributed control of the simulation so that there is no single point of failure.
Publications:
- A Distributed System Based on Web-services for Computational Science Simulations 20th ACM International Conference on Supercomputing (ICS), 2006
- Computational Science Simulations based on Web Services International Conference on Computational Science, 06/23/2003
- Post-Cluster Computing and the Next Generation of Scientific Applications Sixth World Multiconference on Systemics, Cybernetics and Informatics, 07/14/2002
- Parallel FEM Simulation of Crack Propagation on the AC3 Velocity Cluster 7th Workshop on Cluster Cluster-Based Computing, 05/06/2000
- Parallel FEM simulation of Crack Propagation - Challenges, Status, and Perspectives 7th International Workshop on Solving Irregularly Structured Problems in Parallel, 05/05/2000
Software
- SOAP::Clean - a Perl module for exposing legacy applications as web services





