Skip to content

Categories:

Composition of Grid-enabled Web Services

Composition of Grid-enabled Web Services for

Integration and Sharing of Distributed Resources

through Web based Interfaces

By

Niraj Kumar

Software Developer

 Kolkata ' 700026.

West Bengal, India

E-mail: nirajkumariitkgp@gmail.com

Contact No: (Mobile).

© 2006 Niraj Kumar. All right reserved.

ABSTRACT

Traditional computer architecture and integration mechanism are more biased towards

tightly coupled client-server architecture and centralized databases. However with

phenomenal growth in web technologies and emergence of Web as world biggest

database has pushed human and organizations ability to utilize these resources effectively

to limits. Computer scientists, researchers and organizations throughout the world trying

to develop mechanism to make effective utilization of these distributed and

heterogeneous resources to gain competitive advantage in market. In this study, we

propose to develop a grid-enabled web services through web based interfaces, which may

considerably enhance our ability to share distributed and heterogeneous resources and

services among the institutions and organizations throughout the world.

Key Words: XML, SOAP, UDDI, WSDL, Web Services, Grid Computing, Java.

THE PROBLEM

To fulfill challenges posed by competitive world there is need to develop a system, which

can Integrate various distributed databases and web resources on the WWW and bring

these heterogeneous sources of data into common platform or in the form required by the

user. Also there is need to provide mechanism to allow sharing of resources and services

among various Organization / Institutions. Clients through portal interfaces should be

able to get required information on real time basis by seamlessly integrating various

resources/services spread across WWW and various Organization/ institutions and hiding

its implementation details from the user. Required system should be highly flexible and

scalable and services should be added and deleted at any time without affecting the

2

performance of the system. This required that Client and Servers must be independent of

each other.

Introduction

The organizations/ institutions of todays need to share and integrate various resources and

services to provide task specific requirements of a particular users and applications.

However, this is very challenging task because each of these resources and services have

different structures, contents, query languages and retrieved data in different format and

supported by different underlying hardware and network support. Furthermore, they are

prone to having their interfaces and formats updated without warning. Due to increasing

complexities of problems any user specific task typically requires to interact with

number of resources and services distributed over geographically distant locations and

under the control of different agencies. Many of them have security concerns and want to

share only limited resources with others. They also need flexibility in adding or deleting

resources and our proposed system should able to run even if some resources and services

will be no longer available for use or new resources and services get added to the system.

This requires that machines must be able to communicate with one another without much

human intervention.

Our current web applications follow the traditional client-server model of software

architecture and they are closely interrelated with one other, inflexible and tightly

coupled. This makes whole system dysfunctional if any changes is done in client or

server architecture independently. Also, our traditional database and information

integration systems (like Enterprise Resource Planning) are biased towards centralized

system where various resources of the organization are integrated using centralized

databases and mostly with proprietary software. This is contrary to the spirit of the Web

- a loose, open confederation of resources held together by simple protocols. So, to

address these problems many distributed environment technologies and standards like

DCOM, CORBA, RMI etc were came up during late 1990's. However, in reality these

are not suitable for the Internet, and introduce a degree of dependency and/or platform

issues. They are not able to completely eliminate need to write client application without

having to know anything about the architecture of participating distributed objects. In the

meantime advances in Servlet and Java Server Page (JSP) technologies and emergence of

standard like J2EE and .Net from Sun Microsystems and Microsoft respectively made

possible development of fast and relatively secure Web based application within the

realm of reality.

This set the stage for XML based Web services, which is an exciting new technology

standard that enables communication between heterogeneous computer systems. Web

services emerged as standard only in last 3 years. At its core, the technology is simply

XML moving from one computer to another in a form that each computer can reliably

process. It is a significant improvement to traditional systems integrations and it has

significant implication for any organization. Web services facilitates the ability to expand

computer to computer communication. They are developed supporting three primary

3

internet standards - Simple object access protocol (SOAP), Web Services Description

Language (WSDL), and Universal Description, Discovery, and Integration (UDDI)

directory. Currently these standard were supported by all major software technology

vendors and approved by the W3C. Grid based computer applications can be considered

as next level in this chain of events, which make possible even heavy duty task solvable

by using diversified and heterogeneous resources and services.

In this case study, we propose to develop a XML-based Web services and Grid based

solution using various Java based technologies, which can facilitate transfer and sharing

of various resources across the Organizations/ Institutions of the world through Web

based interfaces on the portal. We also want to make it possible to share and transfer not

only light weight resources and services, but also heavy weight resources and services.

Our aim is to develop a full fledged independent software product and methodology,

which can be directly usable to any portals which needs to share and integrate resources

over the Internet or among the enterprises and their partners and can be marketable as an

independent product in its own right.

GRID COMPUTING AND WEB SERVICES FOR THE

FUTURE

Imagine a scenario where just with an interface anybody will be able to run any program

without downloading any softwares or all barriers of platforms, databases and

networks vanishes . Grid computing system of the future should able to provide solution

to these problems. It is also expected that autonomic computing and smart network

technologies should emerge which should automatically able to detect changes in the

systems and accordingly able to take appropriate action. It is likely to provide user

friendly interfaces for remote job monitoring and show the status of the result computed

at each nodes in real time. More and more web based interfaces is likely to be added for

all activities related with grid computing. The primary reasons for these are because grid

systems require dynamic discovery and composition of services in heterogeneous

environments necessitates mechanisms for registering and discovering interface

definitions and endpoint implementation descriptions and for dynamically generating

proxies based on (potentially multiple) bindings for specific interfaces. WSDL supports

this requirement by providing a standard mechanism for defining interface definitions

separately from their embodiment within a particular binding (transport protocol and data

encoding format). Second, already numerous tools and support for WSDL processors

that can generate language bindings for a variety of languages and platforms. Third,

using http protocol for communication allows to communicate with any system sitting

behind firewalls as usually firewalls don't likely to block this port. Fourth, because any

computational intensive tasks require to interact with more than one computers at

geographically distributed locations, databases etc which requires algorithms for defining

work flows. Web Services Flow Language (WSFL) and MS XLANG, which is an XML

language to describe workflow processes among distributed and heterogeneous

environment offers excellent potential for this purpose.

4

The web services framework is likely to integrated with the grid computing system of the future

and already attempts in this direction is being made, however it is yet to matured to stage where

web services potential can be harnessed for grid computing fully. As grid computing has

started to leverage Web services to define standard interfaces for business services and

institutional needs. The grid is likely to provide virtual integrated environment to people

from different organizations and locations to work together to solve a specific problems.

This is a typical dynamic resource sharing and information exchange. The grid

computing platform is likely to allow resource discovery, resource sharing, and

collaboration in a distributed environment in more user friendly ways.

Future generation Grid enabled Web services should be able to accomplice the following

tasks:

The ability to more efficiently use computing power. Jobs can be sent to the node

that has the least amount of load.

Complex jobs can be broken up and run on multiple nodes in parallel, providing a

significant performance increase. This kind of structure is known as a

computational grid.

Large amounts of data can be stored in a structure that spreads over many

systems, yet still be accessed as if they were part of a single node. This structure,

similar to a federated database, is known as a data grid.

The ability to run different parts of an application on systems with different

characteristics. However, any grid system requires that user specific to their

requirements and problems

should submit the appropriate input files and define the problems algorithm in suitable

languages. This requires considerable domain expertise in the problem areas as well as

understanding of the processes involved in grid systems to able to efficiently use it.

Using SOAP for Communication in Grid environment

We need to develop a mechanism to send and receive communication to remote services

and resources in grid environment. Web services expose objects method via SOAP.

Following steps needed to be followed:

The client application builds a SOAP message, which is an XML document capable

of performing the desired request/response operation.

The client sends the SOAP message to a JSP page on a Web server listening SOAP

requests.

The SOAP server parses the SOAP package and invokes the appropriate method

and object in its domain, passing in the parameters included in the SOAP document

The request object performs the indicated function and returns data to the SOAP

server which packages the response in a SOAP envelope. The server wraps the SOAP

enveloped response object, such as servlet , which is send back to the requesting

machine.

5

The client receives the object, stripps off the SOAP envelope and send the response

document to the program originally requesting it, completing the request/response

cycle.

Managing Work Flow in Grid

Once the resources are discovered, Work flow in grid can be established using Web

Services Flow Language (WSFL) and MS XLANG, which is an XML language to

describe workflow processes and spawn them. WSFL specifies how a Web Service is

interfaced with another. With it, we can determine whether the Web Services should be

treated as an activity in one workflow or as a series of activities. While WSFL

complements WSDL (Web Services Definition Language) and is transition-based,

XLANG is an extension of WSDL and block-structured based. WSFL supports two

model types: flow and global models. The flow model describes business processes that a

collection of Web Services needs to achieve. The global model describes how Web

Services interact with one another. XLANG, on the other hand, allows orchestration of

Web Services into business processes and composite Web Services. WSFL is strong on

model presentation while XLANG does well with the long-running interaction of Web

Services. Web Services and resources can be declared as private or public.

Monitoring Remote Jobs in Grid environment

In a complex system like the grid, monitoring is essential for understanding its operation,

debugging, failure detection and for performance optimization. The monitoring system

must be able to provide information about the current state of various grid entities, such

as grid resources and running jobs, as well as to provide notifications when certain events

(e.g. system failures, performance problems) occur. Monitoring jobs require

interoperation between the monitoring system and other grid services. The running

application consists of processes running on hosts constituting the grid resource.

Processes are identified locally by the operating system by process identifiers

(PIDs). The local resource management system (LRMS) controls jobs running on hosts

belonging to a grid resource. It allocates hosts to jobs, starts and stops jobs on user

request and possibly restarts jobs in case of an error. It may also checkpoint and migrate

jobs between hosts of the resource which can be considered as a special case of job

startup. The LRMS identifies the job it manages by a local job identifier (LJID). To

monitor a job the monitoring system has to know the relation between LJIDs and PIDs.

There are various ways to accomplice this task and each grid system implements this in

different ways. The future generation grid systems should provide job submission, job

monitoring, job status and job output through web based interfaces to make it accessible

to common man.

6

System Architecture, Design and Modeling

Our architecture focuses on providing virtual integrated environment among

organizations/institutions through web based interfaces, which are easy to use, flexible

and secure. It provides mechanism to share large number of resources like video

lectures on real time and on demand, scientific databases, query to partner institutions,

lecture notes etc though our primary focus is on providing computation resources for

computation intensive scientific and engineering tasks. Our architecture aims to provide

support for platform independent and heterogeneous resources.

In the beginning our focus is on providing support for Java, C++/C and Fortan languages.

For scripting we intends to provide support for PERL, Shell Script, Dos Script. As far as

operating system is concerned we intend to provide support for windows and Unix based

platforms. To make this possible our architecture is XML based and intend to combine

web services concepts with grid computing concepts. Our architecture is based on

autonomic computing concepts and also intended to integrate intelligent network

technology concepts like Jini Technology with grid computing to make it possible for us

to able to dynamically sense changes in network environment and system should able to

take appropriate action automatically. Our architecture is capable of taking into

consideration any number of systems and services added/removed from the system in

real time. Besides it is capable of displaying status of jobs, output of the jobs at each

nodes in real time.

The job is submitted by the client to the web portal through a graphical user interface

(GUI). The web portal delegates the management of jobs to schedulers. A scheduler

divides a job into smaller tasks (in the case of an independent job, a task refers to the

subset of parameters that can be executed independently) and sends the tasks to the

resources for execution. Ease of use is achieved by encapsulating the system with an easy

and a well defined interface. The execution service provided by the resources is wrapped

inside a web service interface, hence it can be consumed easily by any user. The

scheduler encapsulates the complexity of the job scheduling into a web service interface.

This approach of using web service interface allows easy client side implementation. The

web portal provides GUIs for job submission and management, hence allowing the client

to submit and monitor jobs easily. The parameter file can contain a range or a list of input

values. The scheduler parses the parameter file and splits the input values so it can

distribute the job to many resource machines with different range or list of input values,

which is a subset of the submitted input values.

The status of the each subparts of the job submitted to each nodes is displayed through

web based interface at real time. It is also proposed to display the status about

performance of each nodes with respect to particular job to the respective clients through

web based interface. It will also display the available memory at each nodes, current load

at each nodes, average CPU performance of each nodes. After computation is complete

the final result is generated and displayed in real time basis through web based interface

and automatically an event is generated to send output to the clients through email or to

the computer directory of the client.

7

Simplified view of integrating two organizations resources for Grid

Communication Protocols(like HTTP/ HTTPS etc)

Information flow

Between two institutions

(XML over HTTP/

XML over FTP/

XML over HTTPS)

Data Data

CyberSWIFT Partner Organizations

Web Server and

Application Server

Database Server

Web Server and

Application Server

Database Server

WWW

Web Portal

Other resources

(Computers, Video

Lectures, Files etc

Other resources

(Computers, Video

Lectures, Files etc)

8

Grid enabled Web service Dataflow Diagram

Result

Find the Appropriate Nodes

Transfer the Data

Submit the Jobs

Collect the result

Web Portal

Client 1 Client 2 Client 3

HTTP

request

Authenticate with the

Server

Connect to the Server

Transfer the File

(XML over FTP)

Close the Connection

Integrate the result

9

Proposed Design Methodology

Design methodology to be adopted for this study is in stages. Each stages can be

considered as separate software modules. Primary challenges when designing Grid-based

Web services are to look beyond the traditional Client- Server paradigm, where client is

tightly coupled with server. Here we have to design client completely independent of

server, so that even if some changes take place in the server side (In this case which is not

in our control as many services can be withdrawn, while many new services can be added

without our knowledge). Then on server side, this system must able to give choices to the

other partner institutions regarding type of services and resources they want to share and

at any point of time able to withdrawn as well as add new services/resources. So this

kind of system design should be based on following design principles:

Clients and server applications should be independent from one another

Applications should be built by discrete components coordinating serverbased

modules.

Services and resources should be discovered by querying directories.

Services should be transient

Services should support extension and able to degrade when no longer

needed.

A mechanism to describe the Services (Example: WSDL implementation)

A mechanism to communicate with services (Ex: SOAP implementation)

A mechanism to submit the services and resources in registry (Implementation

of UDDI)

A mechanism to discover available services and resources (Implementation of

UDDI)

A mechanism to break the complex jobs into simple jobs to be submitted to

available and least loaded nodes. Depending upon the changes in the load of any

nodes it should capable of automatically redistribute the jobs to least loaded

nodes.

A mechanism to display the status of jobs submitted at each nodes in real time

with related statistics about memory status, CPU performances and uses and

changes in them at real time. It should also display the output displayed by each

nodes.

A mechanism to send back result to the client (Using SOAP)

Keeping into consideration these guidelines the system design stages should be

following:

STAGE I:

In this stage we propose to design user interfaces keeping into view that clients should

able to minimize the number of pages needed to be viewed and this should be

independent from the available services and resources.

STAGE II:

10

In this stage we have developed elementary and small grid computing systems for

elementary computations. We have evaluated various available grid computing systems

like Globus, Unicore, Candore, JCGrid, Jgrid, Optimal Grid etc. We also propose to test

all these grid systems by running sample jobs among number of windows and Unix based

environment. Idea is just to make a suitable decision about which of these grid based

systems are based suitable for our purpose. Here we need to take into account computing

facilities available or likely to be available in the future for this purpose, Which kinds of

operating systems all these runs, technological level of people involved in this process

and their commitment, type of network facilities available and security, firewalls and

other administrative decisions about how to allow access to our facilities to the clients as

well as partner institutions. Also we need to take into consideration all these

environments of our partner institutions and general state of affairs and understanding

about grid based system in the country, its current and future potential, and requirements

for such a system .

In this stage we also propose to develop web based interfaces for this kind of system for

job submission, job monitoring, output condition monitoring, output displayed specific

for the grid system we plan to adopt considering our suitability and capabilities. Once this

interface come up user should start the server and submit their jobs . Similar interfaces

will be provided to the partner institutions with the difference that no public interfaces

should be provided but only to transfer their resources to IIITMK or to submit the

services/resources to the registry.

STAGE III:

In this stage we implement the logic of web interface implementation using web services

concepts and suitable grid system. We also customize that system according to our

requirements and feasibility. We also combine computing power of various PCs in

windows and Unix platforms available. Make it fully operational and providing facilities

for remote jobs and status monitoring. Provide single web interface to submit any

computational intensive jobs. We also provide support for various available languages

and scripts as well as integrate the whole system with partner institutions and industry.

We also implement autonomic grid computing concepts and intelligent network concepts

with suitable technologies to see that our system should able to withstand the requirement

of future generation grid systems. We also optimize our whole system to get best possible

throughput and CPU utilization of available nodes for the purpose in geographically

distributed locations. We also aim to enhance our capabilities to apply web services

concepts for grid computing purpose. We also train people and students about these

technologies and systems and considerably enhance our abilities to use this system with

maturity.

STAGE IV:

In this stage we develop web based interfaces for many other grid computing systems as

well as continuously improve and upgrade our system as newer technologies for these

11

will be developed. In this stage we also try to make grid computing technologies

available to common man which requires very less technical knowledge of computers.

As future systems are likely to develop monitors of different kinds and capabilities in

terms of memory requirements, CPU capabilities, we aim to provide grid facilities

through all those devices. We also look into possibilities of developing cutting edge

technology and products in these areas as well as developing some algorithms for

defining work flows, job monitoring etc in the distributed environment. We also try to

look into developing the possibility of system where any body without downloading

softwares can able to use these softwares by sending appropriate program to them.

Summary and Conclusion

In this study, we have presented a framework for sharing distributed and heterogeneous

resources and services among the organizations/institutions. We have also presented

various modules of our framework. We have overviewed some current grid system

available and their usefulness through sample case studies taking into account strength

and weakness of the organizations. We have implemented sample SOAP, AXIS and Jini

based web services and made it available through our web based interfaces. We have

empathizes the importance of combining autonomic, intelligent networks, and web

services concepts with current grid computing systems to make it more effective,

efficient and reachable to the common man.

In summary, a simple grid computing system combining the power of Web Services,

autonomic computing, and intelligent network concepts with capabilities of combining

the computing power of twenty ' thirty computers of different platforms through web

based interfaces, which may be geographically distributed and hiding the complexities

of the implementations from its user, is recommended as grid computing system for the

future.

Challenges and Potential Research Directions

In this section we try to point out potential challenges and future research directions of

this study in stages.

As discussed in the assumptions, publicizing grid-enabled web services through web

based interfaces may raise security concerns. For example, if they are open to anyone and

everyone, the hackers and malicious users can overload the system by submitting dummy

jobs. We may think of providing these facilities through HTTPS or to develop different

level of security mechanism for such a system.

12

Developing browser plug-ins specifically for grid computing purpose is another

interesting areas which requires our attention.

Web services concepts has made significant advancement in last few years, however its

focus so far is only towards providing light weight services. However, we can think of

providing services of the kind where we don't need to download and install any software

to use it, but by requesting with appropriate input files anybody should be able to use it.

We can think of developing some these kind of services in future.

Developing pricing mechanism for making available these kind of services is another

important potential area, which can be explored.

Developing domain specific tools for using these kind of grid system in optimal way for

example drug design, earth sciences, bio-informatics etc are another potential area which

required further exploration.

REFERENCES

(1) IBM Developer Works

http://www-128.ibm.com/developerworks/webservices

(2) IBM OptimalGrid

http://www.alphaworks.ibm.com/tech/optimalgrid

(3) IBM TSpaces

http://www.almaden.ibm.com/cs/TSpaces

(4) JCGrid Website

http://jcgrid.sourceforge.net/

(5) PovRay Website

http://www.povray.org/

(6) Foster Ian, "What is Grid? A Three Point Checklist", Argonne National Laboratory &

University of Chicago, 2002, PP: 1- 4 .

(7) Foster Ian, Kesselman Carl, Nick J., Tuecke, "The Physiology of the Grid - An Open

Grid Services Architecture for Distributed Systems Integration ", OGSA draft documents,

Version: 6/22/2002, PP: 1-31.

(8) Balaton Zoltan, Gombas Gabor, "Resource and Job Monitoring in the Grid", MTA

SZTAKI Computer and Automation Research Institute 2003, PP: 1-8.

13

(9) Tantra J. W., Thu M. M., Heng F. C., "A Framework for secure execution of java

jobs in grid computing", Executive Summary, 2004, PP: 1-7.

(10)Jgrid Website

http://pds.irt.vein.hu/jgrid_index.html]

(11) Jini Network Technology Website

www.gini.org

(12) Sun MicroSystem Web Services

http://java.sun.com/webservices

(13) Apache Website

http://apache.org

(14) Globus Grid Computing Site

http://www.globus.org/

(15) Unicore Grid Computing Site

www.unicore.org

(16) Gridbus, Australia Site

www.gridbus.org

(17) MIT Thesis Web Site

http://theses.mit.edu/

(18) MIT Open Course Site

http://ocw.mit.edu

(19) Java World Site

www.javaworld.com

(20) Apache Tomcat Site

http://tomcat.apache.org

(21) Apache Axis Site

http://ws.apache.org/axis/

(22) Apache Soap Site

http://ws.apache.org/soap/

(23) VideoLAN Project site

http://www.videolan.org/

(24) Streaming Media World Site

http://streamingmediaworld.com

14

(25) OpenSSH Website

http://www.openssh.com/

(26) Marty Hall CoreServlet Site

www.coreservlets.com

(27)W3 School Website

http://www.w3schools.com

(28) IIT Kharagpur E-library Site

http://www.library.iitkgp.ernet.in/

(29) IIT Kanpur Online Thesis Site

http://www.iitk.ac.in/

(30)Computational Chemistry portal, IIITMK

http://comchem.in

(31) W3 consortium Site

http://www.w3.org/

(32) Professional XML, Wrox Press, 2000, PP: 797-835.

(33)Microsoft Website

http://www.microsoft.com/

(34) Kumar Niraj, Srivathsan K. R., "Enterprise risk evaluation and continuous

mitigation using the Fuzzy-Multi-attribute decision making ' A conceptual approach",

Under review by IISc Journal, 2005, pp-1-25.

(35) Kumar N., Bhattacherjee A. and Sarkar D., " Performance appraisal of coal mines

using Data Envelopment Analysis and Fuzzy Set Theory", Mintech, 2002, Volume 23,

No. 5, pp. 18-25.

(36) Kumar N., Bhattacherjee A., Chakravarty D. and Sarkar D., “Efficiency

measurement of mines using DEA and AHP”, TAMSEM, I.I.T. Kharagpur, February,

2004.

(37) Biomer Website

http://www.es.embnet.org/Services/MolBio/B/

(38)Reddy J. N, “An Introduction to Finite Element Method”, McGRAW-HILL

International Editions, 1993.

(39) Condore Grid Computing Site

http://www.cs.wisc.edu/condor

15

Posted in Software.



0 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.