[IEEE 2013 International Conference of Information and Communication Technology (ICoICT) - Bandung,...

6
Dynamic Process Migration Framework Amirreza Zarrabi * , Khairulmizam Samsudin * and Amin Ziaei * Department of Computer and Communication Systems, Faculty of Engineering, Universiti Putra Malaysia, 43400 UPM Serdang, Selangor, Malaysia Email: [email protected] Faculty of Computer Science and Information Technology, University of Malaya, 50603 Lembah Pantai, Kuala Lumpur, Malaysia Email: [email protected] Abstract—Process migration refers to the act of transferring a process in the middle of its execution in a network. The majority of today’s computing power exists in the form of workstations. Increasing use of autonomous workstations connected by a high- speed network represents substantial opportunity for sharing more resources and designing a highly available system much cheaper than a high performance single machine. The process migration framework as a bare facility should be included in all Operating Systems (OSs). In this paper, we describe the design and implementation of process migration framework for Linux OS which aims to segregate the process migration mechanism from the system design and provide the capability of dynamically extending the process migration system on demand to increase compatibility and reduce the system overhead. Index Terms—Process Migration Mechanism; Migration Framework I. I NTRODUCTION The primary reason to transfer a program between machines in a network is performance augmentation [1]. This could be accomplished in the form of dynamic load distribution, reduction of the network traffic, or relocation of a currently running program from a machine prone to fault. The literature [2], [3] reveals that in all existing imple- mentations, the process migration mechanism is already hard- coded inside the migration system. Consequently, all processes undergo similar phases during the migration event irrespective to the requirement of migration system applications or intrinsic features of the processes. Support for third party modules or dedicated hardware cannot be simply incorporated inside the migration events and the processes accessing these resources are simply considered as unmigratable. We present a dynamic process migration framework to include the bare process migration support inside the Linux OS. Its architecture tries to segregate the process migration mechanism from the system design so that the minimum flexibility is provided for users to adopt the migration event according to their constraints and obtain the capability of dy- namically extending the process migration system on demand to increase compatibility and reduce the system overhead. The remainder of the paper is organized as follows: Sec- tion 2 provides an overview of process migration. Section 3 describes the generic architecture of process migration framework. Section 4 presents the system design. The process migration framework implementation is discussed in section 5. Section 6 and 7 are devoted to the framework evaluation. Finally, we present some concluding remarks. II. OVERVIEW OF PROCESS MIGRATION Process migration introduces opportunities for designing of different multicomputer configurations. A multicomputer configuration is any arrangement of several computers, which is used to support specific services or applications [4]. Two major types of process migration exist: conventional process migration and migration with distributed resources. Resources on different machines are local in conventional pro- cess migrations and some methods are required to access the remote resources, e.g. using Global Memory Service (GMS) [5] to access the memory pages on remote machines. Con- versely, distributed resources belong to the universal reservoir and the process resources are uniquely identified and accessi- ble throughout the network. Process migration can be easily implemented as the resource distribution would be used to access the migrated process resources, e.g. Distributed Shared Memory (DSM) [6] provides the shared memory paradigm so that a migrated process can access to its address space and open files after migration. A. Process Migration Mechanism A method should be deployed to handle the process states throughout the migration event irrespective to the process migration type. It is referred to as the migration mechanism. The process state should be treated in one of the following three methods while the process migration is in progress [7]: Ignore process states on the source machine and simply use the corresponding resources on the destination. Move process states to the destination machine by ex- tracting the required information and reinstate them. Preserve process states on the source machine and for- ward requests for operations back to the source. B. Development of Process Migration Systems Condor [8] is a software package that supports user level process migration. Any system call made by the migrated process communicates with the source machine and performs a remote procedure call. Programs submitted to be run by condor should be linked to specific C library. 978-1-4673-4992-5/13/$31.00 ©2013 IEEE 2013 International Conference of Information and Communication Technology (ICoICT) 410

Transcript of [IEEE 2013 International Conference of Information and Communication Technology (ICoICT) - Bandung,...

Page 1: [IEEE 2013 International Conference of Information and Communication Technology (ICoICT) - Bandung, Indonesia (2013.03.20-2013.03.22)] 2013 International Conference of Information

Dynamic Process Migration FrameworkAmirreza Zarrabi∗, Khairulmizam Samsudin∗ and Amin Ziaei†

∗Department of Computer and Communication Systems, Faculty of Engineering,Universiti Putra Malaysia, 43400 UPM Serdang, Selangor, Malaysia

Email: [email protected]†Faculty of Computer Science and Information Technology,

University of Malaya, 50603 Lembah Pantai, Kuala Lumpur, MalaysiaEmail: [email protected]

Abstract—Process migration refers to the act of transferring aprocess in the middle of its execution in a network. The majorityof today’s computing power exists in the form of workstations.Increasing use of autonomous workstations connected by a high-speed network represents substantial opportunity for sharingmore resources and designing a highly available system muchcheaper than a high performance single machine.

The process migration framework as a bare facility shouldbe included in all Operating Systems (OSs). In this paper, wedescribe the design and implementation of process migrationframework for Linux OS which aims to segregate the processmigration mechanism from the system design and provide thecapability of dynamically extending the process migration systemon demand to increase compatibility and reduce the systemoverhead.

Index Terms—Process Migration Mechanism; MigrationFramework

I. INTRODUCTION

The primary reason to transfer a program between machinesin a network is performance augmentation [1]. This couldbe accomplished in the form of dynamic load distribution,reduction of the network traffic, or relocation of a currentlyrunning program from a machine prone to fault.

The literature [2], [3] reveals that in all existing imple-mentations, the process migration mechanism is already hard-coded inside the migration system. Consequently, all processesundergo similar phases during the migration event irrespectiveto the requirement of migration system applications or intrinsicfeatures of the processes. Support for third party modules ordedicated hardware cannot be simply incorporated inside themigration events and the processes accessing these resourcesare simply considered as unmigratable.

We present a dynamic process migration framework toinclude the bare process migration support inside the LinuxOS. Its architecture tries to segregate the process migrationmechanism from the system design so that the minimumflexibility is provided for users to adopt the migration eventaccording to their constraints and obtain the capability of dy-namically extending the process migration system on demandto increase compatibility and reduce the system overhead.

The remainder of the paper is organized as follows: Sec-tion 2 provides an overview of process migration. Section3 describes the generic architecture of process migrationframework. Section 4 presents the system design. The process

migration framework implementation is discussed in section5. Section 6 and 7 are devoted to the framework evaluation.Finally, we present some concluding remarks.

II. OVERVIEW OF PROCESS MIGRATION

Process migration introduces opportunities for designingof different multicomputer configurations. A multicomputerconfiguration is any arrangement of several computers, whichis used to support specific services or applications [4].

Two major types of process migration exist: conventionalprocess migration and migration with distributed resources.Resources on different machines are local in conventional pro-cess migrations and some methods are required to access theremote resources, e.g. using Global Memory Service (GMS)[5] to access the memory pages on remote machines. Con-versely, distributed resources belong to the universal reservoirand the process resources are uniquely identified and accessi-ble throughout the network. Process migration can be easilyimplemented as the resource distribution would be used toaccess the migrated process resources, e.g. Distributed SharedMemory (DSM) [6] provides the shared memory paradigm sothat a migrated process can access to its address space andopen files after migration.

A. Process Migration Mechanism

A method should be deployed to handle the process statesthroughout the migration event irrespective to the processmigration type. It is referred to as the migration mechanism.The process state should be treated in one of the followingthree methods while the process migration is in progress [7]:

• Ignore process states on the source machine and simplyuse the corresponding resources on the destination.

• Move process states to the destination machine by ex-tracting the required information and reinstate them.

• Preserve process states on the source machine and for-ward requests for operations back to the source.

B. Development of Process Migration Systems

Condor [8] is a software package that supports user levelprocess migration. Any system call made by the migratedprocess communicates with the source machine and performsa remote procedure call. Programs submitted to be run bycondor should be linked to specific C library.

978-1-4673-4992-5/13/$31.00 ©2013 IEEE

2013 International Conference of Information and Communication Technology (ICoICT)

410

Page 2: [IEEE 2013 International Conference of Information and Communication Technology (ICoICT) - Bandung, Indonesia (2013.03.20-2013.03.22)] 2013 International Conference of Information

LOCUS [9] is a distributed OS in which the operation distri-bution is achieved by OS packaging up a message and sendingit to the relevant machine at any point within the execution ofthe system call when remote service is needed. The LOCUSfilesystem appears to the user to be a single rooted tree whichfacilitates the process migration. It is practical as LOCUSpursues the UNIX design of accessing resources through files.

DEMOS/MP [10] is a distributed OS which is based onthe message passing paradigm. It provides a location transpar-ent interprocess communication even between processes ondifferent machines. The location independent communicationmechanism and the fact that the kernel can participate inmessage communication in the same manner as any processsimplifies the implementation of process migration.

Kerrighed [11] formerly known as Gobelins is an OS aimingat giving a single system image. It has been implemented bymodifications of Linux kernel and extending system mech-anisms with the concept of container [12] which allow theimplementation of DSM to simplify the migration of theaddress space and open files.

Some of the migration systems rely on virtualization facil-ities which can be in OS or hardware level.

Zap [13] is implemented as a kernel module in Linux thatprovides a thin virtualization on top of the operating systemand presents the process group with the same virtualized viewof the system. This decouples processes from rest of operatingsystem and simplifies migration events. A checkpoint/restartfeature is presented for containers in OpenVZ [14]. It is im-plemented as loadable kernel modules plus a set of user spaceutilities. It relies on OpenVZ container-type virtualization tofulfill checkpoint/restart requirements.

Hardware virtualization approaches like Xen [15] andVMware [16] allow checkpointing and restarting only anentire operating system environment, and they can not providecheckpointing and restarting of small sets of processes. Thatleads to higher migration overhead.

III. PROCESS MIGRATION FRAMEWORK ARCHITECTURE

Architecture is roughly the prudent partitioning of a wholesystem into parts, with specific relations among the parts [17].Three components exist in any migration system, including:

• Checkpoint/Restart Subsystem.• Migration Coordinator on source machine.• Migration Daemon on destination machine.

A. Checkpoint/Restart Subsystem

The checkpoint/restart subsystem exists on both sourceand destination machines and accommodates the fundamentalproperty of exporting and importing process states from/to theunderlying OS. It is designated to be at the lowest layer inthe system architecture with the expectation of accessing toall process resources.

In any migration event two checkpoint/restart subsystemsare participating, one instance checkpoints the process while

the other restores the process simultaneously. The contem-porary definition of checkpoint and restart1, which relies onthe concept of separation of the checkpoint and restart eventsshould be adapted. Each peer should be able to divert theremote machine from its ordinary execution sequence byemploying specific protocol.

B. Migration Coordinator/Daemon

Checkpoint/restart subsystem provides infrastructure forprocess migration. However, the event could not be initiatedunless the source and destination machines agree on certainprerequisites. The migration coordinator derives all desiredarguments for the event and invokes the checkpoint/restartsubsystem interface to carry out the requested event.

The counterpart to the migration coordinator on the remotemachine is the migration daemon. The migration coordinatorinstructs the daemon for initializing an appropriate environ-ment for migration. It establishes equipment for the coordi-nator to call the checkpoint/restart subsystem interface on theremote machine.

IV. SYSTEM DESIGN

When drawing system architecture many design decisionsremain unbound and are left to the downstream designers andimplementers [17]. Level of implementation is an importantaspect of system design which is not specified in processmigration architecture.

The practice of integrating part of the system in user spacewould offer great flexibility for performing convenient tasksusing existing user space tools. The optimum system designcould be constructed by combining different implementationlevels in the process migration framework.

A. Kernel Level Infrastructure

The checkpoint/restart subsystem is selected to be a kernellevel infrastructure as it should abide by the behavioral con-

Figure 1. Process Migration System Package Diagram.

1Application checkpoint and restart refers to the ability of saving the stateof a running application so that it can later resume its execution from the timeof the checkpoint [18].

978-1-4673-4992-5/13/$31.00 ©2013 IEEE

2013 International Conference of Information and Communication Technology (ICoICT)

411

Page 3: [IEEE 2013 International Conference of Information and Communication Technology (ICoICT) - Bandung, Indonesia (2013.03.20-2013.03.22)] 2013 International Conference of Information

straints described in previous sections which are only feasiblein kernel space.

We tried to identify existing similarities in previous check-point/restart implementations [18], [19] and define a multilayerstructure. Each checkpoint/restart system is comprised of atleast four layers, including: kernel subsystem specific layer,transfer medium layer, core system control layer, and userspace interface layer. Fig. 1 presents the organization ofsystem structure.

A hierarchical structure is defined which logically partitionsthe system functionality. System implementation and debug-ging are easier by separating system functions through a divideand conquer methodology. Moreover, communication protocolis simpler to design and modify in a layered architecture.

1) Kernel Subsystem Specific Layer : Linux kernel consistsof various subsystems which are interacting with the processescontext [20]. Exporting the state of a process implicitly indi-cates the need for extracting the state of all these subsystems.

The kernel subsystem specific layer resorts to callbacksas a method to obtain enough knowledge on managementpolicy of each subsystem. Each kernel subsystem or third partymodule with significant states has to register predefined set ofoperations in this layer. Refer to Table I for the minimal listof operations.

Table IKERNEL SUBSYSTEM OPERATION

Operation Description

checkpoint Sore kernel subsystem states.restart Restore kernel subsystem states.fault Recover an entitya from error state.status Extract the current status of ongoing event.dump Handle fatal error incidents.a Entity refers to an element used as a unit in checkpoint and restart event,

e.g. a memory page is considered as an entity for memory subsystem.

The fault method would be invoked to divert the check-point or restart event from the normal execution sequence. Itcould be accomplished only if they could be conducted inmultiple steps. Therefore, the checkpoint and restartmethods should be iterative type. That is, a single call to thesemethods could decline to fulfill the respective event.

The shared resources managed by the kernel subsystemsor third party modules should undergo the checkpoint orrestart event with extra considerations. They require to becheckpointed once for all the processes that reference theresources and should remain in the consistent state while acheckpoint event is in progress. Processes should be restartedwhile referencing to their formerly shared resources [19].

2) Transfer Medium Layer : The transfer medium layerspecies how the extracted states from kernel subsystems orthird party modules should be handled. The primary objectiveof this layer is to decouple low speed transfer medium fromthe checkpoint and restart events. Various transfer mediumscould be deployed, ranging from memory as backing store to

a standard filesystem interface of particular OS [21] and aspecifically optimized network protocols [22].

Irrespective to the characteristics of transfer mediums, theyshould provide a facility to store and retrieve data. In transfermedium layer, callbacks are employed to abstract the pecu-liarities of a transfer medium. The minimal operations arerepresented in Table II.

Table IITRANSFER MEDIUM OPERATIONS

Operation Description

popdata Retrieve data into kernel module.pushdata Store kernel subsystem in the medium.command Send or receive medium dependent command.flush Flush all buffers in the transfer medium.

The command method is used to send low-level instructionsto transfer medium layer in order to modify its behavior, e.g.changing the internal buffer size. The same method can beexploited by kernel subsystems or third party modules to sendtheir internal command to peer modules on the remote machineif the network medium is used.

3) Core System Control Layer : All process specific in-formation concerned with handling a Linux multithreadedprocess within the checkpoint/restart subsystem are maintainedat this layer. The checkpoint and restart events on multipleprocesses are kept thoroughly isolated in this layer. Therefore,the concurrency of events is guaranteed while the kernelsubsystem specific layer takes care of shared resources.

The core system control layer should not make any as-sumption about specific process undergoes an event. As aworkaround, each process is treated as an object with its ownconstructor/de-constructor functions which are specified basedon request attributes. Table III shows the operations utilizedfor each process [23].

Table IIIPROCESS SPECIFIC CONSTRUCTOR/DE-CONSTRUCTOR

Operation Description

setup Called before launching event to prepare process.restart Called when restarting process.cleanup Called for terminating current success/failed event.

A generic watchdog should be implemented within thislayer which would verify event progress periodically. If noprogress occurs over the specified amount of time, the eventis considered as dead and rollback event is initiated to retrievethe last stable state or produces appropriate error if rollbackcannot accomplish.

4) User Space Interface Layer : This layer introduces asingle entry point to the checkpoint/restart subsystem thataccepts a request for a particular event from a user evaluatesthe request and issues the required sequence of calls to coresystem control layer. In other words, it is a component of thesystem that translates a user action into one or more requests

978-1-4673-4992-5/13/$31.00 ©2013 IEEE

2013 International Conference of Information and Communication Technology (ICoICT)

412

Page 4: [IEEE 2013 International Conference of Information and Communication Technology (ICoICT) - Bandung, Indonesia (2013.03.20-2013.03.22)] 2013 International Conference of Information

for system functionality and feedback about the consequencesof this action [24].

B. User Level Tools

The user level tools refer to the applications that deploy thecheckpoint/restart subsystem. It should establish an appropri-ate environment on the remote machine and issue series ofrequests to underlying checkpoint/restart subsystems.

The checkpoint/restart subsystem attempts to isolate the pro-cesses concurrent events. However, many applications consistof multiple cooperating processes and not only must the stateassociated with each process be migrated but the processesrelationships such as parent/child relationships and identifiersshould be considered. The Linux inheritance feature whilecreating new process could be exploited to reach this goal [25].The special kernel subsystem specific module would take careof the identifiers. Users level tools solely need to be concernedabout two factors:

• If process P1 is the parent of process P2 then whenrestarting, P1 must create P2 before initiating the restartevent on itself.

• If process P1 consists of many threads then one of thesethreads should create other threads.

V. SYSTEM IMPLEMENTATION

The execution context for checkpoint and restart events isinitiated inside the core system control layer. The executioncontext could be external or internal [26] according to the userrequest. In the internal context, the respective event wouldhappen within the context of the process itself, contrary tothe external context which the event would be accomplishedwithin a different process context.

The worker method related to the requested event wouldbe invoked in the execution context. These methods rely onkernel subsystems or third party modules and transfer mediumcontexts to carry out the actual events. It would satisfy therequirements of the event by calling the process correspondingconstructor/de-constructor methods.

A. User Space Interface

The interface between the kernel space and the user spaceis implemented as character device driver [27]. The ioctlsystem call is exploited for interaction with the device file.It multiplexes the different commands to the appropriatefunctions in the user space interface layer. Table IV lists allioctl commands and their respective descriptions.

In any repetitive user data access, the system should havesome means of establishing context [28]. In the user spaceinterface layer, each request is an abstract that constructs thecontext of the event. Any request is identified with an integervalue, which is returned to the user space when a request issubmitted.

While the request ID represents a submitted request locally,a dedicated token is used for distributed representation of anevent. For each event, an instance of this token should becreated, which is used in the handshaking sequence.

Table IVUSER SPACE INTERFACE IOCTL COMMANDS

ioctl Commands Description

Request Submission/Removal

SUBMIT_REQUEST Submit a new request for an event.DISMISS_REQUEST Drop a finished or pending request

before initiating the event.

Handshaking Sequence Managementa

PREPARE_REQUEST_TOKEN Initialize a distributed token.RESPONSE_REQUEST_TOKEN Start handshaking sequence.GRANT_REQUEST_TOKEN Conclude handshaking sequence

Event Progress Control

DO_REQUEST_EVENT Initiate an event for the currentlypending request

THREAD_JOIN_EVENT Join the calling thread to any eventpending on its thread group

KILL_FORCE_EVENT Kill the event after being initiatedimmaturely

a Handshaking is separated from request submission and removal asthe checkpoint/restart subsystem could be deployed with disk transfermedium for process checkpointing which does not require handshaking.

B. Multithreaded Process State Consistency

The kernel subsystems and third party modules requireconsistent access to all process states. However, a runningprocess would modify its states as a part of the executionside effect, and some of these states are only available inparticular process execution modes [29]. Therefore, the coresystem control layer should temporarily render the processinactive at specific points in its execution to achieve therequested consistency. The user space interface layer providestwo methods to quiesce a process, including: synchronous andasynchronous.

In synchronous method, all threads inside a process shouldcollaborate to quiesce the process. All threads of the processshould call a specified system call to force entering to thekernel space where the thread would be blocked. However, theasynchronous method would exploit Linux freezer subsystem.

C. Process Migration Event Strategies

A process migration event strategy refers to one of theoptions that a user can choose to checkpoint and restart aprocess on different machines. Depending on the executioncontext type in the core system control layer and quiescingmethod, various strategies could be deployed for migration ofa process in the user space interface layer.

VI. MIGRATION EVENT BEHAVIOR

A single process was migrated and the steps taken by thesystem are depicted in Fig. 2. The proposed migration systemwould not stick to the complete transfer of a specific statebut would give the opportunity to all kernel subsystems orthird party modules to be involved in every attempt. Therefore,they could assert their internal erroneous state and preventunnecessary operations.

978-1-4673-4992-5/13/$31.00 ©2013 IEEE

2013 International Conference of Information and Communication Technology (ICoICT)

413

Page 5: [IEEE 2013 International Conference of Information and Communication Technology (ICoICT) - Bandung, Indonesia (2013.03.20-2013.03.22)] 2013 International Conference of Information

Figure 2. Migration Event Behavior in the Proposed Migration System

(a) (b) (c)

Figure 3. Machine CPU/Memory Load: (a) source machine before migration, (b) source machine after migration, (c) destination machine after migration

The most demanding step in the process migration is theprocess address space transfer. The address space kernelmetadata is passed to the destination machine allows theprocess to resume its execution on the destination machine.The pages would be transferred as a concluding step in themigration event. The destination machine would inform thesource machine for the successful restoration of address space.However, the confirmation commands transferred to the sourcemachine could overlap the requests for an out-of-order page.

The CPU and memory loads for the source and destinationmachines were monitored throughout the migration event asdemonstrated in Fig. 3.

VII. FRAMEWORK EVALUATION

The migration framework uniquely identifies the kernelsubsystems or third party modules so that the migrationmechanism can be specified while a request for an eventis submitted to the checkpoint/restart subsystem. A samplescenario is demonstrated in Fig. 4 where an interactive processis migrated. The migrated process uses resources which areinitially allocated by the temporary daemon child if the userdeclines to handle the respective resources using the kernelsubsystems or third party modules. As a result, the migrationevent could be attuned by the user to fulfill the requirementof interactive process migration. A distributed terminal couldbe implemented in user space exploiting the Linux pseudoterminal devices [30]. The temporary daemon child wouldduplicate the standard input, output, and error streams to referto a same distributed pseudo terminal slave device. Other

process states are handled by the checkpoint/restart subsystem,e.g. process address space and processor execution context.

The portability of a process migration system relates to thediversity of the supported underlying environment. The mi-gration framework defines a programming interface which canbe utilized by programmers or hardware vendors to introduceany anonymous kernel subsystem or dedicated hardware tothe migration system. This would relieve the system fromconfining in an environment boundary and allow dynamicextension of the process migration system.

Transparency requires the migrated process not to be aware

Figure 4. Sample: Interactive User Process Migration

978-1-4673-4992-5/13/$31.00 ©2013 IEEE

2013 International Conference of Information and Communication Technology (ICoICT)

414

Page 6: [IEEE 2013 International Conference of Information and Communication Technology (ICoICT) - Bandung, Indonesia (2013.03.20-2013.03.22)] 2013 International Conference of Information

Table VCOMPARISON WITH OTHER PROCESS MIGRATION SYSTEMS

Project Name FlexibleMigrationMechanism

Portability Transparency

OpenMOSIX [31] Hard-Coded No Default OnlyKerrighed [11] Hard-Coded No Default OnlyCondor [8] Hard-Coded No Default OnlyCurrent Framework User Defined Yes Process Aided

of the migration event. However, processes typically havemore knowledge about their own behavior which could beutilized for better overall performance. Using process migra-tion event strategies, a process could decide to take part inmigration while being completely aware of the event or requestfor fully transparent event. As a result, both legacy processeswithout any knowledge of migration system existence andthose which are aware of migration system could be migratedirrespective to their internal implementation in a single event.

To obtain a clear insight on contributions made by thisarchitecture, refer to Table V for a comparison between someof the proprietary products and the current architecture. Themigration framework supports flexible migration mechanismcontrary to other products, which have their migration mech-anism hard-coded inside the process migration system. Thetransparency provided by the migration system could obtainhints from the user process for a more efficient migrationevent if user process decides to participate otherwise normaltransparency would be enforced.

VIII. CONCLUSION

In this paper we presented a process migration frameworkfor Linux OS. Concentration is devoted to the architecture,design and implementation of the framework. No assumptionis made for the underlying OS or hardware architecture. Thisframework provides the opportunity for users to implementtheir own fully tuned process migration system accordingto their demands. The characteristic of the whole system isdefined by the subset of kernel subsystem or transfer mediummodules loaded in to the framework.

This architecture imposes no limitation on the implementa-tion of the kernel subsystem of transfer medium module as itsupports the concept of migration event strategies. Therefore,the same framework can be exploited for implementation ofthe checkpoint/restart system.

REFERENCES

[1] M. Steen and A. Tanenbaum, “Distributed systems: Principles andparadigms, 2nd edition,” 2007.

[2] D. Milojicic, F. Douglis, Y. Paindaveine, R. Wheeler, and S. Zhou,“Process migration,” ACM Computing Surveys (CSUR), vol. 32, no. 3,pp. 241–299, 2000.

[3] J. Smith, “A survey of process migration mechanisms,” ACM SIGOPSOperating Systems Review, vol. 22, no. 3, pp. 28–40, 1988.

[4] S. Burd, Systems Architecture. COURSE TECHNOLOGY, 2010.[5] M. Feeley, W. Morgan, E. Pighin, A. Karlin, H. Levy, and C. Thekkath,

“Implementing global memory management in a workstation cluster,” inACM SIGOPS Operating Systems Review, vol. 29, no. 5. ACM, 1995,pp. 201–212.

[6] G. Vallee, C. Morin, R. Lottiaux, J. Berthou, and I. Malen, “Processmigration based on gobelins distributed shared memory,” in ClusterComputing and the Grid, 2002. 2nd IEEE/ACM International Sympo-sium on. IEEE, 2002, pp. 325–325.

[7] F. Douglis, “Transparent process migration in the sprite operatingsystem,” Ph.D. dissertation, Citeseer, 1990.

[8] M. Litzkow, T. Tannenbaum, J. Basney, and M. Livny, Checkpointand migration of UNIX processes in the Condor distributed processingsystem. Computer Sciences Department, University of Wisconsin, 1997.

[9] B. Walker, G. Popek, R. English, C. Kline, and G. Thiel, “The locusdistributed operating system,” in ACM SIGOPS Operating SystemsReview, vol. 17, no. 5. Acm, 1983, pp. 49–70.

[10] M. Powell and B. Miller, Process migration in DEMOS/MP. ACM,1983, vol. 17, no. 5.

[11] C. Morin, P. Gallard, R. Lottiaux, and G. Vallée, “Towards an efficientsingle system image cluster operating system,” Future Generation Com-puter Systems, vol. 20, no. 4, pp. 505–521, 2004.

[12] R. Lottiaux and C. Morin, “Containers: A sound basis for a true singlesystem image,” in Cluster Computing and the Grid, 2001. Proceedings.First IEEE/ACM International Symposium on. IEEE, 2001, pp. 66–73.

[13] S. Osman, D. Subhraveti, G. Su, and J. Nieh, “The design and imple-mentation of zap: A system for migrating computing environments,”ACM SIGOPS Operating Systems Review, vol. 36, no. SI, pp. 361–376,2002.

[14] A. Mirkin, A. Kuznetsov, and K. Kolyshkin, “Containers checkpointingand live migration,” in Ottawa Linux Symposium, 2008.

[15] Xen. (2012) Home of the xen hypervisor, the powerful open sourceindustry standard for virtualization. http://www.xen.org/.

[16] VMware. (2012) Vmware virtualization software for desktops, serversand virtual machines for public and private cloud solutions.http://www.vmware.com/.

[17] P. Clements, F. Bachmann, L. Bass, D. Garlan, P. Merson, J. Ivers,R. Little, and R. Nord, Documenting Software Architectures: Views andBeyond, ser. SEI Series in Software Engineering. Addison-Wesley,2010.

[18] O. Laadan and S. Hallyn, “Linux-cr: Transparent application checkpoint-restart in linux,” in Proceedings of the 12th Annual Ottawa LinuxSymposium (OLS), Ottawa, Canada, 2010.

[19] J. Duell, P. Hargrove, and E. Roman, The design and implementation ofBerkeley Lab’s linuxcheckpoint/restart. Citeseer, 2005.

[20] W. Mauerer, Professional Linux kernel architecture. Wrox, 2008.[21] H. Zhong and J. Nieh, “Crak: Linux checkpoint/restart as a kernel

module,” Department of Computer Science, Columbia University, Tech.Rep. CUCS-014-01, 2001.

[22] R. Prasad, M. Jain, and C. Dovrolis, “Socket buffer auto-sizing for high-performance data transfers,” Journal of GRID computing, vol. 1, no. 4,pp. 361–376, 2003.

[23] W. Dieter and J. Lumpp Jr, “User-level checkpointing for linuxthreadsprograms,” in Proceedings of the FREENIX Track: 2001 USENIX AnnualTechnical Conference. USENIX Association, 2001, pp. 81–92.

[24] B. Myers and M. Rosson, “Survey on user interface programming,” inProceedings of the SIGCHI conference on Human factors in computingsystems. ACM, 1992, pp. 195–202.

[25] O. Laadan and J. Nieh, “Transparent checkpoint-restart of multipleprocesses on commodity operating systems,” in 2007 USENIX AnnualTechnical Conference on Proceedings of the USENIX Annual TechnicalConference. USENIX Association, 2007, p. 25.

[26] R. Gioiosa, J. Sancho, S. Jiang, and F. Petrini, “Transparent, incrementalcheckpointing at kernel level: a foundation for fault tolerance for parallelcomputers,” |, p. 9, 2005.

[27] J. Corbet, A. Rubini, and G. Kroah-Hartman, Linux device drivers, ser.O’Reilly Software Series. O’Reilly, 2005.

[28] S. Smith, J. Mosier, M. Corporation, and U. S. A. F. S. C. E. S. Division,Guidelines for designing user interface software. Citeseer, 1986.

[29] G. Vallee, R. Lottiaux, D. Margery, C. Morin, and J. Berthou, “Ghostprocess: a sound basis to implement process duplication, migrationand checkpoint/restart in linux clusters,” in Parallel and DistributedComputing, 2005. ISPDC 2005. The 4th International Symposium on.IEEE, 2005, pp. 97–104.

[30] R. Stevens and S. Rago, Advanced Programming in the UNIX (R)Environment. Addison-Wesley Professional, 2005.

[31] M. Bar, A. Maya, and K. Snehal, “openmosix. linux congress 2003,”2003.

978-1-4673-4992-5/13/$31.00 ©2013 IEEE

2013 International Conference of Information and Communication Technology (ICoICT)

415