? L.B. Ryzhyk, A.Y. Burtsev, 2005 
44                                    ISSN 1681?6048 System Research & Information Technologies, 2005, ? 1 
??? 681.142.2 
ARCHITECTURAL DESIGN OF E1 DISTRIBUTED OPERATING 
SYSTEM 
L.B. RYZHYK, A.Y. BURTSEV 
This paper presents the distributed operating system architecture based on the con-
cept of replication of distributed objects. A complete or partial copy of distributed 
object?s state is placed in each node where the object is used. Copy coherence is en-
sured by replication algorithms. For each object the most efficient access algorithm, 
taking its semantics into account, can be applied. All E1 subsystems are designed to 
support replication, which makes E1 a convenient platform for developing reliable 
distributed applications. 
INTRODUCTION 
In modern operating systems distributed computations support is usually limited 
to network protocol stack. However, construction of distributed applications re-
quires more advanced communication facilities such as remote procedure calls, 
distributed synchronization primitives and distributed shared memory. The grow-
ing complexity of software systems necessitates a new software layer, providing 
developers with efficient, reliable and secure access to network resources.  
Currently, this layer is most frequently implemented by middleware systems. 
Middleware is defined as a layer of software above the operating system but be-
low the application program that provides a common programming abstraction 
across a distributed system [1, 2]. For example, in distributed data processing sys-
tems, component-oriented middleware, which supports the common object model 
in different network nodes, is widely used [3, 4, 5, 6]. 
The alternative approach consists of integrating distributed computations 
support into the operating system. Nowadays, the advanced communication 
facilities have become an essential software component like file system or 
inter-process communication facilities. OS-level implementation allows the 
construction of the most effective architecture, supporting the unified set of 
primitives for access to local and remote resources. 
Distributed OS is a software platform providing applications with common 
execution environment within distributed system, including means of access to 
hardware and software resources of the system and application communication 
facilities.  
This paper presents architectural design of E1 distributed operating system. 
Such OS should meet three major requirements: 
1. Convenient interface. Due to the nature of distributed systems, it is more 
difficult for users and software developers to work in them, than in centralized 
ones. Among the complexity factors one can name: heterogeneity of access to 
local and remote resources, high probability of faults, asynchronous communica-
tion environment, non-uniform memory access. To enable computations in such 
an environment, the distributed OS must support a set of abstractions, isolating 
Architectural design of E1 distributed operating system 
???????? ??????????? ?? ???????????? ??????????, 2005, ? 1 45 
developers from the listed complexities and providing a convenient interface to all 
the resources of a distributed system. 
2. Efficiency. OS efficiency is determined mainly by temporal 
characteristics of access to various resources. In the distributed environment 
network latencies become a productivity bottleneck. Therefore distributed OS 
should minimize the influence of remote communication on software operation. 
3. Reliability. In the absence of fault tolerance mechanisms, a single node 
or network connection failure can put the whole distributed system out of order 
and cause loss of data. Therefore the distributed OS should provide reliable 
computations support, including redundant storage and execution, as well as fault 
recovery. 
1. ?1 CONCEPTS 
This section presents our approach to implementation of the above requirements. 
Convenient interface 
To provide applications with convenient interface to all computer network re-
sources, ?1 implements a Single System Image abstraction, which implies that for 
application software the distributed system looks like a centralized one. This fea-
ture allows a developer to ignore the physical layout of resources but instead fo-
cus on the functionality they provide. 
Implementation of single system image in ?1 is based on abstraction of the 
distributed object. Distributed objects encapsulate state and functionality of all 
OS components. Each object exposes a set of well defined interfaces that can be 
invoked by other objects. Objects are globally accessible by their interfaces from 
all nodes of a system. 
Both OS components and application software relies on a single E1 object 
model, i.e. ?1 applications are constructed as a collection of distributed objects. 
To an application programmer the computer network looks and fills like a single 
virtual computer, its software structured as a set of objects. Access to the 
hardware resources, as well as the interaction between software components are 
reduced to invoking methods on the corresponding objects. 
Efficiency 
The distributed software systems consist of interacting components located in dif-
ferent network nodes. As the operations, performed in each node, often depend on 
instructions and data received from remote components, the communication la-
tencies eventually affect the performance of the entire system. Two popular tech-
niques, used to overcome this effect are: replacing remote communication by lo-
cal operations, and removing remote communication beyond the critical execution 
paths. Replacing remote communication by a local interaction implies that the 
state of a server object is cached in the client nodes. In this case read operations 
are performed locally on the cached copy of an object. Modifications can some-
times also be applied locally with the subsequent delayed delivery of changes to a 
server. Removing the remote communication beyond the critical paths allows the 
reduction of the time spent by main computational threads waiting for remote 
messages. For this purpose additional helper threads, that speculatively obtain the 
data, required by main computations, are used.  
L.B. Ryzhyk, A.Y. Burtsev 
ISSN 1681?6048 System Research & Information Technologies, 2005, ? 1 46 
Object replication constitutes a generalization of the indicated approaches. 
In ?1 a complete or partial copy of a distributed object?s state can be placed in 
each node where the object is used. The state of an object is synchronized (repli-
cated) among nodes. Each invocation of an object method is handled by its replica 
in the node, where the call originates. Communication with the remote replicas 
occurs only when required by the replication protocol, for example, when it is 
necessary to obtain a missing part of an object state. 
Thus, the distributed communication in ?1 is moved inside the distributed 
object. Hence, efficiency of access to an object is determined by efficiency of the 
replication strategy. Obviously, there is no single replication strategy, equally ef-
fective for all types of objects. Therefore ?1 does not impose the use of any spe-
cific strategy or a collection of strategies. Instead, E1 provides services and tools 
to simplify the construction of replicated objects. In effect, for each class of ob-
jects the most efficient access algorithm, which takes into account its semantics, 
can be applied. Such algorithm can be either selected from a set of existing repli-
cation strategies, or designed specifically for the given class of objects. 
Reliability 
E1 provides support for reliable distributed applications development through rep-
lication and persistence. Replication can appear not only as a means of efficient 
access to an object, but also as a redundancy mechanism. For example, by sup-
porting consistent copies of an object in n different nodes, it is possible to tolerate 
up to n ? 1 node crashes [7]. Thus, replication utilizes hardware redundancy of 
the distributed system to provide reliable execution of applications.  
Persistence is the ability of the objects to exist for unlimited time, 
irrespectively of whether a system functions continuously. For this purpose a 
copy of an object is kept in nonvolatile storage and is being synchronized with an 
active copy. The stored object state is always correct, even in the face of 
hardware failures. (As for now, support for persistence in E1 is not designed in 
sufficient details. Therefore, it is not covered in this paper.) 
Another important principle underlying the ?1 architecture is component 
model support. According to this principle, the replicated objects model is 
extended to a valid component model. Such architecture makes ?1 a convenient 
platform for the development of distributed applications. 
On the low level, the E1 component model relies on the execution 
primitives, which are essentially different from the ones used by the conventional 
OS. The primary execution abstraction in the conventional systems is process or 
task, representing an instance of a program, loaded into memory. Each task runs 
in a separate address space. Within a task several execution threads can exist. This 
model does not appropriately support interacting objects of medium granularity 
[8]. Therefore, we abandon it for the new execution model, tailored for 
component systems. In ?1 all executable code and data belong to objects. All 
objects reside within a single 64-bit address space. ?1 supports the migrating 
threads model [8], in which execution of a thread, invoking an object method, is 
transferred to the context of the invoked object. Migrating threads allow the 
departure from a server-style object design, where an object runs one or several 
threads to process incoming method invocations. 
Since both OS services and application software are developed within the 
framework of a single ?1 component model, the model has to be highly flexible, 
Architectural design of E1 distributed operating system 
???????? ??????????? ?? ???????????? ??????????, 2005, ? 1 47 
while introducing minimal overhead. These requirements have guided the design 
of ?1 component services, presented in the following sections of this paper. 
2. COMPARISON WITH OTHER SYSTEMS 
Modern distributed operating systems can be divided into two classes, based on 
the method of access to distributed system resources: client/server systems and 
distributed shared memory (DSM)-based systems. ?1 implements a third ap-
proach, based on replicated objects. This section presents a brief characteristic of 
existing architectures and compares them to the ?1 architecture. 
In client/server OS?es, similar to E1, all resources of the distributed system 
are represented by objects, which are uniformly accessible from all nodes. 
However, objects are not physically distributed. Each object is located in one of 
system nodes under control of a server process. Global availability of objects is 
provided by the remote method invocation mechanism, which hides the 
distributed nature of interactions from the client. Two well-known examples of 
client/server distributed OS?es are Mach [9] and Chorus [10]. The advantage of 
client/server architecture is its relative simplicity. However, it does not provide a 
locality of access to resources and, therefore, does not eliminate the influence of 
network latencies on the performance of the system. Another disadvantage of 
client/server architecture is the lack of reliability mechanisms. Failure of a single 
node can cause a wave of software failures all over the system, a phenomenon 
known as the ?domino effect?. Furthermore, client/server architecture lacks 
scalability, as it does not support load balancing among nodes. 
DSM-based OS?es [11, 12, 13, 14] use essentially different approach to 
implement the distributed object model. The main idea underlying these systems 
is to emulate common memory in the distributed environment. The state and 
executable code of each object are globally accessible from each node by their 
virtual addresses. On the first access to an object, the OS creates local copies of 
its pages. The copies are synchronized using memory coherence algorithms [15]. 
These algorithms can be thought of as universal replication strategies, applicable 
for any types of objects. Unfortunately, they often fail to provide acceptable 
efficiency of access. To implement efficient access to an object the replication 
algorithm should take into account its semantics. Algorithms, working on the 
level of virtual memory pages are obviously unaware of object semantics. Thus, 
we observe a natural trade-off between generality and efficiency of the replication 
strategy. 
In E1, the efficient and reliable access to each object is ensured by selecting 
replication strategy on the basis of the object?s semantics.  
Two examples of standard E1 replication strategies are client/server 
replication and memory object replication. These strategies reproduce the types of 
access to distributed objects used respectively by client/server and DSM-based 
OS?es. Thus, ?1 can be considered a generalization of these architectures. 
3. ?1 OVERVIEW 
3.1. Distributed object 
Distributed objects are first-class citizens in E1. All OS services, as well as appli-
cation software are constructed from distributed objects. 
L.B. Ryzhyk, A.Y. Burtsev 
ISSN 1681?6048 System Research & Information Technologies, 2005, ? 1 48 
All objects reside in a single virtual 64-bit address space. Each object expos-
es one or several interfaces consisting of a set of methods. Each distributed object 
interface is identified by its unique 64-bit address. Any object, knowing this ad-
dress, can invoke the interface methods from any network node. All interfaces in 
?1 contain a standard method of navigation between the interfaces of the same 
distributed object. 
Objects in ?1 can be physically distributed, i.e. keeping partial or complete 
copies of the state in several nodes. The copy of an object?s state in one of the 
system nodes is called distributed object replica. The distribution of the state 
among replicas and replica synchronization is called object replication. 
The E1 distributed object architecture aims to separate an object?s semantics 
and replication strategy. An object developer implements only the object?s seman-
tics or functionality in local (non-replicated) cases, while a replication strategy 
supplier implements the replication algorithm. Replication strategy can be univer-
sal, i.e. applicable to objects of various classes. At the same time, objects of the 
same class can be replicated using different strategies. 
To achieve the goal above, we put forward the distributed object architec-
ture, in which object semantics and replication strategy are implemented by sepa-
rate structural units. In E1, distributed objects are composed of local objects. A 
local object is limited to one node of the distributed system. Note that a similar 
distributed object architecture has been implemented by Globe object-oriented 
middleware [6]. 
The E1 local object resembles the structure of a C++ object [16]. It consists 
of a fixed-size section, containing data members and pointers to interfaces (me-
thod tables), and the data structures, dynamically allocated by the object from 
heap. In terms of C++, the interfaces of the local object are purely virtual base 
classes, from which the object is inherited. A similar approach is taken by COM [4]. 
The distributed object architecture is shown in Fig. 1. 
In a simple case when the distributed object has only one replica, it is identi-
fied with a single local object, semantics object. Semantics object contains the 
distributed object state, exposes the distributed object interfaces and implements 
its functionality. 
When the reference to the distributed object is created in the node, where 
there is no replica of the given object yet, a new replica is created in this node. 
The structure of distributed object with several replicas is shown in Figure 1. A 
copy of the semantics object is placed in each node, where the distributed object 
is represented. To ensure global accessibility of the distributed object interfaces 
by their virtual addresses, semantics objects are placed to the same virtual memo-
ry location in all nodes. The distributed object integrity is maintained by replica-
tion objects, complementing the semantics objects in each node. Replication ob-
jects implement the distributed object replication protocol. Replication object 
substitutes implementations of semantics object interfaces by its own implementa-
tions, which allows it to process the distributed object method invocations. (De-
tailed discussion of interface substitution technique lies beyond the scope of this 
paper.) While processing the invocation, replication object can refer to the seman-
tics object to execute necessary operations over the local object state, as well as 
communicate with remote replication objects to perform synchronization and re-
mote execution of operations. Interface substitution is transparent for other ob-
Architectural design of E1 distributed operating system 
???????? ??????????? ?? ???????????? ??????????, 2005, ? 1 49 
jects and can be thought of as aggregation of the semantics object by the replica-
tion object. Such architecture eliminates the overhead of supporting replication 
objects for the distributed objects that are not actually distributed, i.e. have only 
one replica. If an object with several replicas eventually remains with only one 
replica, its replication object is deleted. 
The presented distributed object architecture has two important advantages. 
First of all, it effectively separates the object?s semantics and replication strategy. 
Secondly, it does not impose any essential limitations on replication algorithms 
used. Hence, for each object the access protocol, providing high efficiency, while 
preserving required reliability guarantees, can be applied. 
3.2. ?1 architecture 
Fig. 2 shows a generalized ?1 architecture. E1 consists of a microkernel and a set 
of distributed objects acting at the user level. The microkernel supports a minimal 
set of primitives that are necessary for OS construction, such as: address spaces, 
threads, IPC and interrupts dispatching. All operating system and application 
functionality is implemented by objects. 
Microkernel-based design has a number of advantages. First, it is potentially 
more reliable than conventional monolithic architecture, as it allows the major 
part of operating system functionality to be moved beyond the privileged kernel. 
Second, microkernel implements a flexible set of primitives, providing a high 
level of hardware abstraction, while imposing little or no limitations on operating 
system architecture. Therefore, building an operating system on top of an existing 
microkernel is significantly easier than developing from scratch. Besides, since 
Semantics
object
Node 1 Node 2
Node 3
Replication
object
Replication
object
Semantics
object
Semantics
object
Replication
object
Object interface
Network
Interface 1 Interface 1
Interface 1
Fig. 1. The distributed object architecture 
Semantics 
object 
Replication 
j t 
L.B. Ryzhyk, A.Y. Burtsev 
ISSN 1681?6048 System Research & Information Technologies, 2005, ? 1 50 
operating system services run at user level, rather than inside the kernel, it is 
possible to replace or update certain services at run-time, or even start several 
versions of a service simultaneously. Third, and finally, some of the existing 
microkernels achieve an IPC performance an order of magnitude over monolithic 
kernels [17]. Among these are microkernels of the L4 family [18, 19, 20, 21]. For 
object-oriented operating systems, like E1, it is extremely important to minimize 
the latency of control transfer between address spaces; therefore, L4 has been 
selected as the microkernel of ?1. 
4. OBJECT INTERACTION AND PROTECTION 
In order to perform useful operations, objects interact by means of method calls. 
Safe execution of applications is provided by a protection mechanism which 
guarantees that object interaction is controlled by precisely defined access control 
policy. 
4.1. Protection Domains 
OS protection model has to be based on facilities provided by the underlying 
hardware platform, primarily, virtual memory mechanisms in modern micropro-
cessors these are. Therefore, object protection is closely related to virtual memory 
organization. 
In ?1 all objects reside in a single virtual address space. Object interfaces are 
invoked directly by their virtual addresses, just as in C++ methods are invoked 
through the pointer to an object. The major advantage of such a virtual memory 
organization is a convenient programming model, which greatly simplifies the 
communication between objects. A single E1 address space spans the whole 
distributed system. Hence, all objects in the system are accessible by their unique 
virtual addresses from any network node. 
Nevertheless, in order to provide effective object isolation in E1, we 
introduce the notion of protection domain, representing a part of a single virtual 
address space, within which one or several distributed objects are located. Each 
object in E1 belongs to exactly one domain. Associated to each domain is a 
Fig. 2. Generalized ?1 architecture 
Architectural design of E1 distributed operating system 
???????? ??????????? ?? ???????????? ??????????, 2005, ? 1 51 
separate protection context, isolating internal domain objects from the other 
objects in the system. However, objects inside domain are not protected from 
each other. Method calls inside domain do not require the protection context 
switch. Thus, domains offer a trade-off between efficiency and safety of 
interaction. 
Domains provide global isolation of objects within the framework of a dis-
tributed system. If the object has several replicas, then in every node its replica 
resides in the same domain and at the same memory address. Therefore, if two 
objects are isolated from each other, i.e. reside in different domains, then their 
replicas will be placed in different domains in all nodes. Like other ?1 primitives, 
domains are distributed objects. A replica of each domain is placed in each node, 
where there is a replica of at least one object, belonging to this domain.  
4.2. Crossdomain calls 
Objects in ?1 interact via method calls. This type of communication is synchron-
ous. Each call is accompanied by a set of input and output parameters, specified 
by the object developer by means of Interface Definition language (IDL). 
In ?1 all method calls are executed by the local replica of the invoked object. 
In order to guarantee that such a replica will exist and will not be deleted by the 
garbage collection system, one must create a reference to an object before using 
any of its methods. 
Object methods are invoked through a pointer to one of its interfaces. Since 
all objects in ?1 are located in a single address space, this pointer is valid in any 
system node and in any protection domain.  
Within the domain boundaries, method calls work very similar to C++ 
language: arguments are placed in stack and registers, and the control is 
transferred to the address specified in the method table of an invoked object. 
Implementation of crossdomain calls is more complicated, although for the 
interacting objects the difference is transparent. An attempt to access an object 
outside the local domain triggers a page fault exception, handled by 
Crossdomain Adapter (CA), located in the same domain as the object where the 
exception occurs. The CA?s task is to prepare the stack, containing the invocation 
arguments, which will be mapped into the target domain and on which the method 
will be executed. All arguments are copied directly to the new stack. Although 
crossdomain call mechanism does not explicitly support passing large data arrays 
without copying, a similar functionality can be achieved by passing pointers to 
objects, representing shared memory regions. 
Having created the call stack, the CA transfers control to the microkernel to 
complete the call. The kernel then refers to the Object Registry (see Section 5.1) 
for the validation of the caller?s capabilities to invoke the given operation, and 
finally maps the call stack to the target domain and transfers control to the called 
object. Return from crossdomain call occurs in a similar way, through the target 
domain CA. 
4.3. Threads 
The ?1 execution model is based on the migrating threads concept [8]. At any 
point in time each thread runs in the context of a specific object. During method 
invocation, execution of a thread is transferred to the target object. Thus, the 
thread is not permanently bound to any specific object or domain. As shown in 
L.B. Ryzhyk, A.Y. Burtsev 
ISSN 1681?6048 System Research & Information Technologies, 2005, ? 1 52 
[8], migrating threads are more appropriate for object-oriented environment, than 
traditional static threads.  
Since in ?1, distributed object invocation is actually an invocation of its 
local replica, it does not cause the transfer of thread execution to a remote node. 
There is, however, one particular situation, when such transfer occurs. It is when 
the replication strategy requires migration of object replica between network 
nodes. The object state is then moved to the target node, along with all of its 
threads. After completing execution within the migrated object replica, threads 
return to their home nodes. 
5. COMPONENT SERVICES 
This section describes the ?1 services, which extend the distributed object model 
to a full-featured component model. Among these are Object Registry, Access 
Control Server, Global Naming Server, and garbage collection system. (Detailed 
description of dynamic class loading mechanism lies beyond the scope of this pa-
per.) 
5.1. Object Registry 
Object Registry lies at the heart of the ?1 component model. It maintains the 
information about all local replicas of distributed objects, including their types, 
virtual addresses, host domain IDs and reference counting information. The 
Registry coordinates execution of such operations as creation and deletion of the 
distributed objects and their replicas, crossdomain calls and garbage collection. 
Crossdomain calls validation 
At the time of crossdomain call, the microkernel refers to the Object Registry 
through an IAccessValidator interface to assure the existence of the invoked ob-
ject?s replica in a local node, and also to validate the caller?s rights to perform the 
given operation. 
The Registry itself does not implement access control policy. Instead, for the 
verification of call legitimacy it refers to the Access Control Server, which will be 
discussed in the next section. 
To improve the efficiency of crossdomain communication, information on 
objects and rights can be cached by the microkernel, which avoids having to look 
up the Registry for each crossdomain call.  
5.2. Access Control Server 
Access Control Server (ACS) is a distributed object, which enforces a single 
access control policy across the distributed system by verifying the legitimacy of 
each call. 
Selection of an operating system access control model is very challenging. 
Having its own limitations and drawbacks, none of the existing protection models 
can be considered generally optimal. Therefore ?1 does not impose any specific 
access control policy to be implemented by ACS. Nor does it limit the ACS 
replication strategy or data structures used to store information on rights. 
However, the ACS must implement the IAccessControl interface, used by the 
Object Registry for crossdomain calls validation. The main method of the 
IAccessControl interface, namely ValidateAccess of this interface confirms or 
Architectural design of E1 distributed operating system 
???????? ??????????? ?? ???????????? ??????????, 2005, ? 1 53 
denies the validity of a call, based on the thread identifier, caller and callee 
identities, and the invoked method. 
A variety of protection models can be implemented within the framework of 
the presented approach, including various capability [22, 23, 24, 25] and access 
control list (ACL) [26] models. It is also possible to select subjects and objects of 
the model in different ways. Some possible choices for objects are: distributed 
object, a single interface or even method. While for the role of subjects, one can 
use distributed object, protection domain or user.  
5.3. Global Name Server 
The Global Name Server (GNS) implements a distributed object location proto-
col, which maps the object?s virtual address to the list of its contact points, i.e. 
network nodes, containing the object?s replicas. GNS is used by the Object Regi-
stry to initiate the creation of a new distributed object replica in a local node. 
The choice of a specific object location algorithm, implemented by GNS, 
should be based on the scale of the system and on the frequency with which nodes 
are added to and removed from it. For small systems a centralized protocol with 
one or several name servers is preferable. For large-scale systems with stable 
structures the hierarchy of domain servers [27] is usually used. While for highly 
dynamic systems decentralized naming protocols, e.g. [28], are most effective. 
5.4. Garbage collection 
The purpose of the ?1 garbage collection system is to detect and delete of unused 
distributed object replicas.  
In each node, the garbage collection system maintains only the information 
concerning local replicas. For each replica, the list of strong references to it, as 
well as the list of references it holds to other replicas, are stored. Both distributed 
and local references are taken into account. This information is sufficient to trace 
any changes in the reference graph, including those caused by node or network 
connection failures, while the simple references counting does not account for 
such situations correctly. 
Cyclic distributed garbage collection in E1 is based on the partial reference 
graph tracing procedure, which verifies the reachability of some specified replica 
from Root Object Set [29]. The Root Set consists of system objects, which by 
definition are never regarded as garbage.  
6. REPLICATION 
In ?1 the efficiency of the access to distributed object is determined by the effi-
ciency of the replication strategy. The most efficient strategies are those that take 
into account the properties of certain object categories.  
6.1. Distributed object replicas communication 
Any non-trivial replication strategy requires some communication layer to organ-
ize the interaction between distributed object replicas. In ?1, such a layer is pro-
vided by the Group RPC (GRPC) service, supporting transparent invocation of 
remote replication objects. GRPC in turn relies upon the Group Communication 
L.B. Ryzhyk, A.Y. Burtsev 
ISSN 1681?6048 System Research & Information Technologies, 2005, ? 1 54 
mechanism which supports the exchange of unicast and multicast messages with 
various delivery ordering and reliability properties. 
Group communication mechanism. For the purpose of this discussion, a 
group is a communication-level abstraction, which corresponds to a set of a single 
distributed object?s replicas. The ?1 group communication system includes two 
main services: group membership service and message delivery service.  
Group membership service allows the addition and removal of object repli-
cas dynamically. In addition, it is responsible for maintaining the consistency of 
the group in the face of hardware and software failures, which might cause unex-
pected replica crashes or group fragmentation. This is a nontrivial task, since in an 
asynchronous distributed environment it is impossible to distinguish a node crash 
from temporary inaccessibility caused by network delays [30]. To overcome this 
obstacle, one can use a distributed algorithm, determining accessible group mem-
bers and reaching a consensus concerning a new group structure among its surviv-
ing members [31]. Such an algorithm is implemented by a special membership 
service component ? Failure Detector (FD).  
If some of the group members become inaccessible as a result of network 
partitioning, rather than node failures, group fragmentation occurs. In this case, 
group membership service initiates formation of a new group in each fragment. 
Later on, the fragments may remerge into a single group again. 
Message delivery service provides primitives for sending unicast and multi-
cast messages between group members. For each message session, the delivery 
protocol properties can be specified. The most important ones are reliability of 
delivery and message ordering guarantees.  
We plan to build the ?1 group communication system on one of the existing 
implementations [32, 33, 34]. Such systems are initially designed to provide rep-
lication support within more complex software systems. Therefore they can be 
relatively easily integrated into ?1. Also, being highly modular, they can be easily 
extended to support new message delivery properties [33].  
Group RPC. Message-oriented communication primitives form the basis for 
distributed object replicas interaction. However, it is desirable to provide the rep-
lication strategy developer with a more convenient procedural model, allowing 
direct access to methods of the remote replication objects. In the case of point-to-
point communication, the remote procedure call (RPC) mechanism is generally 
used to invoke operations on remote objects. The group remote procedure call 
(GRPC) is the generalization of RPC for the case of multicast communication. On 
the basis of group communication services described above, the GRPC imple-
ments a single primitive allowing a simultaneous invocation of several remote 
objects. 
Like regular RPC, GRPC implements remote invocation with the help of 
client and server stabs. Stubs are compiled automatically from IDL definitions of 
objects?. Client stubs locally expose interfaces of remote replication objects. Each 
call to a client stub is converted into a message, sent to one or several remote rep-
licas by means of a group communication system. The message is delivered to a 
server stub, which transforms it into a call of an appropriate replication object 
method. The result of the invocation is sent back to the caller?s client stub. Hav-
ing obtained the necessary number of responses (determined by the semantics of 
the call), the client stub returns control to the calling object. 
Architectural design of E1 distributed operating system 
???????? ??????????? ?? ???????????? ??????????, 2005, ? 1 55 
6.2. Serialization interface 
Since replication object is generally unaware of the semantics object?s structure, 
the semantics object must implement serialization operations itself. These opera-
tions are available to the replication object through the ISerializable interface. 
Some languages, e.g. Java and C#, provide support for automatic object 
serialization/deserialization, based on run-time type information. We expect that 
these languages will be widely used for application programming in ?1.  
However, in order to support languages, such as C++, in which automatic 
object serialization is not generally possible either at the time of compilation, or at 
run-time [35], it is desirable to develop a language-independent method for 
generating serialization interface.  
In ?1 the support for automatic objects serialization is provided by the 
memory management system. Each local object consists of a static part and 
dynamically allocated data. The dynamic memory allocation interface is exposed 
by Heap objects. Heap object represents a continuous virtual memory area, upon 
which allocation and deallocation operations are defined. Each domain provides 
the default local heap, which can be used by all its objects. Besides, any object 
can create a separate heap and allocate memory only from it. To serialize such 
object, it is enough to store the structure of the heap plus the object?s static part in 
some data packet. At object deserialization, the heap is restored in a new node in 
the same virtual address. So, the problem of serialization/deserialization of the 
semantics object is reduced to a far simpler problem of serialization/deserialization 
of a Heap object. This approach is language-independent and can be used for 
objects of any type. However, it introduces certain memory overhead, since using 
a separate heap per object implies that the object?s dynamic data occupies an 
integral number of physical memory pages. 
CONCLUSION 
This paper presents the architectural design of the E1 distributed operating sys-
tem. In E1 the abstraction of the replicated distributed object is used as a building 
block for both OS components and application software. Since the distributed ob-
ject?s interfaces are globally uniformly accessible across the network, the distrib-
uted nature of the system is hidden from application developers and users. Select-
ing replication strategy for each object according to its semantics allows for 
efficient access, while providing the required degree of reliability. The internal 
architecture of the distributed object effectively separates its semantics and repli-
cation algorithm, which actually reduces the task of distributed object develop-
ment to programming of a local (non-replicated) object. 
Remote object access protocols are implemented by the developers of repli-
cation strategies. Most replication strategies are universal, i.e. can be applied to 
objects of various types. However, the replication strategy can be designed for a 
particular type of objects, which allows it to maximize the efficiency of access by 
taking type-specific properties into account. ?1 provides support for the replica-
tion object development, including group communication system, object persis-
tency, and special synchronization primitives. 
?1 runs on top of a microkernel which supports a minimal set of primitives 
like address spaces, threads, IPC, interrupts dispatching. All operating system and 
L.B. Ryzhyk, A.Y. Burtsev 
ISSN 1681?6048 System Research & Information Technologies, 2005, ? 1 56 
application functionality is implemented by distributed objects. We believe that 
microkernel-based architecture improves modularity and reliability of the system, 
as well as reduces control transfer costs via the kernel, which is especially impor-
tant for systems oriented at intensive communication of medium-grained objects. 
Further work on E1 includes extending the presented architecture with 
support for object persistence. After that we plan to proceed to implementation of 
the first E1 prototype and its subsequent analysis. 
REFERENCES 
1. Bakken E.D. Middleware // Encyclopedia of Distributed Computing. ? Kluwer Aca-
demic Press. ? 2003. ? 1110 p. 
2. Bernstein P.A. Middleware // Communications of the ACM. ? 39. ? ? 2. ? Feb. 
? 1996. ? P. 68? 98. 
3. Object Management Group (OMG). The Common Object Request Broker: Architec-
ture and Specification. Version 3.0 // July. ? 2002. ? P. 1150. 
4. Microsoft Corporation and Digital Equipment Corporation. The Component Object 
Model Specification. Version 0.9 // October. ? 1995. ? P. 274. 
5. Sun Microsystems, Inc. EJB specification 2.0. ? 640 p. 
6. Van Steen M., Homburg P., Tanenbaum A.S. Globe: A Wide ? Area Distributed 
System // IEEE Concurrency.  ? January ? March. ? 1999. ? P. 70?78. 
7. Schneider F. Implementing fault ? tolerant services using the state machine approach: 
a tutorial // ACM Computing Surveys. ? 22(4). ? December. ? 1990. ? 
P. 290?319. 
8. Ford B., Lepreau J. Microkernels Should Support Passive Objects. // Proc. I ? 
WOOOS'93. ? December ? 1993. P. 226?229. 
9. Mach:A New Kernel Foundation for UNIX Development / M.J. Accetta, R.V. Baron, 
W. Bolosky et al. // Proc. of the Summer 1986 USENIX Conference. ? July. ? 
1986. ? P. 93?113. 
10. Chorus Distributed Operating Systems / M. Rozier, V. Abrossimov, F. Armand et al. 
//  Computing Systems. ? 1988. ? 1, ? 4. ? P. 82?98. 
11. Chase J. S. An Operating System Structure for Wide-Address Architectures // PhD 
Thesis, Department of Computer Science and Engineering, University of Wash-
ington. ?August. ? 1995. ? P. 144. 
12. Heiser G., Elphinstone K., Russell S., Vochteloo J. Mungi: A distributed single ad-
dress ? space operating system // Technical Report 9314. School of Computer 
Science and Engineering, The University of New South Wales. ? 1993. ? 
P. 271?280. 
13. The ARIAS Distributed Shared Memory: an Overview / P. Dechamboux, J.-P. Fassino, 
Hagimont D. et al. // Lecture Notes in Computer Science. ? 1997. ? 1175. ? 
P. 56?73. 
14. The Design and Implementation of the Clouds Distributed Operating System / 
P. Dasgupta, R.C. Chen, S. Menon et al. // Computing Systems Journal. ? 3. ? 
USENIX. ? Winter. ? 1990. ? P. 11?46. 
15. Li K. Shared Virtual Memory on Loosely Coupled Multiprocessors // PhD thesis, 
Yale. ? September. ? 1986. ? P. 213. 
16. Stroustrup B. The C++ Programming Language (3rd Edition) // Addison ? Wesley. 
? 1997. ? P. 911. 
17. Achieved IPC performance (still the foundation for extensibility) / J. Liedtke, 
K. Elphinstone, S. Sch?nberg et al. // Proc. 6th Workshop on Hot Topics in Op-
erating Systems (HotOS). ? Chatham (Cape Cod), MA. ? May. ? 1997. ? 
P. 28?31. 
Architectural design of E1 distributed operating system 
???????? ??????????? ?? ???????????? ??????????, 2005, ? 1 57 
18. Liedtke J. L4 reference manual (486, Pentium, Pro) // Research Report RC 20549. ? 
IBM T. J. Watson Research Center, Yorktown Heights, NY. ? September. ? 
1996. ? P. 53. 
19. Elphinstone K., Heiser G., L4 Reference Manual // Technical Report UNSW ? CSE 
? TR ? 9709. ? School of Computer Science and Engineering, University of 
New South Wales. ? December. ? 1997. ? P. 79. 
20. Potts D., Winwood S., Heiser G. L4 Reference Manual: Alpha 21x64 // Technical 
Report UNSW ? CSE ? TR ? 0104. ? University of New South Wales. ? 
Sydney. ? March. ? 2001. ? P. 79. 
21. The L4Ka team. L4 experimental kernel reference manual, version X.2 // ? Febru-
ary. ? 2002. ? P. 182. 
22. Tanenbaum A. S., Mullender S.J., Renesse R. van. Using Sparse Capabilities in a Dis-
tributed Operating System // Proc. Sixth International Conference on Distributed 
Computing Systems. ? IEEE. ? 1986. ? P. 558?563. 
23. Gong L. A Secure Identity ? Based Capability System // 
24. Jones A. K., Lipton R. J., Snyder L. A Linear Time Algorithm for Deciding Security 
// Proc. 17th Symposium on Foundations of Computer Science. ? Houston, Tex-
as. ? 1976. ? P. 33?41. 
Proc. IEEE Symposium on 
Security and Privacy. ? 1989. ? P. 56? 65. 
25. Bishop M., Snyder L. The transfer of information and authority in a protection sys-
tem // Proc. 17th ACM Symposium on Operating Systems Principles. ? 1979. ? 
P. 45?54. 
26. Ritchie D. M., Thompson K. The UNIX time sharing system // Comm. ACM 17:7. ? 
July. ? 1974. ? P. 365?375. 
27. Mockapetris P., Dunlap K. J. Development of the Domain Name System // Proc. 
ACM SIGCOMM. ? Stanford, CA. ? 1988. ? P. 123?133. 
28. Stoica I., Morris R., Karger D., Kaashoek M. F., Balakrishnan H. Chord: A scalable 
peer ? to ? peer lookup service for Internet applications // Technical Report TR 
? 819. ? MIT. ? March. ? 2001. ? P. 149?160. 
29.  Wilson P.R. Uniprocessor garbage collection techniques // Technical report, Univer-
sity of Texas. ? January. ? 1994. ? P. 1?42. 
30. Ricciardi A., Schiper A., Birman K. Understanding Partitions and the ?No Partition? 
Assumption // Proc. 4th IEEE Workshop on Future Trends of Distributed Sys-
tems. ? Lisboa. ? September. ? 1993. ? P. 12?26. 
31. Schiper A., Ricciardi A. Virtually ? Synchronous Communication Based on a Weak 
Failure Suspector // Proc. 23rd International Symposium on Fault ? Tolerant 
Computing Systems. ? Toulouse, France. ? June. ? 1993. ? P. 534?543. 
32. Birman K. P., Joseph T. A., Exploiting Virtual Synchrony in Distributed Systems // 
Proc. 11th ACM Symp. on Operating Systems Principles. ? Austin, TX. ?
November. ? 1987. ? P. 123?138. 
33.  Renesse R. van, Birman K. P., Glade B., Guo K., Hayden M., Hickey T. M., Malki D., 
Vaysburd A., Vogels W. Horus: A Flexible Group Communication Subsystem // 
Technical Report TR 95 ? 1500. ? Cornell University, Ithaca, NY.? 1995.? P. 14. 
34. Amir Y., Dolev D., Kramer S., Malki D. Transis: A communication sub ? system for 
high availability // Proc. 22nd IEEE Fault ? Tolerant Computing Symposium 
(FTCS). ? July. ? 1992. ? P. 76?84. 
35. Shapiro M., Gautron P., Mosseri L. Persistence and Migration for C++ Objects // 
Proc. Third European Conference on Object ? Oriented Programming. ? 
1989. ? P. 191?204. 
Received 08.05.2003 
__________________________ 
From the Editorial: The article corresponds completely to submitted manuscript.