Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin

...

Supporting multiple tenants requires a top-down evaluation of the software components in the typical enterprise software stack such as CollectionSpace Service Architecture. This document examines the CollectionSpace architecture with a view of supporting multiple tenants.

Section
Panel
bgColor#fff
borderStylesolid
titleOn this page
borderStylesolid
Table of Contents
maxLevel5
minLevel1

...

There are three kinds of storage systems expected to be used in CollectionSpace. The CollectionSpace runtime would utilize the tenant-specific data available in the runtime context while performing IO operations to isolate the data per tenant. How the runtime context would be populated with tenant-specific information is addressed in the section on security. Let's assume that the tenant specific information is available at the time of performing storage operations.

  1. Document repository (backed by SQL datastore)
  2. SQL datastore
  3. File system

Document Repository

As described earlier, Nuxeo is used in CollectionSpace predominantly for entity object management. Nuxeo Repository is the core component providing this functionality. A repository stores documents representing CollectionSpace entities in a tree-like structure that enables grouping documents inside folders in an hierarchic manner as shown in figure below.

Gliffy Diagram
pageDesign notes for multi-tenancy in CollectionSpace
namenuxeo_repository_hierarchy
spacecollectionspace
pageid16548108
sizeM
pageid16548108

The top-level node in this structure is called a Nuxeo domain (that is different from CollectionSpace domain). Each Nuxeo repository instance by default creates a Nuxeo domain. The diagram above shows workspaces for Collection Object and Location entities of CollectionSpace system.

...

In Nuxeo repository, the document types are generally shared between one or more organization(s) using the repository. However, in a multi-tenant environment, this is useful only for those document types that are shared between the tenants. In CollectionSpace, some document types could be specific to a tenant or a group of tenants because of the Schema extensions feature. See tenant-aware repository binding for more details.

In the following sections we describe and discuss several approaches that could be considered to accommodate the multi-tenant scenario in Nuxeo Repository.

...

In this approach a Nuxeo repository is assigned per tenant. CollectionSpace would use SQL datastore backed Nuxeo repository. According to this approach, there would exist a separate J2EE datasource per tenant. Such a datasource should be created at the time of tenant provisioning.

Gliffy Diagram
pageDesign notes for multi-tenancy in CollectionSpace
namerepository_per_tenant
spacecollectionspace
pageid16548108
sizeM
pageid16548108

Pros:

  1. This approach offers a very clean isolation between two tenants. Repositories are not shared so documents and document types are kept in two totally different storage areas.
  2. Backup and restore and repository level is possible.
  3. Increase redundancy. If one repository is not available, it will affect the availability of the services only to a single tenant. Other tenants using other repositories could continue to work.

...

Under this scenario, each tenant could be assigned a Nuxeo domain as shown in figure below. Multiple Nuxeo domains could reside in the same physical Nuxeo repository. CollectionSpace service layer would keep an association between the domain id (or name) and the tenant id (or name). A Nuxeo domain could be created at the time of tenant provisioning.

Nuxeo domain is just a core document type. In Nuxeo APIs (repository session), the get, update and delete operations on a document do not require a fully-qualified id (domain-workspace-document id) as document ids are globally unique. Only create operation allows to provide parent id. We should explore if PathRef could be used instead of IdRef for get, update and delete operations. That way we could create a fully-qualified path to the document.

Gliffy Diagram
pageDesign notes for multi-tenancy in CollectionSpace
namedomain per tenant
spacecollectionspace
pageid16548108
sizeM
pageid16548108

Pros:

  1. There would be clean isolation between document instances that belong to two different tenants.
  2. All document types within the same repository are shared between different tenants. No replication.

...

  1. Single repository instance could become a single point of failure. Availability of service decreases as more tenants are affected due to failure of a single repository. This could be mitigated by scaling out.
  2. Nuxeo's hierarchy table that is accessed to browse anything in the repository becomes very hot.
  3. No isolation. All the document types (or CollectionSpace schema templates) will be available to all the tenants sharing the repository even if not relevant.
  4. Additional routing and look up logic would be required in the service layer to identify the repository hosting a tenant found in the context at the runtime.

...

A tenant-aware binding would include these repository specific components. Detailed explanation about various elements of a tenant aware binding is provided in tenant-aware_bindings.

SQL datastore

In CollectionSpace v1.0, we plan to use MySQL 5.x as a SQL datastore for storing various kinds of data for various purposes. This could include out-of-the-box user registry, organization registry, audit trail logs, CollectionSpace ID space, and any information that requires reliable persistence and recoverability.

...

Gliffy Diagram
pageDesign notes for multi-tenancy in CollectionSpace
namebasic_table_tenant
spacecollectionspace
pageid16548108
sizeM
pageid16548108

File system

Various configuration related artifacts are stored on file system by CollectionSpace as well as the infrastructure used by the CollectionSpace service layer. These includes XML schema files, various properties, log files, connection (to database, ftp servers, identity providers, 3rd party web services, etc.) related configuration, etc.

...

In addition to the above steps, a museum might choose to customize schemas of various entities and relationships in CollectionSpace as described in Schema Extension. This would require additional steps including creating new document types, optionally creating workflows, generating and deploying tenant-aware bindings, testing, etc. Deployment of tenant-aware bindings would involve creating necessary domain and workspaces in repository among other things.

...

In this section, we discuss the impact of a multi-tenant environment on security architecture. The following topics are covered.

  1. Account provisioning
  2. Authentication Process
  3. Authorization
  4. Audit trail
  5. Callout

Account provisioning

Wiki Markup
{multi-excerpt-include:pageTitle=Account Service Description and Assumptions|name=account provisioning description|nopanel=true}

...

To scale out the document repository as discussed in Approach #2, we can use a hybrid of Approach #1 and Approach #2. Here, we would divide the CollectionSpace museum domain(s) (e.g., life sciences, art history, anthropology and archeology, architecture, etc.) into separate repositories. Each Nuxeo domain (e.g. Domain-Tenant-1) in repository still represents a tenant (e.g. Tenant-1). That is, one repository would host multiple tenants from the same CollectionSpace domain (e.g. anthropology museums). The CollectionSpace server could also be scaled out where one CollectionSpace server could use one or more Nuxeo repository and there could be one or more CollectionSpace server.

Gliffy Diagram
pageDesign notes for multi-tenancy in CollectionSpace
namemulti-tenant scale out
spacecollectionspace
pageid16548108
sizeM
pageid16548108

Pros:

Cons:

Issues:

CollectionSpace ID space and service

...