Improving CollectionSpace Search

Search functionality in the new CollectionSpace UI is taking shape, with the implementation of keyword search nearly complete. The new UI includes significant improvements to search, making search results more usable and easier to customize.

 

Result columns based on record type

CollectionSpace is able to perform searches on a single record type, but currently the search results must be displayed in the same way for every record type, making them less understandable.

Here's an example of a search on cataloging (object) records in CollectionSpace 4.4:

Here's an example of a search on group records:

The result tables in both of these searches have identical column headers, and the same number of columns. This is a restriction of the UI, and it can lead to confusion. For example, the first two column headers are labeled "ID Number" and "Summary", but these labels do not necessarily correspond to the names of any fields in the records. For object records, the "Summary" column actually contains data from the Title field. For group records, the "ID Number" column contains data from the Title field, and the "Summary" column contains data from the Group Owner field. "ID Number" and "Summary" are by necessity generic descriptors that could conceivably apply to any field in any record – and in the case of group records, "ID Number" turns out not to be generic enough, because the content is likely not even a number.

The Record Type column is another example of the limitations of this one-size-fits-all approach. That column is useful for searches that are not restricted to a single record type. But when a search is limited to a single record type, every row in the result table will contain the same value in that column: the record type that was searched. This makes it not very useful information that would better be replaced with another field, or simply removed to make more room for the other columns. That's not currently possible. Since the column should appear on searches for all record types, it must be present on searches for single record types as well.

These problems are solved in the new UI by making search results separately configurable for each record type, so that the most meaningful information can be displayed for each search context.

Here's what a search on object records looks like in CollectionSpace 5.0:

This is a search on group records:

Note that the headers of the first two columns in both result sets are labeled with the actual field names of their contents, and that there is no record type column.

Searches on all record types produce a different table:

Since the first two columns contain data pulled from different fields on different record types, the headers actually do have to be generic. Here "ID Number" has been replaced with the even more generic "Record", since the contents may be neither an ID nor a number. The Type column is now present, since it is useful in this context.

 

Formatted data

In the current UI, it's difficult to format the data returned in search results. This is evident in the contents of the "Updated At" column in the CollectionSpace 4.4 result sets. These values are not formatted in a user-friendly way. They're just timestamps pulled out of the database, formatted for machine-readability. In the CollectionSpace 5.0 result sets, the raw timestamps have been passed through a locale- and time zone-aware formatter. It's easy to configure a formatter to be applied to any search result column. This could either be a custom JavaScript function, or one of several built-in formatters. In addition to the timestamp formatter, other built-in formatters include: localization/translation of raw strings, as in the Type column in the All Records search; and extraction of display names from CollectionSpace ref names (URIs), as in the Owner column in the Groups search.

A formatter is not restricted to producing text. For example, the search result table for media handling records in CollectionSpace 5.0 looks like this:

In this case, the contents of the Thumbnail column are formatted from the blobCsid field for each media handling record. The output of the formatter is an image, retrieved from a URL that is computed from the value of blobCsid.

 

Try it out

This functionality is now available on the demo of the new UI at http://nightly.collectionspace.org/drydock/login. Currently, only keyword searches on All Record Types, Object, Group, and Media Handling work properly. Other record types have not been fully configured, so searches may not produce results.

 

Configuring the CollectionSpace UI currently requires a somewhat daunting level of knowledge, about CollectionSpace's source code repository and its layout; about multi-tenancy and the file overlay model; and about deploying changes into the CollectionSpace server using development tools like Ant and Maven. For the new UI, the goal is to make this process more accessible. It should be possible to jump in with some basic web development skills, and no knowledge of the CollectionSpace source code, build process, or server architecture.

 

Installation

The new CollectionSpace UI can be displayed on any web page. Here's a simple example:

Publish the page to a web server. That's it.

Here's a breakdown: The UI is distributed as a JavaScript file, which is loaded into the page (on line 8). This JS file is a minified bundle – think of it as a binary distribution. There's no need to download, edit, or build source code. When the bundle is loaded, it makes the cspaceUI variable available in the global namespace. cspaceUI is a function, which when called (on line 10), renders the UI into the page – by default, inside the main tag (line 7).

This example loads the JS bundle from the UNPKG CDN. It will always be available there, but UNPKG isn't meant for production use. The bundle will also be shipped with future tarball releases of CollectionSpace, set up to be served from CollectionSpace's tomcat server. You can serve it from there in production, or move it to another web server in your environment. Some HTML pages like the one above will also be shipped with CollectionSpace, as starting points for further customization, or for use as-is. They'll be set up to be served from CollectionSpace's tomcat server as well, but they also can be moved to any other server.

There's also an unminified JavaScript bundle available, cspaceUI.js. This bundle contains verbose warning messages, and is suitable for development use.

 

Configuration

Configuration is done by passing options to cspaceUI(). The options are in a state of flux as development of the UI continues, but these are some examples that work today:

Pointing to services on a different host

There's no requirement for the UI to be served from the same server as the services layer. By default, the UI will attempt to connect to the REST API at the same host from which it originated, but it can also be configured with the URL of a remote host (line 12):

This assumes that the services layer at http://cspace-services.hoster.com has been configured to accept CORS requests from http://mymuseum.org.

Supplying translated/customized messages

A museum may want to replace some UI messages with different text, or translate all of the messages to another language. The message bundle is a configuration option:

Changing the container element

The UI doesn't have to be rendered into the main element. A different container element may be specified by supplying a CSS selector:

Supplying an outer header/footer/sidebar

The UI is rendered into a specific element on the page, leaving anything else that might be on the page undisturbed. This allows you to supply your own headers, footers, sidebars, or other content surrounding the CollectionSpace UI container:

Changing the logo or other styles

A museum may want to replace the CollectionSpace logo with its own, or change other styles. This is easily accomplished with some CSS:

In this example, a class (mymuseum) has been added to the main element into which the UI is rendered (line 12). A CSS rule also has been added, using both that class and the one provided by the UI (lines 6-8). Since that rule's selector is more specific than any that exist in the UI's default stylesheet, it overrides the default styling. There are many other ways to achieve this override – the mymuseum class isn't even necessary, since the selector main .cspace-ui-LoginPage--common would also be more specific than any in the default stylesheet – but this way is nicely readable.

More to come...

As development continues, additional configuration options will become available. This could include: adding and removing record types, replacing the form templates for records, changing the columns in search results, and more.

 

Use Cases

Some interesting use cases are enabled by this simplified configuration:

  • Split hosting. A CollectionSpace hosting provider might host only the back end services, while a client museum hosts the UI on its own server in its own domain. This allows a museum to hire out the hard parts – JEE server and RDBMS management – to experts, while maintaining local control of UI look and feel.
  • Multi-language support. A single CollectionSpace back end can easily be configured with many front-ends. For example, an HTML page at mymuseum.org/cspace/en could be configured to display the CollectionSpace UI in English, while another page at mymuseum.org/cspace/fr could be configured in French, with both pointing to the same services layer.
  • Role-based UI. Mulitiple front-ends could be configured differently for use by museum staff with different roles, for example, registrars vs. photographers. Differences might include the fields that are displayed in each record type. This would be similar to using "templates" in the current UI, but persistent, and able to reach across multiple record types.
  • Collection-based UI. Museums with disparate collections, e.g. natural history and art, could also configure multiple front-ends, each tailored to a different collection. Again, this might be addressed today with "templates," but the multiple front-end approach is persistent and more comprehensive.

Demo Time!

A live, editable example is on CodePen. You can even log in, and create object records on nightly.

Our First UI Milestone

In planning out the new UI, we established four milestones over the first year, at three month intervals. The first milestone was set for the beginning of October, and I'm happy to announce that we finished it on schedule. The target for this milestone included two groups of functionality: log in/log out, and basic record editing for a single record type.

There was a significant startup cost paid during this period; about 1/3 of the time was spent updating the services layer, another 1/3 was spent on research and planning, and only the final 1/3 was spent doing actual UI development. For the next three month period, I'm anticipating that there will still be some research and planning, but a lot more actual development. We should be able to build features out pretty quickly now.

Demo Time!

A build of the new UI is running at http://nightly.collectionspace.org/drydock. After logging in, you can create a new object record at http://nightly.collectionspace.org/drydock/record/object, or edit an existing one at http://nightly.collectionspace.org/drydock/record/object/[csid]. Note that all of the fields are displayed in free text inputs. Implementation of dropdowns, autocomplete inputs, dates, structured dates, and other field components will be done with the "advanced record editing" group of functions, scheduled for the second milestone. These features are mostly spec'ed out in JIRA.

Looking Ahead

In the next week or two I'll be working on developer documentation, so that other people can start to implement forms for the remaining record types. Then I'll work on implementing advanced record editing features, sidebar components, and keyword search, which will take us to the end of the year.

We're looking for contributors. If you'd like to help create the next version of the CollectionSpace user interface, please let us know!

 

 

As part of the UI rewrite, three new acronyms features have landed in the CollectionSpace services layer. These will be available in the next release of CollectionSpace (tentatively numbered 4.5). You can try them today by building and deploying the master branch to your own server, or by hitting our nightly build server at http://nightly.collectionspace.org:8180/cspace-services.

 

JSON Output

The REST API now returns data in JSON format when requested, and accepts data in JSON format when it is provided. This is a necessary step towards eliminating the application layer.

The services layer treats XML as its first-class serialization format; it contains code that explicitly generates XML and expects to receive XML. For this reason, the JSON provided as output and expected as input is not a direct serialization of CollectionSpace resources. Instead, JSON is produced by converting XML output as it exits the services layer, and JSON is converted back into XML as it enters the services layer. This results in a couple of oddities, like using @ to prefix fields that are translated from XML attributes, and using @xmlns: to prefix fields that correspond to XML namespace declarations.

To request data in JSON format, send the HTTP Accept header, with value application/json:

To send data in JSON format, send the HTTP Content-Type header, with value application/json:


OAuth 2 Tokens

The REST API now grants and accepts authorization tokens. This is another step towards eliminating the application layer. 

Tokens are granted following the OAuth 2 specification. Currently only the resource owner password credentials grant and refresh token grant are supported. Only one client identifier is currently allowed: cspace-ui. In the terminology of the specification, the cspace-ui client represents a public client and a user-agent-based application. Since this client is public (and therefore incapable of keeping a secret), the client secret (sent as the basic auth password in the following examples) is empty.

Tokens my be obtained by sending a POST request to the /oauth/token endpoint:

This returns an access token and a refresh token.

The access token may be used to authorize requests against CollectionSpace resources. This is done by sending the HTTP Authorization header:

The refresh token may be used to obtain a new access token – typically, after the current access token has expired. This is done by sending a POST to the /oauth/token endpoint:

Notes:

  1. Access tokens are set to expire in one hour, and refresh tokens in 12 hours. Once the refresh token has expired, the user must re-authenticate with their username and password.
  2. The tokens are JWT, but this may change in the future. For now they should be considered opaque strings.
  3. The above examples are contrived to demonstrate how the API is used. For one-off curl requests, it's easier to use basic auth, which is still supported. Token auth is preferred when you're using a client that handles the token management for you.
  4. Tokens are signed with a random key generated when the services layer starts. Restarting the CollectionSpace server changes the signing key, invalidating all outstanding tokens. This means that all users must re-authenticate with their username and password. Restarting CollectionSpace is in fact the only way to revoke an outstanding token. There is no way to revoke an individual token; they will all be invalidated.

Future enhancements will include configurability – for example, of allowed client identifiers and associated secrets, token expiration times, and token signing keys – and support for additional OAuth 2 grant types. Additional design work around revoking individual tokens is also required.

 

CORS

The REST API may now be configured to accept cross-origin requests. This allows flexibility in deploying JavaScript applications that utilize CollectionSpace services. An application served from a museum's own domain (e.g. http://hearstmuseum.berkeley.edu) may now access CollectionSpace at a hosting provider's domain (e.g. http://cspace.berkeley.edu). For example, a hosting provider might choose to host only the CollectionSpace services layer, while a museum hosts the UI on their own servers.

Cross-origin requests are not allowed by default. Domains may be whitelisted for CORS on a single CollectionSpace server by adding a security.properties file to tomcat's lib directory. The cors.allowed.origins property accepts a comma separated list of domains:

security.properties

Alternatively, the default value of cors.allowed.origins may be modified in the source tree by editing applicationContext-security.xml.


Losing the Application Layer

Some significant changes that will happen to CollectionSpace with the rewrite of the UI aren't actually in the UI at all.

You probably know that CollectionSpace is made up of three layers: the services layer, the application layer, and the UI. While working on the prototype of the new UI last year, I considered making a pretty drastic simplification: Could we have a UI that connects directly to the services layer, allowing us to remove the application layer? I concluded that we probably could, without too much trouble. In the intervening year, it's become even more enticing.

I'm taking this opportunity to greatly reduce the application layer, and hopefully to eliminate it completely. To understand what this means, let's examine what the application layer does. I think there are five things:

  1. Translate service layer XML payloads to JSON for use by the UI
  2. Provide cookie/session-based authentication to the UI
  3. Serve static UI assets (JavaScript, CSS, HTML, icon images, etc.)
  4. Configure the the UI layer
  5. Configure the services layer

The Proposal

I believe that most of the things the application layer does could be better done elsewhere.

  1. The services layer could optionally produce and accept JSON payloads, in addition to XML.
  2. The services layer could provide token-based (bearer) authentication, in addition to the username/password (basic) authentication currently supported.
  3. Static UI assets could be served by any HTTP server. We could configure the Tomcat server distributed with CollectionSpace to do this, and deployers could optionally use their own web server.
  4. Configuration of the UI could be done in the UI layer. The configuration currently done in the application layer is already tightly coupled to HTML templates, JavaScript, JSON configuration, and message keys in the UI layer, so most of the time you have to modify both layers anyway. It's difficult for implementers to figure out that what appears to be a UI-only change (for example, adding a field to the advanced search screen) actually also requires a change to the application layer.
  5. Configuration of the services layer could be done in the services layer. XML configuration in the application layer is currently used to generate three kinds of artifacts for the services layer: XML schema files (XSD), tenant configuration (aka tenant bindings) files (XML), and Nuxeo doctype bundles (jar files). One option is to retain some kind of XML configuration similar to what currently exists in the application layer, but simplified and with UI-specific configuration removed. Those files could be moved into the services layer. Another possibility is to use XSD files as the primary source of configuration, and generate tenant configuration and Nuxeo doctype bundles from those. That would be nice in that we wouldn't be inventing our own configuration file format, but it's possible that we can't put all the information needed to configure the services layer into valid XSD files. I'll be investigating these options further.

Why?

There are a few important advantages to eliminating the application layer.

Improved Reliability

Currently the application layer has an idea of the state of the system, via its configuration files. The services layer also has an idea of the state of the system, via its databases and Nuxeo. These do not always agree. For example, you might add some fields to the application layer configuration, but forget to run ant deploy_services_artifacts, so those fields are not known to the services layer. This can result in inscrutable errors when saving records. Once the UI is able to talk directly to the services layer, the services layer becomes the single source of truth for the state of the system, eliminating a class of problems that are difficult to debug.

Better Performance

The application layer is itself a web application (in the Java EE sense), which connects to the services layer application over HTTP. This results in greater memory usage in the JVM vs. a single application, and incurs additional HTTP overhead on each request. The XML to JSON conversion is also very memory and processor intensive, and could be improved. Removing the application layer and rewriting the conversion code should reduce the memory required to run a CollectionSpace server, and make some requests faster.

Simpler Configuration

As it's grown over time, the XML configuration in the application layer has accrued inconsistencies, misnomers, and workarounds that cause confusion for customizers. Fixing these, along with co-locating all UI-related configuration in the UI layer, would improve the developer experience.

 

What do you think? Let us know at talk@lists.collectionspace.org. In future posts I'll provide updates on the status of each change I've described here.

CollectionSpacers: I sent out a note in June about our engagement of UX designer Tim Stutt to work with us on a review of the CollectionSpace user experience. As those of you who have been working with us for a long time know, we spent considerable time at the very beginning of the project working on user experience design. Heading into a new grant and a major re-write of the CSpace UI Code, it was the perfect time for a review.

 

Tim sat down with a number of daily users and observed how CSpace is used to complete day-to-day work. As a result of these observations and interviews, he's identified a number of areas where can make improvements - some big and some small. Tim's final report can be found via this link or on the User Interface Design page.

 

The Functional Working Group has been tasked with ensuring that the recommendations in the report are: evaluated, added to JIRA where appropriate, and integrated into the plan of work for Project Drydock, along with all the other bugs and improvement requests that have been made over the years. As previously noted, we are not planning a full re-design of the UX, but where we're able to make changes that will improve our lives and workflows, we will strive to do so.

 

As always, please email talk@lists.collectionspace.org with any questions, thoughts, or notes about the review. 
UI Rewrite: A One-Year Mission

As part of the recently awarded Mellon Foundation grant I'm excited to begin a year-long effort to rewrite the CollectionSpace user interface, with technical support from Richard and the Technical Working Group, and functional guidance from Megan and the Functional Working Group. In short, we're aiming to make it easier for developers to customize and extend CollectionSpace, while providing a more responsive and productive experience for end users.

Previously...

This work will build on the prototype completed early last year, which is a proof-of-concept using the React JavaScript library. The original proposal from February 2015 outlines the motivation for this work, and describes some of the technology that was used. The prototype itself is live, with the ability to create, search, and update object records. (Enter * in the search box to list all records). The result of last year's work was presented to the Technical Working Group in April of 2015.

Stay Tuned!

You can follow the rewrite issue-by-issue in JIRA, in the project code-named Drydock.

I'll be writing regular blog posts to keep you updated on our progress, and so you'll know what to expect when we're done. In the next few posts I'll cover the lessons learned from the prototype phase, which will influence the work to come.

Your opinions will also influence the work to come, so please keep an eye on these posts, and raise your questions and concerns with me, Richard, Megan, and the Talk list.


CollectionSpace Community:

June 30th marked the end of our most recent grant cycle, which makes it a great time to look back on what we’ve accomplished over the past two years. Our goals for this grant cycle were ambitious, and we’re delighted with the progress we’ve made. We are also very pleased to announce that The Andrew W. Mellon Foundation has awarded us a new grant, which will enable us to continue working on our sustainability efforts, award more mini-grants, and complete some long-overdue infrastructure upgrades to the core application.

Goal: Transition support of CollectionSpace from the original project partners at the Museum of the Moving Image and the University of California, Berkeley to a new organization, LYRASIS. The transition from MMI and UCB to LYRASIS was wrapped up fairly early in the project; a process that was made much easier once we convinced Richard Millet to become the program’s Technical Lead. All websites, demos, communications, and other tools are fully integrated at LYRASIS, leaving our partners at MMI and UCB free to focus on governance rather than wiki up-time.

Goal: Establish a governance structure. Speaking of governance, a huge thanks to Laurie Arp of LYRASIS for managing our first-ever vote for new members for the Leadership and Functional Working Groups and the continuance of members in our Technical Working Group. The linchpins of our governance structure, the working groups set the overall direction for the project, provide technical guidance, and recommend and create requirements for new features and functionality. The full list of current working group members can be found on the program wiki.

Goal: Focus on community engagement and promotion. It has been a very exciting two years for community engagement and promotion. We kicked off a monthly webinar series that has had hundreds of attendees. We’ve presented at or attended conferences across the country, presenting on diverse topics from linked open data to sustaining community source software. We’ve worked with new communities of practice to develop profiles to support bonsai gardens, design materials collections, public art, and local history. As always, the voices of our implementers and users has been invaluable in getting the word out about CollectionSpace at conferences, on listservs, and via in-person workshops and demos.

Goal: Continue stewardship and development of the application. Four public releases, five new community-developed profiles, 547 issues closed, the Nuxeo upgrade - the list goes on and on. We had a huge number of community contributions, from new procedures to webapps and reports. With version 4.4 due out this week, we’re excited to move on to the next phase of development.

Goal: Plan for membership, sponsorship, and service provider support models. We launched our membership program in the fall of 2015, and are very encouraged by our inaugural group of twelve organizational members and one sustaining member representing the five University of California, Berkeley implementers. A strong membership is key to both our sustainability and governance plans, and we look forward to integrating all our new implementers and members into the wider community. We’ve also brought on several new service providers, all of whom have already made valuable contributions to the code and community. A strong service provider network is essential for extending the reach of the program team, and enabling adoption for organizations without sufficient IT resources to install, configure, and migrate data in-house.

All these accomplishments have been made possible through the support of our dedicated community of implementers, many of whom provide governance and financial support, code to improve the application, support for new users, and a dozen other contributions to this community-source endeavor. The program team is grateful for all your contributions.


Here’s to another great year,

Megan