Electronic Resource License Metadata Project: Update 7

May 2nd, 2010

Week 13: April 19 - 23

This week I heard back from the Form Builder developers, who announced that they will release an upgrade to their webform utility that includes email attachments.  It is not expected to be release until the middle or end of May.

The Notre Dame developers also released the Coral Organization module, which is an add-on to the Licensing module.  Stanford developers installed the module and I performed thorough testing and found a few technical problems, most of which had already been resolved by Notre Dame, with simple updates needing to be installed on our installation.  Some of the problems involved activating certain functions, which weren’t clearly spelled out in the Coral technical documentation.  I also discovered that I wasn’t able to access our Coral installation from home, due to security settings added when our developers installed the software.  The solution involved installing VPN software which allows a secure connection between the Stanford network and non-Stanford IP addresses.  Within a few minutes, I was able to continue my testing and start creating documentation from the comfort of my home.

Week 14: April 26 - 30

With the technical problems for the most part resolved, I began in create custom documentation that would guide Stanford staff with license uploads and data entry.  The Coral Licensing documentation will include one section on creating a new license record and another section on editing and adding metadata to license records.  The database structure and rules that I had previously created during my review of a sample set of licenses was instrumental in the creation of the license module documentation.  Without a clear sense of whether our customization requests will go through, I specified a way to use expression-dependent qualifiers from a single list of qualifiers with the existing software.  If our modification requests go through, it shouldn’t take too much effort to also modify the documentation.

A roadblock I discovered while creating documentation was that I was able to upload documents and attachments to Coral, but when trying to view the documents, I received an error message: “Not Found: The requested URL /licensing/documents/…pdf was not found on this server.”  I reported the issue to Stanford developers, who found a permissions issue, which again wasn’t described in the Coral technical documentation.  After making an adjustment to the permissions of that upload folder, he was able to access the documents, but I am still not.  The same error is occurring when I try to access attachments uploaded to the /licensing/attachments folder.  Although this isn’t preventing me from creating documentation, the inability to access uploaded license PDF’s is a serious problem.  Hopefully the issue will be resolved soon.

Electronic Resource License Metadata Project: Update 6

April 20th, 2010

Week 11: April 5 - 9

This week I worked on linking the various webform options with the Jira ticketing system.  This can be accomplished by having the webform email its submissions to Jira, which automatically generates a new Jira ticket.  The main challenge remains forwarding the email attachments, which neither Drupal nor Form Builder do to our full liking.  Form Builder has absolutely no attachment capability at this time.  Drupal allows attachments, but instead of forwarding them via email, it stores them in a database and forwards a URL to the attachment file via email upon submission.  I have been in communication with the Form Builder developers, who say that attachments will be implemented in the near future.  Drupal also allows attachments of new Microsoft document types (.docx and .xlsx), which is a requirement.  Unfortunately, the existing  Drupal webform module will most likely will not be modified.  At this point, our webform options remains unclear and may require sticking with what we are current using – a custom-scripted webform.

We finally received Abby Finereader 10.  I spent some time testing and creating instructional documentation.  The improved quality of this new version is evident.  Rendering is far more accurate and editing converted documents is simpler.

University of Notre Dame released an add-on module for Coral called “organizations,” which stores organization and contact information.  The module makes it easier to browse by organization, find licenses associated with each organization, assign relationships among organizations, and generally store more information about the organizations than what was possible before.

Week 12: April 12 - 16

This week I researched alternative Drupal webform modules.  As with many open source products, Drupal evolves with the input and labor of volunteers.  There have been several people who have suggested adding attachments to the next webform module version, but for some reason no programmers have responded to the call.  Since Drupal in a contact management system, it makes sense that attachments would be stored in a database of some sort instead of simply forwarding them via email.

The DLSS department has set a date for installing Coral on our server space (erm.stanford.edu) by April 23.  At this point, the standard version will be installed, while Stanford programmers consult with Notre Dame programmers on the feasibility of adding our suggested modifications to the next Coral release.  Either way, I will soon have a functional system for which to create instructional documentation.

Electronic Resource License Metadata Project: Update 5

April 8th, 2010

Week 09: March 22 – 26

This week I began testing various webform options for license submissions.  The existing webform was developed using custom HTML, CSS and web scripting.  Adding fields requires a base level of programming and HTML expertise.  Creating webforms in Drupal on the other hand is a relatively simple process.   I began by creating a sample webform, adding the specified fields in various styles, including drop down menus, checkboxes, radio buttons, grid/survey and attachment selection.  Drupal provides mostly everything specified by the project manager, except that the attachment function only emails upon submission a URL to where the document is stored in the Drupal database instead of sending the actual document as an email attachment.  The problem with this is that it creates the extra step for the Electronic Acquisitions staff to obtain the attached license that is stored in Drupal which requires the user to authenticate.  The ease of webform creation may warrant that extra step.

Week 10: March 29 – April 2

This week I tested another webform option that Stanford offers called Form Builder.  Any Stanford staff member can create a custom webform and store the settings in his or her AFS space on the university server.  Form Builder is by far the simplest way to create a webform that I have come across.  I created another webform sample that included all the field options.  Field options include custom drop down menus, checkboxes, radio buttons and grid/surveys can be created, in addition to a few pre-built field groups including Name and Address.  The one drawback is that it provides no attachment functionality.

This week I also met with the electronic acquisitions specialist who will take over supporting the ERM database when my project concludes.  I trained her to scan documents, convert them into searchable PDF files, and edit the rendered version.

The Coral ERM open source software was released this week.  We created a project charter request for the Digital Libraries Department to modify Coral and install it on our server.

Electronic Resource License Metadata Project: Update 4

March 21st, 2010

Week 07: March 8 – 12

This week I continued comparing the potential of Filemaker Pro 9.0 and Coral ERM software to meet our specific needs.  Although Coral was developed to meet the needs of the University of Notre Dame, we concluded that the web user interface, accessibility to an unlimited users, the minimal learning curve for data entry and searching, and aesthetics make it a better choice.  The main limitations of Coral are the inability to create unique qualifiers (descriptive choices) for any given expression type (descriptive categories).  The software relies more on copying and pasting the actual license text to describe expression types, although there is a customizable set of global terms from a drop-down menu that can be used to describe any expression.  We would prefer the ability to create unique customized lists of qualifier terms for each expression in addition to copying and pasting full text of the license sectioning being described and the section number where it is located in the license.  Unique lists of qualifiers terms would allow for searching and aggregating based upon those terms.  Coral also has serious limitations for searching and aggregating based upon qualifier terms.  There is a possibility that the Stanford Libraries web programmers may be able to modify the software to meet our specific needs.

I also continued with license reviews.

Week 08: March 15 – 19

This week I revisited the license rules and database structure that I had previously created with Filemaker Pro in mind.  Based upon my license review, I developed helpful instructions for anyone performing license review and data entry.  The basic interpretive rules should apply to Coral or any other database that we adopt, and will be an important part of the documentation that I will later develop.

If modifying the Coral ERM software proves to be problematic, we thought of a way to cope with the global qualifier list.  There are only six expression types that require unique lists of qualifiers.  We could simply populate the global qualifier list and precede the qualifiers for those six expressions with the name of the expression, followed by the qualifier choice (exp. “Archival Copy Type – At Request,” “Archival Copy Type – Only if Publisher Fails,” etc.).  Searching by qualifier choice will still be impossible without software modification.

Toward the end of the week, the project manager and I met with two programmers from the Stanford Libraries.  We showed them the software and their preliminary opinion was that it shouldn’t take too much effort to modify the software to our specification, though a final estimate on labor hours required would require access to the actual source code.  The University of Notre Dame should release their software to use within a few weeks.   If we do choose to modify the software, we will not be able to benefit from any future modules or updates released.

I also continued with license reviews.

Electronic Resource License Metadata Project: Update 3

March 9th, 2010

Week 05: February 22 – 26

This week I downloaded and tested the trial version of Abbey FineReader 10.  The software has made noticeable improvements upon its previous version in its ability to recognized tiny or blurred text.  This software also allows the conversion of an existing PDF to a machine readable and editable format.  The upgrade from our “lite” version 9 to the “professional” version 10 apparently costs 179.99, but I need to get someone from their company to confirm this.

I also met with electronic acquisitions specialist who deals with connection problems and security breaches.  Security breaches fall into two categories.  One type is excessive use or downloading behavior of particular electronic resources, which show up in a breach log, which point out the user who is contacted and made aware of his violation.  Another type is an E-Z Proxy server breach, which points out unusual use of Stanford network username (sunet ID) and password, usually from someplace oversees.  Access issues consist of proxy setup problems, sunet ID problems, Firefox caching, network issues and problems with opening PDF’s.

Finally, I met with an analyst to discuss the E-Loader, which is a way to obtain MARC records from the vendor.  Periodically, vendors send a file with information on newly purchased books, newly unavailable books and edits to existing records.  Based upon the comparison of the vendor list and the library catalog, it is determined which MARC records need to be obtained.  The MARC records that the vendor supplies still need to be verified for accuracy prior to applying them.

I also continued with license reviews.

Week 06: March 1 – 5

This week I installed Filemaker Pro  9.0 and began learning its operation.  I accessed an e-book entitled, “Learn FileMaker Pro 9,” and began creating simple database tables.  We have been working with a member of the SULAIR Tech Support department, who developed one possible database structure to help us get started.  I eventually copied her database and began adding fields to various tables.  Connecting relational tables has proved to be a challenge.  My first project will be a single table consisting of all the fields they we came up with during the license sample review.

This week we gained access to a test version of Coral, the University of Notre Dame ERM software.  This system uses drop down menus to apply descriptive terms to various license attributes, but mainly used notes fields where the actual passage can be copied and pasted.  From our point of view, a major limitation is that the system only allows a single list of qualifiers (descriptive choices) to be used to describe all expression types (attributes).  Out originally idea was to have unique sets qualifiers for each attribute.  We will investigate the labor involved in modifying the code to better suit our needs.

Electronic Resource License Metadata Project: Update 2

February 23rd, 2010

Week 03: February 8 - 12

This week I met with two library employees who administer SFX, the Ex Libris service that interconnects links to electronic resources such as journals and e-books. This service helps in catalog record creation, link resolving and provides seamless access to resources from various places within the OPAC and library website. Members of the Electronic Acquisitions Department verify the proper loading of new electronic journal or e-book purchases and at times request the manual creation of catalog records.

The project manager and I again met with the head of the SSRC and manager of Strategic Digital Projects to discuss our progress and get help interpreting certain phrases we have been finding during the license review. We discussed the changes we have been making to our metadata terms and categories. We are developing a set of rules for data input, interpretation and synonyms that is steadily growing as we continue reviewing our sample set of licenses.

The OCR program we chose is called Abbyy FindReader. OCR (optical character recognition) software interprets the text of scanned documents and creates PDF or Word documents that include machine-editable text. We hope that such documents will eventually be able to be uploaded and attached or copy and pasted into database records for full-text searching. We have a library of hundreds of scanned license PDFs that need to be interpreted using OCR. Unfortunately, we discovered that the version of Abbyy FineReader that we acquired only interprets documents that are scanned immediately prior, with no ability to interpret existing PDFs. I will look into options for upgrading our software.

I also continued with my license review, adding rules and synonym to our metadata list.

Week 04: February 15 - 19

This week I continued with license review.

The project manager and I met with a member of the SULAIR tech support team who has expertise in FileMaker Pro relational database application. Several library departments use FileMaker Pro for various database needs and store data on a server devoted to this program. We had previously met with the tech support member to discuss our needs and for this meeting, she created a small database structure and demonstrated how data is entered and retrieved. There is still a question about how license amendment records will be stored (child and parent records) and how we will relate amendments to the original contracts. The project manager then requested that a FMP license and a guide book so I can get started on developing our database. On the side, the project manager is investigated our possible use of the ERM software that Notre Dame is using called Coral. Coral is programmed with PHP and data is stored on a MySQL database. Although we will be using FMP for this project, I will continue to work with the project manager to investigate other methods.

Electronic Resource License Metadata Project: Update 1

February 6th, 2010

Week 01: January 26-29

This first week of the project consisted of several introductions, meeting pertinent people associated with electronic acquisitions and licenses. This week I met with the Ordering Operation Manager who gave me a step-by-step overview of the ordering process beginning with the order request from a selector to the bibliographic information being entered into the library catalog and link set up. I also met with a member of the tech support department, who helps to support Filemaker Pro. She considered our needs and gave us an overview of the requirements for our project.

I also met with head of Social Science Resource Center who reviews and approves the license agreements prior to ordering. Also at this meeting was the manager of Strategic Digital Projects, who works on adding content into the Stanford Digital Repository. Both of these people have an interest in he ability to easily search for and analyze license information through the use of the license database that I will create during this project.

I also began the tasks of planning and developing the structure of the license database, based upon the analysis of several electronic resource licenses. I worked closely with the Electronic Resources and Technology Librarian to come up with a set of key attributes that she considered useful. The categories included general information, perpetual access, requirements if any Stanford staff member discovered a breach in the license terms, and public services terms. We chose to use a free online application called Dropbox.com to share files and collaborate during this project.

Week 02: February 1-5

This week I spent several hours reviewing more licenses and revising the database structure, based upon new information and scenarios I am finding while analyzing the sample licenses. Every time I discover a new way vendors describe certain requirements, I am adjusting the way that information will be captured during data entry. The goal is to come up with a way that general enough to apply to all licenses, while specific enough to make queries meaningful. I am reviewing my work with the Electronic Acquisitions librarian on a weekly basis.

I met with two technical specialists for electronic resources, who gave me an overview of loading free e-resources, including journals, databases and single documents found for free on the web. Some free e-resources are included in SFX, an automated linking solution, requiring only a quick activation for creating a catalog record and link. Others require original cataloging.

I also began testing ABBYY FineReader, which is an OCR software package acquired for this project to convert paper licenses into PDF’s with searchable and selectable text.

Drupal Internship Update 4

May 1st, 2009

This last part of my internship consisted of attending meetings, discussing feedback and more tech support.  I also spent some time with the department senior developer and learned more about server side of our Drupal installation.  In the final days I will create an intro Drupal screencast.

Week 12 (April 6 – 10)

Attended a meeting with the Art library staff to discuss their progress and concerns. They were concerned about the server speed, which would change when the new server is installed and the websites migrated. They were also concerned about the lack of print-friendly screens and the attachment size limits that were currently in place. We increased their maximum upload file size and discussed the possibility or installing a module to help with printing appearances. This week I also learned how to create a new panel to homepages for featuring timely information such as events and new offerings. It involved creating new views and panel views and adding them to all group homepage templates. The users then have the ability to tag content with a meta tag associated with this region of the screen to make it appear.

Week 13 (April 13 – 17)

Attended the webteam meeting this week. At it we discussed the different projects that every member was working on. A large recent release was for the Tel Aviv exhibition, for which the webteam created a Drupal website. Timelines for the upcoming Drupal migration was also discussed. I also met with my internship supervisor this week. We discussed my progress and decided that the remainder of my internship should shift to fulfilling the odds and ends of my learning objectives, aside from documentation creation. This includes module creation, automatically changing URL paths when the migration to the new server occurs, the actual migration process and creating a new intro screencast. I met with a senier web developer, Jon Lavigne, who showed me the process of migrating a Drupal website from one server to another. In Drupal, this includes backing up and transferring a MySQL database and the entire Drupal file directory. Modules also have to be updated as well. Challenges include making sure the URLs all change when the custom script is ran, resetting permissions accurately and adjusting the authentication module.

Week 14 (April 20 – 24)

Attended a meeting with the Earth Science Library GIS staff to discuss their progress and concerns. The were concerned about the level of HTML knowledge that is actually needed to create the look and feel of their Drupal website. We displayed the different options available for people of different level HTML expertise. We also discussed the way in which Drupal can display book cover images and bibliographic information. Creating a download page was also discussed, which I will meet with Jon Lavigne to discuss later in the semester.

Week 15 (April 27 – May 1)

Attended a meeting with the Education Library staff to get feedback. They have a Drupal installation outside of our installation, but recently added a subject specialist site to our installation. They provided me with some valuable feedback for by documentation. They would like to see more information about views and taxonomy. I also attended another web team meeting. We discussed the ways in which files are named and stored in our “digital stacks.” An application called “Pair Trees” (similar to tiny url), can provide logical file naming and mask the actual location of downloads. The migration of our Drupal installation to the new server finally occurred. I will spend some time verifying URLs and content.

Drupal Internship Update 3

April 9th, 2009

I am performing tech support now with no need for help from other DLSS staff. The documentation is basically complete. I will soon meet with my internship supervisor to discuss the outcome. I was finally introduced to theme development, which was one of the main topics of the Drupal class I completed a few weeks back.

Week 9 (Mar 16 – 20)

I spent most of this week providing technical support to the libraries just getting started with their Drupal sites. The Earth Science Library is creating three separate sites – GIS, Maps and one for the physical library. Continued working on the instructional documentation.

Week 10 (Mar 23 – 27)

Continued formatting and testing the new documentation section. Learned about creating a custom theme, based on an HTML or PDF file. The project that the Drupal web developer, Jessie Keck, is working on is to create a Drupal site to match the general appearance of an existing photo Stanford image gallery. This involved editing template files for block, node, page, box and comment.

Week 11 (Mar 30 – Apr 3)

Attended a meeting to discuss deadlines for the rollout of upgrades and new features. The science and engineering libraries are hoping to have the search capabilities improved as soon as possible. This will include testing different search terms and documenting the search results. Within 2 months, the entire SULAIR Drupal installation will be migrated to an improved server which should improve performance. The upgrade to Drupal version 6 will have to wait until the panes module is out of beta for that version.

Drupal Internship Update 2

March 13th, 2009

This second period of my internship was marked by more independent assignments and a crash course in Drupal theming and module development.  I have a new-found appreciation for anyone ever tasked with creating instructional documentation.  I will be glad when I can move onto the some actual programming assignments, but for the meantime, documentation is my priority.  I hope the new layout I am designing will encourage more people to use it and create their sites more independently as a result.

Week 5 (Feb 16 – 20)
Attended a 3-day advanced Drupal training seminar that concentrated on creating custom themes, modules and general tips.  The themes section took us through the complete process of changing a theme.  It included adjusting and replacing variables in template files and overriding certain theme programming functions.  The modules section was a bit over my head, but I got a good idea of how to research programming standards that are in place for various core module files, which allows for customizing them to create new modules.  The tips section introduced some of the new capabilities of the next Drupal release version.  We also learned some testing methods and applied them to our current library Drupal website.

Week 6 (Feb 23 – 27)
Continued testing the current instructional documentation.  Learned about the uploading of files using the Drupal interface and loaded the updated screen captures for the instructional documentation. Attended a meeting that discussed the planned implementation of Google analytics to the Drupal sites.

Week 7 (Mar 2 – 6)
Discussed my progress with the project manager in our monthly 1 on 1 meeting.  Was assigned to interview users currently creating their new websites about their use and opinions of the instructional documentation.  Was assigned to create accompanying screencasts of each instructional document.  Continued to update the testing sandbox environment by verifying that user access control and management settings matched with the production environment.  Learned about the view filters that allow aggregation of content using Boolean and multiple taxonomy terms.

Week 8 (Mar 9 – 13)
Attended a meeting about website monitoring, testing systems, Google Analytics and moving to the updated version of Drupal.  Started creating a new way to present the instructional documentation that groups together concepts and tasks that are logically related.