Staff may wish to occasionally download EAD from ArchivesSpace, manipulate data in Excel, and re-import the file into ArchivesSpace.
These instructions provide steps to do so.
- Open the desired resource record in ArchivesSpace
- Click on the “Export” drop-down button
- Select and Click “Download EAD” (keep the EAD3 Schema option checked)
- Open the downloaded EAD file in oXygen XML Editor. If you don't have a copy of the aspace-plus-excel-at-yale-2020-11-15.xpr oXygen project, see step 7 of "Excel to EAD and EAD to ArchivesSpace" to download a copy
- Run the 2 EAD3-to-Excel transformation
- In your Downloads folder, open the most recent oXygen file in Excel. This file should read something like: “[call number]_UTC_ead_bpg-excel.xml”
- Note: You can also open this file by dragging and dropping into an open blank Excel spreadsheet
- Manipulate the data, as needed, in Excel
- Once you are finished manipulating the data in the Excel spreadsheet save the file. Ensure that you're saving it as an XML Spreadsheet 2003 (*.xml)
- From there, follow steps 7-15 in the EAD and Excel instructions. For Beinecke staff, please check with Alicia Detelich before re-importing any files following this method. There are a few other potential changes that will need to take place in order not to reintroduce messy data back into ArchivesSpace.
Caveats:
- If you intend to re-import your data into ArchivesSpace, be aware that the system may create duplicate top containers. It's best to always test your import in ArchivesSpace TEST and confirm that you're able to reconcile the data and the top conainers.
- Staff should keep in mind that there is no column in the spreadsheet that corresponds to ArchivesSpace's "general note" field. As of February 2019, any "general notes" in ArchivesSpace will be retained in the EAD to Excel transformation, but those notes will be moved to the scope and contents column instead.
- The EAD to Excel transformation does *not* support every type of EAD encoding possible. EAD elements/attributes that will be dropped during the transformation process include: table, ref, extref, ptr, daogrp, bibliography, fileplan, index, note, @role (in origination elements), @script, @calendar, @certainty, etc. All of these elements and attributes are retained at the collection level, but right now they are not mapped anywhere within the container list section that is output into the Excel spreadsheet. The text from each element should still survive, but not the extra markup.