GETTING STARTED WITH DIGITAL PRESERVATION
Assuming you already have a personal computer, you can get started with digital preservation by purchasing a digitizing device, M-DISCs, an M-DISC READY writer, and obtaining the software you want to create archival file formats such as PDF/A and JPEG 2000.
The digitizing device can be a scanner or a digital camera. Perhaps you will want both. Scanners are easier to use, but not as versatile as digital cameras.
When using a scanner, always scan at the highest resolution for which you can afford the archival storage capacity. At the very least, you should scan at 300 dpi (dots per inch) if you never intend to print larger than the resulting digital record size. 1200 or greater dpi is recommended if you think you will ever want to print a larger version of the record.
The scanning device you purchase will have software that allows you to set the desired dpi.
When using a digital camera to digitize a physical record, make sure you have natural, flat, uniform lighting so you can avoid shadows and reflections. Using a tripod is recommended, especially to keep the lens parallel to the record being photographed (camera lenses magnify skew if they are not parallel).
Most digital cameras allow you to choose a dpi setting, so always choose the highest setting available. Then, when you load your digital pictures to your computer, they will have maximum resolution when functioning as source files for the archival versions you create. Although these camera pictures will require significant capacity on your computer’s hard disc when you load them, you can delete them once you have created archival files and written them to M-DISCs.
If you have analog audio recordings that you want to preserve digitally, you can purchase an audio digitizer. These USB devices can digitize virtually any type of analog audio signal so the recording can be archived in the WAV format.
To digitize your analog family movies, a professional service is recommended to minimize cost and maximize quality. The same applies to 35mm slides, although slide scanning attachments may be available for the scanner you purchase (but they may be pricey). For more information on scanning slides, see reference .
If using a service, be sure to specify which archival file format you want for the output (either AVI (.avi) or QuickTime (.mov) for digital video, and lossless JPEG 2000 for digitized slides). If JPEG 2000 is unavailable, you might ask for lossless TIFF and then convert the files to JPEG 2000 yourself.
ADDING DESCRIPTIVE INFORMATION TO RECORDS
Once you have the necessary software, digitizing equipment, and M-DISCs with writer in place, you are ready to get started with digital preservation. However, before you attempt to preserve any records, it is important that you develop a plan to add descriptive information (called metadata in the digital preservation industry) to the digital records you are planning to preserve.
At a minimum, descriptive information should include both contextual and historical information.
Contextual information describes what the record is—for example, a copy of someone’s death certificate, a photograph of a named person, etc. Contextual information also relates the record to its environment, throwing more light on the person(s) to whom the record applies. The more complete and descriptive contextual information is that you add to a digital record, the more valuable, interesting, and endeared the record will become—to you, your posterity, and your extended family.
Historical information provides the source of the record (for example, the county, city, town, or church archive from which a copy of a birth certificate was obtained). It should also identify the creator of the record, if such information can be determined. This is important for copyright reasons, which are discussed below.
Your plan to add descriptive information to digital records should begin with file names. A file name can contain both contextual and historical information. For example, when the author scanned a photograph of a distant relative, the scanning software gave the output file a generated name of 110237489853.tif. One would never know from this file name what the record actually is (other than a TIFF image). But by changing the file name to Photo of Esther Elizabeth Knight on her wedding day 8 May 1917.tif anyone looking at the file name will know immediately what the record is. When searching the contents of an M-DISC, having this much information for all the file names listed will certainly help you zero in on the object of your search very quickly!
A caution is in order here. Older versions of personal computer operating systems have a limit of 256 characters to identify the location of a file on the computer’s hard disc (called the file path). These 256 characters include the file name as well as the names of all folders that must be opened to navigate to the file. Folder names may also be descriptive; therefore, the more nesting of folders you use, the fewer characters will be left for the file name.
However, newer operating system versions allow up to 256 characters for the file name regardless of folder names.
In general, it is best to rename files with descriptive information when you first create or load them—otherwise, you may never get around to doing it.
In order to create a full set of descriptive information, you should also add reference information (or tags—another type of metadata) to files when you create them. Reference information (metadata) allows search software to assist you in locating and accessing records.
When the author scanned file 110237489853.tif as explained above, he also added the following tags by clicking on the appropriate software option buttons—
- Title: Esther Elizabeth Knight on her wedding day 8 May 1917
- Subject: Esther Elizabeth Knight wedding photo
- Author: in the public domain
- Keywords: Esther, Elizabeth, Knight, wedding, 1917, bride, photo, public domain
If tags are to be used effectively, both file creation software and search software must support such tags. It has already been pointed out that soft Xpansion Perfect PDF Master (which is free for personal use) does not allow the addition of tags when creating PDF/A files—you must purchase soft Xpansion’s business version of this product to get this capability.
Any time you deal with records, make sure you adhere to copyright law in regards to copying, printing, and distribution. This applies whether you are working with digital records or physical records.
To avoid violations, track down the source or owner of each record (if possible), then apply applicable copyright law. A wonderfully clear and concise summary of copyright law as it pertains to genealogy has been written by Michael Patrick Goad.9 Please take time to study his short, well written article. Some key points from it are reproduced here—
- If an original work of authorship was created after 1977, it’s copyrighted and it’s going to be for a very long time. The earliest that any work created after that will lose its copyright will be about 2049 – that’s assuming that the author died right after he authored the work.
- If it was created before 1923, there is no copyright on it anymore, so long as it was published. If it wasn’t published, it may still be protected by copyright.
- Works published before March 1, 1989 without proper copyright notice are almost always in the public domain because, under the law that existed before that, a proper copyright notice was required for copyright protection.
- Works published from 1923 to 1963 had to be renewed after an initial copyright term for protection to continue. The U.S. Copyright Office estimates that over 90% of works eligible for renewal were never renewed.
A second article written by Gary Hoffman provides additional useful information that augments Goad’s article with further insight. Please review this article as well. 10
An M-DISC is designed to be permanent; therefore, you cannot change anything after it is written. Before writing any records to an M-DISC, you will want to organize them so as to be as efficient in writing to the disc as possible. You can write the entire disc at one time, or you may write just a portion of it and add files later (depending on your disc writer software). In general, writing one record at a time is not practical.
The number of records (files) you can store on an M-DISC depends on the disc type and the average size of the records you want to write, as shown here—
Number of 1 MB records that can be stored
Number of 2.5 MB records that can be stored
(MB means megabyte)
To simplify the writing process, it is recommended that you first copy the files to a temporary folder and monitor the size of the folder as you proceed. For Windows, this can be done by floating your cursor over the folder name—a pop up will display the total capacity of the folder. You should not exceed a folder size of 4700 megabytes (4.7 gigabytes) when writing to a DVD M-DISC.
Once the temporary folder is populated with the desired files, you can start the writing (i.e., etching or burning) process. If the folder size exceeds the disc’s capacity, writing will stop when the disc is full, leaving all remaining files unwritten. Of course, maximizing the number of files written to each disc minimizes the number of discs required.
An important preservation principle developed at Stanford University is LOCKSS (Lots Of Copies Keep Stuff Safe). The basic concept is this—the more copies you store in different locations, the safer your records will be.
To apply LOCKSS to your archive, you should write a minimum of two discs per set of files and store them in two different locations as far apart as practical. Writing three discs and storing them in three different locations is even better. Perhaps you can exchange archival discs with friends and/or family to enhance the safety of your archived data.
Of course, applying LOCKSS to your archive requires that you get organized and develop a process to track locations and contents of the archival discs. Fortunately, there is an abundance of software available to help you do this, such as Microsoft Access or Intuit QuickBase (an online database).
SHARING YOUR DIGITAL RECORDS
As mentioned in the beginning of this paper, sharing a digital record with others is fast and easy—as long as you have an internet connection and email services. The author uses Yahoo email (mail.yahoo.com) because it is free of charge and offers unlimited storage capacity. Also, it allows you to attach a file as large as 20 megabytes to an email. However, whether or not someone can receive such a large file depends on his or her email capabilities.
Should you want to send someone a file that is larger than the person’s email software will accept, you can use a free transfer service instead.
TransferBigFiles.com is a website that allows you to transfer large files over the internet at no charge. YouSendIt.com will also do this for a fee. Once you upload a file that you want to transfer, a link is provided to your intended recipient. That person need only click on the link to download the file to his or her computer.
BACKING UP YOUR ARCHIVED RECORDS
A side benefit of an email service that provides unlimited storage capacity is that it provides a means to extend the LOCKSS principle for your personal archive. By sending yourself emails with attached preservation files, you can create a collection of such emails that will be stored on the email service provider’s computer infrastructure. In effect, you can backup your archive on this infrastructure.
You should never rely on this approach to be your primary or even secondary archive, however, since the email service provider could start limiting storage capacity at any time or could even go out of business. And organizing so many emails to function as your primary archive might be difficult. Also, you may have difficulty accessing your email inbox when you urgently need to retrieve a record from your digital archive.
Online (cloud) backup is also becoming a popular way to backup family history records because of its convenience. But newcomers to cloud backup have much to learn and consider. The Library of Congress has published a blog11 that explains these considerations. You should review this blog if you are interested in exploring cloud backup.
However, you should never rely on cloud backup as your primary or even secondary archive. There is no guarantee that your data will be saved indefinitely. Some cloud backup services (including Amazon web services) have already crashed, resulting in lost data for some customers. Also, information in the cloud can be hacked.
Bottom line, you should not count on cloud backup services alone to protect your important family history records!
AS TIME GOES BY . . .
It is important that you, your posterity, and your extended family monitor technology changes and take appropriate actions as needed. These actions, which comprise the ongoing aspects of digital preservation, include—
- Transforming file formats that are becoming obsolete to their replacement formats.
- Migrating files to newer archival storage media so they can continue to be rendered if current M-DISC technology is becoming obsolete.
Clearly, digital preservation is not a one-time activity, nor is it a single-generation project. Your responsibility in the digital preservation chain is to gather, digitize, and preserve records the very best you can, then pass them on to the next generation of your posterity and/or extended family that has been prepared to carry on the work.
In many respects, digital preservation is like a relay race—you carry the baton for a period of time and then pass it on to the next runner. To prevent the baton from being dropped during the handoff, you and the next runner must work together in perfect synchronization. This means preparing and motivating the next runner to carry on the race without missing a step.
As this process is carried on from one generation to the next, your digital family history records can be preserved in perpetuity.
Yes, it takes work—but the payback cannot be measured and is eternal.
Recommended PDF/A links—
- www.aiim.org/documents/standards/19005-1_FAQ.PDF (frequently asked questions)
- www.adobe.com/enterprise/pdfs/pdfarchiving.pdf (white paper)
Disclaimer—the recommendations in this paper are the personal opinions of the author. Use them at your own risk. The author is not liable for any consequences resulting from using his recommendations.