Sunday, January 30, 2011

Scanning Technology

Today I participated in the monthly Scanfest put on by various members of the genealogy blogging community.  It was my first occasion to join in, and I had a great time.  I must admit there was more 'fest' and less 'scan' than I had anticipated!  It can be difficult to concentrate on your scanning while trying to follow the conversation.  I was able to get a couple of pages of photo album scanned, named appropriately and saved in an organized fashion, so progress was made.

So I mentioned we had some conversation.  What did we discuss?  Well, one thing we talked about at length were scanners.  How appropriate!  Specifically, we talked about the Flip-Pal portable scanners that Dick Eastman has reported on recently.  The general consensus is that they're nice to have when you're away from home, and do a good job of stitching together multiple scans of larger objects.  This is necessary because the maximum size they can cover in one pass is 10"x6".

A number of us talked about scanning film and slides.  For this sort of work, you either need a dedicated scanner, or a flatbed scanner either with the capability built-in or with attachments for scanning such media.  Older HP scanners used to come with a triangular prism-like device for scanning transparent media.  Better scanners, like the CanoScan series from Canon, have a backlit lid and frames to hold the transparencies.  I happen to have the CanoScan 8800F, recently replaced by the 9000F, so that's the one I can speak to.  If memory serves, a few others in the discussion also have the same model.  I strongly recommend it as a great all-purpose scanner.  The downside is that it's not portable.  It also doesn't handle legal-size, so if you do a lot of that, you might want to look into a model that will.  Another issue with scanners is, what light source does it use?  Most scanners still use a cold cathode lamp (similar to a fluorescent light).   These take some time to warm up, and have a lamp life of about 5,000 hours.  That warmup time slows down your scanning, forcing you to wait each time you scan.  In addition, the color of the light produced changes over time, becoming dimmer and acquiring a color cast which can make your scans look 'off'.  Some new designs like my CanoScan 8800F use White LED backlights.  LED's have the benefit of being instantly on and ready to scan, with no warmup time.  Also, they usually are rated at 50,000 hours of life or more.  LED's also don't change color over time.  If you're looking at new scanners, I strongly recommend an LED backlit model.

We discussed scanner settings, and what settings to use with different media.  My personal recommendation is to scan at a minimum of 300 dots per inch (dpi).  Even newsprint I scan at that minimum.  Most photos I scan at 600dpi or more, with high quality formal portraits at 1200dpi.  You can always reduce once you've made the scan, but you cannot increase without rescanning at the higher resolution.  Slides and film I always scan at 4800dpi, the highest native resolution of my 8800F scanner.  Newer models like the 9000F can do 9600dpi or more.  Slides and film frames are small; scanning at high resolution lets you blow the resulting image up without loss of quality.

We talked about graphic image formats a lot, too.  Virtually all scanners will do a variety of formats, with TIFF (Tagged Image File Format) being the preferred choice.  Why?  Well, TIFF (sometimes abbreviated as TIF) is a "lossless" format.  That means the computer doesn't compress or discard any of the data in the image.  You can repeatedly edit and save a TIFF file with no loss of quality.  Not so with JPEG (Joint Photographic Experts Group) files, usually abbreviated as JPG.  JPG is a "lossy" format, which means that the computer compresses the data in a manner that causes some of the data to be lost, with minimal perceptive difference in the image.  The problem with this is that repeatedly editing and saving a JPG file causes repeated compression and data loss, leading to steadily decreasing image quality.  Why would we use JPG, then?  Well, JPG is very effective at compressing images to save disk space.  Back when space was much more expensive, this was a big issue.  As well, transmitting images over dialup was painfully slow on larger file sizes.  JPG was designed to fix both issues, and does a generally good job of it.  For any files you need to have around for a long time, or need to edit repeatedly, or archive, TIFF is the better choice for image quality.  The good news is once you have the TIFF file, you can convert it to JPG and save a copy to use when you need to send the file across a network or through e-mail.  You could also convert a JPG to TIFF, but even the initial saving of the JPG file loses some of the data, so you end up with a lower quality TIFF file.

We also talked about using a digital camera when something cannot be scanned.  For instance, something larger than the area of your scanner requires very finicky handling and software stitching to make a combined image if you can even manage to scan it at all.  Three dimensional objects do better with a digital camera than a flat scanner.  The problem here is, most cameras shoot JPG files at least by default, and many can do no other formats.  For snapshots, that's usually fine, but for archival purposes we're back at the image quality issue.  Digital Single Lens Reflex (DSLR) cameras can also shoot RAW format, which is the native format of the camera.  Technically, ALL digital cameras shoot a RAW format, but only the high-end cameras allow you to use that format instead of JPG.  Also, each camera has its own RAW format!  If you're forced to use a camera that only exports JPG files, you should soot using the highest quality and resolution the camera is capable of, then immediately convert the file to TIFF to avoid any further loss of quality.

Convert the files, eh?  How?  Well, we talked somewhat about image editing software.  My software of choice is Adobe Photoshop Elements.  Others like Paint Shop Pro (PSP).  Some don't do much editing at all, only using what Picasa offers for minor tweaks.  Most software will allow you to load an image, then select File, Save As, and then choose the destination format you want.  Usually, if you're following my advice and saving in TIFF format originally, you'll want to make copies in JPG or PNG (Portable Net Graphics) formats for use on your own blog or in e-mails.  In fact, to share some examples of what I was scanning today, I had to make JPG copies to upload to the forum.

I could go on for hours.  That's an occupational hazard for IT guys like me!  I'll close by saying that the next Scanfest event will be on February 27th at 11:00AM Pacific Time.  Keep an eye out for instructions on how to join!

UPDATE:  In hindsight, I realize I should have saved this for a  Tech Tuesday.  Chalk it up to the excitement of my first Scanfest...

Tech Tuesday – Have you stumbled upon a piece of technology or new Web-based application that would be of interest to your fellow genealogy colleagues? Post at your blog on Tech Tuesday and show us the ins and outs of this technology and how it can benefit the genealogy community.  This is a new series suggested by Donna Peterson of Hanging with Donna and in the past there have been many iterations of this series: the National Archives and Records Administration (NARA) blog Narations as well as The Family Curator by Denise Levenick.  (Blurb shamelessly stolen from Thomas McEntee at Geneabloggers!)