filenames on a PC
Tagged: filename pdf
October 24, 2016 at 12:07 am #135080
Need some input on using a proper filename for saving scanned newspaper clippings to our PC. Each file is in pdf and bmp.
When I became Curator, I started scanning all newly acquired LOOSE clippings. The contents of scanned pdf’s are searchable in Windows Explorer. One donor recently gave us 500+ loose clippings from 1940-1970 (plus 11 scrapbooks full of glued in place clippings). The scrapbook clippings will not be scanned.
Since our inception in 1975, we’ve accumulated 3 file drawers of clippings. Last year I relabeled each folder (such as “School-graduation (281)”. The number was to be used as part of a filename system for scanned clippings.
What filename would be best for saving the pdf’s (and bmp’s) on the PC? Should I include the YYYY-MM-DD date of the article? The Donor #? The file cabinet folder # (the 281 above)? All of these? In what order (such as: article date, Donor #, folder #.pdf). Should I omit the file cabinet folder # and put similar clippings in sub-folders on the PC?
My professional work is with databases. I often find I make things too hard for non-computer people. At least specific pdf’s can be found by typing in a search word in Windows Explorer. It is very unfortunate that most of our help do not (and will not) use our computer. If we keep saving file cabinets full of newspaper clippings, it will become harder to find specific clippings. Plus we do NOT have the room.
October 24, 2016 at 11:59 am #135082Quinn Morgan FerrisParticipant
Thanks for your question. As a conservator, I am not an expert on the subject of naming conventions, but I will do my best to offer helpful advice. I am wondering what type of institution you are working for, and if they have any other digital collections that have been named before? If there are other digital collections present, it may help to follow whatever naming convention has already been established.
If this is not the case, in my experience naming your files in a semi-intuitive way depends on three factors–consistency, logic, and the necessity of using a unique identifier. Using the date of the article might seem like a good idea until you realize that you have several articles in the same location from the same date, so it helps to do a brief assessment of the items in the collection before going through and renaming all of the files.
Based on the information you gave me, I would recommend two things–first, start from the largest set of data and work your way down–for example collection (largest subset of items) _foldernumber(smaller subset of items)_pdfnumber (smallest subset or unique identifier). Second, try to keep your digital naming somewhat consistent with the labeling and naming that you have done for the actual artifacts. Ideally, this would eliminate confusion later.
Naturally, as someone only hearing a little about your collection, it is difficult for me to tell you what the best method would be. I am attaching two pdfs that outline the naming conventions of Harvard and University of Colorado. Additionally, you can visit this link for the Stanford University Library to get a sense of how different institutions approach this issue.
Hope this helps!
Attachments:You must be logged in to view attached files.
October 25, 2016 at 12:36 am #135087
Quinn, et al:
Sorry for this long reply. I often find myself trying to be too concise.
Our institution is a Historical Society which includes a Museum of Local History. The only other digital files are about 500 photos (about 10% of our photos). The photo filenames on the PC are the same as the accession number on the original photos (EX: P-434.tif).
Scanning the largest set of data first and working to the smaller sets is wise. In our case ONLY newly acquired newspaper clipping will be scanned and saved on the PC. In the future when we have time, the clippings currently in the file cabinet will be scanned (or if an old clipping is found to be too fragile).
I agree that consistency, logic, and a unique identifier are a must. When I use dates, I’ve always used YYYY-MM-DD with placeholders (EX: 1999-03-05 for Mar 5, 1999 and 0000-02-10 for Feb 10, no year).
It is true that when more than one newspaper clipping has the same publishing date, there would be a problem with the filename. IF I include the accession number of the Donor in the filename the problem goes away. This is what I originally thought of doing. Unfortunately this makes the filename cumbersome:
A) “1999-12-24 2016-03-112.pdf”
B) “1999-12-24 2016-03-117.pdf”
For both A and B, the articles were published on Dec 24, 1999.
For “A”, the accession number is 2016-03-112
For “B”. the accession number is 2016-03-117
(The accession # is the year donated, the donor#, and the sequence# for the year)
Our current newspaper clippings in the file cabinet do NOT have accession numbers on the clippings. I NOW think it might be best to omit the accession number from the filename of the scanned clippings. When the clippings in the file cabinet EVENTUALLY get scanned, I would have a bigger problem. In the long run, I think it might be best to save the scanned clippings with only the publishing date but add a unique 3-digit sequence number:
We have about 300 file folders in our newspaper file cabinet. Each file folder is labeled by subject (Hotels, School-events, School-graduation, Politics, etc). On the PC I’ll probably use sub-folders using the same names as the file folders in the file cabinet. This way all the scanned clippings of the same subject will be in the same sub-folder on the PC. No need to make the filename longer than it needs to be. The good thing is the contents of all the pdf’s can be searched in Windows Explorer.
David Cranston, Curator
Hadley-Lake Luzerne Historical Society
52 Main St – PO Box 275
Lake Luzerne, NY 12846-0275
October 28, 2016 at 10:56 am #135096Quinn Morgan FerrisParticipant
Thank you for clarifying some of the details. Sounds like you’ve hit upon a formula that works for you. Good luck with all the scanning and naming!
All the best,
October 28, 2016 at 7:17 pm #135099
By the way, I mentioned the contents of our pdf files are searchable in Windows Explorer. Normally the contents of a pdf are NOT searchable in Windows Explorer. As a warning, IF someone decides to start saving hundreds of pdf’s and expect the contents to be searchable, you will probably need to change your Indexing settings. What that is and how to make those changes can be found elsewhere on the internet. Also, the pdf’s will need to be put through an OCR (Optical Character Recognition) program.
Our Society’s PC is an old Windows XP, and mine is Windows 7.
- The forum ‘Connecting to Collections Care Forum Archives – 2015 through 2018’ is closed to new topics and replies.