AH Image Data Collector for Photoshop CS / CS2 / and Bridge

Andrew · Post by **Andrew** » Fri Oct 07, 2005 9:53 am

January 1 2007: I am releasing a modified version of AH Image Data Collector today. It should be compatible with CS / CS2 / CS3. It is similar to the old version so the description that follows should be enough to get you going. If there are any technical issues please report them in this thread.

This script was created in response to a request at the Adobe Forum. AH Image Data Collector is a javascript for Photoshop CS / CS2 which is also integrated with Bridge. What it does is to write image data to an external data file. Image Data is any aspect of a files metadata as displayed in Photoshop's FileInfo Panel plus file name and file path.

You can work with groups of files selected in Bridge or the File Browser or select files within a system of subfolders using powerful selection criteria eg jpg files with a*h in the name excluding those which have c or d in the name.

The subfolders checkbox has been replaced by a 'Levels' checkbox. If set to 0 it does not explore subfolders, if set to 1 it explores the first level of subfolder, if set to 2 it explores all subfolders of target folders.

You specify the data required by using keywords in the Data Selection panel. Each part of the data request is separated from the next by a '+' e.g. 'crdater+_+crtimehm' and you can insert plain text wherever you wish. The following are the keywords.

xmeta:fieldID - inserts image metadata eg xmeta:exif:FNumber. Multiple id-fields can be specified. For detail go here.
xmeta::fieldID - note the double colon - returns the metadata in xml format with identifiers. For detail go here.
origname - file name excluding extension.
namewext - file name including extension.
fullname - file name with full path eg /c/myimages/IMG1000.tif.
fullpath - file path eg /c/myimages.
xnl - inserts a newline in the output data.
/re/replacetext - does search / replace within the file name, go here for details (RegExp based).
crdater - reverse creation date (best for computer sort order): yymmdd.
crdate - creation date: ddmmyy.
crdateus - US style creation date: mmddyy.
crtime - creation time - hhmmss.
crtimehm - creation time - hhmm.
orignum3 - the rightmost 3 consecutive integers in the original file name e.g. 'Img2390' becomes '390'. Similarly orignum2.
autonum100 - counts from 100, any startpoint can be specified eg autonum1.

Here are some sample strings and their results.

Code: Select allCONTROL STRING
fullname+xnl+xmeta::exif:ISOSpeedRatings/rdf:Seq/rdf:li+xnl

Code: Select allDATA
/e/photo20d/test/test/041115_1856_28-10.jpg
<exif:ISOSpeedRatings>
<rdf:Seq>
<rdf:li>400</rdf:li>
</rdf:Seq>
</exif:ISOSpeedRatings>

/e/photo20d/test/test/041115_400 copy.jpg
<exif:ISOSpeedRatings>
<rdf:Seq>
<rdf:li>400</rdf:li>
</rdf:Seq>
</exif:ISOSpeedRatings>

/e/photo20d/test/test/test.jpg
<exif:ISOSpeedRatings>
<rdf:Seq>
<rdf:li>400</rdf:li>
</rdf:Seq>
</exif:ISOSpeedRatings>

Code: Select allCONTROL STRING
namewext+xnl+xmeta:exif:ISOSpeedRatings/rdf:Seq/rdf:li+,+xmeta:exif:FNumber+,+xmeta:exif:MeteringMode+xnl

Code: Select allDATA
041115_1856_28-10.jpg
400,28/10,5
041115_400 copy.jpg
400,28/10,5
test.jpg
400,28/10,5

Code: Select allCONTROL STRING
fullpath+,+fullname+,+xmeta:exif:ISOSpeedRatings/rdf:Seq/rdf:li+,+xmeta:exif:FNumber+,+xmeta:exif:MeteringMode+xnl

Code: Select allDATA
/e/photo20d/test/test,/e/photo20d/test/test/041115_1856_28-10.jpg,400,28/10,5
/e/photo20d/test/test,/e/photo20d/test/test/041115_400 copy.jpg,400,28/10,5
/e/photo20d/test/test,/e/photo20d/test/test/test.jpg,400,28/10,5

For a lot more detail read the help file:

bb/files/imagedata-help.htm

Andrew

YrbkMgr · Post by **YrbkMgr** » Fri Oct 07, 2005 4:10 pm

Andrew,

I installed the Data Collector script, read the help, and the reference for the renamer. I followed your instructions only I do not know if there's any special designation for "Description" within metadata other than Description.

This Screenshot of the dialog is what was entered, with exactly 2 files selected. It produced said DAT file with only the origname data, no Description.

So it works, but, I'm not able to extract the critical piece, Description. Thoughts?

Also, it would be helpful if it were possible to create an output filename prompt so that once could point to a destination folder for the DAT and an optional filename.

Peace,
Tony

Andrew · Post by **Andrew** » Fri Oct 07, 2005 8:36 pm

I'll do a bit more on this later today. You need to supply the exact XML descriptor to find a metadata element. If you open an image file with the full set of IPTC data you are using, go to file-info, go to Advanced, then Save the xml data from your file and open it in a text editor you will see what I mean. Near the top you would see for example:

<exif:FNumber>63/30</exif:FNumber>

To capture that value you need to use:

xmeta:exif:FNumber

I am going to turn this into a metadata manager in the next couple of weeks, it will work with metadata writing it to external files, writing it to text layers, anything else I can think of. It will also have more options, like being able to render the target data still wrapped in xml identifiers.

The destination control you asked for should be pretty straightforward too.

Of course, there is the question, what are you really achieving with the external file since all you are doing is duplicating xmp values that the file already has. Perhaps you should think more how you want to use it.

Andrew

YrbkMgr · Post by **YrbkMgr** » Fri Oct 07, 2005 9:57 pm

Andrew,

Last thing first:

Of course, there is the question, what are you really achieving with the external file since all you are doing is duplicating xmp values that the file already has. Perhaps you should think more how you want to use it.

I know how I want to use it. The thumbnail sketch is I have millions of images with embeddd exif data. I need a cross reference so that I can have a database of file name and image description. So I need a process to extract existing data and log those two params. Later, I'll need to parse it, depending on its final output structure, and merge it into a database, but that's later. For now, I need to know what filename contains which description.

One of the issues is that the XML descriptor is, in my case, apparently bohemian in format. While your example of FNumber is straightforward, my data isn't apparently.

Here's a snippet of the data:

<rdf:Description rdf:about=""
xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:format>image/jpeg</dc:format>
<dc:creator>
<rdf:Seq>
<rdf:li>My Old Yearbook ™ CD</rdf:li>
</rdf:Seq>
</dc:creator>
<dc:description>
<rdf:Alt>
<rdf:li xml:lang="x-default">COHEN_SEYMOUR</rdf:li>
</rdf:Alt>
</dc:description>
</rdf:Description>

I need the information "COHEN_SEYMOUR". Playing games with different permutations, one syntax that got close was:

origname+_+xmeta:dc:description

But that produced the following:

Jamaica55_014_02_
_rdf_Alt_
_rdf_li xml_lang=_x-default__COHEN_SEYMOUR_-rdf_li_
_-rdf_Alt_

It's not as clean as I need to get it. The "Jamaica55" stuff is the filename. What seems to be screwing me up is the rdf:li and the "xml_lang=_x-default" being included.

Now you know that I'm not a scripting wizard, but I can generally follow logic - I just don't have programming experience so I'm still confused.

Finally, I know you're putting thought into this, and I'm not interested in dragging you around by the nose so to speak - I am grateful for what you've done thus far so if this is too much of an annoyance, just say, it's cool.

Peace,
Tony

Andrew · Post by **Andrew** » Fri Oct 07, 2005 10:11 pm

Hi Tony, no problem, we will get there soon. What you have there are multi-layered unique ID's so you have to specify each layer of the ID (it is breifly mentioned in the docs)

xmeta:dc:description/rdf:Alt

That will get closer. What I cannot remember is whether I set it up to deal with ID's that were not identical for open and close. I know I had both at one stage but I may have simpified it. You could try

xmeta:dc:description/rdf:Alt/rdf:li xml:lang="x-default"

and

xmeta:dc:description/rdf:Alt/rdf:li

In any case I will fix it so it works. I will also disable the function that converted all characters that are prohibited in file names to _.

Don't worry about me, this has been a very useful exercise which will build into lots of other things I am working on.

Andrew

YrbkMgr · Post by **YrbkMgr** » Fri Oct 07, 2005 10:45 pm

Andrew,

You are too cool for school.

First, here's what
origname+_+xmeta:dc:description/rdf:Alt/rdf:li
resulted in

Jamaica55_014_02_xml_lang=_x-default__BLANCO_SANDRA

SO flippin' close.

Adding
/rdf:li xml:lang="x-default"

Returned nothing, just the filename.

I will also disable the function that converted all characters that are prohibited in file names to _

That would be excellent.

It's true that I read about the "/", but I was reading too fast and forgot that I had to use it. Looks like it works, but there may be an issue with the begin tag not matching the end tag as you said.

See, here's a little more data. I use Russell Brown's caption maker on these images. It won't work when I enter a document title, so I've been forced to use Description so that caption maker will run (see russell browns web site if you don't know what I mean).

But when I look at File Info, there's like 8 different fields for "Description". So really, I have a feeling that if I were able to use the Title field instead of Description your script would probably be easier (at least in specifying the correct XMP field), and data entry for our images would go faster because we wouldn't have to Tab to Description. We could just enter data in the Title field. But as I say, the caption maker won't work with the field "Document Title". For some reason the "Description" field that comes up with File Info, while listed first, appears to be rather bohemian. <throws hands up> I don't know...

Not that it's all that relevant, but I thought I'd meniton it.

Andrew · Post by **Andrew** » Sun Oct 09, 2005 5:16 am

One thing I notice that is very annoying is that the command string is being truncated when you reopen the script. I will see if I can do something about that.

NOW FIXED

Andrew

YrbkMgr · Post by **YrbkMgr** » Sun Oct 09, 2005 7:02 pm

Andrew,

Nice script. Functionally, it's near perfect. For my purposes, I can run with it nicely. Your fixes for the subfield variations produce exactly what I'm after.

If you were so inclined, the following would be improvements I would recommend, based on how I would use this delicious little script.

It would be great to have a "picklist" of "meta data". That is to say, that if a user could pick an image, have the script read all the fields in the image, and then the user could tick them off to include in the data selection params. This would avoid a non-scripter type from having to carefully analyze XMP data to grab the correct field. It would have to be a subroutine obviously, or possibly a completely separate script that runs and then passes field data to the DataCollector script for selection - I'm not sure how possible it is, but food for thought. One challenge would be, I would think, the order - a user would want control of the order of the fields. I'm just thinking of a way for someone to avoid having to decrypt XMP field delimiters.

Also, I would recommend a browse for the placement of the results file. A user might want the option to store all results files in a single location, rather than having to scour for them once generated. Imagine 100 folders that the user ran this script on, all in spurious locations on a HD. It might be nice to put the data files in a single, unified location so that they might be improted into a database all in one fell swoop.

This last one isn't so trivial as it may seem at first. In a weeks time, I can easily run this script to capture data from 1,000,000 images, week after week. In a production type setting, where most image processing requires little user intervention, I could ostensibly have 300 different folders upon which I run the script, all within the PS interface (or bridge if I installed your other little utility). One of the things I have to do now is, once the scripts are run all in one fell swoop, I have to go mine the files to put them in a central location.

Now, my plan for a workaround to this situation is to name each results file with a common sting. That way I can use the search engine in XP to find them, and even though they are unique filenames, they will have some common string that is unlikely to be in other files - if that makes sense. As an example, two results files might be:

snakeboy_Jamaica55_Data
snakeboy_Hillsborough95_Data

Searching on snakeboy allows me to collect the results files with XP search.

I'm only sharing this so as to say, it's not critical for me, and I have no idea what your development plans are, but perhaps the above is useful to you.

I am absolutely stoked about my new capabilities with your script. Thank you Andrew.

Peace,
Tony

Andrew · Post by **Andrew** » Sun Oct 09, 2005 7:39 pm

Hi

Glad it works for you. Yes I thought about both your suggestions but decided that until I see this is more widely used I would stay with things as they are. On the file location thing, I think your solution is pretty effective and will not waste you much time. One way to make the search faster might be to alter the extension of the dat file - it's defined near the beginning of the script. On the xmp delimiters, what I might do at some stage is simply list them in this thread here and people can cut and paste from them. In most cases the user only needs to make occassional changes to the command string so it is probably managable at that level.

Andrew

Andrew · Post by **Andrew** » Sun Dec 31, 2006 10:21 pm

New CS3 compatible version released 1 Jan 2007 (see first post) - please report any problems here.

Andrew