I'm trying to write a very simple perl script that does the following:

1) Open up all the XML files in a directory.
2) Use regex to grab 3 attribute values.
3) Output $1 $2 $3 values from step 2 into a file.

I am only having trouble with step 2. Would you know if it was possible to grab these values using regex? I'm not sure what the matching syntax would look like in Perl.

Example XML file:
<test office="U.S.House" state="US" more text here="dont want this"  filter="Unfiltered" more useless text="here" />
<useless>dont need this</useless>

I would need grab the attributes of office, state, and filter. If i try office="(.*)" , it matches more than I want. Any help would be appreciated.



Hi John,

Sorry for the delay in responding to your question. It has been a very busy week and this is the first time I have been able to sit down to catch up on my personal email.

Rather than trying to use REGEX to parse out complex documents I recommend using a CPAN module such as XML::Simple which makes the job easy. I have used this module to process XML data many times with complete success.

If you want to see how to use this approach vs. REGEX please let me know, I'll write up a sample script that will process your data.



