You are here:

Excel/Copy and Paste 1 row from 45,000 xml files

Advertisement


Question
My goals are:

1. Read a directory and subdirectories
2. Open each *.xml file.
3. For each xml file that is opened, copy row 3 to another activework (“MyWork”) starting at Row 1.
4. Using - Lastrow = cells(rows.count,1).end(xlup).row
5. Final Results: 45,000 lines copied in activework(“MyWork”) one after the other, 1 line for each xml file which is Row 3

Problems:
1.  can’t get the MyPath to read the subdirectories
2.  can’t get the Row 3 from the open xml to copy to the activeworkbook.

My Code:

Sub SelectNodes()
Dim MyPath as string
Dim MyFile as string
Dim wbOpen as Workbook
Dim lastRow as long

Note: There are 2 more subdirectories after FFFFFFF, and then the xml files …….file1.xml

MyPath = “C:\AAAAAA\BBBBBBB\CCCCCC\DDDDDD\EEEEEE\FFFFFFF”

MyFile = Dir(MyPath & “*.xml”)

Do While Len(myFile) > 0

Set wbOpen = Workbooks.Open(Filename:=MyPath & MyFile)

Lastrow=cells(rows.count,1).end(xlup).row

Rows(“3:3”).Select    ....selects from open xml file

Selection.copy

Windows(“MyWork.xlxs”).Activate

Ranges(“A1:A” * lastrow).Select

ActiveSheet.Paste

MyFile = Dir

Close MyFile …. Xml file

Loop

End sub

Answer
Achilles,

Here is some code which I think will work.  I used as much of your code (and coding style) as I could so it would not look to foreign to you.  I integrated that with code that does a recursive search of a directory structure.  I wasn't sure on your comment on where your XML files are located. Your path

MyPath = "C:\AAAAAA\BBBBBBB\CCCCCC\DDDDDD\EEEEEE\FFFFFFF\"

would mean that the first level of XML files are in the FFFFFFF subdirectory.  If that is not the case, adjust the path to the highest level subdirectory that includes all the XML files below it (either in that subdirectory or lower subdirectories).  

Also, I have never tried opening an XML file directly in Excel.  I am taking your word that Excel can open a file with an XML extension directly and put your desired line in row 3.

I have tested the code and know that it will list all the files specified.  I can't test the code of yours that I put into the procedure to do the opening and copying of workbooks.  


Dim aFiles() As String, iFile As Integer


Sub SelectNodes()
Dim MyPath As String
Dim MyFile As String
Dim wbOpen As Workbook
Dim lastRow As Long
Dim Counter As Integer
Dim sh As Worksheet

iFile = 0
ReDim aFiles(1 To 1)
MyPath = "C:\AAAAAA\BBBBBBB\CCCCCC\DDDDDD\EEEEEE\FFFFFFF\"
ListFilesInDirectory MyPath

Windows("MyWork.xlxs").Activate
Set sh = ActiveSheet
sh.Cells.ClearContents
For Counter = 1 To iFile
  'Worksheets("Sheet1").Cells(Counter, 1).Value = aFiles(Counter)
   Set wbOpen = Workbooks.Open(Filename:=aFiles(Counter))
   ' get lastrow from sheet where data will be pasted
   lastRow = sh.Cells(sh.Rows.Count, 1).End(xlUp).Row + 1
   Rows(3).Select   ' ....selects from open xml file
   Selection.Copy
   Windows(“MyWork.xlxs”).Activate
   Range("A1:A" & lastRow).Select
   ActiveSheet.Paste
   wbOpen.Close SaveChanges:=False
Next
sh.rows(1).EntireRow.Delete

End Sub


Sub ListFilesInDirectory(Directory As String)
Dim aDirs() As String, iDir As Integer, stFile As String


' use Dir function to find files and directories in Directory
' look for directories and build a separate array of them
' note that Dir returns files as well as directories when vbDirectory specified
iDir = 0
stFile = Directory & Dir(Directory & "*.*", vbDirectory)
Do While stFile <> Directory
 Debug.Print stFile
  If Right(stFile, 2) = "\." Or Right(stFile, 3) = "\.." Then
    ' do nothing - GetAttr doesn't like these directories
  ElseIf GetAttr(stFile) = vbDirectory Then
    ' add to local array of directories
    iDir = iDir + 1
    ReDim Preserve aDirs(iDir)
    aDirs(iDir) = stFile
  Else
    ' add to global array of files
    ' restrict actions to files that end in xml
    If UCase(Right(stFile, 3)) = "XML" Then
    iFile = iFile + 1
    ReDim Preserve aFiles(1 To iFile)
    aFiles(iFile) = stFile
    End If
  End If
  stFile = Directory & Dir()
Loop


' now, for any directories in aDirs call self recursively
If iDir > 0 Then
  For iDir = 1 To UBound(aDirs)
    ListFilesInDirectory aDirs(iDir) & Application.PathSeparator
  Next iDir
End If
End Sub


The code consists of a header declaration and two routines.  You should copy it all and paste it into a new module.  You would run the macro SelectNodes

--
Regards,
Tom Ogilvy

About Excel
This topic answers questions related to Microsoft Excel spreadsheet (or workbook) stand-alone or Mircrosoft Office Excel including Excel 2003, Excel 2007, Office 2000, and Office XP. You can get Excel help on Excel formulas(or functions), Excell macros, charting in Excel, advanced features, and the general use of Excel. This does not provide a general Excel tutorial nor the basics of using a spreadsheet. It provides specific answers to using Microsoft Excel only. If you do not see your Excel question answered in this area then please ask an Excel question here

Excel

All Answers


Answers by Expert:


Ask Experts

Volunteer


Tom Ogilvy

Expertise

Selected as an Excel MVP by Microsoft since 1999. Answering Excel questions in Allexperts since its inception in 2001. Able to answer questions on almost all aspects of Excel's internal capabilities. If seeking a VBA solution, please specify that in your question itself so I give you the answer you want. [Excel has weak protection - if you are distributing an application, I don't answer questions on how to protect your project from your users.]

Experience

Extensive experience.

Education/Credentials
Master of Science (MS) degree Operations Research (ORSA)

Awards and Honors
Microsoft MVP in Excel.

©2016 About.com. All rights reserved.