Excel/Sorting lines in one document into search-term-entitled smaller documents.
QUESTION: I have several hundred text files consisting lines taken from anonymous medical reports, each in a directory by itself.
I also have a list of search terms:
NAME: SEARCH TERMS.txt
1. Open both files in TextPad
2. Copy the first search term from the list (ex: 'apple')
3. Open FIND in TextPad, then MARK ALL lines containing 'apple' in SOURCE1.txt.
4. Then go EDIT>CUT OTHER>BOOKMARKED LINES
5. PASTE the lines from step 4 into a new document
6. SAVE AS apple.txt
7. Repeat until I reach the end of the list of search terms
Then, in the original folder, I will have:
SOURCE1.txt (with all terms in the text files below removed)
I am OK with Textpad macros, MS Word 2007 or Notepad++, or whatever you suggest for this task. Also, if there are too many steps here, or this is too complicated, I am willing to look at simply rearranging the lines in SOURCE1.txt into groups that I can manually save as apple.txt, melon.txt, etc. I am just looking for a way to automate all or part of this process to use my time more efficiently. After I am finished with this part, I have to edit every resultant file to find the best expressions for each term. That part will take a long time, so I want to cut out the first steps if possible.
As always, if this task is not up your alley, please refer me to someone who specializes in this type of thing. Many thanks!
Jonathan in Buffalo, Minnesota, USA
ANSWER: Hi Jonathan.
I believe that the "find" utility in Windows should help you on this. But I need to do some testing to confirm this.
You can try it out as well by checking out the help for that command by typing "find ?" at the command prompt
---------- FOLLOW-UP ----------
QUESTION: This utility searches for files. Remember, I am "harvesting" lines from within a single document and creating multiple documents, one with each search term. If this is what you mean, I will investigate further. I look forward to your answer, if you have one.
Here are two examples. One pulls out the lines that contains the word apple and the other that pulls out the lines that do not contain the word apple
C:\Documents and Settings\user>copy con sample.txt
1 file(s) copied.
C:\Documents and Settings\user>find "apple" sample.txt
C:\Documents and Settings\user>find/?
Searches for a text string in a file or files.
FIND [/V] [/C] [/N] [/I] [/OFF[LINE]] "string" [[drive:][path]filename[ ...]]
/V Displays all lines NOT containing the specified string.
/C Displays only the count of lines containing the string.
/N Displays line numbers with the displayed lines.
/I Ignores the case of characters when searching for the string.
/OFF[LINE] Do not skip files with offline attribute set.
"string" Specifies the text string to find.
Specifies a file or files to search.
If a path is not specified, FIND searches the text typed at the prompt
or piped from another command.
C:\Documents and Settings\user>find /v "apple" sample.txt
C:\Documents and Settings\user>
Do let me know if this helps.