You are here:

C/Help needed to write a program in c

Advertisement


Question
Hi,

I need to write a c program to remove duplicates from a file.

I have a wordlist.txt file with around 700k words...there are lots of duplicates in the file... and i need to write a program in c to remove the duplicates...

I have just started learning c... I have written just the starting of the program by referencing books and websites... dont know if its correct... plz take a look at the code below and tell me how to write this program... thanks

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>

int main(int argc, char *argv[])
{
   FILE *fp;
   long lfilelen;         /* Length of file */
   char *cFile;          /* Dynamically allocated buffer ( entire file ) */
   
  if(argc != 2)
   {
       syntax();
       return 1;
   }

   fp = fopen(argv[1], "r"); /* Open file in text mode */

   if(fp == NULL)          /* Could not open file */
   {
       printf("Error opening %s: %s (%u)
", argv[1], strerror(errno), errno);
       return 1;
   }

   fseek(fp, 0L, SEEK_END); /* Position to end of file */
   lfilelen = ftell(fp);    /* Get file length */
   rewind(fp);          /* Back to start of file */

   cFile = calloc(lfilelen + 1, sizeof(char));

   if(cFile == NULL)
   {
       printf("
Insufficient memory to read file
");
       return 0;
   }

   fread(cFile, lfilelen, 1, fp);  /* Read entire file into cFile */

}

Answer
Hi Nildeep,

Till now you program looks good. In the end you are copying the entire file into the buffer. Now you just have to think about how to check for duplication and how you can remove it.

"If by duplication you mean same word occurring more than once". To, start with what you can do is, write a n-squre algo using 2 loops and check whether each word is present more than once or not. If it is present in that case you can over write the next time you see that word.(you can replace the word by spaces).

In this way you can remove the duplication.
First, try doing this later we can go over other algo's to make it run better.

Regards,
Abhishek Kumar

C

All Answers


Answers by Expert:


Ask Experts

Volunteer


Abhishek Kumar

Expertise

I can answer questions related to basic concepts , arrays , expressions , pointers and queries related to coding .

Experience

I have done my bachelors in Computer Science. For past 5 years i have been working in c , and i have good command over most of the areas of C Language.

Education/Credentials
I did my Bachelors in Computer Science and Engineerin .

Awards and Honors

©2012 About.com, a part of The New York Times Company. All rights reserved.