| |||||||
| Programming PHP, Perl, Ruby on Rails, AJAX, HTML, XHTML, CSS, JavaScript, MySQL and any other coding topics. |
![]() |
| | LinkBack | Thread Tools |
| | #1 (permalink) |
| DomainersUniversity.com Team Leader | Delete Duplicates in a CSV file? I have a CSV file containing 36,000 records. Each record contains two fields. Email address and First name. Problem is there are many duplicate email addresses. What is the simplest way to delete the dupes? I could bring them into a spreadsheet, sort and delete manually, but I don't have 5 hours to waste ![]() Ideas? |
| |
| | #2 (permalink) |
| Senior Member | make a simple php script to do it: PHP Code: that should work... tell me if it doesnt and ill see what i can do.
__________________ Hacksar.com - Your source for random computer tips and tricks! MySiteMemberships.com - Keep track of your site registration information! Like my post? Rep is appreciated! |
| |
| | #4 (permalink) |
| Senior Member | oh sorry XD. just make a php file, put it on some web host that supports php, and put your csv file in the same folder as the php file you made. the other.csv is just a file that the script will create to save the updated version (without duplicates - i have a habit of NEVER overwriting old files). if you can't get this to work, i can upload it to my own host and you can do it then. i actually also found this php compiler (bambalam) which basically makes ur php code into an exe. it's my newfound love, its great. so if worst comes to worst, i can make you an exe that you can just run (only if u want). edit: i just realized, if you have duplicated emails, but are the names also duplicated or are they different? if so, the script would be slightly different.
__________________ Hacksar.com - Your source for random computer tips and tricks! MySiteMemberships.com - Keep track of your site registration information! Like my post? Rep is appreciated! Last edited by nasaboy007; 08-11-2008 at 04:16 PM. |
| |
| | #6 (permalink) |
| DomainersUniversity.com Team Leader | Thanks! I'm gonna try it out. EDIT: So I created a web page called script.php, containing nothing but the above code. I placed my file names.csv in the same folder. I then entered the url of the script into my browser and hit enter. All I got was a display of the code. What am I doing wrong? Last edited by Gene; 08-11-2008 at 04:18 PM. |
| |
| | #7 (permalink) |
| Senior Member | oh sorry, you need to put <?php at the beginning and ?> at the end (wrapping it in php tags). so like this: PHP Code: if it still doesnt work, give me a few lines of the csv as a sample so i could see what exactly needs to be done to do it.
__________________ Hacksar.com - Your source for random computer tips and tricks! MySiteMemberships.com - Keep track of your site registration information! Like my post? Rep is appreciated! |
| |
| | #8 (permalink) |
| DomainersUniversity.com Team Leader | I shoulda known that ![]() Okay, so I added the tags. Now when I run it I get: Fatal error: Function name must be a string in /home/username/public_html/names/script.php on line 9 aaaaaaa@juno.com,AL bbbbbbb@Yahoo.Com,ALACITA ccccccc@yahoo.com,ALADAS Code: 1 <?php
2
3 $filename = "names.csv";
4 $file = fopen($filename, "r");
5 $read = fread($file, filesize($filename));
6
7 $split = array_unique(explode("\n", $read));
8
9 $fclose($file);
10
11 $filename2 = "other.csv";
12 $file2 = fopen($filename2, "a");
13
14 foreach($split as $key=>$value) {
15 if($value != "") {
16 fwrite($file2, $value . "\n");
17 }
18 }
19
20 fclose($file2);
21
22 ?>
Last edited by Gene; 08-11-2008 at 04:31 PM. |
| |
| | #10 (permalink) |
| DomainersUniversity.com Team Leader | Yes, like this: aaaaaaa@juno.com,AL bbbbbbb@Yahoo.Com,ALACITA ccccccc@yahoo.com,ALADAS 36,000 of them ![]() Thanks for your help nasaboy007... I'll be back tomorrow. Gotta log off now. Rep added. |
| |
| | #11 (permalink) |
| Senior Member | d'oh, i put in a $ for the fclose on line 9. although it still doesnt seem to work... let me get it working on localhost and then i'll post it up. EDIT: ok here, this should work. the only thing is, open the csv file and go to the last line and just hit enter (adding another linebreak). idk why but if there isn't an extra line at the end of the file and if it's one of the duplicates, it won't remove it. adding the extra line will. PHP Code:
__________________ Hacksar.com - Your source for random computer tips and tricks! MySiteMemberships.com - Keep track of your site registration information! Like my post? Rep is appreciated! Last edited by nasaboy007; 08-11-2008 at 05:04 PM. |
| |
| | #12 (permalink) |
| NamePros Legend | if you have excel you can try this excel add-on to remove duplicate entries, cells or entire rows Duplicate Manager 1.1 http://www.download.com/Duplicate-Ma...dlPid=10752056 or ConnectCode Duplicate Remover 1 http://www.download.com/ConnectCode-...dlPid=10785494 HTH |
| |
| | #13 (permalink) |
| Domains my Dominion | If you like Unix shell scripting ![]() Assuming both E-mail and first name are the same (duplicate lines are identical), and assuming your CSV file is named file.csv and located in folder /var: Code: sort +1 /var/file.csv|uniq > output.csv If the first names are not identical across dupes, this one will then just look at the E-mail addresses. But the output file will only contain E-mail addresses, the first name column gets discarded Code: cut -d "," -f 1 /var/file.csv|sort +1|uniq > output.csv
__________________ Buy now - MassDeveloper.com $500 |
| |
| | #14 (permalink) | |
| DomainersUniversity.com Team Leader | This seems to have worked! It failed at first but I needed to change file permissions, then it worked. Many thanks! Quote:
| |
| |
| | #15 (permalink) |
| Senior Member | great, glad I could be of assistance!
__________________ Hacksar.com - Your source for random computer tips and tricks! MySiteMemberships.com - Keep track of your site registration information! Like my post? Rep is appreciated! |
| |
![]() |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
| |