For this guide we will take thousands of csv files and combine them into One single file, which will be uploaded to a database. This file will have a new column inserted, where the originating filename will be added next to the row. In scraping sites, often my output files are separated and after I clean them up I need to combine them. This is how I do it.
On my Windows 10 computer
I will copy my CSV files over to a secondary drive, leaving the originals intact.
Then I will open a command prompt as an administrator and navigate to that drive. If the files are on my H: drive will navigate by typing
H:
Then press Enter
Now I will navigate to the “dataset” folder
cd dataset
I will then list the directory contents to ensure this holds my csv files
dir
Once I see all of my csv files I will enter the following:
for /f %a in ('dir /b *.csv') do for /f "tokens=*" %b in (%a) do echo %b,%a >> combined-date.csv
Then press Enter
This will create a file called combined-04132021.csv (which is today’s date)
And I can then upload that file to my database easily