Splitting a large csv file into chunks

Recently I had to split a 30 thousand line csv file into smaller files of 1 thousand lines. Naturally I had to look for a tool that already does this and I found that this utility split comes installed on most Ubuntu versions (sweet)!

So I removed the first line (which is the header) and went to town.

On first try

split -l 1000 -d 30k-lines-file.csv 1k-lines-file-

Ok.. but it produced the following file names

1k-lines-file-1
1k-lines-file-2
1k-lines-file-3
...

Not good. how do I get back the .csv extension

A little more googling and I found a patch from the split mailing list from 2007! Apparently there is a parameter --additional-suffix

split -l 1000 -d --additional-suffix=.csv 30k-lines-file.csv 1k-lines-file-

Ahh... a little better

Now to add back the header to the first line of each file.

sed -i '1 i header1, header2' 1k-lines-file-*.csv

That's all folks!


Cover photo by Jaxon Lott on Unsplash

Show Comments