Recently I had to split a 30 thousand line csv file into smaller files of 1 thousand lines. Naturally I had to look for a tool that already does this and I found that this utility split
comes installed on most Ubuntu versions (sweet)!
So I removed the first line (which is the header) and went to town.
On first try
split -l 1000 -d 30k-lines-file.csv 1k-lines-file-
Ok.. but it produced the following file names
1k-lines-file-1
1k-lines-file-2
1k-lines-file-3
...
Not good. how do I get back the .csv
extension
A little more googling and I found a patch from the split mailing list from 2007! Apparently there is a parameter --additional-suffix
split -l 1000 -d --additional-suffix=.csv 30k-lines-file.csv 1k-lines-file-
Ahh... a little better
Now to add back the header to the first line of each file.
sed -i '1 i header1, header2' 1k-lines-file-*.csv
That's all folks!
Cover photo by Jaxon Lott on Unsplash