|
After waiting for Google Fusion tables to upload a largeish file (70MB) with a csv suffix, I noticed that the file was a tab separated file (Hint: head filename.csv previews the first 10 rows on the Unix command line, and head -n 20 filename.csv > testfile.csv ) and that, at the moment, Google fusion tables only likes comma separated/CSV text file uploads. So I was wondering: what's an easy way of converting data in a tab separated file to a comma separated file? Note that the data cells may contain commas, and may also contain control characters etc. |
|
TSV to CSV is less of a misery than most conversions but the solution depends on how irritating your data is. Sed is definitely the way to go on this:
And at this point you're so far down the rabbit hole that you probably want to start over. Tab problem: You using a Mac? I despise their sed. -- Thanks for that; just one note - on my mac (;-) I needed to escape the escape when escaping the ": sed 's/"/three_backslashes"/g'
(03 Jun '11, 12:39)
psychemedia ♦♦
|
|
To answer my own question:
Originally, i tried the sed expression 's/^/"/; s/$/"/; s/\t /","/g;' but my version of sed didnlt like it; instead I replaced the t tab with ctrl-v TAB (from http://www.unix.com/unix-dummies-questions-answers/71183-why-tab-could-not-recognized-grep-sed.html ) PS By the by, whilst looking around for a solution, I came across the xlrd Python library that looks as if it can be used to read Excel spreadsheets. Doh - also need to escape any pre-existing " before adding more...
(03 Jun '11, 12:42)
psychemedia ♦♦
|
|
Something like this, perhaps?: https://gist.github.com/1006465 (PHP, converts between two formats using an actual CSV parser) |
Get the Data