How to read large files on a Mac (CSV, JSON, etc.)
I was playing with web scraping the other day and ended up with a big JSON file with data.
I wanted to quickly have a look inside and see if everything was fine, if I was collecting everything I needed, if the structure was correct.
The file wasn’t even that big, around 20 MB. But there was a lot of nesting in data and whatever I tried to use, just froze my (usually effortlessly powerful) Mac for minutes, until I forcefully killed the apps I tried to used.
So I tried converting the file to CSV instead and try my luck with CSV. It ended up having 200+ columns and 80k+ rows. Reasonable, or so I thought.
Nothing worked. Excel and Apple’s Numbers froze and gave up completely. “Outsourcing” it to a Google Spreadsheet didn’t work as well. I was stuck.
Time to hit Google.
“How to open big files on a mac” gave surprisingly little, and most of it was 10+ years old stuff.
After a much too long search, on some forum or Stack Overflow, I don’t even remember now, I’ve finally found the solution. It didn’t sound good at first, but I tried it and it worked.
It’a s simple, free tool: Hex Fiend.
On their website, it says it’s “A fast and clever open source hex editor for Mac OS X.” I didn’t want to edit any hex, I just wanted to see what’s inside my (not even that large) file.
Furthermore, the website looks like it was built in the 90s and haven’t been touched since. Which very well may be the case.
What’s more important though, the tool works, is blazingly fast and when you turn off the hex view, what’s left is pure file contents.
It worked perfectly with my file but I wanted to test and see if it can easily handle much larger files. It handles all of them equally fast.
Here’s a 190 MB JSON file:
This is how the file looks when you first open in Hex Fiend.
But when you uncheck “Views > Hexadecimal”, it’s just the file contents.
I tested on a few more files, e.g. 117 MB CSV file. Opened like it was a 3 KB txt file. Perfect. Even when quickly scrolling through the whole file, it was absolutely instantenous.
So there you have it. For me, so I don’t forget what to use to have a peak inside any huge file, and for anyone else that needs it.