Blog

Handling Large CSV Files for Digital Forensics and Incident Response

I would love to hear about your techniques for large CSVs. Here’s a roundup of tools. I’ve only tested the first seven.

Free Tools

  1. Timesketch/Kibana à la Skadi (Linux)

    • I tend to use Timesketch for its collaboration and multi-timeline capabilities. Four installation methods: Docker only, OVA, Vagrant, or installer script. Docker only instructions here.

  2. Timeline Explorer (Windows)

    • Usually works on large CSVs despite dialog box pop-up. If it doesn’t, email Eric and he will work with you to troubleshoot. Thank him profusely.

  3. Visual Studio Code + Excel Viewer (Windows, Mac, Linux)

    • Although there is a hard coded limit of 50 MB for Visual Studio Code extensions, I mention it for those who already use VSC.

    • “…use the explorer context menu or editor title menu to invoke the Open Preview command” to put the data into columns

  4. Gnumeric (Linux)

    • Maximum number of rows = 16,777,216

    • WSL Instructions

      • Steps on Windows

      • Steps on Ubuntu

        • apt-get install gnumeric
        • echo export DISPLAY=:0.0 >> ~/.bashrc
        • sed -i 's+<listen>unix:tmpdir=/tmp</listen>+<listen>tcp: host=localhost port=0 </listen>+g' /usr/share/dbus-1/session.conf
        • Reopen Bash

      • Source: Sous-système Windows pour Linux : Ubuntu sur Windows

  5. Woanware’s LogViewer2 (Windows)

  6. Linux CLI

    • I find myself using mostly the following for log manipulation: grep, awk, cut, rev, sed, sort, uniq, tail, head, cat, wc, tr

  7. Import into Excel Data Model (Windows, Mac)

  8. Import into Access

    • External Data | New Data Source | From File | Text File

  9. sift (Linux)

  10. csvkit (Linux)

  11. liquid Large File Editor (Windows)

  12. VisiData (Mac or Linux)

  13. CSView (Windows or Mac)

    • Handles files > 4 GB

  14. reCsvEdit (Windows, Mac, Linux)

    • Handles files > 1 GB

  15. OpenRefine (Windows, Mac, Linux)

Commercial Tools

  1. 010 Editor using Column Mode (Windows, Mac, Linux) | $129.95 (commercial) | $49.95 (home/academic)

    • Handles files > 50 GB

  2. Delimit Pro (Windows) | $49 annually

    • Up to 2 billion rows and 2 million columns

  3. Tablecruncher Pro (Mac) | $29

    • Handles files > 2 GB and 15 million rows

Marcus ThompsonTools, LogsComment