> sorting with skipped headers is a mess I like these command line tools, but I ...

cosmojg · on March 16, 2023

It's a matter of perspective.

I like programming languages, but I think they can cripple someone actually learning Unix!

At the end of the day, you should just use whatever tools make you the most productive most quickly.

coldtea · on March 16, 2023

The whole point of UNIX userland is to not have to write a custom program for every simple case that just needs recombining some existing basic programs in a pipeline...

2h · on March 16, 2023

> simple case

that's just it though, the last example is not a simple case, hence why the last example is awkward by the commenters own admission. command line tools are fine, but you need to know when to set the hammer down and pick up the chainsaw.

coldtea · on March 16, 2023

>the last example is not a simple case

As far as shell scripting goes, this is hardly anything to write home about. Looks simple enough to me.

It just retains the header by printing the header first as is, and then sorting the lines after the header. It's immediately obvious how to do it to anybody who knows about head and tail.

And with Miller it's even simpler than that, still on the command line...

carb · on March 16, 2023

To me the last example is still simple. When I encounter this in the wild, I don't really care about preserving the header.

  tail -n +2 example.csv | sort -r -k4 -t','

Or more often, I just do this and ignore the header

  sort -r -k4 -t',' example.csv

Keeping the header feels awkward, but using `sort` to reverse sort by a specific column is still quicker to type and execute (for me) than writing a program.

gabrielsroka · on March 16, 2023

I thought the Go code looked way too complex and Python would be simpler. Yes and no.

  import csv
  
  filename = 'example.csv'
  sort_by = 'index'
  reverse = True
  
  with open(filename) as f:
      lines = [d for d in csv.DictReader(f)]
  
  for line in lines:
      line['index'] = int(line['index'])
  lines.sort(key=lambda line: line[sort_by], reverse=reverse)
  print(','.join(lines[0].keys()))
  for line in lines:
      print(','.join(str(v) for v in line.values()))

cmdlineluser · on March 16, 2023

Perhaps a `DictWriter` would simplify things:

    import csv
    import sys
    
    filename = "example.csv"
    sort_by = "index"
    reverse = True
    
    with open(filename, newline="") as f:
        reader = csv.DictReader(f)
        writer = csv.DictWriter(sys.stdout, fieldnames=reader.fieldnames)
        writer.writeheader()
        writer.writerows(sorted(reader, key=lambda row: int(row[sort_by]), reverse=reverse))

gabrielsroka · on March 16, 2023

I thought about that but 1) it seemed like cheating to write to standard out, 2) you're assuming that the column to sort by is an integer whereas I broke that code up a little bit.

But yours has the advantage of being able to support more complex CSVs.