Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> sorting with skipped headers is a mess

I like these command line tools, but I think they can cripple someone actually learning programming language. For example, here is a short program that does your last example:

https://go.dev/play/p/9bASZ97lLWv



It's a matter of perspective.

I like programming languages, but I think they can cripple someone actually learning Unix!

At the end of the day, you should just use whatever tools make you the most productive most quickly.


The whole point of UNIX userland is to not have to write a custom program for every simple case that just needs recombining some existing basic programs in a pipeline...


> simple case

that's just it though, the last example is not a simple case, hence why the last example is awkward by the commenters own admission. command line tools are fine, but you need to know when to set the hammer down and pick up the chainsaw.


>the last example is not a simple case

As far as shell scripting goes, this is hardly anything to write home about. Looks simple enough to me.

It just retains the header by printing the header first as is, and then sorting the lines after the header. It's immediately obvious how to do it to anybody who knows about head and tail.

And with Miller it's even simpler than that, still on the command line...


To me the last example is still simple. When I encounter this in the wild, I don't really care about preserving the header.

  tail -n +2 example.csv | sort -r -k4 -t','
Or more often, I just do this and ignore the header

  sort -r -k4 -t',' example.csv
Keeping the header feels awkward, but using `sort` to reverse sort by a specific column is still quicker to type and execute (for me) than writing a program.


I thought the Go code looked way too complex and Python would be simpler. Yes and no.

  import csv
  
  filename = 'example.csv'
  sort_by = 'index'
  reverse = True
  
  with open(filename) as f:
      lines = [d for d in csv.DictReader(f)]
  
  for line in lines:
      line['index'] = int(line['index'])
  lines.sort(key=lambda line: line[sort_by], reverse=reverse)
  print(','.join(lines[0].keys()))
  for line in lines:
      print(','.join(str(v) for v in line.values()))


Perhaps a `DictWriter` would simplify things:

    import csv
    import sys
    
    filename = "example.csv"
    sort_by = "index"
    reverse = True
    
    with open(filename, newline="") as f:
        reader = csv.DictReader(f)
        writer = csv.DictWriter(sys.stdout, fieldnames=reader.fieldnames)
        writer.writeheader()
        writer.writerows(sorted(reader, key=lambda row: int(row[sort_by]), reverse=reverse))


I thought about that but 1) it seemed like cheating to write to standard out, 2) you're assuming that the column to sort by is an integer whereas I broke that code up a little bit.

But yours has the advantage of being able to support more complex CSVs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: