More

chairleader · on May 8, 2024

Out of curiosity, are there Linux tools to go from binary files to audio and back again? i.e `cat archive.zip | faxify -wav > archive.zip.wav`

anthk · on May 8, 2024

Minimodem

Encode to opus:

      cat foo.zip | minimodem --tx -r 300 -f out.flac

      opusenc --bitrate 64 out.flac out.opus

Decode from opus:

       minimodem --rx -r 300 -f out.opus > foo2.zip

chairleader · on May 8, 2024

fantastic, thanks!

chairleader · on Nov 6, 2023

Thanks for the reply, I'm also a big fan of documentation living inside the repo. I appreciate the framework you describe for the readme as well. Similar to _kb's sibling comment, that sounds like a sane approach to meeting the needs of contributors to and supporters of the codebase.

FWIW, I'm struggling with documentation that support the mix of roles on the team in addition to coders/contributors - QA engineers, product owners, project managers, support teams, etc. A big part is the capture of requirements flowing into proposed solutions and out into support documentation.

In reply to _kb, it did occur to me that there are many job descriptions at play here, and that change is inevitable. Perhaps the answer is simply that it takes many people and several genres of software to keep things from falling through the cracks.

lgkk · on Nov 11, 2023

For sure.

I know JIRA isn't everyone's cup of tea, but it reminded me of Confluence since you mentioned people across domains like that. I only mentioned JIRA because it was connected to Github and other services (sales, marketing, designers, etc.) and that made it easy for people across domains to discuss and have common place for everyone on a project and share information in a meaningful way.

The knowledge lived in one space together, so it was kind of cool that I would pull some data from our data sets to help marketing frame their messaging. Or seeing designers and product managers work together on mocks and share customer insights, and then product engineering teams would build their tickets from it. I could get an insight into what work the technical teams are doing and help them deploy their software.

chairleader · on Nov 6, 2023

Thanks for this link, I agree, that is a fantastic framework for documenting software.

The idea that a reader of a piece of documentation is approaching it with a particular goal is spot on. I suppose I envision either more or perhaps another layer of goals/contexts in addition to what this is outlined here.

A set of use cases I'm thinking of involve the process of changing what a piece of software does. As design documents arise that may or may not apply to a future version of the software, where should that live? Perhaps during the design phase of that release it belongs in one place while designers are iterating on its contents, and later it moves to another place when it is considered a stable reference document?

Plenty of capital P Process behind this question, I suppose. And perhaps there's no getting around some amount of the "librarian" work moving content around to reflect its use.

_kb · on Nov 9, 2023

MADR [1] in its reframed "Any Decision Record" form can be a good tool for that. RFDs [2] also appear to capture a similar intent.

[1]: https://adr.github.io/madr/

[2]: https://rfd.shared.oxide.computer/rfd/0001

chairleader · on Nov 6, 2023

The symptoms we experience are: - not capturing key information, often because the challenge of finding a place makes a project out of a task which then gets dropped, - capturing the information in an island somewhere making it unfindable later on, e.g. a loose document in a user's Confluence space.

With a team of folks with different skill levels, when there's information to capture, I'd like there to be a clear single place for it to live so that it's easy to place there and easy to find again.

chairleader · on Oct 23, 2020

Interesting. I would say that John didn't build a solution that met all of the use cases that the business needed.

* To ensure accuracy of transactions, he should have built reporting to meet the needs of accountants.

* To support change, he should have kept his components separate and provided modest testing examples.

* To support developers and operations, he should have found and documented the external dependencies along with steps to verify that they are in place.

Whether product managers didn't uncover these requirements or the business didn't prioritize them, these aspects of the definition of "done" have more to do with thee environment the engineer is working in and less to do with his or her work.

chairleader · on Aug 5, 2020

Thank you for the tips!

chairleader · on Sept 6, 2019

I appreciate this:

> In general, reviewers should favor approving a CL once it is in a state where it definitely improves the overall code health of the system being worked on, even if the CL isn’t perfect.

Reviewers are human too, and can occasionally get lost in the weeds nitpicking a PR/CL.

chairleader · on May 17, 2019

What tends to be the first indication of breaches? It's one thing to do a forensic analysis after learning of a breach, and it's another to detect it in the first place.

nerdbaggy · on May 17, 2019

I worked at a company that logged every single SQL query and made a rule set based on that. May not of been the most efficient but it worked great. There was basically a whitelist of sorts and if the query structure wasn’t in there then action taken. Also worked by knowing what queries came in what order when doing certain things.

INTPenis · on May 17, 2019

This sounds a lot like an IDS for SQL. I've worked with government agencies that focus very heavily on IDS in firewall systems.

SO not only do they catch attacks early, in the perimiter network, but they also often block legitimate traffic and handle such cases regularly.

But it's a default deny policy so that comes with. It also costs a ton of money for the best IDS solutions. I believe it comes from companies like Checkpoint, Cisco and Symantec.

xfitm3 · on May 17, 2019

What tooling did you use to audit queries?

dboreham · on May 17, 2019

Not parent but it reads like they wrote their own (presumably driven by DB server log data with query logging enabled).

bigiain · on May 17, 2019

Mostly, Troy Hunt e-mailing clueless companies saying "Hey, this data breach I got sent seems to check out as real, at least a few of the users in it have validated that's currently or recently been their $company website credential to me."

(Only mostly joking...)

wepple · on May 17, 2019

I’ve done a few small IR jobs in my time, and also have a hobby of reading every breach report that comes out.

It seems the vast majority of breach discovery amongst typical companies is an engineer going “hrmm that’s odd”: a router at 100% CPU because it’s currently part of a DDoS attack. A DBA noticing a huge query they don’t recall running. Unusual login times for administrative accounts. Having email systems sinkholed for sending spam. And of course “all my files are encrypted?”

clebio · on May 17, 2019

It depends on attack surface and what tooling you already have in place. But for example:

> Finding suspicious outbound network activity

https://blog.rapid7.com/2016/05/09/introduction-to-osquery-f...

chairleader · on May 15, 2019

I'm working in a vanilla PHP codebase right now, and I see all sides - the fractal design fails and the surprising improvements to the language. I've just read up on the architecture of Laravel, and given the public opinion of it, must be a very nice framework.

What strikes me most about PHP is the fundamental request/response execution model. Your execution context begins when a request is received and ends when we send the last byte or terminate the request. There's no startup healthchecks, no cache warming or any other bootstrapping of your service unless you jump through convoluted hoops on your own. Your service is either accepting requests or it isn't. You either lazy load your data into APCu or you don't. I've leveraged AWS healthchecks to achieve these in the past, but that path is not very maintainable.

Inevitably in the course of maintaining a service, I find use cases for a phase of execution that should occur before the server is live or shared static memory that should be available at all times, but (unless I don't know something) those are things that PHP doesn't do.

This is the reason that I find PHP to be a bizarro language - for its fundamental design assumption!

DCoder · on May 15, 2019

This execution model is just good old CGI scripts [0].

[0]: https://en.wikipedia.org/wiki/Common_Gateway_Interface

IloveHN84 · on May 15, 2019

Actually you can compile PHP code and optimize it.

chairleader · on April 18, 2019

Quite a premise: "Giant monolithic source-code repositories are one of the fundamental pillars of the back end infrastructure in large and fast-paced software companies."

huac · on April 18, 2019

facebook, google, airbnb, quora, many more all use monorepo

obviously there are many others who do not use monorepo (amazon comes to mind) but it's reasonable to claim that they are actually widely used and fundamental when used

jhenkens · on April 18, 2019

Microsoft uses it for Windows as well, which was so large they wrote their own git filesystem to power it.

vruiz · on April 18, 2019

Does anybody know how these companies development environments look like? I know about Piper at Google but how do the rest manage? Does every single engineer have the entire monorepo in their machines?

Shish2k · on April 19, 2019

At facebook, a virtual filesystem (https://github.com/facebookexperimental/eden) + change monitor (https://facebook.github.io/watchman/) = source control operations run in O(files I have modified) instead of O(size of repository) time

vruiz · on April 19, 2019

Very interesting, thanks!

YawningAngel · on April 18, 2019

Most places I know of use Git Virtual File System or equivalents.

vruiz · on April 18, 2019

It is my understanding that VFSForGit only works on Windows.

filoleg · on April 18, 2019

The github repo has instructions for running it on Mac and says that the stable Mac version is under active development.

venantius · on April 18, 2019

Airbnb uses a monorepo for JVM-based project but most of Airbnb's code at least as of mid-2017 was not run on the JVM and was hosted multi-repo.

fjp · on April 18, 2019

What are AirBnb's unique scaling issues other than just being a web app with tons of usage?

Uber has navigation, route optimization, queuing, etc. Facebook has to propagate activity our to massive and complex social network graphs.

I'm not discounting the toughness of operating at Airbnb's scale, but from my limited understanding it seems like they are not solving a new problem.

novok · on April 18, 2019

Map search, automatic dynamic pricing, predicting busy periods, predicting user & provider preferences, etc.

huac · on April 18, 2019

i was a data scientist so i knew of monorail but never touched that side of things :)

twic · on April 19, 2019

Notably, Netflix doesn't (or least didn't):

https://medium.com/netflix-techblog/towards-true-continuous-...

chairleader · on April 19, 2019

til! Most of my experience with large, single-repository projects are just plain monoliths. The design goal we strive for tends to be the microservice architecture, assuming that isolation of responsibility leads to more maintainability, better decision making, etc. I can see how, with a well disciplined team, the monorepo could have the best of both worlds.

bluGill · on April 18, 2019

Many companies do use a monorepo. Many other companies do not. There are trade offs.

msangi · on April 18, 2019

All the companies have in common a huge budget they can invest on their build systems to overcome the shortcomings of monorepos.

They do have some benefits, but they also come with an immense cost

patejam · on April 18, 2019

But that's what the OP's quote is saying. "large" companies use them.

huac · on April 18, 2019

all of those companies were once small too

googlemike · on April 18, 2019

I like it at Google!

Aqua_Geek · on April 18, 2019

Google has the team + tooling to properly support it. The same cannot be said for many other orgs.

glandium · on April 19, 2019

Google has more people working on the problem than many other companies have employees.

Aqua_Geek · on April 19, 2019

I don’t doubt it. They also do more traffic through their VCS than most companies do through their main product.

erik_seaberg · on April 19, 2019

Do they? They didn't used to. In 2015 we were routinely dead in the water, unable to test and deploy anything from our google3 projects because some random submitted a CL for a project we didn't even care about. Teams would appoint "build cops" whose job is to complain as quickly as possible because that's all we could do about it.

Every problem you could have with bad dependencies is entirely self-inflicted. The Right Thing™ is to choose a known-good version, and update when you have the bandwidth to pay down the tech debt.

nevir · on April 18, 2019

Many teams

k__ · on April 18, 2019

Yes, my first thought "I wish the systems I worked on in big corps were big monolithic giants"

robocat · on April 18, 2019

Which huge successful software companies don't use a monorepo?

xxpor · on April 18, 2019

Amazon.

slsteele · on April 19, 2019

I think these count as huge (although maybe not when next to Google), but Spotify and Netflix.

draw_down · on April 18, 2019

I'm willing to agree with that premise.