Check function XYZ is called, return abc when XYZ is called etc are the bad kind that people were bit badly by.
The good kind are a minimally correct fake implementation that doesn't really need any mocking library to build.
Tests should not be brittle and rigidly restate the order of function calls and expected responses. That's a whole lot of ceremony that doesn't really add confidence in the code because it does not catch many classes of errors, and requires pointless updates to match the implementation 1-1 everytime it is updated. It's effectively just writing the implementation twice, if you squint at it a bit.
In reflection heavy environments and with injection and reflection heavy frameworks the distinction is a bit more obvious and relevant (.Net, Java). In some cases the mock configuration blossoms to essentially parallel implementations, leading to the brittleness discussed earlier in the thread.
Technically creating a shim or stub object is mocking, but “faking” isn’t using a mocking framework to track incoming calls or internal behaviours. Done properly, IMO, you’re using inheritance and the opportunity through the TDD process to polish & refine the inheritance story and internal interface of key subsystems. Much like TDD helps design interfaces by giving you earlier external interface consumers, you also get early inheritors if you are, say, creating test services with fixed output.
In ideal implementations those stub or “fake” services answer the “given…” part of user stories leaving minimalistic focused tests. Delivering hardcoded dictionaries of test data built with appropriate helpers is minimal and easy to keep up to date, without undue extra work, and doing that kind of stub work often identifies early re-use needs/benefits in the code-base. The exact features needed to evolve the system as unexpected change requests roll in are there already, as QA/end-users are the systems second rodeo, not first.
The mocking antipatterns cluster around ORM misuse and tend to leak implementation details (leading to those brittle tests), and is often co-morbid with anemic domains and other cargo cult cruft. Needing intense mocking utility and frameworks on a system you own is a smell.
For corner cases and exhaustiveness I prefer to be able to do meaningful integration tests in memory as far as possible too (in conjunction with more comprehensive tests). Faster feedback means faster work.
Why is check if XYZ is called with return value ABC bad, as long as XYZ is an interface method?
Why is a minimally correct fake any better than a mock in this context?
Mocks are not really about order of calls unless you are talking about different return values on different invocations. A fake simply moves the cheese to setting up data correctly, as your tests and logic change.
The point is to test against a model of the dependency, not just the expected behavour of the code under test. If you just write a mock that exactly corresponds to the test that you're running, you're not testing the interface with the underlying system, you're just running the (probably already perfectly understandable) unit through a rote set of steps, and that's both harder to maintain and less useful than testing against a model of the underlying system.
(And IMO this should only be done for heavyweight or difficult to precisely control components of the system where necessary to improve test runtime or expand the range of testable conditions. Always prefer testing as close to the real system as reasonably practical)
The kind of mocks the OP is arguing against are not really a model of the dependency, they're just a model of a particular execution sequence in the test, because the mock is just following a script. Nothing in it ensures that the sequence is even consistent with any given understanding of how the dependency works, and it will almost certainly need updating when the code under test is refactored.
My point is that a fake doesn't magically fix this issue. Both are narrow models of the underlying interface. I don't still quite understand why a mock is worse than a fake, when it comes to narrow models of the interface. If there is a method that needs to be called with a specific set up, there is no practical difference between a fake and a mock.
Again, none of this is a replacement for writing integration tests where possible. Mocks have a place in the testing realm and they are not an inherently bad tool.
Check function XYZ is called, return abc when XYZ is called etc are the bad kind that people were bit badly by.
The good kind are a minimally correct fake implementation that doesn't really need any mocking library to build.
Tests should not be brittle and rigidly restate the order of function calls and expected responses. That's a whole lot of ceremony that doesn't really add confidence in the code because it does not catch many classes of errors, and requires pointless updates to match the implementation 1-1 everytime it is updated. It's effectively just writing the implementation twice, if you squint at it a bit.