One of the most crucial requirements of successful testing is being able to get the test results fast. Out of many aspects, the most important is the isolation of what's to be tested from everything that is outside of the specific test's scope. Most of the time this means running the test without database, UI, web service calls or disk access (except when you're specifically testing that part). This can be achieved by substituting (by the means of stubbing or mocking) the costly calls for the time of test execution.
Let's look at the common understanding of a unit:
Intuitively, one can view a unit as the smallest testable part of an application. In procedural programming a unit could be an entire module but is more commonly an individual function or procedure. In object-oriented programming a unit is often an entire interface, such as a class, but could be an individual method.
If we treat a class method as a unit, then using the available tools we can achieve both the ideal isolation and a high degree of code coverage (even 95-100%). What we get as a result is a set of classes that are working perfectly in isolation.
But do they work together?
Well, we can't really say anything without running other kinds of tests. Many of the errors will be caught by a type system, but only if that's available. If not, even changing a method signature will most probably break the system.
The way to solve the problem is to test the integration of the units. But we really want this process to be fast enough so that the code-test cycle can be ran often. Basically, we need another level of tests, or - to put it another way - we need tests with less granular units. That's a lot of work, and in the end it's a not-so-obvious way of introducing duplication: we test every part of a scenario and then the whole scenario itself.
Then how about being less strict with the definition of a unit and redefining it so that it's not only a class method anymore? Unfortunately, I can only now think of a bit vague definition of a unit: a closed set of code that makes sense in isolation.
If we have some kind of a framework class (date handling, i18n, general serialization), then it's perfectly fine for a unit of test. On the other hand, there might be a lot of technical code that is there just to support a set of web page interaction scenarios - in that case the whole page seems like a better choice for the unit of test. But again, handling the negative cases (the different paths of failure) is easier when the units are as small as possible.
Varying the size of a test unit is importang for maintaining good quality of the tests: bigger units cover more production code while requiring less test code; smaller units make sense for widely used types and are useful for pinpointing failure paths.