Part of the system responsible for generating test cases.
Using information on usage patterns and data from static analysis we can easily generate test stubs for each function/method in a module/class. Since we already know the structure of the code we can generate setup and teardown code for methods of the same class, that will initialize a sample object, for example.
Note: pyreverse includes py2tests, a unittest skeleton generation. All of its functionality (generating test stubs for classes and functions inside modules) is already handled in Pythoscope.
There are at least a few ways to organize and run your test cases. Differences include:
- a testing framework used: unittest or doctest
- whether to use one-to-one correspondence between test classes and application classes
- usage of if __name__ == '__main__': in test modules
- whether test classes should subclass unittest.TestCase or not
- usage of runner/framework-specific functionalities, like e.g. raising SkipTest in nose
Quickcheck-style fuzzy testing, deriving code contracts
Another idea that is worth exploring is taking information about values passed around and deriving Eiffel-style contracts for methods and functions. It would work like this:
- Generate a random input of some chosen type. We could use some function contract information gathered earlier, but if that's not available we can continue anyway. Not only values of arguments should vary, but their number as well (important for testing functions with optional arguments).
- Call the function with generated input.
- Record the result.
- Generate a test case based on this.
Test cases don't have to be generated immediately, I'd rather see them grouped by the result (into equivalence classes1) and put into separate test cases.
Using this method we'll be able to come up with new test cases without any user interaction, and possibly beyond normal system usage, capturing "accidental" system behavior, which I'm guessing could be a real time saver for legacy systems.
- Generated tests can be declarative in nature, inspired by Haskell's QuickCheck (see Peckcheck). Using type (or rather more general: range of typical/accepted values) information gathered from coverage reports we can generate new input/output couples to enhance generated tests even more (and possibly finding bugs). Care must be taken to ensure that generated tests don't fail (i.e. that our generalizations are right).
- How can we assure generated tests safety? Test generator doesn't know the meaning of functions it will generate tests for. For example, it may generate a test case for remove_glob with an argument '/*', which can potentially wipe out all developer's files. Maybe we can "guess" necessary stubs for particular test cases? Calling os.remove indicates a need for filesystem stub, while modules importing sqlite3 may require a database stub, etc.