It should be ensured that all measurements are available once execute() has returned. The observed code sections do not have to join() spawned threads which requires additional waiting for all spawned threads before execute() returns in order to fulfil this condition.
It is not always possible to wait for all threads when some of the executed code is not instrumented, because threads can start from uninstrumented code in parallel threads even after the execute method calls for a join. At that point ByCouner is unaware of the new thread and cannot wait for it. Also ByCounter can not assume that instrumented code will always be executed.
For the set of cases in which threads are only spawned from instrumented code, new options have been added. InstrumentationParameters has a new property provideJoinThreadsAbility (true by default) that makes waiting for threads spawned from instrumented code possible. ExecutionSettings has a new property waitForThreadsToFinnish (true by default) which enables this behaviour for BytecodeCounter.execute calls. In addition to that, CountingResultCollector has the method joinSpawnedThreads() which allows the user to manually decide when to join (depends on provideJoinThreadsAbility).
A test case for these features has been added to TestThreads.