3577827a

By: Tom Sydney Kerckhove <syd@cs-syd.eu>

Run mutation children in parallel: N workers = getNumCapabilities

Uses mapConcurrently with a QSem to bound concurrency to the number of
RTS capabilities (set via +RTS -N).  Each child process is independent
so no synchronisation is needed between them beyond the semaphore.