By: Tom Sydney Kerckhove <syd@cs-syd.eu>
Run mutation children in parallel: N workers = getNumCapabilities Uses mapConcurrently with a QSem to bound concurrency to the number of RTS capabilities (set via +RTS -N). Each child process is independent so no synchronisation is needed between them beyond the semaphore.