Hello,
I would like to assess the performances of NEST on different hardwares. And I have some questions if I may.
I am thinking about reusing one of the benchmarks here : https://github.com/compneuronmbu/nest-benchmarks.
Let's assume we have two nodes : A and B. A has N cores and B has M cores. I want to know if NEST runs faster for a particular test case on A or B. Hence I am fully filling the nodes with MPI ranks and OpenMP threads. If my understanding is correct, I need to set NVP = MPI ranks x OpenMP threads hence NVP = N on A and NVP = M on B. At this stage, for a lot of applications I would just say "The application runs faster on A or on B." . What worries me for NEST is that N and M might be different, hence different NVP for the simulation on A and B. From the documentation it says that for different NVP we have different results. So it gets me a bit worried to say that NEST is faster on A or B for a particular test case if I am not computing the same thing. I was wondering if you could confirm whether this is (or not) an issue and if you could give me some guidelines as regards benchmarking NEST please. That would be fantastic. This is also problematic if I want to do a strong scaling study on a single node for instance.
I could run NEST on the smallest common multiple of N and M (let's call it P). This would mean P/N nodes of A and P/M nodes of B. I guess this would make sense for NEST. But this is not a single node comparison, this is more a core vs core comparison.
Hope this makes sense. And please feel free to correct me if I said something wrong. I do not have a lot of experience with NEST.
Best regards, Conrad Hillairet.
Hello Conrad,
You understood NVP right. The dynamics of the model (what gets computed) and thus the computational load depends on the size of the network, i.e., the number of neurons and synapses in the network. In our benchmarks, this is typically controlled by a "scale" parameter. As long as that scale parameter is the same, results are comparable even if NVP-numbers differ (pathologies are possible, see below).
The benchmarks report the number of spikes generated (Rate_sum). As long as that number is almost the same, benchmark timings are comparable. The variation in Rate_sum gives a rough indication of the uncertainty in the benchmark results. The reason for this is that the rate directly reflects the number of spikes exchanged and spike delivery is a major part of the work done.
If you perform weak-scaling experiments, the scale will change, but even then results are comparable a long as Rate_sum stays approximately constant (for small scales the rate might vary more because the number of synapses per neuron changes with scale; if I remember right this happens for scale < 10).
For strong scaling, the scale stays constant and results are well-comparable. Simulation results will still not be identical for different NVP because each VP has its own random number stream, so different NVP mean different random number sequences. But this does not influence workload and does thus not affect benchmarking results. Again, checking Rate_sum can be useful.
In extremely rare cases (and so far not observed for the benchmarks discussed here, as far as I can remember), the dynamics of the model may enter a pathological state in which, e.g., all neurons fire at maximal rate. This would invalidate results, but again this would show up as a change in Rate_sum.
You are probably aware of the following paper
van Albada SJ., Rowley AG., Senk J, Hopkins M, Schmidt M, Stokes AB., Lester DR., Diesmann M and Furber SB. (2018) Performance Comparison of the Digital Neuromorphic Hardware SpiNNaker and the Neural Network Simulation Software NEST for a Full-Scale Cortical Microcircuit Model. Frontiers in Neuroscience(12):291 http://dx.doi.org/10.3389/fnins.2018.00291
which discusses comparison of simulation performance across different architectures in detail.
Best, Hans Ekkehard
--
Prof. Dr. Hans Ekkehard Plesser Head, Department of Data Science
Faculty of Science and Technology Norwegian University of Life Sciences PO Box 5003, 1432 Aas, Norway
Phone +47 6723 1560 Email hans.ekkehard.plesser@nmbu.no Home http://arken.nmbu.no/~plesser
On 02/03/2021, 17:35, "conrad.hillairet@arm.com" conrad.hillairet@arm.com wrote:
Hello,
I would like to assess the performances of NEST on different hardwares. And I have some questions if I may.
I am thinking about reusing one of the benchmarks here : https://github.com/compneuronmbu/nest-benchmarks.
Let's assume we have two nodes : A and B. A has N cores and B has M cores. I want to know if NEST runs faster for a particular test case on A or B. Hence I am fully filling the nodes with MPI ranks and OpenMP threads. If my understanding is correct, I need to set NVP = MPI ranks x OpenMP threads hence NVP = N on A and NVP = M on B. At this stage, for a lot of applications I would just say "The application runs faster on A or on B." . What worries me for NEST is that N and M might be different, hence different NVP for the simulation on A and B. From the documentation it says that for different NVP we have different results. So it gets me a bit worried to say that NEST is faster on A or B for a particular test case if I am not computing the same thing. I was wondering if you could confirm whether this is (or not) an issue and if you could give me some guidelines as regards benchmarking NEST please. That would be fantastic. This is also problematic if I want to do a strong scaling stu dy on a single node for instance.
I could run NEST on the smallest common multiple of N and M (let's call it P). This would mean P/N nodes of A and P/M nodes of B. I guess this would make sense for NEST. But this is not a single node comparison, this is more a core vs core comparison.
Hope this makes sense. And please feel free to correct me if I said something wrong. I do not have a lot of experience with NEST.
Best regards, Conrad Hillairet. _______________________________________________ NEST Users mailing list -- users@nest-simulator.org To unsubscribe send an email to users-leave@nest-simulator.org
Hi Conrad,
I wanted to let you know that the benchmark repository you are referring to, https://github.com/compneuronmbu/nest-benchmarks, is a bit outdated and no longer updated. I should have marked this in the README, I will do so now. It is of course still possible to use though! We have moved our benchmarking efforts to GIN on g-node, it makes it easier to handle our data. The repository (https://gin.g-node.org/nest/nest-benchmarks) is currently private, as we are working actively on the setup at the moment, but let me know if you would like access, and I can add you.
Best wishes, Stine
________________________________ Fra: Hans Ekkehard Plesser hans.ekkehard.plesser@nmbu.no Sendt: onsdag 3. mars 2021 07:41 Til: NEST User Mailing List users@nest-simulator.org Emne: [NEST Users] Re: NEST performance assesment
Hello Conrad,
You understood NVP right. The dynamics of the model (what gets computed) and thus the computational load depends on the size of the network, i.e., the number of neurons and synapses in the network. In our benchmarks, this is typically controlled by a "scale" parameter. As long as that scale parameter is the same, results are comparable even if NVP-numbers differ (pathologies are possible, see below).
The benchmarks report the number of spikes generated (Rate_sum). As long as that number is almost the same, benchmark timings are comparable. The variation in Rate_sum gives a rough indication of the uncertainty in the benchmark results. The reason for this is that the rate directly reflects the number of spikes exchanged and spike delivery is a major part of the work done.
If you perform weak-scaling experiments, the scale will change, but even then results are comparable a long as Rate_sum stays approximately constant (for small scales the rate might vary more because the number of synapses per neuron changes with scale; if I remember right this happens for scale < 10).
For strong scaling, the scale stays constant and results are well-comparable. Simulation results will still not be identical for different NVP because each VP has its own random number stream, so different NVP mean different random number sequences. But this does not influence workload and does thus not affect benchmarking results. Again, checking Rate_sum can be useful.
In extremely rare cases (and so far not observed for the benchmarks discussed here, as far as I can remember), the dynamics of the model may enter a pathological state in which, e.g., all neurons fire at maximal rate. This would invalidate results, but again this would show up as a change in Rate_sum.
You are probably aware of the following paper
van Albada SJ., Rowley AG., Senk J, Hopkins M, Schmidt M, Stokes AB., Lester DR., Diesmann M and Furber SB. (2018) Performance Comparison of the Digital Neuromorphic Hardware SpiNNaker and the Neural Network Simulation Software NEST for a Full-Scale Cortical Microcircuit Model. Frontiers in Neuroscience(12):291 http://dx.doi.org/10.3389/fnins.2018.00291
which discusses comparison of simulation performance across different architectures in detail.
Best, Hans Ekkehard
--
Prof. Dr. Hans Ekkehard Plesser Head, Department of Data Science
Faculty of Science and Technology Norwegian University of Life Sciences PO Box 5003, 1432 Aas, Norway
Phone +47 6723 1560 Email hans.ekkehard.plesser@nmbu.no Home http://arken.nmbu.no/~plesser
On 02/03/2021, 17:35, "conrad.hillairet@arm.com" conrad.hillairet@arm.com wrote:
Hello,
I would like to assess the performances of NEST on different hardwares. And I have some questions if I may.
I am thinking about reusing one of the benchmarks here : https://github.com/compneuronmbu/nest-benchmarks.
Let's assume we have two nodes : A and B. A has N cores and B has M cores. I want to know if NEST runs faster for a particular test case on A or B. Hence I am fully filling the nodes with MPI ranks and OpenMP threads. If my understanding is correct, I need to set NVP = MPI ranks x OpenMP threads hence NVP = N on A and NVP = M on B. At this stage, for a lot of applications I would just say "The application runs faster on A or on B." . What worries me for NEST is that N and M might be different, hence different NVP for the simulation on A and B. From the documentation it says that for different NVP we have different results. So it gets me a bit worried to say that NEST is faster on A or B for a particular test case if I am not computing the same thing. I was wondering if you could confirm whether this is (or not) an issue and if you could give me some guidelines as regards benchmarking NEST please. That would be fantastic. This is also problematic if I want to do a strong scaling stu dy on a single node for instance.
I could run NEST on the smallest common multiple of N and M (let's call it P). This would mean P/N nodes of A and P/M nodes of B. I guess this would make sense for NEST. But this is not a single node comparison, this is more a core vs core comparison.
Hope this makes sense. And please feel free to correct me if I said something wrong. I do not have a lot of experience with NEST.
Best regards, Conrad Hillairet. _______________________________________________ NEST Users mailing list -- users@nest-simulator.org To unsubscribe send an email to users-leave@nest-simulator.org
_______________________________________________ NEST Users mailing list -- users@nest-simulator.org To unsubscribe send an email to users-leave@nest-simulator.org