Usage every bits of intra and inter communication features of a modern architectures with MPI and OpenMP. In here, we will also look into some GCC features to use.
These days operating systems provide APIs and CPUs provide special
instructions to run some of these problematic scenarios faster and/or
safer. With the help of POSIX Threads, pthreads
, we could run instance
of a software without thinking to implement concepts such as mutex or
read locks (synchronization). CPU manufacturers introduced technique
called SIMD. This is an acronym for Single Instruction, Multiple Data.
An example to SIMD is AVX instruction in Intel and AMD processors or
NEON in ARM. This special instruction is to access the same data (in
chunks) accessible with one instruction time. General purpose GPUs are
also used to accelerate this process with feeding the information into
VRAM once and then compute with tiny compute units such as CUDA in
Nvidia graphics cards.
Of course, all these are only valid if the application is needed any of these to leverage the best case scenarios since not many applications can run on a GPU or special hardware such as FPGA.