SAUCE
Home
Events
Listing
Future
Previous
Accelerated Computing with GPUs 2020
Data Mining - Winter 20/21
High Performance Computing 2019
Einführung in die Bioinformatik WS19/20
Computational Logic
Parallel Algorithms and Architectures 2019
DSEA Praktikum 2018/19
Deep Learning 2018
High Performance Computing 2018
Parallel Algorithms and Architectures 2018
Datenstrukturen und effiziente Algorithmen Ws 18/19
EiP SoSe 18
bio-st-18
EiP WS 2017/18
High Performance Computing 2017
Datenstrukturen und effiziente Algorithmen WiSe 17/18
PS SS 2017
Einfuehrung in die Programmierung SS17
Parallel Algorithms and Architectures 2017
High Performance Computing 2016
DSEA 2016/17
EiP WS2016/17
Parallel Algorithms and Architectures 2016
PS SS 2016
Krypto SS 2016
EiP SS 2016
DSEA Praktikum WS 2015/16
DSEA WS 2015/16
News
Documentation
About
Changelog
Roadmap
Deutsche Dokumentation
Tips and Tricks
Test configuration
Language information
Contact
Login
High Performance Computing 2019
Pair Programming
Pair Programming
Sheet 2 (AVX Shuffles, Instruction Parallelism)
Sheet 3 (Stochastic PI, Shallow Deep Learning)
Sheet 4 (Conjugate Gradient with MPI, Asynchronous 2D Jacobi Partitioning)
Sheet 6 (std::async, block-cyclic distribution)
Sheet 7 (Atomics, Queue)
Sheet 8 (Sorting, Riemann Zeta)
Sheet 5 (Reverse-Engineering MPI, SUMMA)
Sheet 9 (Data Dependencies, Triangular Matrix Vector)
Sheet 10: Lockfree Hashmaps
Sheet 11 (Position Based Dynamics)
Sheet 12 ( Outer Product, Kmer Counting)
Tutorial 0: C++ Examples
Tutorials_extra
All Exercises
testSheet
Sheet 2 Redoable
Sheet 3 Redoable
Sheet 4 Redoable
Sheet 5 Redoable
Sheet 6 Redoable
Sheet 7 Redoable
Sheet 8 Redoable
Sheet 9 Redoable
Sheet 10 Redoable
Sheet 11 Redoable
Sheet 12 Redoable
Lecture 5: Asynchronous 1D Jacobi Partitioning
Lecture 3: AVX SOA normalization
Lecture 4: MPI PI
Lecture 5: Asynchronous 1D Jacobi Partitioning
Lecture 6: Interleaved SUMMA
Lecture 7: Thread distributions MVM
Lecture 8: Dynamic Schedule of All-Pairs distance computation
Lecture 9: 1NN classifier on MNIST data
Lecture 10: Backward Substitution
Lecture 11: Lockfree List using an Array
Lecture 11: Lockfree Hashmap
Lecture 12: Kepler Orbits
Lecture 5: Asynchronous 1D Jacobi Partitioning
Assignment
Scaffold Head
#include <iostream> #include <cstdint> #include <vector> #include <limits> #include <string> #include <algorithm> #include <assert.h> #include <mpi.h> #include "include/bitmap_IO.hpp" // sponsored by stack overflow: http://stackoverflow.com/questions/440133 std::string random_string(size_t length) { auto randchar = []() -> char { const char charset[] = "0123456789" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "abcdefghijklmnopqrstuvwxyz"; return charset[ rand() % (sizeof(charset) - 1)]; }; srand(time(0)); std::string str(length, 0); std::generate_n(str.begin(), length, randchar); return str; } int main (int argc, char * argv[]) { MPI::Init (argc, argv); const uint64_t height = 3*4*5*6+2; const uint64_t width = 2*3*4*5*6+2; const uint64_t num_ranks = MPI::COMM_WORLD.Get_size(); const uint64_t rank = MPI::COMM_WORLD.Get_rank(); // make indexing great again! auto at = [&] (const uint64_t& row, const uint64_t& col) { return row*width+col; }; // check for consistency of dimensions assert(height > 2 && (height-2) % num_ranks == 0); assert(width > 2 && (width -2) % num_ranks == 0); const bool is_root = rank == 0; std::vector<double> image(height*width*is_root, 0); if (is_root) { // draw a checkerboard const uint64_t stride = 181; for (uint64_t row = 0; row < height; ++row) for (uint64_t col = 0; col < width; ++col) image[at(row, col)] = (row/stride + col/stride) % 2; } // double buffers for Jacobi iteration // we start with ying and update to yang, and copy yang back to ying const uint64_t local_height = (height-2)/num_ranks+2; std::vector<double> ying(local_height*width, 0); std::vector<double> yang(local_height*width, 0); ////////////////////////////////////////////////////////////////////////// // SCATTER ////////////////////////////////////////////////////////////////////////// // compute counts and displacements int32_t counts[num_ranks] = {0}; int32_t displs[num_ranks] = {0}; for (uint64_t proc = 0; proc < num_ranks; ++proc) { counts[proc] = (local_height )*width; displs[proc] = (local_height-2)*width*proc; } // scatter image to ying MPI::COMM_WORLD.Scatterv(image.data(), counts, displs, MPI::DOUBLE, ying.data(), counts[rank] , MPI::DOUBLE, 0); ////////////////////////////////////////////////////////////////////////// // FIXPOINT COMPUTATION ////////////////////////////////////////////////////////////////////////// auto update = [&] (const uint64_t& row, const uint64_t& col) { return 0.25*(ying[at(row+1, col )] + ying[at(row-1, col )] + ying[at(row , col+1)] + ying[at(row , col-1)]); }; uint64_t counter = 0, print_every = 1024; double error = std::numeric_limits<double>::infinity(); while (error > 1E-4 && counter < 1UL << 14) {
Scaffold Foot
// relax interior that does not depend on communication for (uint64_t row = 2; row < local_height-2; ++row) for (uint64_t col = 1; col < width-1; ++col) yang[at(row,col)] = update(row, col); // wait for received data req[1].Wait(); req[3].Wait(); // relax lower row=1 for (uint64_t col = 1; col < width-1; ++col) yang[at(1, col)] = update(1, col); // relax upper row=local_height-2 for (uint64_t col = 1; col < width-1; ++col) yang[at(local_height-2 ,col)] = update(local_height-2, col); // wait for send data to ensure copy is not messing up data req[0].Wait(); req[2].Wait(); double local_error = 0; for (uint64_t row = 1; row < local_height-1; ++row) { for (uint64_t col = 1; col < width-1; ++col) { const double residue = ying[at(row,col)]-yang[at(row, col)]; local_error += residue*residue; ying[at(row,col)] = yang[at(row, col)]; } } // every process needs the same error => Allreduce MPI::COMM_WORLD.Allreduce(&local_error, &error, 1, MPI::DOUBLE, MPI::SUM); // status updates every print_every iteration if (counter++ % print_every == print_every-1 && is_root) std::cout << "# Squared error after " << counter << " iterations: " << error << std::endl; } // final status if (is_root) std::cout << "# Final squared error after " << counter << " iterations: " << error << std::endl; ////////////////////////////////////////////////////////////////////////// // GATHER ////////////////////////////////////////////////////////////////////////// // gather image from ying for (uint64_t proc = 0; proc < num_ranks; ++proc) { const bool lo = proc == 0, hi = proc+1 == num_ranks; counts[proc] = (local_height-!lo-!hi)*width; displs[proc] = proc > 0 ? displs[proc-1]+counts[proc-1] : 0; } MPI::COMM_WORLD.Gatherv( ying.data(), counts[rank], MPI::DOUBLE, image.data(), counts, displs, MPI::DOUBLE, 0); ////////////////////////////////////////////////////////////////////////// // CHECK ////////////////////////////////////////////////////////////////////////// if (is_root) { std::string filename = random_string(8)+".bmp"; dump_bitmap(image.data(), height, width, "www/"+filename); std::cout << "# See http://iaimz105.informatik.uni-mainz.de:8000/" << filename << "\nParallel programming is " << (counter == 8353 ? "fun!" : "error-prone!") << std::endl; } MPI::Finalize(); }
Start time:
Do 10 Okt 2019 10:15:00
End time:
Di 11 Feb 2020 10:15:00
General test timeout:
10.0 seconds
Tests
Command line arguments
6
Comment prefix
#
Given input
Expected output
Parallel programming is fun!