Ahmed Elbossily

Portrait
21335 Lüneburg, Universitätsallee 1, 12.217
Fon 04131.677-1847 (Sekr.), ahmed.elbossily@leuphana.de

Table of contents for this page

Publications

Journal contributions

  1. Experimental–numerical investigation of force-controlled friction extrusion process via feedback-controlled simulation
    Ahmed Elbossily (Author) , Zina Kallien (Author) , Lars Rath (Author) , Rupesh Chafle (Author) , Mohamadreza Afrasiabi (Author) , Benjamin Klusemann (Author) , 09.02.2026 , in: Advances in Industrial and Manufacturing Engineering, 12, 2666-9129 , p. 100184 , 16 p.

    Research output: Journal contributionsJournal articlesResearchpeer-review

  2. Numerical Investigation of Deposition Efficiency Influencing Factors in the Friction Surfacing Process
    Ahmed Elbossily (Author) , Zina Kallien (Author) , Rupesh Chafle (Author) , Kirk A. Fraser (Author) , Mohamadreza Afrasiabi (Author) , Benjamin Klusemann (Author) , 01.01.2026 , in: Key Engineering Materials, 1050, 1662-9795 , p. 1 , 7 p.

    Research output: Journal contributionsConference abstract in journalResearchpeer-review

  3. GPU-accelerated meshfree computational framework for modeling the friction surfacing process
    Ahmed Elbossily (Author) , Zina Kallien (Author) , Rupesh Chafle (Author) , Kirk A. Fraser (Author) , Mohamadreza Afrasiabi (Author) , Markus Bambach (Author) , Benjamin Klusemann (Author) , 01.10.2025 , in: Computational Particle Mechanics, 12, 5 , p. 3721-3745 , 25 p.

    Research output: Journal contributionsJournal articlesResearchpeer-review

Courses

Ahmed Mohamed Fathy Mohamed Ahme Elbossily
Next appointment:
Tuesday, 2026-05-05 at 16:15
The course is structured into four modules: C++ Foundations, Object-Oriented Programming, Memory Management, and Multi-Threading. Additionally, there is a Capstone project at the end of the course that incorporates all the concepts learned.

Through the course, you will develop mini-programs for solving mechanical problems.
Next appointment:
Tuesday, 2026-04-28 at 10:15
Ahmed Mohamed Fathy Mohamed Ahme Elbossily
This course is designed to teach students how to leverage the power of Graphics Processing Units (GPUs) through parallel programming. Traditionally, programs utilize the CPU as the primary unit for execution. In this course, students will learn how to run programs on a GPU and observe the significant difference in processing speed. Initially, GPU cards were used exclusively for video rendering and gaming. However, with the advent of GPU programming, we can now make our programs run faster!

C++ programming language will be used for creating GPU programs in this course. The first two modules will provide students with a strong foundation in C++ programming. Then, students will apply their C++ knowledge to work with CUDA and execute programs on a GPU.

At the end of each module, there will be a project designed to reinforce the concepts learned by students. To ensure the course's generality, these projects will focus on image-processing tasks, such as image coloring and blurring. After all, who doesn’t enjoy working with images!

Module 1: C++ Fundamentals
Topics:
* Basic syntax: data types, control flow, functions
* Containers: `std::vector`, `std::map`
* File I/O
* Introduction to OpenCV for image load/display
Project #1:
Build a program for image converter: load a color image,
convert to grayscale, save the result.
-------------------------------------------------------------------------------------------
Module 2: Memory Management
Topics:
* Stack vs. heap memory model
* Dynamic allocation: `new/delete` and `malloc/free`
* Common pitfalls: Memory leaks
Project #2:
Extend Project #1 into a mini “Image Editor” that applies
two operations—grayscale and Gaussian blur—with
user-specified parameters.
-------------------------------------------------------------------------------------------
Module 3: GPU Programming with CUDA
Topics:
* CUDA programming model: kernels, threads, blocks, grids
* Host vs. device memory; data transfers
* Simple reduction operations on GPU
Project #3:
Port your blurring operation to CUDA. Load the image on
the host, copy it to device memory, launch a blur kernel,
copy back, and save. Benchmark CPU vs. GPU runtimes.
-------------------------------------------------------------------------------------------
Module 4: GPU Hardware & Optimization
Topics:
* GPU architecture deep dive: SMs, warps
* Memory hierarchy: shared and global memory
* Coalesced Memory accesses
Project #4:
Students will revisit the previous projects, applying the
knowledge of GPU data communication to further
accelerate program speed.
Next appointment:
Wednesday, 2026-04-29 at 12:15