Loading...
Please wait, while we are loading the content...
Similar Documents
High-Performance Reservoir Simulations on Modern CPU-GPU Computational Platforms
| Content Provider | Semantic Scholar |
|---|---|
| Author | Bogachev, Kirill Milyutin, Sergey Telishev, Alexey Nazarov, Volodymyr Shelkov, V. G. Eydinov, Dmitry Malinur, O. Hinneh |
| Copyright Year | 2018 |
| Abstract | With growing complexity and grid resolution of the numerical reservoir models, computational performance of the simulators becomes essential for field development planning. To achieve optimal performance reservoir modeling software must be aligned with the modern trends in high-performance computing and support the new architectural hardware features that have become available over the last decade. A combination of several factors, such as more economical parallel computer platforms available on the hardware market and software that efficiently utilizes it to provide optimal parallel performance helps to reduce the computation time of the reservoir models significantly. Thus, we can get rid of the perennial problem with lack of time and resources to run required number of cases. That, in turn, makes the decision about field development more reliable and balanced. In the past, the advances in simulation performance were limited by memory bandwidth of the CPU based computer systems. During the last several years, new generation of graphical processing units (GPU) became available for high-performance computing with floating point operations. That made them very suitable solution for the reservoir simulations. The GPU’s currently available on the market offer significantly higher memory bandwidth compared to CPU’s and thousands of computational cores that can be efficiently utilized for simulations. In this paper, we present results of running full physics reservoir simulator on CPU+GPU platform and provide detailed benchmark results based on hundreds real field models. The technology proposed in this paper demonstrates significant speed up for models, especially with large number of active grid blocks. In addition to the benchmark results we also provide analysis of the specific hardware options available on the market with the pros and cons and some recommendations on the value for money of various options applied to the reservoir simulations. Introduction In 1965 Gordon Moore predicted that the number of transistors in a dense integrated circuit has to double every two years, [1]. This is referred to as the Moore’s law and it has proved to be true over the last 50 years. Over the last several years it has been mostly achieved by growth in the number of CPU cores integrated within one shared memory chip. The number of cores available on the CPU’s has grown from 1-2 in 2005 to 20+ and keeps growing every year [2]. In the reality multicore supercomputers are not exclusive anymore and have become available for everyone in shape of laptops or desktop workstations with large amount of memory and high-performance computing capabilities. There has also been a significant improvement in the cluster computing options that were very expensive and required special infrastructure, costly cooling systems, significant efforts to support, etc. only about 10-15 years ago. These days the clusters have also become economical and easy-to-use machines. It has been demonstrated that the reservoir simulation time can be efficiently scaled on the modern CPU-based workstations and clusters if the simulation software is implemented properly to AAPG ICE 2018: <3005448> 2 support the features of modern hardware architecture, [3,4]. It is fair to conclude that the simulation time is not a principal bound for the projects anymore, as it can always be reduced by adding extra computational power. In addition to in-house hardware one can consider simulation resources available on the clouds based on pay-per-use model, [5]. This makes the reservoir modeling solutions much more scalable in terms of time, money, human and computational resources. Reservoir engineers from large and small companies can have access to nearly unlimited simulation resources and reduce the simulation time to minimum whenever it is required. There exists another hardware technology that progresses rapidly graphical processing units (GPU). The modern GPU’s have thousands of cores that are suitable for high-performance parallel simulations. However, it is not even the large number of cores that makes GPU’s particularly attractive for this purpose. One of the most critical parameters for parallel simulations is memory bandwidth which is effectively the speed of communication between the computational cores. Fig. 1 shows a typical example of simulation time spent on computation of a reservoir model. Significant part of it is the linear solver (60% in this case), and this task is the most demanding with respect to the memory bandwidth due to large volume of communications. That is why reservoir simulations is often considered to be a problem of the so-called ‘memory bound’ type. If we compare the memory bandwidth characteristics of the CPU and GPU platforms available today (Fig. 2), we can see that the difference in parameter is about one order of magnitude, i.e. from 100GB/s to 1000GB/s respectively. Moreover, the progress of the GPU’s has been very rapid over the last several years, and the trend clearly shows that the gap in memory bandwidth parameter keeps growing ([6]). This means that the GPU platforms will become more and more efficient for parallel computing and reservoir simulations in particular. Figure 1: Memory bound problems in reservoir simulations Figure 2: Floating point operations and memory bandwidth progress GPU vs. CPU AAPG ICE 2018: <3005448> 3 Hybrid CPU-GPU Approach There have been a few recent developments applying GPU for the reservoir simulations, e.g. [7,8,9]. Typical approach is to adopt GPU for the whole simulation process. In this work, we utilize the hybrid approach where the workload is distributed between CPU and GPU in an efficient manner. This method was first introduced in [12], and this paper is an extension of that work. The idea of the proposed technology is to delegate the most computationally intensive parts of the job to GPU and leave everything else on the CPU side. In [12] the part of the jobs assigned to GPU was limited to the linear solver. It worked very well for large and mid-size problems (say, 1 million active blocks or higher) and provides many-fold acceleration compared to the conventional CPU-based simulators. In many cases acceleration 2-3 times was observed. In specific cases like SPE 10 model ([10]) where the time spent on linear solver is nearly 90% due to extreme contrast in the reservoir property v, the speed-up due to including GPU was almost 6 times (Fig.3). In this work we extend the approach for compositional models and also include EOS flash calculations in the part of the calculations that are run on GPU. Flash calculations for compositional models For compositional models, it is required to identify the parameters of phase equilibrium of the hydrocarbon mixture. This part can take as much as 10-20% of the total modeling time for real cases. Opposed to linear equations with sparse matrices, the standard solution methods for flash calculations [13,14] are primarily bound by the double-precision calculation speed, and not the memory bandwidth. In addition, the algorithm for the phase equilibrium calculations can be naturally vectorized as a sequence of iteration series for certain sets of grid blocks, where the calculations for each set are independent from the other grid blocks. To achieve maximum computation efficiency on CPU’s we can change the approach from single block calculations to using embedded CPU instructions SSE/AVX ([15]) that allow to work with vector of parameters for multiple grid blocks at the same time. Flash calculations in the fluid flow models can also be efficiently carried out on the graphical cards, especially for the cases with large areas with oil and gas phases present. Comparison of the simulation performance for this method is discussed in the results section. Figure 3: SPE 10 model simulation performance on a laptop with and without GPU and dual CPU workstation commonly used for reservoir simulations. AAPG ICE 2018: <3005448> 4 The workload distribution between the CPU’s and GPU’s can be summarize as follows (Fig. 4). GPU is assigned to run the most computationally intensive parts of the simulations, such as the linear solver and the flash calculations, and everything else remains on the CPU side. The communication between CPU’s and GPU’s is going via PCI Express slots. In this example a dual socket CPU system is equipped with multiple graphical cards and each CPU communicates with the corresponding GPU’s. The multiple graphical cards communicate directly via fast interconnect channels. This is the most general scheme. Practically in most cases used for reservoir simulations the system consists of one or two GPU’s connected to one or two PCIE respectively. In this paper, we only consider examples with single or dual GPU systems and single GPU per PCIE slot. Hardware Options Available Table 1 presents some hardware alternatives currently available on the market. CPU models are shown in the upper part of the table, and the key GPU models are listed in the bottom. We also provide some pricing estimate for the readers information. Note, the prices can vary region by region and also change in time. So, the information here is a rough approximation, but gives fair understanding of the cost level and price ratio between different options. There are several parameters that are fundamentally different between CPU’s and GPU’s: ▪ Memory bandwidth. As discussed above, this parameter is one of the most critical for parallel computation performance in reservoir simulations. As one can see from the table, graphical processors offer significantly better solution here, including the basic models (GTX) for mass markets that can be purchased at a very reasonable cost. More advanced and expensive Tesla cards have higher bandwidth. However, as it follows from our observations, in many cases GTX processors work very well and achieve almost the same level of computation performance as the more expensive |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | https://rfdyn.com/wp-content/uploads/2018/11/2018-AAPG-ICE-3005448-.pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |