HPCS2016 Keynotes

The International Conference on High Performance Computing & Simulation

(HPCS 2016)

The 14th Annual Meeting

July 18 – 22, 2016

Innsbruck, Austria

http://hpcs2016.cisedu.info or http://cisedu.us/rp/hpcs16

HPCS 2016 KEYNOTES

Tuesday Keynote: Big Data Analysis in European Clouds:

The Challenges for Life Sciences

Steven J. Newhouse

Head of Technical Services, European Bioinformatics Institute

European Molecular Biology Laboratory, U.K.

NOTES (See file below. Forthcoming)

Wednesday Keynote: PCubeS/IT - A Type Architecture and Portable

Parallel Language for Hierarchical Parallel Machines

Andrew S. Grimshaw

Department of Computer Science, University of Virginia, VA, USA

Co-architect, XSEDE

NOTES (See file below. Forthcoming)

Thursday Keynote: e-Research & the Art of Linking Astrophysics & Deforestation,

via Smartening Energy Systems and Detecting Energy Theft

David Wallom

Oxford e-Research Centre,

University of Oxford, U.K.

NOTES (See file below. Forthcoming)

Thursday Keynote II

& Closing Plenary: Big Data Analysis Made Scalable on Clouds

Domenico Talia

University of Calabria & DtoK Lab, Italy

NOTES (See file below)

______________________________________________________________________

Tuesday Keynote: Big Data Analysis in European Clouds:

The Challenges for Life Science

Steven J. Newhouse

Head of Technical Services, European Bioinformatics Institute

European Molecular Biology Laboratory, U.K.

ABSTRACT

The life sciences community, is one of many science domains, that is having to deal with the challenges of big data analysis. Unlike many other research areas, life sciences data can contain significant personal information (e.g. medical records) and the consequences of any data analysis can have profound health implications for the individual (e.g. a cancer diagnosis). It is therefore not surprising that such data is highly regulated and is frequently kept within the organization it is collected in (i.e. a hospital) and external network access highly controlled. Any platform for analyzing life science data has to therefore reflect the complexity of the underlying data.

ELIXIR, is a European wide research infrastructure for the life sciences, that is building a compute platform to support such data analysis activities. Building upon the work that has taken place previously in e-Infrastructure/cyber-infrastructure communities, the compute platform has been working to establish a federated cloud infrastructure to support both flagship science use cases and research activities coming from the ‘long-tail’.

The presentation will review some of the early results of the ELIXIR-EXCELERATE project and how this work is being aligned with both institutional activities at the European Bioinformatics Institute, and more broadly the agenda of the European Cloud Initiative, in order to support the research challenges coming from leading European life science researchers.

______________________________________________________________________

Wednesday Keynote: PCubeS/IT - A Type Architecture and Portable

Parallel Language for Hierarchical Parallel Machines

Andrew S. Grimshaw

Department of Computer Science, University of Virginia, VA, USA

Co-architect, XSEDE

ABSTRACT

Writing portable parallel applications remains a challenge, particularly in the presence of increasingly heterogeneous and deep node architectures. Achieving good performance is especially challenging for rookie parallel programmers who lack the experience of optimizing performance on different types of hardware. Snyder, in his seminal work “Type Architectures, Shared Memory, and the Corollary of Modest Potential,” argues that a reflection of the salient features of an architecture in the programming language is necessary. The PCubeS type architecture represents a parallel machine as a finite hierarchy of parallel processing spaces each having fixed, possibly zero, compute and memory capacities and containing a finite set of uniform, independent sub-spaces.

The IT language is a PCubeS language in which computations are defined to take place in a corresponding hierarchy of logical processing spaces, each of which may impose a different partitioning of data structures. The programmer is responsible for decomposing the problem into multiple spaces, selecting the best decomposition of variables in each space, and for mapping the logical IT spaces to the physical spaces of the target machine. Only the last step, mapping logical to physical spaces is different for each target machine. The rest of the IT program remains the same for different target machines.

The IT compiler and run-time system are responsible for breaking up and executing the code for each logical space on the specified hardware space and managing all communication and synchronization between partitions within a space and between different logical spaces. Two IT compilers have been completed: a multicore compiler and a distributed memory/multicore compiler that uses MPI as inter-host communication mechanism. A third compiler, adding GPGPUs to the distributed memory/multicore compiler is in development.

In this talk I will briefly review what makes efficient parallel programming difficult for programmers, and why the problem is only getting worse as machines are developed with deeper and deeper memory hierarchies and more and more node heterogeneity. I will then introduce the PCubeS type architecture and IT programming language as a mechanism to addressing the efficient portable parallel programming problem via a series of sample programs. Finally I will present the performance results for several application kernels on multicore, distributed memory/multicore, and distributed memory/GPGPU.

______________________________________________________________________

Thursday Keynote: e-Research & the Art of Linking Astrophysics & Deforestation,

via Smartening Energy Systems and Detecting Energy Theft

David Wallom

Oxford e-Research Centre,

University of Oxford, U.K.

ABSTRACT

The core activity when conducting e-Research is the application of ICT tools or technologies to solve a problem in a specific area of research. To maximize value from insights, tools or services developed as part of this requires the reuse of these developed products either within the same area or in others which share the application design model. By being an interdisciplinary domain e-Research is in a unique position to transcend domain boundaries and cherry pick the best ideas and implementations from solutions in any number of areas to solve complex and rare problems. This talk will show how, building on the specific solution developed to support radio astronomy, the Energy & Environmental ICT group in OeRC was able to reuse solutions developed to solve new problems in Energy systems, then smart metering, energy theft and then onto supply chain analysis. At each step we will discuss the particular challenges that allow this switch between domains whilst identifying clearly the problems solved.

______________________________________________________________________

Thursday Keynote II

& Closing Plenary: Big Data Analysis Made Scalable on Clouds

Domenico Talia

University of Calabria & DtoK Lab, Italy

ABSTRACT

The size of data stored in digital repositories is going to increase beyond any previous estimation and data stores and sources are more and more pervasive and distributed. Professionals and scientists need advanced data analysis tools and services coupled with scalable architectures to support the extraction of useful information from big data repositories. Cloud computing systems offer an effective support for addressing both the computational and data storage needs of big data mining and parallel knowledge discovery applications. In fact, complex data mining tasks involve data- and compute-intensive algorithms that require large and efficient storage facilities together with high performance processors to get results in acceptable times. In this talk we introduce the topic and the main research issues in the area of cloud-based data analysis. We discuss how to make data mining services scalable and present a Data Mining Cloud Framework designed for developing and executing distributed data analytics applications as workflows of services. In this environment we use data sets, analysis tools, data mining algorithms and knowledge models that are implemented as single services that can be combined through a visual programming interface in distributed workflows to be executed on Clouds. The main features of the programming interface are described and performance evaluation of knowledge discovery applications is reported.