Digital Science Webinar: Science in the Cloud

This post was originally published on this site


As part of a continuing series, on Thursday 25th May we hosted a Digital Science thought leadership webinar discussing the trends in cloud-based computing and the importance of failure in innovation and how this can lead to great science. We discussed the benefits of investing in cloud-based applications and infrastructure and also what industry leaders like Amazon and CERN and planning on developing over the next five years.

Our speakers included:

  • Host: Laura Wheeler – Head of Digital Communications, Digital Science
  • Steve Scott – Director of Portfolio Development, Digital Science
  • Brendan Bouffler, Global Manager, Amazon Web Services Research Cloud Program
  • Tim Bell, Group leader of the Computer and Monitoring group within the IT department, CERN
  • Dan Valen, Product Manager, Figshare

Laura Wheeler (@laurawheelers) kicked off by giving a brief overview of the esteemed panel and their backgrounds before handing over to Steve Scott, our first speaker. Steve started the webinar by providing an overview of the cloud and its various applications drawing on a rather tasty metaphor – making a pizza!

After walking the audience through the cloud-based infrastructures, Steve reflected on an important point: Researchers now require a vastly different set of skills than what once was required.

“Researchers of the past were not necessarily associated with the finer details of IT procurement – it wasn’t a hot topic for them. That’s changing. Scientists must now navigate the landscape of digital information management…One of the challenges for the cloud services is about training for researchers.”

Steve elaborated  with some important points:

  • Physical hardware is on the way out! It takes a huge amount of time to install and can become redundant and out of date very quickly – with cloud-based services you rent the hardware and the deployment of services can be instantaneous.
  • Storing large amounts of data is a challenge. A USB can be lost or become corrupted easily. Lab notes that were taken in notebooks are perishable.

Quote from a lab at a national institute:

“We discovered a promising new way to run a major experiment we carried out seven years ago and needed to find the original datasets used. No-one could!”

Steve finished his presentation by talking about the collaborative nature of the cloud. Teams from across the globe are able to work on datasets easily and efficiently in real time!

Next up, we had Brendan”Boof” Bouffler, Global Manager at Amazon Web Services Research Cloud Program who started with a funny anecdote about Jeff Bezos and the origins of Amazon.

“I think that when Jeff Bezos set the company up he may not have expected to have a research computing team inside the company but he got one! We actually got there because we discovered that in order to do this stuff [Amazon’s online shopping] we had to get very good at building big data centres. When you build big data centres, you start to find out what you can do with them…My team’s task is to make this crazy business applicable to the scientists and to the community.”

Brendan then talked about the fundamental steps in the scientific method and how the tools he helps create amplify these processes allowing scientists to work better and more efficiently, asking more questions and getting quicker responses. It’s also important to mention that scientists hoping to process data are able to do so on a number of different platforms. If one doesn’t work, try another!

brendan

Image: Brendan Bouffler

Just like in all endeavors – failure is a chance to learn. Every time you carry out the scientific method, you get closer to finding out how the universe works.

Brendan referenced the pace at which technologies progress and how he envisages computing to be a commodity that one can use when they want, only paying for their usage time. Imagine a world, Brendan says, where every time we required electricity to power our electronics we had to spin up a generator out in our back gardens – apply this same logic to computing! Brendan spent the rest of his presentation commenting on what his team are focused on doing now and in the future. He also ran through the AWS platform. The possibilities that cloud computing offer industries like medicine are endless – it’s changing the way we cure disease!

Image: Brendan Bouffler

Brendan closed his presentation by pointing to the Amazon Web Services (AWS) Researcher’s Handbook he has helped create. The AWS Research Cloud Program helps you focus on science, not servers! Join the AWS Research Cloud Program and download the handbook here.

Next up, we had Tim Bell, Group leader of the Computer and Monitoring group within the IT department at CERN who walked us through CERN’s current cloud computing uses and its plans for the future.

But before that, Tim introduced The Large Hadron Collider!

Image Credit: Tim Bell

“Over the space of the past five years, we’ve been deploying a private cloud that allows the physicists to become familiar with the use of cloud technologies, and to be using those technologies in order to be driving their workloads through software rather that having people running around the computer centre with cables!”

Tim stated that it is very important that computing power keeps up with what the physics requires. He then ran through all the ways in which CERN is working toward commercial opportunities partnering with other cloud-based services like Google and Amazon.

Image Credit: Tim Bell

It’s not just about infrastructure though. Tim mentioned the software service solutions running on top of the CERN cloud. For example, the online repository Zenodo that was created to help researchers based at institutions of all sizes to share results in a wide variety of formats across all fields of science. Tim also mentioned the CERN Open Data Portal where anyone can access LHC curated data and become a physicist for a day! A useful tool that schools are using to teach pupils how professional scientists work.

Tim then talked about the challenges facing CERN today:

  • Purchasing cloud resources through public procurement is very hard.
  • Cost models where access to a data set had a cost on it.
  • Data-intensive science.
  • Skills combination is changing over time. We now need contract managers, rather than employees, who change the disks and install the machines.
  • Security.
  • Scale – finding methods that CERN can be scaling.

Our final speaker was Dan Valen, Product Manager at Figshare who focused his presentation on why Figshare chose to be a cloud-based platform and why they chose AWS to be their cloud-based platform.

“Dealing with different file types has always been a challenge at Figshare. Taking on the tasks of handling large file uploads – specifically now that we can handle file uploads of up to five terabytes. To then build file previews to visualize them in the browser and ingest and expose the different files and metadata requires a solid infrastructure.”

Dan went on to list the important reasons why cloud-based data is vitally important to everyone from librarians and scientists to publishers and university administrators by drawing on a list of data storage horror stories.

word cloud

There’s a real danger of storing items locally and not having a data preservation and archiving policy in place. Dan went on to state the key benefits of cloud-hosted services, one being predictable cost over time and another being low maintenance. Fundamentally, the most important benefit, however, is excellent security. AWS are able to offer the same types of security they offer to some of the world’s most data-sensitive organizations! The rest of Dan’s presentation was aimed at running through Figshare’s infrastructure and its storage workflows.

The webinar ended with a lively Q&A debate spearheaded by Laura Wheeler; great questions invoked great responses! Using #DSwebinar, our audience was able to interact with our panel throwing their opinions into the mix. If you feel you still have something to say – we’re all ears! Tweet us @digitalsci using #DSwebinar.

The post Digital Science Webinar: Science in the Cloud appeared first on Digital Science.

Comments are closed.

Proudly powered by WordPress | Theme: Baskerville 2 by Anders Noren.

Up ↑