What can machine learning reveal about the solid Earth?

Posted: March 24, 2019 by oldbrew in data, Earthquakes, Geology, methodology, research, volcanos

The ability to recognize patterns in Earth’s behaviour by sifting through masses of geological data could be programmed into machines.

Scientists seeking to understand Earth’s inner clockwork have deployed armies of sensors listening for signs of slips, rumbles, exhales and other disturbances emanating from the planet’s deepest faults to its tallest volcanoes.

“We measure the motion of the ground continuously, typically collecting 100 samples per second at hundreds to thousands of instruments,” said Stanford geophysicist Gregory Beroza. “It’s just a huge flux of data.”

Yet scientists’ ability to extract meaning from this information has not kept pace, reports Phys.org.

The solid Earth, the oceans and the atmosphere together form a geosystem in which physical, biological and chemical processes interact on scales ranging from milliseconds to billions of years, and from the size of a single atom to that of an entire planet.

“All these things are coupled at some level,” explained Beroza, the Wayne Loel Professor in the School of Earth, Energy & Environmental Sciences (Stanford Earth). “We don’t understand the individual systems, and we don’t understand their relationships with one another.”

Now, as Beroza and co-authors outline in a paper published March 21 in the journal Science, machine-learning algorithms trained to explore the structure of ever expanding geologic data streams, build upon observations as they go and make sense of increasingly complex, sprawling simulations are helping scientists answer persistent questions about how the Earth works.

“When I started collaborating with geoscientists five years ago, there was interest and curiosity around machine learning and data science,” recalled Karianne Bergen, lead author on the paper and a researcher at the Harvard Data Science Initiative who earned her doctorate in computational and mathematical engineering from Stanford. “But the community of researchers using machine learning for geoscience applications was relatively small.”

That’s changing rapidly. The most straightforward applications of machine learning in Earth science automate repetitive tasks like categorizing volcanic ash particles and identifying the spike in a set of seismic wiggles that indicates the start of an earthquake.

This type of machine learning is similar to applications in other fields that might train an algorithm to detect cancer in medical images based on a set of examples labeled by a physician.

More advanced algorithms unlocking new discoveries in Earth science and beyond can begin to recognize patterns without working from known examples.

“Suppose we develop an earthquake detector based on known earthquakes. It’s going to find earthquakes that look like known earthquakes,” Beroza explained. “It would be much more exciting to find earthquakes that don’t look like known earthquakes.”

Beroza and colleagues at Stanford have been able to do just that by using an algorithm that flags any repeating signature in the sets of wiggles picked up by seismographs – the instruments that record shaking from earthquakes – rather than hunting for only the patterns created by earthquakes that scientists have previously catalogued.

Both types of algorithms – those with explicit labeling in the training data and those without – can be structured as deep neural networks, which act like a many-layered system in which the results of some transformation of data in one layer serves as the input for a new computation in the next layer.

Among other efforts noted in the paper, these types of networks have allowed geoscientists to quickly compute the speed of seismic waves – a critical calculation for estimating earthquake arrival times – and to distinguish between shaking caused by Earth’s natural motion as opposed to explosions.

Full article here.

  1. ivan says:

    Why do I get the feeling that this is the beginning of the next great ‘scare the people’ scam.
    Use of computer models – check.
    Potential for possible great harm – check.
    Only those in the know can interpret the results – check.
    Produces reasons for shutting down fracking – check.
    Produces reasons to stop oil drilling – check.
    Has the potential for real scary headlines – check.

    If the climatologists can’t get the results the seismologists are the next to try.

    I hope I’m wrong but the potential is there.

  2. Gamecock says:

    Agreed, Ivan.

    Many years ago, when I worked in the state pollution control department, we could measure some things down to parts per million. Like dissolved oxygen in ppm to one decimal point. Years later, when I heard that some thing were being measured down to parts per billion, I thought no good would come of it. I still think that is too small to be relevant. Maybe there are some special cases. So we get people wailing about ‘pollution’ in parts per billion, and it’s needless hysteria.

    As in toxicity is in the dose. The writer assumes something of value will come from machine ‘learning’ of the data.

    “It would be much more exciting to find earthquakes that don’t look like known earthquakes.”

    Assuming they don’t throw data away, when they get an earthquake (or ‘earthquakes that don’t look like known earthquakes,’ whatever that means), they can examine the data leading up to it, and then catalog it as a ‘known’ earthquake.

    ‘The ability to recognize patterns in Earth’s behaviour by sifting through masses of geological data could be programmed into machines.’

    That is using computers to analyze data. We’ve been doing it for generations.

    ‘That’s changing rapidly. The most straightforward applications of machine learning in Earth science automate repetitive tasks like categorizing volcanic ash particles and identifying the spike in a set of seismic wiggles that indicates the start of an earthquake.’

    That’s not ‘machine learning.’ It’s basic computerization.

  3. oldbrew says:

    Going beyond routine algorithms…

    Deep learning

    Deep learning (also known as deep structured learning or hierarchical learning) is part of a broader family of machine learning methods based on learning data representations, as opposed to task-specific algorithms. Learning can be supervised, semi-supervised or unsupervised.[1][2][3]

    Deep learning architectures such as deep neural networks, deep belief networks and recurrent neural networks have been applied to fields including computer vision, speech recognition, natural language processing, audio recognition, social network filtering, machine translation, bioinformatics, drug design, medical image analysis, material inspection and board game programs, where they have produced results comparable to and in some cases superior to human experts.


  4. oldmanK says:

    Machine learning is like going out to hunt elephants. This is how it works; choose your method.


  5. oldbrew says:

    Obscure headline of the day…

    Algorithm designs optimized machine-learning models up to 200 times faster than traditional methods
    by Rob Matheson, Massachusetts Institute of Technology
    MARCH 21, 2019

    In a paper being presented at the International Conference on Learning Representations in May, MIT researchers describe an NAS algorithm that can directly learn specialized convolutional neural networks (CNNs) for target hardware platforms—when run on a massive image dataset—in only 200 GPU hours, which could enable far broader use of these types of algorithms.

    Resource-strapped researchers and companies could benefit from the time- and cost-saving algorithm, the researchers say. The broad goal is “to democratize AI,” says co-author Song Ha.

    – – –
    Gamecock says: That’s not ‘machine learning.’ It’s basic computerization.

    But it’s not basic any more. The latest advances should mean much more can be done within the resource and/or time limit(s), maybe even things that were not feasible at all until now.

  6. stpaulchuck says:

    As a computer science major (BSCS) I entered the world of ‘machine learning’ about a year ago. I have been reading all the research papers I could get a hold of on the subject and then started building my own models and morphing them along the way as others gained traction on some nasty problems in time series data – the holy grail of statistics and machine learning.

    As for deep learning architectures, most of them are a waste of computer time. Look up ‘back propagation’ which is the key to self training architectures. It is basically a set of correction factors fed back into the layers to converge the prediction values with the real results from a training set. It has been found that almost all the correction takes place in the layer next to the output layer. Each succeeding layer gets less and less making them bound to their initial parameter values – useless.

    Having said that, there are a couple newer layouts that seem to hold promise, echo state networks and reservoir networks. I am about to rejoin the fray and work with these layouts. One paper shows success with Mackey-Glass generated data which looks like stochastic time series data but does have rhythm/pattern to it. Completely stochastic time series data is still out of reach though.

    Which brings me to this article. It is likely they WILL find patterns in this data but whether or not they are mere artifacts of the data or actually useful detection of Earth movement events is problematic. Much like Mackey-Glass, I am sure there are chained events that form patterns. Unfortunately, I believe that too many of the chains are too long or too many connections to be easily detected for a pattern. The beating of butterfly wings. I do wish them good luck though. Perhaps they’ll figure out an architecture that will help my research. 😉

  7. Gamecock says:

    “It would be much more exciting to find earthquakes that don’t look like known earthquakes.”

    You codify earthquakes. You use the computer to look at the data for them. Finding ‘earthquakes that don’t look like known earthquakes’ is a bug, not a feature.

  8. Oldbrew, I have lost Roger’s email I was going to send you through him this link to a YouTube video on a subject which I made a comment a few posts down

    It is fairly long but explains with experiments and knowledge of MRI (used in medical scanning) why much of the money spent on space research is a waste of money and time. The presenter clearly states that there are no black holes and no big bang. Also, most stars (including the sun) do not have a gas plasma core but have a solid or liquid core and none of these stars are black bodies.