#asteroidWeAreComingAfterYou received a People's Choice Nomination.

THE CHALLENGE: Near Earth Objects Machine Learning

Solar System

For this challenge, we invite you to become "virtual contributors" to the Asteroid Grand Challenge and develop a hypothetical method, concept note or simple prototype that demonstrates how Machine Learning could be used to help us avoid the same fate as the dinosaurs.

Explanation

Presented the challenges in tracking/characterization of the Near Earth Objects (NEO), we have attempted to accelerate the speed of characterization through machine learning. We have carried out thorough analysis of NEO data and established correlation using supervised learning between different available parameters. It led to discovery of a solution that more accurately determines the diameter of known asteroid.

Currently, the diameter of an asteroid is usually estimated empirically by fitting the following equation [1]

Where,

D is the diameter (km)

H is the absolute magnitude is the visual magnitude an observer would record if the asteroid were placed 1 AU at 0 phase angle.

p is the albedo which is the ratio of the light reflected by a body to the light received by it.

However, in most cases the albedo data for asteroid is not known and only the absolute magnitude is known [2]. Therefore scientists assume the range for albedo between 0.05 and 0.25. This causes high variability and uncertainty for determining asteroid diameter. For example - absolute magnitude value of 15.5 is considered to be an asteroid of diameter 2-5km. We demonstrate that supervised learning methods can predict more accurately the diameter of an asteroid given the Albedo data is not available. For example for absolute magnitude value of 15.5 the considered asteroid diameter range can be reduced to around 2.2-4.8km.

Our solution consists of a research method and a script that reads data from MPC web service [3] and prepares training and testing sets. These sets consisted of available absolute magnitude and orbital data. Training set is used to build up a model to predict diameter. Testing set is then used to validate the prediction and determine Mean-Square-Error (MSE) which is then compared with the calculations of diameter made with assumed albedo from the same testing set. We tried with different learning methods: Linear Regression, Lasso, Bayesian Linear regression. We found that Bayesian Linear regression most accurately determined asteroids diameter given albedo data was missing. Predictions from our model show reduced MSE which is indicative of the improvement in accuracy. We believe that including a larger dataset and a more sophisticated algorithm we can further improve the accuracy in prediction of diameters.

Figure 1a shows the scatter plot of real diameter vs calculated diameter (with assumed albedo). Depending on the assumption used it is clear that the range of values is large and accuracy is low. By running some machine learning algorithms, we can use additional data available to narrow down the predictions and improve accuracy. The predicted values with use of Bayesian linear regression vs real diameter values are shown in Figure 1b. This shows a possible improvement in accuracy for determining NEO diameter.

MSE of diameter calculation with albedo = 0.05 |
27.169 |

MSE of diameter calculation with albedo = 0.15 |
2.641 |

MSE of diameter calculation with albedo = 0.25 |
6.691 |

MSE of predicted diameter using Bayesian Linear Regression |
1.859 |

In the future, we believe we could make this work even more accurately by using more sophisticated machine learning techniques and larger datasets from more sources. This could significantly improve predictions of asteroid parameters.

Resources Used

References:

[1] http://arxiv.org/pdf/1109.4096v1.pdf

[2] http://neo.jpl.nasa.gov/glossary/h.html

[3] http://minorplanetcenter.net/web_service

Feature image taken from: