『Abstract
Assuming a study region in which each cell has associated with
it an N-dimensional vector of values corresponding to N predictor
variables, one means of predicting the potential of some cell
to host mineralization is to estimate, on the basis of historical
data, a probability density function that describes the distribution
of vectors for cells known to contain deposits. This density estimate
can then be employed to predict the mineralization likelihood
of other cells in the study region. however, owing to the curse
of dimensionality, estimating densities in high-dimensional input
spaces is exceedingly difficult, and conventional statistical
approaches often break down. This article describes an alternative
approach to estimating densities. Inspired by recent work in the
area of similarity-based learning, in which input takes the form
of a matrix of pairwise similarities between training points,
we show how the density of a set of mineralized training examples
can be estimated from a graphical representation of those examples
using the notion of eigenvector graph centrality. We also show
how the likelihood for a test example can be estimated from these
data without having to construct a new graph. Application of the
technique to the prediction of gold deposits based on 16 predictor
variables shows that its predictive performance far exceeds that
of conventional density estimation methods, and is slightly better
than the performance of a discriminative approach based on multilayer
perceptron neutral networks.
Key Words: Mineral deposit prediction; density estimation; eigenvector
graph centrality; similarity-based learning』
Introduction
Similarity-based density estimation
Eigenvector graph centrality
Estimating likelihoods on test data
Converting distances to similarities
Empirical results
A two-dimensional illustrative example
Gaussian mixture models
Similarity-based and kernel approaches
Full 16-dimensional input space
Discussion and concluding remarkes
References