Data fusion for sociocultural place understanding using deep learning
The International Society for Optical Engineering, or SPIE (formerly known as the Society of Photographic Instrumentation Engineers), has published a range of peer-reviewed scientific journals for over 50 years. A recent issue of their Proceedings of SPIE, which publishes research presented at their recent conference in Orlando, included an article by GA-CCRi data scientists Jake Popham, Mike Forkin, Nick Hamblet, Bryce Inouye titled “Data fusion for sociocultural place understanding using deep learning.”
Their article describes an ongoing project at GA-CCRi that uses the Accumulo key-value store, the Rya RDF triplestore, the GeoMesa spatio-temporal database, and the Solr free-text indexing system to combine data from OpenStreetMap, the GDELT Global Knowledge Graph, Twitter, DBpedia, and overhead imagery into a knowledge graph that enables the identification of connections, patterns, and relationships between pieces of data from these disparate sources. It does this by using deep learning techniques to create feature vectors that encapsulate the contributions of the various data sources into a single fixed-dimension vector per entity, where entities are people, places, groups, Tweets, concepts, and more. Feature vectors are indexed directly to enable a nearest neighbor lookup, but they are primarily intended as compact knowledge representations for use in downstream modeling that will produce socio-cultural outputs, spatial and otherwise.
The paper describes how the embedding space generated by this knowledge graph can then be used by models such as TransE and TransH for tasks such as land use prediction:
This combination of socio-cultural data from sources such as Twitter with the geospatial storage and analytics that GeoMesa provides is typical of much of the deep learning data fusion work that we’re doing at GA-CCRi, and we’re happy that SPIE gave us the opportunity to spread the word!
This material is based upon work supported by the Engineering Research and Development Center (ERDC) – Construction Engineering Research Laboratory. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the ERDC-CERL.