Transforming big data into smart data involves deriving value from harnessing the volume, variety, and velocity of big data using semantics and the semantic web. This allows making sense of big data by providing actionable information that improves decision making. Examples discussed include a healthcare application called kHealth that uses personal sensor data along with population level data to provide personalized and timely health recommendations and interventions for conditions like asthma.
Presentation at the AAAI 2013 Fall Symposium on Semantics for Big Data, Arlington, Virginia, November 15-17, 2013
Additional related material at: http://wiki.knoesis.org/index.php/Smart_Data
Related paper at: http://www.knoesis.org/library/resource.php?id=1903
Abstract: We discuss the nature of Big Data and address the role of semantics in analyzing and processing Big Data that arises in the context of Physical-Cyber-Social Systems. We organize our research around the five V's of Big Data, where four of the Vs are harnessed to produce the fifth V - value. To handle the challenge of Volume, we advocate semantic perception that can convert low-level observational data to higher-level abstractions more suitable for decision-making. To handle the challenge of Variety, we resort to the use of semantic models and annotations of data so that much of the intelligent processing can be done at a level independent of heterogeneity of data formats and media. To handle the challenge of Velocity, we seek to use continuous semantics capability to dynamically create event or situation specific models and recognize new concepts, entities and facts. To handle Veracity, we explore the formalization of trust models and approaches to glean trustworthiness. The above four Vs of Big Data are harnessed by the semantics-empowered analytics to derive Value for supporting practical applications transcending physical-cyber-social continuum.
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...Amit Sheth
107 slides•3.5K views
Featured Keynote at Worldcomp'14, July 2014: http://www.world-academy-of-science.org/worldcomp14/ws/keynotes/keynote_sheth
Video of the talk at: http://youtu.be/2991W7OBLqU
Big Data has captured a lot of interest in industry, with the emphasis on the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity, and their applications to drive value for businesses. Recently, there is rapid growth in situations where a big data challenge relates to making individually relevant decisions. A key example is human health, fitness, and well-being. Consider for instance, understanding the reasons for and avoiding an asthma attack based on Big Data in the form of personal health signals (e.g., physiological data measured by devices/sensors or Internet of Things around humans, on the humans, and inside/within the humans), public health signals (information coming from the healthcare system such as hospital admissions), and population health signals (such as Tweets by people related to asthma occurrences and allergens, Web services providing pollen and smog information, etc.). However, no individual has the ability to process all these data without the help of appropriate technology, and each human has different set of relevant data!
In this talk, I will forward the concept of Smart Data that is realized by extracting value from Big Data, to benefit not just large companies but each individual. If I am an asthma patient, for all the data relevant to me with the four V-challenges, what I care about is simply, “How is my current health, and what is the risk of having an asthma attack in my personal situation, especially if that risk has changed?” As I will show, Smart Data that gives such personalized and actionable information will need to utilize metadata, use domain specific knowledge, employ semantics and intelligent processing, and go beyond traditional reliance on ML and NLP.
For harnessing volume, I will discuss the concept of Semantic Perception, that is, how to convert massive amounts of data into information, meaning, and insight useful for human decision-making. For dealing with Variety, I will discuss experience in using agreement represented in the form of ontologies, domain models, or vocabularies, to support semantic interoperability and integration. For Velocity, I will discuss somewhat more recent work on Continuous Semantics, which seeks to use dynamically created models of new objects, concepts, and relationships, using them to better understand new cues in the data that capture rapidly evolving events and situations.
Smart Data applications in development at Kno.e.sis come from the domains of personalized health, energy, disaster response, and smart city. I will present examples from a couple of these.
Transforming Big Data into Smart Data for Smart Energy: Deriving Value via ha...Amit Sheth
91 slides•11.1K views
Keynote at the Workshop on Building Research Collaboration: Electricity Systems. Purdue University, West Lafayette, IN. Aug 28-29, 2013.
Abstract:
Big Data has captured much interest in research and industry, with anticipation of better decisions, efficient organizations, and many new jobs. Much of the emphasis is on technology that handles volume, including storage and computational techniques to support analysis (Hadoop, NoSQL, MapReduce, etc), and the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity. However, the most important feature of data, the raison d'etre, is neither volume, variety, velocity, nor veracity -- but value. In this talk, I will emphasize the significance of Smart Data, and discuss how it is can be realized by extracting value from Big Data. Accomplishing this task requires organized ways to harness and overcome the original four V-challenges; and while the technologies currently touted may provide some necessary infrastructure-- they are far from sufficient. In particular, we will need to utilize metadata, employ semantics and intelligent processing, and leverage some of the extensive work that predates Big Data.
For achieving energy sustainability, Smart Grids are known to transform the way we generate, distribute, and consume power. Unprecedented amount of data is being collected from smart meters, smart devices, and sensors all throughout the power grid. I will discuss the central question of deriving Value from the entire smart grid data deluge by discussing novel algorithms and techniques such as Semantic Perception for dealing with Velocity, use of ontologies and vocabularies for dealing with Variety, and Continuous Semantics for dealing with Velocity. I will discuss scenarios that exemplify the process of deriving Value from Big Data in the context of Smart Grid.
Additional background is at: http://wiki.knoesis.org/index.php/Smart_Data
A previous version of this talk with more technical details but not focused on energy: http://j.mp/SmatData
"Computing for Human Experience: Semantics empowered Cyber-Physical, Social and Ubiquitous Computing beyond the Web" Keynote at On the Move Federated Conferences, Crete, Greece, October 18, 2011.
http://www.onthemove-conferences.org/
Details: http://wiki.knoesis.org/index.php/Computi
Presented at the Panel on
Sensor, Data, Analytics and Integration in Advanced Manufacturing, at the Connected Manufacturing track of Bosch-USA organized "Leveraging Public-Private Partnerships for Regional Growth Summit". Panel statement: Sensors, data and analytics are the core of any smart manufacturing system. What are the main challenges to create actionable outputs, replicate systems and scale efficiency gains across industries?
Moderator: Thomas Stiedl, Bosch
Panelists:
1. Amit Sheth, Wright State University
2. Howie Choset, Carnegie Melon University
3. Nagi Gebraeel, Georgia Institute of Technology
4. Brian Anthony, Massachusetts Institute of Technology
5. Yarom Polosky, Oak Ridget National Laboratory
For in-depth look:
Smart IoT: IoT as a human agent, human extension, and human complement
http://amitsheth.blogspot.com/2015/03/smart-iot-iot-as-human-agent-human.html
Semantic Gateway: http://knoesis.org/library/resource.php?id=2154
SSN Ontology: http://knoesis.org/library/resource.php?id=1659
Applications of Multimodal Physical (IoT), Cyber and Social Data for Reliable and Actionable Insights: http://knoesis.org/library/resource.php?id=2018
Smart Data: Transforming Big Data into Smart Data...: http://wiki.knoesis.org/index.php/Smart_Data
Historic use of the term Smart Data (2004): http://www.scribd.com/doc/186588820
Physical-Cyber-Social Computing involves the integration of observations from physical sensors, knowledge and experiences from cyber systems, and social interactions from people. This will allow machines to understand contexts, correlate multi-domain data, and provide personalized solutions by leveraging background knowledge spanning physical, cyber, and social domains. Semantic computing plays a key role in bridging differences between domains to derive insights. The vision is for systems that can proactively initiate information needs with minimal human involvement.
Knowledge Will Propel Machine Understanding of Big DataAmit Sheth
56 slides•1.2K views
1) Amit Sheth presented on how knowledge can help machines better understand big data.
2) He discussed challenges like understanding implicit entities, analyzing drug abuse forums, and understanding city traffic using sensors and text.
3) Sheth argued that knowledge graphs and ontologies can help interpret diverse data types and provide contextual understanding to help solve real-world problems.
Smart Data - How you and I will exploit Big Data for personalized digital hea...Amit Sheth
89 slides•76.6K views
Amit Sheth's keynote at IEEE BigData 2014, Oct 29, 2014.
Abstract from:
http://cci.drexel.edu/bigdata/bigdata2014/keynotespeech.htm
Big Data has captured a lot of interest in industry, with the emphasis on the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity, and their applications to drive value for businesses. Recently, there is rapid growth in situations where a big data challenge relates to making individually relevant decisions. A key example is personalized digital health that related to taking better decisions about our health, fitness, and well-being. Consider for instance, understanding the reasons for and avoiding an asthma attack based on Big Data in the form of personal health signals (e.g., physiological data measured by devices/sensors or Internet of Things around humans, on the humans, and inside/within the humans), public health signals (e.g., information coming from the healthcare system such as hospital admissions), and population health signals (such as Tweets by people related to asthma occurrences and allergens, Web services providing pollen and smog information). However, no individual has the ability to process all these data without the help of appropriate technology, and each human has different set of relevant data!
In this talk, I will describe Smart Data that is realized by extracting value from Big Data, to benefit not just large companies but each individual. If my child is an asthma patient, for all the data relevant to my child with the four V-challenges, what I care about is simply, “How is her current health, and what are the risk of having an asthma attack in her current situation (now and today), especially if that risk has changed?” As I will show, Smart Data that gives such personalized and actionable information will need to utilize metadata, use domain specific knowledge, employ semantics and intelligent processing, and go beyond traditional reliance on ML and NLP. I will motivate the need for a synergistic combination of techniques similar to the close interworking of the top brain and the bottom brain in the cognitive models.
For harnessing volume, I will discuss the concept of Semantic Perception, that is, how to convert massive amounts of data into information, meaning, and insight useful for human decision-making. For dealing with Variety, I will discuss experience in using agreement represented in the form of ontologies, domain models, or vocabularies, to support semantic interoperability and integration. For Velocity, I will discuss somewhat more recent work on Continuous Semantics, which seeks to use dynamically created models of new objects, concepts, and relationships, using them to better understand new cues in the data that capture rapidly evolving events and situations.
Smart Data applications in development at Kno.e.sis come from the domains of personalized health, energy, disaster response, and smart city.
This tutorial presents tools and techniques for effectively utilizing the Internet of Things (IoT) for building advanced applications, including the Physical-Cyber-Social (PCS) systems. The issues and challenges related to IoT, semantic data modelling, annotation, knowledge representation (e.g. modelling for constrained environments, complexity issues and time/location dependency of data), integration, analy- sis, and reasoning will be discussed. The tutorial will de- scribe recent developments on creating annotation models and semantic description frameworks for IoT data (e.g. such as W3C Semantic Sensor Network ontology). A review of enabling technologies and common scenarios for IoT applications from the data and knowledge engineering point of view will be discussed. Information processing, reasoning, and knowledge extraction, along with existing solutions re- lated to these topics will be presented. The tutorial summarizes state-of-the-art research and developments on PCS systems, IoT related ontology development, linked data, do- main knowledge integration and management, querying large- scale IoT data, and AI applications for automated knowledge extraction from real world data.
Related: Semantic Sensor Web: http://knoesis.org/projects/ssw
Physical-Cyber-Social Computing: http://wiki.knoesis.org/index.php/PCS
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersAmit Sheth
40 slides•17.4K views
Abstract
Kno.e.sis (http://knoesis.org) is a world-class research center that uses semantic, cognitive, and perceptual computing for gathering insights from physical/IoT, cyber/Web, and social and enterprise (e.g., clinical) big data. We innovate and employ semantic web, machine learning, NLP/IR, data mining, network science and highly scalable computing techniques. Our highly interdisciplinary research impacts health and clinical applications, biomedical and translational research, epidemiology, cognitive science, social good, policy, development, etc. A majority of our $12+ million in active funds come from the NSF and NIH. In this talk, I will provide an overview of some of our major research projects.
Kno.e.sis is highly successful in its primary mission of exceptional student outcomes: our students have exceptional publication and real-world impact and our PhDs compete with their counterparts from top 10 schools for initial jobs in research universities, top industry research labs, and highly competitive companies. A key reason for Kno.e.sis' success is its unique work culture involving teamwork to solve complex problems. Practically all our work involves real-world challenges, real-world data, interdisciplinary collaborators, path-breaking research to solve challenges, real-world deployments, real-world use, and measurable real-world impact.
In this talk, I will also seek to discuss our choice of research topics and our unique ecosystem that prepares our students for exceptional careers.
Semantic, Cognitive, and Perceptual Computing – three intertwined strands of ...Amit Sheth
70 slides•2.5K views
Keynote at Web Intelligence 2017: http://webintelligence2017.com/program/keynotes/
Video: https://youtu.be/EIbhcqakgvA Paper: http://knoesis.org/node/2698
Abstract: While Bill Gates, Stephen Hawking, Elon Musk, Peter Thiel, and others engage in OpenAI discussions of whether or not AI, robots, and machines will replace humans, proponents of human-centric computing continue to extend work in which humans and machine partner in contextualized and personalized processing of multimodal data to derive actionable information.
In this talk, we discuss how maturing towards the emerging paradigms of semantic computing (SC), cognitive computing (CC), and perceptual computing (PC) provides a continuum through which to exploit the ever-increasing and growing diversity of data that could enhance people’s daily lives. SC and CC sift through raw data to personalize it according to context and individual users, creating abstractions that move the data closer to what humans can readily understand and apply in decision-making. PC, which interacts with the surrounding environment to collect data that is relevant and useful in understanding the outside world, is characterized by interpretative and exploratory activities that are supported by the use of prior/background knowledge. Using the examples of personalized digital health and a smart city, we will demonstrate how the trio of these computing paradigms form complementary capabilities that will enable the development of the next generation of intelligent systems. For background: http://bit.ly/PCSComputing
Physical Cyber Social Computing: An early 21st century approach to Computing ...Amit Sheth
123 slides•10.1K views
Keynote given at WiMS 2013 Conference, June 12-14 2013, Madrid, Spain. http://aida.ii.uam.es/wims13/keynotes.php
Video of this talk at: http://videolectures.net/wims2013_sheth_physical_cyber_social_computing/
More information at: More at: http://wiki.knoesis.org/index.php/PCS
and http://knoesis.org/projects/ssw/
Replacing earlier versions: http://www.slideshare.net/apsheth/physical-cyber-social-computing & http://www.slideshare.net/apsheth/semantics-empowered-physicalcybersocial-systems-for-earthcube
Abstract: The proper role of technology to improve human experience has been discussed by visionaries and scientists from the early days of computing and electronic communication. Technology now plays an increasingly important role in facilitating and improving personal and social activities and engagements, decision making, interaction with physical and social worlds, generating insights, and just about anything that an intelligent human seeks to do. I have used the term Computing for Human Experience (CHE) [1] to capture this essential role of technology in a human centric vision. CHE emphasizes the unobtrusive, supportive and assistive role of technology in improving human experience, so that technology “takes into account the human world and allows computers themselves to disappear in the background” (Mark Weiser [2]).
In this talk, I will portray physical-cyber-social (PCS) computing that takes ideas from, and goes significantly beyond, the current progress in cyber-physical systems, socio-technical systems and cyber-social systems to support CHE [3]. I will exemplify future PCS application scenarios in healthcare and traffic management that are supported by (a) a deeper and richer semantic interdependence and interplay between sensors and devices at physical layers, (b) rich technology mediated social interactions, and (c) the gathering and application of collective intelligence characterized by massive and contextually relevant background knowledge and advanced reasoning in order to bridge machine and human perceptions. I will share an example of PCS computing using semantic perception [4], which converts low-level, heterogeneous, multimodal and contextually relevant data into high-level abstractions that can provide insights and assist humans in making complex decisions. The key proposition is to explain that PCS computing will need to move away from traditional data processing to multi-tier computation along data-information-knowledge-wisdom dimension that supports reasoning to convert data into abstractions that humans are adept at using.
[1] A. Sheth, Computing for Human Experience
[2] M. Weiser, The Computer for 21st Century
[3] A. Sheth, Semantics empowered Cyber-Physical-Social Systems
[4] C. Henson, A. Sheth, K. Thirunarayan, Semantic Perception: Converting Sensory Observations to Abstractions
Smart Data and real-world semantic web applications (2004)Amit Sheth
2 slides•195 views
Probably the first recorded use of "smart data" for achieving the Semantic Web and for realizing productivity, efficiency, and effectiveness gains by using semantics to transform raw data into Smart Data.
2013 retake on this is discussed at: http://wiki.knoesis.org/index.php/Smart_Data
This is a brief a brief review of current multi-disciplinary and collaborative projects at Kno.e.sis led by Prof. Amit Sheth. They cover research in big social data, IoT, semantic web, semantic sensor web, health informatics, personalized digital health, social data for social good, smart city, crisis informatics, digital data for material genome initiative, etc. Dec 2015 edition.
1) The document discusses a semantics-based approach to machine perception that uses semantic web technologies to derive abstractions from sensor data using background knowledge on the web.
2) It addresses three primary issues: annotation of sensor data, developing a semantic sensor web, and enabling semantic perception intelligence at the edge on resource-constrained devices.
3) The approach represents background knowledge and sensor observations using ontologies, and uses deductive and abductive reasoning over these representations to interpret sensor data at multiple levels of abstraction.
There is a rapid intertwining of sensors and mobile devices into the fabric of our lives. This has resulted in unprecedented growth in the number of observations from the physical and social worlds reported in the cyber world. Sensing and computational components embedded in the physical world is termed as Cyber-Physical System (CPS). Current science of CPS is yet to effectively integrate citizen observations in CPS analysis. We demonstrate the role of citizen observations in CPS and propose a novel approach to perform a holistic analysis of machine and citizen sensor observations. Specifically, we demonstrate the complementary, corroborative, and timely aspects of citizen sensor observations compared to machine sensor observations in Physical-Cyber-Social (PCS) Systems.
Physical processes are inherently complex and embody uncertainties. They manifest as machine and citizen sensor observations in PCS Systems. We propose a generic framework to move from observations to decision-making and actions in PCS systems consisting of: (a) PCS event extraction, (b) PCS event understanding, and (c) PCS action recommendation. We demonstrate the role of Probabilistic Graphical Models (PGMs) as a unified framework to deal with uncertainty, complexity, and dynamism that help translate observations into actions. Data driven approaches alone are not guaranteed to be able to synthesize PGMs reflecting real-world dependencies accurately. To overcome this limitation, we propose to empower PGMs using the declarative domain knowledge. Specifically, we propose four techniques: (a) automatic creation of massive training data for Conditional Random Fields (CRFs) using domain knowledge of entities used in PCS event extraction, (b) Bayesian Network structure refinement using causal knowledge from Concept Net used in PCS event understanding, (c) knowledge-driven piecewise linear approximation of nonlinear time series dynamics using Linear Dynamical Systems (LDS) used in PCS event understanding, and the (d) transforming knowledge of goals and actions into a Markov Decision Process (MDP) model used in PCS action recommendation.
We evaluate the benefits of the proposed techniques on real-world applications involving traffic analytics and Internet of Things (IoT).
Semantics-empowered Smart City applications: today and tomorrowAmit Sheth
71 slides•922 views
Citation:
Amit Sheth, "Semantics-empowered Smart City applications: today and tomorrow,” Keynote presented at the The 6th Workshop on Semantics for Smarter Cities (S4SC 2015), collocated with the 14th International Semantic Web Conference (ISWC2015), Bethlehem, PA, USA. Oct 11-12, 2015.
http://kat.ee.surrey.ac.uk/wssc/index.html
Abstract: There has been a massive growth in potentially relevant physical (sensor/IoT)- cyber (Web)- social data related to activities and operations of cities and citizens. As part of our participation in smart city projects, including the EU-funded CityPulse project, we have analyzed a large number of of use cases with inputs from city administrations and end users, and developed a few early applications. In this talk, I will present some exciting smart city applications possible today and venture to speculate on some future ones where Big Data technologies and semantic computing, including the use of domain knowledge, play a critical role.
Abstract: http://j.mp/1MhWWei
Healthcare applications now have the ability to exploit big data in all its complexity. A crucial challenge is to achieve interoperability or integration so that a variety of content from diverse physical (IoT)- cyber (web-based)- and social sources, with diverse formats and modality (text, image, video), can be used in analysis, insight, and decision-making. At Kno.e.sis, an Ohio Center of Excellence in BioHealth Innovation, we have a variety of large, collaborative healthcare/clinical/biomedical projects, all involving domain experts and end-users, and access to real world data that include: clinical/EMR data (of individual patients and that related to public health), data from a variety of sensors (IoT) on and around patients measuring real-time physiological and environmental observations), social data (Twitter, Web forums, PatientsLikeMe), Web search logs, etc. Key projects include: Prescription drug abuse online-surveillance and epidemiology (PREDOSE), Social media analysis to monitor cannabis and synthetic cannabinoid use (eDrugTrends), Modeling Social Behavior for Healthcare Utilization in Depression, Medical Information Decision Assistant and Support (MIDAS) with application to musculoskeletal issues, kHealth: A Semantic Approach to Proactive, Personalized Asthma Management Using Multimodal Sensing (also for Dementia), and Cardiology Semantic Analysis System (with applications to Computer Assisted Coding and Computerized Document Improvement).
This talk will review how ontologies or knowledge graphs play a central role in supporting semantic filtering, interoperability and integration (including the issues such as disambiguation), reasoning and decision-making in all our health-centric research and applications. Additional relevant information is at the speaker’s HCLS page. http://knoesis.org/amit/hcls
- The document describes a method for understanding city traffic dynamics by utilizing sensor data that measures average speed and link travel time, as well as textual data from tweets and official traffic reports.
- It builds statistical models to learn normal traffic patterns from historical sensor data and identifies anomalies, then correlates anomalies with relevant traffic events extracted from tweets and reports.
- The method was evaluated on data collected for the San Francisco Bay Area, and it was able to scale to large real-world datasets by exploiting the problem structure and using Apache Spark for distributed processing. Events extracted from social media provided complementary information to sensor data for explaining traffic anomalies.
Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...Katie Whipkey
42 slides•424 views
This document provides guidance on incorporating big data into humanitarian operations. It defines big data as large, complex datasets that exceed traditional data analysis capabilities. Big data is characterized by its volume, variety, velocity and value. The document outlines the history of big data and provides an overview of different big data types. It also discusses benefits and challenges, as well as important considerations around policy, acquisition, use, and timeline for humanitarian organizations looking to utilize big data.
ON EXPLOITING MULTIMODAL INFORMATION FOR MACHINE INTELLIGENCE AND NATURAL IN...Amit Sheth
40 slides•596 views
Keynote: SECOND INTERNATIONAL WORKSHOP IN MULTIMEDIA PRAGMATICSMMPrag 2019, San Jose, California, 28-30 March 2019
http://mipr.sigappfr.org/19/keynote-speakers/
The Holy Grail of machine intelligence is the ability to mimic the human brain. In computing, we have created silos in dealing with each modality (text/language processing, speech processing,image processing, video processing, etc.). However, the human brain’s cognitive and perceptual capability to seamlessly consume (listen and see) and communicate (writing/typing, voice, gesture) multimodal (text, image, video, etc.) information challenges the machine intelligence research. Emerging chatbots for demanding health applications present the requirements for these capabilities. To support the corresponding data analysis and reasoning needs, we have to explore a pedagogical framework consisting of semantic computing, cognitive computing, and perceptual computing (http://bit.ly/w-SCP). In particular, we have been motivated by the brain’s amazing perceptive power that abstracts massive amounts of multimodal data by filtering and processing them into a few concepts (representable by a few bits) to act upon. From the information processing perspective, this requires moving from syntactic and semantic big data processing to actionable information that can be weaved naturally into human activities and experience (http://bit.ly/w-CHE). Exploration of the above research agenda, including powerful use cases, is afforded in a growing number of emerging technologies and their applications - such as chatbots and robotics. In this talk, I will provide these examples and share the early progress we have made towards building health chatbots (http://bit.ly/H-Chatbot) that consume contextually relevant multimodal data and support different forms/modalities of interactions to achieve various alternatives for digital health (http://bit.ly/k-APH). I will also discuss the indispensable role of domain knowledge and personalization using domain and personalized knowledge graphs as part of various reasoning and learning techniques.
This document discusses the rise of big data and data-driven economies. It notes that data has become a new class of economic asset and that many governments and organizations have recognized the importance of harnessing big data. It then describes some of the key characteristics of big data and drivers that are generating large volumes of data such as mobile devices, the internet of things, user-generated content, and cloud computing. The remainder of the document discusses concepts such as the data value chain, different types of data analytics, and various use cases and case studies to illustrate how big data is being applied.
Social media provides a natural platform for dynamic emergence of citizen (as) sensor communities, where the citizens share information, express opinions, and engage in discussions. Often such a Online Citizen Sensor Community (CSC) has stated or implied goals related to workflows of organizational actors with defined roles and responsibilities. For example, a community of crisis response volunteers, for informing the prioritization of responses for resource needs (e.g., medical) to assist the managers of crisis response organizations. However, in CSC, there are challenges related to information overload for organizational actors, including finding reliable information providers and finding the actionable information from citizens. This threatens awareness and articulation of workflows to enable cooperation between citizens and organizational actors. CSCs supported by Web 2.0 social media platforms offer new opportunities and pose new challenges. This work addresses issues of ambiguity in interpreting unconstrained natural language (e.g., ‘wanna help’ appearing in both types of messages for asking and offering help during crises), sparsity of user and group behaviors (e.g., expression of specific intent), and diversity of user demographics (e.g., medical or technical professional) for interpreting user-generated data of citizen sensors. Interdisciplinary research involving social and computer sciences is essential to address these socio-technical issues in CSC, and allow better accessibility to user-generated data at higher level of information abstraction for organizational actors. This study presents a novel web information processing framework focused on actors and actions in cooperation, called Identify-Match-Engage (IME), which fuses top-down and bottom-up computing approaches to design a cooperative web information system between citizens and organizational actors. It includes a.) identification of action related seeking-offering intent behaviors from short, unstructured text documents using both declarative and statistical knowledge based classification model, b.) matching of intentions about seeking and offering, and c.) engagement models of users and groups in CSC to prioritize whom to engage, by modeling context with social theories using features of users, their generated content, and their dynamic network connections in the user interaction networks. The results show an improvement in modeling efficiency from the fusion of top-down knowledge-driven and bottom-up data-driven approaches than from conventional bottom-up approaches alone for modeling intent and engagement. Several applications of this work include use of the engagement interface tool during recent crises to enable efficient citizen engagement for spreading critical information of prioritized needs to ensure donation of only required supplies by the citizens. The engagement interface application also won the United Nations ICT agency ITU's Young Innovator 2014 award.
Cities are composed of complex systems with physical, cyber, and social components. Current works on extracting and understanding city events mainly rely on technology enabled infrastructure to observe and record events. In this work, we propose an approach to leverage citizen observations of various city systems and services such as traffic, public transport, water supply, weather, sewage, and public safety as a source of city events. We investigate the feasibility of using such textual streams for extracting city events from annotated text. We formalize the problem of annotating social streams such as microblogs as a sequence labeling problem. We present a novel training data creation process for training sequence labeling models. Our automatic training data creation process utilizes instance level domain knowledge (e.g., locations in a city, possible event terms). We compare this automated annotation process to a state-of-the-art tool that needs manually created training data and show that it has comparable performance in annotation tasks. An aggregation algorithm is then presented for event extraction from annotated text. We carry out a comprehensive evaluation of the event annotation and event extraction on a real-world dataset consisting of event reports and tweets collected over four months from San Francisco Bay Area. The evaluation results are promising and provide insights into the utility of social stream for extracting city events.
The document discusses challenges in analytics for big data. It notes that big data refers to data that exceeds the capabilities of conventional algorithms and techniques to derive useful value. Some key challenges discussed include handling the large volume, high velocity, and variety of data types from different sources. Additional challenges include scalability for hierarchical and temporal data, representing uncertainty, and making the results understandable to users. The document advocates for distributed analytics from the edge to the cloud to help address issues of scale.
This document discusses the challenges of building a network infrastructure to support big data applications. Large amounts of data are being generated every day from a variety of sources and need to be aggregated and processed in powerful data centers. However, networks must be optimized to efficiently gather data from distributed sources, transport it to data centers over the Internet backbone, and distribute results. The unique demands of big data in terms of volume, variety and velocity are testing whether current networks can keep up. The document examines each segment of the required network from access networks to inter-data center networks and the challenges in supporting big data applications.
I have framed this talk to encourage Pharmacy students to embrace computing in general, and data science and artificial intelligence techniques in particular. The reason is that data-driven science has overtaken traditional lab science; chemistry and biology that underlie pharmacy have become data-driven sciences, and a significant majority of the new jobs in pharma industries demand data analysis skills. Increasingly, traditional bioinformatics approaches are being complemented or replaced by machine learning or deep learning algorithms, especially for cases that have large data sets. I will provide a few examples (e.g., drug discovery, finding adverse drug reactions and broadly pharmacovigilance, and selecting patients for clinical trials) to demonstrate how big data and/or AI are indispensable to pharma research and industry today.
Philosophy of Big Data: Big Data, the Individual, and SocietyMelanie Swan
34 slides•6.7K views
Philosophical concepts elucidate the impact the Big Data Era (exabytes/year of scientific, governmental, corporate, personal data being created) is having on our sense of ourselves as individuals in society as information generators in constant dialogue with the pervasive information climate.
Ohio Center of Excellence in Knowledge-Enabled Computing at Wright State (Kno.e.sis)
Center overview: http://bit.ly/coe-k
Invitation: http://bit.ly/COE-invite
The document provides an overview of big data analytics for healthcare. It begins with motivating examples that demonstrate how big data can help improve healthcare outcomes and lower costs. It then discusses the main sources of healthcare data, including structured EHR data like billing codes, labs, and medications, as well as unstructured clinical notes. The document outlines challenges in analyzing these different types of complex healthcare data. It also introduces a healthcare analytics platform that can extract and select features from various data sources to build predictive models. Finally, it discusses techniques for clinical text mining, including named entity recognition and negation analysis.
This tutorial presents tools and techniques for effectively utilizing the Internet of Things (IoT) for building advanced applications, including the Physical-Cyber-Social (PCS) systems. The issues and challenges related to IoT, semantic data modelling, annotation, knowledge representation (e.g. modelling for constrained environments, complexity issues and time/location dependency of data), integration, analy- sis, and reasoning will be discussed. The tutorial will de- scribe recent developments on creating annotation models and semantic description frameworks for IoT data (e.g. such as W3C Semantic Sensor Network ontology). A review of enabling technologies and common scenarios for IoT applications from the data and knowledge engineering point of view will be discussed. Information processing, reasoning, and knowledge extraction, along with existing solutions re- lated to these topics will be presented. The tutorial summarizes state-of-the-art research and developments on PCS systems, IoT related ontology development, linked data, do- main knowledge integration and management, querying large- scale IoT data, and AI applications for automated knowledge extraction from real world data.
Related: Semantic Sensor Web: http://knoesis.org/projects/ssw
Physical-Cyber-Social Computing: http://wiki.knoesis.org/index.php/PCS
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersAmit Sheth
40 slides•17.4K views
Abstract
Kno.e.sis (http://knoesis.org) is a world-class research center that uses semantic, cognitive, and perceptual computing for gathering insights from physical/IoT, cyber/Web, and social and enterprise (e.g., clinical) big data. We innovate and employ semantic web, machine learning, NLP/IR, data mining, network science and highly scalable computing techniques. Our highly interdisciplinary research impacts health and clinical applications, biomedical and translational research, epidemiology, cognitive science, social good, policy, development, etc. A majority of our $12+ million in active funds come from the NSF and NIH. In this talk, I will provide an overview of some of our major research projects.
Kno.e.sis is highly successful in its primary mission of exceptional student outcomes: our students have exceptional publication and real-world impact and our PhDs compete with their counterparts from top 10 schools for initial jobs in research universities, top industry research labs, and highly competitive companies. A key reason for Kno.e.sis' success is its unique work culture involving teamwork to solve complex problems. Practically all our work involves real-world challenges, real-world data, interdisciplinary collaborators, path-breaking research to solve challenges, real-world deployments, real-world use, and measurable real-world impact.
In this talk, I will also seek to discuss our choice of research topics and our unique ecosystem that prepares our students for exceptional careers.
Semantic, Cognitive, and Perceptual Computing – three intertwined strands of ...Amit Sheth
70 slides•2.5K views
Keynote at Web Intelligence 2017: http://webintelligence2017.com/program/keynotes/
Video: https://youtu.be/EIbhcqakgvA Paper: http://knoesis.org/node/2698
Abstract: While Bill Gates, Stephen Hawking, Elon Musk, Peter Thiel, and others engage in OpenAI discussions of whether or not AI, robots, and machines will replace humans, proponents of human-centric computing continue to extend work in which humans and machine partner in contextualized and personalized processing of multimodal data to derive actionable information.
In this talk, we discuss how maturing towards the emerging paradigms of semantic computing (SC), cognitive computing (CC), and perceptual computing (PC) provides a continuum through which to exploit the ever-increasing and growing diversity of data that could enhance people’s daily lives. SC and CC sift through raw data to personalize it according to context and individual users, creating abstractions that move the data closer to what humans can readily understand and apply in decision-making. PC, which interacts with the surrounding environment to collect data that is relevant and useful in understanding the outside world, is characterized by interpretative and exploratory activities that are supported by the use of prior/background knowledge. Using the examples of personalized digital health and a smart city, we will demonstrate how the trio of these computing paradigms form complementary capabilities that will enable the development of the next generation of intelligent systems. For background: http://bit.ly/PCSComputing
Physical Cyber Social Computing: An early 21st century approach to Computing ...Amit Sheth
123 slides•10.1K views
Keynote given at WiMS 2013 Conference, June 12-14 2013, Madrid, Spain. http://aida.ii.uam.es/wims13/keynotes.php
Video of this talk at: http://videolectures.net/wims2013_sheth_physical_cyber_social_computing/
More information at: More at: http://wiki.knoesis.org/index.php/PCS
and http://knoesis.org/projects/ssw/
Replacing earlier versions: http://www.slideshare.net/apsheth/physical-cyber-social-computing & http://www.slideshare.net/apsheth/semantics-empowered-physicalcybersocial-systems-for-earthcube
Abstract: The proper role of technology to improve human experience has been discussed by visionaries and scientists from the early days of computing and electronic communication. Technology now plays an increasingly important role in facilitating and improving personal and social activities and engagements, decision making, interaction with physical and social worlds, generating insights, and just about anything that an intelligent human seeks to do. I have used the term Computing for Human Experience (CHE) [1] to capture this essential role of technology in a human centric vision. CHE emphasizes the unobtrusive, supportive and assistive role of technology in improving human experience, so that technology “takes into account the human world and allows computers themselves to disappear in the background” (Mark Weiser [2]).
In this talk, I will portray physical-cyber-social (PCS) computing that takes ideas from, and goes significantly beyond, the current progress in cyber-physical systems, socio-technical systems and cyber-social systems to support CHE [3]. I will exemplify future PCS application scenarios in healthcare and traffic management that are supported by (a) a deeper and richer semantic interdependence and interplay between sensors and devices at physical layers, (b) rich technology mediated social interactions, and (c) the gathering and application of collective intelligence characterized by massive and contextually relevant background knowledge and advanced reasoning in order to bridge machine and human perceptions. I will share an example of PCS computing using semantic perception [4], which converts low-level, heterogeneous, multimodal and contextually relevant data into high-level abstractions that can provide insights and assist humans in making complex decisions. The key proposition is to explain that PCS computing will need to move away from traditional data processing to multi-tier computation along data-information-knowledge-wisdom dimension that supports reasoning to convert data into abstractions that humans are adept at using.
[1] A. Sheth, Computing for Human Experience
[2] M. Weiser, The Computer for 21st Century
[3] A. Sheth, Semantics empowered Cyber-Physical-Social Systems
[4] C. Henson, A. Sheth, K. Thirunarayan, Semantic Perception: Converting Sensory Observations to Abstractions
Smart Data and real-world semantic web applications (2004)Amit Sheth
2 slides•195 views
Probably the first recorded use of "smart data" for achieving the Semantic Web and for realizing productivity, efficiency, and effectiveness gains by using semantics to transform raw data into Smart Data.
2013 retake on this is discussed at: http://wiki.knoesis.org/index.php/Smart_Data
This is a brief a brief review of current multi-disciplinary and collaborative projects at Kno.e.sis led by Prof. Amit Sheth. They cover research in big social data, IoT, semantic web, semantic sensor web, health informatics, personalized digital health, social data for social good, smart city, crisis informatics, digital data for material genome initiative, etc. Dec 2015 edition.
1) The document discusses a semantics-based approach to machine perception that uses semantic web technologies to derive abstractions from sensor data using background knowledge on the web.
2) It addresses three primary issues: annotation of sensor data, developing a semantic sensor web, and enabling semantic perception intelligence at the edge on resource-constrained devices.
3) The approach represents background knowledge and sensor observations using ontologies, and uses deductive and abductive reasoning over these representations to interpret sensor data at multiple levels of abstraction.
There is a rapid intertwining of sensors and mobile devices into the fabric of our lives. This has resulted in unprecedented growth in the number of observations from the physical and social worlds reported in the cyber world. Sensing and computational components embedded in the physical world is termed as Cyber-Physical System (CPS). Current science of CPS is yet to effectively integrate citizen observations in CPS analysis. We demonstrate the role of citizen observations in CPS and propose a novel approach to perform a holistic analysis of machine and citizen sensor observations. Specifically, we demonstrate the complementary, corroborative, and timely aspects of citizen sensor observations compared to machine sensor observations in Physical-Cyber-Social (PCS) Systems.
Physical processes are inherently complex and embody uncertainties. They manifest as machine and citizen sensor observations in PCS Systems. We propose a generic framework to move from observations to decision-making and actions in PCS systems consisting of: (a) PCS event extraction, (b) PCS event understanding, and (c) PCS action recommendation. We demonstrate the role of Probabilistic Graphical Models (PGMs) as a unified framework to deal with uncertainty, complexity, and dynamism that help translate observations into actions. Data driven approaches alone are not guaranteed to be able to synthesize PGMs reflecting real-world dependencies accurately. To overcome this limitation, we propose to empower PGMs using the declarative domain knowledge. Specifically, we propose four techniques: (a) automatic creation of massive training data for Conditional Random Fields (CRFs) using domain knowledge of entities used in PCS event extraction, (b) Bayesian Network structure refinement using causal knowledge from Concept Net used in PCS event understanding, (c) knowledge-driven piecewise linear approximation of nonlinear time series dynamics using Linear Dynamical Systems (LDS) used in PCS event understanding, and the (d) transforming knowledge of goals and actions into a Markov Decision Process (MDP) model used in PCS action recommendation.
We evaluate the benefits of the proposed techniques on real-world applications involving traffic analytics and Internet of Things (IoT).
Semantics-empowered Smart City applications: today and tomorrowAmit Sheth
71 slides•922 views
Citation:
Amit Sheth, "Semantics-empowered Smart City applications: today and tomorrow,” Keynote presented at the The 6th Workshop on Semantics for Smarter Cities (S4SC 2015), collocated with the 14th International Semantic Web Conference (ISWC2015), Bethlehem, PA, USA. Oct 11-12, 2015.
http://kat.ee.surrey.ac.uk/wssc/index.html
Abstract: There has been a massive growth in potentially relevant physical (sensor/IoT)- cyber (Web)- social data related to activities and operations of cities and citizens. As part of our participation in smart city projects, including the EU-funded CityPulse project, we have analyzed a large number of of use cases with inputs from city administrations and end users, and developed a few early applications. In this talk, I will present some exciting smart city applications possible today and venture to speculate on some future ones where Big Data technologies and semantic computing, including the use of domain knowledge, play a critical role.
Abstract: http://j.mp/1MhWWei
Healthcare applications now have the ability to exploit big data in all its complexity. A crucial challenge is to achieve interoperability or integration so that a variety of content from diverse physical (IoT)- cyber (web-based)- and social sources, with diverse formats and modality (text, image, video), can be used in analysis, insight, and decision-making. At Kno.e.sis, an Ohio Center of Excellence in BioHealth Innovation, we have a variety of large, collaborative healthcare/clinical/biomedical projects, all involving domain experts and end-users, and access to real world data that include: clinical/EMR data (of individual patients and that related to public health), data from a variety of sensors (IoT) on and around patients measuring real-time physiological and environmental observations), social data (Twitter, Web forums, PatientsLikeMe), Web search logs, etc. Key projects include: Prescription drug abuse online-surveillance and epidemiology (PREDOSE), Social media analysis to monitor cannabis and synthetic cannabinoid use (eDrugTrends), Modeling Social Behavior for Healthcare Utilization in Depression, Medical Information Decision Assistant and Support (MIDAS) with application to musculoskeletal issues, kHealth: A Semantic Approach to Proactive, Personalized Asthma Management Using Multimodal Sensing (also for Dementia), and Cardiology Semantic Analysis System (with applications to Computer Assisted Coding and Computerized Document Improvement).
This talk will review how ontologies or knowledge graphs play a central role in supporting semantic filtering, interoperability and integration (including the issues such as disambiguation), reasoning and decision-making in all our health-centric research and applications. Additional relevant information is at the speaker’s HCLS page. http://knoesis.org/amit/hcls
- The document describes a method for understanding city traffic dynamics by utilizing sensor data that measures average speed and link travel time, as well as textual data from tweets and official traffic reports.
- It builds statistical models to learn normal traffic patterns from historical sensor data and identifies anomalies, then correlates anomalies with relevant traffic events extracted from tweets and reports.
- The method was evaluated on data collected for the San Francisco Bay Area, and it was able to scale to large real-world datasets by exploiting the problem structure and using Apache Spark for distributed processing. Events extracted from social media provided complementary information to sensor data for explaining traffic anomalies.
Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...Katie Whipkey
42 slides•424 views
This document provides guidance on incorporating big data into humanitarian operations. It defines big data as large, complex datasets that exceed traditional data analysis capabilities. Big data is characterized by its volume, variety, velocity and value. The document outlines the history of big data and provides an overview of different big data types. It also discusses benefits and challenges, as well as important considerations around policy, acquisition, use, and timeline for humanitarian organizations looking to utilize big data.
ON EXPLOITING MULTIMODAL INFORMATION FOR MACHINE INTELLIGENCE AND NATURAL IN...Amit Sheth
40 slides•596 views
Keynote: SECOND INTERNATIONAL WORKSHOP IN MULTIMEDIA PRAGMATICSMMPrag 2019, San Jose, California, 28-30 March 2019
http://mipr.sigappfr.org/19/keynote-speakers/
The Holy Grail of machine intelligence is the ability to mimic the human brain. In computing, we have created silos in dealing with each modality (text/language processing, speech processing,image processing, video processing, etc.). However, the human brain’s cognitive and perceptual capability to seamlessly consume (listen and see) and communicate (writing/typing, voice, gesture) multimodal (text, image, video, etc.) information challenges the machine intelligence research. Emerging chatbots for demanding health applications present the requirements for these capabilities. To support the corresponding data analysis and reasoning needs, we have to explore a pedagogical framework consisting of semantic computing, cognitive computing, and perceptual computing (http://bit.ly/w-SCP). In particular, we have been motivated by the brain’s amazing perceptive power that abstracts massive amounts of multimodal data by filtering and processing them into a few concepts (representable by a few bits) to act upon. From the information processing perspective, this requires moving from syntactic and semantic big data processing to actionable information that can be weaved naturally into human activities and experience (http://bit.ly/w-CHE). Exploration of the above research agenda, including powerful use cases, is afforded in a growing number of emerging technologies and their applications - such as chatbots and robotics. In this talk, I will provide these examples and share the early progress we have made towards building health chatbots (http://bit.ly/H-Chatbot) that consume contextually relevant multimodal data and support different forms/modalities of interactions to achieve various alternatives for digital health (http://bit.ly/k-APH). I will also discuss the indispensable role of domain knowledge and personalization using domain and personalized knowledge graphs as part of various reasoning and learning techniques.
This document discusses the rise of big data and data-driven economies. It notes that data has become a new class of economic asset and that many governments and organizations have recognized the importance of harnessing big data. It then describes some of the key characteristics of big data and drivers that are generating large volumes of data such as mobile devices, the internet of things, user-generated content, and cloud computing. The remainder of the document discusses concepts such as the data value chain, different types of data analytics, and various use cases and case studies to illustrate how big data is being applied.
Social media provides a natural platform for dynamic emergence of citizen (as) sensor communities, where the citizens share information, express opinions, and engage in discussions. Often such a Online Citizen Sensor Community (CSC) has stated or implied goals related to workflows of organizational actors with defined roles and responsibilities. For example, a community of crisis response volunteers, for informing the prioritization of responses for resource needs (e.g., medical) to assist the managers of crisis response organizations. However, in CSC, there are challenges related to information overload for organizational actors, including finding reliable information providers and finding the actionable information from citizens. This threatens awareness and articulation of workflows to enable cooperation between citizens and organizational actors. CSCs supported by Web 2.0 social media platforms offer new opportunities and pose new challenges. This work addresses issues of ambiguity in interpreting unconstrained natural language (e.g., ‘wanna help’ appearing in both types of messages for asking and offering help during crises), sparsity of user and group behaviors (e.g., expression of specific intent), and diversity of user demographics (e.g., medical or technical professional) for interpreting user-generated data of citizen sensors. Interdisciplinary research involving social and computer sciences is essential to address these socio-technical issues in CSC, and allow better accessibility to user-generated data at higher level of information abstraction for organizational actors. This study presents a novel web information processing framework focused on actors and actions in cooperation, called Identify-Match-Engage (IME), which fuses top-down and bottom-up computing approaches to design a cooperative web information system between citizens and organizational actors. It includes a.) identification of action related seeking-offering intent behaviors from short, unstructured text documents using both declarative and statistical knowledge based classification model, b.) matching of intentions about seeking and offering, and c.) engagement models of users and groups in CSC to prioritize whom to engage, by modeling context with social theories using features of users, their generated content, and their dynamic network connections in the user interaction networks. The results show an improvement in modeling efficiency from the fusion of top-down knowledge-driven and bottom-up data-driven approaches than from conventional bottom-up approaches alone for modeling intent and engagement. Several applications of this work include use of the engagement interface tool during recent crises to enable efficient citizen engagement for spreading critical information of prioritized needs to ensure donation of only required supplies by the citizens. The engagement interface application also won the United Nations ICT agency ITU's Young Innovator 2014 award.
Cities are composed of complex systems with physical, cyber, and social components. Current works on extracting and understanding city events mainly rely on technology enabled infrastructure to observe and record events. In this work, we propose an approach to leverage citizen observations of various city systems and services such as traffic, public transport, water supply, weather, sewage, and public safety as a source of city events. We investigate the feasibility of using such textual streams for extracting city events from annotated text. We formalize the problem of annotating social streams such as microblogs as a sequence labeling problem. We present a novel training data creation process for training sequence labeling models. Our automatic training data creation process utilizes instance level domain knowledge (e.g., locations in a city, possible event terms). We compare this automated annotation process to a state-of-the-art tool that needs manually created training data and show that it has comparable performance in annotation tasks. An aggregation algorithm is then presented for event extraction from annotated text. We carry out a comprehensive evaluation of the event annotation and event extraction on a real-world dataset consisting of event reports and tweets collected over four months from San Francisco Bay Area. The evaluation results are promising and provide insights into the utility of social stream for extracting city events.
The document discusses challenges in analytics for big data. It notes that big data refers to data that exceeds the capabilities of conventional algorithms and techniques to derive useful value. Some key challenges discussed include handling the large volume, high velocity, and variety of data types from different sources. Additional challenges include scalability for hierarchical and temporal data, representing uncertainty, and making the results understandable to users. The document advocates for distributed analytics from the edge to the cloud to help address issues of scale.
This document discusses the challenges of building a network infrastructure to support big data applications. Large amounts of data are being generated every day from a variety of sources and need to be aggregated and processed in powerful data centers. However, networks must be optimized to efficiently gather data from distributed sources, transport it to data centers over the Internet backbone, and distribute results. The unique demands of big data in terms of volume, variety and velocity are testing whether current networks can keep up. The document examines each segment of the required network from access networks to inter-data center networks and the challenges in supporting big data applications.
I have framed this talk to encourage Pharmacy students to embrace computing in general, and data science and artificial intelligence techniques in particular. The reason is that data-driven science has overtaken traditional lab science; chemistry and biology that underlie pharmacy have become data-driven sciences, and a significant majority of the new jobs in pharma industries demand data analysis skills. Increasingly, traditional bioinformatics approaches are being complemented or replaced by machine learning or deep learning algorithms, especially for cases that have large data sets. I will provide a few examples (e.g., drug discovery, finding adverse drug reactions and broadly pharmacovigilance, and selecting patients for clinical trials) to demonstrate how big data and/or AI are indispensable to pharma research and industry today.
Philosophy of Big Data: Big Data, the Individual, and SocietyMelanie Swan
34 slides•6.7K views
Philosophical concepts elucidate the impact the Big Data Era (exabytes/year of scientific, governmental, corporate, personal data being created) is having on our sense of ourselves as individuals in society as information generators in constant dialogue with the pervasive information climate.
Philosophy of Big Data: Big Data, the Individual, and SocietyMelanie Swan
34 slides•6.7K views
Similar to TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, Variety, and Velocity using Semantic Techniques and Technologies (20)
Ohio Center of Excellence in Knowledge-Enabled Computing at Wright State (Kno.e.sis)
Center overview: http://bit.ly/coe-k
Invitation: http://bit.ly/COE-invite
The document provides an overview of big data analytics for healthcare. It begins with motivating examples that demonstrate how big data can help improve healthcare outcomes and lower costs. It then discusses the main sources of healthcare data, including structured EHR data like billing codes, labs, and medications, as well as unstructured clinical notes. The document outlines challenges in analyzing these different types of complex healthcare data. It also introduces a healthcare analytics platform that can extract and select features from various data sources to build predictive models. Finally, it discusses techniques for clinical text mining, including named entity recognition and negation analysis.
Based on our discussion, it seems you've been feeling down lately due to being away from family. While distance can be difficult, trying to visit more or video chat may help you feel less isolated. Remember that these feelings will pass. Would you like to discuss strategies to lift your mood until you can see them again? I'm here to help however I can.
Patient: Yes, please provide some strategies.
AI WORLD: I-World: EIS Global Innovation Platform: BIG Knowledge World vs. BI...Azamat Abdoullaev
25 slides•3.3K views
Future World Projects
Global Intelligence Platform
Smart World
Smart Nation
Smart Cities Global Initiative
Smart Superpower Projects
Big Data and Big Knowledge, etc.
This document summarizes the transition from clinical information systems to health grids and the future of health research infrastructure. It discusses trends like rising populations in Asia, increasing resource scarcity, and the need for multidisciplinary and open collaboration. Health grids are presented as enabling virtual collaborations across institutions. Key areas like medical imaging, computational models, and genomic medicine are highlighted. Adoption challenges and requirements like reliable, usable infrastructure are also summarized.
The document discusses developing a Personalized Health Knowledge Graph (PHKG) to support personalized preventative healthcare applications. PHKG integrates medical knowledge and personal health data to provide context-specific and personalized insights. It proposes an architecture with a knowledge graph, rule-based inference engine, and integration of knowledge from ontology catalogs. Challenges include modeling personalization/context, analyzing IoT data, and reusing knowledge from existing health resources. The solution is demonstrated for asthma management using the KHealth dataset and ontologies. Future work includes additional disease cases and dynamic knowledge graph evolution.
Amit Sheth, Pramod Anantharam, Krishnaprasad Thirunarayan, "kHealth: Proactive Personalized Actionable Information for Better Healthcare", Workshop on Personal Data Analytics in the Internet of Things at VLDB2014, Hangzhou, China, September 5, 2014.
Accompanying Video: http://youtu.be/pqcbwGYHPuc
Paper: http://www.knoesis.org/library/resource.php?id=2008
The document discusses the potential for citizens to play an increased role in public health through new technologies. It envisions citizens serving as "sentinels" by opting to share personal health data to help public health surveillance. Citizens could also use health data from social networks and devices to connect with others and access health services and programs. New tools are needed to engage citizens as "scientists" by giving them access to and abilities to analyze public health data.
Framework for understanding data science.pdfMichael Brodie
30 slides•22 views
The objective of my research is to provide a framework with which the data science community can understand, define, and develop data science as a field of inquiry. The framework is based on the classical reference framework (axiology, ontology, epistemology, methodology) used for 200 years to define knowledge discovery paradigms and disciplines in the humanities, sciences, algorithms, and now data science. I augmented it for automated problem-solving with (methods, technology, community) [1][2]. The resulting data science reference framework is used to define the data science knowledge discovery paradigm in terms of the philosophy of data science addressed in [1] and the data science problem-solving paradigm, i.e., the data science method, and the data science problem-solving workflow, addressed in [2][3]. The framework is a much called for unifying framework for data science as it contains the components required to define data science. For insights to better understand data science, this paper uses the framework to define the emerging, often enigmatic, data science problem-solving paradigm and workflow, and to compare them with their well-understood scientific counterparts – scientific problem-solving paradigm and workflow.
The objective of my current research [4] is to develop a 21st C re-conception of data. Unlike 20th C data that are assets, 21st C data science data is phenomenological – a resource in which to discover phenomena and their properties, previously and otherwise impossible.
[1] Brodie, M.L., Defining data science: a new field of inquiry, arXiv preprint https://doi.org/10.48550/arXiv.2306.16177 Harvard University, July 2023.
[2] Brodie, M.L., A data science axiology: the nature, value, and risks of data science, arXiv preprint http://arxiv.org/abs/2307.10460 Harvard University, July 2023.
[3] Brodie, M.L., A framework for understanding data science, arXiv preprint https://arxiv.org/abs/2403.00776 Harvard University, March 2024.
[4] Brodie, M.L., Re-conceiving data in the 21st Century. Work in progress, Harvard University.
The document discusses health informatics research at a computer science department. It defines health informatics as the development of concepts, structures, frameworks and systems to enable efficient and effective healthcare. It outlines several potential areas of health informatics research including health information management, intelligent health systems, health user interfaces, health communications, mathematical computing in health and operating systems for health. It also lists faculty involved in health informatics research and provides an overview of the department's health informatics activities and progress.
This document discusses geohealth, which combines geospatial data and digital technologies to improve public health. It addresses challenges like personalized healthcare, data-driven societies, and smart environments. Geohealth research focuses on topics like infection prevention, one health, and quantified self. Combining eHealth platforms, geospatial data, and other digital tools can help monitor health risks in real-time and tailor interventions. Collaboration across different fields and countries is needed to further geohealth research and applications. The goal is to use new technologies and data to more effectively ensure safety, health, and well-being.
This document discusses geohealth, which combines geospatial data and digital technologies to improve public health. It addresses challenges like personalized healthcare, data-driven societies, and smart environments. Geohealth research focuses on topics like infection prevention, one health, and quantified self. Combining eHealth platforms, geospatial data, and other digital tools can help monitor health risks in real-time, predict disease outbreaks, and develop tailored interventions. The document also discusses collaborations and education initiatives around geohealth.
Building Effective Visualization Shiny WVFOlga Scrivner
36 slides•156 views
This document provides an overview of web visualization tools and frameworks for business intelligence and data visualization. It discusses reactive web frameworks, the Shiny application framework from RStudio, and the Web Visualization Framework (WVF) developed by the Cyberinfrastructure for Network Science Center. Examples of visualizations created with Shiny and WVF are presented, including Sankey diagrams, streamgraphs, heatmaps, and network maps. The document concludes by discussing the future outlook for WVF and promoting an online course on information visualization.
This document discusses occupational health and safety management systems and high-performance work systems. It defines biomedical and health informatics, public health informatics, visual analytics, and geovisualization. It presents the University of Illinois Health system's current paper-based occupational health workflow and its proposed electronic, data-driven workflow using Qualtrics, ESRI, IBM SPSS, and Cerner software. It demonstrates predictive analytics on employee health reports to provide real-time metrics and optimize decisions using geographic information systems.
This document discusses teenage sex in the 21st century with regards to emerging technologies like AI, blockchain, cloud computing, and big data. It notes that while these technologies are widely discussed, few people truly have expertise in working with them, though most believe others do. It suggests substituting other modern technologies like AI, blockchain, or IoT into the same analogy. The document then touches on various topics related to digital health transformation including healthcare IT, smart machines, and challenges with digitization versus true digital transformation.
The future interface of mental health with information technology: high touch...HealthXn
60 slides•326 views
The document discusses the future of mental health and technology, including:
- Technology may help address challenges in healthcare systems but also presents pitfalls if not implemented carefully.
- The roles of health professionals and patients may change as technology becomes more integrated in care, requiring new skills.
- Data and information from various sources can provide insights if analyzed properly, but also raise privacy and security concerns.
- Future health systems will rely more on knowledge management and using data/analytics to provide personalized, predictive care while maintaining the human touch.
Kaiser Permanente is the largest integrated healthcare delivery system in the US, serving around 11 million members through its 38 hospitals, 608 medical offices, and other facilities. It has a long history of technology innovation dating back to the 1960s, when its founders envisioned using computers to create lifelong health records for members. Today, Kaiser Permanente is using big data and machine learning applied across its clinical, financial, and operational data to improve healthcare delivery and outcomes for its members. It has built a data platform and analytics infrastructure to support these efforts, and engages in initiatives like internal data science competitions to advance its analytics capabilities.
How to Manage Work Order Dependencies in Odoo 17 ManufacturingCeline George
18 slides•445 views
When we manufacture certain products, some procedures may need to be completed before others may begin. To ensure that operations take place in the correct order, work order dependencies in the manufacturing module allow operations on a Bill of Materials (BoM) to be blocked by other operations that should be performed first.
Breaking Barriers, Building Bridges The Future of Cross-Cultural Collaboratio...JIPP.IT
32 slides•62 views
Global Teams, Local Insights: Leading Across Cultures
In a world where global collaboration is the norm, cultural intelligence has become a game-changing leadership skill. In this powerful webinar, international experts shared practical strategies for turning cultural differences into trust, innovation, and high-performing global teams.
Top Takeaways:
)Build trust across cultures
)Turn differences into creative synergy
)Lead international teams with clarity and confidence
You missed the webinar? No problem! Book now our On-Demand Online Course:
INTERNATIONAL COLLABORATION
More info read here:
https://jipp.it/international-collaboration-the-foundation/
Ready to put your knowledge to the ultimate test? Gather your sharpest minds and prepare for an evening of exhilarating trivia at our upcoming quiz event! From pop culture deep dives to historical head-scratchers, we've got a diverse range of questions designed to challenge and entertain. It's the perfect opportunity to flex those brain muscles, engage in some friendly competition, and maybe even learn a thing or two. Form your teams, brush up on your general knowledge, and get ready for a night filled with laughter, brainpower, and the thrill of victory. Don't miss out on the chance to be crowned the ultimate quiz champions!
QUIZMASTER : EIRAIEZHIL R K, BA ECONOMICS (2022-25), THE QUIZ CLUB OF PSGCAS
THE QUIZ CLUB OF PSGCAS BRINGS TO YOU A GENERAL QUIZ SET COVERING EVERYTHING UNDER THE SKY TO THE FLOOR OF THE EARTH!
QUIZMASTER: AVISMIGA S, BSc PSYCHOLOGY (2022-25), THE QUIZ CLUB OF PSGCAS
Administration of medication.Medication administration: the direct applicatio...DR .PALLAVI PATHANIA
274 slides•604 views
Medication administration: the direct application of a prescribed medication—whether by injection, inhalation, ingestion, or other means—to the body of the individual by an individual legally authorized to do so.
Teacher Education Programme Optional Paper Guidance & Counselling CONCEPTS IN...ProfDrShaikhImran
27 slides•433 views
According to Good’s Dictionary
“Counselling is the individualized and personalized assistance for personal, educational, vocational problems in which all pertinent facts are studied and analyzed and a solution is sought often with the assistance of a specialist”.
Following subtopics under Unit 5 Product level marketing are covered:
Preparation & evaluation of a product level marketing plan, Nature & contents of
Marketing Plans - Executive Summary, Situation Analysis, Marketing Strategy, Financials, and Control.Marketing
Evaluation & Control - Concept, Process & types of control - Annual Plan Control, Profitability Control, Efficiency
Control, Strategic Control, Marketing Audit, Impact of Technology on Marketing Planning and Control =
Connected Marketing Mix -four C’s (co-creation, currency, communal activation, and Conversation). Application
of Agile marketing Practices in Marketing Planning and control, Use of Immersive Marketing for Marketing
Planning and control decisions.
Unit 5 chapter 6 - CHRONOPHARMACOLOGY.pptxAshish Umale
9 slides•383 views
The slide indicates the details study about th chronopharmacology and it's different aspects. Rhythm cycle or circadian rhythm along with the biological clock or biological cycle is the main aspect of the slides which gives us details study about the cyclic condition of the body.
The 24 hr duration cycle provides different aspects to the body along with these the system present inside the body work on the basis of cycle.
Some of the systems show there action dusting the day condition while
Some of them work in the night condition whereas the exceptional system work during both the cyclic condition that is during the day as well as during the night also.
Hormonal activations occur during the day as well as during the night also.
Based on there there activation and activity the system show there functionalities which are beneficial to the human body as well as human daily functioning.
As we know chrono pharmacology deals with the optimization of drug and reducing the side effects condition by providing treatment as well as by providing drug doses on the different time aspects of on the different rhythmic conditions which help to cure the disease and also help in carry out the treatment.
European challenges through ancient lens: revisiting the 'decline' of the Wes...Javier Andreu
19 slides•397 views
Material de apoyo a la conferencia dictada, en la Universidad de Columbia, el 10 de abril de 2025, por el Prof. Dr. D. Javier Andreu Pintado, en el marco de las actividades organizadas por la University of Columbia European Student Association.
Aviso de la conferencia en la sección de eventos de la Universidad de Columbia: https://sipa.campusgroups.com/ceusa/rsvp_boot?id=1928478
Langman's Medical Embryology 14th Ed.pdfKalluKullu
429 slides•304 views
embryology for medical students,E verystudentwillbeaffectedbypregnancy,eithertheirmother’s,
because what happens in the womb does not necessarily stay in
the womb, or by someone else’s. As health care professionals, you will often
encounter women of childbearing age who may be pregnant, or you may
have children of your own, or maybe it is a friend who is pregnant. In any
case, pregnancy and childbirth are relevant to all of us, and unfortunately,
these processes often culminate in negative outcomes. For example, 50% of
all embryos are spontaneously aborted. Furthermore, prematurity and birth
defects are the leading causes of infant mortality and major contributors to
disabilities. Fortunately, new strategies can improve pregnancy outcomes,
and health care professionals have a major role to play in implementing
these initiatives. However, a basic knowledge of embryology is essential to
the success of these strategies, and with this knowledge, every health care
professional can play a role in providing healthier babies.Clinical Correlates: In addition to describing normal events, each
chapter contains clinical correlates that appear in highlighted boxes. This
material is designed to demonstrate the clinical relevance of embryology
and the importance of understanding key developmental events as a first
step to improving birth outcomes and having healthier babies. Clinical
pictures and case descriptions are used to provide this information, and
this material has been increased and updated in this edition.
Genetics: Because of the increasingly important role of genetics and
molecular biology in embryology and the study of birth defects, basic
genetic and molecular principles are discussed. llqe first chapter provides
an introduction to molecular processes, defines terms commonly used
in genetics and molecular biology, and describes key pathways used
Langman's Medical Embryology 14th Ed.pdfKalluKullu
429 slides•304 views
TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, Variety, and Velocity using Semantic Techniques and Technologies
1. Transforming Big Data into Smart Data:
Deriving Value via harnessing Volume, Variety and Velocity
using semantics and Semantic Web
Put Knoesis Banner
Keynote at 30th IEEE International Conference on Data Engineering (ICDE) 2014
Amit Sheth
LexisNexis Ohio Eminent Scholar & Exec. Director,
The Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis)
Wright State, USA
5. Only 0.5% to 1% of
the data is used for
analysis.
5
http://www.csc.com/insights/flxwd/78931-big_data_growth_just_beginning_to_explode
http://www.guardian.co.uk/news/datablog/2012/dec/19/big-data-study-digital-universe-global-volume
6. Variety – not just structure but modality: multimodal, multisensory
Semi structured
6
8. • What if your data volume gets so large and
varied you don't know how to deal with it?
• Do you store all your data?
• Do you analyze it all?
• What is coverage, skew, quality?
How can you find out which data points are
really important?
• How can you use it to your best advantage?
9
Questions typically asked on Big Data
http://www.sas.com/big-data/
10. • Prediction of the spread of flu in real time during H1N1 2009
– Google tested a mammoth of 450 million different mathematical
models to test the search terms that provided 45 important
parameters
– Model was tested when H1N1 crisis struck in 2009 and gave more
meaningful and valuable real time information than any public health
official system [Big Data, Viktor Mayer-Schonberger and Kenneth Cukier, 2013]
• FareCast: predict the direction of air fares over different
routes [Big Data, Viktor Mayer-Schonberger and Kenneth Cukier, 2013]
• NY city manholes problem [ICML Discussion, 2012]
11
Illustrative Big Data Applications
11. Current focus mainly to serve business intelligence and
targeted analytics needs, not to serve complex
individual and collective human needs (e.g., empower
human in health, fitness and well-being; better disaster
coordination, personalized smart energy)
12
What is missing?
12. highly personalized/individualized/contextualized
Incorporate real-world complexity:
- multi-modal and multi-sensory nature of
physical-world and human perception
Can More Data beat better algorithms?
Can Big Data replace human judgment?
13
Many opportunities, many challenges, lessons to apply
13. • Not just data to information, not just analysis, but actionable
information, delivering insight and support better decision
making right in the context of human activities
15
What is needed?
Data Information
Actionable: An apple a day
keeps the doctor away
14. 16
What is needed? Taking inspiration from cognitive models
• Bottom up and top down cognitive
processes:
– Bottom up: find patterns, mine (ML, …)
– Top down: Infusion of models and background
knowledge (data + knowledge + reasoning)
Left(plans)/Right(perceives) Brain
Top(plans)/Bottom(perceives) Brain
http://online.wsj.com/news/articles/SB10001424052702304410204579139423079198270
15. • Ambient processing as much as possible while enabling
natural human involvement to guide the system
17
What is needed?
Smart Refrigerator: Low on Apples
Adapting the Plan:
shopping for apples
16. Makes Sense to a human
Is actionable –
timely and better decisions/outcomes
18
17. 20
My 2004-2005 formulation of SMART DATA - Semagix
Formulation of Smart Data
strategy providing services
for Search, Explore, Notify.
“Use of Ontologies and
Data repositories to gain
relevant insights”
18. Smart Data (2013 retake)
Smart data makes sense out of Big data
It provides value from harnessing the
challenges posed by volume, velocity,
variety and veracity of big data, in-turn
providing actionable information and
improve decision making.
21
19. OF human, BY human FOR human
Smart data is focused on the actionable
value achieved by human involvement in
data creation, processing and consumption
phases for improving
the human experience.
Another perspective on Smart Data
22
20. OF human, BY human FOR human
Another perspective on Smart Data
23
21. Petabytes of Physical(sensory)-Cyber-Social Data everyday!
More on PCS Computing: http://wiki.knoesis.org/index.php/PCS 24
„OF human‟ : Relevant Real-time Data
Streams for Human Experience
22. OF human, BY human FOR human
25
Another perspective on Smart Data
23. Use of Prior Human-created Knowledge Models
26
„BY human‟: Involving
Crowd Intelligence in data processing workflows
Crowdsourcing and Domain-expert guided
Machine Learning Modeling
24. OF human, BY human FOR human
Another perspective on Smart Data
27
25. Detection of events, such as wheezing
sound, indoor temperature, humidity,
dust, and CO level
Weather Application
Asthma Healthcare
Application
Close the window at home
during day to avoid CO in
gush, to avoid asthma attacks
at night
28
„FOR human‟ :
Improving Human Experience
Population Level
Personal
Public Health
Action in the Physical World
Luminosity
CO level
CO in gush
during day time
26. Electricity usage over a day, device at
work, power consumption, cost/kWh,
heat index, relative humidity, and public
events from social stream
Weather Application
Power Monitoring
Application
29
„FOR human‟ :
Improving Human Experience
Population Level Observations
Personal Level Observations
Action in the Physical World
Washing and drying has
resulted in significant cost
since it was done during peak
load period. Consider
changing this time to night.
27. 30
Every one and everything has Big Data –
It is Smart Data that matter!
28. • Healthcare:
ADFH, Asthma, GI
– Using kHealth system
• Social Media Analysis:
Crisis coordination
– Using Twitris platform
• Smart Cities:
Traffic management
31
I will use applications in 3 domains to demonstrate
29. • Healthcare:
ADFH, Asthma, GI
– Using kHealth system
• Social Media Analysis:
Crisis coordination
– Using Twitris platform
• Smart Cities:
Traffic management
43
Smart Data Applications
30. 44
A Historical Perspective on Collecting Health Observations
Diseases treated only
by external observations
First peek beyond just
external observations
Information overload!
Doctors relied only on
external observations
Stethoscope was the
first instrument to go
beyond just external
observations
Though the stethoscope
has survived, it is only one
among many observations
in modern medicine
http://en.wikipedia.org/wiki/Timeline_of_medicine_and_medical_technology
2600 BC ~1815 Today
Imhotep
Laennec’s stethoscope
Image Credit: British Museum
Editor's Notes
#2: Starting slide Various Big data problems – Traditional examples vs what we are doing examples. Variety and Velocity than Volume. kHealth problem. People will be interested in Smart Data.Traditional ML techniques, High Performance Computing, Statistics. Human level of Abstraction is Smart data.
#4: Note:For images and sources, if not on slides, please see slide notesSome images were taken from the Web Search results and all such images belong to their respective owners, we are grateful to the owners for usefulness of these images in our context.
#7: Types of DataFormats of DataAlso talk about the increase in the platforms that helps generating these data
#8: Example high velocity Big Data applications at work:financial services, stock brokerage, weather tracking, movies/entertainment and online retail.Fast data (rate at which data is coming: esp from mobile, social and sensor sources), Rapid changes – in the data content, Stream analysis – to cope with the incoming data for real-time online analytics
#12: http://radhakrishna.typepad.com/rks_musings/2013/04/big-data-review.htmlGoogle predicted the spread of flu in real time - after analyzing two datasets, a.) 50 million most common terms that Americans type, b.) data on the spread of seasonal flu from public health agency- tested a mammoth of 450 million different mathematical models to test the search terms, comparing their predictions against the actual flu cases- model was tested when H1N1 crisis struck in 2009 and gave more meaningful and valuable real time information than any public health official system (Big Data, Viktor Mayer-Schonberger and Kenneth Cukier, 2013)
#13: Better Algorithms Beat More Data — And Here’s Whyhttp://allthingsd.com/20121128/better-algorithms-beat-more-data-and-heres-why/Big Data Cannot Replace Human Judgmenthttp://www.matchcite.com/blog/blog/2012/july/big-data-cannot-replace-human-judgment.aspx**Comments about the articles
#14: Better Algorithms Beat More Data — And Here’s Whyhttp://allthingsd.com/20121128/better-algorithms-beat-more-data-and-heres-why/Big Data Cannot Replace Human Judgmenthttp://www.matchcite.com/blog/blog/2012/july/big-data-cannot-replace-human-judgment.aspx**Comments about the articles
#15: Better Algorithms Beat More Data — And Here’s Whyhttp://allthingsd.com/20121128/better-algorithms-beat-more-data-and-heres-why/Big Data Cannot Replace Human Judgmenthttp://www.matchcite.com/blog/blog/2012/july/big-data-cannot-replace-human-judgment.aspx**Comments about the articles
#17: Top and bottom part of the brain -- http://online.wsj.com/news/articles/SB10001424052702304410204579139423079198270 Top part of the brain is known for generating plansBottom part of the brain deals with current situational awarenessPerception through senses happens in the primitive part of the brain (mostly subconsciously)Machine perception allows us to transform low level sensor observations to higher level abstractions that are directly communicable to the upper part of the brain (non-subconscious)Thus, people can understand/adapt their plan quickly with abstractionsThe left brain here is generating plan of having an apple a day to make a healthy living The right part of the brain identifies an apple through senses
#18: Communicating the “abstraction” of less apples at home through “Ambient processing/intelligence”The left/top part of the brain will adapt the plan to shopping for apple soon so that the overall plan of having an apple a day can be achieved
#19: Smart data makes sense out of big data – it provides value from harnessing the challenges posed by volume, velocity, variety and veracity of big data, to provide actionable information and improve decision making.
#25: All the data related to human activity, existence and experiencesMore on PCS Computing: http://wiki.knoesis.org/index.php/PCS
#27: Information is CREATED by human with the Machinery available – Wikipedia tool, sensors and social networksInformation is STORED in Man+Machine readable format, LODInformation is PROCESSED using the LOD and Human assisted Knowledge-basedHigher level abstraction on info is now consumed in many mechanistic ways (including GIS) to provide EXPERIENCE for humans Example of a human guided modeling and improved performancehttp://research.microsoft.com/en-us/um/people/akapoor/papers/IJCAI%202011a.pdf
#29: Actionable information example:In Asthma use case we have a sensor – sensordrone which records luminosity and CO levelsA high correlation between CO level and luminosity is foundThis is an actionable information to the user interpreting it as CO in gush during day time=> Mitigating action can be “closing the window” during day
#30: Also, we have weather application which performs abstraction on weather sensory observations to identify blizzard conditions (food for actions!!) :--20,000 weather stations (with ~5 sensors per station)-- Real-Time Feature Streams - live demo: http://knoesis1.wright.edu/EventStreams/ - video demo: https://skydrive.live.com/?cid=77950e284187e848&sc=photos&id=77950E284187E848%21276
#33: http://www.huffingtonpost.com/2012/10/30/hurricane-sandy-power-outage-map-infographic_n_2044411.htmlI would like to start with a motivational example here.
#34: Fraustino, Julia Daisy, Brooke Liu and Yan Jin. “Social Media Use during Disasters: A Review of the Knowledge Base and Gaps,” Final Report to Human Factors/Behavioral Sciences Division, Science and Technology Directorate, U.S. Department of Homeland Security. College Park, MD: START, 2012. Disaster communication deals with disaster information disseminated to the public by governments, emergency management organizations, and disaster responders as well as disaster information created and shared by journalists and the public. Disaster communication increasingly occurs via social media in addition to more conventional communication modes such as traditional media (e.g., newspaper, TV, radio) and word-of-mouth (e.g., phone call, face-to-face, group). Timely, interactive communication and user-generated content are hallmarks of social media, which include a diverse array of web- and mobile-based tools Disaster communication deals with (1) disaster information disseminated to the public by governments, emergency management organizations, and disaster responders often via traditional and social media; as well as (2) disaster information created and shared by journalists and affected members of the public often through word-of-mouth communication and social media. For information seeking. Disasters often breed high levels of uncertainty among the public (Mitroff, 2004), which prompts them to engage in heightened information seeking, (Boyle, Schmierbach, Armstrong, & McLeod, 2004; Procopio & Procopio, 2007). As expected, information seeking is a primary driver of social media use during routine times and during disasters (Liu et al., in press; PEW Internet, 2011). For timely information. Social media provide real-time disaster information, which no other media can provide (Kavanaugh et al., 2011; Kodrich & Laituri, 2011). Social media can become the primary source of time-sensitive disaster information, especially when official sources provide information too slowly or are unavailable (Spiro et al., 2012). For example, during the 2007 California wildfires, the public turned to social media because they thought journalists and public officials were too slow to provide relevant information about their communities (Sutton, Palen, & Shklovski, 2008). Time-sensitive information provided by social media during disasters is also useful for officials. For example, in an analysis of more than 500 million tweets, Culotta (2010) found Twitter data forecasted future influenza rates with high accuracy during the 2009 pandemic, obtaining a 95% correlation with national health statistics. Notably, the national statistics came from hospital survey reports, which typically had a lag time of one to two weeks for influenza reporting. For unique information. One of the primary reasons the public uses social media during disaster is to obtain unique information (Caplan, Perse, & Gennaria, 2007). Applied to a disaster setting, which is inherently unpredictable and evolving, it follows that individuals turn to whatever source will provide the newest details. Oftentimes, individuals experiencing the event first-hand are on the scene of the disaster and can provide updates more quickly than traditional news sources and disaster response organization. For instance, in the Mumbai terrorist attacks that included multiple coordinated shootings and bombings across two days, laypersons were first to break the news on Twitter (Merrifield & Palenchar, 2012). Research participants report using social media to satisfy their need to have the latest information available during disasters and for information gathering and sharing during disasters (Palen, Starbird, Vieweg, & Hughes, 2010; Vieweg, Hughes, Starbird, & Palen, 2010). For unfiltered information. To obtain crisis information, individuals often communicate with one another via social media rather than seeking a traditional news source or organizational website (Stephens & Malone, 2009). The public check in with social media not only to obtain up-to-date, timely information unavailable elsewhere, but also because they appreciate that information may be unfiltered by traditional media, organizations, or politicians (Liu et al., in press). To determine disaster magnitude. The public uses social media to stay apprised of the extent of a disaster (Liu et al., in press). They may turn to governmental or organizational sources for this information, but research has shown that if the public do not receive the information they desire when they desire it, they, along with others, will fill in the blanks (Stephens & Malone, 2009), which can create rumors and misinformation. On the flipside, when the public believed that officials were not disseminating enough information regarding the size and trajectory of the 2007 California wildfires, they took matters into their own hands, using social media to track fire locations in real-time and notify residents who were potentially in danger (Sutton, Palen, & Shklovski, 2008). To check in with family and friends. While Americans predominately use social media to connect with family and friends (PEW Internet, 2011), during disasters those connections may shift. For those with family or friends directly involved with the disaster, social media can provide a way to ensure safety, offer support, and receive timely status updates (Procopio & Procopio, 2007; Stephens & Malone, 2009). In a survey of 1,058 Americans, the American Red Cross (2010) found that nearly half of their respondents would use social media to let loved ones know they are safe during disasters. After the 2011 earthquake and tsunami in Japan, the public turned to Twitter, Facebook, Skype, and local Japanese social networks to keep in touch with loved ones while mobile networks were down (Gao, Barbier, & Goolsby, 2011). Researchers also note that disasters may enhance feelings of affection toward family members, and indeed survey participants reported expressing more positive emotions toward their loved ones than usual as a result of the September 11 terrorist attacks, even if they were not directly impacted by the disaster (Fredrickson et al., 2003). Finally, disasters can motivate the public to reconnect with family and friends via social media (Procopio & Procopio, 2009; Semaan & Mark, 2012). To self-mobilize. During disasters, the public may use social media to organize emergency relief and ongoing assistance efforts from both near and afar. In fact, one research group dubbed those who surge to the forefront of digital and in-person disaster relief efforts as “voluntweeters” (Starbird & Palen, 2011). Other research documents the role of Facebook and Twitter in disaster relief fundraising (Horrigan & Morris, 2005; PEJ, 2010). Research also reveals how social media can help identify and respond to urgent needs after disasters. For example, just two hours after the 2010 Haitian earthquake Tufts University volunteers created Ushahidi-Haiti, a crisis map where disaster survivors and volunteers could send incident reports via text messages and tweets. In less than two weeks, 2,500 incident reports were sent to the map (Gao, Barbier, & Gollsby, 2011). To maintain a sense of community. During disasters the media in general and social media in particular may provide a unique gratification: sense of community. That is, as the public logs in online to share their feelings and thoughts, they assist each other in creating a sense of security and community, even when scattered across a vast geographical area (Lev-On, 2011; Procopio & Procopio, 2007). As Reynolds and Seeger (2012) observed, social media create communities during disasters that may be temporary or may continue well into the future. To seek emotional support and healing. Finally, disasters are often inherently tragic, prompting individuals to seek not only information but also human contact, conversation, and emotional care (Sutton et al., 2008). Social media are positioned to facilitate emotional support, allowing individuals to foster virtual communities and relationships, share information and feelings, and even demand resolution (Choi & Lin, 2009; Stephens & Malone, 2009). Indeed, social media in general and blogs in particular are instrumental for providing emotional support during and after disasters (Macias, Hilyard, & Freimuth, 2009; PEJ New Media Index, 2011). Additionally, social media in general and Twitter in particular can aid healing, as research finds during both natural disasters, such as Hurricane Katrina (Procopio & Procopio, 2007), and man-made disasters, such as the July 2011 attacks in Oslo, Norway (Perng et al., 2012).
#35: http://www.buzzfeed.com/annanorth/how-social-media-is-aiding-the-hurricane-sandy-rec -- Facebook help during Hurricane Sandyhttp://blog.twitter.com/2012/10/hurricane-sandy-resources-on-twitter.html – Twitter page for Hurricane Sandyhttp://www.treehugger.com/culture/12-ways-help-hurricane-sandy-relief-efforts.htmlCategorization of severity based on weather conditions. Actionable information is contextually dependent.
#36: http://news.cnet.com/8301-1023_3-57541566-93/report-twitter-hits-half-a-billion-tweets-a-day/http://semiocast.com/en/publications/2012_07_30_Twitter_reaches_half_a_billion_accounts_140m_in_the_USLet me consider one small example of how social data (in turn data) can help people during disasters. Data becomes smart data if it takes recipient into account - context.Sensor data for emergency responders. Who in the population needs immediate attention (1) Location (2) Severity (3) Health Condition Need for abstraction. – Semantic Perception needs abstraction. 90 + Heart Problem Don’t run out23 Run out
#37: http://news.cnet.com/8301-1023_3-57541566-93/report-twitter-hits-half-a-billion-tweets-a-day/http://semiocast.com/en/publications/2012_07_30_Twitter_reaches_half_a_billion_accounts_140m_in_the_UShttp://www.internews.org/sites/default/files/resources/InternewsEurope_Report_Japan_Connecting%20the%20last%20mile%20Japan_2013.pdfLet me consider one small example of how social data (inturn data) can help people during disasters. Data becomes smart data if it takes recipient into account and changes contact accordingly.Sensor data for emergency responders. Who in the population needs immidiate attention (1) Location (2) Severity (3) Health Condition Need for abstraction. – Semantic Perception needs abstraction. 90 + Heart Problem Don’t run out23 Run out
#38: http://news.cnet.com/8301-1023_3-57541566-93/report-twitter-hits-half-a-billion-tweets-a-day/http://semiocast.com/en/publications/2012_07_30_Twitter_reaches_half_a_billion_accounts_140m_in_the_USLet me consider one small example of how social data (inturn data) can help people during disasters. Data becomes smart data if it takes recipient into account and changes contxt accordingly.Sensor data for emergency responders. Who in the population needs immediate attention (1) Location (2) Severity (3) Health Condition Need for abstraction. – Semantic Perception needs abstraction. 90 + Heart Problem Don’t run out23 Run out
#39: http://www.buzzfeed.com/jackstuef/the-man-behind-comfortablysmug-hurricane-sandysDuring the storm last night, user @comfortablysmug was the source of a load of frightening but false information about conditions in New York City that spread wildly on Twitter and onto news broadcasts before Con Ed, the MTA, and Wall Street sources had to take time out of the crisis situation to refute them.
#40: Although we face challenges like these with data everytime. The most important thing is what you aim to do with the data. I mean what value do you intend to provide from the data
#43: -- Contextual Questioning – Potential Information needed from Humans
#45: "2600 BC – Imhotep wrote texts on ancient Egyptian medicine describing diagnosis and treatment of 200 diseases in 3rd dynasty Egypt.”Sir William Osler, 1st Baronet, was a Canadian physician and one of the four founding professors of Johns Hopkins Hospital. He was called the father of modern medicine. Sir William Osler called Imhotep as the true father of medicine.Observations related to human body was quite limited Initially, doctors communicated with patients asking for their symptoms (subjective)Laennec’s [Rene TheophileHyacintheLaënnec, a French Physician] stethoscope was the fist peek into the observations of human body (objective)Now, there are petabytes of data being generated for observations of human body
#46: Larry Smarr is a professor at the University of California, San DiegoAnd he was diagnosed with crohn's diseaseWhat’s interesting about this case is that Larry diagnosed himselfHe is a pioneer in the area of Quantified-Self, which uses sensors to monitor physiological symptomsThrough this process he discovered inflammation, which led him to discovery of Crones DiseaseThis type of self-tracking is becoming more and more common
#47: - With this ability,many problems could be solved- For example: we could help solve health problems (before they become serious health problems) through monitoring symptoms and real-time sense making, acting as an early warning system to detect problematic health conditions
#54: AmitSheth, Pramod Anantharam, Cory Henson, 'Physical-Cyber-Social Computing: An Early 21st Century Approach,' IEEE Intelligent Systems, vol. 28, no. 1, pp. 78-82, Jan.-Feb., 2013.
#56: Research on Asthma has three phases Data collection: what signals to collect?Analysis: what analysis to be done?Actionable information: what action to recommend?In the next slide, we take a peek into the analysis that we do for Asthma
#59: What is the current state of a person/patient? => Summarizing all the observations (sensor and personal) into a single score indicating health of a personInstead of presenting all the raw data (often to much e.g., Asthma application we have developed collects CO, temperature, and humidity every 10 seconds resulting in 8,640 observations/day) which may not be comprehensible to the patient, we empower them by providing actionable summaries.
#60: What is the likely state of the person in future? => Given the current state and the changing environmental conditions, estimate the state of the person by summarizing it into a number which is actionable. For example, vulnerability score for a person with Asthma is computed with environmental factors (pollen, air quality, external temperature and humidity) and current state of the patient. Intuitively, a person with well controlled asthma should have a lower vulnerability score than a person with poorly controlled asthma both being in a poor environmental state.
#64: In the absence of declarative knowledge in a domain, we resort to statistical approaches to glean insights from dataEven if there is declarative knowledge of a domain, it may have to be personalizedThe CO level may be related to the luminosity as observed by the sensordrone – as it gets brighter the CO level also increases => high CO level in daytime If such an insight is provided to a person, the interpretation can be:Some activity inside the house leads to high CO levelsOutside activity leads to high CO levels inside the houseSince the person knows that he/she is absent in the house during mornings, it has to be something from outside.- Person narrows down to a possible opened window at home (forgot to close more often)
#65: There are two components in making sense of Health Signals:Health signal extraction – processing, aggregating, and abstracting from raw sensor/textual data to create human intelligible abstractionsHealth signal understanding – derive (1) connections between abstractions and (2) Action recommendation:ContinueContact nurseContact doctor
#67: Only score based structure extraction is presented here. Other popular structure extraction techniques include constraint based approaches which finds independences between random variables X1, …, XnI-Map => different structures result in the same loglikelihood score. Thus recovering the original structure of the graph generating data using data alone is considered impossible! We go the the rescue of declarative knowledge to: (1) choose promising structures and (2) to break ties when two structure results in the same score
#69: Massive amount of data is collected by sensors and mobile devices yet patients and doctors care about “actionable” information.This data has all the four Vs of big data and we used knowledge enabled techniques to transform it into valueIn the context of PD, we analyzed massive amount of sensor data collected by sensors on a smartphones to understand detection and characterization of PD severity.
#70: Main idea: Prior knowledge of PD was used to facilitate its detection from massive sensor data by reducing the search spaceDetails:Declarative knowledge of PD includes PD severity and their symptoms as shown in the logical rule aboveEach PD severity level is a conjunction of a set of PD symptomsEach symptom was mapped to its manifestation in sensor observationsThe availability of declarative knowledge significantly improved the analytics by aiding feature selection processThe graphs above contrasts the physical movements and voice of two control group members and two PD patients
#79: perception cycle contains two primary phasesexplanationtranslating low-level signals into high-level abstractions inference to the best explanationdiscriminationfocusing attention on those properties that will help distinguish between multiple possible explanationsused to intelligently task sensors and collect additional observations (rather than brute force approach of blindly collecting all observations)
#83: perception cycle contains two primary phasesexplanationtranslating low-level signals into high-level abstractions inference to the best explanationdiscriminationfocusing attention on those properties that will help distinguish between multiple possible explanationsused to intelligently task sensors and collect additional observations (rather than brute force approach of blindly collecting all observations)
#84: A single-feature (disease) assumption means that all the observed properties (symptoms) must be explained by a single feature.i.e., this framework is not expressive enough to model comorbidity where there may be more than one feature (disease) co-existing For example, if there are two diseases causing disjoint symptoms, and all the symptoms of both the diseases are observed, then this framework will not be able to find the coverage and returns no diseases.Parsimony criteria is single feature assumption to choose from among multiple explanationsNot true: if multiple disease account for single property…Rewrite with more relaxed parcimony criteria (complex, cannot be modeled in OWL)Make KB more intelligent: create an individual that represents the two disease which together explain a symptom
#86: perception cycle contains two primary phasesexplanationtranslating low-level signals into high-level abstractions inference to the best explanationdiscriminationfocusing attention on those properties that will help distinguish between multiple possible explanationsused to intelligently task sensors and collect additional observations (rather than brute force approach of blindly collecting all observations)
#93: Intelligence distributed at the edge of the networkRequires resource-constrained devices (mobile phones, gateway notes, etc.) to be able to utilize SW technologies
#94: Intelligence distributed at the edge of the networkRequires resource-constrained devices (mobile phones, gateway notes, etc.) to be able to utilize SW technologiesHenson et al. 'An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained Devices, ISWC 2012.
#95: compute machine perception inferences -- i.e., explanation and discrimination -- of high-complexity on a resource-constrained devices in milisecondsDifference between the other systems and what this system provides
#96: Intelligence at the age. Shipping computation and domain models to the edge (Distributed)
#101: http://www.guardian.co.uk/news/datablog/2012/oct/31/twitter-sandy-floodinghttp://www.huffingtonpost.com/2012/11/02/twitter-hurricane-sandy_n_2066281.htmlhttp://mashable.com/2012/10/31/hurricane-sandy-facebook/We in our lab have quite a bit of Social Data Research going on. So I would like to focus on the use of social networks during these disasters/crisis.Twitter and Facebook are massively used during disasters. During Hurricane Sandy there were …Not only this a major outbreak of tweets were during Japan earthquake which crossed more that 2000 tweets/sec.So why do people intend to use social networks to this extent during disasters.
#114: Categorization of severity based on weather conditions. Actionable information is contextually dependent.
#116: - 1 (+half) minuteAlright, so let’s motivate by this situation during emergency - Various actors: resource seekers, responder teams, resource providers at remote siteAnd - each of these actor groups have questions --- - needs - providers - responders: wondering!Here we have social network to connect these actors and bridge the gap for communication platformBut it’s potential use is yet to be realized for effective help
#117: Talk about what kind of smart data we provide that helps the actions of crisis response coordination.
#118: Source: Purohit et. al 2013 (https://docs.google.com/a/knoesis.org/document/d/1aBJ2egHICUwaWxR8jOoTIUfEYj1QAnUt0q7haIKoYGY/edit# , http://www.knoesis.org/library/resource.php?id=1865)
#126: Definition of the event US Elections and some changes/subevents --- Primaries --- Debates -- People/Places/Organizations involved in the eventArab Spring -- Subevents during those -- Egypt protests
#133: Pucher, J., Korattyswaroopam, N., & Ittyerah, N. (2004). The crisis of public transport in India: Overwhelming needs but limited resources. Journal of Public Transportation, 7(4), 1-30.
#136: Twitter as a source of real-time informationThere are over 200 million users generating 500 million tweets / dayTwitter as a source of events in a cityCitizens use twitter to express their concerns of city infrastructure that impacts their life
#137: The red-tweets are the tweets that are related to city infrastructure e.g., trafficThere are two steps in converting raw tweets from a city to city related events:City event annotation: sequence labeling technique to spot location and event termsCity event extraction: aggregating all the location + event terms to derive eventsTo do this aggregation, we follow some principles that characterize city events
#138: CRF assigns a tag to each tokenGlobal normalization is the argmax termRHS is just a regression based implementation of linear chain (potentials defined only over adjacent tags) CRFLingPipe implementation of CRF is used in our experiments
#141: localized event detection strategy, city a composition of smaller geographical unitsWe call these geographical units as grids Geohash provides us a way of compartmentalizing a city into uniquely addressable gridsDistance computed using the formula:dlon = lon2 - lon1 dlat = lat2 - lat1 a = (sin(dlat/2))^2 + cos(lat1) * cos(lat2) * (sin(dlon/2))^2 c = 2 * atan2( sqrt(a), sqrt(1-a) ) d = R * c (where R is the radius of the Earth)Found the box for the tweet!37.7545166015625, -122.42065429687537.7545166015625, -122.4096679687537.7490234375, -122.4096679687537.7490234375, -122.420654296875
#142: These algorithms take the annotated tweets as input and then emit events with their metadata
#143: Now that we have presented the (1) event extraction and (2) event aggregation algorithms, how well are we doing?We evaluate both the componentsThe ground truth are the events reported on 511.orgWe compare the events we extract from tweets with the 511.org events
#144: We evaluate the extracted events based on there orthogonal metrics.We compare (1) events extracted from tweets using our algorithms and (2) 511.org events Complementary events – the events from (1) and (2) may complement each other i.e., one providing a different view from the otherCorroborative events – the events from (1) and (2) may support each other i.e., redundant eventsTimeliness – the events were reported on (1) before it was reported on (2)
#145: Next few slides give examples of the evaluation metric
#149: The record of 511.org may have its own timestamp which may be before tweets
#154: More at: http://wiki.knoesis.org/index.php/PCSAnd http://knoesis.org/projects/ssw/