See instead more recent version (ICDE2014 keynote): http://j.mp/ICDE-key
A video of a version of this talk: http://youtu.be/8RhpFlfpJ-A
Amit Sheth, "Transforming Big Data into Smart Data: Deriving Value via harnessing Volume, Variety and Velocity using semantics and Semantic Web," keynote at the 21st Italian Symposium on Advanced Database Systems,
June 30 - July 03 2013, Roccella Jonica, Italy. Also invited talks given in Universities in Spain and Italy in June 2013.
Highlight: How to harness Smart Data that is actionable, from the Voluminous Big Data with Velocity and Variety-- using Semantics and the Semantic Web core to bring Human-Centric Computing in practice.
Abstract from: http://www.sebd2013.unirc.it/invitedSpeakers.html
Big Data has captured much interest in research and industry, with anticipation of better decisions, efficient organizations, and many new jobs. Much of the emphasis is on technology that handles volume, including storage and computational techniques to support analysis (Hadoop, NoSQL, MapReduce, etc), and the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity. However, the most important feature of data, the raison d'etre, is neither volume, variety, velocity, nor veracity -- but value. In this talk, I will emphasize the significance of Smart Data, and discuss how it is can be realized by extracting value from Big Data. To accomplish this task requires organized ways to harness and overcome the original four V-challenges; and while the technologies currently touted may provide some necessary infrastructure-- they are far from sufficient. In particular, we will need to utilize metadata, employ semantics and intelligent processing, and leverage some of the extensive work that predates Big Data. For Volume, I will discuss the concept of Semantic Perception, that is, how to convert massive amounts of data into information, meaning, and insight useful for human decision-making. For dealing with Variety, I will discuss experience in using agreement represented in the form of ontologies, domain models, or vocabularies, to support semantic interoperability and integration, and discuss how this can not simply be wished away using NoSQL. Lastly, for Velocity, I will discuss somewhat more recent work on Continuous Semantics , which seeks to use dynamically created models of new objects, concepts, and relationships and uses them to better understand new cues in the data that capture rapidly evolving events and situations.
Additional background at: http://knoesis.org/vision > SmartData and "Semantics-empowered Approaches to Big Data Processing for Physical-Cyber-Social Applications," http://www.knoesis.org/library/resource.php?id=1889 .
Physical Cyber Social Computing: An early 21st century approach to Computing ...Amit Sheth
Keynote given at WiMS 2013 Conference, June 12-14 2013, Madrid, Spain. http://aida.ii.uam.es/wims13/keynotes.php
Video of this talk at: http://videolectures.net/wims2013_sheth_physical_cyber_social_computing/
More information at: More at: http://wiki.knoesis.org/index.php/PCS
and http://knoesis.org/projects/ssw/
Replacing earlier versions: http://www.slideshare.net/apsheth/physical-cyber-social-computing & http://www.slideshare.net/apsheth/semantics-empowered-physicalcybersocial-systems-for-earthcube
Abstract: The proper role of technology to improve human experience has been discussed by visionaries and scientists from the early days of computing and electronic communication. Technology now plays an increasingly important role in facilitating and improving personal and social activities and engagements, decision making, interaction with physical and social worlds, generating insights, and just about anything that an intelligent human seeks to do. I have used the term Computing for Human Experience (CHE) [1] to capture this essential role of technology in a human centric vision. CHE emphasizes the unobtrusive, supportive and assistive role of technology in improving human experience, so that technology “takes into account the human world and allows computers themselves to disappear in the background” (Mark Weiser [2]).
In this talk, I will portray physical-cyber-social (PCS) computing that takes ideas from, and goes significantly beyond, the current progress in cyber-physical systems, socio-technical systems and cyber-social systems to support CHE [3]. I will exemplify future PCS application scenarios in healthcare and traffic management that are supported by (a) a deeper and richer semantic interdependence and interplay between sensors and devices at physical layers, (b) rich technology mediated social interactions, and (c) the gathering and application of collective intelligence characterized by massive and contextually relevant background knowledge and advanced reasoning in order to bridge machine and human perceptions. I will share an example of PCS computing using semantic perception [4], which converts low-level, heterogeneous, multimodal and contextually relevant data into high-level abstractions that can provide insights and assist humans in making complex decisions. The key proposition is to explain that PCS computing will need to move away from traditional data processing to multi-tier computation along data-information-knowledge-wisdom dimension that supports reasoning to convert data into abstractions that humans are adept at using.
[1] A. Sheth, Computing for Human Experience
[2] M. Weiser, The Computer for 21st Century
[3] A. Sheth, Semantics empowered Cyber-Physical-Social Systems
[4] C. Henson, A. Sheth, K. Thirunarayan, Semantic Perception: Converting Sensory Observations to Abstractions
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...Amit Sheth
Featured Keynote at Worldcomp'14, July 2014: http://www.world-academy-of-science.org/worldcomp14/ws/keynotes/keynote_sheth
Video of the talk at: http://youtu.be/2991W7OBLqU
Big Data has captured a lot of interest in industry, with the emphasis on the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity, and their applications to drive value for businesses. Recently, there is rapid growth in situations where a big data challenge relates to making individually relevant decisions. A key example is human health, fitness, and well-being. Consider for instance, understanding the reasons for and avoiding an asthma attack based on Big Data in the form of personal health signals (e.g., physiological data measured by devices/sensors or Internet of Things around humans, on the humans, and inside/within the humans), public health signals (information coming from the healthcare system such as hospital admissions), and population health signals (such as Tweets by people related to asthma occurrences and allergens, Web services providing pollen and smog information, etc.). However, no individual has the ability to process all these data without the help of appropriate technology, and each human has different set of relevant data!
In this talk, I will forward the concept of Smart Data that is realized by extracting value from Big Data, to benefit not just large companies but each individual. If I am an asthma patient, for all the data relevant to me with the four V-challenges, what I care about is simply, “How is my current health, and what is the risk of having an asthma attack in my personal situation, especially if that risk has changed?” As I will show, Smart Data that gives such personalized and actionable information will need to utilize metadata, use domain specific knowledge, employ semantics and intelligent processing, and go beyond traditional reliance on ML and NLP.
For harnessing volume, I will discuss the concept of Semantic Perception, that is, how to convert massive amounts of data into information, meaning, and insight useful for human decision-making. For dealing with Variety, I will discuss experience in using agreement represented in the form of ontologies, domain models, or vocabularies, to support semantic interoperability and integration. For Velocity, I will discuss somewhat more recent work on Continuous Semantics, which seeks to use dynamically created models of new objects, concepts, and relationships, using them to better understand new cues in the data that capture rapidly evolving events and situations.
Smart Data applications in development at Kno.e.sis come from the domains of personalized health, energy, disaster response, and smart city. I will present examples from a couple of these.
This is a brief a brief review of current multi-disciplinary and collaborative projects at Kno.e.sis led by Prof. Amit Sheth. They cover research in big social data, IoT, semantic web, semantic sensor web, health informatics, personalized digital health, social data for social good, smart city, crisis informatics, digital data for material genome initiative, etc. Dec 2015 edition.
Smart Data - How you and I will exploit Big Data for personalized digital hea...Amit Sheth
Amit Sheth's keynote at IEEE BigData 2014, Oct 29, 2014.
Abstract from:
http://cci.drexel.edu/bigdata/bigdata2014/keynotespeech.htm
Big Data has captured a lot of interest in industry, with the emphasis on the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity, and their applications to drive value for businesses. Recently, there is rapid growth in situations where a big data challenge relates to making individually relevant decisions. A key example is personalized digital health that related to taking better decisions about our health, fitness, and well-being. Consider for instance, understanding the reasons for and avoiding an asthma attack based on Big Data in the form of personal health signals (e.g., physiological data measured by devices/sensors or Internet of Things around humans, on the humans, and inside/within the humans), public health signals (e.g., information coming from the healthcare system such as hospital admissions), and population health signals (such as Tweets by people related to asthma occurrences and allergens, Web services providing pollen and smog information). However, no individual has the ability to process all these data without the help of appropriate technology, and each human has different set of relevant data!
In this talk, I will describe Smart Data that is realized by extracting value from Big Data, to benefit not just large companies but each individual. If my child is an asthma patient, for all the data relevant to my child with the four V-challenges, what I care about is simply, “How is her current health, and what are the risk of having an asthma attack in her current situation (now and today), especially if that risk has changed?” As I will show, Smart Data that gives such personalized and actionable information will need to utilize metadata, use domain specific knowledge, employ semantics and intelligent processing, and go beyond traditional reliance on ML and NLP. I will motivate the need for a synergistic combination of techniques similar to the close interworking of the top brain and the bottom brain in the cognitive models.
For harnessing volume, I will discuss the concept of Semantic Perception, that is, how to convert massive amounts of data into information, meaning, and insight useful for human decision-making. For dealing with Variety, I will discuss experience in using agreement represented in the form of ontologies, domain models, or vocabularies, to support semantic interoperability and integration. For Velocity, I will discuss somewhat more recent work on Continuous Semantics, which seeks to use dynamically created models of new objects, concepts, and relationships, using them to better understand new cues in the data that capture rapidly evolving events and situations.
Smart Data applications in development at Kno.e.sis come from the domains of personalized health, energy, disaster response, and smart city.
2024 Trend Updates: What Really Works In SEO & Content MarketingSearch Engine Journal
The future of SEO is trending toward a more human-first and user-centric approach, powered by AI intelligence and collaboration. Are you ready?
Watch as we explore which SEO trends to prioritize to achieve sustainable growth and deliver reliable results. We’ll dive into best practices to adapt your strategy around industry-wide disruptions like SGE, how to navigate the top challenges SEO professionals are facing, and proven tactics for prioritizing quality and building trust.
You’ll hear:
- The top SEO trends to prioritize in 2024 to achieve long-term success.
- Predictions for SGE’s impact, and how to adapt.
- What E-E-A-T really means, and how to implement it holistically (hint: it’s never been more important).
With Zack Kadish and Alex Carchietta, we’ll show you which SEO trends to ignore and which to focus on, along with the solution to overcoming rapid, significant and disruptive Google algorithm updates.
If you’re looking to cut through the noise of constant SEO and content trends to drive success, you won’t want to miss this webinar.
Storytelling For The Web: Integrate Storytelling in your Design ProcessChiara Aliotta
In this slides I explain how I have used storytelling techniques to elevate websites and brands and create memorable user experiences. You can discover practical tips as I showcase the elements of good storytelling and its applied to some examples of diverse brands/projects..
This presentation by Thibault Schrepel, Associate Professor of Law at Vrije Universiteit Amsterdam University, was made during the discussion “Artificial Intelligence, Data and Competition” held at the 143rd meeting of the OECD Competition Committee on 12 June 2024. More papers and presentations on the topic can be found at oe.cd/aicomp.
This presentation was uploaded with the author’s consent.
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...All Things Open
Presented at All Things Open AI 2025
Presented by David vonThenen - DigitalOcean
Title: Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Applications
Abstract: In the ever-evolving field of AI, retrieval-augmented generation (RAG) systems have become critical for delivering high-quality, contextually relevant answers in applications powered by large language models (LLMs). While vector databases have traditionally dominated RAG applications, graph databases, specifically knowledge graphs, offer a transformative approach to contextual AI that’s often overlooked. This approach provides unique advantages for applications requiring deep insights, intelligent search, and reasoning over both structured and unstructured sources, making it ideal for complex business scenarios.
Attendees will leave with an understanding of how to build a RAG system using a graph database and practical skills for data querying and insights retrieval. By comparing graph and vector database approaches, we’ll highlight when and why graph databases may offer superior benefits for managing complex data relationships. The session will provide concrete examples and advanced techniques, empowering participants to incorporate knowledge graphs into their AI systems for better data-driven outcomes and improved LLM performance. This discussion will conclude with a live demo showcasing key techniques and insights covered in this talk.
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
Bluesky: https://bsky.app/profile/allthingsopen.bsky.social
2025 conference: https://2025.allthingsopen.org/
Leveraging Pre-Trained Transformer Models for Protein Function Prediction - T...All Things Open
Presented at All Things Open AI 2025
Presented by Tia Pope - North Carolina A&T
Title: Leveraging Pre-Trained Transformer Models for Protein Function Prediction
Abstract: Transformer-based models, such as ProtGPT2 and ESM, are revolutionizing protein sequence analysis by enabling detailed embeddings and advanced function prediction. This talk provides a hands-on introduction to using pre-trained open-source transformer models for generating protein embeddings and leveraging them for classification tasks. Attendees will learn to tokenize sequences, extract embeddings, and implement machine-learning pipelines for protein function annotation based on Gene Ontology (GO) or Enzyme Commission (EC) numbers. This session will showcase how pre-trained transformers can democratize access to advanced protein analysis techniques while addressing scalability and explainability challenges. After the talk, the speaker will provide a notebook to test basic functionality, enabling participants to explore the concepts discussed.
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
Bluesky: https://bsky.app/profile/allthingsopen.bsky.social
2025 conference: https://2025.allthingsopen.org/
Making GenAI Work: A structured approach to implementationJeffrey Funk
Richard Self and I present a structured approach to implementing generative AI in your organization, a #technology that sparked the addition of more than ten trillion dollars to market capitalisations of Magnificent Seven (Apple, Amazon, Google, Microsoft, Meta, Tesla, and Nvidia) since January 2023.
Companies must experiment with AI to see if particular use cases can work because AI is not like traditional software that does the same thing over and over again. As Princeton University’s Arvind Narayanan says: “It’s more like creative, but unreliable, interns that must be managed in order to improve processes.”
Fine-Tuning Large Language Models with Declarative ML Orchestration - Shivay ...All Things Open
Presented at All Things Open AI 2025
Presented by Shivay Lamba - Couchbase
Title: Fine-Tuning Large Language Models with Declarative ML Orchestration
Abstract: Large Language Models used in tools like ChatGPT are everywhere; however, only a few organisations with massive computing resources are capable of training such large models. While eager to fine-tune these models for specific applications, the broader ML community often grapples with significant infrastructure challenges.
In the session, the audience will understand how open-source ML tooling like Flyte (a Linux Foundation open-source orchestration platform) can be used to provide a declarative specification for the infrastructure required for a wide array of ML workloads, including the fine-tuning of LLMs, even with limited resources. Thus the attendee will learn how to leverage open-source ML toolings like Flyte's capabilities to streamline their ML workflows, overcome infrastructure constraints, reduce cost and unlock the full potential of LLMs in their specific use case. Thus making it easier for a larger audience to leverage and train LLMs.
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
Bluesky: https://bsky.app/profile/allthingsopen.bsky.social
2025 conference: https://2025.allthingsopen.org/
Packaging your App for AppExchange – Managed Vs Unmanaged.pptxmohayyudin7826
Learn how to package your app for Salesforce AppExchange with a deep dive into managed vs. unmanaged packages. Understand the best strategies for ISV success and choosing the right approach for your app development goals.
The Rise of AI Agents-From Automation to Autonomous TechnologyImpelsys Inc.
AI agents are more than just a buzzword—they are transforming industries with real autonomy. Unlike traditional AI, they don’t just follow commands; they think, adapt, and act independently. The future isn’t just AI-enabled—it’s AI-powered.
Dev Dives: Unleash the power of macOS Automation with UiPathUiPathCommunity
Join us on March 27 to be among the first to explore UiPath innovative macOS automation capabilities.
This is a must-attend session for developers eager to unlock the full potential of automation.
📕 This webinar will offer insights on:
How to design, debug, and run automations directly on your Mac using UiPath Studio Web and UiPath Assistant for Mac.
We’ll walk you through local debugging on macOS, working with native UI elements, and integrating with key tools like Excel on Mac.
This is a must-attend session for developers eager to unlock the full potential of automation.
👨🏫 Speakers:
Andrei Oros, Product Management Director @UiPath
SIlviu Tanasie, Senior Product Manager @UiPath
The Death of the Browser - Rachel-Lee Nabors, AgentQLAll Things Open
Presented at All Things Open AI 2025
Presented by Rachel-Lee Nabors - AgentQL
Title: The Death of the Browser
Abstract: In ten years, Internet Browsers may be a nostalgic memory. As enterprises face mounting API costs and integration headaches, a new paradigm is emerging. The internet's evolution from an open highway into a maze of walled gardens and monetized APIs has created significant challenges for businesses—but it has also set the stage for accessing and organizing the world’s information.
This lightning talk traces our journey from the invention of the browser to the arms race of scraping for data and access to it to the dawn of AI agents, showing how the challenges of today opened the door to tomorrow. See how technologies refined by the web scraping community are combining with large language models to create practical alternatives to costly API integrations.
From the rise of platform monopolies to the emergence of AI agents, this timeline-based exploration will help you understand where we've been, where we are, and where we're heading. Join us for a glimpse of how AI agents are enabling a return to the era of free information with the web as the API.
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
Bluesky: https://bsky.app/profile/allthingsopen.bsky.social
2025 conference: https://2025.allthingsopen.org/
Delivering your own state-of-the-art enterprise LLMsAI Infra Forum
MemVerge CEO Charles Fan describes a software stack that can simplify and expedite the deployment of language models with capabilities such as GPU-as-a-Service, Training-as-a-Service, Inference-as-a-Service, and Transparent Checkpointing.
EaseUS Partition Master Crack 2025 + Serial Keypiolttruth25
https://ncracked.com/7961-2/
Note: >>👆👆 Please copy the link and paste it into Google New Tab now Download link
EASEUS Partition Master Crack is a professional hard disk partition management tool and system partition optimization software. It is an all-in-one PC and server disk management toolkit for IT professionals, system administrators, technicians, and consultants to provide technical services to customers with unlimited use.
EASEUS Partition Master 18.0 Technician Edition Crack interface is clean and tidy, so all options are at your fingertips. Whether you want to resize, move, copy, merge, browse, check, convert partitions, or change their labels, you can do everything with a few clicks. The defragmentation tool is also designed to merge fragmented files and folders and store them in contiguous locations on the hard drive.
IObit Driver Booster Pro Crack 12.2.0 with License Key [2025]jamesfolkner123
COPY & PASTE LINK👉👉👉https://serialsofts.com/dl/ IOBIT Driver Booster Pro is an application that can update all the drivers and game components present on the computer.
Don't just talk to AI, do more with AI: how to improve productivity with AI a...All Things Open
Presented at All Things Open AI 2025
Presented by Sheng Liang - Acorn Labs
Title: Don't just talk to AI, do more with AI: how to improve productivity with AI agents
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
Bluesky: https://bsky.app/profile/allthingsopen.bsky.social
2025 conference: https://2025.allthingsopen.org/
Revolutionizing GPU-as-a-Service for Maximum EfficiencyAI Infra Forum
In this session, we'll explore our cutting-edge GPU-as-a-Service solution designed to transform enterprise AI operations. Learn how our MemVerge.ai platform maximizes GPU utilization, streamlines workload management, and ensures uninterrupted operations through innovative features like Dynamic GPU Surfing. We'll dive into key use cases, from training large language models to enterprise-scale AI deployment. We'll demonstrate how our solution benefits various stakeholders – from platform engineers to data scientists and decision-makers. Discover how our platform optimizes costs while maintaining data security and sovereignty.
How to Leverage AI to Boost Employee Wellness - Lydia Di Francesco - SocialHR...SocialHRCamp
Speaker: Lydia Di Francesco
In this workshop, participants will delve into the realm of AI and its profound potential to revolutionize employee wellness initiatives. From stress management to fostering work-life harmony, AI offers a myriad of innovative tools and strategies that can significantly enhance the wellbeing of employees in any organization. Attendees will learn how to effectively leverage AI technologies to cultivate a healthier, happier, and more productive workforce. Whether it's utilizing AI-powered chatbots for mental health support, implementing data analytics to identify internal, systemic risk factors, or deploying personalized wellness apps, this workshop will equip participants with actionable insights and best practices to harness the power of AI for boosting employee wellness. Join us and discover how AI can be a strategic partner towards a culture of wellbeing and resilience in the workplace.
2024 State of Marketing Report – by HubspotMarius Sescu
https://www.hubspot.com/state-of-marketing
· Scaling relationships and proving ROI
· Social media is the place for search, sales, and service
· Authentic influencer partnerships fuel brand growth
· The strongest connections happen via call, click, chat, and camera.
· Time saved with AI leads to more creative work
· Seeking: A single source of truth
· TLDR; Get on social, try AI, and align your systems.
· More human marketing, powered by robots
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...All Things Open
Presented at All Things Open AI 2025
Presented by David vonThenen - DigitalOcean
Title: Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Applications
Abstract: In the ever-evolving field of AI, retrieval-augmented generation (RAG) systems have become critical for delivering high-quality, contextually relevant answers in applications powered by large language models (LLMs). While vector databases have traditionally dominated RAG applications, graph databases, specifically knowledge graphs, offer a transformative approach to contextual AI that’s often overlooked. This approach provides unique advantages for applications requiring deep insights, intelligent search, and reasoning over both structured and unstructured sources, making it ideal for complex business scenarios.
Attendees will leave with an understanding of how to build a RAG system using a graph database and practical skills for data querying and insights retrieval. By comparing graph and vector database approaches, we’ll highlight when and why graph databases may offer superior benefits for managing complex data relationships. The session will provide concrete examples and advanced techniques, empowering participants to incorporate knowledge graphs into their AI systems for better data-driven outcomes and improved LLM performance. This discussion will conclude with a live demo showcasing key techniques and insights covered in this talk.
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
Bluesky: https://bsky.app/profile/allthingsopen.bsky.social
2025 conference: https://2025.allthingsopen.org/
Leveraging Pre-Trained Transformer Models for Protein Function Prediction - T...All Things Open
Presented at All Things Open AI 2025
Presented by Tia Pope - North Carolina A&T
Title: Leveraging Pre-Trained Transformer Models for Protein Function Prediction
Abstract: Transformer-based models, such as ProtGPT2 and ESM, are revolutionizing protein sequence analysis by enabling detailed embeddings and advanced function prediction. This talk provides a hands-on introduction to using pre-trained open-source transformer models for generating protein embeddings and leveraging them for classification tasks. Attendees will learn to tokenize sequences, extract embeddings, and implement machine-learning pipelines for protein function annotation based on Gene Ontology (GO) or Enzyme Commission (EC) numbers. This session will showcase how pre-trained transformers can democratize access to advanced protein analysis techniques while addressing scalability and explainability challenges. After the talk, the speaker will provide a notebook to test basic functionality, enabling participants to explore the concepts discussed.
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
Bluesky: https://bsky.app/profile/allthingsopen.bsky.social
2025 conference: https://2025.allthingsopen.org/
Making GenAI Work: A structured approach to implementationJeffrey Funk
Richard Self and I present a structured approach to implementing generative AI in your organization, a #technology that sparked the addition of more than ten trillion dollars to market capitalisations of Magnificent Seven (Apple, Amazon, Google, Microsoft, Meta, Tesla, and Nvidia) since January 2023.
Companies must experiment with AI to see if particular use cases can work because AI is not like traditional software that does the same thing over and over again. As Princeton University’s Arvind Narayanan says: “It’s more like creative, but unreliable, interns that must be managed in order to improve processes.”
Fine-Tuning Large Language Models with Declarative ML Orchestration - Shivay ...All Things Open
Presented at All Things Open AI 2025
Presented by Shivay Lamba - Couchbase
Title: Fine-Tuning Large Language Models with Declarative ML Orchestration
Abstract: Large Language Models used in tools like ChatGPT are everywhere; however, only a few organisations with massive computing resources are capable of training such large models. While eager to fine-tune these models for specific applications, the broader ML community often grapples with significant infrastructure challenges.
In the session, the audience will understand how open-source ML tooling like Flyte (a Linux Foundation open-source orchestration platform) can be used to provide a declarative specification for the infrastructure required for a wide array of ML workloads, including the fine-tuning of LLMs, even with limited resources. Thus the attendee will learn how to leverage open-source ML toolings like Flyte's capabilities to streamline their ML workflows, overcome infrastructure constraints, reduce cost and unlock the full potential of LLMs in their specific use case. Thus making it easier for a larger audience to leverage and train LLMs.
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
Bluesky: https://bsky.app/profile/allthingsopen.bsky.social
2025 conference: https://2025.allthingsopen.org/
Packaging your App for AppExchange – Managed Vs Unmanaged.pptxmohayyudin7826
Learn how to package your app for Salesforce AppExchange with a deep dive into managed vs. unmanaged packages. Understand the best strategies for ISV success and choosing the right approach for your app development goals.
The Rise of AI Agents-From Automation to Autonomous TechnologyImpelsys Inc.
AI agents are more than just a buzzword—they are transforming industries with real autonomy. Unlike traditional AI, they don’t just follow commands; they think, adapt, and act independently. The future isn’t just AI-enabled—it’s AI-powered.
Dev Dives: Unleash the power of macOS Automation with UiPathUiPathCommunity
Join us on March 27 to be among the first to explore UiPath innovative macOS automation capabilities.
This is a must-attend session for developers eager to unlock the full potential of automation.
📕 This webinar will offer insights on:
How to design, debug, and run automations directly on your Mac using UiPath Studio Web and UiPath Assistant for Mac.
We’ll walk you through local debugging on macOS, working with native UI elements, and integrating with key tools like Excel on Mac.
This is a must-attend session for developers eager to unlock the full potential of automation.
👨🏫 Speakers:
Andrei Oros, Product Management Director @UiPath
SIlviu Tanasie, Senior Product Manager @UiPath
The Death of the Browser - Rachel-Lee Nabors, AgentQLAll Things Open
Presented at All Things Open AI 2025
Presented by Rachel-Lee Nabors - AgentQL
Title: The Death of the Browser
Abstract: In ten years, Internet Browsers may be a nostalgic memory. As enterprises face mounting API costs and integration headaches, a new paradigm is emerging. The internet's evolution from an open highway into a maze of walled gardens and monetized APIs has created significant challenges for businesses—but it has also set the stage for accessing and organizing the world’s information.
This lightning talk traces our journey from the invention of the browser to the arms race of scraping for data and access to it to the dawn of AI agents, showing how the challenges of today opened the door to tomorrow. See how technologies refined by the web scraping community are combining with large language models to create practical alternatives to costly API integrations.
From the rise of platform monopolies to the emergence of AI agents, this timeline-based exploration will help you understand where we've been, where we are, and where we're heading. Join us for a glimpse of how AI agents are enabling a return to the era of free information with the web as the API.
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
Bluesky: https://bsky.app/profile/allthingsopen.bsky.social
2025 conference: https://2025.allthingsopen.org/
Delivering your own state-of-the-art enterprise LLMsAI Infra Forum
MemVerge CEO Charles Fan describes a software stack that can simplify and expedite the deployment of language models with capabilities such as GPU-as-a-Service, Training-as-a-Service, Inference-as-a-Service, and Transparent Checkpointing.
EaseUS Partition Master Crack 2025 + Serial Keypiolttruth25
https://ncracked.com/7961-2/
Note: >>👆👆 Please copy the link and paste it into Google New Tab now Download link
EASEUS Partition Master Crack is a professional hard disk partition management tool and system partition optimization software. It is an all-in-one PC and server disk management toolkit for IT professionals, system administrators, technicians, and consultants to provide technical services to customers with unlimited use.
EASEUS Partition Master 18.0 Technician Edition Crack interface is clean and tidy, so all options are at your fingertips. Whether you want to resize, move, copy, merge, browse, check, convert partitions, or change their labels, you can do everything with a few clicks. The defragmentation tool is also designed to merge fragmented files and folders and store them in contiguous locations on the hard drive.
IObit Driver Booster Pro Crack 12.2.0 with License Key [2025]jamesfolkner123
COPY & PASTE LINK👉👉👉https://serialsofts.com/dl/ IOBIT Driver Booster Pro is an application that can update all the drivers and game components present on the computer.
Don't just talk to AI, do more with AI: how to improve productivity with AI a...All Things Open
Presented at All Things Open AI 2025
Presented by Sheng Liang - Acorn Labs
Title: Don't just talk to AI, do more with AI: how to improve productivity with AI agents
Find more info about All Things Open:
On the web: https://www.allthingsopen.org/
Twitter: https://twitter.com/AllThingsOpen
LinkedIn: https://www.linkedin.com/company/all-things-open/
Instagram: https://www.instagram.com/allthingsopen/
Facebook: https://www.facebook.com/AllThingsOpen
Mastodon: https://mastodon.social/@allthingsopen
Threads: https://www.threads.net/@allthingsopen
Bluesky: https://bsky.app/profile/allthingsopen.bsky.social
2025 conference: https://2025.allthingsopen.org/
Revolutionizing GPU-as-a-Service for Maximum EfficiencyAI Infra Forum
In this session, we'll explore our cutting-edge GPU-as-a-Service solution designed to transform enterprise AI operations. Learn how our MemVerge.ai platform maximizes GPU utilization, streamlines workload management, and ensures uninterrupted operations through innovative features like Dynamic GPU Surfing. We'll dive into key use cases, from training large language models to enterprise-scale AI deployment. We'll demonstrate how our solution benefits various stakeholders – from platform engineers to data scientists and decision-makers. Discover how our platform optimizes costs while maintaining data security and sovereignty.
How to Leverage AI to Boost Employee Wellness - Lydia Di Francesco - SocialHR...SocialHRCamp
Speaker: Lydia Di Francesco
In this workshop, participants will delve into the realm of AI and its profound potential to revolutionize employee wellness initiatives. From stress management to fostering work-life harmony, AI offers a myriad of innovative tools and strategies that can significantly enhance the wellbeing of employees in any organization. Attendees will learn how to effectively leverage AI technologies to cultivate a healthier, happier, and more productive workforce. Whether it's utilizing AI-powered chatbots for mental health support, implementing data analytics to identify internal, systemic risk factors, or deploying personalized wellness apps, this workshop will equip participants with actionable insights and best practices to harness the power of AI for boosting employee wellness. Join us and discover how AI can be a strategic partner towards a culture of wellbeing and resilience in the workplace.
2024 State of Marketing Report – by HubspotMarius Sescu
https://www.hubspot.com/state-of-marketing
· Scaling relationships and proving ROI
· Social media is the place for search, sales, and service
· Authentic influencer partnerships fuel brand growth
· The strongest connections happen via call, click, chat, and camera.
· Time saved with AI leads to more creative work
· Seeking: A single source of truth
· TLDR; Get on social, try AI, and align your systems.
· More human marketing, powered by robots
ChatGPT is a revolutionary addition to the world since its introduction in 2022. A big shift in the sector of information gathering and processing happened because of this chatbot. What is the story of ChatGPT? How is the bot responding to prompts and generating contents? Swipe through these slides prepared by Expeed Software, a web development company regarding the development and technical intricacies of ChatGPT!
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
The realm of product design is a constantly changing environment where technology and style intersect. Every year introduces fresh challenges and exciting trends that mold the future of this captivating art form. In this piece, we delve into the significant trends set to influence the look and functionality of product design in the year 2024.
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
Mental health has been in the news quite a bit lately. Dozens of U.S. states are currently suing Meta for contributing to the youth mental health crisis by inserting addictive features into their products, while the U.S. Surgeon General is touring the nation to bring awareness to the growing epidemic of loneliness and isolation. The country has endured periods of low national morale, such as in the 1970s when high inflation and the energy crisis worsened public sentiment following the Vietnam War. The current mood, however, feels different. Gallup recently reported that national mental health is at an all-time low, with few bright spots to lift spirits.
To better understand how Americans are feeling and their attitudes towards mental health in general, ThinkNow conducted a nationally representative quantitative survey of 1,500 respondents and found some interesting differences among ethnic, age and gender groups.
Technology
For example, 52% agree that technology and social media have a negative impact on mental health, but when broken out by race, 61% of Whites felt technology had a negative effect, and only 48% of Hispanics thought it did.
While technology has helped us keep in touch with friends and family in faraway places, it appears to have degraded our ability to connect in person. Staying connected online is a double-edged sword since the same news feed that brings us pictures of the grandkids and fluffy kittens also feeds us news about the wars in Israel and Ukraine, the dysfunction in Washington, the latest mass shooting and the climate crisis.
Hispanics may have a built-in defense against the isolation technology breeds, owing to their large, multigenerational households, strong social support systems, and tendency to use social media to stay connected with relatives abroad.
Age and Gender
When asked how individuals rate their mental health, men rate it higher than women by 11 percentage points, and Baby Boomers rank it highest at 83%, saying it’s good or excellent vs. 57% of Gen Z saying the same.
Gen Z spends the most amount of time on social media, so the notion that social media negatively affects mental health appears to be correlated. Unfortunately, Gen Z is also the generation that’s least comfortable discussing mental health concerns with healthcare professionals. Only 40% of them state they’re comfortable discussing their issues with a professional compared to 60% of Millennials and 65% of Boomers.
Race Affects Attitudes
As seen in previous research conducted by ThinkNow, Asian Americans lag other groups when it comes to awareness of mental health issues. Twenty-four percent of Asian Americans believe that having a mental health issue is a sign of weakness compared to the 16% average for all groups. Asians are also considerably less likely to be aware of mental health services in their communities (42% vs. 55%) and most likely to seek out information on social media (51% vs. 35%).
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
Creative operations teams expect increased AI use in 2024. Currently, over half of tasks are not AI-enabled, but this is expected to decrease in the coming year. ChatGPT is the most popular AI tool currently. Business leaders are more actively exploring AI benefits than individual contributors. Most respondents do not believe AI will impact workforce size in 2024. However, some inhibitions still exist around AI accuracy and lack of understanding. Creatives primarily want to use AI to save time on mundane tasks and boost productivity.
Organizational culture includes values, norms, systems, symbols, language, assumptions, beliefs, and habits that influence employee behaviors and how people interpret those behaviors. It is important because culture can help or hinder a company's success. Some key aspects of Netflix's culture that help it achieve results include hiring smartly so every position has stars, focusing on attitude over just aptitude, and having a strict policy against peacocks, whiners, and jerks.
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
PepsiCo provided a safe harbor statement noting that any forward-looking statements are based on currently available information and are subject to risks and uncertainties. It also provided information on non-GAAP measures and directing readers to its website for disclosure and reconciliation. The document then discussed PepsiCo's business overview, including that it is a global beverage and convenient food company with iconic brands, $91 billion in net revenue in 2023, and nearly $14 billion in core operating profit. It operates through a divisional structure with a focus on local consumers.
Content Methodology: A Best Practices Report (Webinar)contently
This document provides an overview of content methodology best practices. It defines content methodology as establishing objectives, KPIs, and a culture of continuous learning and iteration. An effective methodology focuses on connecting with audiences, creating optimal content, and optimizing processes. It also discusses why a methodology is needed due to the competitive landscape, proliferation of channels, and opportunities for improvement. Components of an effective methodology include defining objectives and KPIs, audience analysis, identifying opportunities, and evaluating resources. The document concludes with recommendations around creating a content plan, testing and optimizing content over 90 days.
How to Prepare For a Successful Job Search for 2024Albert Qian
The document provides guidance on preparing a job search for 2024. It discusses the state of the job market, focusing on growth in AI and healthcare but also continued layoffs. It recommends figuring out what you want to do by researching interests and skills, then conducting informational interviews. The job search should involve building a personal brand on LinkedIn, actively applying to jobs, tailoring resumes and interviews, maintaining job hunting as a habit, and continuing self-improvement. Once hired, the document advises setting new goals and keeping skills and networking active in case of future opportunities.
A report by thenetworkone and Kurio.
The contributing experts and agencies are (in an alphabetical order): Sylwia Rytel, Social Media Supervisor, 180heartbeats + JUNG v MATT (PL), Sharlene Jenner, Vice President - Director of Engagement Strategy, Abelson Taylor (USA), Alex Casanovas, Digital Director, Atrevia (ES), Dora Beilin, Senior Social Strategist, Barrett Hoffher (USA), Min Seo, Campaign Director, Brand New Agency (KR), Deshé M. Gully, Associate Strategist, Day One Agency (USA), Francesca Trevisan, Strategist, Different (IT), Trevor Crossman, CX and Digital Transformation Director; Olivia Hussey, Strategic Planner; Simi Srinarula, Social Media Manager, The Hallway (AUS), James Hebbert, Managing Director, Hylink (CN / UK), Mundy Álvarez, Planning Director; Pedro Rojas, Social Media Manager; Pancho González, CCO, Inbrax (CH), Oana Oprea, Head of Digital Planning, Jam Session Agency (RO), Amy Bottrill, Social Account Director, Launch (UK), Gaby Arriaga, Founder, Leonardo1452 (MX), Shantesh S Row, Creative Director, Liwa (UAE), Rajesh Mehta, Chief Strategy Officer; Dhruv Gaur, Digital Planning Lead; Leonie Mergulhao, Account Supervisor - Social Media & PR, Medulla (IN), Aurelija Plioplytė, Head of Digital & Social, Not Perfect (LI), Daiana Khaidargaliyeva, Account Manager, Osaka Labs (UK / USA), Stefanie Söhnchen, Vice President Digital, PIABO Communications (DE), Elisabeth Winiartati, Managing Consultant, Head of Global Integrated Communications; Lydia Aprina, Account Manager, Integrated Marketing and Communications; Nita Prabowo, Account Manager, Integrated Marketing and Communications; Okhi, Web Developer, PNTR Group (ID), Kei Obusan, Insights Director; Daffi Ranandi, Insights Manager, Radarr (SG), Gautam Reghunath, Co-founder & CEO, Talented (IN), Donagh Humphreys, Head of Social and Digital Innovation, THINKHOUSE (IRE), Sarah Yim, Strategy Director, Zulu Alpha Kilo (CA).
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
The search marketing landscape is evolving rapidly with new technologies, and professionals, like you, rely on innovative paid search strategies to meet changing demands.
It’s important that you’re ready to implement new strategies in 2024.
Check this out and learn the top trends in paid search advertising that are expected to gain traction, so you can drive higher ROI more efficiently in 2024.
You’ll learn:
- The latest trends in AI and automation, and what this means for an evolving paid search ecosystem.
- New developments in privacy and data regulation.
- Emerging ad formats that are expected to make an impact next year.
Watch Sreekant Lanka from iQuanti and Irina Klein from OneMain Financial as they dive into the future of paid search and explore the trends, strategies, and technologies that will shape the search marketing landscape.
If you’re looking to assess your paid search strategy and design an industry-aligned plan for 2024, then this webinar is for you.
5 Public speaking tips from TED - Visualized summarySpeakerHub
From their humble beginnings in 1984, TED has grown into the world’s most powerful amplifier for speakers and thought-leaders to share their ideas. They have over 2,400 filmed talks (not including the 30,000+ TEDx videos) freely available online, and have hosted over 17,500 events around the world.
With over one billion views in a year, it’s no wonder that so many speakers are looking to TED for ideas on how to share their message more effectively.
The article “5 Public-Speaking Tips TED Gives Its Speakers”, by Carmine Gallo for Forbes, gives speakers five practical ways to connect with their audience, and effectively share their ideas on stage.
Whether you are gearing up to get on a TED stage yourself, or just want to master the skills that so many of their speakers possess, these tips and quotes from Chris Anderson, the TED Talks Curator, will encourage you to make the most impactful impression on your audience.
See the full article and more summaries like this on SpeakerHub here: https://speakerhub.com/blog/5-presentation-tips-ted-gives-its-speakers
See the original article on Forbes here:
http://www.forbes.com/forbes/welcome/?toURL=http://www.forbes.com/sites/carminegallo/2016/05/06/5-public-speaking-tips-ted-gives-its-speakers/&refURL=&referrer=#5c07a8221d9b
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
Everyone is in agreement that ChatGPT (and other generative AI tools) will shape the future of work. Yet there is little consensus on exactly how, when, and to what extent this technology will change our world.
Businesses that extract maximum value from ChatGPT will use it as a collaborative tool for everything from brainstorming to technical maintenance.
For individuals, now is the time to pinpoint the skills the future professional will need to thrive in the AI age.
Check out this presentation to understand what ChatGPT is, how it will shape the future of work, and how you can prepare to take advantage.
The document provides career advice for getting into the tech field, including:
- Doing projects and internships in college to build a portfolio.
- Learning about different roles and technologies through industry research.
- Contributing to open source projects to build experience and network.
- Developing a personal brand through a website and social media presence.
- Networking through events, communities, and finding a mentor.
- Practicing interviews through mock interviews and whiteboarding coding questions.
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
1. Core updates from Google periodically change how its algorithms assess and rank websites and pages. This can impact rankings through shifts in user intent, site quality issues being caught up to, world events influencing queries, and overhauls to search like the E-A-T framework.
2. There are many possible user intents beyond just transactional, navigational and informational. Identifying intent shifts is important during core updates. Sites may need to optimize for new intents through different content types and sections.
3. Responding effectively to core updates requires analyzing "before and after" data to understand changes, identifying new intents or page types, and ensuring content matches appropriate intents across video, images, knowledge graphs and more.
A brief introduction to DataScience with explaining of the concepts, algorithms, machine learning, supervised and unsupervised learning, clustering, statistics, data preprocessing, real-world applications etc.
It's part of a Data Science Corner Campaign where I will be discussing the fundamentals of DataScience, AIML, Statistics etc.
Time Management & Productivity - Best PracticesVit Horky
Here's my presentation on by proven best practices how to manage your work time effectively and how to improve your productivity. It includes practical tips and how to use tools such as Slack, Google Apps, Hubspot, Google Calendar, Gmail and others.
The six step guide to practical project managementMindGenius
The six step guide to practical project management
If you think managing projects is too difficult, think again.
We’ve stripped back project management processes to the
basics – to make it quicker and easier, without sacrificing
the vital ingredients for success.
“If you’re looking for some real-world guidance, then The Six Step Guide to Practical Project Management will help.”
Dr Andrew Makar, Tactical Project Management
3. 1% of the data is
used for analysis.
3
http://www.csc.com/insights/flxwd/78931-big_data_growth_just_beginning_to_explode
http://www.guardian.co.uk/news/datablog/2012/dec/19/big-data-study-digital-universe-global-volume
6. • Focus on verticals: advertising‚ social media‚ retail‚
financial services‚ telecom‚ and healthcare
– Aggregate data, focused on transactions, limited
integration (limited complexity), analytics to find
(simple) patterns
– Emphasis on technologies to handle
volume/scale, and to lesser extent velocity:
Hadoop, NoSQL,MPP warehouse ….
– Full faith in the power of data (no
hypothesis), bottom up analysis
6
Current Focus on Big Data
7. • What if your data volume gets so large and
varied you don't know how to deal with it?
• Do you store all your data?
• Do you analyze it all?
• How can you find out which data points are
really important?
• How can you use it to your best advantage?
7
Questions typically asked on Big Data
http://www.sas.com/big-data/
9. • Prediction of the spread of flu in real time during H1N1 2009
– Google tested a mammoth of 450 million different mathematical
models to test the search terms, comparing their predictions against
the actual flu cases; 45 important parameters were founds
– Model was tested when H1N1 crisis struck in 2009 and gave more
meaningful and valuable real time information than any public health
official system [Big Data, Viktor Mayer-Schonberger and Kenneth Cukier, 2013]
• FareCast: predict the direction of air fares over different
routes [Big Data, Viktor Mayer-Schonberger and Kenneth Cukier, 2013]
• NY city manholes problem [ICML Discussion, 2012]
9
Illustrative Big Data Applications
10. • Current focus mainly to serve business intelligence and targeted
analytics needs, not to serve complex individual and collective
human needs (e.g., empower human in health, fitness and well-
being; better disaster coordination) that is highly
personalized/individualized/contextualized
– Incorporate real-world complexity: multi-modal and multi-sensory nature
of real-world and human perception
– Need deeper understanding of data and its role to information (e.g., skew,
coverage)
• Human involvement and guidance: Leading to actionable
information, understanding and insight right in the context of
human activities
– Bottom-up & Top-down processing: Infusion of models and background
knowledge (data + knowledge + reasoning)
10
What is missing?
12. Smart Data
Smart data makes sense out of Big data
It provides value from harnessing the
challenges posed by
volume, velocity, variety and veracity of big
data, in-turn providing actionable
information and improve decision
making.
12
13. “OF human, BY human and FOR human”
Smart data is focused on the actionable
value achieved by human involvement in
data creation, processing and consumption
phases for improving
the human experience.
Another perspective on Smart Data
13
15. “OF human, BY human and FOR human”
Another perspective on Smart Data
15
16. Petabytes of Physical(sensory)-Cyber-Social Data everyday!
More on PCS Computing: http://wiki.knoesis.org/index.php/PCS 16
‘OF human’ : Relevant Real-time Data
Streams for Human Experience
17. “OF human, BY human and FOR human”
17
Another perspective on Smart Data
18. Use of Prior Human-created Knowledge Models
18
‘BY human’: Involving
Crowd Intelligence in data processing workflows
Crowdsourcing and Domain-expert guided
Machine Learning Modeling
19. “OF human, BY human and FOR human”
Another perspective on Smart Data
19
20. Detection of events, such as wheezing
sound, indoor
temperature, humidity, dust, and CO2
level
Weather Application
Asthma Healthcare
Application
Close the window at home
during day to avoid CO2 in
gush, to avoid asthma attacks
at night
20
‘FOR human’ :
Improving Human Experience
Population Level
Personal
Public Health
Action in the Physical World
21. 21
Why do we care about Smart Data
rather than Big Data?
22. Transforming Big Data into Smart Data:
Deriving Value via harnessing Volume, Variety and Velocity
using semantics and Semantic Web
Put Knoesis Banner
Keynote at SEBD 2013, July 1, 2013 and invited talk in universities in Spain, June 2013.
The Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis), Wright State, USA
Pavan
Kapanipathi
Pramod
Anantharam
Amit Sheth
Cory
Henson
Dr. T.K.
Prasad
Maryam
Panahiazar
Contributions by many, but Special Thanks to:
Hemant
Purohit
23. Second-costliest hurricane in United States
history estimated damage $75 billion
90-115 mph winds
State of Emergency in New York
285 people killed on the track of Sandy
750,000 without power (NY)
Immense devastation and Human suffering
23
Big Data to Smart Data: Disaster Management example
http://www.huffingtonpost.com/2012/10/30/hurricane-sandy-power-outage-map-infographic_n_2044411.html
24. 20 million tweets with “sandy, hurricane”
keywords between Oct 27th and Nov 1st
2nd most popular topic on Facebook during 2012
Social (Big) Data during Hurricane Sandy
24
• http://www.guardian.co.uk/news/datablog/2
012/oct/31/twitter-sandy-flooding
• http://www.huffingtonpost.com/2012/11/02
/twitter-hurricane-sandy_n_2066281.html
• http://mashable.com/2012/10/31/hurricane-
sandy-facebook/
25. For information seeking
For timely information
For unique information
For unfiltered information
To determine disaster magnitude
To check in with family and friends
To self-mobilize
To maintain a sense of community
To seek emotional support and healing
Governments
Emergency management
organizations
Journalists
Disaster responders
Public
BIG DATA TO SMART DATA: WHY? and FOR WHOM?
25
Fraustino et al. Social Media Use
during Disasters: A Review of the
Knowledge Base and Gaps. US Dept.
of Homeland Security, START 2012.
26. Improving situational awareness
- Timely delivery of necessary
information to the right people
Improving coordination between
resource seekers and suppliers
Detecting the magnitude of
disaster by people sentiments.
Many more challenges…
Can SNS’s make Disaster Management easier –
Giving Actionable Information (Smart Data)
26
http://www.buzzfeed.com/annanorth/how-social-media-is-aiding-the-hurricane-sandy-rec
http://blog.twitter.com/2012/10/hurricane-sandy-resources-on-twitter.html
http://www.treehugger.com/culture/12-ways-help-hurricane-sandy-relief-efforts.html
27. Volume
Twitter hits half a billion tweets a day!
Challenges
Delivering the necessary
actionable/information to the right people
27
http://news.cnet.com/8301-1023_3-57541566-93/report-twitter-hits-half-a-billion-tweets-a-day/
http://semiocast.com/en/publications/2012_07_30_Twitter_reaches_half_a_billion_accounts_140m_in_the_US
28. Velocity
Volume
@ConEdison Twitter handle that the company had only
set up in June gained an extra 16,000 followers over the
storm. – Did the information reach everyone?
Challenges
Delivering the necessary/actionable
information to the right people
Rate of Data Arrival
Approximately 7000 TPS
10 images per second on instagram
28
http://news.cnet.com/8301-1023_3-57541566-93/report-twitter-hits-half-a-billion-tweets-a-day/
http://semiocast.com/en/publications/2012_07_30_Twitter_reaches_half_a_billion_accounts_140m_in_the_US
http://www.internews.org/sites/default/files/resources/InternewsEurope_Report_Japan_Connecting%20the%20last%20mile%20Japan_2013.pdf
33. Value
-Makes Sense
-Actionable Information
-Decision support/making
Disaster Management
Victims
Timely and Contextual Information about
• Electricity, Food, Water, Shelter and
donation offers related to the disaster.
Data http://www.wired.com/insights/2013/04/big-data-fast-data-smart-data/ 33
35. • Healthcare
– kHealth
– SemHeath
• Social event coordination
– Twitris
• Traffic monitoring
– kTraffic
35
Applications of Smart Data Analytics
36. The Patient of the Future
MIT Technology Review, 2012
http://www.technologyreview.com/featuredstory/426968/the-patient-of-the-future/ 36
37. To gain new insight in
patient care &
early indications of
disease
37
Smart Data in Healthcare
38. Sensing is a key enabler of the Internet of Things
BUT, how do we make sense of the resulting avalanche
of sensor data?
50 Billion Things by 2020 (Cisco)
38
39. Parkinson’s disease (PD) data from The Michael J. Fox Foundation
for Parkinson’s Research.
39
1https://www.kaggle.com/c/predicting-parkinson-s-disease-progression-with-smartphone-data
8 weeks of data from 5 sensors on a smart phone, collected for 16 patients
resulting in ~12 GB (with lot of missing data).
Variety Volume
VeracityVelocity
Value
Can we detect the onset of Parkinson’s disease?
Can we characterize the disease progression?
Can we provide actionable information to the patient?
semantics
Representing prior knowledge of PD
led to a focused exploration of this
massive dataset
WHY Big Data to Smart Data: Healthcare example
40. 40
Big Data to Smart Data Using a Knowledge Based Approach
ParkinsonMild(person) = Tremor(person) ∧ PoorBalance(person)
ParkinsonModerate(person) = MoveSlow(person) ∧ PoorSleep(person) ∧ MonotoneSpeech(person)
ParkinsonAdvanced(person) = Fall(person)
Control Group PD Patients
Movements of an active
person has a good
distribution over X, Y, and
Z axis
Restricted movements by
a PD patient can be seen
in the acceleration
readings
Audio is well modulated
with good variations in
the energy of the voice
Audio is not well
modulated represented a
monotone speech
Declarative Knowledge of
Parkinson’s Disease used to focus
our attention on symptom
manifestations in sensor
observations
41. • 25 million people in the U.S. are diagnosed with
asthma (7 million are children)1.
• 300 million people suffering from asthma
worldwide2.
• Asthma related healthcare costs alone are around
$50 billion a year2.
• 155,000 hospital admissions and 593,000 emergency
department visits in 20063.
41
1http://www.nhlbi.nih.gov/health/health-topics/topics/asthma/
2http://www.lung.org/lung-disease/asthma/resources/facts-and-figures/asthma-in-adults.html
3Akinbami et al. (2009). Status of childhood asthma in the United States, 1980–2007. Pediatrics,123(Supplement 3), S131-S145.
Asthma: Severity of the problem
42. Asthma is a multifactorial disease with health signals spanning
personal, public health, and population levels.
42
Real-time health signals from personal level (e.g., Wheezometer, NO in
breath, accelerometer, microphone), public health (e.g., CDC, Hospital EMR), and
population level (e.g., pollen level, CO2) arriving continuously in fine grained
samples potentially with missing information and uneven sampling frequencies.
Variety Volume
VeracityVelocity
Value
Can we detect the asthma severity level?
Can we characterize asthma control level?
What risk factors influence asthma control?
What is the contribution of each risk factor?semantics
Understanding relationships between
health signals and asthma attacks
for providing actionable information
WHY Big Data to Smart Data: Healthcare example
43. 43
Population Level
Personal
Public Health
Variety: Health signals span heterogeneous sources
Volume: Health signals are fine grained
Velocity: Real-time change in situations
Veracity: Reliability of health signals may be compromised
Value: Can I reduce my asthma attacks at night?
Decision support to doctors
by providing them with
deeper insights into patient
asthma care
Asthma: Demonstration of Value
44. 44
Sensordrone – for monitoring
environmental air quality
Wheezometer – for monitoring
wheezing sounds
Can I reduce my asthma attacks at night?
What are the triggers?
What is the wheezing level?
What is the propensity toward asthma?
What is the exposure level over a day?
What is the air quality indoors?
Commute to Work
Personal
Public Health
Population Level
Closing the window at home
in the morning and taking an
alternate route to office may
lead to reduced asthma attacks
Actionable
Information
Asthma: Actionable Information for Asthma Patients
45. Personal, Public Health, and Population Level Signals for Monitoring Asthma
Asthma Control => Daily Medication
Choices for starting
therapy
Not Well Controlled Poor Controlled
Severity Level of
Asthma
(Recommended Action) (Recommended Action) (Recommended Action)
Intermittent Asthma SABA prn - -
Mild Persistent Asthma Low dose ICS Medium ICS Medium ICS
Moderate Persistent
Asthma
Medium dose ICS alone
Or with
LABA/montelukast
Medium ICS +
LABA/Montelukast
Or High dose ICS
Medium ICS +
LABA/Montelukast
Or High dose ICS*
Severe Persistent Asthma High dose ICS with
LABA/montelukast
Needs specialist care Needs specialist care
ICS= inhaled corticosteroid, LABA = inhaled long-acting beta2-agonist, SABA= inhaled short-acting beta2-agonist ;
*consider referral to specialist
Asthma Control
and Actionable Information
Sensors and their observations
for understanding asthma
45
46. 46
Personal
Level Signals
Societal Level
Signals
(Personal Level Signals)
(Personalized
Societal Level Signal)
(Societal Level Signals)
Societal Level Signals
Relevant to the
Personal Level
Personal Level Sensors
(kHealth**) (EventShop*)
Qualify Quantify
Action
Recommendation
What are the features influencing my asthma?
What is the contribution of each of these features?
How controlled is my asthma? (risk score)
What will be my action plan to manage asthma?
Storage
Societal Level Sensors
Asthma Early Warning Model (AEWM)
Query AEWM
Verify & augment
domain knowledge
Recommended
Action
Action
Justification
Asthma Early Warning Model
*http://www.slideshare.net/jain49/eventshop-120721, ** http://www.youtube.com/watch?v=btnRi64hJp4
47. 47
Population Level
Personal
Wheeze – Yes
Do you have tightness of chest? –Yes
ObservationsPhysical-Cyber-Social System Health Signal Extraction Health Signal Understanding
<Wheezing=Yes, time, location>
<ChectTightness=Yes, time, location>
<PollenLevel=Medium, time, location>
<Pollution=Yes, time, location>
<Activity=High, time, location>
Wheezing
ChectTightness
PollenLevel
Pollution
Activity
Wheezing
ChectTightness
PollenLevel
Pollution
Activity
RiskCategory
<PollenLevel, ChectTightness, Pollution,
Activity, Wheezing, RiskCategory>
<2, 1, 1,3, 1, RiskCategory>
<2, 1, 1,3, 1, RiskCategory>
<2, 1, 1,3, 1, RiskCategory>
<2, 1, 1,3, 1, RiskCategory>
.
.
.
Expert
Knowledge
Background
Knowledge
tweet reporting pollution level
and asthma attacks
Acceleration readings from
on-phone sensors
Sensor and personal
observations
Signals from personal, personal
spaces, and community spaces
Risk Category assigned by
doctors
Qualify
Quantify
Enrich
Outdoor pollen and pollution
Public Health
Health Signal Extraction to Understanding
Well Controlled - continue
Not Well Controlled – contact nurse
Poor Controlled – contact doctor
48. … and do it efficiently and at scale
What if we could automate this
sense making ability?
48
49. People are good at making sense of sensory input
What can we learn from cognitive models of perception?
• The key ingredient is prior knowledge
49
50. * based on Neisser’s cognitive model of perception
Observe
Property
Perceive
Feature
Explanation
Discrimination
1
2
Perception Cycle*
Translating low-level signals
into high-level knowledge
Focusing attention on those
aspects of the environment that
provide useful information
Prior Knowledge
50
51. To enable machine perception,
Semantic Web technology is used to integrate
sensor data with prior knowledge on the Web
51
52. Prior knowledge on the Web
W3C Semantic Sensor
Network (SSN) Ontology Bi-partite Graph
52
53. Prior knowledge on the Web
W3C Semantic Sensor
Network (SSN) Ontology Bi-partite Graph
53
55. Explanation
Inference to the best explanation
• In general, explanation is an abductive problem; and
hard to compute
Finding the sweet spot between abduction and OWL
• Single-feature assumption* enables use of OWL-DL
deductive reasoner
* An explanation must be a single feature which accounts for
all observed properties
Explanation is the act of choosing the objects or events that best account for a set of
observations; often referred to as hypothesis building
55
56. Explanation
Explanatory Feature: a feature that explains the set of observed properties
ExplanatoryFeature ≡ ∃ssn:isPropertyOf—.{p1} ⊓ … ⊓ ∃ssn:isPropertyOf—.{pn}
elevated blood pressure
clammy skin
palpitations
Hypertension
Hyperthyroidism
Pulmonary Edema
Observed Property Explanatory Feature
56
57. Discrimination is the act of finding those properties that, if observed, would help distinguish
between multiple explanatory features
Observe
Property
Perceive
Feature
Explanation
Discrimination
2
Focusing attention on those
aspects of the environment that
provide useful information
Discrimination
57
58. Discrimination
Expected Property: would be explained by every explanatory feature
ExpectedProperty ≡ ∃ssn:isPropertyOf.{f1} ⊓ … ⊓ ∃ssn:isPropertyOf.{fn}
elevated blood pressure
clammy skin
palpitations
Hypertension
Hyperthyroidism
Pulmonary Edema
Expected Property Explanatory Feature
58
59. Discrimination
Not Applicable Property: would not be explained by any explanatory feature
NotApplicableProperty ≡ ¬∃ssn:isPropertyOf.{f1} ⊓ … ⊓ ¬∃ssn:isPropertyOf.{fn}
elevated blood pressure
clammy skin
palpitations
Hypertension
Hyperthyroidism
Pulmonary Edema
Not Applicable Property Explanatory Feature
59
61. Through physical monitoring and
analysis, our cellphones could act as
an early warning system to detect
serious health conditions, and
provide actionable information
canary in a coal mine
Our Motivation
kHealth: knowledge-enabled healthcare
61
62. Qualities
-High BP
-Increased Weight
Entities
-Hypertension
-Hypothyroidism
kHealth
Machine Sensors
Personal Input
EMR/PHR
Comorbidity risk score
e.g., Charlson Index
Longitudinal studies of
cardiovascular risks
- Find correlations
- Validation
- domain knowledge
- domain expert
Parameterize the
model
Risk Assessment Model
Current Observations
-Physical
-Physiological
-History
Risk Score
(Actionable Information)
Model CreationValidate correlations
Historical observations
of each patient
Risk Score: from Data to Abstraction and Actionable Information
62
63. How do we implement machine perception efficiently on a
resource-constrained device?
Use of OWL reasoner is resource intensive
(especially on resource-constrained devices),
in terms of both memory and time
• Runs out of resources with prior knowledge >> 15 nodes
• Asymptotic complexity: O(n3)
63
64. intelligence at the edge
Approach 1: Send all sensor observations
to the cloud for processing
Approach 2: downscale semantic
processing so that each device is capable
of machine perception
64
Henson et al. 'An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained
Devices, ISWC 2012.
65. Efficient execution of machine perception
Use bit vector encodings and their operations to encode prior knowledge and
execute semantic reasoning
010110001101
0011110010101
1000110110110
101100011010
0111100101011
000110101100
0110100111
65
66. O(n3) < x < O(n4) O(n)
Efficiency Improvement
• Problem size increased from 10’s to 1000’s of nodes
• Time reduced from minutes to milliseconds
• Complexity growth reduced from polynomial to linear
Evaluation on a mobile device
66
67. 2 Prior knowledge is the key to perception
Using SW technologies, machine perception can be formalized and
integrated with prior knowledge on the Web
3 Intelligence at the edge
By downscaling semantic inference, machine perception can
execute efficiently on resource-constrained devices
Semantic Perception for smarter analytics: 3 ideas to takeaway
1 Translate low-level data to high-level knowledge
Machine perception can be used to convert low-level sensory
signals into high-level knowledge useful for decision making
67
68. • Real Time Feature Streams:
http://www.youtube.com/watch?v=_ews4w_eCpg
• kHealth: http://www.youtube.com/watch?v=btnRi64hJp4
68
Demos
69. 73
Smart Data in Social Media Analytics
To Understand the
human social
dynamics in real
world events
70. 0.5B Tweets per day
0.5B Users
60% on Mobile
5530 Tweets per second
related to the Japan earthquake and tsunami
17000 Tweets
per second
74
Twitter During Real-world Events of Interest
http://www.flickr.com/photos/twitteroffice/5897088517/sizes/o/in/photostream/
http://bayarea.sbnation.com/49ers/2013/2/3/3947738/super-bowl-prop-bets-2013-
twitterhttp://bayarea.sbnation.com/49ers/2013/2/3/3947738/super-bowl-prop-bets-2013-twitter
http://expandedramblings.com/index.php/march-2013-by-the-numbers-a-few-amazing-twitter-stats/
77. 81
Red Color: Negative Topics
Green Color: Positive Topics
Twitris: Sentiment Analysis- Smart Answers with reasoning!
How was Obama doing in the second debate?
SMART DATA IS ABOUT ANALYSIS FOR REASONING
(what caused the positive sentiment for Democrats)
BEHIND THE REAL-WORLD ACTIONS (Democrats’ win)
http://knoesis.wright.edu/library/resource.php?id=1787
78. Top 100 influential users that
talks about Barack Obama
Positive or Negative
Influence
Twitris: Network Analysis
SMART DATA TELLS YOU HOW CAN A SYSTEM BE
TWEAKED FOR THE DESIRED ACTIONS!
Could we engage with users (targeted) with extreme
polarity leaning for Obama to spark an agenda in the whole
network of voters (ACTION)? 82
79. Twitris: Community Evolution
SMART DATA FOCUSES ON THE CAUSALITY
OF CHANGES IN REAL-WORLD ACTIONS!
Romney
Obama
Evolution of influencer interaction networks for Romney vs. Obama
topical communities, during U.S. Presidential Election 2012 debates
Before 1st
debate
After 1st
debate
After
Hurricane Sandy
After 3rd
debate
83
80. The Dead People mentioned
in the event OWC
Twitris: Impact of Background Knowledge
84
81. How People from Different
parts of the world talked
about US Election
Images and Videos
Related to US Election
Twitris: Analysis by Location
85
82. What is Smart Data in the context of
Disaster Management
ACTIONABLE: Timely delivery of
right resources and information to
the right people at right location!
86
Because everyone wants to Help, but DON’T KNOW HOW!
83. Join us for the Social
Good!
http://twitris.knoesis.org
RT @OpOKRelief:
Southgate Baptist Church
on 4th Street in Moore
has
food, water, clothes, diap
ers, toys, and more. If
you can't go,call 794
Text "FOOD" to
32333, REDCROSS to
90999, or STORM to
80888 to donate $10
in storm relief.
#moore #oklahoma
#disasterrelief
#donate
Want to help animals in
#Oklahoma? @ASPCA tells
how you can help:
http://t.co/mt8l9PwzmO
CITIZEN SENSORS
RESPONSE TEAMS
(including humanitarian
org. and ‘pseudo’ responders)
VICTIM SITE
Coordination of
needs and offers
Using Social Media
Does anyone
know where to
send a check to
donate to the
tornado
victims?
Where do I go
to help out for
volunteer work
around Moore?
Anyone know?
Anyone know
where to donate
to help the
animals from the
Oklahoma
disaster? #oklah
oma #dogs
Matched
Matched
Matched
Serving the need!
If you would like to volunteer
today, help is desperately
needed in Shawnee. Call
273-5331 for more info
http://www.slideshare.net/hemant_knoesis/cscw-2012-hemantpurohit-11531612
87
Purohit et al. Framework to Analyze Coordination in Crisis Response, 2012. Int’l Collaboration in-progress:
84. Smart Data from Twitris system for
Disaster Response Coordination
Which are the primary locations with
most negative sentiments/emotions?
Who are all the people to engage
with for better information
diffusion?Which are the most important
organizations acting at my
location?
Smart data provides actionable information and improve decision making through
semantic analysis of Big Data.
Who are the resource seekers and
suppliers? How can one donate?
88
85. Source: Purohit et. al 2013, Information Filtering and Management Model for Disaster Response Coordination 89
Disaster Response Coordination Framework
86. Disaster Response Coordination:
Twitris Summary for Actionable Nuggets
90
Important tags to
summarize Big Data flow
Related to Oklahoma
tornado
Images and Videos Related
to Oklahoma tornado
87. 91
Disaster Response Coordination:
Twitris Real-time information for needs
Incoming Tweets with need
types to give quick idea of
what is needed and where
currently #OKC
Legends for Different
needs #OKC
(It is real-time widget for monitoring of needs, so will not be active after the event has passed)
http://twitris.knoesis.org/oklahomatornado
89. Really sparse Signal to Noise:
• 2M tweets during the first week after #Oklahoma-tornado-2013
- 1.3% as the highly precise donation requests to help
- 0.02% as the highly precise donation offers to help
93
• Anyone know how to get involved to
help the tornado victims in
Oklahoma??#tornado #oklahomacity
(OFFER)
• I want to donate to the Oklahoma cause
shoes clothes even food if I can (OFFER)
Disaster Response Coordination:
Finding Actionable Nuggets for Responders to act
• Text REDCROSS to 909-99 to donate to
those impacted by the Moore tornado!
http://t.co/oQMljkicPs (REQUEST)
• Please donate to Oklahoma disaster
relief efforts.: http://t.co/crRvLAaHtk
(REQUEST)
For responders, most important information is the scarcity and
availability of resources, can we mine it via Social Media?
90. • Features driven by the experience of domain experts at the
responder organizations
• Examples,
– ‘I want to <donate/ help/ bring>’ for extraction of offering
intention
– ‘tent house’ OR ‘cots’ for shelter need types
94
Disaster Response Coordination:
Human Knowledge to drive information extraction
91. • A knowledge-driven approach
– A rich inventory of metadata for tweets
– Semantic matching for
needs (query) vs. offers (documents)
• Example,
– @bladesofmilford please help get the word out,we are accepting kid clothes to send
to the lil angels in Oklahoma.Drop off @MilfordGreenPiz (REQUEST)
– I want to donate to the Oklahoma cause shoes clothes even food if I can (OFFER)
95
Disaster Response Coordination:
Automatic Matching of needs and offers
Matching the
competitive intentions
(Needs and Offers) can
offload humans for the
task of resource
matchmaking for
coordination.
92. 96
Disaster Response Coordination:
Engagement Interface for responders
What-Where-How-Who-Why
Coordination
Influential users to engage
with and resources for
seekers/supplies at a
location, at a timestamp
Contextual
Information for a
chosen topical tags
93. • Illustrious scenario: #Oklahoma-tornado 2013
97
Disaster Response Coordination:
Anecdote for the value of Smart Data
FEMA asked us to quickly filter
out gas-leak related data
Mining the data for smart nuggets
to inform FEMA (Timely needs)
Engaged with the author of this
information to confirm (Veracity)
e.g., All gas leaks in #moore were capped and stopped by
11:30 last night (at 5/22/2013 1:41:37)
Lot of tweets for ‘how to/where to’ assist (‘pseudo’ responders)
e.g., I want to go to Oklahoma this weekend & do what i can to help those people with
food,cloths & supplies,im in the feel of wanting to help ! :)
94. An event is a dynamic topic that evolves and
might later fork into several distinct events.
Smart Data analytics to capture rapidly evolving social data events
98
Social Media is the pulse of the
populace, a true reflection of
events all over the globe!
97. Dynamic Model Creation:
101
Example of how background knowledge help
understand situation described in the tweets, while
also updating knowledge model also
98. How is Continuous Semantics a form of
Smart Data Analytics?
Keeping the Background Knowledge
abreast with the changes of the event
Smartly learning and adapting data acquisition
(Temporally apt Big Data, i.e. Fast Data)
In-turn providing temporally relevant
Smart Data through analysis
102
99. 103
Smart Data Analytics in Traffic Management
To improve the
everyday life
entangled due
to our most
common
problem of
sticking in
traffic
100. By 2001 over 285 million Indians lived in cities, more than in all
North American cities combined (Office of the Registrar General of India 2001)1
1The Crisis of Public Transport in India
2IBM Smarter Traffic
Modes of transportation in Indian Cities
Texas Transportation Institute (TTI)
Congestion report in U.S.
104
Severity of the Traffic Problem
101. Vehicular traffic data from San Francisco Bay Area aggregated from on-road
sensors (numerical) and incident reports (textual)
105
http://511.org/
Every minute update of speed, volume, travel time, and occupancy resulting in
178 million link status observations, 738 active events, and 146 scheduled
events with many unevenly sampled observations collected over 3 months.
Variety Volume
VeracityVelocity
Value
Can we detect the onset of traffic congestion?
Can we characterize traffic congestion based on events?
Can we provide actionable information to decision makers?
semantics
Representing prior knowledge of
traffic lead to a focused exploration
of this massive dataset
Big Data to Smart Data: Traffic Management example
104. • Observation: Slow Moving Traffic
• Multiple Causes (Uncertain about the cause):
– Scheduled Events: music events, fair, theatre events, concerts, road
work, repairs, etc.
– Active Events: accidents, disabled vehicles, break down of
roads/bridges, fire, bad weather, etc.
– Peak hour: e.g. 7 am – 9 am OR 4 pm – 6 pm
• Each of these events may have a varying impact on traffic.
• A delay prediction algorithm should process multimodal and
multi-sensory observations.
Uncertainty in a Physical-Cyber-Social System
108
105. • Internal observations
– Speed, volume, and travel time observations
– Correlations may exist between these variables
across different parts of the network
• External events
– Accident, music event, sporting event, and
planned events
– External events and internal observations may
exhibit correlations
Modeling Traffic Events
109
106. Accident
Music event
Sporting event
Road Work
Theatre event
External events
<ActiveEvents, ScheduledEvents>
Internal observations
<speed, volume, traveTime>
Weather
Time of Day
Modeling Traffic Events
110
107. Domain Experts
cold
PoorVisibility
SlowTraffic
IcyRoad
Declarative domain knowledge
Causal
knowledge
Linked Open Data
Cold (YES/NO) IcyRoad (ON/OFF) PoorVisibility (YES/NO) SlowTraffic (YES/NO)
1 0 1 1
1 1 1 0
1 1 1 1
1 0 1 0
Domain Observations
Domain Knowledge
Structure and parameters
Complementing Probabilistic Models with Declarative Knowledge
112
Correlations to causations using
Declarative knowledge on the
Semantic Web
108. • Declarative knowledge about various domains
are increasingly being published on the web1,2.
• Declarative knowledge describes concepts and
relationships in a domain (structure).
• Linked Open Data may be used to derive
priors probability of events (parameters).
• Explored the use declarative knowledge for
structure using ConceptNet 5.
1http://conceptnet5.media.mit.edu/
2http://linkeddata.org/
Domain Knowledge
113
109. http://conceptnet5.media.mit.edu/web/c/en/traffic_jam
Delay
go to baseball game
traffic jam
traffic accident
traffic jam
ActiveEvent
ScheduledEvent
Causes
traffic jam
Causes
traffic jam
CapableOf
slow traffic
CapableOf
occur twice each day
Causes
is_a
bad weather
CapableOf
slow traffic
road ice
Causes
accident
TimeOfDay
go to concert
HasSubevent
car crash
accident
RelatedTo
car crash
BadWeather
Causes
Causes
is_a
is_a
is_a is_a is_a
is_a
is_a
ConceptNet 5
114
110. Traffic jam
Link
Description
Scheduled
Event
traffic jambaseball game
Add missing random variables
Time of day
bad weather CapableOf slow traffic
bad weather
Traffic data from sensors deployed on road
network in San Francisco Bay Area
time of day
traffic jambaseball game
time of day
slow traffic
Three Operations: Complementing graphical model structure extraction
Add missing links bad weather
traffic jambaseball game
time of day
slow traffic
Add link direction
bad weather
traffic jambaseball game
time of day
slow traffic
go to baseball game Causes traffic jam
Knowledge from ConceptNet5
traffic jam CapableOfoccur twice each day
traffic jam CapableOf slow traffic
115
111. 116
Scheduled Event
Active Event
Day of week Time of day
delay
Travel time
speed
volume
Structure extracted form
traffic observations
(sensors + textual) using
statistical techniques
Scheduled Event
Active Event
Day of week
Time of day
delayTravel time
speed
volume
Bad Weather
Enriched structure which has
link directions and new nodes
such as “Bad Weather”
potentially leading to better
delay predictions
Enriched Probabilistic Models using ConceptNet 5
112. Take Away
• It is all about the human – not computing, not
device
– Computing for human experience
• Whatever we do in Smart Data, focus on human-
in-the-loop (empowering machine computing!):
– Of Human, By Human, For Human
– But in serving human needs, there is a lot more than
what current big data analytics handle –
variety, contextual, personalized, subjective, spanning
data and knowledge across P-C-S dimensions
118
113. Acknowledgements
• Kno.e.sis team
• Funds: NSF, NIH, AFRL, Industry…
• Note:
• For images and sources, if not on slides, please see slide notes
• Some images were taken from the Web Search results and all such images belong
to their respective owners, we are grateful to the owners for usefulness of these
images in our context.
119
114. • OpenSource: http://knoesis.org/opensource
• Showcase: http://knoesis.org/showcase
• Vision: http://knoesis.org/node/266
• Publications: http://knoesis.org/library
120
References and Further Readings
117. Amit Sheth’s
PHD students
Ashutosh Jadhav
Hemant
Purohit
Vinh
Nguyen
Lu Chen
Pavan
Kapanipathi
Pramod
Anantharam
Sujan
Perera
Alan Smith
Pramod Koneru
Maryam Panahiazar
Sarasi Lalithsena
Cory Henson
Kalpa
Gunaratna
Delroy
Cameron
Sanjaya
Wijeratne
Wenbo
Wang
Kno.e.sis in 2012 = ~100 researchers (15 faculty, ~50 PhD students)
118. 124
thank you, and please visit us at
http://knoesis.org
Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing
Wright State University, Dayton, Ohio, USA
Smart Data
#5: Types of DataFormats of DataAlso talk about the increase in the platforms that helps generating these data
#6: Example high velocity Big Data applications at work:financial services, stock brokerage, weather tracking, movies/entertainment and online retail.Fast data (rate at which data is coming: esp from mobile, social and sensor sources), Rapid changes – in the data content, Stream analysis – to cope with the incoming data for real-time online analytics
#10: http://radhakrishna.typepad.com/rks_musings/2013/04/big-data-review.htmlGoogle predicted the spread of flu in real time - after analyzing two datasets, a.) 50 million most common terms that Americans type, b.) data on the spread of seasonal flu from public health agency- tested a mammoth of 450 million different mathematical models to test the search terms, comparing their predictions against the actual flu cases- model was tested when H1N1 crisis struck in 2009 and gave more meaningful and valuable real time information than any public health official system (Big Data, Viktor Mayer-Schonberger and Kenneth Cukier, 2013)
#11: Better Algorithms Beat More Data — And Here’s Whyhttp://allthingsd.com/20121128/better-algorithms-beat-more-data-and-heres-why/Big Data Cannot Replace Human Judgmenthttp://www.matchcite.com/blog/blog/2012/july/big-data-cannot-replace-human-judgment.aspx**Comments about the articles
#12: Smart data makes sense out of big data – it provides value from harnessing the challenges posed by volume, velocity, variety and veracity of big data, to provide actionable information and improve decision making.
#15: Information is CREATED by human with the Machinery available – Wikipedia tool, sensors and social networksInformation is STORED in Man+Machine readable format, LODInformation is PROCESSED using the LOD and Human assisted Knowledge-basedHigher level abstraction on info is now consumed in many mechanistic ways (including GIS) to provide EXPERIENCE for humans
#17: All the data related to human activity, existence and experiencesMore on PCS Computing: http://wiki.knoesis.org/index.php/PCS
#19: Information is CREATED by human with the Machinery available – Wikipedia tool, sensors and social networksInformation is STORED in Man+Machine readable format, LODInformation is PROCESSED using the LOD and Human assisted Knowledge-basedHigher level abstraction on info is now consumed in many mechanistic ways (including GIS) to provide EXPERIENCE for humans Example of a human guided modeling and improved performancehttp://research.microsoft.com/en-us/um/people/akapoor/papers/IJCAI%202011a.pdf
#21: Also, we have weather application which performs abstraction on weather sensory observations to identify blizzard conditions (food for actions!!) :--20,000 weather stations (with ~5 sensors per station)-- Real-Time Feature Streams - live demo: http://knoesis1.wright.edu/EventStreams/ - video demo: https://skydrive.live.com/?cid=77950e284187e848&sc=photos&id=77950E284187E848%21276
#23: Starting slide Various Big data problems – Traditional examples vs what we are doing examples. Variety and Velocity than Volume. kHealth problem. People will be interested in Smart Data.Traditional ML techniques, High Performance Computing, Statistics. Human level of Abstraction is Smart data.
#24: http://www.huffingtonpost.com/2012/10/30/hurricane-sandy-power-outage-map-infographic_n_2044411.htmlI would like to start with a motivational example here.
#25: http://www.guardian.co.uk/news/datablog/2012/oct/31/twitter-sandy-floodinghttp://www.huffingtonpost.com/2012/11/02/twitter-hurricane-sandy_n_2066281.htmlhttp://mashable.com/2012/10/31/hurricane-sandy-facebook/We in our lab have quite a bit of Social Data Research going on. So I would like to focus on the use of social networks during these disasters/crisis.Twitter and Facebook are massively used during disasters. During Hurricane Sandy there were …Not only this a major outbreak of tweets were during Japan earthquake which crossed more that 2000 tweets/sec.So why do people intend to use social networks to this extent during disasters.
#26: Fraustino, Julia Daisy, Brooke Liu and Yan Jin. “Social Media Use during Disasters: A Review of the Knowledge Base and Gaps,” Final Report to Human Factors/Behavioral Sciences Division, Science and Technology Directorate, U.S. Department of Homeland Security. College Park, MD: START, 2012. Disaster communication deals with disaster information disseminated to the public by governments, emergency management organizations, and disaster responders as well as disaster information created and shared by journalists and the public. Disaster communication increasingly occurs via social media in addition to more conventional communication modes such as traditional media (e.g., newspaper, TV, radio) and word-of-mouth (e.g., phone call, face-to-face, group). Timely, interactive communication and user-generated content are hallmarks of social media, which include a diverse array of web- and mobile-based tools Disaster communication deals with (1) disaster information disseminated to the public by governments, emergency management organizations, and disaster responders often via traditional and social media; as well as (2) disaster information created and shared by journalists and affected members of the public often through word-of-mouth communication and social media. For information seeking. Disasters often breed high levels of uncertainty among the public (Mitroff, 2004), which prompts them to engage in heightened information seeking, (Boyle, Schmierbach, Armstrong, & McLeod, 2004; Procopio & Procopio, 2007). As expected, information seeking is a primary driver of social media use during routine times and during disasters (Liu et al., in press; PEW Internet, 2011). For timely information. Social media provide real-time disaster information, which no other media can provide (Kavanaugh et al., 2011; Kodrich & Laituri, 2011). Social media can become the primary source of time-sensitive disaster information, especially when official sources provide information too slowly or are unavailable (Spiro et al., 2012). For example, during the 2007 California wildfires, the public turned to social media because they thought journalists and public officials were too slow to provide relevant information about their communities (Sutton, Palen, & Shklovski, 2008). Time-sensitive information provided by social media during disasters is also useful for officials. For example, in an analysis of more than 500 million tweets, Culotta (2010) found Twitter data forecasted future influenza rates with high accuracy during the 2009 pandemic, obtaining a 95% correlation with national health statistics. Notably, the national statistics came from hospital survey reports, which typically had a lag time of one to two weeks for influenza reporting. For unique information. One of the primary reasons the public uses social media during disaster is to obtain unique information (Caplan, Perse, & Gennaria, 2007). Applied to a disaster setting, which is inherently unpredictable and evolving, it follows that individuals turn to whatever source will provide the newest details. Oftentimes, individuals experiencing the event first-hand are on the scene of the disaster and can provide updates more quickly than traditional news sources and disaster response organization. For instance, in the Mumbai terrorist attacks that included multiple coordinated shootings and bombings across two days, laypersons were first to break the news on Twitter (Merrifield & Palenchar, 2012). Research participants report using social media to satisfy their need to have the latest information available during disasters and for information gathering and sharing during disasters (Palen, Starbird, Vieweg, & Hughes, 2010; Vieweg, Hughes, Starbird, & Palen, 2010). For unfiltered information. To obtain crisis information, individuals often communicate with one another via social media rather than seeking a traditional news source or organizational website (Stephens & Malone, 2009). The public check in with social media not only to obtain up-to-date, timely information unavailable elsewhere, but also because they appreciate that information may be unfiltered by traditional media, organizations, or politicians (Liu et al., in press). To determine disaster magnitude. The public uses social media to stay apprised of the extent of a disaster (Liu et al., in press). They may turn to governmental or organizational sources for this information, but research has shown that if the public do not receive the information they desire when they desire it, they, along with others, will fill in the blanks (Stephens & Malone, 2009), which can create rumors and misinformation. On the flipside, when the public believed that officials were not disseminating enough information regarding the size and trajectory of the 2007 California wildfires, they took matters into their own hands, using social media to track fire locations in real-time and notify residents who were potentially in danger (Sutton, Palen, & Shklovski, 2008). To check in with family and friends. While Americans predominately use social media to connect with family and friends (PEW Internet, 2011), during disasters those connections may shift. For those with family or friends directly involved with the disaster, social media can provide a way to ensure safety, offer support, and receive timely status updates (Procopio & Procopio, 2007; Stephens & Malone, 2009). In a survey of 1,058 Americans, the American Red Cross (2010) found that nearly half of their respondents would use social media to let loved ones know they are safe during disasters. After the 2011 earthquake and tsunami in Japan, the public turned to Twitter, Facebook, Skype, and local Japanese social networks to keep in touch with loved ones while mobile networks were down (Gao, Barbier, & Goolsby, 2011). Researchers also note that disasters may enhance feelings of affection toward family members, and indeed survey participants reported expressing more positive emotions toward their loved ones than usual as a result of the September 11 terrorist attacks, even if they were not directly impacted by the disaster (Fredrickson et al., 2003). Finally, disasters can motivate the public to reconnect with family and friends via social media (Procopio & Procopio, 2009; Semaan & Mark, 2012). To self-mobilize. During disasters, the public may use social media to organize emergency relief and ongoing assistance efforts from both near and afar. In fact, one research group dubbed those who surge to the forefront of digital and in-person disaster relief efforts as “voluntweeters” (Starbird & Palen, 2011). Other research documents the role of Facebook and Twitter in disaster relief fundraising (Horrigan & Morris, 2005; PEJ, 2010). Research also reveals how social media can help identify and respond to urgent needs after disasters. For example, just two hours after the 2010 Haitian earthquake Tufts University volunteers created Ushahidi-Haiti, a crisis map where disaster survivors and volunteers could send incident reports via text messages and tweets. In less than two weeks, 2,500 incident reports were sent to the map (Gao, Barbier, & Gollsby, 2011). To maintain a sense of community. During disasters the media in general and social media in particular may provide a unique gratification: sense of community. That is, as the public logs in online to share their feelings and thoughts, they assist each other in creating a sense of security and community, even when scattered across a vast geographical area (Lev-On, 2011; Procopio & Procopio, 2007). As Reynolds and Seeger (2012) observed, social media create communities during disasters that may be temporary or may continue well into the future. To seek emotional support and healing. Finally, disasters are often inherently tragic, prompting individuals to seek not only information but also human contact, conversation, and emotional care (Sutton et al., 2008). Social media are positioned to facilitate emotional support, allowing individuals to foster virtual communities and relationships, share information and feelings, and even demand resolution (Choi & Lin, 2009; Stephens & Malone, 2009). Indeed, social media in general and blogs in particular are instrumental for providing emotional support during and after disasters (Macias, Hilyard, & Freimuth, 2009; PEJ New Media Index, 2011). Additionally, social media in general and Twitter in particular can aid healing, as research finds during both natural disasters, such as Hurricane Katrina (Procopio & Procopio, 2007), and man-made disasters, such as the July 2011 attacks in Oslo, Norway (Perng et al., 2012).
#27: http://www.buzzfeed.com/annanorth/how-social-media-is-aiding-the-hurricane-sandy-rec -- Facebook help during Hurricane Sandyhttp://blog.twitter.com/2012/10/hurricane-sandy-resources-on-twitter.html – Twitter page for Hurricane Sandyhttp://www.treehugger.com/culture/12-ways-help-hurricane-sandy-relief-efforts.htmlCategorization of severity based on weather conditions. Actionable information is contextually dependent.
#28: http://news.cnet.com/8301-1023_3-57541566-93/report-twitter-hits-half-a-billion-tweets-a-day/http://semiocast.com/en/publications/2012_07_30_Twitter_reaches_half_a_billion_accounts_140m_in_the_USLet me consider one small example of how social data (in turn data) can help people during disasters. Data becomes smart data if it takes recipient into account - context.Sensor data for emergency responders. Who in the population needs immediate attention (1) Location (2) Severity (3) Health Condition Need for abstraction. – Semantic Perception needs abstraction. 90 + Heart Problem Don’t run out23 Run out
#29: http://news.cnet.com/8301-1023_3-57541566-93/report-twitter-hits-half-a-billion-tweets-a-day/http://semiocast.com/en/publications/2012_07_30_Twitter_reaches_half_a_billion_accounts_140m_in_the_UShttp://www.internews.org/sites/default/files/resources/InternewsEurope_Report_Japan_Connecting%20the%20last%20mile%20Japan_2013.pdfLet me consider one small example of how social data (inturn data) can help people during disasters. Data becomes smart data if it takes recipient into account and changes contact accordingly.Sensor data for emergency responders. Who in the population needs immidiate attention (1) Location (2) Severity (3) Health Condition Need for abstraction. – Semantic Perception needs abstraction. 90 + Heart Problem Don’t run out23 Run out
#30: http://news.cnet.com/8301-1023_3-57541566-93/report-twitter-hits-half-a-billion-tweets-a-day/http://semiocast.com/en/publications/2012_07_30_Twitter_reaches_half_a_billion_accounts_140m_in_the_USLet me consider one small example of how social data (inturn data) can help people during disasters. Data becomes smart data if it takes recipient into account and changes contxt accordingly.Sensor data for emergency responders. Who in the population needs immediate attention (1) Location (2) Severity (3) Health Condition Need for abstraction. – Semantic Perception needs abstraction. 90 + Heart Problem Don’t run out23 Run out
#31: http://www.buzzfeed.com/jackstuef/the-man-behind-comfortablysmug-hurricane-sandysDuring the storm last night, user @comfortablysmug was the source of a load of frightening but false information about conditions in New York City that spread wildly on Twitter and onto news broadcasts before Con Ed, the MTA, and Wall Street sources had to take time out of the crisis situation to refute them.
#32: Although we face challenges like these with data everytime. The most important thing is what you aim to do with the data. I mean what value do you intend to provide from the data
#35: -- Contextual Questioning – Potential Information needed from Humans
#37: Larry Smarr is a professor at the University of California, San DiegoAnd he was diagnosed with Crones DiseaseWhat’s interesting about this case is that Larry diagnosed himselfHe is a pioneer in the area of Quantified-Self, which uses sensors to monitor physiological symptomsThrough this process he discovered inflammation, which led him to discovery of Crones DiseaseThis type of self-tracking is becoming more and more common
#40: Massive amount of data will be collected by sensors and mobile devices yet patients and doctors care about “actionable” information.This data has all the four Vs of big data and we used knowledge enabled techniques to transform it into valueIn the context of PD, we analyzed massive amount of sensor data collected by sensors on a smartphones to understand detection and characterization of PD severity.
#41: Main idea: Prior knowledge of PD was used to facilitate its detection from massive sensor data by reducing the search spaceDetails:Declarative knowledge of PD includes PD severity and their symptoms as shown in the logical rule aboveEach PD severity level is a conjunction of a set of PD symptomsEach symptom was mapped to its manifestation in sensor observationsThe availability of declarative knowledge significantly improved the analytics by aiding feature selection processThe graphs above contrasts the physical movements and voice of two control group members and two PD patients
#51: perception cycle contains two primary phasesexplanationtranslating low-level signals into high-level abstractions inference to the best explanationdiscriminationfocusing attention on those properties that will help distinguish between multiple possible explanationsused to intelligently task sensors and collect additional observations (rather than brute force approach of blindly collecting all observations)
#55: perception cycle contains two primary phasesexplanationtranslating low-level signals into high-level abstractions inference to the best explanationdiscriminationfocusing attention on those properties that will help distinguish between multiple possible explanationsused to intelligently task sensors and collect additional observations (rather than brute force approach of blindly collecting all observations)
#56: A single-feature (disease) assumption means that all the observed properties (symptoms) must be explained by a single feature.i.e., this framework is not expressive enough to model comorbidity where there may be more than one feature (disease) co-existing For example, if there are two diseases causing disjoint symptoms, and all the symptoms of both the diseases are observed, then this framework will not be able to find the coverage and returns no diseases.
#58: perception cycle contains two primary phasesexplanationtranslating low-level signals into high-level abstractions inference to the best explanationdiscriminationfocusing attention on those properties that will help distinguish between multiple possible explanationsused to intelligently task sensors and collect additional observations (rather than brute force approach of blindly collecting all observations)
#62: - With this ability,many problems could be solved- For example: we could help solve health problems (before they become serious health problems) through monitoring symptoms and real-time sense making, acting as an early warning system to detect problematic health conditions
#64: Intelligence distributed at the edge of the networkRequires resource-constrained devices (mobile phones, gateway notes, etc.) to be able to utilize SW technologies
#65: Intelligence distributed at the edge of the networkRequires resource-constrained devices (mobile phones, gateway notes, etc.) to be able to utilize SW technologiesHenson et al. 'An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained Devices, ISWC 2012.
#66: compute machine perception inferences -- i.e., explanation and discrimination -- of high-complexity on a resource-constrained devices in milisecondsDifference between the other systems and what this system provides
#67: Intelligence at the age. Shipping computation and domain models to the edge (Distributed)
#70: “to help software reusability in order to allow new applications to be built faster and to share innovations (software components, novel approaches) amongst software developers” “to standardize and commoditize back-end data stores so client software may access any Open mHealth-compliant data store in a uniform way (interoperability)” “to produce examples and documentation of these concepts meaningfully and simply”
#71: Observe data from different sensors at the same time.
#72: System Architecture Fig. shows an overview of the SemHealth architecture. SensorsAll are bluetooth sensors already utilized by the current k-Health application to measure weight, heart rate, and blood pressureAndroid applicationReads sensor observations through bluetoothPerforms annotation on observations and generates percepts from those observationsUploads annotated observations and percepts to the server-side data storeRetrieves data using DSU API and feeds data to DPU and/or DVU APIsVisualizes data through DVU APIConsidered a “nice to have” as existing visualization may be used as-isWill utilize existing graphing library for Android with Open mHealth-style API that may be translated to browser at a later timeServer-sideOpen mHealth compliant DSU and DPU APIsTriple data storage replaces existing SQLite database in k-Health applicationExisting k-Health reasoner now the brains behind DPU
#87: Categorization of severity based on weather conditions. Actionable information is contextually dependent.
#88: - 1 (+half) minuteAlright, so let’s motivate by this situation during emergency - Various actors: resource seekers, responder teams, resource providers at remote siteAnd - each of these actor groups have questions --- - needs - providers - responders: wondering!Here we have social network to connect these actors and bridge the gap for communication platformBut it’s potential use is yet to be realized for effective helpBecause.. (next slide)
#89: Talk about what kind of smart data we provide that helps the actions of crisis response coordination.
#90: Source: Purohit et. al 2013 (https://docs.google.com/a/knoesis.org/document/d/1aBJ2egHICUwaWxR8jOoTIUfEYj1QAnUt0q7haIKoYGY/edit# , http://www.knoesis.org/library/resource.php?id=1865)
#99: Definition of the event US Elections and some changes/subevents --- Primaries --- Debates -- People/Places/Organizations involved in the eventArab Spring -- Subevents during those -- Egypt protests
#105: Pucher, J., Korattyswaroopam, N., & Ittyerah, N. (2004). The crisis of public transport in India: Overwhelming needs but limited resources. Journal of Public Transportation, 7(4), 1-30.
#108: Point of this slide: heterogeneity and uncertainty
#109: A single observation of slow moving traffic may have multiple explanations.
#110: Internal observations are limited to whatever the on-road sensors can observe. In the 511.org data we have analyzed, the internal observations are mentioned above.External events are obtained from sources beyond the on-road sensors e.g., some agency like 511.org which reports traffic incidents.Note that: Internal observations are mostly machine sensors External events are mostly textual observationsThe analogy in healthcare will be:Internal observations: on body sensors such as heart rate, temperatureExternal events: jogging, walking, taking stairs
#112: e.g. equation for projectile motion may not precisely compute the actual projectile. Air resistance may have been ignored
#114: Used of open data for parameters is promising and can be explored as future research.
#115: Some facts about the domain of traffic got from Conceptnet5The types of events are obtained by using the comprehensive subsumption relationship from 511.orgWe propose to use such a knowledge in complementing the PGM structure learning algorithmsCapableOf(traffic jam, occur twice each day)CapableOf(traffic jam, slow traffic)RelatedTo(accident, car crash)Causes(road ice, accident)CapableOf(bad weather, slow traffic)HasSubevent(go to concert, car crash)Causes(go to baseball game, traffic jam)Causes(traffic accident, traffic jam)BadWeather(road ice)BadWeather(bad weather)ScheduledEvent(go to concert)ScheduledEvent(go to baseball game)ActiveEvent(traffic accident)Delay(slow traffic)Delay(traffic jam)TimeOfDay(occur twice each day)
#116: Declarative knowledge + statistical correlationThis slide illustrates the three operations to enrich the correlation structure extracted using statistical methods These operations utilize declarative knowledge form ConceptNet5 as shown in each step
#117: Statistical correlation structure shown aboveThe enriched structure is shown belowThe enrichment of the graphical model will potentially allow us to capture the domain precisely and also improve our prediction as the model would get closer to the underlying probabilistic distribution in the real-worldLog-Likelihood score is one way of quantifying how good a structure is based on the observed data There may be many candidate structures extracted from data which result in the log likelihood scoreDeclarative knowledge will help us ground statistical models to reality which will allow us to pick one structure over the other Pramod Anantharam, KrishnaprasadThirunarayan and AmitSheth, 'Traffic Analytics using Probabilistic Graphical Models Enhanced with Knowledge Bases,' 2nd International Workshop on Analytics for Cyber-Physical Systems (ACS-2013) at SIAM International Conference on Data Mining (SDM13), pp. 13--20, Texas, USA, May 2-4, 2013.We stopped at structure extraction for our workshop paper (SIAM ACS workshop) since the declarative knowledge we used (ConceptNet5) and statistical model (nodes and edges) are at the same level of abstraction
#125: More at: http://wiki.knoesis.org/index.php/PCSAnd http://knoesis.org/projects/ssw/