448 lines
16 KiB
Typst
448 lines
16 KiB
Typst
#import "/globals.typ": *
|
|
|
|
//#outline-slide()
|
|
|
|
= Introduction
|
|
|
|
== Why are we here?
|
|
#slide[
|
|
#set align(center)
|
|
#grid(align: auto, rows: (13fr, 1fr), gutter: 1pt, inset: 1pt,
|
|
[#image("resources/iot-diagram-1.jpg")
|
|
#set text(size: 13pt)
|
|
#link("https://tse3.mm.bing.net/th?id=OIP.o3AVQNkQCCG_2cmhQzD1zQHaEW&pid=Api"),
|
|
#v(5pt)]
|
|
)
|
|
]
|
|
#slide[
|
|
#set align(left)
|
|
== Project Description
|
|
To study the privacy and security aspects of IoT devices
|
|
- _systematically_ and
|
|
- _reproducibly_,
|
|
we need an easy-to-use
|
|
- _testbed_
|
|
that
|
|
- _automates_
|
|
#text(size: 0.7em, [(some aspects of)]) the process of experimenting with IoT devices.
|
|
#v(5pt)
|
|
*In this presentation I describe an implementation of such a testbed:* `IOTTB`
|
|
|
|
#speaker-note[
|
|
- _systematically_: standardization,
|
|
- _reproducible_: a systematic approach promises more reproducible experiments, and thus better verifiable results.
|
|
- _testbed_: and environment which fixes certain parameters
|
|
- _automates_: beyond reproducibility, the level of manual involvement influences feasibility w.r.t. reproduction
|
|
]
|
|
]
|
|
|
|
== Principal Objectives
|
|
#slide[
|
|
#v(5pt)
|
|
== Objectives
|
|
Key objectives:
|
|
+ _Automation recipes_ @fursinckorg2021 for repeated execution of experiments, including data collection and analysis.
|
|
+ _FAIR_ data storage (Findable, Accessible, Interoperable, Reusable) (see @faircsartefacts2022, @go-fair and @wilkinson_fair_2016).
|
|
|
|
]
|
|
|
|
|
|
|
|
= Motivation
|
|
== Problem(s)
|
|
#slide(composer: utils.side-by-side)[
|
|
1 Manual setup and configuration of tools
|
|
- e.g. `tcpdump`, `Wireshark`, `Frida`
|
|
- configurations not interoperable between tools
|
|
#pause
|
|
2 Ad-hoc decisions
|
|
- file/artefact naming
|
|
- measured/extracted data features
|
|
- metadata recorded
|
|
#pause
|
|
3 Tailored utilities
|
|
- lack interoperability
|
|
- require adaptation depending on project
|
|
][
|
|
#pause
|
|
4 Scattered data and lack of standardization
|
|
- Inconsistent data collection and storage
|
|
- Difficult to maintain compatibility across projects
|
|
#pause
|
|
5 Onboarding challenges
|
|
- New members create ad-hoc solutions
|
|
- Perpetuates inefficiency and inconsistency
|
|
]
|
|
|
|
|
|
|
|
== Challenges Faced
|
|
#slide[
|
|
- Problems with current approach:
|
|
+ Inconsistent data collection
|
|
+ Lack of standardized tools and methods
|
|
+ Issues with file naming and data structuring
|
|
- Resulting difficulties:
|
|
+ Compatibility across projects
|
|
+ Onboarding new members
|
|
+ Ad-hoc solutions perpetuating inefficiency
|
|
]
|
|
|
|
= Background
|
|
== IoT Devices
|
|
#slide[
|
|
#set text(size: 14pt)
|
|
#grid(
|
|
rows: (4fr, 7fr),
|
|
gutter: 3pt,
|
|
grid(columns: 4,
|
|
[#figure(image("resources/philips-hue.jpg"),caption: [Smart Lighting])<fig:philips-hue>],
|
|
[#figure(image("resources/echo-dot.jpeg"), caption: [Smart Speakers])<fig:echo-dot>],
|
|
[#figure(image("resources/mi-camera.png", height: 80%), caption: [Home Surveillance Camera])<fig:mi-camera>],
|
|
[#figure(image("resources/meta-quest-2.png"), caption: [VR Headset])<fig:meta-quest-2>]),
|
|
grid(columns: (2.5fr, 3fr,2.5fr, 3fr),
|
|
[#figure(image("resources/dall-e-home-topo-1.jpeg", height: 80%), caption: [Dall-E Diagram of a Smart Home Network])],
|
|
grid.cell(colspan: 1, align: top+left, inset: 0.5em, breakable: true, [
|
|
#set text(size: 15pt)
|
|
#h(12pt)
|
|
#v(12pt)
|
|
IoT devices offer #alert[benefits]:
|
|
- Home lighting control
|
|
- Remote video monitoring
|
|
- Automated cleaning
|
|
#v(-5pt)
|
|
and more! But, they becuase
|
|
+ Used in Homes
|
|
+ Connected
|
|
- LAN only
|
|
- Internet
|
|
- #text(size:0.8em, [May lead to information leakage])
|
|
]),
|
|
grid.cell(colspan: 1, align: top+left, inset: 1em, breakable: true, [
|
|
#set text(size: 15pt)
|
|
#h(12pt)
|
|
#v(12pt)
|
|
#math.arrow.r.double Security and privacy *risks*
|
|
- Surveillance potential
|
|
- Unauthorized data sharing
|
|
- Vulnerable to bugs and security failures]),
|
|
[#figure(image("resources/dall-e-home-topo-2.jpeg", height: 80%), caption: [Dall-E Schematic Smart Home Network])]
|
|
)
|
|
)
|
|
]
|
|
|
|
#slide[
|
|
#set align(left)
|
|
- *IoT Devices Overview:*
|
|
- Devices connected to the internet (voice assistants, smart watches, smart home gadgets)
|
|
- Embedded with microprocessors and software
|
|
- *Examples of IoT Devices:*
|
|
- Security cameras
|
|
- Home lighting systems
|
|
- Children's toys
|
|
- *Importance of IoT:*
|
|
- Physical dimension (sensors, controllers)
|
|
- Internet connectivity
|
|
]
|
|
|
|
== Testbeds
|
|
#slide[
|
|
#set align(left)
|
|
- *What is a Testbed?*
|
|
- Controlled environment for experiments
|
|
- Ensures reproducibility and standardization
|
|
- *Examples of Testbeds:*
|
|
- Industry and Engineering: Platforms for product development
|
|
- Natural Sciences: Laboratories (e.g., climate chambers, wind tunnels, see @vaughan2005use)
|
|
- Computing: Software testing environments (unit tests, IDEs)
|
|
- Interdisciplinary: Complex systems (e.g., smart electric grid testbeds, see @tbsmartgrid2013)
|
|
]
|
|
|
|
== FAIR Data Principles
|
|
#slide[
|
|
#set align(left)
|
|
- *FAIR Data Principles:* @wilkinson_fair_2016, @go-fair
|
|
- *Findability:* Data should be easy to find
|
|
- *Accessibility:* Data should be accessible under well-defined conditions
|
|
- *Interoperability:* Data should be integrated with other data
|
|
- *Reusability:* Data should be reusable for future research
|
|
- *Purpose:*
|
|
- Improve reusability of scientific data
|
|
- Guide for designing _data storage_ systems
|
|
#speaker-note[
|
|
#set text(size: 0.5em)
|
|
#grid(columns: 2,[
|
|
*Findability:*
|
|
- Ensuring data is easily locatable and identifiable.
|
|
- Use of persistent identifiers like DOIs.
|
|
- Metadata should be richly described to enable precise searching.
|
|
- *Positive Example:* A dataset with a DOI and comprehensive metadata that is indexed in major search engines.
|
|
- *Negative Example:* A dataset stored on a personal computer with no metadata and no persistent identifier.
|
|
|
|
*Accessibility:*
|
|
- Data should be retrievable by authorized users.
|
|
- Use of standardized protocols for data access.
|
|
- Clear access conditions and usage licenses.
|
|
- *Positive Example:* A dataset available through a well-documented API with clear access guidelines and permissions.
|
|
- *Negative Example:* A dataset stored in a proprietary format that requires special software to access, with unclear or restrictive access conditions.
|
|
],[
|
|
*Interoperability:*
|
|
- Data should integrate with other datasets.
|
|
- Use of standardized formats and vocabularies.
|
|
- Ensure compatibility with existing data and tools.
|
|
- *Positive Example:* A dataset in CSV format using standardized column headers that align with other datasets in the field.
|
|
- *Negative Example:* A dataset in a non-standard format with custom jargon that is difficult to merge with other data sources.
|
|
|
|
*Reusability:*
|
|
- Data should be well-documented to allow future use.
|
|
- Include clear licensing for reuse.
|
|
- Ensure data quality and provenance are maintained.
|
|
- *Positive Example:* A dataset with a clear Creative Commons license, detailed documentation, and a version history.
|
|
- *Negative Example:* A dataset with no documentation, unclear provenance, and no stated reuse policy.
|
|
])
|
|
]
|
|
]
|
|
|
|
== Network Traffic
|
|
#slide[
|
|
#set align(left)
|
|
- *Importance of Network Traffic in IoT:*
|
|
+ Captures communication patterns (device-to-server (internet), device-to-device (LAN, e.g., companion apps))
|
|
+ Essential for evaluating performance and identifying unauthorized communications
|
|
- *Protocol Analysis:*
|
|
+ Understand device operation and communication protocols
|
|
+ Identify compatibility, efficiency, and security issues
|
|
- *Flow Monitoring:*
|
|
+ Detect potential security threats (data breaches, unauthorized access, malware)
|
|
+ Monitor for anomalies indicating security incidents or vulnerabilities
|
|
- *Information Leakage:*
|
|
+ Adversaries can passively observe traffic and extract sensitive information
|
|
+ Even encrypted traffic can leak information about the smart environment and users
|
|
see @infoexpiot, @iothome2019, @friesssniffing2018, @infoexpiot and @peekaboo2020
|
|
#speaker-note[
|
|
- Nw traffic important for various reasons for us
|
|
- due to data being encrypted in many cases now adays
|
|
- most methods boild down to some type of network traffic analysis
|
|
]
|
|
]
|
|
== Findings from Key Studies
|
|
#slide[
|
|
#set align(left)
|
|
*Examples:*\
|
|
- *Leakage:* Personal data and device usage patterns. @infoexpiot
|
|
- *Details:* The study found that IoT devices often leak personal data and detailed usage patterns to third-party servers.
|
|
- *Leakage:* Home device interactions and usage. @iothome2019
|
|
- *Details:* This research revealed that interactions with home devices can be intercepted, providing insights into daily routines and activities.
|
|
- *Leakage:* Device/Network communication _patterns_.@friesssniffing2018
|
|
- *Details:* Sniffing tools can capture communications between IoT devices. WiFi packets expose usage patterns regardless of encryption@peekaboo2020. Those patterns contain features which can be extracted (i.e. leaked) and fed into machine learning models which are capable of exposing more meaningful information (e.g., identifying devices and their functionality) @alyamiwifi2022.
|
|
In the end these are all some aspect of the same issue: even encrypted traffic leaks information which can be valuable to adversaries.
|
|
#speaker-note[
|
|
Examples:
|
|
- how many people live in a houshold
|
|
- how many devices are in the household
|
|
- when which devices are on line
|
|
- when, who is home
|
|
]
|
|
]
|
|
|
|
== Packet Capture
|
|
#slide[
|
|
#set align(left)
|
|
- *Network Packet Capture:*
|
|
+ Intercepting and storing data packets on a network
|
|
+ Principal technique for studying device behavior and communication patterns
|
|
- *Importance in IoT Security Research:*
|
|
+ Main data collection mechanism
|
|
+ Essential for analyzing network traffic
|
|
//#math.arrow.r.double Wireshark Example
|
|
#speaker-note[
|
|
- data collection for network traffic
|
|
]
|
|
]
|
|
|
|
== Automation Recipes
|
|
#slide[
|
|
#set align(left)
|
|
- *Automation Recipes:*
|
|
- Platform agnostic automation
|
|
- e.g., install tool y, retrieve dataset x
|
|
- Integrate with existing scripts/tools
|
|
- Examples in ML
|
|
- _Collective Mind Framework:_ @CommonLanguageFacilitate2023, @fursinckorg2021
|
|
- Provides reusable recipes for building, running, benchmarking, and optimizing applications
|
|
- Platform-independent or supplemented with user-specific scripts
|
|
|
|
#speaker-note[
|
|
- *Importance of Automation:*
|
|
- Automates workflows irrespective of underlying tools
|
|
- the agnostic part is just the goal
|
|
- these recipies must be able to integrate well with existing tools, personal scripts
|
|
- Enhances reproducibility and efficiency in experiments
|
|
- Underlying data has a standardized (w.r.t. to tooling) format, if tool is available
|
|
]
|
|
]
|
|
|
|
== Summary of Key Points
|
|
#slide[
|
|
#set align(left)
|
|
- *Key Issues Identified:*
|
|
+ Manual setup and configuration of tools
|
|
+ Ad-hoc decisions in file naming, data features, and metadata
|
|
+ Tailored utilities lacking interoperability
|
|
+ Scattered data and lack of standardization
|
|
+ Onboarding challenges for new members
|
|
- *Importance of Addressing These Issues:*
|
|
+ Improve reproducibility and reliability of experiments
|
|
+ Enhance data quality and interoperability
|
|
+ Facilitate easier onboarding and collaboration
|
|
]
|
|
|
|
== Return to ...
|
|
#slide[
|
|
#set align(left)
|
|
- *How IOTTB Addresses These Issues:*
|
|
+ *Automation Recipes:*
|
|
- Standardize the setup and configuration of tools
|
|
- Ensure consistent data collection and analysis processes
|
|
+ *FAIR Data Storage:*
|
|
- Enhance findability, accessibility, interoperability, and reusability of data
|
|
- Improve data management and sharing practices
|
|
+ *Testbed Design:*
|
|
- Provide a controlled environment for reproducible experiments
|
|
- Simplify onboarding and collaboration through standardized procedures
|
|
]
|
|
= #smallcaps[IoTdb]
|
|
== Model Environment
|
|
#slide(composer: (1fr, 1fr))[
|
|
#figure(
|
|
image("resources/network-setup1.png"),
|
|
caption: [Common capture setup. Separate AP, switch and capturing device.]
|
|
)<fig:setup1>
|
|
][
|
|
#figure(
|
|
image("resources/setup2.png"),
|
|
caption: [Setup with AP and "Capture Device" on same machine.]
|
|
)
|
|
]
|
|
|
|
== The testbed
|
|
#slide[
|
|
#align(top + center)[_[...] testbed for IoT devices which automates aspects of running experiments._]
|
|
#pause
|
|
How is this realized?\
|
|
#pause
|
|
*`iottb`*:
|
|
- Python Package
|
|
- Defines Data Storage (implicit in behaviour)
|
|
- Database is a directory hierarchy in a file system
|
|
- DB is a collection of "device"-folders
|
|
- Devices in turn hold some metadata and can have subfolders containing capture data #pause
|
|
- Defines a metadata schema for devices, as well as captures
|
|
- Automates collecting of metadata + data
|
|
]
|
|
|
|
#focus-slide[#align(center+horizon,[DEMO])]
|
|
= Outlook
|
|
== Evaluation
|
|
#slide[
|
|
*FAIR*-ness?\
|
|
#pause
|
|
_Findability_:\
|
|
- supported through use of UUIDs, while maintaining human readability.
|
|
#speaker-note[Findable
|
|
|
|
F1. (Meta)data are assigned a globally unique and persistent identifier
|
|
|
|
F2. Data are described with rich metadata (defined by R1 below)
|
|
|
|
F3. Metadata clearly and explicitly include the identifier of the data they describe
|
|
|
|
F4. (Meta)data are registered or indexed in a searchable resourc]
|
|
]
|
|
#slide[
|
|
*FAIR*-ness?\
|
|
_Findability_:\
|
|
- supported through use of UUIDs, while maintaining human readability.
|
|
_Accessibility_:\
|
|
- to a degree up to user of testbed
|
|
- UUID precondition for data met
|
|
- metadata makes sense also without data
|
|
#speaker-note[
|
|
A1. (Meta)data are retrievable by their identifier using a standardised communications protocol
|
|
|
|
A1.1 The protocol is open, free, and universally implementable
|
|
|
|
A1.2 The protocol allows for an authentication and authorisation procedure, where necessary
|
|
|
|
A2. Metadata are accessible, even when the data are no longer available
|
|
]
|
|
]
|
|
#slide[
|
|
*FAIR*-ness?\
|
|
_Findability_:\
|
|
- supported through use of UUIDs, while maintaining human readability.
|
|
_Accessibility_:\
|
|
- to a degree up to user of testbed
|
|
- UUID precondition for data met
|
|
- metadata makes sense also without data
|
|
_Interoperability_:\
|
|
- Used data formats are common and well known (json, pcap)
|
|
- Metadata schema understandable given example
|
|
#speaker-note[
|
|
1. (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.
|
|
|
|
I2. (Meta)data use vocabularies that follow FAIR principles
|
|
|
|
I3. (Meta)data include qualified references to other (meta)data
|
|
]
|
|
]
|
|
#slide[
|
|
*FAIR*-ness?\
|
|
_Findability_:\
|
|
- supported through use of UUIDs, while maintaining human readability.
|
|
_Accessibility_:\
|
|
- to a degree up to user of testbed
|
|
- UUID precondition for data met
|
|
- metadata makes sense also without data
|
|
_Interoperability_:\
|
|
- Used data formats are common and well known (json, pcap)
|
|
- Metadata schema understandable given example
|
|
_Reusability_:\
|
|
- Used formats support this.
|
|
- Data capture tool (`iottb`) can be made available
|
|
- + rerun with the same configuration
|
|
#speaker-note[
|
|
R1. (Meta)data are richly described with a plurality of accurate and relevant attributes
|
|
|
|
R1.1. (Meta)data are released with a clear and accessible data usage license
|
|
|
|
R1.2. (Meta)data are associated with detailed provenance
|
|
|
|
R1.3. (Meta)data meet domain-relevant community standard
|
|
]
|
|
]
|
|
#slide[
|
|
*Automation Recipes*?\
|
|
- `iottb` automates capture
|
|
- Metadata should allow repeating experiments
|
|
- want: configure capture based on metadata
|
|
]
|
|
|
|
= Questions
|
|
|
|
|
|
= Appendix
|
|
#bibliography("presentation-bsc.bib", style: "ieee")
|
|
== Images
|
|
#slide[
|
|
#set text(size: 13pt)
|
|
//#show link: underline
|
|
#show link: set text(stroke: blue)
|
|
*Introduction*#footnote([Images licenced for free share and use to the best of my knowledge.])\
|
|
- IoT Network Diagram: #link("https://tse3.mm.bing.net/th?id=OIP.o3AVQNkQCCG_2cmhQzD1zQHaEW&pid=Api")
|
|
- @fig:echo-dot: #link("https://i0.wp.com/thegroyne.com/wp-content/uploads/2018/04/Amazon-Echo-Dot-Altavoces-inteligentes-04.jpeg")
|
|
- @fig:philips-hue: #link("https://www.multimediaplayer.it/wp-content/uploads/kit-philips-hue.jpg")
|
|
- @fig:mi-camera: #link("https://d.otto.de/files/bd42f6e9-ac45-5e1c-8d5f-ac3affcee9d6.pdf")#footnote("Unclear licence")
|
|
|
|
]
|