Trust the Crowd: Wireless Witnessing to Detect Attacks on ADS-B-Based Air-Traffic Surveillance

GLICO·2024년 10월 29일

Paper_Review

목록 보기

1/1

Abstract

Automatic Dependent Surveillance-Broadcast(ADS-B) has been widely adopted as the de facto standard for air-traffic surveillance. Aviation regulations require all aircraft to actively broadcast status reports containing identity, position, and movement information. However, the lack of security measures exposes ADS-B to cyberattacks by technically capable adversaries with the purpose of interfering with air safety. In this paper, we develop a non-invasive trust evaluation system to detect attacks on ADS-B-based air-traffic surveillance using real-world flight data as collected by an infrastructure of ground-based sensors. Taking advantage of the redundancy of geographically distributed sensors in a crowdsourcing manner, we implement verification tests to pursue security by wireless witnessing. At the core of our proposal is the combination of verification checks and Machine Learning (ML)-aided classification of reception patterns--such that user-collected data cross-validates the data provided by other users. Our system is non-invasive in the sense that in neither requires modifications on the deployed hardware nor the software protocols and only utilized already available data. We demonstrate that our system can successfully detect GPS spoofing, ADS-B spoofing, and even Sybil attacks for airspaces observed by at least three benign sensors. We are further able to distinguish the type of attack, identify affected sensors, and tune our system to dynamically adapt to changing air-traffic conditons.

방송형 자동종속관제(ADS-B)는 항공 교통에 대한 표준으로 널리 채택되었다. 항공 규제들은 신원, 위치, 이동 정보 등을 포함하는 상태 보고서를 적극적으로 방송하도록 요구한다. 하지만, 보안 조치의 부족함은 ADS-B를 항공 안전을 방해 할 목적으로 기술적 능력을 가지는 적에 의한 사이버 공격에 노출시킨다. 이 논문에서는, 우리는 지상-기반 센서 인프라에 의해 수집된 실제 항공 데이터를 가지고 ADS-B기반의 항공 교통 관제에 대한 공격을 탐지하기 위한 비침입적 신뢰 평가 시스템을 개발한다. 크라우드소싱 방식으로 지리적으로 분산된 센서들의 중복성의 이점으로, 우리는 무선 목격을 통한 보안을 추구하기 위한 검증 테스트를 개발한다. 우리 제안의 중점은 사용자가 수집한 데이터가 다른 사용자에 의해 제공된 데이터를 상호 검증하도록, 검증 테스트와 기계 학습 기반의 수신 패턴 분류를 조합하는 것이다. 우리의 시스템은 이미 배포된 하드웨어나 프로토콜들에 대한 수정 없이 오직 이미 사용가능한 데이터만을 활용한다는 점에 있어서 비침입적이다. 우리는 우리의 시스템이 성공적으로 GPS spoofing, ADS-B spoofing, 그리고 심지어 최소 3개의 정상 센서들에 의해 목격되어진 영공에 대한 Sybil attacks를 탐지할 수 있다는 것을 증명했다. 우리는 심지어 공격의 유형을 구분하고, 영향을 받은 센서들을 식별하고, 변화하는 항공 교통 제어를 위해서 동적으로 우리의 시스템을 조정할 수 있다.

Abstract 정리

ADS-B는 항공교통관제에 있어서 표준이 되었다. 하지만 ADS-B에 대한 보안 조치가 부족하여 사이버 위협에 취약한 상황이다.
이러한 상황에서 우리는 지리적으로 분산되어있는 사용자들로 부터 수집되는 데이터를 활용하여 공격 탐지 평가 시스템을 개발한다.
우리의 시스템은 이미 배포된 하드웨어나 프로토콜에 대한 수정이 필요없는 비침입적 시스템이며, 다양한 공격들(GPS spoofing, ADS-B spoofing, Sybil attacks 등)의 유형을 식별하고 공격 당한 센서를 식별하는 등의 특징을 가지고 있다.

Introduction

The monitoring of air traffic has evolved from an analog Radio Detection and Ranging (RADAR)-based system to a digitally-aided surveillance infrastructure. Effective from January 1, 2020, all aircraft are required to be equipped with an Automatic Dependent Surveillance-Broadcast (ADS-B) system to access most of the world's airspace, which hence constitutes the de facto standard for air-traffic monitoring. ADS-B-capable transmitters periodically broadcast status reports that inform others about their identification, position, movement, and additional status codes.

항공 교통 모니터링이 아날로그 RADAR 기반의 시스템에서 디지털을 지원하는 감시 인프라 체계로 발전해왔다. 2020년 1월 1일 부터, 모든 항공기들은 거의 대부분의 전 세계 영공을 접근하기 위해서는 ADS-B 시스템을 장비하도록 요구되어졌고, 이는 곧 사실상의 표준이 되었다. ADS-B를 이용할 수 있는 송신자들은 다른 항공기들에게 그들의 신원, 위치, 이동, 그리고 추가적인 상태코드들을 알리는 상태 보고서를 주기적으로 방송한다.

While the aviation industry is characterized by very long development cycles--up to several decades--, applications that mandate high safety guarantees are usually lagging behind advacements on the security side. As such, ADS-B reports are neither encrypted nor authenicated. At the same time, the open specification of ADS-B promotes the collection and free usage of aircraft reports. Simple sensors can decode aircraft broadcast reports and gain a real-time view of their surrounding airspace. A network that combines more than 1000 user-operated ground-based sensors in a crowdsourcing manner is the OpenSky Network. This network collects and stores air-traffic data from around the world and makes them available for research.

항공 산업이 아주 긴 개발 주기가 특성인 반면에, 고수준의 안전을 보장 해야하는 애플리케이션들은 보통 보안 측면의 발전 보다 뒤떨어져있다. ADS-B 보고서들은 암호화나 인증되어지지 않는다. 동시에, ADS-B의 공개 명세서는 항공기 보고서들의 수집과 무료 사용을 촉진시켰다. 단순한 센서들은 항공 방송 보고서들을 해독할 수 있고 그들을 둘러싸고 있는 영공의 실시간 뷰를 얻을 수 있다. OpenSky Network는 1000개 이상의 크라우드 소싱 방식의 유저에 의해 수행되는 지상 기반의 센서들로 구성되어지는 네트워크이다. 이 네트워크는 전 세계로부터 항공 교통 데이터를 수집하고 저장한다 그리고 그것들을 연구를 위해 사용 가능하게 만든다.

Since ADS-B lacks fundamental security practices, the exposrue to cyberattacks targeting air traffic has long been discussed. These works demonstrate how attackers can interfere with aircraft sensors and how fake aircraft messages can be injected into air-traffic monitoring systems. For instance, adversaries with commercial off-the-shelf hardware and moderate knowledge can generate arbitrary messages mimicking valid ADS-B reports. The consequences of such attacks range from distraction on the flight deck or in the control room up to violations of mandatory safety separations, and eventually increasing the possibilty of aircraft collisions. Since the implementation of theses attacks is far from being only of academic nature, security solutions are urgently needed to protect the integrity of air-traffic surveillance. In fact, data trust establishment is an open and central problem in the aviation industry and emerging concerns have already reached the public.

ADS-B가 근본적인 보안 정책이 부족하기 때문에, 항공 교통을 타겟으로하는 사이버공격에 대한 노출은 오랫동안 논의되어왔다. 이러한 작업들은 어떻게 공격자들이 항공기 센서들을 방해할 수 있고 가짜 항공기 메시지들이 어떻게 항공 교통 관제 시스템에 주입될 수 있는지를 증명한다. 예를 들어서, 기성품 하드웨어와 보통의 지식을 가진 적들은 유효한 ADS-B 보고서들을 모방하는 임의의 메시지를 생성할 수 있다. 이러한 공격들의 결과는 비행 갑판 또는 제어실에서의 주의산만에서 의무적인 안전 분리 위반까지에 이르기까지 다양하며, 결국 항공기 충돌의 가능성까지 증가시킨다. 이러한 공격들의 구현이 학문적 성격에 불과하지 않기 때문에, 항공 교통 관제 감시의 무결성을 보호하기 위해 보안 솔루션들이 시급해진다. 사실, 데이터 신뢰 구축은 항공 산업의 개방적이고 중심적인 문제이며, 새로운 우려들은 이미 대중들에게 도달했다.

참고
off-the-shelf 라는 단어는 보안 외신에서 자주 사용되는 일종의 형용사처럼 쓰이는 단어이다. '판매대에서 떨어져 나온'이라는 직역처럼 구매했다 라는 뜻으로 사용된다.
즉, off-the-shelf라는 단어는 다음과 같은 뜻을 내포하고 있는 것이 보통이다.

공격자들이 스스로 그것을 개발할 정도의 기술을 가지고 있지 않다.

누구나 사용할 수 있는 기성품이 공격에 활용된다.
위 본문에서는 off-the-shelf라는 단어를 사용함으로써 전문 지식이 없는 공격자들이 기성품 하드웨어의 구매와 약간의 지식으로도 쉽게 ADS-B에 대한 공격을 수행하고 있다라고 생각하면 된다.

To answer the demands for more security in the safety-driven aviation industry, we propose a data-centric trust evaluation system with the goal of assessing the trustworthiness of ADS-B reports using data that is already collected at wide scale. We refer to trust in the sense that messages are trustworthy when they originate from functional, non-malicious sources. In contrast, error-prone or attacker-controlled messages trying to harm the system should be detected. Furthermore, we explore the identification of the type of attack and the traceability of malicious sensors.

안전한 항공 산업의 더 나은 보안에 대한 요구에 대답하기 위해서, 우리는 이미 광범위하게 수집된 데이터를 활용하는 ADS-B 보고서들의 신뢰성을 평가하는 것을 목표로 가지는 데이터 중심의 신뢰 평가 시스템을 제안한다. 우리는 메시지들이 기능적이고, 비위협적인 소스로부터 기원되었을 때, 신뢰적이다라는 의미에서의 신뢰를 말하고 있다. 이와 대조적으로, 시스템에 해를 끼지려고하는 에러가 쉽게 발생하거나 공격자-통제적인 메시지들은 탐지되어야 한다. 게다가, 우리는 공격의 유형 식별과 악의적인 센서들의 추적을 살펴봐야한다.

The development of such a system faces several challenges imposed by the highly regulated aviation industry. Viable solutions need to be non-invasive in the sense that they do not require any modifications on the deployed hard-and software. In particular, security systems should not interfere with other systems already in place to avoid lengthy (re)certification processes. Preferably, solutions are augmentation systems that operate autonomously with sensor input already available. We develop our system to fulfill all these challenges.

이러한 시스템의 발전은 매우 규제되어있는 항공 산업에 의한 여러가지 문제들을 직면한다. 생존가능한 솔루션들은 이미 배포된 하드웨어나 소프트웨어에 대한 어떠한 수정도 요구하지 않는 의미의 비침입성이 필요하다. 특히, 보안 시스템들은 긴 절차의 (재)인증 프로세스들을 피하기 위해서 이미 자리를 잡은 다른 시스템들과 방해하면 안된다. 가급적으로, 솔루션들은 이미 사용가능한 센서 인풋을 가지고 자동적으로 수행하는 증강 시스템들이다. 우리는 이러한 모든 문제들을 충족하기 위해서 우리의 시스템을 개발한다.

At the core of our system, we make use of the crowd-sourcing nature of a sensor network in which user-collected data cross-validates data provided by other users. Forming a network of trusted sensors based on mutual auditing, we pursue wireless witnessing. Wireless witnessing is the collaborative process of observing the status of a distributed wireless system. We apply it in the security context to assess and validate the trustworthiness of ADS-B reports. In particular, we implement a Machine Learning (ML)-based verification test that is trained on typical message reception patterns. The collaboration of sensors characterizes expected reception patterns of aircraft reports transmitted from certain airspace segments while automatically factoring in natural message loss.

우리 시스템의 중심에 있어서, 우리는 유저-수집 데이터들이 다른 유저들에 의해 제공된 데이터를 교차 검증하는 센서 네트워크의 크라우드 소싱 방식의 특성을 활용하는 것이다. 상호 감사를 기반으로하는 신뢰된 센서들의 네트워크를 형성하는 것은, 우리가 무선 목격을 추구하는 것이다. 무선 목격은 분산 무선 시스템의 상태를 관찰하는 협업 절차이다. 우리는 그것을 ADS-B 보고서들의 신뢰성을 평가하고 검증하기 위해서 보안 분야에 적용시켰다. 특히, 우리는 특정 메시지 수신 패턴을 학습한 기계 학습 기반의 검증 테스트를 구현했다. 센서들의 협업은 자연스러운 메시지 손실을 자동적으로 고려하면서 특정 영공 세그먼트로부터 전송되어진 항공기 보고서들의 기대되어지는 수신 패턴들을 특징화 한다.

Our system can reliably differentiate between normal air-traffic broadcast and suspicious reports diverging from expected patterns if at least three sensors observe the same airspace. This assumption is already fulfilled by the majority of the considered airspace. Furthermore, our system can recognize the type of attack, e.g. GPS spoofing or ADS-B spoofing to trace affected sensors and identify the sensor redundancy as an important factor. While minimizing false alarm events, we achieve detection rates beyond 95% for moderate GPS spoofing deviations and any form of ADS-B spoofing. To further harden the network against attacks, new sensors can be integrated by providing consistent snapshots of their airspaces. Since our system is solely based on an already existing infrastructure and does not require any modifications on aviation systems, it is non-invasive and could be implemented today easing very long certification processes. In contrast to existing solutions for air-traffic verification, we do not require the measurement of time, frequency shifts, or any PHY layer features, but only use discrete sensor events.

우리의 시스템은 최소 3개의 센서들이 같은 영공을 관찰 할 경우, 정상 항공 교통 방송과 예상 패턴들과는 다른 의심이 가는 보고서들을 분리할 수 있다. 이러한 가정은 이미 고려되는 영공의 대부분에 의해서 충족된다. 게다가, 우리의 시스템은 GPS spoofing이나 ADS-B spoofing같은 공격의 유형을 구분함으로써 영향을 받은 센서들을 추적하고 센서 이중화를 중요한 요소로 식별할 수 있다. 잘못된 이벤트를 알리는 것을 최소화하는 반면에, 우리는 중간정도의 GPS spoofing 편차와 ADS-B spoofing의 어떠한 형태라도 95%이상의 탐지율을 달성한다. 공격에 대한 네트워크를 더 견고히 하기 위해서, 새로운 센서들이 그들의 영공의 일관된 스냅샷을 제공함으로써 통합되어질 수 있다. 우리의 시스템이 이미 존재하는 인프라를 기반으로 하고 있고 항공 시스템의 어떠한 수정도 요구하고 있지 않기 때문에, 비침입적이고, 매우 긴 인증 절차들을 용이하게 하면서 구현될 수 있다. 기존의 항공 교통 검증에 대한 솔루션들과는 대조적으로, 우리는 시간, 주파수 이동, 또는 모든 PHY 계층 기능들의 측정을 요구하지 않고, 오직 이산 센서 이벤트들을 사용한다.

In summary, the contributions of this paper are:

We propose the first comprehensive approach to evaluate the trustworthiness of ADS-B aircraft reports based on an existing infrastructure of crowdsourcing sensors.
We demonstrate the applicability of our approach by incorporating real-world flight data collected by geographically distributed sensors at a large scale.
We simulate prominent attacks on GPS and ADS-B, detect their presence via validation in our trust system, and draw conclusions about their type and origin.
We elaborate on network expansion and optimized sensor deployment to further harden the network against attacks in the future.

요약해서, 이 논문의 공헌은 다음과 같다.

우리는 기존의 크라우드소싱 센서들의 인프라를 기반으로 ADS-B 보고서들의 신뢰성을 평가하는 최초의 종합적인 방법을 제안한다.
우리는 지리적으로 분산된 센서들에 의해 수집된 실제 항공 데이터를 대규모로 통합하여 우리의 방식의 적용 가능성을 증명한다.
우리는 GPS와 ADS-B에 대한 두드러지는 공격들을 시뮬레이션하고, 우리의 신뢰 시스템의 검증을 통해서 그들의 존재를 탐지하고, 그리고 그들의 유형과 기원에 대한 결론들을 도출했다.
우리는 네트워크 확장과 최적화된 센서 배포에 자세히 설명하여 향후 발생할 공격에 대하여 네트워크를 더욱 견고하게 한다.

Introduction 정리

이 논문에서는 크라우드소싱 방식으로 무선 목격을 추구할 뿐 아니라, 비침입적이라는 성격을 가지는 신뢰 평가 모델을 소개하고 있다.
제공하고 있는 시스템은 공격 유형을(GPS spoofing, ADS-B spoofing) 식별할 수 있고 효과적으로 공격을 탐지할 수 있다.
또한 비침입적이라는 성격을 통해서 기존의 시스템에 적용 가능성과 추후 공격에 대한 네트워크 구축에 대해서 설명할 것이다.

System and Attacker Models

We first describe today's air-traffic monitoring techniques with a focus on ADS-B. We then introduce our trust definition and present the consolidated system model. Finally, we define the considered attacker model.

우리는 먼저 ADS-B에 초점을 맞추어 오늘날의 항공 교통 관제 기술에 대해서 설명한다.
이후에 우리의 신뢰 정의를 소개하고 통합 시스템 모델을 도입한다. 마지막으로, 우리는 고려되는 공격 모델을 정의한다.

1. Air-Traffic Monitoring

In recent years, traditional analog RADAR-based systems for air-traffic monitoring have been augmented with digital means for active wireless communication. For the communication with ground stations and other aerial vehicles, aircraft are mandated to be equipped with ADS-B transponders that periodically broadcast status reports. These reports contain aircraft identification, information on speed, track, and acceleration along with further observation data. The positioning information is mainly derived via GPS, which is the preferred method for self-localization.

최근 몇 년 동안에, 항공 교통 모니터링을 위한 전통적인 아날로그 RADAR기반의 시스템들은 능동적인 무선 통신을 위한 디지털 수단으로 보강되어졌다. 지상국과 다른 비행체들과의 통신을 위해서, 비행체들은 주기적으로 상태 보고서를 방송하는 ADS-B 수신기를 장착해야 한다. 이러한 보고서들은 비행기 신원, 스피드 정보, 경로, 가속도와 추가 관찰 데이터 등을 포함한다. 위치 정보는 주로 자체 위치 파악에 선호되는 방식인 GPS를 통해 도출된다.

Since the ADS-B protocol is openly specified, the modulation and data frame patterns are known. ADS-B operates at a frequency of 1,090 MHz and the typical reception range can reach up to 700km. The signals can thus be received by simple consumer-grade hardware such as Universal Software Radio Peripherals (USRPs) or even cheaper Software Defined Radios (SDRs) like RTL-SDR dongles, which are available for as low as $20. The availability of SDRs not only allows passive eavesdropping but also led to software tools for active ADS-B transmission or the generation of fake GPS signals. Surprisingly, the ADS-B protocol lacks fundamental security measures, and neither applies encryption nor authentication.

ADS-B 프로토콜이 공개적이기 때문에, 변조 및 데이터 프레임 패턴들은 알려져있다. ADS-B는 1,090 MHz 주파수 대역에서 수행되고 전형적인 수신 범위는 700km까지 도달할 수 있다. 신호들은 USRP나 20$도 하지 않는 RTL-SDR동글과 같은 훨씬 저렴한 SDR과 같은 단순한 소비자 등급 하드웨어에 의해 수신될 수 있다. SDRs의 사용가능성은 수동적 도청뿐 아니라 활성 ADS-B 전송이나 가짜 GPS 신호들의 생성을 위한 소프트웨어 툴이 될 수 있다. 놀랍게도, ADS-B 프로토콜은 근본적인 보안 측정이 부족하다, 그리고 암호화와 인증이 적용되어 있지 않다.

2. Trsut Definition

We define trust in our system as the certainty of an ADS-B report to be the result of normal behavior and not disrupted by malfunctioning or active manipulation. To this end, a trusted report represents valid data transmitted by genuine sources. On the other hand, an untrustworthy report is either erroneous or contains fake data that should be discarded from further processing. While the traditional notion of trust had been entity-centric and rigid, today's fast-changing ad hoc networks necessitate the adjustment of trust models.

우리는 신뢰를 정상 동작의 결과와 오작동이나 능동적인 조작에 의해서 중단되지 않는 ADS-B 보고서의 확실성으로써 정의한다. 끝에서는, 신뢰된 보고서는 진짜 소스들에 의해 전송되어진 유효한 데이터를 의미한다. 반면에, 비신뢰적 보고서는 에러가 존재하거나 이후 절차에서 제거되어져야만 하는 가짜 데이터를 포함하고 있는것을 의미한다. 신뢰의 전통적인 언급이 개체-중심이고 엄격했던 반면에, 오늘날의 빠르게 변화하는 ad-hoc 네트워크들은 신뢰 모델들의 조정을 필수로 한다.

Hence, we seek to establish a data-centric trust model in consideration of short-lived associations in volatile environments as mentioned by Raya et al. In particular, we design a trust system that is driven by data collected by geographically distributed sensors that share their observations within a network. The combination of redundant views enables the systems to cross-validate data and eventually establish a form of wireless witnessing.

따라서, Raya 등이 언급한 바와 같이 변동성이 큰 환경에서의 단명 연관성을 고려하여 데이터 중심 신뢰 모델을 구축하고자 한다. 특히, 우리는 네트워크 안에서 그들의 관측을 공유하는 지리적으로 분산된 센서들에 의해 수집된 데이터로 도출되는 신뢰 모델을 설계했다. 많은 관점의 조합은 시스템들이 데이터를 교차검증 할 수 있도록 하고 결국에는 무선 목격의 형태를 구축한다.

3. Consolidated System Model

We consider the following system model. Aircraft that are equipped with an ADS-B transmitter periodically broadcast status reports which among other information include GPS-derived positions. A set of geographically distributed sensors receive these reports and their observations are shared with others in a crowdsourcing manner. A central server collects and processes the forwarded observations. Overall, we are faced with the high mobility of aircraft, while the receiving sensors are stationary and are less likely to move significantly. Figure 1 depicts an overview of our system model that we consider to assess the trustworthiness of ADS-B reports.

우리는 다음과 같은 시스템 모델을 고려한다. ADS-B 송신기를 장착한 항공기들은 GPS기반의 위치를 포함하는 다른 정보들과 함께 상태 보고서를 주기적으로 방송해야 한다. 일련의 지리적으로 분산된 센서들은 이러한 보고서들을 수신하고 그들의 관측들은 크라우드 소싱 방식으로 다른 센서들과 공유되어 진다. 중앙 서버가 추후 관측들을 위해 이를 저장하고 처리한다. 종합적으로, 수신 센서들은 고정적이고 상당히 덜 움직이는 반면에, 우리는 항공기의 매우 높은 이동성을 직면하게 된다. 그림1은 우리가 ADS-B 보고서들의 신뢰성을 평가하기 위해 고려한 시스템 모델의 개요를 보여준다.

4. Considered Adversary

Our adversary model comprises several prominent attack vectors, which we categorize according to their intended target and their scope. Table $I$ shows an overview. We evaluate our proposed system against these attacks. Moreover, we will argue in Section $VI-C$ that even attackers with complete knowledge about our verification scheme cannot bypass our implementation of wireless witnessing and can still be detected.

우리의 공격자 모델은 몇가지 두드러지는 공격 백터들로 구성되어있으며, 우리는 그들의 의도된 타겟과 그들의 범위에 따라서 목차화했다. 표 1은 개요를 보여준다. 우리는 이러한 공격들에 대해서 우리가 제안한 시스템을 평가한다. 게다가, 우리는 섹션 6-C에서 심지어 공격자들이 우리의 검증 스키마에 대한 온전한 지식을 가진다 할지라도, 우리의 무선 목격 구현을 우회할 수 없고 여전히 감지된다는 것에 대해서 설명할 것이다.

GPS Spoofing. The airborne (self)-positioning sensors process received GPS signals from multiple satellites to embed the results in the broadcasted ADS-B reports. One attack scenario considers the spoofing of GPS signals where an attacker sends out specially crafted signals at a considerable signal strength. As a result, an attacker can inject false positioning or timing information into the aircraft systems inducing the processing of fake attacker-controlled data.

GPS Spoofing. 상공에 있는 위치 센서 프로세스들은 다수의 위성들로부터 GPS 신호들을 받고 방송된 ADS-B 보고서들에 결과를 추가한다. 한 가지 공격 시나리오는 공격자가 상당한 신호 강도로 특수 제작한 신호들을 보내는 GPS 신호들의 Spoofing을 고려할 수 있다. 결과적으로, 공격자는 가짜 위치나 타이밍 정보를 가짜 공격자-제어 데이터를 처리하도록 하면서 항공 시스템에 주입할 수 있다.

ADS-B Spoofing (Single). An attacker capable of generating fake ADS-B messages can transmit arbitrary reports with full control over their contents. These bogus reports may represent, e.g. any aircraft identifier, positioning solution, or movement information. Receivers of such messages will decode the message contents and forward the sensed information to the central server. We differentiate this attack according to the number of affected sensors. An attacker that is limited in its effective range is likely to only affect single sensors due to their broad spatial distribution.

ADS-B Spoofing (Single). 가짜 ADS-B 메시지들을 생성하는 공격 능력은 그들의 내용들에 대한 전체적인 제어를 가지는 임의의 보고서들을 전송할 수 있다. 이러한 가짜 보고서들은 예를들어, 항공기 신원, 위치 솔루션, 또는 이동 정보등을 나타낼 수 있다. 이러한 메시지들을 받는 수신자는 메시지 내용을 해독하고 중앙 서버로 의미있는 정보를 전달할 것이다. 우리는 영향을 받은 센서들의 수에 따라서 이러한 공격을 구분한다. 그것의 유효범위가 제한되는 공격자는 그들의 공간 분포가 넓기 때문에 단일 센서에만 영향을 줄 것 같다.

ADS-B Spoofing (Multiple). A large-scale attacker may also be capable of targeting multiple geographically distributed sensors at the same time. This attacker, however, requires mulitple antennas or a high elevated high power antenna. The attack is conducted in a broadcast fashion and is expected to affect all sensors within its targeted area. As a result, more than one sensor would receive the same fake report and forward it to the central server.

ADS-B Spoofing (Multiple). 대규모 공격자는 또한 동시에 다수의 지리적으로 분산된 센서들을 타겟으로 할 능력을 가질 수 있다. 하지만, 이러한 공격자는 여러개의 안테나또는 고출력의 안테나를 필요로 한다. 공격은 브로드캐스트 방식으로 실행되며 타겟 영역내의 모든 센서들에 영향을 끼칠 것으로 보인다. 결과적으로, 하나 이상의 센서들이 같은 가짜 보고서를 받을 것이고, 그것을 중앙 서버로 전달할 것이다.

Sensor Control. Due to the open nature of the surveillance network, attackers may operate their own sensors and become part of the crowdsourcing infrastructure. Having full control over a sensor, an attacker is able to inject arbitrary data encapsulated in genuine ADS-B reports. This attack can be performed without broadcasting any signals and can be directly conducted on the network level.

Sensor Control. 감시 네트워크의 공개 성질 때문에, 공겨자들은 그들 자신의 센서들을 작동시켜 크라우드 소싱 인프라의 한 부분이 될 수 있다. 센서에 대한 모든 제어를 가지는 것은 공격자가 진짜 ADS-B 보고서들에 캡슐화된 임의의 데이터를 주입할 수 있다. 이러한 공격은 어떠한 브로드캐스팅 신호들 없이도 수행될 수 있고 직접적으로 네트워크 레벨에서 수행될 수 있다.

Sybil Attack. A large-scale attacker operating a significant number of sensors can perform a Sybil attack with the purpose of overruling the network's protection systems. The sensors may be deployed at different locations to influence several redundant views at the same time. This constitutes one of the most powerful attack against sensor networks.

Sybil Attack. 상당 수의 센서들을 작동시키는 대규모 공격자는 네트워크의 보호 시스템들을 뒤덮을 목적으로 Sybil 공격을 수행할 수 있다. 센서들은 동시에 다양한 중복 보기에 영향을 주기 위해서 다른 위치에 배포될 수 있다. 이것은 센서 네트워크에 대한 가장 강력한 공격들 중 하나이다.

Design of an ADS-B Trust System

We propose a system to establish a dynamic verification of ADS-B messages for air-traffic surveillance. We first describe the specifics of the analyzed data and state general network statistics. We then define (i) three verification tests checking the contents of a messages and (ii) one ML-based classification of the report metadata, i.e., the reception pattern.

우리는 항공 교통 관제를 위해서 ADS-B 메시지들의 동적 검증을 구축하기 위한 시스템을 제안한다. 먼저, 분석된 데이터의 세부사항과 일반 네트워크 통계에 대해서 서술한다. 이후 메시지들의 내용을 확인하는 3가지 검증 테스트와 기계학습 기반의 보고서 메타데이터(수신 패턴) 분류기에 대해서 정의한다.

1. Data Source Specifics

As the source of our considered data, we utilize real-world air-traffic data from the OpenSky Network. The sensors are installed and operated by volunteers, who can either remain anonymous or opt to register by providing personal information. Over 1000 sensors promote the coverage of the network that exhibits a particular high sensor density in Europe and on the American continent. The network relies on user-provided data, processes it on centralized servers, and offers access to the collected data of around 20 billion messages per day. It is noteworthy that nodes in the network are not equipped with any cryptographic means or certificates, which would hinder the growth of the sensor network and contradict the easy access to the crowdsourcing platform. While other air-traffic sensor networks exist, we make use of the research-friendly data sharing of this network.

우리의 고려된 데이터의 원천에 따르면, 우리는 OpenSky Network로부터 실제 항공 교통 데이터를 활용한다. 센서들은 익명으로 남거나 개인 정보를 제공함으로써 등록하기로 결정한 지원자들에 의해 설치되고 수행된다. 1000개 이상의 센서들은 유럽과 미국 대륙에서 고밀도를 보이는 네트워크의 범위를 촉진시킨다. 네트워크는 중앙 서버들에서 그것을 처리하는 유저-제공 데이터에 의지한다, 그리고 매일 200억개의 수집된 데이터에 대한 접근을 제안한다. 네트워크에 있는 노드들은 센서 네트워크의 성장을 숨기고 크라우드소싱 플랫폼에 대한 쉬운 접근을 부정하는 어떠한 암호학적 의미나 인증서가 장착되어 있지 않다는 점이 주목 할 만하다. 다른 항공 교통 센서 네트워크들은 존재하는 반면에, 우리는 이 네트워크의 연구 친화적 데이터 공유의 사용하도록 만든다.

For the sake of simplicity, we initially restrict the considered ADS-B reports to the European airspace where the OpenSky Network sensor density is the highest. To further reduce complexity, we divide this space into non-overlapping square shaped clusters $C$ with edge lengths of approx. 10km. In total, the considered environment becomes the union of 232,139 different clusters $C_j \in C$ .

단순성을 위해서, 우리는 OpenSky Network 센서 밀도가 가장 높은 유럽 공역의 ADS-B 보고서들로 제한했다. 복잡성을 더 낮추기 위해서, 우리는 10km의 축적으로 중첩되지 않는 네모 모양 $C$ 로 영역을 나누었다. 결과적으로 해당 환경은 232,139개의 클러스터( $C$ )로 나누어 진다.

In order to get a better understanding of the data provided by the OpenSky Network, we visualize the sensor coverages and the number of processed ADS-B messages with respect to their spatial distribution. These evaluations are based on data collected from an entire data (February 15, 2020) resulting in a total of 132,883,464 messages broadcasted by real aircraft. Figure 2 depicts a heat map of the spatial distribution of all recorded ADS-B reports. As one can see, most reports originated from a few cluster areas close to central European airports. Notably, the database only contains messages that reached at least one contributing sensor.

OpenSky Network에 의해 제공된 데이터에 대한 더 나은 이해를 돕기 위해서, 우리는 센서 커버리지와 그들의 공간 분포에 따른 ADS-B 메시지들의 수를 시각화 했다. 이러한 평가들은 실제 비행기에 의해 방송되어진 총 132,883,464개의 데이터로(February 15, 2020)부터 수집된 데이터를 기반으로 했다. 그림2는 모든 기록된 ADS-B 보고서들의 공간 분포에 대한 히트맵이다. 그림에서 보이듯이, 대부분의 보고서들은 중부 유럽 공항들과 가까운 몇 개의 클러스터로부터 기원된다. 특히, 데이터베이스는 최소 하나의 센서에 도달한 메시지들만을 포함한다.

The overall coverage of the network is the combination of all participating sensors. Since sensor coverages can significantly overlap with each other, the redundancy is higher in areas with more sensors as compared to rural areas. Figure 3 shows the aggregated sensor coverage of the OpenSky Network as of February 15, 2020. The heatmap depicts the number of sensors that simultaneously cover an indicated area. A total of 729 different sensors reported data for the considered airspace. We notice a strong dominance in Central Europe, where the most participating sensors are operated. Nevertheless, the coverage of the sensor network also limits the applicability of our system. Airspaces covered by no sensors are not protected.

네트워크의 전체적인 커버리지는 모든 참가 센서들의 조합이다. 센서 커비리지가 서로서로 상당히 겹치기 때문에, 시골 지역에 비해서 많은 센서들을 가지는 지역에 견고성이 더 높다. 서로 다른 총 729개의 센서들은 해당 공역들을 위해 데이터를 보고했다. 우리는 대부분의 참가 센서들이 수행되고 있는 중부 유럽에서 강한 영향력을 알았다. 그럼에도 불구하고, 센서 네트워크의 커버리지는 또한 우리 시스템의 적용 가능성을 제한한다. 센서에 의해 커버되지 않는 영공들은 보호되지 않는다.

2. Notations

For the remainder of this paper, we use the following notations. The network is formed by a set of ground-based sensors $S$ , where each sensor is referred to as $S_i \in S$ . Each ADS-B message $m$ can be received by an arbitrary number $\geq$ 1 of sensors $S_i$ , hence the link ( $m, S_i$ ) exists. Due to noise effects and message collisions, message loss can naturally occur and we denote the probability that sensor $S_i$ receives a message transmitted from cluster $C_j$ as $P_{rec}(S_i, C_j)$ . Moreover, the messages are timestamped by the receiving sensors, where $t$ is the issued timestamp. When a message is not picked up by any sensor, it is consequently not in the considered database. Table $II$ summarizes the used notations.

이 논문의 나머지 부분에 대해서, 우리는 다음과 같은 노테이션을 사용한다. 네트워크는 지상 기반 센서 $S$ 의 집합에 의해 형성된다. 각 센서들은 $S_i \in S$ 로 표현된다. 각 ADS-B 메시지 $m$ 은 1 이상의 임의의 개수의 센서 $S_i$ 에 의해 수신될 수 있다. 따라서, ( $m, S_i$ )링크가 존재한다. 소음 효과와 메시지 충돌 때문에, 메시지 손실은 자연스럽게 발생할 수 있고 우리는 센서 $S_i$ 가 클러스터 $C_j$ 로부터 전송되어진 메시지를 받는 확률을 $P_{rec}(S_i, C_j)$ 로써 정의한다. 게다가, 메시지들은 수신 센서들에 의해 시간이 기록된다.

3. ADS-B Message Trust

In order to assess the trustworthiness of ADS-B messages, we design an evaluation process consisting of four verification tests, namely ( $i$ ) sanity, ( $ii$ ) differential, ( $iii$ ) dependency, and ( $iv$ ) cross check. While the former three tests are stated for the sake of completion, we focus on the cross check that is tailored towards the existing sensor infrastructure to implement wireless witnessing. The system overview is depicted in Figure 4 and is developed in the following.

ADS-B 메시지의 신뢰성을 평가하기 위해서, 우리는 주로 온전성, 차별성, 의존성 그리고 상호 검증인 4가지 검증 테스트로 구성된 평가 절차를 설계한다. 앞의 3가지 검증들은 완료의 목적으로 진행되지만, 우리는 무선 목격을 구현하기 위해 이미 존재하는 센서 인프라에 딱 맞는 상호 검증에 집중한다. 시스템 오버뷰는 그림4에 묘사되어있고 뒤에서 서술한다.

1) Sanity Check:
The sanity check represents a message content verification with respect to defined value ranges. Where data values are not restricted by definition, we apply physical possibility bounds. Sanity checks are specific to the message content, i.e., the reported aircraft status. Table $III$ provides an overview of the implemented sanity check.

1) 온전성 검사:
온전성 검사는 정해진 값의 범위에 대하여 메시지 내용 검증을 의미한다. 데이터 값들이 정의에 의해 제한되지 않는다면, 우리는 물리적 가능 범위를 적용한다. 온전성 검사들은 메시지 내용, 즉 보고된 항공기 상태에 따라 다르다. 표 3은 구현된 온전성 검사의 오버뷰를 제공한다.

Position.
The reported position contains information about the latitude, longitude, and altitude. The latitude is only defined in the range of -90 $\degree$ to 90 $\degree$ , whereas the longitude is defined over -180 $\degree$ to 180 $\degree$ . The altitude is not bounded by its definition but by physical restrictions ranging from approx. -3m, which is the altitude of the lowest European airport, Amsterdam Airport Schipholl. For the maximal altitude, we use a bound of 20,000m, which is hardly reachable for casual air traffic.

위치.
보고된 위치는 위도, 경도, 그리고 고도에 대한 정보를 포함한다. 위도는 오직 -90 $\degree$ 에서 90 $\degree$ 까지의 범위만을 정의한다, 반면에 경도는
-180 $\degree$ 에서 180 $\degree$ 까지 정의한다. 고도는 정의에 의해서 제한되지 않지만, 가장 낮은 유럽 공항(Amsterdam Airport Schipholl)의 고도인 -3m 범위로부터 물리적인 규제에 의해 정의된다. 최대 고도를 위해서, 20,000m의 범위를 사용한다. 이는 일반 항공 교통에 대해 도달 할 수 없다.

Movement.
While airborne, the velocity is expected to be positive and bounded by the maximal speed of the specific aircraft type, usually less than approx. 1,200km/h. The direction of movement, referred to as the true track, is defined by the angle aligned with the True North in the range of 0 $\degree$ to 360 $\degree$ . Moreover, the vertical rate is also aircraft-dependent and is expected to not exceed $\pm50 m/s$ .

이동 방향.
하늘에 있는 동안에, 속력은 양수가 되고, 특정 항공기 유형의 최대 속도인 대략 1,200km/h보다 낮은 속도로 제한될 것으로 예상된다. 실제 트랙을 의미하는 이동 방향은 0 $\degree$ 에서 360 $\degree$ 의 범위에서 True North와 정렬된 각도로 정의된다. 게다가, 수직 속도 또한 항공기에 따라 다르고 $\pm50 m/s$ 를 넘지 않을 것으로 예상된다.

Identification.
Each aircraft is assigned a unique identification, the ICAO 24-bit registration identity. This identifier can be checked against databases that contain currently assigned ICAO registrations. In addition, each aircraft is assigned a volatile call sign, which can also be verified.

신원 확인.
각 항공기는 ICAO 24비트 등록 번호라 불리는 고유 ID를 할당받는다. 이 ID는 현재 할당된 ICAO 등록을 포함하는 데이터베이스에 대해 검사할 수 있다. 게다가, 각 항공기는 휘발성의 호출 부호를 할당 받는데, 이 또한 검증될 수 있다.

2) Differential Check:
The differential check considers changes between succeeding ADS-B messages from the same aircraft. These checks, therefore, require the assignment of messages to tracks based on the included identifier. In consideration of the message update rate and broadcast frequency, we identify reasonable maximal changes per second that conform to the inertia and aircraft capabilities as well as covered by observations of real flight data. Table $IV$ contains the implemented tolerable parameter changes. In cases where we receive updated ADS-B reports after a prolonged loss of communication, e.g., due to missing sensor coverage, we incorporate the lack of data by scaling the tolerable maximal change with the missed time period.

2) 차별성 검사:
차별성 검사는 같은 항공기로부터 온 계속적인 ADS-B 메시지의 차이점을 검사한다. 그러므로, 이러한 검사들은 내재된 ID를 기반으로 경로에 대한 메시지들의 할당을 요구한다. 메시지의 업데이트 속도와 방송 주파수를 고려하여, 우리는 관성과 항공기 기능들에 부합하는 합리적인 초당 최대 변화를 식별하고 실제 항공 데이터의 관측을 통해 다룬다. 표 4는 구현된 허용되는 파라미터 변화를 포함한다. 센서 커버리지를 잃어버리는 장기적인 통신의 손실 이후에 업데이트된 ADS-B 보고서를 받는 상황에서, 우리는 놓친 시간 동안에 허용 최대 변화를 확장함으로써 데이터의 부족을 통합한다.

3) Dependency Check:
The dependency check verifies the relationship between physically-dependent parameters of subsequent reports from the same aircraft. We validate reported horizontal and vertical changes based on predictions of the next position and allow for a tolerance up to 100m, which we have empirically derived from the available dataset. A further dependency exists between the reported altitude and the aircraft indicating to be on ground. We coarsely perform this check against the elevation of the highest European airport (1,707 m), Samedan Airport of Switzerland. Notably, more fine-grained information about the geographical topology would greatly benefit the validity. Table $V$ shows the implemented dependency checks.

3) 의존성 검사:
의존성 검사는 같은 항공기로부터 후속 보고서의 물리적으로 의존하는 파라미터들 간의 관계를 검증한다. 우리는 다음 위치에 대한 예측을 기반으로 보고된 수평적, 수직적 변화들을 검증하고, 가능한 데이터 세트로부터 경험적으로 도출한 최대 100m 까지 허용 오차를 허용 한다. 추가 의존성은 보고된 고도와 땅 위에 있다고 표시되는 항공기들 사이에 존재한다. 우리는 가장 높은 유럽 공항(1,707m)인 스위스의 Samedan Airport의 단계에 대해서 이러한 검사를 거칠게 수행한다. 지리적 토폴로지에 대하여 잘 정제된 정보는 유효성에 이득을 준다. 표 5는 구현된 의존성 검사를 보여준다.

4) Cross Check:
The cross check utilizes the spatial redundancy of the surveillance network in a collaborating manner. Participating sensors are widely distributed and their coverages overlap significantly, as shown in Figure 3. Even though the sensor locations are unknown, we can determine which sensors observe which airspace via inspecting the reported positions embedded in their received ADS-B reports. Hence, in our grid-based approach, each cluster $C_j$ is dedicated to covering sensors $S_i$ such that the following equation holds:

P_{rec}(S_i, C_j) > 0 \qquad(1)

4) 교차 검사:
교차 검사는 공동 작업 방식으로 감시 네트워크의 공간 중복성을 활용한다. 참가 센서들은 널리 분산되어 있고 그림3에서 보듯이, 그들의 커버리지는 상당히 중복된다. 비록 센서 위치가 알려져있지 않더라도, 우리는 그들의 수신된 ADS-B 레포트들에 포함된 보고된 위치를 검사하여 어떤 센서들이 어떤 공역을 감시하는지 결정할 수 있다. 더군다나, 우리의 그리드 기반의 접근방식에서 각 클러스터 $C_j$ 는 다음과 같은 공식을 유지하도록 센서 $S_i$ 를 커버하는데 전념한다.

P_{rec}(S_i, C_j) > 0 \qquad(1)

If multiple sensors $S_i$ cover the same cluster $C_j$ such that $P_{rec}(S_i, C_j) > 0$ , we can countercheck received message by consulting all designated sensors. For each sensor that covers a reported aircraft position, we distinguish two discrete events the sensor has received the message or the sensor has not received the message:

X_{m, S_i}= \begin{cases} 0\quad\nexists(m, S_i) \\ 1\quad\exists(m, S_i) \end{cases} \qquad(2)

만약, 여러개의 센서들이 $S_i$ 같은 클러스터를 $C_j$ 커버하고 있다면, $P_{rec}(S_i, C_j) > 0$ , 우리는 모든 지정된 센서들을 고려함으로써 수신된 메시지들을 상호 검사할 수 있다. 보고된 항공기 위치를 커버하는 각 센서들을 위해서, 우리는 센서들이 메시지를 수신 했는지 또는, 수신 하지 못했는지에 대한 2가지 이벤트를 구분한다.

X_{m, S_i}= \begin{cases} 0\quad\nexists(m, S_i) \\ 1\quad\exists(m, S_i) \end{cases} \qquad(2)

Due to noise effects and signal collisions, sensors naturally experience a message loss in the range of 10% to 75% depending on the distance to the origin, obstacles in view, and the airspace density. Hence, the case of missing a report does not causally imply unusual behavior or the existence of attacks and needs to be factored in accordingly. We refer to the combination of events $X_{m,S_i}$ , $S_i \in S$ as the observed message reception pattern for a report broadcasted from the claimed position. Each sensed message is therefore mapped to a vector representing the reception events for every sensor:

\vec{X}_m = [X_{m,S_1}, X_{m,S_2},...,X_{m,S_{n-1}}, X_{m,S_n}] \qquad(3)

노이즈 효과와 신호 충돌 때문에, 센서들은 기원으로부터의 거리, 가시거리의 장애물, 공역 밀도에 따라서 10%에서 75%까지의 메시지 손실을 경험한다. 더군다나, 보고서를 잃어버리는 경우에는 비정상 행동이나 공격의 존재를 암시할 수 없으므로 이에 대한 고려가 필요하다. 우리는 보고서를 위해 관찰된 메시지 수신 패턴이 요청된 위치로부터 방송되어지는 것으로써 이벤트들의 조합( $X_{m,S_i}$ , $S_i \in S$ )을 정의한다. 각 감지된 메시지는 모든 센서를 위해 수신 이벤트들을 의미하는 벡터에 매핑된다.

\vec{X}_m = [X_{m,S_1}, X_{m,S_2},...,X_{m,S_{n-1}}, X_{m,S_n}] \qquad(3)

where n is the total number of sensors in the network. For our considered scenario, we obtain a vector with 729 entries, which represents the message reception pattern. These patterns exhibit a certain variance and cannot be translated into fixed rules due to non-deterministic sensor reception. Hence, we choose a Machine Learning (ML) approach to handle the huge amount of available data and simultaneously consider unknown external effects.

n은 네트워크에 존재하는 센서들의 총 개수이다. 우리의 고려된 시나리오에 대하여, 우리는 메시지 수신 패턴을 나타내는 729개의 엔트리를 가지는 벡터를 얻는다. 이러한 패턴들은 특정 분산을 나타내고 비결정론적인 센서 인식때문에 고정 규칙으로 변화되지 않는다. 게다가, 우리는 많은 양의 가용가능한 데이터를 다루고 동시에 알려지지 않은 외부 요인을 고려하기 위해서 기계 학습(ML)방식을 선택했다.

In particular, for each of the 132,883,464 recorded ADS-B reports, we determine which of the 729 sensors reported that specific message. In combination with the embedded positioning information, we learn typical reception patterns for the entire day and label the data to be the result of normal operating air traffic and sensors. After processing all reports, each cluster $C_j$ is assigned with actually observed message reception patterns and we assume these patterns to represent normal behavior. We discuss this assumption in Section $VI-A$ and reason about its validity.

특히, 기록된 각 132,883,464개의 ADS-B 보고서들에 대하여, 우리는 729개의 센서들 중 특정 메시지를 기록하는 센서가 어떤 것인지 결정한다. 내재된 위치 정보의 조합과 함께, 우리는 하루 종일 전형적인 수신 패턴을 학습하고 일반적인 항공 교통과 센서들의 작업의 결과로 데이터를 라벨링 한다.
모든 보고서들을 처리한 후에, 각 클러스터 $C_j$ 들은 실제로 관측된 메시지 수신 패턴으로 할당되어지고 우리는 이러한 패턴을 정상 행위로 나타내기 위해서 추측한다. 우리는 섹션 6-A에서 이러한 가정에 대해 논의하고 그것의 유효성에 대한 이유를 설명한다.

Algorithm Choice.
Since our feature space is defined by the number of sensors and each feature is limited to either be 0 (not received) or 1 (received), we choose to use Decision Trees (DTs). This choice is in accordance with similar work classifying distributed sensor events. For more information on machine learning algorithms, we refer to an article by Leo Breimann.

알고리즘 선택.
우리의 특징 영역이 센서들의 수에 의해 정의되어지고 각 특징들이 0 (수신 받지 못함)이나 1 (수신 받음)로 제한되기 때문에, 우리는 의사 결정 트리(DTs)를 사용하기로 결정했다. 이 선택은 분산 센서 이벤트들을 분류하는 유사한 작업에 따른 것이다. 기계 학습 알고리즘에 대한 더 많은 정보는 Leo Breimann의 기사를 참고해라.

4. Attack Analysis

In the case where at least one of our verification tests indicates unusual behavior, an attack analysis is triggered that tries to further reason about $(i)$ the type of attack and $(ii)$ the affected sensors. Depending on which test triggered the attack analysis, different conclusions can be drawn on the cause of an alarm.

우리의 검증 테스트들 중 최소 하나라도 비정상 행동을 보이는 상황에서, 공격 분석은 $(i)$ 공격의 유형과 $(ii)$ 영향을 받은 센서들에 대한 추가적인 추론을 시도하는 트리거가 발생한다. 어떤 테스트가 공격 분석을 트리거했는지에 따라서, 알람의 원인에 대한 다른 결과들이 도출될 수 있다.

1) Type of Attack: We notice that our three attack classes, i.e., GPS spoofing, ADS-B spoofing, and sensor control/Sybil attack, can be characterized by the type of manipulation they cause on the message, respectively on the network. This can either be on the content of the ADS-B messages directly, or more subtle on the message reception characteristic. While the sanity, differential, and dependency checks can verify the message payload, the cross check evaluates the reception pattern. For each attack vector, we identify which verification test is indicative and provide an overview in Table $VI$ .

1) 공격의 유형:
우리의 3가지 공격 클래스들(GPS spoofing, ADS-B spoofing, sensor control/Sybil attack)은 각각의 네트워크에서 메시지에 대한 조작 유형에따라 특징화 될 수 있다는 것을 안다. 이는 ADS-B 메시지의 내용에 대해 직접적으로 행해질 수 있다. 온전성, 차별성, 의존성 검사가 메시지 페이로드를 검증할 수 있는 반면에, 상호 검증은 수신 패턴을 평가한다. 각각의 공격 벡터에 대하여, 우리는 어떤 검증 테스트가 직설적인지 확인하고 표 6에서 오버뷰를 제공한다.

Sanity Check.
The sanity check detects defined value range violations. These can occur when a report is either specifically crafted during an ADS-B spoofing attack or if a sensor is entirely under the control of an attacker.

온전성 검사.
온전성 검사는 정의된 값의 범위의 위반을 감지한다.
이것들은 레포트가 ADS-B spoofing 공격 동안에 특정하게 생성되었거나, 센서가 온전히 공격자의 통제하에 있을 경우에 발생할 수 있다.

Differential Check.
The differential check is indicative to unusual jumps in the data. A GPS spoofing attack may hence be detectable if the position exhibits a sudden jump. All other attacks may also trigger an alarm depending on the variance in the generated fake data.

차별성 검사.
차별성 검사는 데이터 내에서 일반적이지 않은 점프를 나타낸다. GPS spoofing 공격은 위치가 갑작스러운 점프를 보여준다면, 감지될 수 있다. 모든 다른 공격들은 생성된 가짜 데이터의 변화에 따라 알람을 트리거할 수 있다.

Dependency Check.
The dependency check detects inconsistencies between dependable data from independent sensors within the aircraft. Since a successful GPS spoofing attack only affects GPS-related sensors, other information on the movement or on the heading will likely result in a violation. Again, other attacks may also fail this test if the fake reports do not satisfy parameter dependencies.

의존성 검사.
의존성 검사는 항공기 내에 있는 독립적인 센서로부터 신뢰할 만한 데이터간의 불규칙성을 감지한다. 성공적인 GPS spoofing 공격은 오직 GPS와 관련된 센서들에만 영향을 주기 때문에, 이동이나 방향에 관련된 다른 정보들은 위반으로 이어질 가능성이 높을 것이다. 다시 말하지만, 가짜 레포트들이 파라미터 의존성을 만족하지 못한다면, 다른 공격들은 또한 이러한 검사들을 실패할 것이다.

Cross Check.
The cross check tries to decide if a message reception pattern is the result of normal behavior or not. An aircraft report affected by a GPS spoofing attack indicates a wrong position and the reception pattern will likely differ from the actual reception pattern of the real location. For the other attacks, the validity of the cross check depends upon the number of benign sensors that observe the claimed aircraft position. The more sensors simultaneously cover an area, the less likely it will be that only a specific number of sensors, e.g., affected by an ADS-B spoofing attack, receive the specific message. Similar considerations apply for attackers adding sensors to the network. Unaffected sensors will not report injected messages which is eventually reflected in an unusual reception pattern. For both attack classes, reception patterns are easier to decide the more sensors are participating.

상호 검증.
상호 검증은 만약 메시지 수신 패턴이 정상 행동의 결과인지 아닌지 결정한다. GPS spoofing 공격에 영향을 받은 항공기 레포트는 잘못된 위치를 나타내고 수신 패턴이 실제 위치의 실제 수신 패턴과 다를것이다. 다른 공격들에 대하여, 상호 검증의 유효성은 요청된 항공기 위치를 관측하는 정상 센서들의 수에 달려있다. 더 많은 센서들이 동시에 지역을 커버한다면, 더 적은 특정 수의 센서들(ADS-B spoofing 공격에 의해 영향을 받은)만이 특정 메시지를 수신할 것이다. 공격자들이 네트워크에 센서를 추가하는 것도 비슷한 고려사항이 적용된다. 영향을 받지 않은 센서들은 주입된 메시지들을 보고하지 않을 것이며 이는 결국 비정상적인 수신 패턴에 영향을 줄것이다. 양쪽의 공격 클래스들에 대하여, 수신 패턴들은 더 많은 센서들이 참가하는 것을 결정하기에 쉬울 것이다.

2) Affected Sensors:
If we successfully detect unusual behavior and identify the type of attack, we try to also reason about the affected ADS-B sensors. We generally distinguish between passively and actively participating sensors during an attack. While we can tag all sensors that reported an untrustworthy message as potentially malicious, we are interested which sensors are indeed under the attacker's control. Theses compromised sensors are actively trying to disrupt the network. We, therefore, identify all sensors that report messages clearly assigned to a sensor control/Sybil attack as malicious. Their identification allows the disconnection from the network and to restore the network's integrity.

2) 영향을 받은 센서들:
만약 우리가 비정상 행동을 성공적으로 탐지하고 공격의 유형을 식별한다면, 영향을 받은 ADS-B 센서들인지에 대하여 설명하려 할 것이다. 우리는 공격 동안에 참가 센서들이 미활동중인지 활동중인지 구분한다. 비신뢰 메시지를 보고하는 모든 센서들을 우리가 잠재적으로 위협으로 태그할 수 있는 반면에, 우리는 어떤 센서들이 실제로 공격자의 제어 아래에 있는지 궁금해한다. 이러한 오염된 센서들은 네트워크를 방해하려고 적극적으로 노력한다. 그러므로, 우리는 센서 제어/시빌 공격에 대하여 명확하게 할당되어 있는 메시지를 보고하는 모든 센서들을 위협으로 식별한다. 그들의 식별 때문에 네트워크로부터의 연결 해제와 네트워크의 무결성을 복구할 수 있다.

On the other hand, sensors that fell victim to an attack themselves may only be temporarily disconnected from the network. Sensors that are recognized in such a way can later be reactivated once the attack is over. The tracing of affected sensors also allows for a coarse localization of an attack. Even though sensor locations are unknown, coverages of the sensors can be determined and consequnetly a rough attacker position could be narrowed down.

반면에, 공격의 희생양이 된 센서들은 일시적으로 네트워크로부터 연결 해제될 것이다. 그러한 방법으로 인식된 센서들은 후에 일단 공격이 끝나면 재활성화될 수 있다. 영향을 받은 센서들의 추적은 또한 공격의 대략적인 위치를 알아낼 수 있다. 센서 위치들을 알 수 없을지라도, 센서들의 커버리지들은 결정될 수 있고, 결과적으로 공격자 위치를 좁힐 수 있다.

Simulation

While the characteristics of normally operation air traffic can be learned from the actually received ADS-B reports, attack scenarios are required to be emulated based on realistic assumptions and experience. Assuming that no attacks were launched on the selected day (February 15, 2020), we use all reports to map typical reception patterns. In the following, we describe how we simulated the three considered attack classes, i.e., GPS spoofing, ADS-B spoofing, and sensor control/Sybil attack. For each attack, we generate at least the number of reports as received normally, i.e., more than 132 million different fake reports representing each respective attack. Note that this does not reflect the actual distribution between normal and attack reports, but is chosen to establish a reasonable database of fake reports. This allocation is used for the training process only.

정상적으로 수행되는 항공 교통의 성격은 실제로 수신받은 ADS-B 보고서들로부터 학습될 수 있는 반면에, 공격 시나리오들은 실제같은 가정과 경험을 기반으로 에뮬레이트 될 필요가 있다. 공격들은 특정 날짜에 실행되지 않는다고 가정하면서, 우리는 전형적인 수신 패턴을 매핑하기 위해서 모든 보고서들을 사용한다. 다음에서, 우리가 어떻게 세 가지 고려되는 공격 클래스들(GPS spoofing, ADS-B spoofing, and sensor control/Sybil attack)을 시뮬레이션 했는지 설명한다. 각 공격에 대하여, 우리는 정상적으로 수신한 최소 보고서의 개수, 즉 각 공격을 나타내는 1억 3,200만개의 서로 다른 가짜 보고서들을 생성한다. 이것이 정상 보고서와 공격 보고서 간의 실제 분산을 반영하지 않는다는 것을 유의해라. 하지만, 가짜 보고서들의 합리적인 데이터베이스를 구축하기위해 선택되었다. 이러한 할당은 오직 학습 과정만을 위해서 사용되었다.

1. GPS Spoofing

To emulate a successful GPS spoofing attack, we manipulate the reported GPS-derived positioning information embeded in ADS-B reports. More precisely, we randomly sample one ADS-B report from the entire dataset. We then gather all reports from the corresponding aircraft for the preceding 15 min and the next 60 min representing a 75 min aricraft track. This track is then subject to selected deviations $\alpha$ of 1 $\degree$ , 2 $\degree$ , 5 $\degree$ , 10 $\degree$ , 20 $\degree$ , or 45 $\degree$ to simulate an attack incrementally leading aircraft off their track starting at $t_{attack}$ = 15 min. Figure 5 depicts this procedure. For each deviation, we replace the GPS position in the reports while all other data fields and the sensors that received the message remain the same. We label the messages as resulting from a GPS spoofing attack after $t_{attack}$ and also keep track of the applied deviation, the distance to the original track, and the elapsed time after the attack has been launched. We repeat this process of randomly sampling reports from the dataset and manipulating the GPS position until the desired number of reports is reached.

성공적인 GPS spoofing attack을 실험하기 위해서, 우리는 ADS-B 보고서들에 내재된 GPS 기반의 위치 정보를 조작한다. 더욱 정확하게는, 우리는 전체 데이터베이스에서 무작위하게 하나의 ADS-B 보고서 표본을 추출한다. 그러고 나서 우리는 이전 15분과 75분간의 항공 경로를 나타내는 다음 60분동안 해당 항공기에서 모든 보고서들을 수집한다. 항공기가 $t_{attack}$ = 15 min에서 수행되는 점진적으로 그들의 경로를 벗어나는 공격을 시뮬레이션 하기 위해서, 이 경로는 특정 편차 $\alpha$ (1 $\degree$ , 2 $\degree$ , 5 $\degree$ , 10 $\degree$ , 20 $\degree$ , or 45 $\degree$ )를 가진다. 그림 5는 이러한 절차를 묘사한다. 각 편차에 대하여, 우리는 다른 모든 데이터 필드와 메시지를 수신하는 센서들이 같게 유지되는 동안에 보고서들에 있는 GPS 위치를 교체한다.
우리는 $t_{attack}$ 이후에 GPS spoofing attack의 결과로써 메시지들을 라벨링하고 적용된 편차의 경로, 원래 경로와의 거리, 그리고 공격이 수행된 이후 경과된 시간을 보관한다. 우리는 데이터베이스로부터 무작위하게 보고서들을 추출하고 GPS 위치를 조작하는 이 과정을 목표된 보고서의 수에 도달할 때까지 반복한다.

2. ADS-B Spoofing

When simulating an ADS-B spoofing attack, we are faced with the problem of unknown sensor locations. Even the tracing of observed clusters does not reveal a sensor position since the reception range can highly vary and may be distinct in different directions. It is noteworthy that an attacker would face the same problem and cannot pinpoint sensors but would need to bindly affect larger regions when targeting multiple senosrs. We differentiate the attack according to how many sensors fall victim to the attack, i.e., a single sensor, multiple sensors, or all sensors within a selected region. Figure 6 illustrates these attacks. To simulate an attacker targeting multiple sensors, we randomly pick sensors up to the average number of observing sensors of the respective cluster.

ADS-B spoofing attack을 시뮬레이션 할 때, 우리는 알 수 없는 센서 위치 문제를 직면한다. 심지어 클러스터들을 추적하더라도, 수신 범위가 매우 다양할 수 있고 방향이 다를 수 있기 때문에 센서의 위치가 드러나지 않는다. 공격자가 같은 문제를 직면하고 센서들의 위치를 잡을 수 없지만, 다수의 센서들을 목표로 할 때 넓은 지역에 구속력 있게 영향을 줄 필요가 있다는 것에 주목할 만하다. 우리는 얼마나 많은 센서들이 공격의 희생양이 되었는지(단일 센서, 다수 센서, 특정 지역의 모든 센서들)에 따라서 공격을 구분한다. 그림 6은 이러한 공격들을 나타낸다. 다수 센서들을 목표로하는 공격자를 시뮬레이션 하기 위해서, 우리는 각 클러스터의 관측 센서들의 평균 수 만큼 무작위하게 센서들을 고른다.

We again generate fake messages for each scenario by randomized sampling from real-world aircraft reports. We extract the corresponding 75 min long track and adjust the receiving sensors depending on the coverage of the considered cluster and how many sensors are affected by the attack. All other data fields remain the same. We use real aircraft reports to represent an attacker trying to inject authentic ghost aircraft into the network by sending those messages to the scenario-dependent number of sensors.

우리는 실제 항공기 보고서에서 무작위 샘플링을 통해 각 시나리오에 대한 가짜 메시지를 다시 생성한다. 우리는 해당 75분 길이의 트랙을 추출하고 고려된 클러스터의 커버리지와 공격의 영향을 받는 센서의 수에 따라 수신 센서를 조정한다. 다른 모든 데이터 필드는 동일하게 유지된다. 우리는 실제 항공기 보고서를 사용하여 시나리오에 따라 달라지는 센서 수에 해당 메시지를 전송하여 실제 유령 항공기를 네트워크에 주입하려는 공격자를 나타낸다.

3. Sensor Control/Sybil Attack

In a sensor control/Sybil attack, an attacker adds sensors to the network that are under the attacker's synchronized control. We assume that the attacker's sensors initially behave normally to remain unnoticed prior to any fake message injection. When an attack is launched, all controlled sensors mutually try to report the same fake message. We again differentiate between the number of controlled sensors with regard to the number of benign sensors, i.e., a single sensor or equality between the attacker's sensors and benign sensors.

sensor control/Sybil Attack에서, 공격자는 공격자의 동기화된 통제하에 있는 네트워크에 센서들을 추가한다. 우리는 공격자의 센서들이 어떠한 가짜 메시지 주입 이전에 알아차려지지 않은 채로 남기위해서 초기에 정상적으로 행동하는 것으로 가정한다. 공격이 시작될 때, 모든 통제된 센서들은 서로 같은 가짜 메시지를 보고하려고 한다. 우리는 단일 센서나 공격자의 센서들과 정상 센서들간의 동일성과 같은 정상 센서들의 수에 따라서 통제된 센서들의 수를 구분한다.

The process of sampling and selecting tracks is the same as for ADS-B spoofing. We assume that the attacker utilizes all controlled sensors to inject the same message. Notably, the benign sensors that cover the same area are not affected by a Sybil attack and will consequently not report the injection of such messages.

트랙들을 샘플링하고 선택하는 과정은 ADS-B spoofing을 위한 것과 같다. 우리는 공격자가 같은 메시지를 주입하기 위해서 모든 통제된 센서들을 활용한다고 가정한다. 같은 지역을 커버하는 정상 센서들은 Sybil attack에 영향을 받지 않고, 그러한 메시지들의 주입을 보고하지 않는다는 점에 주목해라.

Evaluation

We split the evaluation of the developed ADS-B trust system into $(i)$ performance of detecting each considered attack, $(ii)$ distinguishing between attack vectors, $(iii)$ identifying affected sensors, $(iv)$ analyzing the impact of different grid resolutions, $(v)$ investigating the time dependency and $(vi)$ estimating the computational performance.

우리는 제안된 ADS-B 신뢰 시스템의 평가를 $(i)$ 각 고려된 공격 탐지의 성능, $(ii)$ 공격 벡터들간의 구분, $(iii)$ 공격 받은 센서들의 식별, $(iv)$ 다양한 그리드 해상도의 영향 분석, $(v)$ 시간 의존성 검사, 그리고 $(vi)$ 계산 성능 추정으로 나눈다.

1. Attack Detection Performance

We approach the attack detection performance in two different ways. First, we consider the classification results of single ADS-B reports without linking consecutive reports, and second, we make decisions on combined aircraft tracks. The training process uses all reports of the selected day as well as the simulated attack vectors based on randomly sampled 75 min long aircraft tracks from the OpenSky Network database according to Section IV. Our attack detection evaluation prototype uses clusters $C_j$ with edge lengths of 10km. We assign each report to its originating cluster indicated by the embedded position splitting up all messages over the observed area. We then perform training with our selected DT classifier by iterating through all clusters.

우리는 서로 다른 2가지 방식으로 공격 탐지 성능 측정에 접근했다. 첫째는, 연이은 레포트들을 연결하는 것 없이 단일 ADS-B 보고서들의 결과를 분류하는 것을 고려했다. 둘째, 결합된 항공기 트랙들에 대해 의사결정했다. 학습 과정은 선택된 날짜의 모든 보고서들 뿐만 아니라, 섹션 IV에서 언급한 the OpenSky Network 데이터베이스로부터 무작위하게 샘플링된 75분간의 항공기 트랙을 기반으로 한 시뮬레이션된 공격 벡터들을 사용한다. 우리의 공격 탐지 평가 프로토타입은 10km의 테두리 길이를 가지는 클러스터들( $C_j$ )을 사용한다. 우리가 각 보고서를 관측 영역에서 모든 메시지를 분할하는 내재된 위치에 의해서 표시되는 원래 클러스터들에 할당한다. 그러고나서 우리는 모든 클러스터들을 반복적으로 DT 분류기로 학습한다.

For testing, we again query the database for 1000 untrained and randomly selected aircraft tracks. We do not make any restrictions on the selection process except that we require that at least 50% of the broadcasted reports are actually recorded by the network. This filters tracks that would quickly leave the covered area, i.e., the scope of the network, and hence cannot be classified due to missing reports. We apply the different attack vectors, label each track accordingly, and then classify the resulting reports with the classifier for the designated cluster. For our three attack classes, i.e., GPS spoofing, ADS-B spoofing, and sensor control/Sybil attack, we shortly describe which test triggers an alarm and then focus on the ML supported cross check providing True Positive Rates(TPRs) and False Positive Rates (FPRs).

테스트를 위해서, 우리는 다시 1000개의 학습되지 않고, 무작위하게 선택된 항공 트랙을 위해서 데이터베이스에 쿼리를 보낸다. 우리는 방송된 보고서들에 대해서 최소 50%가 실제로 네트워크에 의해서 기록되어야한다는 점을 제외하고는 선정 과정에 관해서 어떠한 제한도 두지 않는다. 이 조건은 커버되는 영역(네트워크의 범위)을 빠르게 떠날 수 있는 트랙을 필터링하기 때문에 누락된 보고서들때문에 DT를 사용할 수 없는 보고서들을 제외한다. 우리는 서로 다른 공격 벡터들을 적용하고, 각 트랙에 따라서 레이블을 지정한 다음, 지정된 클러스터들에 대한 분류기로 결과 보고서들을 분류한다. 우리의 세가지 공격 클래스들(GPS spoofing, ADS-B spoofing, and sensor control/Sybil attack)을 위해서 우리는 간단하게 어떤 테스트가 알람을 트리거하는지 설명하고, TPR과 FPR을 제공하는 ML기반의 상호 검증에 집중한다.

1) GPS Spoofing:
While an incremental position deviation passes the differential check, our dependency check consistently indicates mismatches between predicted positions and the reported GPS position. Even though we account for a specific uncertainty threshold, at one point in time, the attack exceeds this threshold. In consideration of the cross check, the intuition is that the further away an aircraft claims to be from its real position, the more different the reception pattern will be. Notably, the selected cluster for the cross check is determined by the reported/claimed position. If the real position and the spoofed position are still within the same cluster, the reception patterns are the same and a decision towards the presence of a GPS spoofing attack is not possible.

1) GPS Spoofing
증가하는 위치 편차가 차별성 검사를 통과하는 반면에, 우리의 의존성 검사는 예측된 위치와 보고된 GPS 위치간에 불일치를 일관되게 나타낸다. 심지어 구체적인 불확실성 임계값을 고려하더라도 한 시점에서의 공격은 이 임계값을 넘어선다. 상호 검증을 고려할 때, 항공기가 실제 위치로부터 점점 더 멀어질 수록, 더 많은 수신 패턴이 달라질 것이라는 직감이 든다. 상호 검증을 위한 선택된 클러스터는 보고된/주장하는 위치에 의해서 결정된다는 것에 유의하라. 만약 실제 위치와 변조된 위치가 여전히 동일한 클러스터내에 있다면, 수신 패턴들은 같고 GPS spoofing attack의 존재에 대한 결정은 불가능하다.

To assess our detection performance of GPS spoofing attacks, we consider a classifier that has been trained with samples from normal operation and the simulated GPS spoofing reports. We further calculate a score based on the classifier outcome and the total number of reports. Following this metric, a score of 1 means that every report is labeled authentic while a score of 0 means that every report was labeled malicious. We evaluate $(i)$ the average score over all 1000 runs of the classifier with respect to different deviations $\alpha$ from the original track and the elapsed time in Figure 7 and $(ii)$ the average score with respect to the distance to the original track in Figure 8. The distance to the original track is a combination of the applied deviation and the time that has elapsed after the launch of the attack.

GPS spoofing attacks에 대한 우리의 탐지 성능을 평가하기 위해서, 우리는 정상 수행과 시뮬레이션된 GPS spoofing 보고서들로부터의 표본들로 학습된 분류기를 고려한다. 우리는 분류 결과와 보고서의 총 개수를 기반으로 점수를 계산한다. 다음의 측정법에 따르면, 1점은 모든 보고서가 진짜로 라벨링 되었고 0점은 모든 보고서가 악의적이라고 라벨링 되었다는 것을 의미한다. 우리는 그림7에 $(i)$ 원래의 트랙으로부터 다른 편차들( $\alpha$ )과 경과 시간을 이용한 분류기의 1000번의 실행에 대한 평균 점수와 그림8에 $(ii)$ 원래 트랙에 대한 거리에 따른 평균 점수를 평가한다. 원래 트랙으로부터의 거리는 적용된 편차와 공격이 실행된 이후에 경과된 시간의 조합이다.

Results. While the dependency check is effective in detecting GPS spoofing attacks, in cases where additional information might be missing, the cross check is sufficient to detect such attacks with a high probability after a certain amount of time has passed, see Figure 7. For instance, considering $\alpha = 2\degree$ , $\alpha = 10\degree$ , $\alpha = 45\degree$ the score falls below 0.5 after approx. 20 min, 5 min, and 1 min, respectively.
The rate at which the average score decreases is dominated by the applied deviation $\alpha$ . The higher the deviation, the faster the fake positions approach other clusters, leading to mismatches in the reception patterns. Notably, the average score, even under normal operation, never reaches 1 due to a portion of reports being wrongly classified. We will handle this problem by linking successive reports when deciding aircraft tracks.

결과.
의존성 검사는 GPS spoofing attacks을 탐지하는데에 효과적인 반면에, 추가적인 정보가 누락되는 경우에서 상호 검증은 특정 시간이 지난 후에, 높은 확률로 이러한 공격들을 탐지하는데에 충분하다. 그림 7 참고. 예를 들어서, $\alpha = 2\degree$ , $\alpha = 10\degree$ , $\alpha = 45\degree$ 를 고려하면, 점수는 각각 20분, 5분, 그리고 1분 후에 대략 0.5 밑으로 떨어진다. 평균 점수가 감소하는 비율은 적용된 편차( $\alpha$ )에 의해 다르다. 편차가 클 수록, 가짜 위치들이 다른 클러스터들에 더 빠르게 접근하여 수신 패턴이 불일치한다. 심지어 정상 수행이더라도, 평균 점수는 잘못 분류된 일부 보고서들로 인해 1점에 다다를 수 없다는 것에 유의하라. 우리는 이러한 문제를 항공기 트랙을 결정할 때, 연속적인 보고서들을 연결함으로써 해결한다.

Figure 8 condenses the deviation and the elapsed time into the distance to the original track. The average score quickly approaches 0.5 for distances up to one grid reolution, i.e., 10km in our evaluation prototype. After this point has been reached, the decline slows down and reaches approx. 0.35 for a distance of two grid resolutions. Further distances only moderately decrease the average score and it nearly stabilizes at this point. We observe that the classifier can differentiate the reception patterns and perform increasingly better, the further away the spoofed track deviates from the real aircraft track. note that in the worst case, a distance of approx. $\sqrt2$ -times the grid resolution can still point to the same cluster. However, increasing the distance further guarantees different clusters.

그림8은 편차와 경과시간을 원래 트랙에 대한 거리로 합친다. 우리의 평가 프로토타입에서 최대 하나의 그리드 해상도까지의 거리 즉, 10km에 대하여 평균 점수는 빠르게 0.5에 다다른다. 이 지점에 다다른 후에, 감소는 느려지고 두번째 그리드 해상도에 대하여 대략 0.35에 도달한다. 거리가 더 멀어지면 오직 평균 점수가 적당히 감소하고 거의 이 지점에서 안정된다. 우리는 분류기가 수신 패턴들을 분류하고 점점 더 좋게 수행할 수록 실제 항공기 트랙으로부터 위조된 트랙 편차들이 멀어진다는 것을 관찰한다.최악의 경우 그리드 해상도의 약 $\sqrt2$ 배의 거리가 여전히 동일한 클러스터를 가리킬 수 있다는 점에 유의하라. 그러나 거리를 더 늘리면 서로 다른 클러스터가 보장된다.

We now approach the question of how we decide aircraft tracks, in contrast to the aforementioned evaluations where we showed average scores over all test runs for individual reports. Figure 7 and 8 show that the score fluctuates and that authentic reports are sometimes labeled as malicious. Even when no attacks are applied, we never reach a perfect score of 1. Hence, the detection of attacks cannot be based on single messages alone without triggering a high number of false alarms. Considering that we designed our system as an augmentation system for attack detection, false alarm events are disruptive and a high number is unacceptable.

앞서 언급한 각 보고서들에 대한 모든 테스트 실행에 대한 평균 점수를 보여주는 평가들과는 대조적으로, 우리는 어떻게 우리가 항공기 트랙들을 선택했는지에 대한 질문에 접근한다. 그림 7과 8은 점수가 변동하고 실제 보고서들이 때때로 악의적인것으로 라벨링되는 것을 보여준다. 심지어 아무 공격도 적용되지 않을 때에도, 우리는 절대 완벽한 1점에 도달하지 못한다. 더군다나, 공격의 탐지는 많은 수의 잘못 울리는 알람들을 트리거 하는 것 없이 단일 메시지 홀로 기반이 될 수 없다. 우리는 우리의 시스템을 공격 탐지를 위한 추가적인 시스템으로써 설계했다는 점을 고려할 때, 잘못 울리는 알람들은 파괴적이고 많은 수는 받아들일 수 없다.

To compensate for single false positives, i.e., malicious patterns detected when no attack is applied, we implement time windowing. In particular, we tested three different time windows $w$ , i.e., 5 min, 10 min, and 15 min. The time windowing is only applied backwards such that the score at time $t$ becomes the average score of all received reports within the last $w$ minutes. The final decision is then based on score thresholds. With the target of minimizing false alarms, we set the threshold at the lowest score that we observed across all randomly selected 1000 aircraft tracks at any given time after $t_{attack}$ . As a result, we achieve a false positive rate of 0% by design with respect to the considered tracks. The selected threshold depends on the length of the time window, where shorter time windows lead to higher thresholds and larger time windows allow tighter thresholds.

공격이 적용되지 않았을 때 탐지되는 악의적인 패턴들과 같은 단일 오탐을 위해 보완하기 위해서, 우리는 타임 윈도우를 구현한다. 특히, 우리는 세 가지 다른 타임 윈도우( $w$ )(5분, 10분, 15분)를 테스트했다. 타임 윈도우는 오직 역방향으로 적용되기 때문에, 시간 $t$ 에서의 점수가 마지막 $w$ 분동안 안에 모든 수신된 보고서들의 평균 점수가 된다. 그러고 나서 최종 결정은 점수 임계값을 기반으로 한다. 오탐을 최소화하는 것을 목표로, 우리는 임계값을 $t_{attack}$ 후에 주어진 시간에서의 무작위하게 선택된 모든 1000개의 항공기 트랙들에서 관측한 가장 낮은 점수로 설정한다. 결과적으로, 우리는 고려된 트랙들에 따라서 디자인함으로써 오탐율 0%를 달성했다. 선택된 임계값은 타임 윈도우의 길이에 따라 다르다. 타임 윈도우가 짧으면 임계값이 높아지고, 타임 윈도우가 길어지면 임계값이 줄어든다.

In Table VII, we list the GPS spoofing detection performance considering different deviations and time windows. We analyzed the attack detection rate, i.e., the number of detected attacks compared to all tested runs and the detection delay, i.e., the time at which we observed the threshold violation and raised an alarm. We additionally state the median and the standard deviation. Bold entries mark the best results in each row. We want to highlight that for every configuration the FPR is 0% due to how the threshold is chosen.

표 VII에서, 우리는 다양한 편차와 타임 윈도우를 고려하여 GPS spoofing 탐지 성능을 리스트했다. 우리는 공격 탐지율, 즉 테스트된 모든 실행과 비교하여 탐지된 공격들의 수와 탐지 지연, 즉 우리가 임계값 위반을 관측하고 알람을 울리는 시간을 분석했다. 우리는 추가적으로 중앙값과 표준 편차를 명시한다. 두꺼운 글씨는 각 행에서의 가장 좋은 결과를 표시한다. 우리는 임계값이 선택되는 방식으로 인해 모든 구성에서 FPR이 0%인 것을 강조하고자 한다.

With increasing deviation $\alpha$ , the attack detection reaches up to approx. 99.5%. An attack counts as detected when the threshold is undercut within the first hour after the launch of the attack. The missing 0.5% that were not detected are due to very slow or even parking aircraft. The impact of GPS spoofing becomes negligible in such scenarios considering how we simulated it. The rest of the deviated aircraft tracks are detected with a very high probability. The detection delay strongly depends on the applied deviation $\alpha$ . For higher values, the average detection delay can go as low as approx. 6 min and standard deviations around 8 min. The time window $w$ also impacts the performance. The implementation of different time windows is beneficial since the best attack detection rate and the detection delay is dependent on the applied deviation $\alpha$ .

증가하는 편차에 따라서, 공격 탐지는 대략 99.5%까지 도달한다. 공격은 공격이 시작된 후의 첫 한 시간 이내에 임계값이 감소하면 탐지된 것으로 간주한다. 탐지되지 않은 0.5%가 누락된 것은 매우 느리거나 주차된 항공기 때문이다. 이러한 시나리오에서는 시뮬레이션 방식을 고려할 때, GPS spoofing의 영향은 무시할 만 하다. 나머지 경로를 벗어난 항공기 트랙들은 매우 높은 확률로 탐지된다. 탐지 지연은 적용된 편차 $\alpha$ 에 매우 의존한다. 값이 높을수록, 평균 탐지 지연은 약 6분 표준 편차로는 8분 정도로 낮아질 수 있다. 타임 윈도우 $w$ 는 또한 성능에 영향을 준다. 다양한 타임 윈도우의 구현은 가장 좋은 공격 탐지율과 탐지 지연이 적용된 편차 $\alpha$ 에 따라 다르기 때문에 유용하다.

2) ADS-B Spoofing:
For the evaluation of the ADS-B spoofing detection performance, we specifically focus on the outcome of the cross check. Since an attacker is able to generate arbitrary reports, we assume that an attacker can successfully remain undetected by the sanity, differential, and dependency check. Considering the testing set for the cross check, we take the same sampled aircraft tracks from the GPS spoofing evaluation but apply ADS-B spoofing according to Section IV. At time $t_{attack}$ , the attacker launches the spoofing attack representing a scenario where an aircraft track would normally end, but is continued by fake injections into the system. We distinguish between three scenarios depending on the targeted number of sensors (see Figure 6). Notably, we use a classifier that is trained with samples from normal operation and simulated samples from ADS-B spoofing.

2) ADS-B Spoofing
ADS-B spoofing 탐지 성능의 평가를 위해서, 우리는 특히 상호검증의 결과에 집중한다. 공격자가 임의의 보고서들을 생성할 수 있기 때문에, 우리는 공격자가 성공적으로 온전성, 차별성, 의존성 검사에 탐지되지 않았다고 가정한다. 상호검증을 위한 테스트 세트를 고려해보면, 우리는 동일한 GPS spoofing 평가를 위해 표본화된 항공 트랙을 사용한다. 하지만, 섹션 IV에 서 보듯이 ADS-B spoofing에 적용한다. $t_{attack}$ 시간에서, 공격자는 항공 트랙이 정상적으로 끝났지만, 시스템에 가짜 주입으로 인해 계속 진행되고 있는 시나리오를 의미하는 spoofing attack를 실행한다. 우리는 타겟이 된 센서들의 수(그림 6 참조)를 통해서 세 가지 시나리오들을 구분한다. 우리는 정상 수행의 표본과 ADS-B spoofing의 시뮬레이션 표본으로 학습된 분류기를 사용한다는 점에 유의해라.

Results. The resulting average scores of all three scenarios are depicted in Figure 9. One can see that the score for normal operation is very close to 1, while any form of ADS-B spoofing drastically reduces the average score across all 1000 runs. This change is almost immediately after the attack has been launched and continues to decrease afterwards. Furthermore, the scenarios impact the scores differently. From an attacker's perspective, injecting reports from multiple but not from all sensors is superior to all other strategies.

결과.
모든 세 시나리오들의 평균 점수의 결과는 그림 9에 묘사되어있다. 어떤 것들은 1000번의 수행 동안에 평균 점수가 급작스럽게 하락하는 반면에, 정상 수행에 대한 점수는 1에 매우 가깝다는 것을 볼 수 있다. 이러한 변화는 거의 공격이 시작된 이후에 즉시 발생하고 점점 감소한다. 게다가, 시나리오들은 점수들에 다른 영향을 준다. 공격자의 관점에서는, 모든 센서들이 아닌 복수개의 센서들에게 가짜 보고서들을 주입하는 것은 다른 모든 전략들보다 우월하다.

We argue that even an optimized attacker strategy cannot emulate typical reception patterns by only affecting specific sensors. Since sensors are geographically distributed at unknown positions, an attacker cannot systematically control which and how many sensors receive the fake reports. Eventually, an attacker needs to broadcast from a location close to the claimed position to emulate realistic message reception patterns, virtually becoming a legitimate broadcast from the advertised position.

우리는 최적화된 공격자 전략조차도 오직 영향을 받은 특정 센서들에 의해서 전형적인 수신 패턴들을 에뮬레이트 할 수 없다고 주장한다. 센서들이 알려지지 않은 위치에서 지리적으로 분산되었기 때문에, 공격자는 센서들의 수와 어떤 센서들이 가짜 보고서들을 수신할 지 통제할 수 없다. 결국, 공격자는 실제 메시지 수신 패턴들을 에뮬레이트하기 위해서 타겟 위치에 가까운 장소로부터 방송을 할 필요가 있다. 이는, 알려진 위치로부터 가상의 합법적인 방송이 된다.

Even when targeting multiple sensors, constantly missing reports from sensors within the reception range is a strong indication for some kind of injection. Naturally, the number of sensors observing the cluster where the injection takes place impacts the significance. The patterns have less variations when fewer sensors are operated and the differences to malicious patterns will be less obvious. Figure 10 shows the average score in relation to the number of observing sensors. Having only three sensors, the attacker can remain undetected in more cases than in clusters with a sensor coverage of 10, 30, or 50.

심지어 복수 개의 센서들을 표적으로 할 때에도, 수신 범위 내에 센서들로부터 끊임없이 보고서들이 손실되는 것은 몇 가지 종류의 주입 공격에 대한 강한 암시이다. 자연스럽게, 주입 공격이 발생하는 클러스터를 관측하고 있는 센서들의 수는 중요도에 영향을 준다. 패턴들은 적은 수의 센서들이 수행되어질 때 더 적은 변동성을 가지고 악의적인 패턴들에 대한 차이점은 덜 명백해진다. 그림 10은 관측 센서들의 수에 대한 관계에 대한 평균 점수를 보여준다. 오직 3개의 센서들을 가지는 것은 공격자가 센서 커버리지가 10, 30, 50인 클러스터내에서 보다 더 많은 경우에서 탐지되지 않을 수 있다.

3) Sensor Control/Sybil Attack:
To evaluate our detection performance of sensor control/Sybil attacks, we again focus on the outcome of the cross check. We consider two scenarios with different numbers of compromised sensors, i.e., a single sensor or equality between the attacker's sensors and the number of sensors already observing that specific airspace. Notably, the attackers' sensors initially participate normally and are already considered when message reception patterns are trained. After $t_{attack}$ , the attacker starts to use the controlled sensors to inject an aircraft track. Compared to our assumptions for ADS-B spoofing, the attacker is now capable of emulating arbitrary reception patterns using all the controlled sensors while benign sensors within the same cluster remain unaffected.

3) Sensor Control/Sybil Attack
Sensor Control/Sybil Attack의 탐지 성능을 평가하기 위해서, 우리는 다시 상호 검증의 결과에 집중한다. 우리는 오염된 센서들의 수가 다른 2가지 시나리오들, 즉 단일 센서 또는 공격자의 센서들과 특정 공역을 이미 관측하는 센서들의 수 간의 일치를 고려한다. 공격자의 센서들이 초기에 정상적으로 참여하고 메시지 수신 패턴들이 학습되어졌을 때, 이미 고려되어진다는 것에 유의해라. $t_{attack}$ 이후에, 공격자는 항공 트랙을 주입하기 위해서 통제된 센서들을 사용하기 시작한다. ADS-B spoofing을 위한 우리의 가정들을 비교하여 공격자가 동일한 클러스터내에 있는 정상 센서들이 영향을 받지 않은채로 유지되는 동안에, 모든 통제된 센서들을 사용하면서 임의의 수신 패턴들을 모방할 수 있다.

Results. The results are very similar to the ADS-B spoofing results. The impact on the score is immediate and can be clearly distinguished from normal behavior. The reasoning behind the similar results are based on the benign sensors that are unaffected by the attacker. A message injection from the controlled sensors represents the very unlikely case of a high number of benign sensors missing on the same message. The detection of Sybil attacks is hence based on missing reports rather than all sensors agreeing on the same message. Figure 10 can be converted to this scenario when considering the sensor coverage of only the uncompromised sensors.

결과.
결과들은 ADS-B spoofing 결과와 매우 유사하다. 점수에 대한 영향은 즉각적이고 정상 수행과 명확하게 구분될 수 있다. 비슷한 결과의 이유는 공격자에 의해 영향을 받지 않는 정상 센서들때문이다. 통제된 센서들로부터 메시지 주입은 동일한 메시지에 많은 수의 정상 센서가 누락된 가능성이 매우 낮다는 것을 나타낸다. 그림 10은 오직 손상되지 않은 센서들의 센서 커버리지만을 고려할 때, 이러한 시나리오로 변환될 수 있다.

Nevertheless, some limitations need to be highlighted. If the attacker controls every sensor for one cluster, arbitrary patterns can be emulated and we have no chance of detecting the attack. However, as soon as the attacker tries to inject reports for clusters that are already observed by sensors, the attack can be detected. The vast majority of airspace is already observed by at least one sensor (see Table IX). We argue that as long as the majority of benign sensors operate normally, the attack can still be detected.

그럼에도 불구하고, 약간의 제약들은 강조될 필요가 있다. 만약, 공격자가 한 클러스터에 대해서 모든 센서를 통제한다면, 임의의 패턴들은 에뮬레이트 될 수 있고 우리는 공격을 탐지할 기회가 없다. 하지만, 공격자가 이미 센서들에 의해 관측되고 있는 클러스터들에 대하여 보고서들을 주입하려 한다면, 공격은 탐지될 수 있다. 공역의 대부분은 이미 최소 하나의 센서들에 의해서 관측되고 있다(표 IX 참고). 우리는 정상 센서들이 가능한 오래동안 정상적으로 수행되고 있는 한, 공격은 여전히 탐지 될 수 있다고 주장한다.

4) Combined Attacks:
Thus far, we have evaluated the detection performance of individual attacks, i.e., GPS spoofing, ADS-B spoofing, and sensor control/Sybil attacks. We now analyze if any attack combination can increase the attacker's chance of remaining undetected. Notably, sensor control is superior to ADS-B spoofing since a fully compromised sensor cannot only inject any form of false ADS-B reports (as it is the case for ADS-B spoofing) but also drop any other messages the sensor may receive. Hence, ADS-B spoofing can be considered a subset of the sensor control/Sybil attack class. The success of their combination can be upper bounded by the success an attacker would have who instead also controls the sensors affected by ADS-B spoofing.
While an attacker controlling a subset of sensors may still decide to additionally spoof other sensors, the detection performance is closely tied to the number of benign sensors.

4) Combined Attacks
지금까지, 우리는 각각의 공격들, 즉 GPS spoofing, ADS-B spoofing, and sensor control/Sybil attack에 대한 탐지 성능을 평가해왔다. 우리는 이제 어떤 공격 조합이라도 탐지되지 않은 채로 유지될 공격자의 기회를 증가시킬 수 있는지 분석한다. 센서 통제는 완전히 손상된 센서들이 ADS-B 보고서들을(ADS-B spoofing의 경우처럼) 주입할 수 있을 뿐 아니라, 센서가 수신할 지도 모르는 다른 모든 메시지들을 차단 시킬 수 있기 때문에 ADS-B spoofing보다 우월하다는 점에 유의하라. 따라서 ADS-B spoofing은 sensor control/Sybil attack 클래스의 하위 집단으로 고려될 수 있다. 공격 조합의 성공은 ADS-B spoofing에 영향을 받은 센서들을 통제하는 공격자의 성공에 따라 상한이 정해질 수 있다. 센서들의 하위 집합을 통제하는 공격자가 추가적으로 다른 센서들을 속이기로 결정할지도 모르는 반면에, 탐지 성능은 정상 센서들의 수에 매우 밀접하게 연관되어 있다.

We focus on reports affected by GPS spoofing and ADS-B spoofing at the same time, i.e., a fake GPS track that is injected via ADS-B spoofing. We set the deviation $\alpha$ to 5 $\degree$ and assume an attacker to inject the track via spoofing multiple sensors. We consider the impact on the detection performance from two different directions. Figure 11 shows the change based on a classifier that is indicative for GPS spoofing. Figure 12 depicts the other perspective, where the ADS-B spoofing classifier evaluates the attack combination.

우리는 동시에 GPS spoofing과 ADS-B spoofing에 영향(ADS-B spoofing을 통해서 주입된 가짜 GPS 트랙)을 받은 보고서들에 집중한다. 우리는 편차 $\alpha$ 를 5 $\degree$ 에 맞추고 복수의 센서들을 spoofing함으로써 트랙을 주입하는 공격자를 가정한다. 우리는 두 가지 다른 방향으로 탐지 성능에 대한 영향을 고려한다. 그림 11은 GPS spoofing을 나타내는 분류기를 기반한 변화를 보여준다. 그림 12는 다른 관점, ADS-B spoofing 분류기가 공격 조합을 평가하는 것을 보여준다.

Results. Comparing the detection performance of fake GPS spoofing reports to additional ADS-B spoofing, one can clearly notice the sudden drop in score due to the ADS-B spoofing in the combination. Over the cause of 30 min, the average score is constantly lower rendering the combination unfavorable for the attacker. Surprisingly, from the perspective of ADS-B spoofing, we can notice that the attack combination actually results in slightly higher scores and that the effect increases over time. It seems that a combination favors the attacker, however, the score differences are due to a chance that is not reflected in the figure: By additionally manipulating the GPS positions, the fake track faster approaches edge areas that are observed by less sensors and hence the classification looses significance (compare Figure 10). As long as enough benign sensors are unaffected, any attack combination does not favor the attacker.

결과
가짜 GPS spoofing 보고서들의 탐지 성능과 추가적인 ADS-B spoofing을 비교하면, 조합에서의 ADS-B spoofing 때문에 급격하게 점수가 감소하는 것을 명백하게 알 수 있다. 30분 동안 평균 점수는 일정하게 낮아져 공격자에게 불리하게 작용한다. 놀랍게도, ADS-B spoofing의 관점으로부터, 우리는 공격 조합이 실제로 약간 더 높은 점수를 초래한다는 점과 시간이 흐름에 따라 효과가 상승한다는 점을 알 수 있다. 공격을 조합하는 것이 공격자에게 더 유리한 것처럼 보인다. 하지만, 점수 차이는 그림에 반영되지 않은 기회: 추가적인 GPS 위치의 조작을 함으로써, 가짜 트랙이 적은 센서들에 의해서 관찰되고 있는 가장자리 지역들에 빠르게 접근하고 그로 인해 (그림 10과 비교하여)분류의 중요도를 잃기 때문이다. 정상 센서들이 영향을 받지 않는 한, 어떠한 공격 조합도 공격자에게 유리하지 않다.

5) From Single Reports to Moving Tracks:
In our evaluation, we linked the classification results of individual reports to make a decision for an entire aircraft track. While single reports may be falsely classified as malicious, time windowing mitigates this effect. The trained models for different clusters are separated and some may be more concise than others. A fact that facilitates our detection scheme is the intrinsic movement of aircraft such that a track traverses many different clusters over its course. As a result, the combined decisions of multiple clusters benefits from clusters with higher sensor coverage, eventually yielding a very high classification performance even when clusters are involved that are hard to decide.

5) From Single Reports to Moving Tracks:
우리의 평가에서, 우리는 전체 항공 트랙을 결정하기 위해서 각 보고서들의 분류 결과들을 연결시켰다. 단일 보고서들이 악의적인 것으로 잘못 분류될지도 모르지만, 타임 윈도우를 사용하는 것은 이러한 효과를 완화시킨다. 다른 클러스터들을 위한 훈련된 모델들은 분리되어지고 일부는 다른것들 보다 더욱 간결할 지도 모른다. 우리의 탐지 체계를 용이하게하는 사실은 트랙이 그것의 경로를 통해서 여러 다른 클러스터들을 횡단하도록 항공기가 본질적으로 움직인다는 것이다. 그 결과, 복수의 클러스터들의 조합된 결정은 더 높은 센서 커버리지를 가지는 클러스터의 이점을 가지며, 결국 결정하기 어려운 클러스터가 포함되어 있어도 매우 높은 분류 성능을 가진다.

2. Attack Analysis: Type of Attack

So far, we have used a different classifier for each considered attack vector. The type of attack can be trivially determined by the classifier that indicated the attack. We neglected the possibility that classifiers, e.g., tailored towards GPS spoofing detection, may also raise an alarm when faced with ADS-B spoofing, and vice versa. Note that, when no attack is applied no classifier will yield any false alarm due to the way we set our thresholds. We now analyze whether we can tell attack patterns apart. In order to evaluate the ability to differentiate between our simulated attacks, we transform the binary classification into a multiclass classification that decides the type of attack. We trained a DT classifier with reports from GPS spoofing and ADS-B spoofing. Since both attacks have multiple configurations, we chose a deviation of 20 $\degree$ for GPS spoofing and multiple sensors affected for ADS-B spoofing. We apply a time windowing of $w = 15$ min and evaluate the result at $t_{attack}$ + 30min. Figure 13 depicts the confusion matrix of the classification results.

지금 까지, 우리는 각 고려된 공격 벡터에 대하여 다른 분류기를 사용했다. 공격의 유형은 공격을 나타내는 분류기에 의해 결정될 수 있다. 우리는 GPS spoofing 탐지를 위해 딱 맞추어진 분류기들이 ADS-B spoofing을 마주했을 때, 알람을 울릴 수 있고 그 반대의 경우도 발생할 수 있다는 가능성을 무시했다. 공격이 적용되지 않으면, 임계값을 설정하는 방법 때문에 잘못된 알람이 울리지 않는다는 것을 주의하라. 이제 공격 패턴을 구분할 수 있는지 분석한다. 우리의 시뮬레이션된 공격들을 구분하는 능력을 평가하기 위해서, 우리는 이진 분류를 공격 유형을 결정하는 다중 클래스 분류기로 변환한다. 우리는 GPS spoofing과 ADS-B spoofing의 보고서를 통해 DT 분류기를 훈련했다. 두 공격 모두 다양한 구성을 가지고 있으므로, 우리는 GPS spoofing을 위해서는 20 $\degree$ 의 편차를, ADS-B spoofing을 위해서는 영향을 받는 여러개의 센서들을 선택한다. $w = 15$ 분의 타임 윈도우를 적용하고 그 결과를 $t_{attack}$ + 30min으로 평가한다. 그림 13은 분류 결과들의 혼동 행렬을 보여준다.

Results. Considering aircraft tracks without any attack modification applied, the combined classifier yields no false classifications. For GPS spoofing with $\alpha = 20\degree$ , 78.5% of the randomized runs are detected and correctly identified, while 13.9% are still considered normal. Approx. 7.6% of the cases are assigned as ADS-B spoofing. In comparison, 85.4% of ADS-B spoofing tracks are classified correctly, 4.2% are decided to be normal, and 10.4% are mixed with GPS spoofing. Our classifier struggles with this separation due to the similar impact on reception patterns in the early phases of GPS spoofing. All in all, the majority of attacks were correctly assigned and separated.

결과.
어떠한 공격 수정이 적용되지 않은 항공기 트랙을 고려할 때, 결합된 분류기들은 잘못된 분류를 하지 않는다. $\alpha = 20\degree$ 를 가지는 GPS spoofing의 경우 무작의 실행의 78.5%가 탐지되고 올바르게 확인된 반면에 13.9%는 여전히 정상으로 고려된다. 약 7.6%의 경우는 ADS-B spoofing에 할당된다(오탐). 이와 비교해서, ADS-B spoofing의 85.4%는 올바르게 분류되고 4.2%는 오탐, 10.4%는 GPS spoofing과 합쳐진다. 우리의 분류기는 GPS spoofing의 초기 단계에서 수신 패턴에 미치는 비슷한 영향으로 인해 이러한 분리에 어려움을 겪고 있다. 전체적으로, 공격의 대부분은 정확하게 할당되고 분리된다.

3. Attack Analysis: Affected Sensors

We generally differentiate between sensors that fell victim to an attack themselves and sensors that are actively collaborating. For instance, in a GPS or ADS-B spoofing attack, sensors may be faced with bogus input data, however, they are still functioning correctly and are otherwise conform with their intended behavior. While for GPS spoofing attacks the reception patterns reflect normal behavior--but for a different message origin as claimed, the reception patterns for ADS-B spoofing attacks are altered. When our attack analysis reveals the type of attack being of the latter case, the reporting sensors may be disconnected from the network and excluded from the cross checking procedure of other reports. These sensors are directly affected by the attack and their recordings cannot be trusted. However, once the attack is concluded, the identified sensors may be reactivated to again contribute to the network.

우리는 일반적으로 공격의 희생양이 된 센서들과 적극적으로 협동을 하고 있는 센서들을 구분한다. 예를 들어서, GPS 또는 ADS-B spoofing attack에서 센서들은 가짜 입력 데이터를 직면할 수 있다. 하지만, 센서들은 여전히 올바르게 기능하고 그들이 의도한대로 행동하지 않는다. GPS spoofing attacks을 위한 수신 패턴들은 정상 행동을 반영하는 반면에, 주장한 바와 같이 다른 메시지 발신지의 경우, ADS-B spoofing attacks을 위한 수신 패턴들은 수정된다. 우리의 공격 분석이 후자의 공격 유형을 드러낼 경우, 보고하는 센서들은 네트워크로부터 연결 해지되고 다른 보고서들의 상호 검증 절차로부터 배제된다. 이러한 센서들은 직접적으로 공격에 의해 영향을 받고 그들의 기록들은 신뢰될 수 없다. 하지만, 공격이 끝난다면, 신원이 확인 된 센서들은 네트워크에 기여하기 위해 재활성화 된다.

On the other hand, if the attack analysis reveals a sensor control/Sybil attack, we are faced with compromised sensors actively launching attacks on the network. All sensors that reported the reception of identified fake reports need to be considered as part of an attacker-controlled sensor union. Any shared reports from such sensors cannot be considered trustworthy. Their participation in the crowdsourcing network must be shut down and their forwarded reports filtered out accordingly to recover the integrity of the network.

반면에, 공격 분석이 sensor control/Sybil attack으로 드러난다면, 우리는 네트워크에서 공격을 수행하고 있는 손상된 센서들을 마주할 수 있다. 식별된 가짜 보고서를 수신했다고 보고하는 모든 센서들은 공격자-통제하의 센서 결합의 한 부분으로써 고려될 필요가 있다. 이러한 센서들로부터 공유된 어떠한 보고서들이라도 신뢰적이라고 고려될 수 없다. 네트워크의 무결성을 복구하기 위해서 크라우드소싱 네트워크의 그들의 참여는 종료되여야만 하고 그들의 다음 보고서들은 필터링 된다.

4. Impact of Grid Resolution

The resolution of our considered underlying grid determines the process of assigning reports and sensors to cluster $C_j$ . The higher the grid resolution, the finer is the differentiation between regions and eventually their reception patterns. However, increasing the grid resolution not only increases the computational load but can also lead to overfitting areas to the monitoring sensors. For instance, since we do not know the exact locations of sensors, we need to learn the observed area from reported ADS-B messages. The chances that a sensor did not report any message from a specific area increase with smaller sizes even though the sensor actually observes that airspace. While we chose a grid size with edge lengths of 10km to compare the attack detection performance, we also evaluated the impact of different grid resolutions and gained the following insights.

고려된 그리드의 해상도는 클러스터 $C_j$ 에 보고서와 센서들을 할당하는 프로세스를 고려한다. 그리드 해상도가 높을수록, 지역간의 차별화와 결국 그들의 수신 패턴이 미세해진다. 하지만, 그리드 해상도를 올리는 것은 계산 부하를 증가시킬 뿐 아니라, 관측 센서들에 영역이 과적합될 수 있다. 예를 들어서, 우리는 센서들의 정확한 위치를 모르기 때문에, 보고된 ADS-B 메시지들로부터 관측된 영역을 학습할 필요가 있다. 센서가 실제로 해당 공역을 관측하고있다 할지라도 특정 영역으로부터 어떠한 메시지도 보고하지 않을 가능성은 크기가 작을수록 증가된다. 우리가 공격 탐지 성능을 비교하기 위해서 10km의 가장자리를 가지는 그리드 크기를 선택했지만, 우리는 또한 다양한 그리드 해상도의 영향을 평가하고 다음과 같은 인사이트를 얻었다.

Results. The greater the proliferation of a cluster is, the more sensors are potentially observing at least parts of the area. As a consequence, the reception patterns feature more active sensors and have a higher variance within the same cluster. However, this also makes it harder to have a clear distinction between normal operation and malicious patterns. On the other hand, clusters with very tight areas actually prevent the estimation of meaningful reception patterns and thus also decrease the validity. Since the attack detection performance is related to the differences in the reception patterns, we determined a reasonable trade-off between sensitivity and generalization, which resulted in the grid resolution of 10km.

결과.
클러스터의 확산이 클수록, 더 많은 센서들이 잠재적으로 영역의 적어도 일부분들을 관측하고 있다. 그 결과, 수신 패턴들은 더 많은 활성 센서들을 특징으로 하고 동일 클러스터 내에서 더 많은 분산을 가진다. 하지만, 이것은 또한 정상 수행과 악의적인 패턴간의 명백한 구분을 어렵게 만든다. 반면에, 매우 좁은 영역의 클러스터들은 실제로 의미있는 수신 패턴들의 추정을 방해하므로 유효성을 감소시킨다. 공격 탐지 성능이 수신 패턴들의 차이와 관련이 있기 때문에, 우리는 민감도와 일반화 사이에서 합리적인 절충점을 결정하여 10km 그리드 해상도를 얻었다.

5. Time Dependency

To evaluate the time dependency of our detection scheme, we additionally assess its performance on a dataset gathered for February 17, 2020. This dataset represents a normal weekday, two days after the previously analyzed day. This day was chosen due to a temperature drop and rainy weather and thus represents unfavorable conditions. The number and paths of flights on this new day is similar (but not identical) to the previously selected dataset. During this day, the OpenSky Network recorded over 135 million ADS-B reports and 728 active sensors. The structure of the sensor network on both days is strongly overlapping showing very minor fluctuations. The evaluations steps are kept the same to our previous analysis, revealing the following results.

우리의 탐지 체계의 시간 의존성을 평가하기 위해서, 우리는 추가적으로 2020년 2월 17일에 모았던 데이터세트에 대한 성능을 추가로 평가한다. 이 데이터 세트는 이전에 분석된 날로부터 2일 후의 정상적인 평일을 나타낸다. 이 날은 기온이 낮아지고 비가 오는 날씨로 인해 선택되었으며 불리한 조건을 나타낸다. 이 날의 항공기의 수와 경로들은 이전의 선택된 데이터세트들과 비슷하다(동일하진 않음). 이 날 동안 OpenSky Network는 1억 3,500만개의 ADS-B 보고서와 728개의 활성 센서들을 기록했다. 두 날의 센서 네트워크의 구조는 매우 적은 변동을 보이며 중복되어 있다. 평가 단계는 이전의 분석과 동일하게 유지되었으며 다음과 같은 결과가 나왔다.

Results. Overall, the results show very little deviations from the previous results and the extent of variation is comparable to the homogeneity of the sensor network. In particular, we present results showing the detection performance considering GPS spoofing attacks in Table VIII. The results for both ADS-B spoofing and sensor control/Sybil attacks are overlapping with the prior results such that differences cannot be captured visually, hence we abstain from presenting identical figures. All in all, this provides evidence which suggests that $(i)$ different flight paths, $(ii)$ varying airspace density, and $(iii)$ changing weather conditions only slightly influence the detection performance of our scheme, indicating its robustness against these parameters.

결과.
전체적으로, 결과들은 이전의 결과들로부터 매우 적은 편차를 보여주고 변화의 정도는 센서 네트워크의 동질성과 비슷하다. 특히, 표 VIII에서 GPS spoofing attacks을 고려한 탐지 성능을 보여주는 결과를 제시한다. ADS-B spoofing과 sensor control/Sybil attacks에 대한 결과는 이전 결과와 중복되어 차이를 시작적으로 포착할 수 없으므로, 동일한 수치를 제시하지 않는다. 결국, 이는 $(i)$ 다른 비행 경로, $(ii)$ 공역 밀도 변화, $(iii)$ 날씨 조건 변화가 우리 체계의 탐지 성능에 약간의 영향만을 미쳐 이러한 매개변수들에 대하여 견고하다는 것을 보여주는 증거를 제시한다.

6. Computational Performance

The implementation of the ML-based cross check imposed the challenge of handling more than 132 million reports from more than 700 sensors, just for a single day and only in Europe. With this massive amount of data, training on the entire dataset became infeasible on off-the-shelf equipment. To bring down the required time for training and classification, we decided to split the data into grids, where the data in each grid can be processed separately. Moreover, the training duration is a one-time cost and was well doable on standard hardware. If implemented on a designated server, the required time is expected to be lowered by magnitudes. As a result, even retraining on a regular basis becomes possible. The recurring costs of classifications, on the other hand, are only a minor fraction of the training duration such that all classification for an entire day only took a few minutes and can thus be performed efficiently in real-time.

ML 기반의 상호 검증의 구현은 단지 유럽에서의 하루 동안 700개 이상의 센서들로부터 1억 3,200만개 이상의 보고서들을 다뤄야하는 문제를 제시한다. 이러한 거대한 양의 데이터를 가지고, 전체 데이터세트에 대한 학습은 시중에 나와있는 제품 사용을 불가능하게 만든다. 학습과 분류에 걸리는 요구되는 시간을 줄이기 위해서, 우리는 각 그리드안에 있는 데이터가 각각 처리될 수 있도록 데이터를 그리드로 분리했다. 게다가, 학습 기간은 일회성 비용이며 표준 하드웨어에서도 잘 수행할 수 있었다. 지정된 서버에서 구현되면, 요구 시간은 크기만큼 낮아질 것으로 예상된다. 그 결과, 정기적으로 재교육하는 것이 가능해졌다. 반면에, 분류의 반복되는 비용은 오직 교육기간의 극히 일부에 불과하기 때문에 하루종일 모든 분류를 몇 분 밖에 걸리지 않으므로 실시간으로 효율적으로 수행할 수 있다.

Discussion

We discuss important properties of our developed system: $(i)$ implicit trust in the data source, $(ii)$ limitations, $(iii)$ attacker's knowledge, $(iv)$ false alarm events, $(v)$ the current attack resilience, $(vi)$ optimized sensor deployment, and $(vii)$ further extensions.

우리는 우리의 발전 시스템의 중요한 성질에 대해서 얘기한다. $(i)$ 데이터 원천의 절대적 신뢰, $(ii)$ 제한, $(iii)$ 공격자의 지식, $(iv)$ 거짓 알람 이벤트, $(v)$ 현재 공격 회복력, $(vi)$ 최적화된 센서 배포, 그리고 $(vii)$ 추가 확장들.

1. Implicit Data Source Trust

We base the evaluation of our trust system on data provided by the OpenSky Network, which records real-world air-traffic reports. However, we take the data "as is" and consider it to represent normal behavior. We cannot exclude the existence of erroneous data or even reports that resulted from some kind of attack. Nevertheless, we throughly analyzed the reports of our selected day (February 15, 2020) without any findings. While our system is designed to analyze live data, it can also be used to find unusual events and potential attacks in the recorded air-traffic reports in a retrospective view.

우리는 실제 항공 교통 보고서들을 기록하는 the OpenSky Network에 의해 제공되는 데이터에 대한 우리의 신뢰 시스템의 평가들을 기반으로 한다. 하지만, 우리는 데이터를 "있는 그대로" 사용하고 그 데이터들이 정상 행동을 나타낸다고 고려한다. 우리는 몇 가지의 공격으로부터 비롯되는 많은 양의 데이터나 보고서들의 존재를 배제할 수 없다. 그럼에도 불구하고, 우리는 어떠한 발견도 없이 우리가 선택한 (February 15, 2020)날짜의 보고서들을 분석한다. 우리의 시스템이 실제 데이터를 분석하기위해서 설계되었지만, 회고적 관점에서 기록된 항공 교통 보고서들에 있는 일반적이지 않은 이벤트와 잠재적 공격들을 찾아내는데에 사용될 수 있다.

2. Limitations

While we state that our system can detect all considered attacks (i.e., GPS spoofing, ADS-B spoofing, and sensor control/Sybil attack), our system is subject to limitations. Independent of the attack, any verification can only be applied in covered airspaces (see Figure 3) which excludes, e.g., the open sea. For the cross check, we further require at least three sensors to yield meaningful results. Given these requirements, we achieved detection delays on the order of minutes, which is a limiting factor in situations where fast reactions are required. We tuned our system towards minimal false alarm events requiring us to delay decision. Allowing the existence of false alarms can significantly lower this delay.

우리의 시스템이 모든 고려되는 공격들(GPS spoofing, ADS-B spoofing, and sesor control/Sybil attack)을 모두 탐지할 수 있다고 말했던 반면에, 우리의 시스템은 제한이 있는것 같다. 공격과 무관하게, 어떠한 검증도 망망대해를 제외한 커버되는 공역에만 적용된다. 상호 검증을 위해서, 의미있는 결과를 얻기 위해서 최소 3개의 센서들이 더 필요하다. 이러한 요구들을 고려할 때, 빠른 대응이 필요한 상황에서 제한 요소인 탐지 지연을 분 단위로 달성했다. 우리는 결정을 지연시켜야 하는 최소한의 잘못된 경보 이벤트를 위해 시스템을 조정했다. 거짓 경보의 존재를 허락하는 것은 상당히 이러한 연기를 낮춘다.

Some limitations are specific to the types of attacks as we explain as follows:

몇 가지 제약들은 우리가 다음에 설명할 몇 가지 종류의 공격들에 대하여 구체적이다:

1) GPS Spoofing:
The limitations of GPS spoofing detection are based on the extent of applied deviation and the grid resolution. With finer grid resolution, the more subtle deviations can be detected. However, the resolution can only be increased to a certain degree. Based on our simulations, a resolution of 10km was identified as a good choice. Fixing the grid resolution to 10km, we consider our system to reliably detect more than 96% of GPS spoofing attacks with a deviation of at least 5 $\degree$ . Less deviation can only be detected with lower probability or after significantly more time.

GPS spoofing의 한계점은 적용된 편차와 그리드 해상도의 확장을 기반으로 한다. 더 선명한 그리드 해상도는 더 좋은 편차를 찾을 수 있게 한다. 하지만, 해상도는 오직 특정 각도에서만 증가할 수 있다. 우리의 시뮬레이션을 기반으로, 10km의 해상도는 좋은 선택인것 처럼 확인되었다. 10km로 그리드 해상도를 고정하면서, 우리는 우리의 시스템이 최소 5 $\degree$ 의 편차를 가지고 GPS spoofing attack의 96% 이상을 탐지한다고 고려한다. 더 적은 편차는 오직 낮은 확률 또는 상당히 오랜 시간 이후에 탐지될 수 있다.

2) ADS-B Spoofing:
When facing an ADS-B spoofing attack, the detection capability of our system requires the positions of sensors to remain concealed such that an attacker cannot selectively target individual sensors with, e.g., multiple antennas. If an attacker can pinpoint sensors to emulate realistic reception patterns, our system would not be able to detect malicious injections.

ADS-B spoofing attack을 마주할 때, 우리 시스템의 탐지능력은 공격자가 다수의 안테나들과 같은 개별적인 센서들과 함께 선택적으로 목표할 수 없는 숨겨진 채로 남아있는 센서들의 위치를 요구한다.
공격자는 실제와 같은 수신 패턴을 에뮬레이트하기 위해서 센서들의 위치를 정확히 집어낼 수 있다면, 우리의 시스템은 악의적인 주입을 탐지할 수 없을 것이다.

3) Sensor Control/Sybil Attack:
Naturally, an attacker controlling every sensor could overcome any verification scheme due to full control over reported data. Our detection system relies on the existence of benign sensors. In an area with active malicious sensors, we require at least three benign sensors to be able to detect the attack. Notably, we do not consider any form of identity spoofing, in which reports are injected with sensor identities without any control over the indicated sensors. This must be prevented on other layers.

당연하게도, 모든 센서를 제어하는 공격자는 보고된 데이터에 대한 완전한 통제 덕분에 어떠한 검증 체계라도 이겨낼 수 있다. 우리의 탐지 시스템은 정상 센서들의 존재에 의지한다. 활동적인 악의적인 센서들을 가지는 영역에 있어서, 우리는 공격을 탐지하기 위해서 최소 3개의 정상 센서들을 요구한다. 특히, 표시된 센서를 제어하지 않고도 센서 ID로 보고서를 주입하는 어떠한 형태의 신원 스푸핑도 고려하지 않습니다. 이것은 다른 계층에서 보호되어져야만 한다.

In circumstances that stay within these limiations, our detection scheme achieves the stated performance figures. Outside the limitations, the performance may be heavily degraded. Fortunately, areas where the number of sensors is a limitation are constantly shrinking due to increasing sensor coverage (see Section VI-E).

이러한 제한점들을 가지는 환경에서, 우리의 탐지 체계는 명시된 성능 수치를 달성한다. 제한의 밖에서는, 성능은 매우 낮아질 것이다. 운이 좋게도, 센서들의 수가 제한된 영역들은 센서 커버리지의 증가 덕분에 일정하게 수축한다. (VI-E 참고).

3. Attacker's Knowledge

In our performance analysis of detecting ADS-B spoofing and Sybil attacks, we considered attackers controlling a certain number of sensors. An attacker with full awareness of our system might try to optimize the pursued attack strategy and imitate authentic reception patterns. For both ADS-B spoofing and Sybil attacks, this can only be achieved to a certain degree and we argue that an attacker cannot overcome the detection scheme in regions with enough sensor redundancy. Even a fully aware attacker does not know the exact locations of other sensors, and hence it is not possible to manipulate them in a targeted manner (e.g., through ADS-B spoofing). Moreover, an attacker cannot access the unprocessed readings of ohter sensors in an effort to localize them. In the case of ADS-B spoofing, where an attacker affects multiple sensors, the actual victims cannot be targeted separately. In the case of a Sybil attack, the attacker could try to emulate realistic reception patterns using the controlled sensors, but cannot do so with the sound user-operated sensors. The better a cluster is covered by benign sensors, the more conspicuous an attack will be. We, therefore, argue that even an attacker, fully aware of our system, cannot overcome the detection scheme due to the concealed locations of other sensors.

ADS-B spoofing과 Sybil attacks을 탐지하는 우리 성능 분석에 있어서, 우리는 특정 수의 센서들을 통제하는 공격자를 고려한다. 공격자는 우리 시스템을 완전히 알아차리고 추구하는 공격 전략을 최적화하기 위해서 노력하고 실제 수신 패턴을 모방하려고 노력할지도 모른다. ADS-B spoofing과 Sybil attacks 모두, 오직 어느 정도만 달성될 수 있고, 우리는 센서 중복이 충분한 지역에서는 공격자가 탐지 체계를 극복할 수 없다고 주장한다. 심지어 완전한 알아차림에도 불구하고 공격자는 다른 센서들의 정확한 위치를 모르기 때문에 목표된 방법(ADS-B spoofing)으로 센서를 조작할 수 없다. 게다가 공격자는 다른 센서들의 처리되지 않은 판독값에 접근하여 위치를 파악할 수 없다. 공격자가 다수의 센서들에 영향을 주는 ADS-B spoofing의 경우에서, 실제 피해자들은 별도로 표적화 할 수 없다. Sybil attack의 경우에서, 공격자는 통제된 센서들을 사용하여 실제와 같은 수신 패턴들들을 에뮬레이트하려고 노력할 수 있지만, 사운드 사용자가 동작하는 센서에서는 그렇게 할 수 없다. 클러스터가 정상 센서들에 의해 더 잘 커버될 수록, 공격은 더 잘 눈에 띄게 된다. 그러므로 심지어 우리 시스템의 완전한 인식을 한 공격자도 다른 센서들의 숨겨진 위치 덕분에 탐지 체계를 극복 할 수 없다고 주장한다.

4. False Alarm Events

We acknowledge that any false alarm event, i.e., a falsely detected attack, greatly hinders the acceptance of our developed system. Especially when considering safety-related air-traffic surveillance, false alarm events would distract air-traffic controllers leading to the opposite of what we wanted to achieve. With our choice of setting thresholds, we obtained 0% false positives over a dataset of 1000 randomly sampled tracks. Admittedly, this does not guarantee the absent of false alarms. However, our system can be tuned with updated thresholds and time windows if false alarms arises. Even for broader thresholds, we expect meaningful attack detection rates within reasonable delays.

우리는 거짓 경보 이벤트, 즉 잘못된 탐지 공격이 개발된 시스템의 수용을 크게 방해한다는 점을 인정한다. 특히 안전과 관련된 항공 교통 관제를 고려할 때, 거짓 경보 이벤트는 항공 교통 관제사의 주위를 분산시켜서 우리가 달성하고자 하는 것과 반대로 이어질 수 있다. 임계값을 설정하기로 결정하여 무작위로 샘플링된 1000개의 트랙 데이터 세트에서 0%의 오탐률을 얻었다. 물론 그렇다고 해서 거짓 알람이 없다는 것을 보장하지는 않는다. 하지만, 우리의 시스템은 거짓 경보가 발생한다면, 임계값과 타임 윈도우를 업데이트 함으로써 조정할 수 있다. 더 넓은 임계값에서도 합리적인 지연내에서 의미있는 공격 탐지율을 기대할 수 있다.

5. Current Attack Resilience

The crowdsourcing sensors are at the core of our trust system and their distribution and density are of utter importance for the detection of attacks. The validity of the cross check, i.e., wireless witnessing, increases with the number of sensors covering the same air segments. Thus, the more redundancy, the more variations exist in the reception patterns and the better malicious attacks and sensors can be detected. We analyzed the current resilience of the OpenSky Network by considering regions related to different coverages. Table IX states the breakdown of the total covered area and relates it to the total surface of the European continent.

크라우드 소싱 센서들은 우리의 신뢰 시스템의 핵심이고 공격 탐지에 대하여 센서들의 분포와 밀도는 매우 중요하다. 동일한 공역 세그먼트를 커버하는 센서들의 수가 증가함에 따라 상호 검증, 즉 무선 목격의 유효성이 증가한다. 그러므로, 중복성이 높을 수록, 수신패턴에서의 더 많은 변화가 존재하고 악의적인 공격들과 센서들을 더 잘 탐지할 수 있다. 우리는 다양한 커버리지들과 관련된 지역을 고려함으로써 the OpenSky Network의 현재 복원력을 분석했다. 표 IX는 커버되는 영역 전체의 고장을 설명하며 유럽 대륙의 전체 표면과 관련있다.

6. Optimizing Sensor Deployment

To further develop the security of the network, we encourage the deployment of new sensors in less covered areas to optimize the current geographical distribution by optimized network expansion. Based on the coverage information of the existing sensors in the network (see Figure 3), we optimize the placement of new sensors with the goal of filling blind spots. Our optimization target is an overall coverage increase and therefore a hardening against attacks.

네트워크의 보안을 더 발전시키기 위해서, 우리는 네트워크 확장을 최적화하여 현재 지리적 분포를 최적화하기 위해 덜 커버된 지역에 새로운 센서들의 배포를 권장한다. 네트워크에 존재하는 센서들의 커버리지 정보를 기반으로, 우리는 사각지대를 채우는 것을 목표로 새로운 샌서들의 위치를 최적화한다.(그림 3 참고). 우리의 최적화 목표는 전반적인 커버리지의 증가이고 공격에 대한 강화이다.

To provide an overview of areas that would benefit the most from the deployment of new sensors, we weight the need for better coverage according to the current sensor redundancy of the network. The lower the coverage, the higher is the demand for new sensors. We restrict possible locations to be on land. We further assume an average reception range of 400km and simplify the observable airspace to be a circle around the sensor. Figure 14 depicts areas according to their coverage increase for the entire network. While in Central Europe the deployment of new sensors does not significantly impact the overall resilience against attacks, new sensor setups close to the coastlines can greatly increase the attack resilience.

새로운 센서들의 배포로 가장 큰 이득을 얻을 수 있는 영역에 대한 개요를 제공하기 위해서, 우리는 네트워크의 현재 센서 중복성에 따라 더 좋은 커버리지의 필요성에 무게를 둔다. 커버리지가 낮을 수록, 새로운 센서들에 대한 수요가 증가한다. 우리는 가능한 위치가 육지에 있도록 제한한다. 또한 평균 수신 범위가 400km이고 관측 가능한 공역이 센서 주의의 원으로 단순화한다. 그림 14는 전체 네트워크에 대한 커버리지 증가에 따른 영역을 보여준다. 중앙 유럽에서 새로운 센서들의 배포는 공격에 대한 전체적인 복원력에 영향을 주지 않는 반면에, 해안선에 가깝게 설치한 새로운 센서는 공격 복원력을 상당히 증가시킬 수 있다.

7. Extensions

We discuss three extensions of our trust system with the goal of better reflecting real-world characteristics as well as introducing sensor reputation to weight their impact on the trust assessment process. Further, dynamic learning strategies can keep attack detection strategies updated.

우리는 실제 특성을 더 잘 반영하고 센서 평판을 도입함으로써 신뢰 평가 절차에 그들의 영향에 가중치를 부여하는 것을 목표로 우리 시스템의 3가지 확장을 논의한다. 또한 동적 학습 전략은 공격 탐지 전략이 계속 업데이트 되도록 한다.

Time Dependence.
Since ADS-B broadcasts use the wireless medium, message collisions can occur when the frequency band is saturated. The resulting rate of message loss is dependent on the airspace density which in turn changes over time based on the operating hours of airports. The more aircraft share the same medium, the higher the chances are of messages being lost. While our current system estimates reception probabilities based on averaged one-day observations, a future extension of our trust system may account for time-dependent message loss.

ADS-B 방송이 무선 매체를 사용하기 때문에, 주파수 대역이 포화될 때 메시지 충돌이 발생할 수 있다. 메시지 손실률은 영공 밀도에 따라 달라지며, 이는 공항의 운영 시간에 기반하여 시간이 지남에 따라 변화한다. 항공기가 동일한 매체를 공유할 수록, 메시지들이 손실될 가능성이 높다. 우리의 현재 시스템이 평균 하루 관찰을 기반으로 수신 확률을 추정하지만, 우리 신뢰 시스템의 향후 확장은 시간에 따라 메시지 손실이 발생할 수 있다.

Sensor Reputation.
In the currently deployed crowdsourcing network, we consider each sensor as equivalent to any other sensor. To refine this assumption, sensors may be assigned a reputation rating. A portion of the sensors are operated by personal contacts or registered users. Those sensors are expected to be less likely to participate in active attacks and we could link the reputation of the operator to possessed sensors. Furthermore, the hardware implementation could also be taken into account, where some implementations are more robust to defects than others. By incorporating sensor reputation, the validity of telling normal behavior and attack scenarios apart could be further improved.

현재 배포된 크라우드 소싱 네트워크에서는 각 센서를 다른 센서와 동등한 것으로 간주한다. 이러한 가정을 개선하기 위해서, 센서들은 평판 등급이 할당될 수 있다. 센서들의 일부는 개인 연락처 또는 등록된 사용자에 의해서 수행될 수 있다. 이러한 센서들은 활성화된 공격들에 참가하지 않을 것으로 예상되고 운영자의 평판을 보유한 센서들에 연결할 수 있다. 또한, 일부 구현들이 다른 구현보다 결함에 더 강한 하드웨어 구현도 고려할 수 있다. 센서 평판을 통합함으로써, 정상 행동과 공격 시나리오를 구분하는 것의 유효성을 더욱 향상시킬 수 있다.

Dynamic Learning.
Finally, we envision the implementation of dynamic learning techniques. A dynamic learning approach could constantly update the trained message reception patterns. This allows to incorporate shifts which can occur when, e.g., sensors are joining or leaving the network, the reception range of sensors changes, or transmission ranges are altered. Moreover, new attack vectors may arise in the future. A (re-)training of our classifiers with updated attack vector definitions ensures that the trust evaluation process keeps its validity when facing currently unknown attacks.

마지막으로, 우리는 동적 학습 기술들의 구현을 구상한다. 동적 학습 방식은 학습된 메시지 수신 패턴들을 지속적으로 업데이트할 수 있다. 이것은 센서들이 네트워크에 가입하거나 떠나거나, 센서의 수신 범위가 변경되거나, 전송 범위가 변경될 때 발생할 수 있는 변화를 통합할 수 있게 한다. 또한 향후 새로운 공격 벡터가 발생할 수 있다. 업데이트된 공격 벡터 정의로 분류기를 (재)교육하면 현재 알려지지 않은 직면했을 때 신뢰 평가 프로세스의 유효성을 유지할 수 있다.

This paper is partly based on the work by Raya et al. who were the first to propose a framework for data-centric trust establishment with a focus on short-lived associations in volatile environments and on resulting work approaching distributed sensor events. While our proposal for trust establishment specifically targets ADS-B based air-traffic surveillance, similar trust requirements exist in Vehicular AdHoc Networks(VAHETs) or industrial wireless sensor networks. While Petit et al. discuss detection systems for VANETs based on dynamic thresholds, Ruj et al. focus on validating message consistency to identify misbehavior. Whereas Sun et al. present a trust framework to detect faulty data in VANETs, Hundman et al. apply similar data verification schemes for spacecraft. Dastner et al. classify military aircraft based on their ADS-B report trace. Wang et al. analyzes the feasibility of false data filtering in general sensor networks and Henningsen et al. especially focus on industrial networks. In comparison, our system is tailored towards a network of geographically distributed sensors.

이 논문은 변동성이 큰 환경에서의 단명한 연관성과 그 결과로 발생하는 분산형 센서 이벤트에 접근하는 작업에 초점을 맞춘 데이터 중심 신뢰 구축 프레임워크를 최초로 제안한 Raya 등의 연구를 부분적으로 기반으로 한다. 신뢰 구축에 대한 제안은 특히 ADS-B 기반 항공 교통 감시를 대상으로 하지만, 차량용 애드혹 네트웍스(VAHET) 또는 산업용 무선 센서 네트워크에서도 유사한 신뢰 요구 사항이 존재한다. Petit 등은 동적 임계값을 기반으로 한 VANET의 탐지 시스템에 대해 논의하는 반면, Ruj 등은 잘못된 행동을 식별하기 위한 메시지 일관성을 검증하는 데 중점을 둔다. Sun 등은 VANET의 결함 있는 데이터를 탐지하기 위한 신뢰 프레임워크를 제시하는 반면, Hundman 등은 우주선에 대해 유사한 데이터 검증 체계를 적용한다. Dastner 등은 ADS-B 보고서 추적을 기반으로 군용기를 분류한다. Wang 등은 일반 센서 네트워크에서 허위 데이터 필터링의 가능성을 분석하고 Henningsen 등은 특히 산업 네트워크에 초점을 맞춘다. 이에 비해 우리의 시스템은 지리적으로 분산된 센서 네트워크에 맞게 조정된다.

While in practice still vulnerable, the insecurity of ADS-B has long been highlighted from an academic perspective. Purton et al. analyzed critical information flows and focused primarily on techincal solutions. They applied a qualitative assessment method that identified potential shortcomings. In contrast, McCallie et al. applied a risk analysis to assess the impact of different attack vectors and recommended solutions to be incorporated into the ADS-B implementation plan. Moreover, Strohmeier et al. provide an overview of system-inherent problems and illustrate the security challenges of ADS-B in future air-traffic monitoring. Smith et al. empirically analyze pilots' reactions to wireless attacks on avionic systems and show that undetected attacks can lead to dangerous distractions. There are several open attacking ADS-B on different levels. Chevrot et al. present a framework for arbitrary false data injection and outline detection strategies. Nevertheless, we must always consider the necessary effort for an attack and its feasibility in a real-world scenario.

실제로는 여전히 취약하지만 ADS-B의 불안정성은 오랫동안 학계의 관점에서 강조되어 왔다. Purton et al. 는 중요한 정보 흐름을 분석하고 주로 기술적 솔루션에 초점을 맞췄다. 그들은 잠재적 단점을 식별하는 정성적 평가 방법을 적용했다. 반면, McCallie et al. 는 위험 분석을 적용하여 다양한 공격 벡터의 영향을 평가하고 ADS-B 구현 계획에 통합할 솔루션을 권장했다. 또한 Strohmeier et al. 은 시스템 고유의 문제에 대한 개요를 제공하고 향후 항공 교통 모니터링에서 ADS-B의 보안 과제를 설명한다. Smith et al. 는 항공 전자 시스템에 대한 무선 공격에 대한 조종사의 반응을 경험적으로 분석하고 탐지되지 않은 공격이 위험한 주의 분산으로 이어질 수 있음을 보여준다. 다양한 수준에서 ADS-B를 공격하는 여러 가지 공개 공격이 있다. Chevrot et al. 은 임의의 허위 데이터 주입을 위한 프레임워크를 제시하고 탐지 전략을 개요화한다. 그럼에도 불구하고 우리는 항상 실제 시나리오에서 공격에 필요한 노력과 그 실현 가능성을 고려해야 한다.

Moser et al. take a perspective on the feasibility of attacking ADS-B communication and consider an attacker using a multi-device setup. Recent work showed that such strong adversaries become increasingly realistic. Furthermore, Costin and Francillon demonstrated that the step from a scientific attack concept to a real attack is not necessarily too wide and managed to inject fake aircraft messages into live surveillance monitors. Later, Schafer et al. experimentally analyzed the practicability of known threats revealing startling results. In particular, aircraft instrument landing systems are prone to wireless attacks. Besides these works, which all focus on aviation applications, Balduzzi et al. proved that also maritime traffic via Automatic Identification System (AIS) broadcast messages can be the target of successful attacks. While the physical constraints of vehicles differ a lot, the similarity of communication channels helps to map well known attacks to this new context.

Moser 등 은 ADS-B 통신 공격의 가능성에 대한 관점을 취하고 다중 장치 설정을 사용하는 공격자를 고려한다. 최근 연구에 따르면 이러한 강력한 공격자는 점점 더 현실적으로 변한다는 사실이 밝혀졌다. 또한 Costin과 Francillon은 과학적 공격 개념에서 실제 공격으로의 단계가 반드시 너무 넓지는 않으며, 실제 감시 모니터에 가짜 항공기 메시지를 주입하는 데 성공했음을 입증했다. 이후 Schafer et 등 은 알려진 위협의 실용성을 실험적으로 분석하여 놀라운 결과를 공개했다. 특히 항공기 계기판 착륙 시스템은 무선 공격에 취약하다. 항공 애플리케이션에 초점을 맞춘 이러한 연구 외에도 Balduzzi 등은 자동 식별 시스템(AIS) 방송 메시지를 통한 해상 교통도 성공적인 공격의 대상이 될 수 있음을 입증했다. 차량의 물리적 제약은 많이 다르지만, 통신 채널의 유사성은 잘 알려진 공격을 이 새로운 맥락에 매핑하는 데 도움이 된다.

Besides the large body of offensive work, defensive proposals exist in recent research. Strohmeier et al. survey the existing research on countermeasures. More specifically, Ghose and Lazos as well as Schafer et al. and Liu et al. propose the usage of timing or Doppler shift characteristics to detect attacks on ADS-B. While this cannot protect from attacks, it still helps to identify malicious or inaccurate messages. Other location verification schemes and anomaly detection methods are based on RADAR observations, statistical tests, or PHY layer information. Habler and Shabtai use flight route modelling and anomaly detection to identify malicious ADS-B messages, achieving a false alarm rate of 4.5 %. Similar false alarm rates are achieved by Naganawa et al. based on Angle of Arrival (AoA) measurements. Sun et al. also use AoA verification but with a single receiver.

공격적인 작업 외에도 최근 연구에서는 방어적인 제안이 존재한다. Strohmeier 등은 대응 방안에 대한 기존 연구를 조사한다. 보다 구체적으로, Ghose와 Lazos, Schafer 등과 Liu 등은 ADS-B에 대한 공격을 감지하기 위해 타이밍 또는 도플러 시프트 특성을 사용할 것을 제안한다. 이는 공격으로부터 보호할 수는 없지만 여전히 악의적이거나 부정확한 메시지를 식별하는 데 도움이 된다. 기타 위치 확인 체계와 이상 탐지 방법은 레이더 관찰, 통계 테스트 또는 PHY 계층 정보를 기반으로 한다. Habler 와 Shabtai는 비행 경로 모델링과 이상 탐지를 사용하여 악의적인 ADS-B 메시지를 식별하여 4.5%의 허위 경보율을 달성한다. 도착 각도(AoA) 측정을 기반으로 Naganawa 등 에서도 유사한 허위 경보율을 달성한다. Sun 등 도 AoA 인증을 사용하지만 단일 수신기와 함께 사용합니다.

First results based on cross-referencing within a distributed sensor network are illustrated by Strohmeier et al. Oligeri et al. use IRIDIUM signals to validate GNSS position solutions. While Wesson et al. discuss solutions based on cryptography, Kim et al. evaluate a solution based on protocol extension with timestamps. Our system, on the other hand, requires no additional measurement information different from already collected data and can thus be implemented without any modifications.

분산 센서 네트워크 내의 상호 참조를 기반으로 한 첫 번째 결과는 Strohmeier 등에 의해 설명된다. Oligeri 등은 이리듐 신호를 사용하여 GNSS 위치 솔루션을 검증한다. Wesson 등은 암호화 기반 솔루션을 논의하는 동안, Kim 등은 타임스탬프가 있는 프로토콜 확장을 기반으로 솔루션을 평가한다. 반면에 우리 시스템은 이미 수집된 데이터와 다른 추가 측정 정보가 필요하지 않으므로 수정 없이 구현할 수 있다.

Aside from ADS-B and AIS, the insecurity of GPS has
been repeatedly demonstrated, while Humphreys et al. were the first to publish an attack on GPS, where they managed to spoof GPS signals. Tippenhauer et al. later analyzed the requirements of successful GPS spoofing attacks and reasoned about possible attacker positions when facing a specific sensor deployment. Zeng et al. demonstrate the insecurity of road navigation systems via a stealthy manipulation based on GPS spoofing. Considering multiple sensors, countermeasures exist for the detection of GPS spoofing attacks and also for spoofer localization. However, these countermeasures depend on ground-based sensors and do not exploit the network volatility. This limits the impact and consequences to a fraction of real-world use cases.

ADS-B 및 AIS 외에도 GPS의 불안정성은 다음과 같다.
Humphreys 등은 GPS에 대한 공격을 최초로 게시하여 관리하는 동안 반복적으로 시연되었습니다. GPS 신호 스푸핑하기. Tippenhauer 등은 나중에 성공적인 GPS 스푸핑 공격의 요구 사항을 분석하고 특정 센서 배포에 직면했을 때 발생할 수 있는 공격자 위치를 추론했다. Zeng 등은 GPS 스푸핑을 기반으로 한 은밀한 조작을 통해 도로 내비게이션 시스템의 불안정성을 입증했다. 여러 센서를 고려할 때 GPS 스푸핑 공격 탐지와 스푸퍼 로컬라이제이션에 대한 대응책도 존재한다. 그러나 이러한 대응책은 지상 기반 센서에 의존하며 네트워크 변동성을 악용하지 않는다. 이로 인해 영향과 결과는 실제 사용 사례의 일부로 제한된다.

Overall, we experience a gap between scientifically proposed defenses and deployed countermeasures. As a consequence, protecting ADS-B is an open challenge that demands scientific advances to consider the requirements and limitations of the real world.

전반적으로 우리는 과학적으로 제안된 방어와 배포된 대응 조치 사이에 괴리가 발생한다. 결과적으로 ADS-B를 보호하는 것은 현실 세계의 요구 사항과 한계를 고려한 과학적 발전을 요구하는 미해결 과제이다.

Conclusion

This work approached a trust evaluation system for ADS-B
based air-traffic surveillance using an already existing infrastructure of crowdsourcing sensors. We demonstrated how our solution leverages sensor redundancy to establish wireless witnessing to protect an otherwise unsecured open system. To this end, we tested our system against prominent attack vectors showing that we cannot only detect them but also draw conclusions about their type and the participating sensors. The validity of our trust evaluation depends on the redundancy of sensors observing same airspace segments. Moreover, we outlined considerations for future sensor deployment hardening the network’s security by optimized expansions.

이 작업은 이미 존재하는 크라우드 소싱 센서들의 인프라를 사용하여 ADS-B 기반의 항공 교통 관제에 대한 신뢰 평가 시스템에 접근했다. 우리는 솔루션이 센서 중복성을 활용하여 보안이 확보되지 않은 개방형 시스템을 보호하기 위해 무선 증인을 설정하는 방법을 시연했다. 이를 위해 저명한 공격 벡터에 대해 시스템을 테스트하여 탐지할 수 있을 뿐만 아니라 유형과 참여 센서에 대한 결론을 도출할 수 있음을 보여주었다. 신뢰 평가의 유효성은 동일한 공역 세그먼트를 관찰하는 센서의 중복성에 따라 달라진다. 또한 향후 최적화된 확장을 통해 네트워크의 보안을 강화하기 위한 고려 사항을 요약했다.

출처 및 인용

https://www.researchgate.net/profile/Nian-Xue/publication/349522456_Trust_the_Crowd_Wireless_Witnessing_to_Detect_Attacks_on_ADS-B-Based_Air-Traffic_Surveillance/links/6034f2bc299bf1cc26e5085b/Trust-the-Crowd-Wireless-Witnessing-to-Detect-Attacks-on-ADS-B-Based-Air-Traffic-Surveillance.pdf

Jansen, K., Niu, L., Xue, N., Martinovic, I., & Pöpper, C. (2021, February). Trust the Crowd: Wireless Witnessing to Detect Attacks on ADS-B-Based Air-Traffic Surveillance. In NDSS.

GLICO

Its me Glico