Internet-Draft | scion-research_I-D | July 2024 |
Meynell, et al. | Expires 19 January 2025 | [Page] |
TODO Abstract here¶
This note is to be removed before publishing as an RFC.¶
The latest revision of this draft can be found at https://scionassociation.github.io/scion-research_I-D/draft-meynell-panrg-scion-research-questions.html. Status information for this document may be found at https://datatracker.ietf.org/doc/draft-meynell-panrg-scion-research-questions/.¶
Discussion of this document takes place on the WG Working Group mailing list (mailto:panrg@irtf.org), which is archived at https://datatracker.ietf.org/rg/panrg. Subscribe at https://www.ietf.org/mailman/listinfo/panrg/.¶
Source for this draft and an issue tracker can be found at https://github.com/scionassociation/scion-research_I-D.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 19 January 2025.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document.¶
SCION is an inter-domain network architecture. Its core components specification, as deployed by some of its early adopters, is outlined in [I-D.scion-dataplane], [I-D.scion-cppki], [I-D.scion-cp], currently under ISE review.¶
The goal of this draft is to explore how SCION and its early deployments try to address open research questions in [RFC9217]. Specifically, there are still many open areas of research around path-aware networking, where SCION with its early deployment experiences and research efforts can provide a contribution. This can also be a starting point for discussions around long-term protocol evolution.¶
This draft assumes the reader is familiar with some of the fundamental concepts defined in the components specification.¶
Note: This is the very first version of the SCION research questions draft, and it merely contains a skeleton of potential topics to be further discussed in this draft. Any feedback is welcome and much appreciated. Thanks!¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
The SCION protocol specifies 16 bits and 48 bits to identify the ISD and AS respectively. This identification is used, at the data-plane level, in every packet to fully address the sender and receiver, and at the control-plane level, to identify the PCB sender and hops.¶
Whilst 48 bits for the AS will accommodate up to 2.81475e14 assignments which is likely to be more than sufficient for the future using 16 bits for the ISD only offers 65,536 possible assignments. Further investigation on whether this is sufficient is needed.¶
The following questions arise: (not comprehensive)¶
How many ASes do we expect in the SCION network model?¶
Can one AS belong to many ISDs?¶
Are AS numbers unique themselves? Or only unique in combination with an ISD?¶
How many ISDs do we expect?¶
What is the ontology of an ISD?¶
Note that ISDs require CORE links to other ISDs. This reduces the number of ISDs to those that have CORE ASes that can directly connect to CORE ASes in other ISDs. The number of ISDs is still super-exponential asymptotically.¶
Control servers return a large number of path segments. This can cost considerable bandwidth / network egress while at the same time overloading clients with an unnecessarily large numbers of segments, mostly consisting of redundant information in terms of duplicate link and hops.¶
This problem may be more problematic in ASes with many end hosts (e.g. IoT), or end hosts with little computing power or little spare bandwidth.¶
Getting a full path to a remote endhost may require three round-trips with the control server.¶
There are multiple possible and independent solution steps here:¶
Compression (idea suggested by Francois Wirz): Segments could be stored in a way that duplicate information (hops & links) is only stored once and the segments contain only references to the hops and links.¶
Allow queries from start AS to end AS across multiple segments. This should be very easy to implement and would be compatible with the current wire protocol (protobuf).¶
Predefine some policies that can be resolved by the control server, e.g. ANY, BEST_LATENCY, BEST_BANDWIDTH, BEST_PRICE, BEST_CO2. For these, a control server could simply calculate 5-10 good paths and return these. Moreover these could be cached for commonly requested remote ASs. If a user requires a custom policy they can still resort to requesting actual segments.¶
Doing path computation on the control server will initially increase computational cost. However, it would substantially decrease network egress. Caching of paths should reduce CPU cost, maybe even below the current cost for retrieving a large amount of segments from the local database and sending them over the network interface.¶
Examples for requesting CORE segments between different ISDs or within an ISD (as of 2024-07-12):¶
src | dst | segments returned |
---|---|---|
64-0:0:0 | 64-0:0:0 | 337 |
64-0:0:0 | 65-0:0:0 | 240 |
64-0:0:0 | 67-0:0:0 | 60 |
64-0:0:0 | 64-2:0:13 | 60 |
Reduced adoption due to limited routing policy possibilities, such as a (core-)AS does not want to accept transit traffic unless it starts/ends in ASs with special properties. For example a GEANT AS does not want to allow transit traffic unless it originates or ends in another research AS.¶
One solution could be to add a “confirm full path”-flag to certain segments. If this flag is set, the full path (all segments) needs to authorized by all ASes that insist on authorizing it. This is obviously less scalable but may be viable for ASes that insist on such policies. This also allows for “secret” policies.¶
Collateral: this probably needs a data plane change. Conceptually, we have only a single resulting segment, and that segment needs to be used in full, e.g. no on-path trickery.¶
Is forward-secrecy DRKey useful and should we develop it?¶
What are the properties of the control-plane?¶
For more info: [I-D.garciapardo-drkey].¶
FABRID [KRAHENBUHL2023] and EPIC [LEGNER2020].¶
At this moment, the SCION implementation is not compatible out-of-the-box with NAT'ed devices, regardless of whether these devices are end-hosts, or even running SCION services. This is due to the (UDP-IP) underlay being modified by the NAT mechanism, but not the internal destination SCION address. Although this does not concern the SCION protocols themselves, we want to check that this will not be a problem. Critically, the SCION header needs to contain the SRC address as seen by the border router so that the border router can forward incoming response packets to the correct NAT device and port.¶
Possible solutions:¶
Links may get overloaded because the SCION routing system fails to distribute load properly over different links. New/different links might be underutilized.¶
If links become overloaded, there are several ways to handle that. Non comprehensively:¶
Squeeze: send an SCMP message to trigger end-hosts to use an alternative path¶
Steer: send and SCMP to trigger users to ask CS for a better path¶
Reduce: hand over very short lived paths, let the end-hosts wait for the path to expire so that they request new paths and (hopefully) decide on a different path.¶
Recommend: let the end-hosts know which paths are recommended by the AS at this time.¶
If a link has good properties, many AS will disseminate segments, therefore paths that go through this link and the link may become overloaded. See Simon Scherrer's work on Braess Paradox.¶
Either there needs to be some constant control by all clients to not choose the best theoretical path, but the one that works best. Or we need to find a way that control servers do not disseminate “good” links to all end-hosts.¶
The current consensus is that end-hosts can use multi-pathing and “automatically” converge on the best path, i.e. creating an equilibrium. Again, see Simon Scherrer's work on Braess Paradox.¶
When a client contacts a server, it is usually understood that it wants the server to use the reverse path to answer back. It the server uses that path for a long period of time, the path will eventually expire. How to standardize the process of refreshment?¶
The server must ask the CS for a path, regardless of the client's policy.¶
The client (somehow) sends a new packet with a new path, prompting the server to use this path from now on.¶
There are some nuances: Usually the server's API will store the initial address of the client to be used through all the session. We might need to take this into account.¶
A related question: how long before expiration should we still use a path? How do we handle that?¶
Do we actually need to solve this reverse path refresh problem?¶
CONTRA: It is probably rare that a server needs to send data for a long time without the application layer protocol requiring the client to ever answer back.¶
PRO: The client may happen to have an old-ish path. If we can't refresh, the client always needs to consider whether a path is valid "long enough", which might only be possible to guess.¶
CONTRA: Sending keep-alives sounds like a connection based protocol. It alo means we need to figure out when to stop sending keep alives.¶
CONTRA: It may be better to solve this in the application layer or in the overlay protocol, where we we know more about potential length of the session, or whether this is a singular request/answer type of exchange, or whether more frequent keep-alives are anyway required.¶
IPv6 in the Data Plane¶
SCION-IP translation¶
How can we interface with QUIC Multipath [I-D.ietf-quic-multipath]?¶
TODO Security¶
This document has no IANA actions.¶
TODO acknowledge.¶