Network Coordinate System

{{Cleanup|date=June 2024|reason=The article is entirely in list format}}

A Network Coordinate System (NC system) is a system for predicting characteristics such as the latency or bandwidth of connections between nodes in a network by assigning coordinates to nodes. More formally, It assigns a coordinate embedding \vec c_n to each node n in a network using an optimization algorithm such that a predefined operation \vec c_a \otimes \vec c_b \rightarrow d_{ab} estimates some directional characteristic d_{ab} of the connection between node a and b.{{Cite journal |last1=Donnet |first1=Benoit |last2=Gueye |first2=Bamba |last3=Kaafar |first3=Mohamed Ali |date=2010 |title=A Survey on Network Coordinates Systems, Design, and Security |url=https://ieeexplore.ieee.org/document/5462982 |journal=IEEE Communications Surveys & Tutorials |volume=12 |issue=4 |pages=488–503 |doi=10.1109/SURV.2010.032810.00007 |s2cid=16908400 |issn=1553-877X}}

Uses

In general, Network Coordinate Systems can be used for peer discovery, optimal-server selection, and characteristic-aware routing.

= Latency Optimization =

When optimizing for latency as a connection characteristic i.e. for low-latency connections, NC systems can potentially help improve the quality of experience for many different applications such as:

  • Online Games
  • Forming game groups such that all the players are close to each other and thus have a smoother overall experience.{{Cite journal |last1=Fu |first1=Yongquan |last2=Xiaoping |first2=Xu |date=February 2017 |title=Self-Stabilized Distributed Network Distance Prediction |url=https://ieeexplore.ieee.org/document/7502073 |journal=IEEE/ACM Transactions on Networking |volume=25 |issue=1 |pages=451–464 |doi=10.1109/TNET.2016.2581592 |s2cid=10842765 |issn=1558-2566|url-access=subscription }}
  • Choosing servers as close to as many players in a given multiplayer game as possible.{{Cite book |last1=Agarwal |first1=Sharad |last2=Lorch |first2=Jacob R. |title=Proceedings of the ACM SIGCOMM 2009 conference on Data communication |chapter=Matchmaking for online games and other latency-sensitive P2P systems |date=2009-08-16 |chapter-url=https://dl.acm.org/doi/10.1145/1592568.1592605 |series=SIGCOMM '09 |location=New York, NY, USA |publisher=Association for Computing Machinery |pages=315–326 |doi=10.1145/1592568.1592605 |isbn=978-1-60558-594-9|s2cid=7720412 }}
  • Automatically routing game packets through different servers so as to minimize the total latency between players who are actively interacting with each other in the game map.{{Citation needed|date=May 2023|reason=This is an application for NC systems I came up with that seems plausible but I cannot find a source that has specifically described this application. If someone finds a source like this, please add!}}
  • Content delivery networks
  • Directing a user to the closest server that can handle a request to minimize latency.
  • Voice over IP
  • Automatically switch relay servers based on who is talking in a few-to-many or many-to-many voice chat to minimize latency between active participants.
  • Peer-to-peer networks
  • Can use the latency-predicting properties of NC systems to do a wide variety of routing optimizations in peer-to-peer networks.
  • Onion routing networks
  • Choose relays such as to minimize the total round trip delay to allow for a more flexible tradeoff between performance and anonymity.{{Cite journal |last=Sherr |first=Micah |date=2009 |title=Coordinate-based routing for high performance anonymity |url=https://netdb.cis.upenn.edu/papers/msherr-dissertation.pdf |journal=Computer and Information Science at UPenn}}
  • Physical positioning
  • Latency correlates with the physical distances between computers in the real world. Thus, NC systems that model latency may be able to aid in locating the approximate physical area a computer resides in.{{Citation needed|date=May 2023|reason=Since latency-based NC systems correlate with the real-world physical distances between computers, it seems plausible that a NC system could be used for physical positioning. If anyone can find a source that suggests this, please add!}}

= Bandwidth Optimization =

NC systems can also optimize for bandwidth (although not all designs can accomplish this well). Optimizing for high-bandwidth connections can improve the performance of large data transfers.{{Cite journal |last=Liao |first=Yongjun |date=2013-01-11 |title=Learning to Predict End-to-End Network Performance |hdl=2268/136727 |url=https://orbi.uliege.be/handle/2268/136727 |language=English}}{{Cite journal |last1=Ramasubramanian |first1=Venugopalan |last2=Malkhi |first2=Dahlia |last3=Kuhn |first3=Fabian |last4=Abraham |first4=Ittai |last5=Balakrishnan |first5=Mahesh |last6=Gupta |first6=Archit |last7=Akella |first7=Aditya |title=A Unified Network "Coordinate" System for Bandwidth and Latency |url=https://www.microsoft.com/en-us/research/wp-content/uploads/2008/09/tr-2008-124.pdf |journal=Microsoft Research}}{{Cite book |last1=Sherr |first1=Micah |last2=Blaze |first2=Matt |last3=Loo |first3=Boon Thau |title=Privacy Enhancing Technologies |chapter=Scalable Link-Based Relay Selection for Anonymous Routing |date=2009 |editor-last=Goldberg |editor-first=Ian |editor2-last=Atallah |editor2-first=Mikhail J. |chapter-url=https://link.springer.com/chapter/10.1007/978-3-642-03168-7_5 |series=Lecture Notes in Computer Science |volume=5672 |language=en |location=Berlin, Heidelberg |publisher=Springer |pages=73–93 |doi=10.1007/978-3-642-03168-7_5 |isbn=978-3-642-03168-7}}

= Sybil Attack Detection =

Sybil attacks are of much concern when designing peer-to-peer protocols. NC systems, with their ability to assign a location to the source of traffic can aid in building systems that are Sybil-resistant.{{Cite journal |date=2023-05-01 |title=Web3 Sybil avoidance using network latency |journal=Computer Networks |language=en |volume=227 |pages=109701 |doi=10.1016/j.comnet.2023.109701 |issn=1389-1286 |last1=Stokkink |first1=Quinten |last2=Ileri |first2=Can Umut |last3=Epema |first3=Dick |last4=Pouwelse |first4=Johan |doi-access=free }}{{Cite journal |last1=Chan-Tin |first1=Eric |last2=Heorhiadi |first2=Victor |last3=Hopper |first3=Nicholas |last4=Kim |first4=Yongdae |date=July 2015 |title=Hijacking the Vuze BitTorrent network: all your hop are belong to us |journal=IET Information Security |language=en |volume=9 |issue=4 |pages=203–208 |doi=10.1049/iet-ifs.2014.0337 |issn=1751-8717|doi-access=free }}

Design Space

= Landmark-Based vs Decentralized =

Almost any NC system variant can be implemented in either a landmark-based or fully decentralized configuration. Landmark-based systems are generally secure so long as none of the landmarks are compromised, but they aren't very scalable. Fully decentralized NC systems are harder to design as they must be resilient to adversarial nodes, but they can in theory scale indefinitely.

= Euclidean Embeddings =

This design assigns a point in k-dimensional euclidean space to each node in the network and estimates characteristics via the euclidean distance function d_{ab} = ||\vec c_a - \vec c_b|| where \vec c_n represents the coordinate of node n.

Euclidean Embedding designs are generally easy to optimize. The optimization problem for the network as a whole is equivalent to finding the lowest energy state of a spring-mass system where the coordinates of the masses correspond to the coordinates of nodes in the network and the springs between the masses represent measured latencies between nodes. To make this optimization problem function work in a decentralized protocol, each node exchanges its own coordinates with those of a fixed set of peers and measures the latencies to those peers, simulating a miniature spring-mass system where all the masses representing the coordinates of the peers and each mass is connected via a single spring to the node's own "mass" which when simulated, gives a more optimal value for the node's coordinate. All these individual updates allow the network as a whole to form a predictive coordinate space by collaboratively.

The laws of Euclidean space require certain characteristics of the distance function to hold true, such as symmetry (measuring from a \rightarrow b should give the same result as from b \rightarrow a) and the triangle inequality (a \rightarrow b) + (b \rightarrow c) \geq (a \rightarrow c). No real-world network characteristics completely satisfy these laws, but some do more than others and NC systems using euclidean embedding are somewhat accurate when run on datasets containing violations of these laws.

Existing research on the topic includes GNP,{{Cite book |last1=Ng |first1=T.S.E. |last2=Zhang |first2=Hui |title=Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies |chapter=Predicting Internet network distance with coordinates-based approaches |date=June 2002 |chapter-url=https://ieeexplore.ieee.org/document/1019258 |volume=1 |pages=170–179 vol.1 |doi=10.1109/INFCOM.2002.1019258|isbn=0-7803-7476-2 |s2cid=2967523 }} PIC{{Cite book |last1=Costa |first1=M. |last2=Castro |first2=M. |last3=Rowstron |first3=R. |last4=Key |first4=P. |title=24th International Conference on Distributed Computing Systems, 2004. Proceedings. |chapter=PIC: Practical Internet coordinates for distance estimation |date=March 2004 |chapter-url=https://ieeexplore.ieee.org/document/1281582 |pages=178–187 |doi=10.1109/ICDCS.2004.1281582|isbn=0-7695-2086-3 |s2cid=846952 }} Vivaldi,{{Cite journal |last1=Dabek |first1=Frank |last2=Cox |first2=Russ |last3=Kaashoek |first3=Frans |last4=Morris |first4=Robert |date=2004-08-30 |title=Vivaldi: a decentralized network coordinate system |url=https://doi.org/10.1145/1030194.1015471 |journal=ACM SIGCOMM Computer Communication Review |volume=34 |issue=4 |pages=15–26 |doi=10.1145/1030194.1015471 |issn=0146-4833|url-access=subscription }} and Pharos{{Cite journal |author1=Y. Chen |author2=Y. Xiong |author3=X. Shi |display-authors=etal |date=April 2009 |title=Pharos: Accurate and Decentralised Network Coordinate System |url=http://www.cs.duke.edu/~ychen/papers/IET_Pharos.pdf |url-status=dead |journal=IET Communications |volume=3 |issue=4 |pages=539–548 |doi=10.1049/iet-com.2008.0187 |s2cid=11701454 |archive-url=https://web.archive.org/web/20131203021959/http://www.cs.duke.edu/~ychen/papers/IET_Pharos.pdf |archive-date=2013-12-03 |access-date=2013-11-27}}

= Matrix Factorization =

This design imagines the entire network as represented by an incomplete matrix X : \R_{n \times n} where n is the total number of nodes in the network, and any element of the matrix at the intersection between row i and column j of the matrix represents a directional latency measurement from node n_ito node n_j. The goal is to estimate the numbers in the unfilled squares of the matrix using the squares that are already filled in, i.e. performing matrix completion. To estimate a specific latency between two nodes, this method uses the dot product d_{ab} = \vec u_a \vec v_b where \vec u_n/\vec v_n represents a point in a k-dimensional inner product space.

NC system designs using matrix factorization are generally more complicated than their euclidean counterparts. In the centralized variant, matrix completion can be performed directly on a set of landmarks which have measured latency to every other landmark in a set, thus creating a complete matrix X representing the landmark network. This matrix can then be factored on a single computer using non-negative matrix factorization (NNMF) into two matrices U : R_{n \times r} and V : R_{r \times n} such that UV \approxeq X. Since matrix multiplication is essentially doing the dot product for each row and column of the input matrices, coordinates for each landmark j can be represented by two "in" and "out" vectors (\vec u_j and \vec v_j) taken respectively from the jth row of U and the jth column of V. With this, latencies between two landmarks can be approximates by a simple dot product: d_{ij} = \vec u_i \vec v_j. Any node that wants to figure out their own coordinates can simply measure the latency to some subset of all the landmarks, re-create a complete matrix using the landmark's coordinates, and then perform NNMF to calculate their own coordinate. This coordinate can then be used with any other node (landmark or otherwise) to estimate latency to any other coordinate that was calculated via the same set of landmarks.{{Cite journal |last1=Mao |first1=Yun |last2=Saul |first2=Lawrence K. |last3=Smith |first3=Jonathan M. |date=December 2006 |title=IDES: An Internet Distance Estimation Service for Large Networks |url=https://ieeexplore.ieee.org/document/4016146 |journal=IEEE Journal on Selected Areas in Communications |volume=24 |issue=12 |pages=2273–2284 |doi=10.1109/JSAC.2006.884026 |s2cid=12931155 |issn=1558-0008}}

The decentralized variant is decidedly simpler. For a given node, the goal is to minimize the absolute difference (or squared difference) between the measured latencies to the peers and the predicted latencies to the peers. The predicted latency is given by the same equation d_{ij} = \vec u_i \vec v_j where \vec u_iis the outgoing vector of node i and \vec v_j is the incoming vector of node j. This goal (or loss function) can then be minimized using stochastic gradient descent with line search.{{Cite journal |last1=Liao |first1=Yongjun |last2=Du |first2=Wei |last3=Geurts |first3=Pierre |last4=Leduc |first4=Guy |date=2013-10-01 |title=DMFSGD: a decentralized matrix factorization algorithm for network distance prediction |url=https://doi.org/10.1109/TNET.2012.2228881 |journal=IEEE/ACM Transactions on Networking |volume=21 |issue=5 |pages=1511–1524 |doi=10.1109/TNET.2012.2228881 |arxiv=1201.1174 |s2cid=8041240 |issn=1063-6692}}

Existing research on the topic includes IDES, which uses principal component analysis from a small set of beacon or landmark nodes. There is also Phoenix, which proposes a distributed form and incorporates a weight-based trust mechanism using a gossip-based layer to protect against adversarial nodes (NCShield).{{Cite journal |last1=Chen |first1=Yang |last2=Wang |first2=Xiao |last3=Shi |first3=Cong |last4=Lua |first4=Eng Keong |last5=Fu |first5=Xiaoming |last6=Deng |first6=Beixing |last7=Li |first7=Xing |date=December 2011 |title=Phoenix: A Weight-Based Network Coordinate System Using Matrix Factorization |url=https://ieeexplore.ieee.org/document/6092405 |journal=IEEE Transactions on Network and Service Management |volume=8 |issue=4 |pages=334–347 |doi=10.1109/TNSM.2011.110911.100079 |s2cid=8079061 |issn=1932-4537|url-access=subscription }} Finally, DMFSGD proposal stochastic gradient descent instead of alternating least squares to learn factorizations.

= Relative Coordinates =

Relative coordinate systems are a slightly more restrictive, but also more stable form of network coordinate system. They don't allow for prediction between arbitrary pairs of nodes, but they are immune to attempts at network-wide coordinate distortion by instead opting for a 3-way factorization. This factorization is as follows: d_{i,j} = Y_i \phi_i Y_j^T where to predict the one-way latency d_{i,j}, you must collect a vector Y_i of measurements to m other nodes and receive a vector Y_j from node j of measurements to m more nodes (potentially different nodes from measurements in Y_i). Node i then must learn a matrix \phi_i to minimize the prediction error to all known peers. Then latency prediction to unknown peers given a corresponding vector Y_j from such unknown peer can occur.

Alternatives

Network Coordinate Systems are not the only way to predict network properties. There are also methods such as iPlane{{Cite journal |last1=Madhyastha |first1=Harsha V. |last2=Isdal |first2=Tomas |last3=Piatek |first3=Michael |last4=Dixon |first4=Colin |last5=Anderson |first5=Thomas |last6=Krishnamurthy |first6=Arvind |last7=Venkataramani |first7=Arun |date=2006-11-06 |title=iPlane: an information plane for distributed services |url=https://dl.acm.org/doi/10.5555/1298455.1298490 |journal=Proceedings of the 7th Symposium on Operating Systems Design and Implementation |series=OSDI '06 |location=USA |publisher=USENIX Association |pages=367–380 |isbn=978-1-931971-47-8 |archive-url=https://web.archive.org/web/20230126080441/https://courses.cs.washington.edu/courses/cse590s/06sp/iplane.pdf |archive-date=2023-01-26}} and iPlane Nano{{Cite journal |last1=Madhyastha |first1=Harsha V. |last2=Katz-Bassett |first2=Ethan |last3=Anderson |first3=Thomas |last4=Krishnamurthy |first4=Arvind |last5=Venkataramani |first5=Arun |date=2009-04-22 |title=iPlane Nano: path prediction for peer-to-peer applications |url=https://dl.acm.org/doi/10.5555/1558977.1558987 |journal=Proceedings of the 6th USENIX Symposium on Networked Systems Design and Implementation |series=NSDI'09 |location=USA |publisher=USENIX Association |pages=137–152}} which take a more analytical approach and try to mechanistically simulate the behavior of internet routers to predict by what route some packets will flow, and thus what properties a connection will have.

In The Wild

  • Vuze - BitTorrent Client{{Cite journal |last1=Ledlie |first1=Johnathan |last2=Gardner |first2=Paul |last3=Seltzer |first3=Margo |title=Network Coordinates in the Wild |url=https://people.csail.mit.edu/ledlie/papers/wild07-tr.pdf |journal=USENIX Symposium on Networked Systems Design & Implementation |issue=4 |pages=299–311 |via=}}

References