Talk:Farthest-first traversal

{{GA|04:10, 7 November 2021 (UTC)|page=1|topic=Computing and engineering|oldid=1053957936}}

{{WikiProject banner shell|class=GA|

{{WikiProject Computer science|importance=mid}}

}}

{{Talk:Farthest-first traversal/GA1}}

Different interpretation of farthest insertion algorithm

I see three different logic descriptions for the algorithm and challenge hereby the "correctness" of the description. I quote "correctness" as I am pretty sure there are two different ways to calculate the farthest insertion algorithm. I would regard them as two different algorithms which for the similarity were named identical.

I hereby question the GA category of this article.

Logic differences

  1. selection of node: maximum distance to the set, taking into account all nodes of the set for distance calculation
  2. selection of node: maximum distance to any one point of the set
  3. insertion of node: usually replacing that edge which will lead to minimum increase of total length/distance. Here described in Greedy Algorithm as: "Remove p from the not-yet-selected points and add it to the end of the sequence of selected points." Adding to the end is just wrong.

This article states the calculation logic as: "Each point p after the first must have the maximum possible distance to the set of points earlier than p in the sequence, where the distance from a point to a set is defined as the minimum of the pairwise distances to points in the set."

I read this as maximum of the sum of the (minimum) pairwise distances. While I think this is the better algorithm, most of the times the maximum distance to any one point is evaluated. If the latter is the case, then the animated graphics is incorrect. The topmost point is closer to the first point then the other ones on the top right corner.

This article refers in this respect to Rosenkrantz, D. J.; Stearns, R. E.; Lewis, P. M., II (1977), "An analysis of several heuristics for the traveling salesman problem"

That article is widely referenced for the farthest insertion heuristic. As such it should be reflected in this article how they select the nodes.

The selection of the node is describes as follows:

"We say that a tour is constructed by farthest insertion if each ai, 1 <= i < n, in the definition of an insertion method satisfies

(5.1) d(Ti, ai) = max {d(Ti, x) for x in N-Ti}.

Contrasting (5.1) with (4.2), we observe that farthest insertion has a max

where nearest insertion has a min. "

The authors just search for the max distance to any one of the points and use that node.

They then write "Inserting nearby points late in the approximation is appealing because the short edges used late in

the procedure are less likely to be accidentally deleted by some still later insertion."

If this is the goal, then the logic may not be ideal. Having a graph were one point is very far from the others may insert first all nodes nearest the two farthest nodes while the sum of distances would insert further away nodes first.

The Wikipedia article lacks a description of the goal of the algorithm. It focuses on the WHAT the algorithm does, not WHY.

When one searches the internet for "farthest insertion algorithm" a number of results come up. I could find only one implementation using the sum.

Examples:

  • https://www2.isye.gatech.edu/~mgoetsch/cali/VEHICLE/TSP/TSP015__.HTM, referring to Rosenkrantz et al
  • MIT OpenCourseWare: https://ocw.mit.edu/courses/civil-and-environmental-engineering/1-203j-logistical-and-transportation-planning-methods-fall-2006/lecture-notes/lec16.pdf
  • a typical GitHub implementation: https://flothesof.github.io/farthest-neighbors.html
  • Princeton: https://www.cs.princeton.edu/courses/archive/fall16/cos126/checklist/tspextra.html

So, if the majority has a different understanding I am suggesting to recognize this in the article and state the differences.

Coming back to the Greedy Algorithm:

In the article the authors describe the insertion of the node for all insertion logic as:

"if T passes through more than one point, then

a) find an edge (x, y) in T which minimizes: d(x,k)+d(k,y)-d(x,y)

b) delete edge (x,y) and add edges (x,k) and (k,y) to obtain

TOUR(T, k);

if T passes through a single node i, then make TOUR(T, k) the two node tour

consisting of edges (i, k) and (k, i).

In either case, we say that TOUR(T, k) is obtained by inserting k into T."

I suggest to change that part of the description of the Greedy Algorithm. — Preceding unsigned comment added by GunterS (talkcontribs) 09:28, 15 March 2022 (UTC)

:Where on earth did you get SUM of pairwise distances, when clearly it states MIN of pairwise distances? ——David Eppstein (talk) 16:16, 15 March 2022 (UTC)

::I was looking at the chart. If the MIN of pairwise distances were used, then the third node would be the right one of the two points below the second one. It is clearly farther away from point 1 then the topmost one. So I assumed by Min of pairwise distances is meant MIN of (all aggregated) pairwise distances in the set. Using the plural in an array can mean all. GunterS (talk) 17:10, 15 March 2022 (UTC)

:Also, what "animated graphic" are you talking about? There is only a single image, not an animation. In that single image, the first point (with no arrow into it) is chosen arbitrarily. The second point is farthest from the first one. The third one is farthest from the first two (its minimum distance to the set of the first two points is larger than the minimum distances of any other point). Outside of your confused misinterpretations, there is only one algorithm to be described here. The "greedy algorithm" for TSP that you are describing is not farthest-insertion, it is cheapest-insertion, a different heuristic unrelated to the topic of this article. —David Eppstein (talk) 19:09, 18 March 2022 (UTC)

Citation for Greedy Exact Algorithm

Hello, apologies if this is the incorrect way to raise this issue. The greedy exact algorithm currently lists the article "Clustering to minimize the maximum intercluster distance" by Gonzalez as the source for this algorithm. However, in the copy of the article I retrieved from ScienceDirect, there appears to only be a description of a 2*OPT approximation algorithm, and no description of a greedy exact algorithm, even in the process of describing prior work. Would it be possible for the author to update the citation to reflect the original source of this algorithm? Wbrackenbury (talk) 13:56, 12 September 2022 (UTC)

:It is an exact algorithm for farthest-first traversal. It is a 2-approximation for other problems. Gonzales focuses on the other problems but the algorithm he describes is an algorithm for farthest-first traversal. —David Eppstein (talk) 16:34, 12 September 2022 (UTC)

::Thanks for the clarification! — Preceding unsigned comment added by 2601:249:8C00:6800:F8B1:3974:C9C4:F7D0 (talk) 16:43, 12 September 2022 (UTC)