Transportation Research Part E 166 (2022) 102903 Available online 23 September 2022 1366-5545/© 2022 Elsevier Ltd. All rights reserved. Contents lists available at ScienceDirect Transportation Research Part E journal homepage: www.elsevier.com/locate/tre Facility Location in Logistics and Transportation: An enduring relationship Francisco Saldanha-da-Gama Departamento de Estatística e Investigação Operacional, Centro de Matemática, Aplicações Fundamentais e Investigação Operacional, Faculdade de Ciências, Universidade de Lisboa, Portugal A R T I C L E I N F O Keywords: Facility Location Logistics Transportation A B S T R A C T This article aims at contributing to the celebration of the 25th Anniversary of Transportation Research Part E: Logistics and Transportation Review (TRE). It provides an overview of the role of Facility Location in Logistics and Transportation highlighting the contribution of TRE to such an enduring relationship. Several conventional problems are revisited showing that the three above fields have been intertwined for a long time. Nevertheless, the role of Facility Location has become even stronger in the past decades due to challenges posed by new technological developments together with a fast economy globalization and a strong increase in environmental concerns. This has called for the study of more complex problems and the development of comprehensive mathematical models leading to major advances in areas such as reverse and green logistics, humanitarian supply chains, and multimodal transportation, to mention a few. These and other related topics are discussed. Hedging against uncertainty has gained much practical relevance and thus it will be much in focus throughout the paper. Several current trends and future challenges are thoroughly discussed. These include but are not limited to the steps already made and those still missing for paving the way from Industry 4.0 to Industry 5.0, as well as the challenges posed by data-driven decision making in the Era of big data. 1. Introduction A facility location problem consists of selecting locations for a facility or equipment to serve a set of demand points or customers in the best possible way. Location Science deals with models and solution techniques for such problems together with the implementation of the solutions in the real world (Laporte et al., 2019). This is a well-established research field that has been much active since the 1960s. Moreover, it has a strong interaction with other disciplines such as Geography, Economics, Transportation, and Logistics, to mention a few. Marking the 25th anniversary of Transportation Research Part E: Logistics and Transportation Review (TRE), this article aims at providing an overview of the role that Location Analysis has played in the context of the problems stemming from Logistics and Transportation. In fact, despite corresponding to three well-defined research topics, Facility Location, Logistics, and Transportation have always been strongly intertwined. The relevance of Location Analysis in Transportation was stated directly in the first quantitative approaches known in the area. This is the case with the famous Weber problem (see Pinto 1977, Launhardt 1900, and Weber 1909) that emerged in an industrial context. It consisted of finding the best trace of a rail road to connect two raw-material sources and a plant (whose location was to be determined) and then to connect the plant with the destination of the production output. The objective was to minimize the E-mail address: fsgama@ciencias.ulisboa.pt. https://doi.org/10.1016/j.tre.2022.102903 Received 8 July 2022; Received in revised form 31 July 2022; Accepted 2 September 2022 Transportation Research Part E 166 (2022) 102903 2 F. Saldanha-da-Gama Fig. 1. 25 five years of TRE — keywords found in the articles that consider problems involving facility location decisions. total weighted transportation costs in the three rail road segments. The location of the industrial facility ended up being a central decision to make because of its impact on the transportation costs. Interestingly, at the time, the costs associated with the facility (land acquisition, construction, setup, et cetera ) were ignored: only the transportation costs were considered. The strong connection between location decisions and transportation costs would be sealed in two papers that were published in the very same year: those by Maranzana (1964) and Manne (1964). From the first work it became clear that ignoring transportation costs when making location decisions may lead to sub-optimal solutions for a system as a whole. The second work – which to the best of the author’s knowledge introduces the first optimization model for a discrete facility location problem – came with an important element: the setup costs for the facilities were combined with the transportation costs in a single objective function to minimize. Since then, location and transportation decisions have been formally intertwined in many problems. Nowadays we keep observing many facility location problems emerging in an economic context combining in a single objective function the costs associated with the facilities and those related with transportation. Salhi and Nagy (1999) present an empirical work showing that in a location-routing problem it is relevant to rely on routing costs when making location decisions. Note, however, that gathering such costs in a single objective function requires them to be somewhat comparable. If not, other alternatives may emerge as more adequate such as the use of a multicriteria model. In any case, more often than not, we see location decisions being conditioned by the transportation costs and the transportation costs being determined by the location of the facilities. With the surge of the Third Industrial Revolution – often called the digital revolution – not only have many new problems and challenges emerged but also new tools have been developed for tackling them. A huge development occurred in computing power. Meanwhile, new problems emerged calling for the use of that power. Some areas where significant developments could be observed include Procurement, Materials Handling, Logistics, Transportation, Inventory Management and Supply Chain Management. These are areas that evolved tremendously in the past decades. Furthermore, they started to intersect thus leading to even more challenging problems. Nowadays, activities in Logistics and Transportation are widely accepted as being major components when it comes to managing a supply chain, which is a fundamental structure to support such activities. That structure is in fact a network of related facilities and services. It is in the design of those networks that, again, location analysis plays a major role thus influencing and being influenced by logistics and transportation decisions (Melo et al., 2009). For a unified view of combined facility location and network design problems and for a comprehensive list of bibliographic references until the first decade of the 2000s, the reader can refer to Contreras and Fernández (2012). A privileged outlet for observing the extent to which Facility Location, Logistics and Transportation are so much intertwined is TRE. In fact, a simple search for the articles addressing location decisions and that have been published in the journal since it started being printed led to Fig. 1, where we can observe a word cloud produced with the keywords of those papers. In addition to Optimization, two keywords emerge directly in the figure: Transportation Planning and Logistics. Facility location problems have been classified in different ways. A popular one is related with the location space: continuous, on a network or discrete. Given the clear dominance of the latter in Logistics and Transportation, in this paper the focus is put on discrete facility location problems and thus on the discrete optimization models often stemming from them. Transportation Research Part E 166 (2022) 102903 3 F. Saldanha-da-Gama Nowadays, Facility Location, Logistics, and Transportation correspond to very extensive research areas that are much fragmented given their depth and width. Thus, this is also true when we focus on problems capturing aspects from the three areas. 1.1. Research methodology and scope of the article The current paper presents a personal perspective of the author in terms of the role of Facility Location in Logistics and Transportation. An overview is presented on the area as a whole as well as on recent developments and future prospects. The article does not intend to be a detailed review since that would not fit the reasonable size of an article. Many interesting and relevant problems will be skipped. This does not mean that the author values them less than the discussed problems; it is just a matter of choosing among many interesting topics a few that are pertinent to illustrate the role of location decisions in Logistics and Transportation. One aspect common to a vast majority of the literature quoted in this paper is that location decisions are among those to be made. This directly excludes much work done in Logistics and Transportation that is certainly of relevance to the existing knowledge. Nevertheless, again, this was a criterion used to narrow the scope of the research discussed thus making the paper more focused. For the topics covered by this paper, three search engines were explored: Web of Science, SCOPUS, and Google Scholar. The keywords used in the search included but were not limited to: Facility Location, Location-Routing, Location-Arc Routing, Hub Location, Multi-Echelon Facilities, Transportation, Routing, Last-Mile Distribution, Electric Vehicles, Hybrid Vehicles, Multimodal Transportation, Logistics, Urban Logistics, Reverse Logistics, Humanitarian Logistics, Green Logistics, Logistics Network Design, Supply Chain Management, Supply Chain Network Design, Closed-Loop Supply Chains, Time-Dependent Decisions, Uncertainty, Robust Optimization, Distributionally Robust Optimization, Stochastic Programming, Chance-constrained Programming, Risk, Conditional Value-at-Risk, Data-Driven Optimization. Regarding the papers published in TRE, the whole publication record was explored given the 25th Anniversary celebration of the journal. For all the other journals and apart from several seminal papers or those specific to some particular aspect being discussed, emphasis was put on the past ten years so that recent trends and developments could be better highlighted. The paper was written in a way that makes it self-contained. In particular a brief history is presented showing how the links between Facility Location and Transportation have evolved and how Logistics has also come into play. Several conventional discrete facility location problems are revisited, which are progressively extended to capture and highlight aspects that have become of major relevance in Logistics and Transportation. This is also a means to progressively present the building blocks of an area that has grown immensely and thus to better structure all the knowledge being discussed. Most of these problems are described mathematically by means of discrete optimization models. Not only is this a way to better grasp the exact features being discussed but also to emphasize the evolution observed throughout time in terms of optimization models proposed. This also leads to gathering in a single document some knowledge that is currently scattered over different sources. Note, however, that not only existing models are presented in this paper. For instance, a new model is introduced for a multi-stage stochastic facility location problem in line to a current research trend. Finally, it should be pointed out that it is not a goal of this paper to discuss algorithmic developments. Instead, the focus is put on the classes of problems that have been investigated, how they have evolved and how Facility Location, Logistics and Transportation decisions have interacted with each other. Nevertheless, throughout the document, when relevant, the reader is refereed to the state-of-the-art solution methods for the different problems discussed. 1.2. Structure of the article and information flow In Fig. 2 we summarize the structure of the paper with emphasis to the flow of information and knowledge throughout the document. As we can observe, in the figure, the central element lying at the basis of the discussion is the broad class of facility location problems namely, those that have played a major role in Logistics and Transportation. Several conventional problems are reviewed that include location–allocation decisions, location with routing decisions and hub location. Afterwards, Logistics comes into play and the discussion is further extended. Note that many problems in Logistics embed knowledge from Transportation. Once the initial discussion is finished, more involved settings are considered. These stem from the inclusion of two major aspects underlying many logistics decisions: time and uncertainty. At this stage, several major building blocks of the multi-disciplinary area intertwining Facility Location, Logistics and Transportation have been laid down. The discussion proceeds with some advanced modeling aspects that are needed to formulate more comprehensive problems. These include planning for risk-averse decision makers, capturing ‘‘soft’’ constraints by means of chance-constraint programming, and imposing closest-assignment as a way to mimic demand nodes patronizing the closest open facilities or service providers. Finally, several methodologies of relevance for tackling the resulting mathematical models are reviewed. Facility Location, Logistics and Transportation meet in many application areas. Some are discussed which include Reverse Logistics Planning, Disaster Operations Management, Environmental-Friendly Production and Distribution Systems, and Integration with Geographical Information Systems. All this discussion is made considering challenges but also opportunities posed by technological developments such as the Industry 4.0 and the Internet-of-Things. Finally, some current challenges are discussed which open the way to itemizing some future research directions. Again, modern trends are considered namely, the need to pave the way from Industry 4.0 to Industry 5.0 and data driven-optimization. Transportation Research Part E 166 (2022) 102903 4 F. Saldanha-da-Gama Fig. 2. Information and knowledge flow throughout the article. In a more synthetic way, the remainder of the paper is organized as follows. In Section 2 several conventional facility location problems intertwining location and transportation decisions are reviewed. In Section 3, logistics activities are added to the discussion. Section 4 focuses on advanced modeling aspects and successful solution techniques for dealing with the resulting optimization models. In Section 5 several application areas are discussed. Section 6 presents some current trends and future prospects. The paper ends with some conclusions drawn from all the contents presented. 2. Location and transportation In a problem embedding transportation decisions, an important feature that one should directly look into concerns the pattern to adopt for visiting the destination nodes or providing service to the customers. Different possibilities exist. If we are dealing with demand for some commodity or service that originates at the demand nodes and that must be supplied from either existing or new facilities to be installed, then the demand nodes can either be directly allocated to one or more such facilities or else they can be Transportation Research Part E 166 (2022) 102903 5 F. Saldanha-da-Gama included in one or several routes originated at and destined to the facilities. Alternatively, we may have a set of origin–destination pairs such that some flow needs to be shipped among them. In this case, even if feasible, shipping the flow directly between every origin–destination pair may easily render the resulting system too expensive or too inefficient. Instead, a set of nodes can be setup to act as transshipment nodes that consolidate and redistribute the flow. Such nodes, which are called hubs, may also originate or may be the destination to some flow. The above patterns lead to different classes of ‘‘conventional’’ facility location problems that we briefly revisit next thus making this paper self-contained while gathering knowledge that is currently scattered over many sources. 2.1. Location–allocation and location–transportation problems The classical transportation problem is one of the most famous problems in Operations Research and Management Science (ORMS). It consists of finding the best way to send a commodity from a finite set of origins 𝐼 to a finite set of destinations 𝐽 Each origin 𝑖 ∈ 𝐼 has a limited supply capacity 𝑞 𝑖 and each destination 𝑗 ∈ 𝐽 has some demand 𝑑 𝑗 . There is a unit transportation cost 𝑐 𝑖𝑗 from origin 𝑖 ∈ 𝐼 to destination 𝑗 ∈ 𝐽 . Considering a decision variable 𝑥 𝑖𝑗 representing the amount shipped from origin 𝑖 ∈ 𝐼 to destination 𝑗 ∈ 𝐽 , the problem can be formulated mathematically as follows: minimize ∑ 𝑖 ∈ 𝐼 ∑ 𝑗 ∈ 𝐽 𝑐 𝑖𝑗 𝑥 𝑖𝑗 , (1) subject to ∑ 𝑗 ∈ 𝐽 𝑥 𝑖𝑗 = 𝑞 𝑖 , 𝑖 ∈ 𝐼, (2) ∑ 𝑖 ∈ 𝐼 𝑥 𝑖𝑗 = 𝑑 𝑗 , 𝑗 ∈ 𝐽 , (3) 𝑥 𝑖𝑗 ≥ 0 , 𝑖 ∈ 𝐼, 𝑗 ∈ 𝐽 . (4) The above model is credited to Hitchcock (1941). It is assumed (without loss of generality) that the total supply equals the total demand. Although variable costs at the origins can easily be embedded in the parameters 𝑐 𝑖𝑗 , the problem ignores fixed costs for using such origins (e.g. setup production or handling costs). Thinking of applications in which this is a relevant feature, let us denote by 𝑓 𝑖 the fixed cost for activating origin 𝑖 ∈ 𝐼 . To formulate the problem we consider a set of binary variables, 𝑦 𝑖 , 𝑖 ∈ 𝐼 indicating whether an origin is used. The original model can be easily adapted: minimize ∑ 𝑖 ∈ 𝐼 𝑓 𝑖 𝑦 𝑖 + ∑ 𝑖 ∈ 𝐼 ∑ 𝑗 ∈ 𝐽 𝑐 𝑖𝑗 𝑥 𝑖𝑗 , (5) subject to ∑ 𝑗 ∈ 𝐽 𝑥 𝑖𝑗 ≤ 𝑞 𝑖 𝑦 𝑖 , 𝑖 ∈ 𝐼, (6) (3) , (4) , 𝑦 𝑖 ∈ {0 , 1} 𝑖 ∈ 𝐼. (7) This model describes a well-known problem in location analysis: the capacitated facility location problem (CFLP) (Fernández and Landete, 2019). In case the total capacity of the selected origins is not enough for covering the demand, we may need to pay a shortage cost at the destination (e.g., an extraordinary urgent order or an outsourcing activity). This can be easily accomplished by considering an infinite-capacity fictitious facility that supplies the missing demand at a unit cost equal to the unit shortage cost. Fernández and Landete (2019) also point out that the above cost-minimization objective is not as restrictive as it may seem at a first glance. In fact, if we have some profit from supplying the demand nodes then the problem can still be reformulated as above. Nowadays, the CFLP and its extensions keep attracting the attention of the scientific community. For instance, Corberán et al. (2020) investigate a so-called facility location problem with capacity transfers. The authors note that in many realistic settings, there is a capacity surplus at some operating facilities. Thus one can take advantage from such surplus and set the corresponding facilities collaborating with those facing shortage. This is in fact a realistic alternative to directly opening extra facilities or paying for shortages. One important feature of the CFLP is the possibility of having a destination node supplied by multiple facilities. If this is not the case we fall in a single-allocation scheme. This calls for redefining the domain of the 𝑥 -variables setting it equal to {0 , 1} , i.e., 𝑥 𝑖𝑗 is equal to one if and only if demand node 𝑗 ∈ 𝐽 is fully supplied by origin 𝑖 ∈ 𝐼 . In this case, the allocation decision directly determines the transportation one. The new model can be obtained from the above one with minor changes: minimize ∑ 𝑖 ∈ 𝐼 𝑓 𝑖 𝑦 𝑖 + ∑ 𝑖 ∈ 𝐼 ∑ 𝑗 ∈ 𝐽 𝑐 𝑖𝑗 𝑑 𝑗 𝑥 𝑖𝑗 , (8) subject to ∑ 𝑗 ∈ 𝐽 𝑑 𝑗 𝑥 𝑖𝑗 ≤ 𝑞 𝑖 𝑦 𝑖 , 𝑖 ∈ 𝐼, (9) ∑ 𝑖 ∈ 𝐼 𝑥 𝑖𝑗 = 1 , 𝑗 ∈ 𝐽 , (10) 𝑥 𝑖𝑗 ∈ {0 , 1} 𝑖 ∈ 𝐼, 𝑗 ∈ 𝐽 , (11) Transportation Research Part E 166 (2022) 102903 6 F. Saldanha-da-Gama 𝑦 𝑖 ∈ {0 , 1} 𝑖 ∈ 𝐼. (7) Let us now ignore the setup costs and assume that the decision maker seeks the allocation of every demand node to a single origin irrespective to capacities. Assume also that a certain number of origins, say 𝑝 , is to be set operating. The problem becomes minimize ∑ 𝑖 ∈ 𝐼 ∑ 𝑗 ∈ 𝐽 𝑐 𝑖𝑗 𝑑 𝑗 𝑥 𝑖𝑗 , (12) subject to ∑ 𝑖 ∈ 𝐼 𝑦 𝑖 = 𝑝, (13) 𝑥 𝑖𝑗 ≤ 𝑦 𝑖 , 𝑖 ∈ 𝐼, 𝑗 ∈ 𝐽 , (14) (10) , (11) A particular case of interest in practice is that in which the potential locations for the facilities coincide with the locations of the demand nodes, i.e., 𝐼 = 𝐽 . In that case we can replace 𝐼 by 𝐽 in the above model and use the variables 𝑥 𝑗𝑗 , 𝑗 ∈ 𝐽 to replace the 𝑦 -variables. This way we will be facing the well-known (discrete) 𝑝 -median problem (Marín and Pelegrín, 2019). Still today, the above problems and their variants are found at the core of much that is done in distribution logistics and transportation planning as we will see in the following sections. 2.2. Facility location problems with routing decisions The vehicle routing problem (VRP) is another popular and much relevant problem in ORMS. In its original version, a single depot and a homogeneous fleet were assumed. A natural extension that we directly consider is the multi-depot setting. We keep using some notation already introduced: 𝐼 is the set of potential depots (origins or facilities); 𝑞 𝑖 is the capacity of a depot installed at 𝑖 ∈ 𝐼 ; 𝐽 is the set of demand nodes. A directed connected graph 𝐺 = ( 𝑉 , 𝐸 ) with 𝑉 = 𝐼 ∪ 𝐽 is assumed for representing the underlying transportation network. A unit transportation cost is associated with each arc in the graph. A set 𝑀 of homogeneous vehicles each with some capacity 𝑄 is assumed to be available, which allows us to define up to | 𝑀 | routes. The binary allocation variables, 𝑥 𝑖𝑗 , can again be considered to indicate whether demand node 𝑗 ∈ 𝐽 is supplied by a route originated at depot 𝑖 ∈ 𝐼 . Additionally, for 𝑚 ∈ 𝑀 , and 𝓁 , 𝓁 ′ ∈ 𝑉 , 𝑣 𝓁𝓁 ′ 𝑚 is equal to one if route 𝑚 visits node 𝓁 ′ immediately after visiting node 𝓁 and zero otherwise. Finally we consider costs 𝑐 𝓁𝓁 ′ representing the shortest unit transportation cost between nodes 𝓁 and 𝓁 ′ computed in graph 𝐺 ( 𝓁 , 𝓁 ′ ∈ 𝑉 ). Assuming that each route starts and ends in the same depot, the multi-depot vehicle routing problem (MDVRP) can be formulated mathematically as follows: minimize ∑ 𝑚 ∈ 𝑀 ∑ 𝓁 ∈ 𝑉 ∑ 𝓁 ′ ∈ 𝑉 𝑐 𝓁𝓁 ′ 𝑣 𝓁𝓁 ′ 𝑚 , (15) subject to ∑ 𝑚 ∈ 𝑀 ∑ 𝓁 ∈ 𝑉 𝑣 𝓁 𝑗𝑚 = 1 , 𝑗 ∈ 𝐽 , (16) ∑ 𝑗 ∈ 𝐽 ( 𝑑 𝑗 ∑ 𝓁 ∈ 𝑉 𝑣 𝓁 𝑗𝑚 ) ≤ 𝑄, 𝑚 ∈ 𝑀, (17) ∑ 𝑗 ∈ 𝐽 𝑑 𝑗 𝑥 𝑖𝑗 ≤ 𝑞 𝑖 , 𝑖 ∈ 𝐼, (18) ∑ 𝓁 ′ ∈ 𝑉 𝑣 𝓁𝓁 ′ 𝑚 − ∑ 𝓁 ′ ∈ 𝑉 𝑣 𝓁 ′ 𝓁 𝑚 = 0 , 𝑚 ∈ 𝑀, 𝓁 ∈ 𝑉 , (19) ∑ 𝓁 ∈ 𝐼 ∑ 𝓁 ′ ∈ 𝐽 𝑣 𝓁𝓁 ′ 𝑚 ≤ 1 , 𝑚 ∈ 𝑀, (20) ∑ 𝑚 ∈ 𝑀 ∑ 𝓁 ∈ 𝑆 ∑ 𝓁 ′ ∈ 𝑉 ⧵ 𝑆 𝑣 𝓁𝓁 ′ 𝑚 ≥ 1 , 𝐼 ⊆ 𝑆 ⊂ 𝑉 , (21) ∑ 𝓁 ′ ∈ 𝐽 𝑣 𝑖 𝓁 ′ 𝑚 + ∑ 𝓁 ′ ∈ 𝑉 𝑣 𝑗 𝓁 ′ 𝑚 ≤ 1 + 𝑥 𝑖𝑗 , 𝑖 ∈ 𝐼, 𝑗 ∈ 𝐽 , 𝑚 ∈ 𝑀, (22) 𝑥 𝑖𝑗 ∈ {0 , 1} , 𝑖 ∈ 𝐼, 𝑗 ∈ 𝐽 , (23) 𝑣 𝓁𝓁 ′ 𝑚 ∈ {0 , 1} , 𝓁 , 𝓁 ′ ∈ 𝑉 , 𝑚 ∈ 𝑀. (24) The objective function (15) accounts for the total transportation cost. Equalities (16) ensure that every demand node is included in one and only one route. Inequalities (17) guarantee that no route exceeds the vehicle capacity. Constraints (18) restrict the amount served by each depot according to its capacity. Equalities (19) state that a set of routes is actually created. Inequalities (20) ensure that each route uses (at most) a single depot. Constraints (21) ensures that every route involves a depot. Constraints (22) impose that if a route includes the depot 𝑖 and the demand node 𝑗 then 𝑥 𝑖𝑗 must be equal to one, i.e., 𝑗 is served by a route originated at (and destined to) depot 𝑖 . Conversely, in case 𝑥 𝑖𝑗 is equal to zero, then no route can involve simultaneously the depot 𝑖 and the demand node 𝑗 . Finally, (23) and (24) specify the domain of the decision variables. Regarding the above model we are not discussing feasibility issues, i.e., we assume that the set of feasible solutions is non-empty. Many extensions of the MDVRP have been investigated such as the existence of time-windows for visiting the nodes, the possibility of having routes starting and ending in different depots, et cetera . Moreover, other alternative models have been proposed Transportation Research Part E 166 (2022) 102903 7 F. Saldanha-da-Gama namely, seeking the development of efficient exact algorithms for small to medium-sized instances. The interested reader should refer to the vast existing literature which includes Contardo and Martinelli (2014), Crevier et al. (2007), Hesam Sadati et al. (2021), Ramos et al. (2020), Salhi et al. (2014), Tu et al. (2014), and Zhen et al. (2020). The MDVRP does not include fixed costs for using the depots. If such costs are relevant, then depot selection (i.e., location decisions) influences the total cost. We use the notation already presented for facility setup costs 𝑓 𝑖 and for the location variables (depot selection) 𝑦 𝑖 ( 𝑖 ∈ 𝐼 ). Additionally, we consider that a fixed cost 𝐹 is paid for every vehicle used. The new problems can be formulated mathematically as follows: minimize ∑ 𝑖 ∈ 𝐼 𝑓 𝑖 𝑦 𝑖 + ∑ 𝑚 ∈ 𝑀 ∑ 𝓁 ∈ 𝑉 ∑ 𝓁 ′ ∈ 𝑉 𝑐 𝓁𝓁 ′ 𝑣 𝓁𝓁 ′ 𝑚 + 𝐹 ∑ 𝑚 ∈ 𝑀 ∑ 𝓁 ∈ 𝐼 ∑ 𝓁 ′ ∈ 𝐽 𝑣 𝓁𝓁 ′ 𝑚 , (25) subject to (16) , (17) , (19)–(24) , ∑ 𝑗 ∈ 𝐽 𝑑 𝑗 𝑥 𝑖𝑗 ≤ 𝑞 𝑖 𝑦 𝑖 , 𝑖 ∈ 𝐼, (6) 𝑦 𝑖 ∈ {0 , 1} 𝑖 ∈ 𝐼. (7) This model was introduced by Prins et al. (2006, 2007) and describes the well-known capacitated location-routing problem (CLRP) (Laporte et al., 1988). The interested reader can refer to Albareda-Sambola and Rodríguez-Pereira (2019) and Mara et al. (2021) for recent overviews on this family of problems. Variations and extensions of the CLRP keep attracting the attention of the scientific community. For instance, Allahyari et al. (2021) include a new risk index in the context of a routing problem involving security carriers for high-value shipment transportation. An alternative perspective to what was just presented corresponds to having demand occurring along the arcs or edges of the underlying network. Popular applications include garbage collection and road maintenance. This setting leads to the class of so-called capacitated arc-routing problems (CARP) and consequently to capacitated location-arc routing problems (CLARP). It is a well-known fact that one can transform a VRP problem into an ARP and vice-versa (see, e.g., Baldacci and Maniezzo 2006, Golden and Wong 1981, Longo et al. 2006, and Pearn et al. 1987). Nevertheless, solving a CARP as a CVRP is often impractical. This explains why specialized models and algorithms have been proposed in the literature. The same analysis can be translated to the multi-depot versions of both problems and consequently to the problems in which the location of the depots is also a decision to make. Again we consider a transportation network 𝐺 = ( 𝑉 , 𝐸 ) with 𝑉 = 𝐼 ∪ 𝐽 𝐸 contains all the arcs between depots and demand nodes as well as those between demand nodes only. 𝑀 is again a set of homogeneous vehicles each with capacity 𝑄 . We denote by 𝑑 ( 𝓁 , 𝓁 ′ ) the demand in arc ( 𝓁 , 𝓁 ′ ) ∈ 𝐸 . Let 𝐸 𝑑 be the set of arcs in 𝐸 that actually have demand; the other arcs can be used but no service is provided. We can consider a complete graph, ̃ 𝐺 = ( ̃ 𝑉 , ̃ 𝐸 ) such that ̃ 𝑉 contains all nodes in 𝐼 as well as all nodes that correspond to either a start- or an end-node of an arc in 𝐸 𝑑 . The unit cost of an arc ( 𝓁 , 𝓁 ′ ) ∈ ̃ 𝐸 is denoted by ̃ 𝑐 𝓁𝓁 ′ . Such cost is equal to 𝑐 𝓁𝓁 ′ if ( 𝓁 , 𝓁 ′ ) ∈ 𝐸 𝑑 and it is equal to the smallest unit transportation cost between 𝓁 and 𝓁 ′ in 𝐺 = ( 𝑉 , 𝐸 ) otherwise. If both 𝓁 , 𝓁 ′ ∈ 𝐼 , then we set ̃ 𝑐 𝓁𝓁 ′ = ∞ . The allocation variables need to be redefined so that they refer to arcs: 𝑥 𝑖 ( 𝓁 , 𝓁 ′ ) is equal to 1 if and only if the demand in arc ( 𝓁 , 𝓁 ′ ) ∈ 𝐸 𝑑 is supplied from depot 𝑖 ∈ 𝐼 . Regarding the routing variables 𝑣 , they are now defined only in ̃ 𝐸 : 𝑣 𝓁𝓁 ′ 𝑚 is equal to one if route 𝑚 uses arc ( 𝓁 , 𝓁 ′ ) and zero otherwise, 𝓁 , 𝓁 ′ ∈ ̃ 𝐸 . The multi-depot CARP can be formulated mathematically as follows: minimize ∑ 𝑚 ∈ 𝑀 ∑ ( 𝓁 , 𝓁 ′ )∈ ̃ 𝐸 ̃ 𝑐 𝓁𝓁 ′ 𝑣 𝓁𝓁 ′ 𝑚 , (26) subject to ∑ 𝑚 ∈ 𝑀 𝑣 𝓁𝓁 ′ 𝑚 = 1 , ( 𝓁 , 𝓁 ′ ) ∈ 𝐸 𝑑 , (27) ∑ ( 𝓁 , 𝓁 ′ )∈ 𝐸 𝑑 𝑑 ( 𝓁 , 𝓁 ′ ) 𝑣 𝓁𝓁 ′ 𝑚 ≤ 𝑄, 𝑚 ∈ 𝑀, (28) ∑ ( 𝓁 , 𝓁 ′ )∈ 𝐸 𝑑 𝑑 ( 𝓁 , 𝓁 ′ ) 𝑥 𝑖 ( 𝓁 , 𝓁 ′ ) ≤ 𝑞 𝑖 , 𝑖 ∈ 𝐼, (29) ∑ 𝓁 ′ ∈ ̃ 𝑉 𝑣 𝓁𝓁 ′ 𝑚 − ∑ 𝓁 ′ ∈ ̃ 𝑉 𝑣 𝓁 ′ 𝓁 𝑚 = 0 , 𝑚 ∈ 𝑀, 𝓁 ∈ ̃ 𝑉 , (30) ∑ 𝓁 ∈ 𝐼 ∑ 𝓁 ′ ∈ ̃ 𝑉 𝑣 𝓁𝓁 ′ 𝑚 ≤ 1 , 𝑚 ∈ 𝑀, (31) ∑ 𝑚 ∈ 𝑀 ∑ 𝓁 ∈ 𝑆 ∑ 𝓁 ′ ∈ ̃ 𝑉 ⧵ 𝑆 𝑣 𝓁𝓁 ′ 𝑚 ≥ 1 , 𝐼 ⊆ 𝑆 ⊂ ̃ 𝑉 , (32) ∑ 𝓁 ′ ∈ ̃ 𝑉 ⧵ 𝐼 𝑣 𝑖 𝓁 ′ 𝑚 + 𝑣 𝓁𝓁 ′ 𝑚 ≤ 1 + 𝑥 𝑖 ( 𝓁 , 𝓁 ′ ) , 𝑖 ∈ 𝐼, ( 𝓁 , 𝓁 ′ ) ∈ 𝐸 𝑑 , 𝑚 ∈ 𝑀, (33) 𝑥 𝑖 ( 𝓁 , 𝓁 ′ ) ∈ {0 , 1} , 𝑖 ∈ 𝐼, ( 𝓁 , 𝓁 ′ ) ∈ 𝐸 𝑑 , (34) 𝑣 𝓁𝓁 ′ 𝑚 ∈ {0 , 1} , ( 𝓁 , 𝓁 ′ ) ∈ 𝐸 𝑑 , 𝑚 ∈ 𝑀. (35) We do not describe neither the objective function nor the constraints since their meaning is straightforward after the discussion presented for the MDVRP. As before, by including costs associated with the origins to operate we are led to a mathematical model Transportation Research Part E 166 (2022) 102903 8 F. Saldanha-da-Gama for the CLARP (see Borges Lopes et al. 2014): minimize ∑ 𝑖 ∈ 𝐼 𝑓 𝑖 𝑦 𝑖 + ∑ 𝑚 ∈ 𝑀 ∑ ( 𝓁 , 𝓁 ′ )∈ ̃ 𝐸 ̃ 𝑐 𝓁𝓁 ′ 𝑣 𝓁𝓁 ′ 𝑘 + 𝐹 ∑ 𝑚 ∈ 𝑀 ∑ 𝓁 ∈ 𝐼 ∑ 𝓁 ′ ∈ 𝐽 ∶( 𝓁 , 𝓁 ′ )∈ ̃ 𝐸 𝑣 𝓁𝓁 ′ 𝑚 , (36) subject to (27) , (28) , (30)–(35) , ∑ ( 𝓁 , 𝓁 ′ )∈ 𝐸 𝑑 𝑑 ( 𝓁 , 𝓁 ′ ) 𝑥 𝑖 ( 𝓁 , 𝓁 ′ ) ≤ 𝑞 𝑖 𝑦 𝑖 , 𝑖 ∈ 𝐼, (37) 𝑦 𝑖 ∈ {0 , 1} 𝑖 ∈ 𝐼. (7) When it comes to combining facility location and routing decisions, other variants have been studied such as to the location- or-routing problem introduced by Arslan (2021) in which a demand node can be supplied as part of a route in case it is within a certain coverage radius of the facility/depot where the route originates. Otherwise, the customer is brought to a facility and thus it is served directly according to a location–allocation scheme. 2.3. Hub location When a commodity is to be shipped across many origin–destination pairs we often look for designing a network that allows savings in transportation costs in addition to improving the overall shipment efficiency. This calls for the location of so-called hubs, which are nodes acting as transshipment points that allow consolidating and redistributing the flow. Consider a set of nodes 𝑉 such that some flow 𝑑 𝑖𝑗 is to be sent from 𝑖 to 𝑗 , with 𝑖, 𝑗 ∈ 𝑉 . This flow is usually refereed to as ‘‘demand’’ and thus we keep denoting it using notation 𝑑 . Let 𝐾 ⊆ 𝑉 be the set of nodes that can be selected as hubs, i.e., we assume that the potential hub locations correspond to nodes that also originate or are destination of some flow and thus they are also included in 𝑉 . The decisions to make include the network design (hubs and hub-edges to establish), and the flow routing. If a hub is setup at node 𝑘 , then a capacity 𝑞 𝑘 is associated with it. Regarding the costs, we define 𝑓 𝑘 as a fixed cost for setting up a hub at node 𝑘 ∈ 𝐾 and 𝑔 𝑘 𝓁 the cost for setting up an edge between hubs 𝑘 and 𝓁 ( 𝑘, 𝓁 ∈ 𝐾 , 𝑘 < 𝓁 ). In hub location problems, all flow is usually charged in all path segments used from the origin to the destination. Nevertheless, the inter-hub flow is often discounted. This is due, for instance, to economies of scale thus encouraging larger amounts to be shipped in hub edges. Hence, we define 𝑐 𝓁𝓁 ′ as the unit cost for the flow sent directly from 𝓁 to 𝓁 ′ , ( 𝓁 , 𝓁 ′ ∈ 𝑉 ). Additionally, we consider a discount factor, 𝛼 ∈ [0 , 1] , for the inter-hub flow cost. When needed, we use 𝑂 𝑖 ( 𝐷 𝑖 ) to represent all the flow originated at (destined to) 𝑖 ∈ 𝑉 In a single-allocation hub location problem each node is allocated to one and only one hub. The problem we are revisiting, which includes hub network design decisions, can be formulated using three sets of decision variables: (i) node-allocation to hubs, (ii) inter-hub edges and (iii) flows. Accordingly, we keep using binary allocation variables 𝑥 𝑖𝑘 equal to one if and only if node 𝑖 ∈ 𝑉 is allocated to hub 𝑘 ∈ 𝐾 . For 𝑘, 𝓁 ∈ 𝐾 with 𝑘 < 𝓁 , 𝑧 𝑘 𝓁 is a binary variable indicating whether hub edge { 𝑘, 𝓁 } is installed. Finally, for every 𝑖 ∈ 𝑉 and 𝑘, 𝓁 ∈ 𝐾 , we consider flow variables 𝑢 𝑖𝑘 𝓁 indicating the amount of flow originated at node 𝑖 that is routed via hubs 𝑘 and 𝓁 in this sequence (thus using the corresponding hub edge). The problem can be formulated mathematically as follows: minimize ∑ 𝑘 ∈ 𝐾 𝑓 𝑘 𝑥 𝑘𝑘 + ∑ 𝑘 ∈ 𝐾 ∑ 𝓁 ∈ 𝐾, 𝑘< 𝓁 𝑔 𝑘 𝓁 𝑧 𝑘 𝓁 + ∑ 𝑖 ∈ 𝑉 ( 𝑂 𝑖 ∑ 𝑘 ∈ 𝐾 𝑐 𝑖𝑘 𝑥 𝑖𝑘 ) + 𝛼 ∑ 𝑖 ∈ 𝑉 ∑ 𝑘 ∈ 𝐾 ∑ 𝓁 ∈ 𝐾 𝑐 𝑘 𝓁 𝑢 𝑖𝑘 𝓁 + ∑ 𝑖 ∈ 𝑉 ( 𝐷 𝑖 ∑ 𝑘 ∈ 𝐾 𝑐 𝑘𝑖 𝑥 𝑖𝑘 ) , (38) subject to ∑ 𝑘 ∈ 𝐾 𝑥 𝑖𝑘 = 1 , 𝑖 ∈ 𝑉 , (39) ∑ 𝑖 ∈ 𝑉 𝑂 𝑖 𝑥 𝑖𝑘 + ∑ 𝑖 ∈ 𝑉 ∑ 𝓁 ∈ 𝐾 𝑢 𝑖 𝓁 𝑘 ≤ 𝑞 𝑘 𝑥 𝑘𝑘 , 𝑘 ∈ 𝐾, (40) ∑ 𝓁 ∈ 𝐾, 𝓁 ≠ 𝑘 𝑢 𝑖𝑘 𝓁 − ∑ 𝓁 ∈ 𝐾, 𝓁 ≠ 𝑘 𝑢 𝑖 𝓁 𝑘 = 𝑂 𝑖 𝑥 𝑖𝑘 − ∑ 𝑗 ∈ 𝑉 𝑑 𝑖𝑗 𝑥 𝑗𝑘 , 𝑖 ∈ 𝑉 , 𝑘 ∈ 𝐾, (41) 𝑧 𝑘 𝓁 ≤ 𝑥 𝑘𝑘 , 𝑘, 𝓁 ∈ 𝐾, 𝑘 < 𝓁 , (42) 𝑧 𝑘 𝓁 ≤ 𝑥 𝓁𝓁 , 𝑘, 𝓁 ∈ 𝐾, 𝑘 < 𝓁 , (43) 𝑢 𝑖𝑘 𝓁 + 𝑢 𝑖 𝓁 𝑘 ≤ 𝑂 𝑖 𝑧 𝑘 𝓁 , 𝑖 ∈ 𝑉 , 𝑘, 𝓁 ∈ 𝐾, 𝑘 < 𝓁 , (44) 𝑥 𝑖𝑘 ∈ {0 , 1} , 𝑖 ∈ 𝑉 , 𝑘 ∈ 𝐾, (45) 𝑢 𝑖𝑘 𝓁 ≥ 0 , 𝑖 ∈ 𝑉 , 𝑘, 𝓁 ∈ 𝐾, (46) 𝑧 𝑘 𝓁 ∈ {0 , 1} , 𝑘, 𝓁 ∈ 𝐾, 𝑘 < 𝓁 (47) The objective function represents the total cost for designing the network (hubs and hub-edges) plus the total transportation cost. Constraints (39) ensure that all nodes are allocated to some hub (to themselves in case of hubs). Constraints (40) refer to the capacity at the hubs. Note that these constraints also ensure that flow can only be sent to hubs that have actually been setup. Equalities (41) are the usual flow-divergence constraints ensuring a consistent flow routing through the network. Constraints (42) Transportation Research Part E 166 (2022) 102903 9 F. Saldanha-da-Gama and (43) guarantee that if a hub edge is setup then both extremes correspond to hubs. Constraints (44) impose that the inter-hub flow is routed exclusively using the installed hub edges. Finally, (45)–(47) state the domain of the decision variables. The multiple allocation version of the above problem allows a node to interact with more than one hub. For this reason, we need additional information to track the flow through the network. We keep using variables 𝑧 and 𝑢 . However, now we redefine 𝑥 𝑖𝑘 as the amount of flow originated at node 𝑖 that is consolidated at hub 𝑘 being redistributed from the latter ( 𝑖 ∈ 𝑉 , 𝑘 ∈ 𝐾 ). Additionally, we need to consider binary hub-location variables, that we denote by 𝑦 𝑘 according to the terminology we have been using, indicating whether node 𝑘 is selected as a hub ( 𝑘 ∈ 𝐾 ). Finally, we consider flow distribution variables, 𝑢 ′ 𝑖 𝓁 𝑗 that for 𝑖, 𝑗 ∈ 𝑉 and 𝓁 ∈ 𝐾 represent the amount of flow originated at node 𝑖 that reaches node 𝑗 from hub 𝓁 . The multiple allocation hub location problem with hub network design decisions can be formulated mathematically as follows: minimize ∑ 𝑘 ∈ 𝐾 𝑓 𝑘 𝑦 𝑘 + ∑ 𝑘 ∈ 𝐾 ∑ 𝓁 ∈ 𝐾, 𝑘< 𝓁 𝑔 𝑘 𝓁 𝑧 𝑘 𝓁 + ∑ 𝑖 ∈ 𝑉 ∑ 𝑘 ∈ 𝐾 𝑐 𝑖𝑘 𝑥 𝑖𝑘 + 𝛼 ∑ 𝑖 ∈ 𝑉 ∑ 𝑘 ∈ 𝐾 ∑ 𝓁 ∈ 𝐾 𝑐 𝑘 𝓁 𝑢 𝑖𝑘 𝓁 + ∑ 𝑖 ∈ 𝑉 ∑ 𝓁 ∈ 𝐾 ∑ 𝑗 ∈ 𝑉 𝑐 𝓁 𝑗 𝑢 ′ 𝑖 𝓁 𝑗 , (48) subject to ∑ 𝑘 ∈ 𝐾 𝑥 𝑖𝑘 = 𝑂 𝑖 , 𝑖 ∈ 𝑉 , (49) ∑ 𝓁 ∈ 𝐾 𝑢 ′ 𝑖 𝓁 𝑗 = 𝑑 𝑖𝑗 , 𝑖, 𝑗 ∈ 𝑉 , (50) ∑ 𝑖 ∈ 𝑉 𝑥 𝑖𝑘 + ∑ 𝑖 ∈ 𝑉 ∑ 𝓁 ∈ 𝐾 𝑢 𝑖 𝓁 𝑘 ≤ 𝑞 𝑘 𝑦 𝑘 , 𝑘 ∈ 𝐾, (51) ∑ 𝓁 ∈ 𝐾, 𝓁 ≠ 𝑘 𝑢 𝑖𝑘 𝓁 − ∑ 𝓁 ∈ 𝐾, 𝓁 ≠ 𝑘 𝑢 𝑖 𝓁 𝑘 = 𝑥 𝑖𝑘 − ∑ 𝑗 ∈ 𝑉 𝑢 ′ 𝑖𝑘𝑗 , 𝑖 ∈ 𝑉 , 𝑘 ∈ 𝐾, (52) 𝑥 𝑖𝑘 ≤ 𝑂 𝑖 (1 − 𝑦 𝑖 ) , 𝑖, 𝑘 ∈ 𝐾, 𝑖 ≠ 𝑘, (53) 𝑣 𝑖 𝓁 𝑗 ≤ 𝐷 𝑗 (1 − 𝑦 𝑗 ) , 𝑗, 𝓁 ∈ 𝐾, 𝑗 ≠ 𝓁 (54) 𝑧 𝑘 𝓁 ≤ 𝑦 𝑘 , 𝑘, 𝓁 ∈ 𝐾, 𝑘 < 𝓁 , (55) 𝑧 𝑘 𝓁 ≤ 𝑦 𝓁 , 𝑘, 𝓁 ∈ 𝐾, 𝑘 < 𝓁 , (56) 𝑢 𝑖𝑘 𝓁 + 𝑢 𝑖 𝓁 𝑘 ≤ 𝑂 𝑖 𝑧 𝑘 𝓁 , 𝑖 ∈ 𝑉 , 𝑘, 𝓁 ∈ 𝐾, 𝑘 < 𝓁 , (57) 𝑦 𝑘 ∈ {0 , 1} , 𝑘 ∈ 𝐾, (58) 𝑧 𝑘 𝓁 ∈ {0 , 1} , 𝑘, 𝓁 ∈ 𝐾, 𝑘 < 𝓁 , (59) 𝑥 𝑖𝑘 ≥ 0 , 𝑖 ∈ 𝑉 , 𝑘 ∈ 𝐾, (60) 𝑢 𝑖𝑘 𝓁 ≥ 0 , 𝑖 ∈ 𝑉 , 𝑘, 𝓁 ∈ 𝐾, (61) 𝑢 ′ 𝑖 𝓁 𝑗 ≥ 0 , 𝑖, 𝑗 ∈ 𝑉 , 𝓁 ∈ 𝐾. (62) Again, the objective function accounts for the total cost (network design and transportation). Constraints (49) ensure that all flow enters the hub network. Equalities (50) impose that for all pairs of nodes ( 𝑖, 𝑗 ) all the flow from the origin to the destination arrives at the latter. Inequalities (51) refer to the capacity of the open hubs. These constraints also ensure that flow can only be routed through hubs that have been setup. Constraints (52) ensure the correct flow routing through the network. These constraint together with (51) guarantee that ∑ 𝑖 ∈ 𝑉 𝑢 ′ 𝑖 𝓁 𝑗 ≤ 𝐷 𝑗 𝑦 𝓁 , 𝑗 ∈ 𝑉 , 𝓁 ∈ 𝐾 , i.e., the consistency between positive flows 𝑢 ′ and the operationality of the hubs distributing that flow is ensured. Inequalities (53) ensure that if a node, say 𝑖 , is a hub, then it cannot send its out-bound flow to another hub, i.e., all hubs process their outbound flow and thus all the flow routed through the hub network is discounted in terms of cost. The following inequalities, (54), guarantee that all the flow destined to a hub is distributed via that hub, i.e., all hubs distribute their inbound flow. Constraints (55) and (56) impose that the extremes of a hub edge correspond to hubs that have been setup, whereas Constraints (57) state that all the flow shipped between hubs is routed via hub edges. Finally, (58)–(62) state the domain of the decision variables. In the above models it is assumed that all potential hubs originate and are destination of flow to and from other nodes. Nevertheless, it is straightforward to adapt the models if this is not the case. The above models consider hub network design decisions, which again is a feature of relevance in many practical applications namely, in Logistics. Nevertheless, much literature on hub location still assumes a fully-interconnected hub network or some specific topology for the hub network, which depends on the specific application (in transportation and telecommunications these assumptions often makes sense). A full inter-hub connectivity allows simplifying the above models which is an exercise that we do not present here. The interested reader can refer to Contreras (2021) and Contreras and O’Kelly (2019) for those details. The objective function of the above hub location problems is a cost-minimization one. Recently, in the context of distri- bution logistics, some authors have investigated profit maximization hub location problems. This is the case in Alibeyg et al. (2016), Taherkhani and Alumur (2019), and Taherkhani et al. (2020). Nevertheless, this is a research stream still very much unexplored. Transportation Research Part E 166 (2022) 102903 10 F. Saldanha-da-Gama 2.4. Hub location-routing In the hub location problems whose optimization models were just discussed, the nodes are served directly from the hubs. In practice, as for CLRP and CLARP, we may be facing a setting calling for the nodes to be served as part of routes, each of which having their origin and destination at a hub. To simplify the exposition, we directly extend what we have discussed for the CLRP. Again we